Dan Callaghan

Selecting a preferred variant with Apache MultiViews

When Apache is doing content negotiation, it follows an exhaustive, well thought out algorithm for selecting the best variant. A normal web browser will send something like this in its request headers:

Accept: application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,*/*;q=0.5

If, like me, you want to serve up multiple content types using negotiation, but show HTML by default for normal users, then step 1 of the Apache algorithm will already do the right thing for you. The browser has declared HTML to be its highest preference, so Apache will pick that.

Unfortunately Google (and other search engines) send only this:

Accept: */*

in which case the other rules in the Apache algorithm come into play. In my case, the plain text variant of my resources was being selected because of step 8 in the algorithm, so Apache was serving plain text to Google and HTML to everyone else.

Not only is this against Google’s rules (it would be considered “cloaking”), but it’s also really bad for rankings!

The solution is to configure “quality of source” (qs) multipliers, such that your preferred content type comes out with the highest quality score in cases where the client sends Accept: */*. That’s easy if you’re using type map files, but who wants to create a type map for every resource on their site? MultiViews gives you automatic type mapping, but the docs don’t explain how to set the qs multiplier for MultiViews.

Luckily, it is possible, just not obvious. I eventually found a thread on the Apache users mailing list from 2002 which explains: you can use the Add­Type directive to redefine content types with a qs parameter, which Apache will then apply in its negotiation algorithm. Your config would look something like this:

<Directory "/var/www/localhost/htdocs">
    Options +Multiviews
    AddType application/atom+xml;qs=0.5 .atom
    AddType text/plain;qs=0.1 .txt
    # all other content types will have a default qs multiplier of 1.0
    ...
</Directory>

Unfortunately Apache will pass the qs parameter on to clients in its response headers. As mentioned in the mailing list thread, it may or may not be in adherence to the spec, and it certainly is ugly, but in my testing it doesn’t seem cause any harm.