MSNbot QBHP referrers
2008-07-20 traffic

When I Wikiized the site, and started indexing the Wiki changes, I naturally also wanted to start looking at incoming traffic and referrers, as you can see on the "recent" page on the main menu. And of course I then started refining it to suit my tastes.

I had already had a "preproc.pl" script to preprocess the logs and give me the hits I want to see. That screens out spiders, everything I myself do from home, and (lately) any IP that posts spam to the forum or Wiki. The remainder is proving pretty interesting.

Normally, one can filter out search engine spiders based on their agent. But Microsoft, as always, follows their own rules (a little research on "search.live.com" and "QBHP" will show you plenty of griping.) They use a normal IE agent string, but mark their search queries using the "form".

And you know, normally I wouldn't care. But their search queries are weird. They consist of a single word, usually (but not always) one found on the page, and if you're actually paying attention to search queries to determine what it is about your site people find interesting, these won't help.

So now my preproc script blocks everything from the 65.55.*.* block with "form=QBHP" in the referrer. You just have to wonder what Microsoft is thinking, sometimes.






Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.