Fun with User-Agents: Firefox and IE7
One of the key parts of FeedBurner's stats processing is trying to determine if a request for a feed represents a casual, drive-by browse or an intentioned subscriber. It has gotten a little bit more complicated lately as some clients serve double-duty. I wanted to share with you how we handle the requests from the two most popular browsers, Firefox and Internet Explorer 7.
The Firefox browser can also be used as a feed-reading client with the Live Bookmarks feature. So, the key for FeedBurner is to determine if a request for a feed is coming from Live Bookmarks (where we can count it as a subscriber) or from a visitor that just happened to click on the feed chicklet (where we just report it as a browser hit). Up until Firefox version 220.127.116.11, we really have to guess, since the requests for the most part look identical: they both have a User-Agent that looks like Mozilla/.*Firefox/.*. So, what we do is that we look at a couple of other headers: X-Moz and Referer. So the logic tree looks like this:
If version < 18.104.22.168 and (X-Moz: prefetch or Referer is not empty), then it's a browser.
Otherwise, it's a Live Bookmarks request.
That's not ideal, because if someone just types in the feed URL in the location bar or launches the feed URL from a different app, we'll count it as a Live Bookmarks hit because the Referer will be empty. But we have nothing else to hang onto.
Firefox 22.214.171.124 has a wonderful new addition that makes this tracking much more accurate. Now, if the request is coming from Live Bookmarks, there will be an X-Moz: livebookmarks header. We can detect that and we don't have to do the referrer guessing game.
If version >= 126.96.36.199 and X-Moz: livebookmarks, then it's a Live Bookmarks request.
Otherwise, it's a browser request.
Internet Explorer 7
The latest version of Internet Explorer adds feed reading capabilities by leveraging the Windows RSS Platform. So, on the surface, things seem really straight-forward, since the Windows RSS Platform has its own User-Agent that's distinct from the IE7 User-Agent.
If User-Agent matches Windows[- ]RSS[- ]Platform/\S+ .*, then it's a "Windows RSS Platform" subscription.
At this point, however, things get complicated. Outlook 2007 has a cool feed reading capability. Unfortunately, the Microsoft Office team didn't get the memo and identifies itself the same as IE7 instead of leveraging the Windows RSS Platform, which would have made much more sense. So how do we distinguish between IE7 browser hits and Outlook 2007 subscriptions? We use the old referrer trick: if there's no referrer, assume it came from the automated poller fueling Outlook.
If User-Agent matches Mozilla/4\.0 \(compatible; MSIE 7.* and Referer is empty, then it's an Outlook 2007 subscription.
But wait ... there's more! It turns out that some Microsoft Vista Gadgets also identify themselves as IE7, and we think it's more appropriate to treat those requests as subscriptions rather than browser hits. Fortunately, there's a hook: we can look at the Referer, and if it starts with x-gadget:///, then the request is coming from a Gadget.
If User-Agent matches Mozilla/4\.0 \(compatible; MSIE 7.* and Referer starts with x-gadget:///, then it's a Vista Gadget subscription.
Finally, if none of the other rules match, we treat it as an IE7 browser hit.
So, those are the kinds of decisions that we make when evaluating each of the over 300 million feed requests we get each day. We're constantly reviewing the list of User-Agents we get in those requests in an effort to make these stats as accurate they can be. What really makes our lives easier is when we can definitively discern through request headers if the request is for an intentioned subscription vs. "other". With developments like a distinct User-Agent header for the Windows RSS Platform and the new X-Moz header, we're getting closer!