« The Second Coming of Ken Jennings | Main | Earthquake Feed »

How FeedBurner Rewrites Links

I've been meaning to write this post for a while ... it probably belongs somewhere on the FeedBurner site, but I'll just put it here for now. I wanted to document exactly how and when FeedBurner will rewrite <link> elements in feeds.

First thing you might be thinking is "Why is FeedBurner rewriting links?" Well, one of the service that we offer to publishers is "clickthrough tracking", where we report the number of clickthroughs on a per-item basis. To do this, we rewrite the URL associated with the item and when it's clicked on, we log the click and then forward it on to the original item URL.

The second thing you need to know is that we never rewrite the <guid> element in an RSS 2.0 feed, although we may switch the isPermaLink attribute from "true" to "false" in some cases. The guid element is used in many cases to unique identify a feed element, and we don't want to mess with that.

Last thing to know is that if we ever rewrite the link to an item, we'll stick the original value of the link in a <feedburner:origLink> element, which is useful for some search engines and other consumers of the feed to reconcile items that may be syndicated into multiple endpoints.

If you want more details on exactly how the rewriting works, it's in the extended entry.
Here is how rewriting works with the different formats. I'm pretty sure this is how we're handling each case, but this is from an older document and I should probably go in and check, so it's possible there are minor errors. Also, this doesn't show the origLink element that will be added.

RSS 0.91, 0.92, 0.93:

S.O.L. ... have to rewrite the <link> element, and no namespace extensions are allowed.  Thankfully, use of these formats is declining.

RSS 1.0

Preserve rdf:about as the item unique id

Case
Source feed
Burned Feed
1
<item rdf:about="URL1">
  <link>URL1</link>
</item>
<item:rdf about="URL1">
  <link>Rewrite-URL1</link>
</item>
2
<item rdf:about="URL1">
  (no link element)
</item>
<item:rdf about="URL1">
  <link>Rewrite-URL1</link>
</item>
3
<item rdf:about="URI">
  <link>URL1</link>
</item>
<item:rdf about="URI">
  <link>Rewrite-URL1</link>
</item>

Some notes:
  • Clients should never use rdf:about as a link destination (except maybe in the case when no <link> is provided), as it is only guaranteed to be a URI, not a URL.  Bloglines violates this principle by putting the URI in the href for the item title, even when it's not a valid URL.
  • Obviously, the value for the URI is also used in the <items> element earlier in the feed as the attribute value for rdf:resource.

RSS 2.0

Preserve guid as the item unique id, but not necessarily the isPermaLink attribute

Case
Source Feed
Burned Feed
1
<item>
  <link>URL1</link>
  <guid isPermaLink="false">ID</guid>
</item>
<item>
  <link>Rewrite-URL1</link>
  <guid isPermaLink="false">ID</guid>
</item>
2
<item>
  <guid isPermaLink="true">URL1</guid>
</item>
<item>
  <link>Rewrite-URL1</link>
  <guid isPermaLink="false">URL1</guid>
</item>
3
<item>
  <link>URL1</link>
</item>
<item>
  <link>Rewrite-URL1</link>
  <guid isPermaLink="false">URL1</guid>
</item>
4
<item>
  <link>URL1</link>
  <guid isPermaLink="true">URL1</guid>
</item>
<item>
  <link>Rewrite-URL1</link>
  <guid isPermaLink="false">URL1</guid>
</item>
5
<item>
  <link>URL1</link>
  <guid isPermaLink="true">URL2</guid>
</item>
<item>
  <link>Rewrite-URL1</link>
  <guid isPermaLink="false">URL2</guid>
</item>

Some notes:
  • Case #5 is ambiguous by providing both an item/link and an item/guid[@isPermaLink!="false"] and having them refer to different URLs.  We disambiguate by treating the link as a link and the guid as a non-permalinked uid.
  • Clients and aggregators should use the guid as the item's id, but if not present then use the link.
  • In cases where neither an item/link nor an item/guid[@isPermaLink!="false"] are present, there's nothing to rewrite so the item should pass through unchanged.

Atom

Preserve id as the id for the item

Case
Source Feed
Burned Feed
1
<entry>
  <link rel="alternate" type="text/html" href="URL1"/>
  <id>ID</id>
</entry>
<entry>
  <link rel="alternate" type="text/html" href="Rewrite-URL1"/>
  <id>ID</id>
</entry>

Not much to say here.  In cases where there might be multiple entry/link[@rel="alternate"] elements, we will prefer to rewrite entry/link[@rel="alternate" and @type="text/html"]/@href or, if not found, the first entry/link[@rel="alternate"].

TrackBack

Listed below are links to weblogs that reference How FeedBurner Rewrites Links:

» Feedburner Link Rewriting from franklinmint.fm
Eric Lunt: Atom: Not much to say here. That's a good thing.... [Read More]

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)