Dancing About Architecture: November 2003 Archives

Photos + Location

There's a neat convergence of location-based information and digital photography that's been talked about for a little while, with sites like GeoSnapper and a Japanese prototype called T'o'rip Space (whatever that means).

Came across this website World-Wide Media eXchange: WWMX and it looks like it's a really nice implementation of associating location information with an image. Once camera phones have location capabilities nicely integrated into the platform, that would seamlessly add a cool dimension to moblogging.

Source: World-Wide Media eXchange: WWMX

Posted by Eric at 11:20 AM | Permalink | Comments (0)

RSS 1.0 vs. RSS 2.0 vs. Atom

I've been asked a few times (no really, I have!) where I stand with the whole RSS-RDF-Atom religous war. There's a very contentious history there, with some strong personalities that have made the domain of content syndication more polarized than perhaps it could have been. I'd really like to stay above the politics and look at the different specs from a technologist's viewpoint... so that's what I'm going to try to do here.

(Okay, if you've discovered RSS recently and want to know what I'm talking about, here are a few links for you, but I'm warning you now it's not pretty: History of the RSS fork, Why Choose RSS 1.0?, and Motivation for Atom.)

The three different specs I'm talking about are RSS 1.0 (which is RDF-based), RSS 2.0 (which is the evolution of RSS 0.91 and RSS 0.92), and Atom (which has also been referred to as "Echo" and is kind of a remix of RSS 2.0).

When we talk about content syndication, there are a few different aspects we need to talk about. The most interesting facet to me is the "feed format", which is really the schema of the XML that represents the channel and items within that channel. So let's get one thing straight right away: these formats are way more alike than they are different.

If I want to express a feed using these three different formats, it's not that tough to do the mapping. In fact, except for some date format goofiness, you could almost get away with using XSLT to do the transcoding. They all deal with representing a feed (called a <channel> in RSS 1.0 and RSS 2.0, a <feed> in Atom) of content items (called an <item> in RSS 1.0 and RSS 2.0, an <entry> in Atom) with certain (and slightly different) attributes of the feed and items being considered "core". There are no show stoppers translating from one feed to another, just annoyances, so what's <issued>2003-11-14T13:46:03-06:00</issued> in Atom is <dc:date>2003-11-14T13:46:03-06:00</dc:date> in RSS 1.0 and <pubDate>Fri, 14 Nov 2003 13:46:03 CDT</pubDate> in RSS 2.0. We can deal with that.

It's possible to extend all three formats in a well-defined manner using what are called "modules" or "namespaces" (let's just call them "modules" that use "XML namespaces"). A properly defined module can be referenced in all three formats. This is the important part! So, for example, while RSS 1.0 relies on using the Dublin Core to represent much of the metadata associated with an item, there's no reason that an RSS 2.0 or Atom feed can't reference the Dublin Core module and use the exact same elements. This is key for evolving the content models and allows us to think about the content without caring as much about the "feed format envelope" they'll be delivered in.

That being said, I prefer different feed formats for different reasons. I like RSS 2.0 because it's simple, but don't like the date format it uses (can't represent it with an xsd:date in XML Schema) or the looseness of the specification. I like Atom's model for representing content, but don't like the fact it's very very early in the adoption curve. I like RSS 1.0 for its formality, but don't like it for its formality -- okay what do I mean by that? I mean RDF is great, but quite frankly you gotta be able to take some big bites to do it right. I think it's just too much for this domain, and I really kind of think of RSS 2.0 as being syntactic sugar on an underlying similar model. I know, I know, that's not totally accurate, and turning your back on RDF means not being able to take advantage of monumental effort and thinking that's already been put into it by very smart people. But you know what? It's too much. And I like RDF -- back when we did DKA in 1998, we chose RDF as the underlying content model for our proto-blog product. It's just that RSS 2.0, with its ability to extend via XML namespaces, is good enough for where we need to go, warts and all (IMHO).

So the key to evolving content syndication forward seems to be in the modules. Great. I think the current psuedo-standard template for documenting modules is fine (see examples for a content module and an admin module), but there are plenty of examples out there that don't follow this format (like this photo module, this review module, and this blogChannel module).

To tell you the truth, I don't really care about what the documentation for these modules look like, but could we please all just agree to define and reference an XML Schema for these modules?!? I don't know what you think about XML Schema, but I personally like it a whole lot better than DTDs and I really like how other people have done the hard work in creating schema validators for me. Plus tools like XMLSpy are really smart about helping you to create documents that conform to an XML Schema. We can then even use the xsi:schemaLocation attribute when we declare these namespaces in the XML document to really automate the validation. I've got some examples of how to do this correctly I'll post someday soon.

As long as we're on the topic of modules and schemas, another thing you module writers really need to do is version your schemas from the very beginning. You should be able to have the ability to evolve your schema over time, but the URI for your module (and it's corresponding XML Schema) needs to change while your old one is still available. A lot of module writers are good about this, but many are not.

Okay, one more thing to say. In addition to the actual format of the feed, another dimension that we haven't talked about yet is the API for moving this content around. Now, if you're just a consumer of feeds in your personal aggregator or client, this statement might be confusing. "Don't I just have to do an HTTP GET request (hopefully with an If-Modified-Since header) to the URL that has the feed?" The answer is yes, that's all there is to it on the consumer side. On the publishers side, however, we find there are several APIs for managing the content that will eventually be pushed out via the feed. So while this isn't directly related to the topic of feed formats, let me just say that I really like how the Atom project is thinking about this. The article The Atom API by Mark Pilgrim does a great job of laying out why we need to start thinking of an item (or, in Atom's case, an "entry") outside of the context of a feed. It also makes a good case for keeping things as simple as they can: in this case, using the basic HTTP methods to manage content. I like this way better than SOAP or XML-RPC or any of the other APIs mentioned in the article. Don't get me started on SOAP.

So, that's the long answer to what I think about RSS 1.0 vs. RSS 2.0 vs. Atom. The short answer is: I don't really care all that much as long as we can extend them all in the same, well-defined manner. Oh, you just wanted the short answer? Sorry about that.

Posted by Eric at 12:16 AM | Permalink | Comments (2) | TrackBacks (6)

Kill Bill: Vol. 1

Oh wow, it's a lot of fun to watch a Quentin Tarantino movie when he's at the top of his game like he is here. I've read and heard quite a few reviews/opinions about this movie, and I can understand why it's a love-it-or-hate-it kind of thing: it all depends on your reaction to the violence and gore, which is the polarizing factor with this film. I chose to put on my "comic book violence"-colored glasses and I thoroughly enjoyed the experience.

There are two things that QT did so right with both Pulp Fiction and Kill Bill: the use of music to create the perfect atmosphere, and his mastery of a non-linear presentation of the storyline. First, the music: QT has an uncanny ability to bring a smile to your face (or make you cringe in the case of Reservior Dogs) simply by the music he chooses. I don't know how he does it, but it's always *just* *right*. From the opening credits of Pulp Fiction (with the changing of the radio stations) to the Nancy Sinatra tune and Mexi-Japanese songs in Kill Bill, it helps to create a memorable experience. Second, the non-linear storyline. While it's not used quite as effectively as Pulp Fiction, it's still effortless and entertaining. I mean really, there's not much of a plot here, so it adds some interest, but the story is kind of besides the point. It's a tribute to the genre with the QT spin on it. Can't wait for Vol. 2.

Kill Bill: Vol. 1 (****)

Posted by Eric at 08:11 PM | Permalink | Comments (1)

The First $20 Million Is Always the Hardest

I enjoy the writing of Po Bronson, with Bombardiers being right up there with Catch-22 as one of the funniest books I have ever read. I remember enjoying The First $20 Million is Always the Hardest: A Novel as well, although I think I keep confusing it in my head with Microserfs by Douglas Coupland. When I saw this movie was playing on Showtime, I had no idea that it was made into a movie. I guess it was only released in L.A. and New York in 2002. Well, it's certainly a cheeseball interpretation of the book, with sub-par writing by Jon Favreau, who adapted the screenplay. But, being a movie about the dot-com generation, I can't resist. It's not horrible, and having Rosario Dawson cast as the female lead certainly doesn't suck.

The First $20 Million Is Always the Hardest (**1/2)

Posted by Eric at 11:31 PM | Permalink | Comments (0)

The Nine Lives of Fritz the Cat

WTF? This is one messed up movie. I guess I should have expected that with anything remotely related to Robert Crumb, but this movie redefines the boundaries of tastelessness for me. I was sometimes incredulous, often distgusted, and never amused. The only value to this movie is as a historical record (made in 1974) of a sick and twisted mind. I like subversive material as much as the next guy, but come on!

Nine Lives of Fritz the Cat (*)

Posted by Eric at 09:17 PM | Permalink | Comments (0)

RSS in the Seattle Times

Nice piece here that shows that RSS is starting to peek into the mainstream. It's kind of funny that he alludes to Pointcast as a true "push" technology -- it really wasn't any more pushy than current RSS clients, and in fact polled just like the RSS clients do. The problem was with its popularity, the inability to customize what was being retrieved, and the general network capacities at the time. If naive RSS clients take off the way Pointcast did, we could have a lot of the same problems if we don't starting thinking about millions of clients polling millions of feeds every five minutes.

Sorry, got off track there. This article still serves as a nice introduction to feeds and explains their benefits with respect to email distribution.

Source: The Seattle Times: Reeling in what you want from the Web

Posted by Eric at 01:46 PM | Permalink | Comments (3)

My Microfeeds

Well, the entry the other day about microfeeds got me thinking that I should be eating my own dog food. If they're such a good idea, why don't I have any? So, I've done a little feed maintenance and added an RSS 2.0 "Microfeed" to each entry. This microfeed contains the comments and trackbacks made on that entry, so if there's some entry you want to keep track of you can go ahead and subscribe to its microfeed.

As I mentioned in the post the other day, if you're using a traditional 3-pane RSS reader you'll probably find all of these microfeeds more trouble than they're worth. But who knows what other clients might show up that can deal with these in an almost transparent fashion. I think what I'd like to do is define an extension (in its own namespace, of course) that allows a microfeed to point to its parent feed, which could be a valuable hint for a client that knew how to deal with a parent-child relationship between feeds. I'm assuming this doesn't exist right now -- please correct me if I'm wrong.

I also created a blog-wide comments feed in case you're interested.

Posted by Eric at 07:16 PM | Permalink | Comments (1) | TrackBacks (1)

Darkness Falls

Ugh, what a piece of crap. A pitifully acted story about a killer tooth fairy that has to stay out of the light. Touching. I didn't even make it to the end of the movie. The opening scene was the only thing halfway decent: it was genuinely scary and suspenseful, but it sure took a nosedive after that.

Darkness Falls (*)

Posted by Eric at 10:34 PM | Permalink | Comments (0)

Microfeeds

So there's an example or two of companies using RSS for press releases: Dick talks about it here and Jason talks about it here. I think the big idea here is that we can transform a static piece of information into an ongoing conversation with potential customers. By representing the information as a syndication feed, it becomes a "living press release". Potential customers can "subscribe" to that press release and the company can issue news and updates related the release as new content items. Cool, eh?

Following the life of a press release is an example of what we might call a "microfeed". Think of a piece of content as anchoring a discussion. The other obvious example of a microfeed would be the comments on a blog item. One approach is to have a feed associated with each item ... in fact, something like that is proposed with the includedComments module.

The problem that emerges isn't that there are so many feeds to generate -- that's a piece of cake on the publisher's side, as it's just another page. The problem is that the traditional three-pane nntp^H^H^H^H RSS reader isn't the right model for dealing with potentially hundreds of microfeeds, many of which might never be heard from again. I'm not sure what the corrent interaction model is, but it probably has to do with recognizing these microfeeds in the context of the original parent, treating them like "child feeds". The list of feeds that I'm following probably looks a lot more like a big threaded discussion than a serial list of feeds.

Then again, the right answer is probably something completely different. What other ideas are out there?

Posted by Eric at 02:16 PM | Permalink | Comments (0)

Microcontent and Feeds

Ken brings to light an issue that will become more and more important as we introduce addition content types (namespace extensions) to our syndicated feeds. Since "feeds" are temporal entities that really serve the purpose of notification, putting information in the feed that will wink out of existence once enough new items have been pushed on the stack is not the way to go if that information has value in its permanence. Put another way, don't put stuff exclusively in the feed that has value outside the feed.

That makes sense if you equate RSS (and its varients) with a "notification feed", and in fact Ken provides some really good suggestions for creating "microcontent" and linking to that from the different syncdication formats.

I think what we're seeing, however, is that RSS evolving beyond being simply a feed format, and is instead becoming more of a standard for content representation. You can argue that this is a misuse of the format, but there seems to be a great deal of innovation going on related to the <item> element even taken out of the context of the <channel> element. Why would that be? My guess is because it's an easy starting point, with common meta-information that almost any kind of content would require, AND because RSS has promoted the use of XML namespaces and provided enough examples to make extensions accessible.

So anyway, I don't think there's a right or wrong answer here, but take Ken's advice and make sure that any information you're publishing that has value in its permanence doesn't get washed away in the feed.

Is a Feed the right place for your Data?

Posted by Eric at 10:33 AM | Permalink | Comments (0) | TrackBacks (1)

RSS to Help Battle Spam

This viewpoint has been around for a while, but it's usually (unfortunately) characterized as "RSS will replace email", which is just silly. But there is a real idea in here: RSS feeds could offer a cure for all of those sites that you register for where you have to give your email address and then uncheck 20 little boxes that try to get you to sign-up for their mailing lists.

I personally would be much more likely to "try out" some of those information feeds if they were presented as RSS becuase I have control. I can unsubscribe without having to rely on the content providers mailing list server, I don't have to give out my email address, and I might find something useful in their feed. I probably won't, but at least I would be exposed to their information instead of never receiving it in the first place.

Anyway, this is one possible (likely?) future for content syndication feeds once the pieces fall into place (ubiquitous readers, a standardized feed:// addressing scheme). This would be a big win.

Here's a nice article that summarizes why this could be a good thing.

RSS offers no-muss, no-fuss information

Posted by Eric at 09:22 AM | Permalink | Comments (1)

Dancing About Architecture

Feeds, music, life

November 25, 2003

Photos + Location

November 20, 2003

RSS 1.0 vs. RSS 2.0 vs. Atom

November 19, 2003

Kill Bill: Vol. 1

November 14, 2003

The First $20 Million Is Always the Hardest

The Nine Lives of Fritz the Cat

RSS in the Seattle Times

November 12, 2003

My Microfeeds

November 07, 2003

Darkness Falls

Microfeeds

November 06, 2003

Microcontent and Feeds

November 05, 2003

RSS to Help Battle Spam