Is a Feed the right place for your Data?

13 comments | Nov 05, 2003 | permalink

Locked in a feed

Working with RSS since extensibility became a good practice, I'm used to hearing things like "I'm going to add this [something] data to my feed and then everyone will [something]." My question is, "Where was this data before you put it in your feed, and why isn't it still there now?"

A "feed" (RSS/RDF, RSS/XML, Atom) is a snapshot of recent entries from some source, typically an episodic web site, that provides new items of information on a regular basis. Outside of online journals and weblogs, one of the most common examples of data feeds is a "stock ticker".

In some cases the data is not already conveniently available on the web or is not relevant past the timeframe of the feed snapshot — that's not the kind of data I'm talking about. I'm talking about data that is already conveniently on the web, or could be, or data that has relative permanence, called "micro content". A recent example is the Review (RVW) Module for RSS 2.0. Review data has permanence, it has linkability, it has searchability, it has reusability — why is it locked in a syndication feed for use pretty much only by syndication clients?

Freed on the web

Instead of creating a "module" for a syndication feed (yes, I know, we told everybody to do that, let's push it in the right direction now), we can use virtually the same format (XML, RDF, HTML, or what have you) and put it at its own URL and use feed extensibility to link to it.

This has several distinct advantages:

  • Most importantly, microcontent resources at URLs have permanence, linkability, searchability, reusability, and other Web goodness.
  • You're not constrained to a particular feed format or tied to tools that only support that format.
  • Your content is available, to everyone, long after the item in the feed has scrolled off. Within a feed, only feed readers and super aggregators that preserve items will keep your content around.
  • You can link to your content from every feed format, easily. This is where the extensibility of feed formats works for you and you don't have to play favorites.
  • Your microcontent is more easily embedded in or linked from your relevant web page, where the vast majority of links from search engines and others will point to.
  • Microcontent can still just as easily be served from your content database, some other database, web services, flat files, or what have you.

Linking from HTML and common feed formats

Web pages are often the most common place that microcontent in other formats is linked from. Embedding a link in the web page related to your microcontent is described below.

Feed formats serve the purpose of "notification", they let readers know when new items appear on your site. If your microcontent is your item and there's no corresponding web page, put the URL of your item in the <link> element (and 'rdf:about' attribute in RSS 1.0). If your microcontent is an alternate representation of, part of, linked from, or otherwise related to a web page on your site, put that web page's URL in the <link> element and extend your feeds as described below (in order of spec publication):

X/HTML

HTML uses a <link> tag within the HTML <head> element to link to related resources. The @rel attribute of the <link> element indicates the relationship of the linked item. If your microcontent is the "same" information as displayed on the web page, but in a machine readable format, the relationship is rel='alternate', if the web page adds to or embellishes the content, you can make up an appropriate relationship, like rel='review' or rel='source'. In any case, the @type attribute of the <link> element should give the media type of the content, like type='application/x.rvw+xml' for XML. Lastly, you put the link to the content in the @href attribute and a descriptive title in the @title attribute.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">
<HTML>
   <HEAD>
      <TITLE>Review: Life of Pi</TITLE>
      <LINK REL="ALTERNATE" TYPE="application/x.rvw+xml"
            HREF="http://www.pmbrowser.info/hublog/archives/1234.rvw"
            TITLE="RVW XML Format">
   </HEAD>
   <BODY>
      <P>Review of Life of Pi by Yann Martel
   </BODY>
</HTML>

RSS 1.0

In RSS 1.0, linking to your microcontent is a matter of choosing an XML namespace and property name that means, effectively, "see also this microcontent". If your microcontent already has an XML namespace, feel free to use it here. The URL for your microcontent goes in the 'rdf:resource' attribute of the property.

<rdf:RDF 
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns:rvw="http://purl.org/NET/RVW/0.1/"
  xmlns="http://purl.org/rss/1.0/">

    ...

    <item rdf:about="http://www.pmbrowser.info/hublog/archives/1234.htm">
      <title>The review headline goes here</title>
      <link>http://www.pmbrowser.info/hublog/archives/1234.htm</link>
      <rvw:item rdf:resource="http://www.pmbrowser.info/hublog/archives/1234.rvw" />

      ...

RSS 2.0

In RSS 2.0, you also must pick an XML namespace and element local name that links to your microcontent.

<rss version="2.0"
  xmlns:rvw="http://purl.org/NET/RVW/0.1/">

    ...

    <item>
      <title>The review headline goes here</title>
      <link>http://www.pmbrowser.info/hublog/archives/1234.htm</link>
      <rvw:item>http://www.pmbrowser.info/hublog/archives/1234.rvw</rvw:item>

      ...

Atom

Atom uses the same <link> element as X/HTML described above, using a <link> element within the Atom <entry>. Use rel='alternate' with an appropriate type for the microcontent. You can have two <link> elements with rel='alternate', one pointing to the web page and one pointing to the microcontent, each with an appropriate @type attribute.

<entry xmlns="http://example.com/newformat#" >
  <title>Review: Life of Pi</title>

  <author>
    <name>alf eaton</name>
    <url>http://www.pmbrowser.info/hublog</url>
  </author>

  <issued>2003-02-05T12:29:29</issued>
  <created>2003-02-05T14:10:58Z</created>
  <modified>2003-02-05T14:10:58Z</modified>

  <link rel="alternate" type="text/html"
    href="http://www.pmbrowser.info/hublog/archives/1234.htm"/>
  <link rel="alternate" type="application/x.rvw+xml"
    href="http://www.pmbrowser.info/hublog/archives/1234.rvw"
    title="RVW XML Format"/>

  ...