Alex headshot

AlBlue’s Blog

Macs, Modularity and More

Why the update site should be an atom feed

2006, atom, eclipse, update

One of the things that most people agree needs work is the Eclipse update manager. (Unfortunately, there doesn't seem to be a plan item for it at the moment.) One of the things that I think is important is that the update manager is updated to use an Atom feed for its meta-information, and I've raised bug 127236 to reflect this. However, bugs only percolate to the top of the pile when enough people vote for it to let the planners know it's an issue to more than just one person who rants on about the same thing :-)

So why is an atom feed the right choice for an udpate site? Well, the atom enabled website that discusses the format of the atom feed has some answers.

Unlike RSS, Atom feeds are designed to consist of a feed or an entry. Feeds can contain entries, but entries can exist on their own outside of feeds (or more likely, for inclusion into several feeds). One of the ways that this can be done is for a feed to contain a bunch of link elements to point to entries that are located/encoded in more depth elsewhere, or more likely shared.

Also, unlike RSS, Atom feeds are namespace-aware XML documents. In fact, all of the Atom elements must be in the atom: namespace (well, more strictly, the http://www.w3.org/2005/Atom namespace). That means that it's possible to transmit any kind of XML payload with any kind of metadata encoded into the entry itself. You could even make up your own elements like osgi:bundleName or eclipse:autostart, and that'd be fine with Atom. Heck, you can even decorate parts of it with html:p if you want to have human-readable text in there as well (for example, for license, copyright, or other types of text). And because it's namespace/XML complaint, you can tag each with xml:lang and automatically give the user the right language.

But there's more. There's a whole bunch of libraries dedicated to reading Atom feeds in a whole host of languages. That's because, regardless of the content of an Atom feed, it always has:

  • id - a unique URI that identifies the entry
  • title - a human readable title
  • updated - time stamp in full ISO format 2003-12-13T18:30:02-05:00

So any feed reader will notice that an update has occured, and can act upon it, even if it doesn't know what an Eclipse plugin is. You could have an entry scrolling across your Slimp3, or a message popped up on your TiVo, or even broadcast over Jabber. Most of these feed readers are also intelligent enough not to waste bandwidth by sending a query to the HTTP server with an ETag, which is the state that the HTTP server last sent the data. If it's not changed, it gets an 304 code to let it know that it doesn't need to update its information. You could even build aggregators that combine multiple sources (much like planeteclipse.org is aggregating this post) to provide internal update sites that are refreshed automatically.

Even better, once the top-level Atom feed is standardised, there could be multiple implementations. For example, the Oscar bundle repository has a respository feed, and the OSGi also has a separate bundle repository format. If they were all Atom feeds, then they could all have entries encoded in the same way for determining what bundles have changed, and have appropriate namespaced tags for each type of provider. Then, an update system that didn't know how to parse (say) OSGi bundle XML would just ignore the entry. (Of course, the goal would be to have a ubiquitous format that everyone uses.)

So, to summarise:

  • Atom allows arbitrary namespace-partitioned XML content to be encapsulated with the Atom entry, opening up the possibility of encoding Manifest.MF-type information in an XML manner as well as other details
  • Atom feeds are designed to have in-line and fragmented entries. Having lots of entries in different locations might be good; you could have (e.g.) a JDT feature that had an Atom feed, that pointed to the same set of entries for the Platform.
  • Atom feeds define standard headers for mandatory fields (including a standardised ISO date that has the same lexicographic sorting as date order), as well as several other optional ones (like authors, copyright, summary etc.). All feed readers will be able to display these fields, even if they don't know about any Eclipse- or OSGi-specific metadata encoded in the XML
  • Having an Atom feed read from many different Atom entries will allow HTTP optimisations (such as ETag and 304) to be taken used
  • Parsers exist for Atom feeds in a variety of different languages already
  • Aggregation of Atom feeds is a well-known subject, and one that can be used to allow for distribution of updates internally
  • It should be trivial for every individual bundle to have its own Atom feed/entries for new versions as they are created. That would allow (for example) the Xerces parser to be updated in Eclipse without needing to have a specific feature being created just to hold a single bundle. In fact, repositories would probably automatically generate such Atom entries when bundles are uploaded, so that they can be embedded in any Atom feed that you want.

So, if you think that having an update manager as an atom feed is a good idea, please vote for it. Eclipse has traditionally been the leader when it comes to certain choices (like basing the platform on OSGi) and this would be yet another example of Eclipse showing others the way. But you need to tell the Eclipse team that this is a desirable thing in order for it to get on the plan for 3.3; and with 3.3M1 already out the door, time waits for no-one.

You might also like to vote for bug 126732, which co-incidentally is an anagram of the first one. I discussed about that one in a separate post and blog entry.