RSS Best Practices

Photo: dean allemangDean Allemang, chief scientist at TopQuadrant, Inc., shares some tips on providing newsfeeds for knowledge dissemination on the Web.

Newsfeeds for the Masses

A lot of information on the Web has a life span of weeks, months or even years. Titles of books on Amazon don’t change; prices change, but only on a slow schedule. There isn’t any need for an Amazon user to check back each day to check the current price of the Harry Potter books.


But What Is RSS?

Be sure to read Part I, Demystifying RSS Technologies, to learn more about RSS terminology and metadata.


But there is some information on the Web that is relatively short-lived. The best and oldest example of this is news: a news article from yesterday is, well, yesterday’s news. Its value decreases rapidly with age, from something that everyone is rushing to see down to something of interest only to historians and researchers – sometimes in a matter of hours.

It is this sort of volatile information that newsfeed technologies are designed to deal with. In addition to news stories, other examples of volatile information include weather reports, earthquakes and blog updates.

Web Page or Newsfeed?

While a newsfeed is a bit more complicated than a Web page, it isn’t difficult to understand. Like a Web page, a newsfeed is specified by a URL. But unlike a Web page, a newsfeed is expected to change on a regular basis, even minute by minute.

Let’s take a simple example: the headline in the sports section in The New York Times. At one moment, the headline could refer to a story about a retired Olympian. But a few minutes later, when a new report comes in, the headline refers to a story about an unusual performance in the World Series. The entire content of the headline changes to reflect the most current news.

Image: screenshot from the new york times rss feed

Figure 1

Now we come to the most important difference between a newsfeed and a Web page. Since it changes so quickly and unpredictably, regular readers could miss a story. For this reason, a newsfeed includes a chronological history of all the recent items. This is why it is called a ‘feed’ — it is a sequence of items, organized chronologically.

Advice for Providing RSS Feeds

Suppose you are in charge of the information provisioning of a non-profit organization or a government agency, and you are considering using newsfeed technology as a way to provide information to your information consumers. I have prepared a list of guidelines to help you and your readers get the most out of the effort you put into your feed.

  1. Select appropriate content. Newsfeeds are appropriate for information that changes rapidly. In some cases, it is obvious that the information comes this way (news stories). In others, it is not so obvious. Flickr is a photo archiving site, but updates to these archives are like news items: they are interesting for a short period of time after they occur, and fade into uselessness shortly afterwards.

    Before you put data into a feed, think about whether it really needs a feed – is there new information being provided on a regular basis? Are there consumers who need to be informed of this information in a timely fashion? If so, then a feed is probably a good way to deliver the information.

  2. Provide structured information whenever possible. The more information you provide in your feed, the easier it is for someone to re-purpose your information and display it in a novel and useful way. As providers of public information, we need to encourage such reuse as much as we can.

    Furthermore, provide as much structure in the information as you can. For example, early versions of the Flickr newsfeed included tag information as a space-delimited stream of terms e.g., “pink flowers autumn flower fall.” Later versions provided more structure, separating out each term into its own data item. The USGS provides structured geolocation information, making it easy to place earthquake events on a map.

  3. There’s no reason to be stingy with the number of feeds you provide. The USGS provides a handful of feeds about earthquakes indexed by severity (magnitude greater than 1, 2.5 and 5, over the past day or week, in the United States or globally). The Washington Post provides a few hundred feeds on its feeds page, some of which are generalizations of others, e.g. “Politics” and “Bush Administration.” Flickr does this one step better: any tag or set of tags can be used to define a feed, effectively defining tens of thousands of feeds. An end user can specify their own feed about pink flowers growing in the autumn.

    The more ways you cut up your data into feeds, the more specialized audiences you can address, and the more relevant use can be made of your information.

    Articles About the Web

  4. Use standard vocabulary. The downside of extensible metadata is the possibility of a proliferation of ways to say the same thing. The New York Times uses their own vocabulary to specify geographical location, making it more difficult to cross-reference location information between stories in the Times and, say, Flickr photos. The same goes for tagging: Flickr and del.icio.us allow their user base to create new tags and use them as they see fit. Newspapers like the Times have their own topic index that they use to classify their news stories.

    But you can maximize the understandability of your information if you can find a shared vocabulary that is relevant to your domain. Some examples of public vocabularies are AGROVOC (United Nations Agriculture Vocabulary), UNSPS (product categories), and the West American Digest System.

Don’t Miss the Boat

Newsfeed technologies have already established themselves a place in the Web as the source for volatile information. They are in widespread use in newspapers, blogging sites, photo sharing sites and elsewhere. They provide a standardized way for an information provider to send new items to information consumers. The extensibility of these systems opens the door for sophisticated usages of the information, beyond the plans of the original information provider.

Page:  Previous  1 | 2 


Dr. Allemang is chief scientist at TopQuadrant, Inc., specializing in innovative applications of Semantic Web technology. He developed the curriculum for TopQuadrant’s successful training series for Semantic Web technologies, which he has been presenting to customers worldwide for over four years. Dean has completed a master’s degree at the University of Cambridge as a Marshall scholar, a PhD at the Ohio State University as a National Science Foundation Graduate Scholar, and is a two-time winner of the Swiss Prize for Innovation in Technology. He recently published, along with co-author Prof. Jim Hendler, Semantic Web for the Working Ontologist (Morgan-Kaufmann, 2008), which has recently appeared in Korean translation.


Want to be kept up-to-date on our latest articles? Sign up for the TSNe-Bulletin, a monthly e-newsletter providing tips and ideas to help you strengthen your nonprofit’s impact with and for the communities you serve.



Comments:
No comments.

Redraw Image


Your comments will not be posted until they have been approved by the moderator.