Demystifying RSS

Photo: dean allemangDean Allemang, chief scientist at TopQuadrant, Inc., demystifies newsfeed technologies and describes what unique value they provide in the information landscape of the Web.

As the Web matures and everyone becomes familiar with its capabilities, we are seeing more and more sophisticated ways to deliver information and link it together into a true web. One technology contributing to this phenomenon goes by the name “RSS.”

While RSS isn’t particularly new (early versions date back to the ’90s), it has fairly recently come into its own, becoming familiar to a wide range of Web users. RSS links, also known as feeds, can be seen on the front page of any major newspaper. Government agencies commonly distribute information using RSS. Social networking sites like LiveJournal, MySpace and Facebook are tightly integrated with RSS in a number of ways.

But What Is It?

RSS is the name of one of several technologies for sharing rapidly-changing information over the Web. These technologies are known as newsfeed technologies.

But newsfeed technologies (especially RSS) have a lot of technological baggage that makes them more mysterious to everyday users than they have to be. The “RS” in “RSS” used to stand for really simple, but nowadays, RSS seems anything but!

What is the difference between all the versions of RSS (e.g., 0.9, 1.0, 2.0)? What about “Atom,” another newsfeed technology that seems to appear often alongside RSS? More fundamentally, what can a casual, or even serious, Web user do with newsfeed technologies? And finally, and most importantly for non-profit information providers, how can — and should — these technologies be used to publish information so as to get the most value from using them?

Reading a News Feed

This idea of a newsfeed isn’t new with the Web; newspapers have worked with the idea of a newsfeed for years. The latest headlines are printed sequentially on a strip of paper, which are read and torn off by reporters in a press room.

But how should this work for a newsfeed on the Web? The usual Web experience is to point the browser at a Web page and see what’s there. But how should a Web browser handle a feed? If it just shows the current top headline, then previous headlines, even just minutes old, will never be seen.

For this reason, most modern Web browsers include a specialized capability for processing newsfeeds. When the browser is directed to a newsfeed (such as clicking on a “feeds” link on The New York Times Web page), it recognizes that it is a feed, and presents the top several stories — along with headlines, source, date of submission and the first line of the full story when available. Figure 1 was produced, for example, by the newsfeed reader in Mozilla Firefox.

Image: screenshot from the new york times rss feed

Figure 1

This sort of display allows a reader to see a summary of the latest news, and to click through to the full story. Many browsers provide further capabilities; for instance, Mozilla Firefox offers the capability to include an RSS feed in the ‘shortcuts’ bar with a convenient pull-down menu to display the latest headlines. For advanced users of newsfeed technology, a wide variety of newsfeed reading software is available.

Another way to read a newsfeed requires no special software on the reader’s computer at all. Many Web portals (e.g., Yahoo!) provide a capability to include newsfeeds in their customized front pages. Then the portal takes responsibility for presenting the news stories in an understandable way. The figure shows a “My Yahoo!” front page, customized for someone with a particular interest in women’s basketball.

Image: screenshot from my yahoo!

Figure 2

Feed Technologies

One of the most confusing things about newsfeed technologies is the bewildering array of standards for doing essentially the same thing. While the history of how there came to be so many competing standards is fraught with emotional disagreements, fortunately for content providers and consumers alike, there is little practical difference among the standards today.

There are three standards in common use today: two versions of RSS (RSS 1.0 and RSS 2.0), and Atom. As far as we need to be concerned with in this article, all versions of RSS and Atom provide the same thing: structured information about volatile entities in a newsfeed.

Show Me the Data

One of the key features of any feed representation is extensibility in the data that it can represent about an item in a feed. In the earliest days of RSS, it was sufficient to have a title, a date and a link to more information (a Web page) about the item. But as more and more types of information came to be managed in feeds, it became necessary to include more and more information in the feed.

In addition to a link to a photo, Flickr began including a link to a small-scale thumbnail of the photo, tag information, source information about the photo (e.g., who took it and using what equipment), and eventually adding geographical information about where the photo was taken. Newspapers began to include headlines and bylines. Blog sources include links to comments and security settings. The newsfeed standards differ slightly in how they handle this sort of extension.

But what is the point, to the information consumer, of all this extensible metadata? Why would they care? The answer to this lies in the utilization patterns of newsfeed data.

Most newsfeed readers today sort news items by source. In Figure 2, we see a set of stories from the Stanford Women’s Basketball newsfeed, and another set from The Washington Post, etc. But since each item has a date stamp, it is possible to display all stories about basketball in the past 24 hours, regardless of source.

As more and more metadata is available in the feed itself, this can become more and more elaborate. “Show me pictures of flowers in Central Park.” “Show me earthquakes of magnitude greater than 4.5 within a 30-mile radius of my home.” We can even combine dynamic and static data: “display reports of voter fraud on a map that displays racial demographics of the country.” The more structured metadata is available in a feed, the more sophisticated a presentation can be made.

So What Do I Do with All This Data?

There is an overwhelming quantity of data on the Web. How can your organization get its information out there in a useful way?

Continue on to read Part II, which shares best practices for developing your own newsfeed.

Page:  12  Next


Dr. Allemang is chief scientist at TopQuadrant, Inc., specializing in innovative applications of Semantic Web technology. He developed the curriculum for TopQuadrant’s successful training series for Semantic Web technologies, which he has been presenting to customers worldwide for over four years. Dean has completed a master’s degree at the University of Cambridge as a Marshall scholar, a PhD at the Ohio State University as a National Science Foundation Graduate Scholar, and is a two-time winner of the Swiss Prize for Innovation in Technology. He recently published, along with co-author Prof. Jim Hendler, Semantic Web for the Working Ontologist (Morgan-Kaufmann, 2008), which has recently appeared in Korean translation. 


Want to be kept up-to-date on our latest articles? Sign up for the TSNe-Bulletin, a monthly e-newsletter providing tips and ideas to help you strengthen your nonprofit’s impact with and for the communities you serve. 



Comments:
No comments.

Redraw Image


Your comments will not be posted until they have been approved by the moderator.