Atom Feeds
Release: 2009-10-31
Jump to Web Standards Articles TOC
News Feeds
After the mass explosion of different versions and attempted structures of the many RSS Feed formats; the community wanted a new clean, modern structure that was fresh and separate from the RSS name. RSS stood for Rich Site Summary, RDF Site Summary and Really Simple Syndication depending on the version and the organisation that developed the specification.
The new feed format was developed on an open forum and was eventually named Atom. Atom reached version 1.0 back in 2005 and is now supported by all Feed Readers or Aggregators by the end of 2006.
Atom Feeds are XML based documents that use the extension .atom (or can use .xml or any server-side script extensions) and the 'application/atom+xml' MIME Media Type. Just like the old RSS Feeds, web browsers that support News Feeds also auto discover Atom Feeds associated with a webpage via, for instance, HTML's meta element: <meta name="alternate" type="application/atom+xml" href="feeds/latestnews.atom" title="The Latest News (Atom)">. Such browsers that support Atom Feeds and the old RSS Feeds include Mozilla Firefox 1.5 and higher, Opera 7.2 and higher, Apple Safari 2.0 and higher and Microsoft Internet Explorer 7 and higher.
You can subscribe to these Atom Feeds which provide latest news or blog entries or other headline related information. You can drag or copy and paste the Atom Feed's URI to a dedicated Feed Reader or Aggregator. As it is only XML they are quick to download and the appearance of the information is handled by the browser or Feed Reader and can usually be customized. Feeds are used so people can be notified by any changes or updates right in their browser, plugin, email client or other Atom Feed supporting program without searching on the actual website. These feeds will have a link to the full information on the actual webpage if there is just a headline or summary.
Atom Feeds
The basic structure of an Atom Feed usually has an XML Declaration, the feed> element with Atom's namespace (usually declared as the default namespace) (xmlns="http://www.w3.org/2005/Atom") and declaring the Human language with XML's xml:lang attribute as:
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
</feed>
The required elements providing information about the feed itself are the title element for the feed headline and the id element as a unique identification usually as a URI or IRI (such as http:) or the 'tag:' URI scheme such as tag:example.com,2009:articles.entry01 or tag:me@example.com,2009:myposts.entry-01.
The 'tag' URI/IRI (or 'tags') provides uniqueness not just in space but in time as well. A UTC date can be used for the date part such as 2009, 2009-01 or 2009-01-01. After the second colon (:) you can use slashes (/) and / or dots to separate any 'words'.
More required elements include two link Empty Elements: one that points to the main webpage or website and the other to point to the feed itself (for other feeds that import entries from the current feed) as:
<link rel="alternate" type="text/html" href="http://www.example.com/news/"/>
<link rel="self" href="http://www.example.com/news/news.atom"/>
More link Empty Elements can be used for other references or alternates by adding a hreflang attribute. This will indicate that the reference or alternate is for a particular Human language. You may have many of these as long as none of them have the same type and hreflang attribute value combinations.
<link rel="alternate" type="text/html" hreflang="en" href="http://www.example.com/news/"/>
<link rel="alternate" type="text/html" hreflang="fr" href="http://www.example.com/fr/news/"/>
<link rel="alternate" type="application/pdf" hreflang="en" href="http://www.example.com/news/news.pdf"/>
Also at least one or more of the authors' names such as an individual's name or the organisation's name can be stated within a name element within an author element. A date time in W3C/UTC Date and Time form is stated within an updated element.
Our example would expand to the following:
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
<title>Web News</title>
<id>http://www.example.co.uk/</id>
<link rel="alternate" type="text/html" href="http://www.example.co.uk/"/>
<link rel="self" href="http://feeds.example.co.uk/latestnews.atom"/>
<updated>2009-02-11T18:32:00+00:00</updated>
<author>
<name>FeedCorp</name>
</author>
</feed>
or
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
<title>Web News</title>
<id>tag:example.co.uk,2009:latestnews</id>
<link rel="alternate" type="text/html" href="http://www.example.co.uk/"/>
<link rel="self" href="http://www.example.co.uk/latestnews.atom"/>
<updated>2009-02-11T18:32:00+00:00</updated>
<author>
<name>FeedCorp</name>
</author>
</feed>
Entries
After the main feed information you have one or more entries declared in an entry element. Within that element you must have a title element, an id element and a link Empty Element to point to the webpage or part of a webpage that has the full story or webpage version. Plus an updated element and either a summary and / or content element for the summary or the full information respectfully.
<entry>
<title>Email Watch: Mozilla Thunderbird 2.0.0.23 released</title>
<id>http://www.getthunderbird.com/</id>
<link rel="alternate" type="text/html" href="http://www.getthunderbird.com/"/>
<summary>If you have Mozilla Thunderbird 1.5 or higher you can go to Help, Check For Updates to update your copy. Otherwise the 2.0.0.23 version is available from the Moziila Thunderbird site.</summary>
<updated>2009-08-24T18:31:00+01:00</updated>
</entry>
Information Types
The default value or Content Model of content elements such as title in the feed information and each entry as well as the summary and content elements in each entry are plain text. So any elements and character entity references will be treated as literal text and not as elements and character entity references (but web browsers process character entity references before the main document processing so character entity references will already be processed).
To change the content model you can add a type attribute to the element. The default value of the type attribute is naturally 'text'. Using the value 'html' will officially allow plain text and processing of character entity references. So you can use <, &, >, ", ' and any hex or numbered entities guaranteed. But with XML in browsers and other environments, support for other named character entity references such as (X)HTML entities like ©, €, é, etc. cannot be guaranteed. Some feed authors use the XML CDATA Section such as:
<content type="html"><![CDATA[
<p>Full unescaped elements <br>& other 'characters'.</p>
]]></content>
The third, preferred and more appropriate, content model value is 'xhtml'. This allows a full native XHTML div element with the XHTML namespace usually declared as a default namespace on the div element. This makes the content of the title, summary or content element as an 'isolated XHTML island' as:
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<p>Any thing you can use in a <code>div</code> element <br/>can be used here.</p>
</div>
</content>
Figure 8: Type XHTML content
The type attribute on the content element (not title or summary) may have a MIME Media Type instead of any of the three main values and the appropriate content within the start and end tags. If the content is separate from the feed you can refer to it as the content using the src attribute such as <content type="" src=""/>.
Optional Features
Optional features for the feed information itself include using the xml:base attribute to help with relative URIs and IRIs; a subtitle element for an article or post strapline which can also use the type attribute. An icon element for a small sized web image that has a width and height aspect ratio of 1:1; also a logo element for a larger logo web image that has a width and height aspect ratio of 2:1. The previous two elements will have a URI/IRI within them to reference the web image.
You can add the fact that the feed was created by some software tool by using the generator element as: <generator uri="http://path.com/to/software/website" version="3.0.2.1">Feed Creator 3000</generator>. Rights including copyrights may be stated in a rights element.
A Feed may be associated with one or more categories using the category element and its term attribute providing the category name. A couple of optional attributes may be used in this element: a scheme attribute provides an IRI to the scheme that the category is defined (such as an OWL or RDF document); a label attribute will provide a Human readable label for displaying in Feed Readers and Web Browsers.
Additional information about the author can include an uri element stating the homepage or contact page or employee page; an email element stating the email address of the individual or main public email address of the organisation. Plus you can add one or more contributor elements with the same child elements as the author element.
Each entry may use several optional elements including their own author information using the same elements author, name and the optional uri and email as well as their own contributor information. A published element can be added for an entry providing the date and time when the entry was originally published. Rights of an individual entry may be conveyed with a rights element.
When an entry is imported into another feed it should have a source element with the original feed's main information as the source element's children to preserve the main metadata.
Extensions to an Atom Feed can be added by adding a full namespace declaration of the extra module(s) of elements and / or attributes to either the feed start tag (for use in the whole feed) or the entry start tag (for use in that particular entry) or the first extra element of the extra clump of elements or each extra element.
An Atom Feed:
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
<title>Web News</title>
<id>tag:newscorp.com,2097-01-05:news</id>
<link rel="alternate" type="text/html" href="http://www.newscorp.com/"/>
<link rel="self" href="http://www.newscorp.com/latestnews.atom"/>
<subtitle>Web related news topics</subtitle>
<icon>http://www.example.co.uk/images/webit.png</icon>
<updated>2103-12-11T18:31:00+00:00</updated>
<author>
<name>FeedCorp</name>
</author>
<entry>
<title>Life of information</title>
<id>tag:newscorp.com,2097-01-05:news/articles/84849</id>
<link rel="alternate" type="text/html" href="http://www.newscorp.com/news/article/84849"/>
<summary type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<p>The ever evolving information landscape.</p>
</div>
</summary>
<updated>2103-12-11T18:31:00+00:00</updated>
</entry>
<!-- ... -->
<entry>
<title>Down by the village</title>
<id>tag:newscorp.com,2097-01-05:news/article/84684
<link rel="alternate" type="text/html" href="http://www.newscorp.com/news/article/84684"/>
<summary type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<p>Due to severe weather conditions, a collection of farm barns slid down into the village last night.</p>
</div>
</summary>
<updated>2103-11-17T09:24:00+00:00</updated>
</entry>
</feed>
Copyright ©2005-2010 Legend Scrolls and Peter Davison. All rights reserved.