Legend Scrolls

Structure your webpages with XHTML

Release: 2008-01-27
Jump to Web Standards Articles TOC

Hypertext

The life of Hypertext on the web has gone through four of the five stages to a proper webpage document structure.
Stage 1 was the initial development of HyperText Markup Language (HTML) by Tim Berners-Lee and then the formal specification of HTML by the Internet Engineering Task Force (IETF) as HTML 2 and then continuing as HTML 3.2 by the World Wide Web Consortium (W3C). Stage 2 began the path to Web Standards by the W3C HTML 4 and 4.01 specifications providing a strict edition and two other editions (transitional and frameset) for backwards compatibility of the depreciated code. But all three added internationalization and accessibility.
Stage 3 upgrades them to the realm of eXtensible Markup Language (XML) as eXtensible HyperText Markup Language (XHTML) 1.0. Stage 4 breaks up the whole XHTML specification into a collection of reusable modules including a Legacy Module for those still using flaky elements and attributes and the addition of new elements provided by new modules. Two main specifications are built from the Modular XHTML Collection as: for low-end Internet such as SmartPhones and PDAs we have XHTML Basic 1.0 and for full native XHTML environments we have XHTML 1.1.
Right now the fifth stage has two specifications being developed as posable replacements. XHTML2 as a proper multi-moulding, rewritten webpage structure. Features internationalization, accessibility, natural fallbacks for all non literal text information such as images, audio, video and animation. Plus HTML5 as a more backwards-compatible, incremental upgrade supporting Web Applications.

Our focus is from Stage 3: XHTML 1.0 and higher.
Native XHTML can be saved in .xhtml documents or if in dynamic documents such as PHP must have the MIME Media Type or Content-Type of application/xhtml+xml. Plus it can also be used in generic XML documents as long as the XML parser recognizes the XHTML namespace: http://www.w3.org/1999/xhtml.

XHTML 1.0 can be processed by non-XHTML handling web browsers by following the HTML-Compatibility Guidelines (mainly all empty elements must have a space before the forward slash) and being in .html files or at least having the (usually default) Content-Type of text/html if in a dynamic document like PHP.

First, for those who still use the depreciated HTML here are the changes to upgrade to HTML-Compatible XHTML 1.0. These changes are because XHTML is written in XML not SGML of which HTML is written in.

  1. As XML does care about the case of element and attribute names, XHTML is declared in lowercase,
  2. All elements that can have a start and end tag such as <p> and <li> must have both start and end tags present,
  3. All elements that only have a start tag like <img> and <br> must be Empty Elements: either have a space (if using HTML-Compatible mode) and a slash before the less-than character such as <br /> (preferred) or can be a normal element as long as there is absolutely no element content including new lines such as <img></img>,
  4. All attribute values must be quoted - commonly with double quotes,
  5. If you use double quotes in attribute values that are surrounded by double quotes then use the &quot; entity within the attribute value instead of a literal double quote (otherwise you would be ending the attribute before you intended; plus when you go to native-XHTML the XML parser will throw errors at you),
  6. If you use apostrophes or single quotes in attribute values that are surrounded by single quotes then use the &#x0027; or &#39; entities within the attribute value instead of a literal apostrophe or single quote. It is best to surround attribute values with double quotes these days,
  7. As ampersands (&) are used to start entities then if you need to use it as an ampersand use the &amp; entity in attribute values and element content,
  8. As less-than characters (<) are used to start elements and you need to use it as a less-than character then use the &lt; entity in attribute values and element content - greater-than characters (>) do not pose a problem,
  9. Any 'minimized' attributes such as checked, multiple or noresize must be expanded to have the attribute name as the value such as checked="checked", multiple="multiple" and noresize="noresize",
  10. Start using the id attribute at least in addition to the name attribute as the name attribute is being depreciated in favour of the global standard of ID type attributes like id and xml:id,
  11. It is best to move from using the language attribute in <script language="JavaScript"> </script> elements to the global standard and required type attribute as <script type="text/javascript"> </script>,
  12. In native-XHTML the document.write() and document.writeln() does exist still but it doesn't work. This is because it would break the XML Document Well-Formed Rules. Instead you will have to use the HTML Document Object Model to manipulate the existing markup,
  13. Use the XHTML DOCTYPEs instead of the depreciated HTML DOCTYPEs,
  14. It is best to move from using character sets like ISO-8859-1 to the most supported global standard and Unicode supporting character set UTF-8,
  15. Element nesting must be proper - for instance: <strong>This is an <em>invalid</strong> nesting</em> as the <em> element was started within the <strong> element then the <em> element should end within the <strong> element as <strong>This is <em>valid</em> nesting</strong>,
  16. Elements <script></script>, <style></style> and <pre></pre> must have the xml:space="preserve" attribute to preserve leading and trailing spaces, tabs, newlines and other 'whitespace' characters and any multiple whitespaces within words at the XML parser level,
  17. As XHTML is an XML language, where ever the lang attribute is used, the xml:lang attribute must accompany it,
  18. Also XHTML needs to declare its XHTML namespace (xmlns="http://www.w3.org/1999/xhtml" usally in the <html> start tag) which uniquely identifies the elements and attributes in the world of XML that they are from the XHTML language and not to be confused with any other XML language element or attribute that might have the same name.

XHTML 1.0

Using the strict flavour of XHTML 1.0 you have the general support for webpage features such as lists, sections, headings, hyperlinks, tables, forms, images and general object inclusion.
XHTML Documents can have the XML Declaration:

<?xml version="1.0" encoding="UTF-8"?>

Generation 3 and under of web browsers will display the XML Declaration as text but Generation 4 and higher of web browsers will correctly hide it. The XML Prolog is actually optional but preferred when using native XHTML.

The XHTML DOCTYPE for version 1.0's strict flavour:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html> </html> is still the Document Element but with xmlns="http://www.w3.org/1999/xhtml" to declare the XHTML namespace and the xml:lang for XML and XHTML natural language selection. Plus the usual lang attribute for backwards compatible natural language selection and any other typical attributes such as dir="ltr" or dir="rtl" (dictates if text goes left-to-right or right-to-left).
<head> </head> and <body> </body> are the child containers. In the head element you have <title> </title> to have the descriptive webpage title.

For various associated information for the webpage you use <meta http-equiv="" content="" /> or <meta name="" content="" /> empty elements. Typically there is one to tell the HTML parser (processor) which character set is used - the mime type is not generally used, because the web server sends this information along with the webpage:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> or
<meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />

In native XHTML and in XML generally the character set is told to the XML parser by the encoding attribute in the XML Declaration.
<meta name="description" content="" /> is used with the <title> </title> element for search engine use, but don't forget to actually market the website - its pointless creating a brochure and then just putting it on the table and expect people to come to collect the brochure themselves even if they have no idea it exists - and rankings can't be guaranteed (there may be 4,300 other websites marketing for the same thing!).
There was a Keywords version but due to mass keyword spamming, search engines no longer deal with the 'meta keywords' element.
For What-you-see-is-what-you-get (WYSIWYG) webpage editors such as Adobe Dreamweaver, Golive and Microsoft Frontpage and others can have the generator value for the name attribute. Authors who hand-code webpages do not need this meta generator element.

Other files may be associated with your XHTML Document using the <link rel="" type="" href="" title="" /> empty element. The rel attribute stands for relation (why the open community can't afford to have element and attribute names fully spelt out I have no idea); the type attribute states the content-type of the relation; href provides the URI of the actual associated file and title provides an advisory information to be displayed in an application menu list or other. Such associations could be Cascade Stylesheets to use precise and natural layout and style to your document.

Icons

Most Generation 6 web browsers and some Generation 5 web browsers support Icons, used in Bookmarks and Microsoft Favourites. They can be either a Microsoft Windows Icon (.ico) (supported on other platforms too) or another web image such as a Portable Network Graphic (.png).

Microsoft introduced the 'favicon' by having an icon in a favicon.ico file sitting in the domain root folder and all icon supporing browsers can download and use this icon. But using the <link /> empty element allows you to name the file anything you want and put it anywhere within the website and have per webpage icons. Microsoft's Internet Explorer only supports the per page version using the rel="shortcut icon" relation. But this is an incorrect use of rel values as they are a space seperated list of terms: so it would be refering to two relations, shortcut and icon. But all icon supporting browsers have support for this 'shortcut icon' as a single term purely as an Internet Explorer compatibility feature. The official way to relate a per page icon is to use rel="icon". All icon supporting browsers except Internet Explorer support rel="icon".

<!-- Icon: IE Compatibility -->
<link rel="shortcut icon" type="image/x-icon" href="mywebpage.ico" />
<!-- Icon: Standards -->
<link rel="icon" type="image/x-icon" href="mywebpage.ico" />

or
<!-- Icon: IE Compatibility -->
<link rel="shortcut icon" type="image/png" href="mywebpage.png" />
<!-- Icon: Standards -->
<link rel="icon" type="image/png" href="mywebpage.png" />

These Icons are usually 16 x 16 and either 16 to 256 colours (so 8-bit PNGs will do). These days you can have true colour icons as 24-bit Windows Icons, Windows XP Icons or 24-bit or higher PNG images.

Document Collections

A document collection or online book / multiple page article can be linked together by relations of top, start, home, first, prev, next, last, contents (table of contents), chapter, section, subsection, appendix, index, glossary, copyright and others.

<link rel="first" href="http://www.legendscrolls.co.uk/webstandards/" title="Web Standards" />
<link rel="prev" href="xml101.xhtml" title="A standard flexible document exchange format, XML" />
<link rel="next" href="morexhtml.xhtml" title="More on XHTML" />
<link rel="last" href="opendocument.xhtml" title="OASIS OpenDocument Format" />

The href attribute is a required attribute for <link /> empty elements.

In-Document Styles and Scripts

Styles can be put 'in-document' within the <style type="text/css"> </style> element within the head section. But as earlier web browsers may have not supported Stylesheets they may display the style code as text. So in HTML enviroments you must surround the style code in a Comment:

<style type="text/css" xml:space="preserve"><!--
   /* style code */
--></style>

It will hide the code but still process it. But in an XML environment anything within comments will be ignored so you must use a CDATA Section:

<style type="text/css" xml:space="preserve"><![CDATA[
   /* style code */
]]></style>

It is best to keep style separate in .css files and associate them with <link /> empty elements or in native XHTML it is preferred to use the XML-Stylesheet Processing Instructions (More on this in the CSS article). As for the xml:space="preserve" attribute this is an added feature that XHTML enjoys from XML. It enforces the spacing to be preserved. For more details on Cascade Stylesheets see the CSS article.
Scripting to provide high functionality in the web browser can also be placed in the head section much like the style element:

<script type="text/javascript" xml:space="preserve"><!--
   // script code
// -->
</script>

<script type="text/javascript" xml:space="preserve"><![CDATA[
   // script code
// ]]>
</script>

But as it is better to have the style separate, it is also better to have the scripting separated and refer to it using the src attribute as:

<script type="text/javascript" xml:space="preserve" src="dosomething.js"></script>

The type and xml:space="preserve" attributes are required attributes for style and script elements.

Anything in the <head> </head> section doesn't get displayed in the web content area of web browsers, etc. Just certain parts may be displayed in the actual application such as the <title> </title> is displayed in the browser title bar and tab plus the title attributes from various <link /> empty elements or other elements may be displayed in browser or plugin menu items or sidebars, etc.

Body Section

The rest of the features, that do display in the web content area, are coded within the <body> </body> element. You can't put normal text directly within the <body> </body> in XHTML 1.0 Strict.

There are two types of element that are used within the body section: Block-level, also known as Structural Elements, which push other elements above or below the current element. Such Block-level elements are headings, sections, paragraphs, lists, tables, noscript and form boxes.
The other type is Inline-level, also known as Textual Elements, which do not push other elements and text below or above itself and so can quite comfortably sit within normal text. No Block like elements may be within any Inline elements but Inline elements can be within Block elements. Such Inline-level elements are hyperlinks, images, objects, scripts, linebreaks, emphasis and form controls.

The most used Structural Elements are the divison section (div) and the paragraph (p):
<div> </div> standing for division is a generic block type element. You can have other block and Inline elements and/or literal text within <div> </div> elements.

<p> </p> provides paragraphs - another block type element but with some presentational spacing around it. <p> </p> can only have Inline and/or literal text. To break lines of text you use the <br /> empty element. Some people tend to prefer to use <div> </div> and separate paragraphs with two <br /> empty elements rather than use a <p> </p> element but this is not semantic (meaningful) structure as they are not proper paragraphs.
A <br /> empty element is one of the Inline type of elements except it does push elements and text after it onto the next line.

Other Inline elements include <i> </i> to make text italic but as <i> </i> is presentational and not structural <em> </em> for emphasis is a more structural replacement of <i> </i>. <b> </b> was for making text bold but <strong> </strong> for stronger emphasis is a more structural replacement of <b> </b>.
<span> </span> is the generic Inline element - good for applying style directly on the text that <span> </span> surrounds by using the style global attribute.

Global Attributes

Global attributes can be used on pretty much any XHTML element. Other global attributes such as the style attribute include the class attribute that has a space separated list of Stylesheet classes that can be used to provide multiple layered styling. Also the id attribute which provides an unique identifier that can be a target for the fragment part of a URI and also for singular style classes. id values must be unique throughout the document and can only start with a letter, underscore( _ ) or a colon(:) and then as many letters, underscores, colons, numbers, dashes(-) and other characters. For language selection to help identify that the following text is in a particular language you can use the xml:lang attribute and for backwards compatibility the lang attribute. In HTML-Compatible XHTML 1.0 if you use one you must also use the other (with native, you can just use xml:lang):

<span xml:lang="en-GB" lang="en-GB">This is English text</span>
<span xml:lang="fr-FR" lang="fr-FR">C'est texte Français</span>
<span xml:lang="el-GR" lang="el-GR">Αυτό είναι ελληνικό κείμενο</span>

Presents 'This is English text' in British English, 'This is French text' in French and 'This is Greek text' in Greek.
To provide advisory information about the text or other feature you can add a title attribute.

Headings

Sections need headings (and again will it hurt the community to use proper names!). There are 6 levels of heading where level 1 is the highest (and largest - in text size perspective) and level 6 is the lowest:

<h1>A level 1 heading</h1>
<h2>A level 2 heading</h2>
<h3>A level 3 heading</h3>
<h4>A level 4 heading</h4>
<h5>A level 5 heading</h5>
<h6>A level 6 heading</h6>

In the Strict flavour of XHTML 1.0 the headings are governed by strict level rules: you can't have a <h1> </h1> and then a <h3> </h3> without a <h2> </h2> appearing between them and similar for the other levels. You can have Inline elements and text within the headings.

Lists

Hypertext can list items in three ways: ordered, unordered or as a definition list.
To order items you provide an ordered list group with the <ol> </ol> block element which contain one or more <li> </li> special list-item type block elements as the items. By default each ordered item will have a number and a dot before it. To have something similar to bulleted lists use a <ul> </ul> block element around the <li> </li> elements:

<ol>
  <li>Item 1</li>
  <li>Item 2</li>
  <li>Item 3</li>
</ol>

produces:
  1. Item 1
  2. Item 2
  3. Item 3
and
<ul>
  <li>Item 1</li>
  <li>Item 2</li>
  <li>Item 3</li>
</ul>

produces:
  • Item 1
  • Item 2
  • Item 3

<li> </li> elements may have Block elements including list groups plus Inline elements and/or text within it. Stylesheets can customize the ordered or unordered markers:

<ul>
  <li>Item 1</li>
  <li>Item 2</li>
  <li>Item 3</li>
  <li>
    <ul>
      <li>Item 4.1</li>
      <li>Item 4.2</li>
      <li>Item 4.3</li>
    </ul>
  </li>
</ul>

produces:
  • Item 1
  • Item 2
  • Item 3
    • Item 4.1
    • Item 4.2
    • Item 4.3

To get rid of the fourth item marker you can add this style: style="list-style-type: none; list-style-image: none;" in the <li> start tag:

  • Item 1
  • Item 2
  • Item 3
    • Item 4.1
    • Item 4.2
    • Item 4.3

Definition lists comprise of a definition list group, <dl> </dl>, one or more definition terms, <dt> </dt>, and for each term you have a definition description, <dd> </dd>:

<dl>
  <dt>Term</dt>
  <dd>Description</dd>
</dl>

producing:
Term
Description

Hyperlinks, or more commonly known as links, allows you to get from one webpage to another:

<a href=""> </a>

or the old fashioned identify a part of the document:

<a name=""> </a>

(known as the fragment identifier). The a stands for anchor because the fragment identifier was an anchor for URIs to take you to a specific part of the existing webpage. The modern way of representing a fragment identifier is to just add an id attribute on an element - so any existing element in the document could be a target for a fragment URI. Only XHTML 1.0 retains the use of <a name=""> </a> for backwards compatibility but other XML languages and XML in general purely use an attribute of type ID.
As for getting you to other documents the first form is used. The href attribute's value is an URI to the other webpage or document:

<a href="there.xhtml">To go there</a>
<a href="there.xhtml#here">Taking you here</a>
<a href="http://www.w3.org/TR/xhtml1/" title="W3C spec">XHTML 1.0</a>

You can have the title attribute to add extra popup information about where the link is going. <a> </a> is an Inline element. Plus to aid in navigating and activating links using the keyboard as part of standard access and accessibility you can use the accesskey attribute on <a href=""> </a> elements. This attribute takes one character from the keyboard and then if you press alt ( or control if on a Mac) and the key character it will focus on the link.

As you can have relative URIs in the href attribute, you can help the browser or whatever is displaying the webpage to construct absolute URIs by adding a <base href="" /> empty element in the <head> </head> section where the href attribute would have the absolute URI leading upto where your relative URI would start:

<base href="http://www.meexample.com/levels/overthere/" />
    ...
<a href="hereweare.xhtml">go here</a>

But you could use the standard xml:base attribute that XML provides to do a more flexible version of <base href /> and can be used on any element (only in native XHTML):

<p xml:base="http://www.meexample.com/levels/overthere/">
  <a href="hereweare.xhtml">go here</a>
</p>

Tables

Tables Specification was originally developed by the Internet Engineering Task Force (IETF) after their original HTML 2.0 to add the ability to display tabular data values and has since been integrated into the HTML and XHTML specifications. A simple table is produced by a <table> </table> with usually more than one <tr> </tr> for table rows and in each table row is more than one <td> </td> for table data cells. These elements are block elements except <td> </td> elements can be next to each other (a kind of 'Inline-Block' if you will). You can have block, Inline elements and text within the table data cells. Because tabular data have headings you can have <th> </th> for table headings within the table rows:

<table>
  <tr>
    <th scope="col">Task</th>
    <th scope="col">Entries</th>
    <th scope="col">Identified By</th>
  </tr>
  <tr>
    <td>Data</td>
    <td>256</td>
    <td>set_3</td>
  </tr>
  <tr>
    <td>Analysis</td>
    <td>3027</td>
    <td>set_52</td>
  </tr>
  <tr>
    <td>Collate</td>
    <td>48</td>
    <td>group_187</td>
  </tr>
</table>

produces:
Task Entries Identified By
Data 256 set_3
Analysis 3027 set_52
Collate 48 group_187

Adding a summary attribute to the <table> start tag adds a general roundup of what this data table is for - it could be read out by a screen reader to a person that is hard of seeing or blind. width attributes can be added to the <table> start tag to set the width of the whole table and on each <td> and <th> to set the field's width. You can add a border by adding this attribute border="1" in the <table> start tag.

But Cascade Stylesheets provide a more flexible way of styling borders for tables and other elements. The scope attribute tells the browser that the table heading is for the current column (scope="col") or current row (scope="row").

Unfortunately webpage authors have abused the use of tables by using them for general document visual layout purposes which is not what tables are for. Laying your page out with tables actually concretes the layout (preventing environments who don't support tables or don't have much screen like a SmartPhone will display things in a corrupted way) and slows down the downloading and processing of the webpage.
A lot of screen readers do not have the concept of multiple columns and so for instance a three column layout table with the width of 150 (pixels) and each table cell having their width of 50 (pixels); plus this text in the first column:

This is a file full of letters that will detail the ontology of the topic.

In the second column:

Presenting the new Firefox browser featuring tabbed-browsing, web-standards support and a high multiple operating system, application platform.

And the third column with:

Weather patterns increase producing a landslide taking several farm barns down into the village valley.

Would be read out as:

This is a Presenting the Weather file full new Firefox patterns of letters browser featuring increase that will tabbed-browsing, producing detail web standards a the support and a landslide ontology high multiple taking of the operating system, several topic. application farm platform. barns down into the village valley.

Confused? What if you were using a text-to-speech facility giving you important traffic of destination information while driving? You would curse the author who concreted the layout of the information page with tables. Cascade Stylesheets provide a much quicker and flexible way of laying things out. Including the ability to switch to an alternate layout without changing the physical webpage say switching to a print or speech stylesheet.

Captions can be stated with <caption> </caption> just within the <table> start tag.

More complex tabular data can also be accomplished by in addition or instead of the scope attribute in the <th> </th> elements you have an id attribute in the <th> </th> elements and have a headers attribute in all the <td> </td> elements and possibly in <th> </th> elements where the value of the headers attribute will be a space separated list of <th> </th> id attribute values that the particular <td> </td> falls under.

<table summary="A complex table example" border="1">
  <tr>
    <th id="task">Task</th>
    <th id="entry">Entries</th>
    <th id="idby">Identified By</th>
  </tr>
  <tr>
    <th id="data" headers="task">Data</th>
    <td headers="entry data">256</td>
    <td headers="idby data">set_3</td>
  </tr>
  <tr>
    <th id="sys" headers="task">Analysis</th>
    <td headers="entry sys">3027</td>
    <td headers="idby sys">set_52</td>
  </tr>
  <tr>
    <th id="collate" headers="task">Collate</th>
    <td headers="entry collate">48</td>
    <td headers="idby sys">group_187</td>
  </tr>
</table>

Optional <thead> </thead>, one or more <tbody> </tbody> and an optional <tfoot> </tfoot> elements can be used to organise complex tables (if you are not using any <thead> </thead> or <tfoot> </tfoot> elements then in XHTML you don't have to use the <tbody> </tbody> (In the HTML spec you do, but not in the XHTML specs)).

<table summary="A complex table with sections example" border="1">
  <thead>
    <tr>
      <th scope="col">Task</th>
      <th scope="col">Entries</th>
      <th scope="col">Identified By</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Data</td>
      <td>256</td>
      <td>set_3</td>
    </tr>
    <tr>
      <td>Analysis</td>
      <td>3027</td>
      <td>set_52</td>
    </tr>
    <tr>
      <td>Collate</td>
      <td>48</td>
      <td>group_187</td>
    </tr>
  </tbody>
  <tbody>
    <tr>
      <td>Data</td>
      <td>48</td>
      <td>set_400</td>
    </tr>
    <tr>
      <td>Analysis</td>
      <td>322</td>
      <td>set_88</td>
    </tr>
    <tr>
      <td>Collate</td>
      <td>1353</td>
      <td>group_107</td>
    </tr>
  </tbody>
  <tfoot>
    <tr>
      <td>DomainA</td>
      <td>1</td>
      <td>Cat_11</td>
    </tr>
    <tr>
      <td>DomainB</td>
      <td>2</td>
      <td>Cat_23</td>
    </tr>
    <tr>
      <td>DomainC</td>
      <td>1</td>
      <td>GroupCollection_448</td>
    </tr>
  </tfoot>
</table>

Column handling is by default handled by the environment like a web browser based on the row and other cell information. But you can specify column related details using one or more <col span="" width="" /> empty elements or a <colgroup span="" width=""> </colgroup> with more than one <col span="" width=""/> empty elements. The span attribute specifies how many columns are affected and the width attribute states the standard column width for that many columns. Style can be attached via style, class, id attributes.

A rowspan attribute on any table cell and/or table header can expand the cell over many rows and colspan over many columns - rowgroup and colgroup values are available to the scope attribute if you are using thead, tbody or tfoot in a complex table.

You can categorize data by using the axis attribute on any <td> </td> and <th> </th> elements:

<table summary="A complex table example" border="1">
  <tr>
    <th id="task">Task</th>
    <th id="entry">Entries</th>
    <th id="idby">Identified By</th>
  </tr>
  <tr>
    <th id="data" headers="task">Data</th>
    <td headers="entry data">256</td>
    <td headers="idby data" axis="Collections">set_3</td>
  </tr>
  <tr>
    <th id="sys" headers="task">Analysis</th>
    <td headers="entry sys">3027</td>
    <td headers="idby sys" axis="Collections">set_52</td>
  </tr>
  <tr>
    <th id="collate" headers="task">Collate</th>
    <td headers="entry collate">48</td>
    <td headers="idby sys" axis="Collections">group_187</td>
  </tr>
</table>

Forms

Forms allow you to provide information or choose options to provide a customised response such as submitting a request form or filling out your profile in a membership scheme or online shopping.
To provide a form you use a <form action="" method=""> </form> block element. action attribute takes a URI to the destination of the form - either an email address or dynamic page like a PHP script or a static/dynamic webpage with JavaScript. Data can be transferred in a few ways. Most notably are the get value that adds the information to the end of the URI as a querystring or post appends the data to the hidden body of the webpage request. As the form controls are actually Inline elements you must have a block like element within the <form> </form> element as part of the Strict flavour of XHTML 1.0. So you could use a <div> </div> or surround each field with a <p></p> element. If you want to provide a title-bordered grouping of the form controls use <fieldset> </fieldset> and <legend> </legend> as:

<form action="response.php" method="post">
  <fieldset><legend>The group title text</legend>
    <!-- form controls -->
  </fieldset>
</form>

Most form controls are provided as an <input type="" id="" name="" value="" /> empty element. The type attribute dictates what kind of form field it is (text, password, checkbox, radio) or what button type it is (button, submit, image, reset). id attribute is the form field's fragment identifier to bind its <label> </label> to itself and can be used for styling or scripting purposes. One affect of wrapping the form control labels in a <label> </label> and binding them with the for attribute against the form control's id attribute not only gives association to screen readers but also if you click on or navigate to the label it will automatically re-focus on the form field itself (only Generation 5 and higher browsers support this). Also the name attribute is used for generating the variable name in the dynamic script that handles the response and also can be used for scripting purposes. To pre-populate the form field you use the value attribute. Not all types use all the attributes and some use extra attributes.
Single line text fields; don't need the value attribute but needs the others:

<label for="yaname">Your Name:</label> <input type="text" id="yaname" name="name" value="Your name goes here" />

Password fields are similar but each character typed is only shown as a star or circle; usually doesn't use the value attribute but does need the others:

<label for="pass">Your Suggested Password:</label> <input type="password" id="pass" name="password" />

Checkboxes handle 'if true or false' fields - when ticked gives a value of 'on' to the script and when unticked gives an empty value; don't use the value attribute as this will provide corrupted values:

<input type="checkbox" id="wantthis" name="wantthis" /> <label for="wantthis">Do you want this?</label>

Checkboxes can be initially ticked or checked:

<input type="checkbox" id="wantthis" name="wantthis" checked="checked" /> <label for="wantthis">Do you want this?</label>

Radio buttons provide 'this or this or this' functionality and have the value of the name attribute as the same as each other:

<input type="radio" id="that1" name="that" value="picked this 1" /> <label for="that1">Pick this one</label>
<input type="radio" id="that2" name="that" value="picked this 2" /> <label for="that2">Or pick this one</label>

One of the radio buttons can also be preset or checked:

<input type="radio" id="that1" name="that" value="picked this 1" /> <label for="that1">Pick this one</label>
<input type="radio" id="that2" name="that" value="picked this 2" checked="checked" /> <label for="that2">Or pick this one</label>

Buttons don't use the <label> </label> element but the value attribute for their label.
General purpose buttons can either use the <input /> or an element that can allow a much richer (markup or styled) label:

<input type="button" id="doit" onclick="doit();" value="Do It" /> or
<button id="doit" onclick="doit();"><img src="images/action.png" alt="" /><br /><span class="fancyLabel">Do It</span></button>

The standard submit button:

<input type="submit" id="submitit" value="Send It" />

Or an image submit button; uses a src attribute for accessing the source of an image and the alt attribute for providing an alternative text for those environments that can't display images and for screen readers to read out for people who can't see the image:

<input type="image" id="isubmitit" src="images/submit.png" alt="Submit It" />

And the less used reset button:

<input type="reset" id="startagain" value="Start Again" />

To use multiple line text fields you use the <textarea id="" name="" rows="" cols=""> </textarea> element:

<label for="yadayada">Description:</label> <textarea id="yadayada" name="desc" rows="5" cols="20">Your description goes here</textarea>

Selecting from a group of options can be done by a <select id="" name=""> </select> element with usually more than one <option value=""> </option> elements creating a drop-down selection box:

<select id="box" name="collection">
  <option value="orange">Orange Chocolate</option>
  <option value="coconut">Coconut Chocolate</option>
  <option value="turkish">Turkish Delight</option>
</select>

One option from the selection can be pre-selected:

<select id="box" name="collection">
  <option value="orange">Orange Chocolate</option>
  <option value="coconut" selected="selected">Coconut Chocolate</option>
  <option value="turkish">Turkish Delight</option>
</select>

It is possible to select more than one at a time by adding a multiple="multiple" attribute to the <select> start tag (If you are using PHP as the target of the form then put a pair of square brackets ([]) on the end of the value for the name attribute to indicate that you are sending multiple values for this variable):

<select id="box" name="collection[]" multiple="multiple">
  <option value="orange">Orange Chocolate</option>
  <option value="coconut">Coconut Chocolate</option>
  <option value="turkish">Turkish Delight</option>
</select>

Plus you can group several options together within the select by wrapping the options in <optgroup label=""> </optgroup> elements (most browsers today support this):

<select id="box" name="collection">
  <optgroup label="Chocolate">
    <option value="orange">Orange Chocolate</option>
    <option value="coconut">Coconut Chocolate</option>
    <option value="turkish">Turkish Delight</option>
  </optgroup>
  <optgroup label="Juice">
    <option value="orange">Orange</option>
    <option value="apple">Apple</option>
    <option value="cramberry">Cramberry</option>
  </optgroup>
</select>

A size attribute can be used on some fields. On <input /> elements of type text and password it sets how many characters would be visibly shown - a kind of hint to what the width would be of the field. On the <select> </select> it would state how many options are visible - turning the field into a list box.
An accesskey attribute can also be used on all form controls to aid in navigating the fields and activating the buttons.
To disable a form control you could use the disabled="disabled" attribute or to make <input /> of type text and password and <textarea> </textarea>'s uneditable then just add a readonly="readonly" attribute to them.

Images

A way to bring in pixel based images like Portable Network Graphics (PNG) you can use the <img src="" alt="" /> Inline empty element. The src attribute, stands for source, takes a URI referencing the image and the alt attribute provides only simple alternative text if there is critical text within the image. The text in the alt attribute will be used for environments like web browsers who can't handle images or can't find the image and also for screen readers to read out to hard of seeing or blind users. width and height attributes can be used to specify the dimensions of the image but this could be more flexibly done by stylesheets unless markup is required. A name attribute can be used for scripting purposes but this attribute is being depreciated in favour of the id attribute of which you can also use now.
Unfortunately the environment has to guess what the image is from the file extension and if the environment doesn't support the image format or can't find it then the only other option is the simple text from the alt attribute - design flaw that is being looked at in Stage 5 (XHTML 2.0).

<img src="images/mescannedpic.png" alt="Scanned picture on 2005-06-07" class="dropright" />

If the image is purely for decoration and has no text that the viewer needs to know or the text that is associated with the image is next to the image then you don't need to have a value in the alt attribute - but you still need an alt attribute present as alt and src attributes are required for <img /> empty elements.

<img src="images/landscape.png" alt="" style="width: 50px; height: 23px;" />
<a href="gohere.xhtml"><img src="images/menuicon.png" alt="" /> Go Here</a>

Rather than the whole image being used as an image link you can make it a client-side image map by adding a usemap attribute and defining a map by <map id="" name=""> </map> with more than one <area shape="" coords="" href="" alt="" /> empty elements or <a shape="" coords="" href=""> </a> hyperlink elements. The value of the image's usemap attribute must be a fragment URI referring to a map's name(depreciated) or id attribute. In the area empty element or a element the shape attribute dictates what shape of the region of the image is the link. Values can be rect, circle, poly or default to use the whole image. If using the shape="rect" then the value of the coords attribute, coordinates, would have for instance coords="0,0,25,25" stating the link is from absolute top-left of the image to 25 pixels across and down of the image. If shape="circle" then coords="15,15,5" would be a circular region where the center is 15 pixels across and down from the top-left of the image and the radius is 5 pixels. For shape="poly" the coords="2,2,4,4,11,11,13,11,5,17,2,8,2,2" would provide a polygon shaped region following those paired coordinates.
The href attribute provides the URI path for the area's link or the nohref="nohref" attribute states that there is no link used. alt is required for alternative text of the part of the image.

<img src="mapped.png" usemap="#mapit" alt="A Client-side image map taking you here, there and everywhere" />
<map id="mapit" name="mapit">
  <area shape="rect" coords="0,0,5,5" href="here.xhtml" alt="Going here" />
  <area shape="circle" coords="15,15,5" href="there.xhtml" alt="Going there" />
  <area shape="poly" coords="32,32,34,34,41,41,43,41,35,37,32,38,32,32" href="everywhere.xhtml" alt="Going everywhere" />
</map>

It is recommended to have text links as an alternative to help in accessibility of the webpage.
For more complex image maps you can use server-side image maps simply by adding an ismap="ismap" attribute on the image (only if used as an image link) as:

<a href="herethereeverywhere.php"><img src="mapped.png" ismap="ismap" alt="A Server-side image map taking you here, there and everywhere" /></a>

For accessibility reasons client-side image maps are preferred against server-side image maps.

Objects

<object> </object>Inline element provides a generic inclusion feature that could bring in images or other webpages, Java Applets, audio, video, multimedia like Flash Movies. If used for images you can use the usemap and ismap attributes to provide image maps. Variable information can be passed to the object via <param name="" value="" /> empty elements within the <object> </object> element. If the object isn't supported or can't be accessed then any text or other markup other than the param empty elements will be used as a freefall fallback.
In the <object> </object> element you use the data attribute to reference the object as a URI and a type attribute to state the MIME Media Type that is associated with the object. The classid attribute can be used for certain scheme-dependant access to the object: Microsoft Internet Explorer uses the classid attribute with the clsid: scheme to reference ActivX Objects. Classic Netscape will think you are trying to load an OLE Object, Gecko based products and Opera support the java: scheme in classid otherwise will ignore the whole <object> </object> element if classid is used and will use the fallback. But they do support this element using the data and type attributes. Other attributes that can be used include width, height, style, class, id, accesskey, etc.
For example:

Flash Movies

<!-- Compact code to bring in Flash into MS Internet Explorer 4+, Netscape 4, Netscape 6+, Opera, Mozilla Firefox, SeaMonkey, Apple Safari, Konqueror and other Generation 4+ web browsers -->
<object type="application/x-shockwave-flash" data="myflashmovie.swf" width="400" height="400" id="myflashmovie">
  <param name="movie" value="myflashmovie.swf" />
  <div class="transcript">Welcome to Graphics Interchange. We will present<br />...
  </div>
</object>

or

<!--[if !ie]> Extended code to bring in Flash into MS Internet Explorer 5+, Netscape 4, Netscape 6+, Opera, Mozilla Firefox, SeaMonkey, Apple Safari, Konqueror and other Generation 5+ web browsers -->
<object type="application/x-shockwave-flash" data="c.swf" width="400" height="400" id="cflashwebstandards">
<!--<![endif]-->
<!--[if ie]>
<object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=9,0,115,0" width="400" height="400" id="cflashinie">
   <param name="movie" value="c.swf" />
<![endif]-->
   <a href="http://www.macromedia.com/go/getflashplayer">Get Macromedia Flash Player (9.0.115.0)</a>
<!--[if ie]>
</object>
<![endif]-->
<!--[if !ie]>-->
</object>
<!--<![endif]-->

Current support for Flash: over 97% of China, India, South Korea, Russia, Taiwan and over 99% of US/Canada, UK, Europe, Japan support Flash 7 including Flash Video (using Sorenson Spark Video codec).
Over 94% of China, India, South Korea, Russia, Taiwan and over 98% of US/Canada, UK, Europe, Japan support Flash 8 including Flash Video (using Sorenson Spark or En2 Video codec).
Just under 90% of China, India, South Korea, Russia, Taiwan and over 93% of US/Canada, UK, Europe, Japan support Flash 9 including Flex 2 Flash-based applications.

Java Applets

<!--[if !ie]> Code to bring in Java Applets into MS Internet Explorer 4+, Netscape 6+, Opera, Mozilla Firefox, SeaMonkey, Apple Safari, Konqueror and other Generation 5+ web browsers -->
<object classid="java:Interact.class" type="application/x-java-applet" archive="Interact.jar" height="300" width="450">
<!--<![endif]-->
<!--[if ie]>
<object classid="clsid:8AD9C840-044E-11D1-B3E9-00805F499D93" codebase="http://java.sun.com/update/1.6.0/jinstall-1_6_0-windows-i586.cab" height="300" width="450">
   <param name="code" value="Interact" />
   <param name="archive" value="Interact.jar" />
<![endif]-->
   <a href="http://java.sun.com/products/plugin/downloads/index.html">Get the latest Java Plug-in here.</a>
<--[if ie]>
</object>
<![endif]-->
<!--[if !ie]>-->
</object>
<!--<![endif]-->

Quicktime media

<!--[if !ie]> Code to bring in Quicktime media to MS Internet Explorer 5+, Netscape 6+, Opera, Mozilla Firefox, SeaMonkey, Apple Safari, Konqueror and other Generation 5+ web browsers -->
<object type="video/quicktime" data="quicktime.mov" width="400" height="400">
  <param name="controller" value="true" />
<!--<![endif]-->
<!--[if ie]>
<object classid="clsid:02BF25D5-8C17-4B23-BC80-D3488ABDDC6B" codebase="http://www.apple.com/qtactivex/qtplugin.cab" width="400" height="400">
   <param name="src" value="quicktime.mov" />
   <param name="controller" value="true" />
<![endif]-->
   <div class="transcript">Welcome to Graphics Interchange. We will present<br />...
   </div>
<!--[if ie]>
</object>
<![endif]-->
<!--[if !ie]>-->
</object>
<!--<![endif]-->

Images

<!-- Code to bring in images to Netscape 6+, Opera, Mozilla Firefox, SeaMonkey, Apple Safari, Konqueror and other Generation 6 web browsers -->
<object type="image/png" data="flowchart.png" width="200" height="200">
   <span class="articleText">A flow chart dipicting...</span>
</0bject>

Webpages

<!-- Code to bring in webpages to MS Internet Explorer 5+, Netscape 6+, Opera, Mozilla, Firefox, Apple Safari, Konqueror and other Generation 5+ web browsers -->
<object type="application/xhtml+xml" data="listings.xhtml" width="400" height="550">
   <a href="listings.xhtml">Listings Page</a>
</object>

The <!--[if ie]>   <![endif]--> are MS Internet Explorer 5+ Conditional XML - these are safe to use as non MS Internet Explorer environments will only see them as HTML/XML comments.
Don't forget that any text and/or speech and any other relavent information must have an alternative for accessibility purposes such as a transcript of the interview or presentation. Also the actual media such as Flash or Java Applet themselves must be accessible.
It is also best to provide a link to the appropriate plugin for users to download and install if they do not have it.

Body Scripting

Scripting maybe used within the body section too. But make sure there is a <noscript> </noscript> block element somewhere near the <script> </script> element for accessibility considerations. <noscript> </noscript> needs a block level element within it (usually a <div> </div> element). The static content within it should either provide the same information or describe what the dynamic function would provide.

<noscript>
   <div>
     Interactive list, feature requires JavaScript which is disabled or not supported.
   </div>
</noscript>

A defer="defer" attribute can be used on <script> start tags in the head or body sections to hold off processing the script code until the complete XHTML document has been loaded.

One thing to note is that in native-XHTML the document.write() and document.writeln() does exist still but it doesn't work. This is because it would break the XML Document Well-Formed Rules. Instead you will have to use the HTML Document Object Model to manipulate the existing markup.

A collection of 'events attributes' are also available for directly attaching script functions and small lines of script code to elements. Such as the onclick attribute on <input type="button" /> and <button> </button> elements. This is a device-dependant event - it requires a mouse click or something similar to activate the code thats in the attribute value. Other device-dependant events include ondblclick, onmousedown, onmouseup, onmouseover, onmouseout and onmousemove for activating the scripts when double clicking, part clicking, moving over or off the element or just generally moving the mouse around within the element. Plus onkeypress, onkeyup and onkeydown are device-dependent events requiring a keyboard action. It is best to use device-independant events as this does not need a particular device for instance incase you are using a handheld and do not have a mouse to use. If you need to use one of the device-dependent events then most information access ports have some sort of keyboard, keypad or touch-sensative palette so the onkeypress, onkeyup and onkeydown can be used. Device-independent events include onload and onunload mainly attached to the <body> </body> element or onsubmit and onreset attached to a <form> </form>. Focusing events can be handled on most elements by the onfocus and onblur attributes. Scripts could happen when you simply select text within a text field (<input type="text" />, <input type="password" />, <textarea> </textarea>) by attaching an onselect attribute. When losing focus from a text field or a selection box that the text has changed or you have chosen a different option then an event can be used by a script by using the onchange attribute on those form controls.

Others

Horizontal rules can be rendered by having a <hr /> empty element but some screen readers interpret horizontal rules as a long sequence of underscores. It is best to style the bottom border or top border of an element to provide a similar presentational effect.

Most of the following elements are Inline elements:
<abbr title=""> </abbr> provides the expanded form in the title attribute of the abbreviated form within the element.
<acronym title=""> </acronym> provides the expanded form in the title attribute of the acronym within the element.
<dfn title=""> </dfn> provides the definition in the title attribute of the term within the element - a singular version of the definition list.
<code> </code> displays the text as if it was programming code style.
<kbd> </kbd> displays the text as if it was keyboard text style.
<em> </em> (as mentioned before) displays the text in emphasis (and visually in italics.
<strong> </strong> (as mentioned before) displays text in stronger emphasis (and visually bolded.
<samp> </samp> displays the text as if it was sample text.
<var> </var> displays text as if variable names.
<cite> </cite> displays the text as if it was a citation style text.
<address> </address> a Block-level element displays text as if on an address label.
<del datetime=""> </del> puts a line through the text - the datetime has the date and time of the 'deletion' and can have a cite attribute having a URI to a citation.
<ins datetime=""> </ins> the datetime has the date and time of the 'insertion' and can have a cite attribute having a URI to a citation.
<pre xml:space="preserve"> </pre> generic Block-level element to preserve any spacing and a simple font and can't have Block-level or pre elements within it.
<sup> </sup> displays the text superscripted.
<sub> </sub> displays text subscripted.
<q> </q> surrounds the text in quotes (quote characters depends on the natural language which can be changed by xml:lang="" and lang="").
And finally <blockquote> </blockquote> a Block-level element that pads the left and right of the paragraph like a quote block and must only be used for quote blocks.

Cascade Stylesheets are to be used for any further styling or to replace any presentation markup.
Continue to More on XHTML to find out about the other flavours of XHTML 1.0 and further XHTML versions.

Copyright ©2005-2008 Legend Scrolls and Peter Davison.
The Globe icon from Crystal Project Icons: LGPL, Copyright © Everaldo.
All rights reserved.