From HTML to HTML 5
Introduction
This tutorial is for those who know HTML (HyperText Markup Language) and/or HTML-Compatible XHTML (eXtensible HyperText Markup Language) 1.0 and are interested in upgrading to HTML5.
It begins with a little background of why HTML5 came about, then we have the steps for upgrading the old HTML to the existing subset of HTML5 that is already supported in web browsers.
Background
HTML (HyperText Markup Language) is written in SGML (Standard Generalised Markup Language). SGML is the mother of all markup languages and is a massive language to describe markup languages. It has been used to describe languages such as HTML, Rich Text, DocBook and ColdFusion.
But HTML was written in theory and it had to be modified from the specification in order to implement it into programs like web browsers.
The World Wide Web Consortium (W3C) were taking the HTML specification down a road to XHTML (eXtensible HyperText Markup Language) Version 2 where developers of web browsers and other programs found it impossible to implement with a massive learning curve for both browser developer and web author.
HTML5 is a proposal provided by the WHATWG (Web Hypertext Application Technology Working Group). This working group was founded by Mozilla Foundation, Opera Software and Apple in 2004 after a W3C workshop on XHTML. WHATWG didn't like where W3C was taking XHTML and HTML. Some concepts were an improvement but they do not address recent web features such as Web Applications and non-document-based webpages.
Since, the W3C has recognized HTML 5 and the WHATWG is part of the new W3C HTML Working Group, together they are developing HTML 5.
HTML 5 is currently a Draft but basic parts of the specification is modelled on the existing implementation of HTML, XHTML and browser extensions so parts of HTML 5 is already supported in browsers.
HTML 5 Improvements
HTML (HyperText Markup Language) version 5 is an abstract language, a single vocabulary for webpages that both old HTML 4.01 and XHTML 1.1 and under can upgrade to. For native XHTML 1.0 and higher, they can upgrade to XHTML 5. For HTML 4.01 and under and HTML-Compatible XHTML 1.0, they can upgrade to the custom HTML of HTML 5.
The custom HTML form of HTML 5 is not based on SGML (Standard Generalized Markup Language) as HTML 4.01 and under are but modelled on the physical implementation of HTML in applications such as web browsers.
A HTML 5 document is 'the custom form of HTML 5' or just known as 'HTML5' by default not because of the Doctype or the syntax used but because of the MIME Media Type of text/html.
All web browsers such as Mozilla Firefox, SeaMonkey, Netscape, Konqueror, Opera, Apple Safari and Microsoft Internet Explorer support the basics of HTML5.
Some simple notes to upgrade from HTML to HTML5:
- Block level elements are now categorised as Flow Content and Inline level elements are categorised as Phrasing Content;
- Elements like
<img>,<iframe></iframe>and<object></object>are categorised as Embedding Content and are also Phrasing Content; - As with HTML, HTML5 elements and attribute names are case insensitive, does not matter wheather uppercase or lowercase, but writing in uppercase is old hat and does not make your code any clearer. So these days HTML in general is written in lowercase;
- All elements that can have a start and end tag such as
<p>and<li>should have both start and end tags present to make your code more clearer; - All elements that only have a start tag like
<img>and<br>are known as Void Elements: before the greater-than character (>) they may have an optional space followed by an optional forward slash such as<br>,<br/>or<br />; - All attribute values should be quoted to make your code clearer - commonly with double quotes;
- If you use double quotes in attribute values that are surrounded by double quotes then use the
"entity within the attribute value instead of a literal double quote (otherwise you would be ending the attribute before you intended and it would mess up your webpage); - If you use apostrophes or single quotes in attribute values that are surrounded by single quotes then use the
','or'entities within the attribute value instead of a literal apostrophe or single quote. It is best to surround attribute values with double quotes these days; - As ampersands (
&) are used to start entities then if you need to use it as an ampersand use the&entity in attribute values and element content; - As less-than characters (
<) are used to start elements and you need to use it as a less-than character then use the<entity in attribute values and element content - greater-than characters (>) do not pose a problem; - Any 'minimized' or boolean attributes such as
checkedormultiplemay be expanded to have the attribute name as the value such aschecked="checked"andmultiple="multiple"; - The
idattribute has replaced thenameattribute for most elements in regards to identifying parts of the document for scripting and styling purposes and link targets. Only<iframe></iframe>and<object></object>will retain thenameattribute for compatibility with some web browsers but use theidattribute as well for other and future web browsers.
<meta>and form controls continue to usenameas usual; - The
languageattribute in<script language="JavaScript"> </script>elements has been replaced with the global standardtypeattribute as<script type="text/javascript"> </script>; document.write()anddocument.writeln()does exist still but it doesn't work in all circumstances and is old hat. Instead you should useinnerHTMLor other features from HTML Document Object Model Level 5 to manipulate the existing markup;- Use the HTML5 Doctype (
<!doctype html>) instead of the depreciated HTML and XHTML Doctypes to trigger Standards Compliant mode in web browsers and other programs; - It is best to move from using character sets like ISO-8859-1 to the most supported global standard and Unicode supporting character set UTF-8;
- Element nesting must be proper to make your code more clearer. For instance:
<strong>This is an <em>invalid</strong> nesting</em>as the<em>element was started within the<strong>element then the<em>element should end within the<strong>element as<strong>This is <em>valid</em> nesting</strong>; - Unlike XHTML 1.0
<script></script>,<style></style>and<pre></pre>do not needxml:space="preserve"attribute to preserve leading and trailing spaces and any multiple spaces within words at the XML and HTML5 parser level; - Unlike XHTML 1.0 where the
langattribute also needs thexml:langattribute, in the custom form of HTML5 you only use thelangattribute;
For those using HTML-Compatible XHTML the namespace and XML-like syntax in the Void Elements and extended boolean attributes is only supported to provide an easier way for you to upgrade to HTML5. As for the namespace, which is optional in the custom form of HTML 5, it must be as the default namespace with the same URI as:
<html xmlns="http://www.w3.org/1999/xhtml" ...>...</html>Some depreciated elements have changed semantics (become more meaningful) such as <i></i> and <b></b> are allowed as 'offset from the normal text without any other meaning' such as screen readers would use a different voice or pitch etc. But these two elements are only to be used as an absolute last resort: there are far more meaningful elements at your disposal.
A few existing elements have also improved semantics such as <small></small> is for small print, <strong></strong> is for importance. Multiple nested <em></em> elements will convey stronger emphasis and multiple nested <strong></strong> will convey stronger importance.
For each initial <dfn></dfn> term on the webpage, its expanded value (usually in a title attribute) should now be within the surrounding text as should the expanded value of each initial <abbr></abbr> abbreviation.
To accompany the style attribute (to state style properties only) for local styling on the element, HTML5 adds 'scoped stylesheets' as the first children of elements that can have Flow Content within them such as the <div></div> element.
Like a normal <style></style> element but has a scoped or scoped='scoped' attribute. You also are free to use selectors, media At-Rules and other typical code in CSS stylesheets (the default styling language in HTML5 is CSS (text/css)). Scoped stylesheets provide this local styling for the parent element (such as the <div></div>) and its other children. But current web browsers ignore the scoped or scoped="scoped" attribute but to be valid you need to keep it in.
For specifying the character set you can use the old <meta http-equiv="content-type" content="text/html; charset=UTF-8"> element as usual but HTML 5 introduces a simpler form:
<meta charset="UTF-8">This element must be the first within the <head></head> element even before the <title></title> element.
Unlike HTML-Compatible XHTML 1.0 the XML Prolog or Declaration, any other XML Processing Instructions (other than PHP's) and CDATA Sections are forbidden in the custom HTML form of HTML 5.
Talking now about ids. Ids are coming into their own in XML Documents and especially in HTML 4, 4.01, XHTML and HTML 5. A powerful yet simple, basic feature of webpages. It replaces the name attribute identifying an element for scripting purposes. Even part of the Document Object Model (DOM) there is a method called getElementById() with the parameter as the value of an id attribute. This method obtains the element that has that id attribute value.
Ids can be used to attach styles too using the hash or sharp character (#) as:
div#navigation {
width: 98%;
background-color: aqua;
}Attaching the width and background colour styles to a div element with an id="navigation" on it.
Also it replaces <a name=""></a> elements as fragment identifiers and results of URIs such as mypage.html#fourthParagraph. As it is an attribute on an element, any element with an id can be the target of such a URI.
Id is a unique identifier and so the value must be unique throughout the document. id attributes can only start with an underscore (_), a colon (:) or a letter and then any number of underscores, colons, dashes (-), letters or numbers.
As mentioned above the Doctype of the custom HTML form of HTML 5 is a simpler, shorter one purely to switch on Standards Compliance Mode in web browsers:
<!doctype html>This is because most web browsers do not validate webpages. To validate a HTML 5 document you use a HTML 5 validator such as the one at http://html5.validator.nu/.
Several elements and attributes have been dropped such as cellspacing, cellpadding and summary attributes on <table></table> are replaced by CSS margins, padding and HTML's <caption></caption> element.
Also frameborder, border, hspace, vspace, leftmargin, rightmargin, topmargin, bottommargin, valign and align are replaced by CSS's border, margins, padding, vertical-align and text-align properties.
Only <img>, <embed>, <object></object>, <canvas></canvas>, <video></video> may have the width and height attributes; others can use the CSS width and height properties.
classid and a few non-common attributes from <object></object> and language from <script></script> have also been dropped.
<font></font>, <noframes></noframes>, <frameset></frameset> and <frame> are all dropped.
<embed></embed> is only supported as <embed> Void Element for compatibility with ancient plugins. You should use <object></object> these days.
This is just the subset of HTML 5 that is currently supported by all web browsers. But new markup is being introduced and tested addressing sectioning elements (<section></section>, <article></article>, <aside></aside>, <nav></nav>, <header></header> and <footer></footer>); improved forms with Web Forms 2 such as <input> types including email, url, datetime, time, week, number, range. Plus markup repeatition templates, combo boxes; <canvas></canvas> for dynamically drawing bitmap images and animating them; <audio></audio> and <video></video> for improved embedded multimedia including a multimedia API (Application Programming Interface) for dynamically creating your own controls. And address' Web Applications including client-side data storage.
Brand new and upcoming versions of web browsers are starting to add support for some of these 'beyond the subset' features.
End.
Copyright ©2008 Legend Scrolls and Peter Davison.
All rights reserved.
Skip to content
Home
Contact Me

