The DOCTYPE declaration for HTML5 is very simple:
The character encoding (charset) declaration is also very simple:
![]() |
the default character encoding in HTML5 is UTF-8. |
---|
The most interesting new elements are:
New semantic elements like <header>, <footer>, <article>, and <section>.
New form control attributes like number, date, time, calendar, and range.
New graphic elements: <svg> and <canvas>.
New multimedia elements: <audio> and <video>.
![]() |
In the chapter HTML5 Support, you will learn how to "teach" old browsers to handle HTML5 semantic. |
---|
The most interesting new API's are:
![]() |
Local storage is a powerful replacement for cookies. |
---|
The following HTML4 elements have been removed from HTML5:
Element | Use instead |
---|---|
<acronym> | <abbr> |
<applet> | <object> |
<basefont> | CSS |
<big> | CSS |
<center> | CSS |
<dir> | <ul> |
<font> | CSS |
<frame> | |
<frameset> | |
<noframes> | |
<strike> | CSS |
<tt> | CSS |
![]() |
In the chapter HTML5 Migration, you will learn how to easily migrate from HTML4 to HTML5. |
---|
Since the early days of the web, there have been many versions of HTML:
Version | Year |
---|---|
Tim Berners-Lee invented www | 1989 |
Tim Berners-Lee invented HTML | 1991 |
Dave Raggett drafted HTML+ | 1993 |
HTML Working Group defined HTML 2.0 | 1995 |
W3C Recommended HTML 3.2 | 1997 |
W3C Recommended HTML 4.01 | 1999 |
W3C Recommended XHTML 1.0 | 2000 |
HTML5 WHATWG First Public Draft | 2008 |
HTML5 WHATWG living Standard | 2012 |
HTML5 W3C Final Recommendation | 2014 |
Tim Berners-Lee invented the "World Wide Web" in 1989, and the Internet took off in the 1990s.
From 1991 to 1998, HTML developed from version 1 to version 4.
In 2000, the World Wide Web Consortium (W3C) recommended XHTML 1.0.
The XHTML syntax was strict, and the developers were forced to write valid and "well-formed" code.
In 2004, WHATWG (Web Hypertext Application Technology Working Group) was formed in response to slow W3C development, and W3C's decision to close down the development of HTML, in favor of XHTML.
WHATWG wanted to develop HTML, consistent with how the web was used, while being backward compatible with older versions of HTML.
In the period 2004-2006, the WHATWG initiative gained support by the major browser vendors.
In 2006, W3C announced that they would support WHATWG.
In 2008, the first HTML5 public draft was released.
In 2012, WHATWG and W3C decided on a separation:
WHATWG will develop HTML as a "living Standard".
A living standard is never fully complete, but always updated and improved. New features can be added, but old functionality can not be removed.
The WHATWG living Standard was published in 2012, and is continuously updated.
W3C will develop a definitive HTML5 and XHTML5 standard, as a "snapshot" of WHATWG.
The W3C HTML5 recommendation was released 28 October 2014.
You can teach older browsers to handle HTML5 correctly.
HTML5 is supported in all modern browsers.
In addition, all browsers, old and new, automatically handle unrecognized elements as inline elements.
Because of this, you can "teach" older browsers to handle "unknown" HTML elements.
![]() |
You can even teach IE6 (Windows XP 2001) how to handle unknown HTML elements. |
---|
HTML5 defines eight new semantic HTML elements. All these are block-level elements.
To secure correct behavior in older browsers, you can set the CSS display property to block:
You can also add any new element to HTML with a browser trick.
this example adds a new element called <myHero> to HTML, and defines a display style for it:
The Javascript statement document.createElement("myHero") is added, only to satisfy IE.
You could use the solution described above, for all new HTML5 elements, but:
![]() |
Internet Explorer 8 and earlier, does not allow styling of unknown elements. |
---|
thankfully, Sjoerd Visscher created the "HTML5 Enabling Javascript", "the shiv":
The code above is a comment, but versions previous to IE9 will read it (and understand it).
The link to the shiv code must be placed in the <head> element, because Internet Explorer needs to know about all new elements before reading them.
Below is a list of the new HTML5 elements, and a description of what they are used for.
HTML5 offers new elements for better document structure:
Tag | Description |
---|---|
<article> | Defines an article in the document |
<aside> | Defines content aside from the page content |
<bdi> | Defines a part of text that might be formatted in a different direction from other text |
<details> | Defines additional details that the user can view or hide |
<dialog> | Defines a dialog box or window |
<figcaption> | Defines a caption for a <figure> element |
<figure> | Defines self-contained content, like illustrations, diagrams, photos, code listings, etc. |
<footer> | Defines a footer for the document or a section |
<header> | Defines a header for the document or a section |
<main> | Defines the main content of a document |
<mark> | Defines marked or highlighted text |
<menuitem> | Defines a command/menu item that the user can invoke from a popup menu |
<meter> | Defines a scalar measurement within a known range (a gauge) |
<nav> | Defines navigation links in the document |
<progress> | Defines the progress of a task |
<rp> | Defines what to show in browsers that do not support ruby annotations |
<rt> | Defines an explanation/pronunciation of characters (for East Asian typography) |
<ruby> | Defines a ruby annotation (for East Asian typography) |
<section> | Defines a section in the document |
<summary> | Defines a visible heading for a <details> element |
<time> | Defines a date/time |
<wbr> | Defines a possible line-break |
Read more about HTML5 Semantics.
Tag | Description |
---|---|
<datalist> | Defines pre-defined options for input controls |
<keygen> | Defines a key-pair generator field (for forms) |
<output> | Defines the result of a calculation |
Read all about old and new form elements in HTML Form Elements.
New Input Types | New Input Attributes |
---|---|
|
|
Learn all about old and new input types in HTML Input Types.
Learn all about input attributes in HTML Input Attributes.
HTML5 allows four different syntaxes for attributes.
this example demonstrates the different syntaxes used in an <input> tag:
Type | Example |
---|---|
Empty | <input type="text" value="John" disabled> |
Unquoted | <input type="text" value=John> |
Double-quoted | <input type="text" value="John Doe"> |
Single-quoted | <input type="text" value='John Doe'> |
In HTML5, all four syntaxes may be used, depending on what is needed for the attribute.
Tag | Description |
---|---|
<canvas> | Defines graphic drawing using Javascript |
<svg> | Defines graphic drawing using SVG |
Read more about HTML5 Canvas.
Read more about HTML5 SVG.
Tag | Description |
---|---|
<audio> | Defines sound or music content |
<embed> | Defines containers for external applications (like plug-ins) |
<source> | Defines sources for <video> and <audio> |
<track> | Defines tracks for <video> and <audio> |
<video> | Defines video or movie content |
Read more about HTML5 Video.
Read more about HTML5 Audio.
Semantics is the study of the meanings of words and phrases in language.
Semantic elements are elements with a meaning.
A semantic element clearly describes its meaning to both the browser and the developer.
Examples of non-semantic elements: <div> and <span> - Tells nothing about its content.
Examples of semantic elements: <form>, <table>, and <img> - Clearly defines its content.
HTML5 semantic elements are supported in all modern browsers.
In addition, you can "teach" older browsers how to handle "unknown elements".
Many web sites contain HTML code like:<div id="nav"> <div
class="header"> <div id="footer">
to indicate navigation, header,
and footer.
HTML5 offers new semantic elements to define different parts of a web page:
The <section> element defines a section in a document.
According to W3C's HTML5 documentation: "A section is a thematic grouping of content, typically with a heading."
A Web site's home page could be split into sections for introduction, content, and contact information.
The <article> element specifies independent, self-contained content.
An article should make sense on its own, and it should be possible to read it independently from the rest of the web site.
Examples of where an <article> element can be used:
In the HTML5 standard, the <article> element defines a complete, self-contained block of related elements.
The <section> element is defined as a block of related elements.
Can we use the definitions to decide how to nest elements? No, we cannot!
On the Internet, you will find HTML pages with <section> elements containing <article> elements, and <article> elements containing <sections> elements.
You will also find pages with <section> elements containing <section> elements, and <article> elements containing <article> elements.
![]() |
Newspaper: the sports articles in the sports section, have a technical section in each article. |
---|
The <header> element specifies a header for a document or section.
The <header> element should be used as a container for introductory content.
You can have several <header> elements in one document.
The following example defines a header for an article:
The <footer> element specifies a footer for a document or section.
A <footer> element should contain information about its containing element.
A footer typically contains the author of the document, copyright information, links to terms of use, contact information, etc.
You can have several <footer> elements in one document.
The <nav> element defines a set of navigation links.
The <nav> element is intended for large blocks of navigation links. However, not all links in a document should be inside a <nav> element!
The <aside> element defines some content aside from the content it is placed in (like a sidebar).
The aside content should be related to the surrounding content.
In books and newspapers, it is common to have captions with images.
The purpose of a caption is to add a visual explanation to an image.
With HTML5, images and captions can be grouped together in <figure> elements:
The <img> element defines the image, the <figcaption> element defines the caption.
With HTML4, developers used their own favorite attribute names to style page elements:
header, top, bottom, footer, menu, navigation, main, container, content, article, sidebar, topnav, ...
this made it impossible for search engines to identify the correct web page content.
With HTML5 elements like: <header> <footer> <nav> <section> <article>, this will become easier.
According to the W3C, a Semantic Web:
"Allows data to be shared and reused across applications, enterprises, and communities."
Below is an alphabetical list of the new semantic elements in HTML5.
The links go to the complete HTML5 Reference.
Tag | Description |
---|---|
<article> | Defines an article |
<aside> | Defines content aside from the page content |
<details> | Defines additional details that the user can view or hide |
<figcaption> | Defines a caption for a <figure> element |
<figure> | Specifies self-contained content, like illustrations, diagrams, photos, code listings, etc. |
<footer> | Defines a footer for a document or section |
<header> | Specifies a header for a document or section |
<main> | Specifies the main content of a document |
<mark> | Defines marked/highlighted text |
<nav> | Defines navigation links |
<section> | Defines a section in a document |
<summary> | Defines a visible heading for a <details> element |
<time> | Defines a date/time |
this chapter is entirely about how to migrate from a typical HTML4 page to a typical HTML5 page.
this chapter demonstrates how to convert an HTML4 page into an HTML5 page, without destroying anything of the original content or structure.
![]() |
You can migrate to HTML5 from HTML4 or XHTML, using the same recipe... |
---|
Typical HTML4 | Typical HTML5 |
---|---|
<div id="header"> | <header> |
<div id="menu"> | <nav> |
<div id="content"> | <section> |
<div id="post"> | <article> |
<div id="footer"> | <footer> |
Change the doctype, from the HTML4 doctype:
to the HTML5 doctype:
Change the encoding information, from HTML4:
to HTML5:
HTML5 semantic elements are supported in all modern browsers.
In addition, you can "teach" older browsers how to handle "unknown elements".
Add the shiv for Internet Explorer support:
![]() |
Read about the shiv in HTML5 Browser Support. |
---|
Look at your existing CSS styles:
Duplicate with equal CSS styles for HTML5 semantic elements:
Change the <div> elements with id="header" and id="footer":
to HTML5 semantic <header> and <footer> elements:
Change the <div> element with id="menu":
to an HTML5 semantic <nav> element:
Change the <div> element with id="content":
to an HTML5 semantic <section> element:
Change all <div> element with class="post":
to HTML5 semantic <article> elements:
Remove these "no longer needed" <style> elements:
Finally you can remove the <head> tags. They are not needed in HTML5:
There is a confusing (lack of) difference in the HTML5 standard, between <article> <section> and <div>.
In the HTML5 standard, the <section> element is defined as a block of related elements.
The <article> element is defined as a complete, self-contained block of related elements.
The <div> element is defined as a block of children elements.
How to interpret that?
In the example above, we have used <section> as a container for related <articles>.
But, we could have used <article> as a container for articles as well.
Here are some different examples:
Web developers are often uncertain about the coding style and syntax to use in HTML.
Between 2000 and 2010, many web developers converted from HTML to XHTML.
With XHTML, developers were forced to write valid and "well-formed" code.
HTML5 is a bit more sloppy when it comes to code validation.
With HTML5, you must create your own Best Practice, Style Guide and Coding Conventions.
A consequent use of style, makes it easier for others to understand and use your HTML.
In the future, programs like XML readers, may want to read your HTML.
Using a well-formed "close to XHTML" syntax, can be smart.
![]() |
Always keep your style smart, tidy, clean, and well-formed. |
---|
For reference, here are some of the older Document Type declerations:
Always declare the document type as the first line in your document:
If you want consistency with lower case tags, you can use:
HTML5 allows mixing uppercase and lowercase letters in element names.
We recommend using lowercase element names:
In HTML5, you don't have to close all elements (for example the <p> element).
We recommend closing all HTML elements:
Looking bad:
Looking good:
In HTML5, it is optional to close empty elements.
this is allowed:
this is also allowed:
The slash (/) is required in XHTML and XML.
If you expect XML software to access your page, it might be a good idea to keep it.
HTML5 allows mixing uppercase and lowercase letters in attribute names.
We recommend using lowercase attribute names:
Looking bad:
Looking good:
HTML5 allows attribute values without quotes.
We recommend quoting attribute values:
this will not work, because the value contains spaces:
this will work:
Always use the alt attribute with images. It is important when the image cannot be viewed.
Always define image size. It reduces flickering because the browser can reserve space for images before they are loaded.
Spaces around equal signs is legal:
But space-less is easier to read, and groups entities better together:
When using an HTML editor, it is inconvenient to scroll right and left to read the HTML code.
try to avoid code lines longer than 80 characters.
Do not add blank lines without a reason.
For readability, add blank lines to separate large or logical code blocks.
For readability, add 2 spaces of indentation. Do not use TAB.
Do not use unnecessary blank lines and indentation. It is not necessary to use blank lines between short and related items. It is not necessary to indent every element:
In the HTML5 standard, the <html> tag and the <body> tag can be omitted.
The following code will validate as HTML5:
We do not recommend omitting the <html> and <body> tags.
The <html> element is the document root. It is the recommended place for specifying the page language:
Declaring a language is important for accessibility applications (screen readers) and search engines.
Omitting <html> or <body> can crash DOM and XML software.
Omitting <body> can produce errors in older browsers (IE9).
In the HTML5 standard, the <head> tag can also be omitted.
By default, browsers will add all elements before <body>, to a default <head> element.
You can reduce the complexity of HTML, by omitting the <head> tag:
![]() |
Omitting tags is unfamiliar to web developers. It needs time to be established as a guideline. |
---|
The <title> element is required in HTML5. Make the title as meaningful as possible:
To ensure proper interpretation, and correct search engine indexing, both the language and the character encoding should be defined as early as possible in a document:
Short comments should be written on one line, with a space after <!-- and a space before -->:
Long comments, spanning many lines, should be written with <!-- and --> on separate lines:
Long comments are easier to observe, if they are indented 2 spaces.
Use simple syntax for linking style sheets (the type attribute is not necessary):
Short rules can be written compressed, on one line, like this:
Long rules should be written over multiple lines:
![]() |
Adding a space after a comma, or a semicolon, is a general rule in all types of writing. |
---|
Use simple syntax for loading external scripts (the type attribute is not necessary):
A consequence of using "untidy" HTML styles, might result in Javascript errors.
These two Javascript statements will produce different results:
If possible, use the same naming convention (as Javascript) in HTML.
Visit the Javascript style Guide.
Most web servers (Apache, Unix) are case sensitive about file names:
london.jpg cannot be accessed as London.jpg.
Other web servers (Microsoft, IIS) are not case sensitive:
london.jpg can be accessed as London.jpg or london.jpg.
If you use a mix of upper and lower case, you have to be extremely consistent.
If you move from a case insensitive, to a case sensitive server, even small errors will break your web.
To avoid these problems, always use lower case file names (if possible).
HTML files should have a .html extension (or .htm).
CSS files should have a .css extension.
Javascript files should have a .js extension.
There is no difference between the .htm and .html extensions. Both will be treated as HTML by any web browser or web server.
The differences are cultural:
.htm "smells" of early DOS systems where the system limited the extensions to 3 characters.
.html "smells" of Unix operating systems that did not have this limitation.
When a URL does not specify a filename (like http://www.html_201.com/), the server returns a default filename. Common default filenames are index.html, index.htm, default.html, and default.htm.
If your server is configured only with "index.html" as default filename, your file must be named "index.html", not "index.htm."
However, servers can be configured with more than one default filename, and normally you can set up as many default filenames as needed.
Anyway, the full extension for HTML files is .html, and there's no reason it should not be used.