Wednesday, May 16, 2007

HTML 5 Proposal Approved by W3C

Exciting news for the front-end developers!

Back in October 2006, the father of HTML, Tim Berners-Lee - currently the W3C Director, acknowledged that the W3C has had difficulties on keeping the HTML language fresh and evolving:

"The perceived accountability of the HTML group has been an issue. Sometimes this was a departure from the W3C process, sometimes a sticking to it in principle, but not actually providing assurances to commenters. An issue was the formation of the breakaway WHAT WG, which attracted reviewers though it did not have a process or specific accountability measures itself. "

Reinventing HTML, by Tim Berners-Lee on October 10, 2006

In a surprise session, HTML 5 was proposed by the WHAT Working Group (founded by representatives of Mozilla, Apple and Opera) to the Consortium and accepted by the majority of the voting participants. They've also agreed on the following:
  • The WHAT Working Group’s HTML5 (Web Applications 1.0 and Web Forms 2.0) will become the current working draft, and an extensive review by the new working group will now take place.
  • The final W3C specification will be named "HTML 5".
  • The W3C specification will be edited by Ian Hickson (Google), editor of the WHAT-WG’s HTML5, and David Hyatt (Apple/Safari).

So, why was the WHAT Working Group formed?

In 2004, after a W3C workshop, Apple, Mozilla and Opera were becoming increasingly concerned about the W3C’s direction with XHTML, lack of interest in HTML and apparent disregard for the needs of real-world authors. So, in response, these organisations set out to with a mission to address these concerns and the Web Hypertext Application Technology Working Group was born.

These days, the WHATWG is a growing community of browser vendors, web developers, and other people interested in the development of the the next generation of HTML and related technologies, specifically designed to allow authors to write and deploy applications over the World Wide Web.

Now, what everyone is itching to know...

What improvements will HTML 5 bring to our world?

This is an extremely exciting question and at a first glance, this new language is very promising:
  • New DOCTYPEs and DTDs
  • New Structures
  • New Semantics
  • New Controls - Whoo-hoo!
  • Client Side Form Validation
  • DOM APIs
  • And the introduction of Repetition Model

New elements

Document Structure

Data

Applications


In addition to the above the input element's type attribute can now have the following new values, which enables a bunch of new native controls:

  • datetime, datetime-local, date, month, week, time, number, range, email, url



New Attributes

An overview of all elements from HTML4 that got new attributes in HTML5.

Element Attributes

  • a media?, ping
  • area ping
  • base target
  • button autofocus, form, replace, template
  • fieldset disabled, form
  • form data, replace
  • input autocomplete, autofocus, form, inputmode, list, min, max, pattern, step, replace, required, template
  • li value (no longer deprecated)
  • meta charset
  • ol start (no longer deprecated)
  • select autofocus, data, form
  • script async, defer
  • style scoped
  • textarea autofocus, form, inputmode, required

HTML4 didn't have a concept of an attribute that applies to every element. HTML5 calls such attributes global attributes. The following attributes from HTML4 are made global attributes:

  • class, dir, id, lang, title

The following new attributes are global attributes:

  • contenteditable, contextmenu, draggable, tabindex

HTML5 also has global attributes that also can be applied on elements from other vocabularies (when namespaced):

  • repeat (Web Forms 2)
  • repeat-start (Web Forms 2)
  • repeat-min (Web Forms 2)
  • repeat-max (Web Forms 2)

Changed Elements

These elements have new meanings in HTML5 which are incompatible with HTML4. The new meanings better reflects the way they are used on the Web or gives them a purpose so people can start using them.

  • a – The a element without an href attribute represents a "placeholder link".
  • address – The address element is now scoped by the new concept of sectioning.
  • b – The b element now represents a span of text to be stylistically offset from the normal prose without conveying any extra importance, such as key words in a document abstract, product names in a review, or other spans of text whose typical typographic presentation is emboldened.
  • hr – The hr element now represents a paragraph-level thematic break.
  • i – The i element now represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, a ship name, or some other prose whose typical typographic presentation is italicized. Usage varies widely by language.
  • label – The browser should not transfer focus from the label to the control unless such behaviour is standard for the underlying platform user interface.
  • menu – The menu element is redefined to be useful for actual menus.
  • small – The small element now represents small print (for side comments and legal print).
  • strong – The strong element now represents importance rather than strong emphasis.

Dropped Elements!

That these elements are dropped means that authors are no longer allowed to use them. User agents will still have to support them and HTML5 will probably get a rendering section in due course that says exactly how. (isindex for instance is already supported by the parser.)

  • acronym (use abbr instead)
  • applet (use object instead)
  • basefont
  • big
  • center
  • dir
  • font (allowed when inserted by WYSIWYG editors)
  • frame
  • frameset
  • isindex
  • noframes
  • noscript (only dropped in XHTML5)
  • s
  • strike
  • tt
  • u

Dropped Attributes!

Some attributes for elements included in HTML4 are not allowed in HTML5:

Element Attributes

In addition, HTML5 has none of the presentational attributes that were in HTML4 (including those on. Any attributes defined on elements that are not in HTML5 are (obviously) also not in HTML5.


APIs

HTML5 introduces a number of APIs that should help in creating web applications. These can be used together with the new elements introduced for applications:



Character Encoding

The character encoding can be declared using the meta element, but the syntax of the meta element has changed. In HTML 4.01 and earlier, the meta element was:

<meta equiv="Content-Type" content="text/html; charset=UTF-8">

In HTML5, the syntax was simplified to remove the unnecessary markup, yet still remain compatible with the encoding detection implemented in most existing browsers.

<meta charset="UTF-8">


DOCTYPE

In HTML 4, the DOCTYPE was long and complicated, and very few people can actually remember it all. The complex PUBLIC and SYSTEM identifers are used to refer to the DTD. But because there is no DTD in HTML5, we’ve taken out the PUBLIC and SYSTEM identifiers and left the minimal amount of code that is both easy to remember and triggers standards mode.

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

Thus, in HTML 5, the DOCTYPE will simply be:

<!DOCTYPE html>

This does not apply to XHTML 5, for which there is no DOCTYPE sniffing and no need for any DOCTYPE at all.




Here’s a very interesting presentation by Lachlan Hunt at the Web Standards Group meeting in Sydney on 2007-01-25.

You may also download the presentation slides (PowerPoint) and audio recording (Ogg Vorbis).

Cheers!
Marcelo Paiva

3 comments:

Unknown said...

Marcelo, thanks for the great overview. I think it behooves all serious front-end coders to join the W3C working group for HTML5, and naturally to promote those browser vendors moving toward compliance.

Marcelo Paiva said...

Thanks Aaron. Let's hope things move quickly. As soon as I learn about the timeframe I'll post it here. -mp

Charusmitha said...

Marcelo, the W3C recommendation is targeted to be out in Q3 of 2010.