Web Devout tidings


Archive for the 'HTML 5' Category

Self-contradictions in the HTML WG

Tuesday, May 22nd, 2007

HTML 5 will include the embed element, font element, and other elements which could easily be replaced using better and well-supported features. Ian Hickson wants them included anyway.

HTML 5 will not include the headers attribute for tables, which was designed to aid accessibility. Ian’s reason for not including them? They can easily be replaced with another feature (the scope attribute).

Ian wants all of the bad practice features of HTML included in HTML 5 because they are widely used.

Ian doesn’t want the good practice headers attribute included even though it’s widely used.

Can someone please tell me what’s going on?

Update 2007-06-01: Here’s a relevant article from Juicy Studio: The HTML Scope/Headers Debate.

Other web standards experts worried about HTML 5

Friday, May 11th, 2007

More web standards experts have begun expressing worries about the direction HTML 5 is currently going. Roger Johansson, writer for the excellent 456 Berea Street blog, has written a few posts on the subject. From one of the posts: “What is currently going on in the W3C HTML Working Group is very disappointing and something I never expected to see when I joined it. I was naive enough to think that everybody joining the HTML WG would be doing so out of a desire to improve the Web. Unfortunately, that does not seem to be the case”

Check out the posts and comments in the following links:

HTML 5: common practice vs. good practice

Sunday, April 29th, 2007

By the way it’s documenting current browser practices, the HTML 5 specification may inadvertently be encouraging bad web development practices.

One of the big reasons for a lot of the junk that’s currently planned in HTML 5 is that new user agents developed in the future should be able to reasonably handle preexisting content on the Web. The HTML 5 specification will describe what browsers are currently doing with a lot of content which maybe wasn’t considered part of a standard before now, and future web browsers would only have to follow the HTML 5 specification in order to handle much of the legacy content on the Web.

I do recognize the need for this. However, I don’t think this idea is being delivered properly. If we aren’t careful, web developers will end up looking at things like the font and embed elements and say, “Well hey, they’re here, they’re standard, why not use them?” I don’t care how many times you say in the specification that authors shouldn’t use them, if people receive even the slightest hint that they’re considered standard, they will.

What I feel needs to happen is a much more clear and physical separation between the parts of the standard meant only for browsers rendering legacy content and the parts meant for web developers following good practices. I’m not sure if I’d go as far as publishing two separate and complimentary standards, but I feel that the legacy stuff should at least be isolated into its own major section of the specification with a fat heading something along the lines of “Crap that browsers support for legacy reasons”. None of this legacy content should pass a webpage validation, and it should be made perfectly clear that use of those features on new webpages is a violation of the standard.

But I somehow have a feeling that this won’t happen. Even the current Web Applications 1.0 specification says that WYSIWYG editors are allowed to use font tags in their output. It mentions nothing to dissuade the use of the embed element even though common uses like Flash can be done purely through an object element in all of today’s major browsers. The specification generally carries an attitude of “if it’s out there right now, it’s valid” which I think will end up only encouraging bad practices and resulting in a lot more problems in the future.

The whimzical world of HTML 5

Monday, April 23rd, 2007

A lot of scary stuff is going on in HTML 5 development. You know all the things we’ve learned about browser/engine-neutral code, building standards on top of other standards, using semantic markup, and so on? Well from what I’ve seen, the HTML working group seems to be throwing all of that out the window.

I should first note that I only just recently subscribed to the HTML WG mailing list, and I haven’t yet had a chance to read the full breadth of the discussion, but the talk right now seems to be gathered around something called “bugmode”, a new standard mechanism for browsers to add an infinite number of “quirks modes” which webpages can subscribe to. It’s currently proposed as something like this:

<html bugmode="ie7 gecko1.8 opera9">

This would basically cause these browsers to use snapshots of the respective layout engines when displaying the page. All future versions of Internet Explorer would use the IE 7 engine, all future versions of Firefox would use the Gecko 1.8 engine, and all future versions of Opera would use the Opera 9 engine.

Am I the only one who thinks this is a terrible idea?

First of all, since when do web developers experience significant problems with new versions of Firefox or Opera? I’ve never had anything important break with the release of a new version. I’ve only experienced such a problem in Internet Explorer, since IE has to fix major implementation flaws in very fundamental areas of the standards, like the basic behavior of the width and height properties. IE is uniquely in this position because most of their engine was developed before the current CSS standards were in place (they basically extrapolated off of CSS 1 however they saw fit at the time) and the engine had no development work for half a decade to correct the inconsistencies.

So I personally wouldn’t mind it if IE added some sort of conditional comment type of thing to target new quirks modes in IE, but I don’t see why there should be a whole new attribute added to the HTML standard just for triggering new browser-specific quirks modes.

I should point out that this is still very much a brainstorming session, and this idea may fade away in a couple weeks, but I’m still bothered by the number of people who seem to be taking this discussion seriously.

I talked a little more about this issue in a comment on Chris Wilson’s blog.

Now, Ian Hickson, who was responsible for a lot of the WHATWG Web Applications 1.0 work and will serve as an editor for the W3C HTML 5 specification, has it in his mind that the HTML WG is chartered to deviate HTML 5 from SGML. That is, he believes it is one of the stated intentions of the HTML WG that future versions of HTML will not be SGML languages.

Here is the charter quote from which he derived this idea:

The Group will define conformance and parsing requirements for ‘classic HTML’, taking into account legacy implementations; the Group will not assume that an SGML parser is used for ‘classic HTML’.

The charter uses the term “classic HTML” to refer to non-XHTML HTML. In SGML terms, this would be the markup using HTML 4.01’s SGML declaration, rather than XML as used by XHTML. Currently, no major web browser uses a full-featured SGML parser to parse classic HTML content. Therefore, it is wise not to assume that a browser can handle any SGML rules thrown at it in a new version of HTML. What the charter is saying is that the group will take into account this fact when developing the new standard. It does not say that HTML 5 shouldn’t be parseable by an SGML parser; it just says not to assume that an SGML parser will always be used.

However, Ian Hickson and others in the HTML WG have used this twisted interpretation of the charter as an excuse to unnecessarily break compatibility with the SGML standard. I’ll say it again: unnecessarily breaking compatibility with the SGML standard. I haven’t yet seen anything they’re trying to accomplish with HTML 5 that couldn’t be done in an SGML-compatible way.

They want to circumvent the issue of XML-style self-closing tag constructs causing problems in user agents which support the default SGML syntax for null end tags? Just set NETENABL to NO in the SGML declaration. This wouldn’t expressly allow XML-style self-closing constructs in HTML, and in these cases the “/” character would be considered invalid, but it brings a fully-compliant SGML parser to the behavior that all major browsers currently exhibit. Note that this would only be handled as intended when used on elements defined as EMPTY, as is currently the case in all major web browsers. If you were to truly support XML-style self-closing tags even for non-EMPTY elements (which may indeed require a significant departure from how HTML is currently constructed), that would cause problems with legacy user agents, which the HTML WG charter says to avoid. A change to the SGML declaration would be somewhat of an issue for fully compliant SGML parsers, since they generally use the Content-type header to determine which SGML grammar is being used, and we should probably avoid giving HTML 5 a different content-type than HTML 4, but at least this would keep HTML 5 compatible with SGML so it isn’t impossible for an SGML parser to parse it.

It has also been proposed that HTML 5 should have no DTD. For similar reasons, I ask, why? I’ve seen the proposed elements in Web Applications 1.0, which is roughly considered the starting point for HTML 5 development, and I don’t see anything there that would require the absence of a DTD. I’m curious what the W3C Validator development team thinks of this. The W3C Validator currently operates strictly via an SGML/DTD parser (the upcoming new version of the Validator also comes equipped with an XML parser in order to also check for well-formedness). Without a DTD, the validator would have to hard-code all of the rules for HTML 5. And how exactly does omitting a DTD benefit anyone?

Not only does Ian Hickson want to omit a DTD, but he doesn’t seem to think that a version indicator is even necessary. His proposed new doctype declaration is simply <!DOCTYPE html>. So that’s it. Every future version of HTML had better be 100% backwards compatible. No mistakes may be made or else the HTML standard is screwed for life. I think history has shown us that this assumption that we can reasonably keep a sane standard backwards-compatible forever is a bit unwise. At one time, the isindex element seemed like a good idea. There are plenty of people who want the q element redefined in HTML 5 so that the browser doesn’t display quotation marks by itself. HTML 5 already attempts to redefine some elements and attributes from HTML 4. I guarantee that there are features currently in Web Applications 1.0 which people are going to see as a mistake several years down the road and want to correct. It will end up causing compatibility problems if there isn’t a version number to go along with those changes. Maybe we’ll have to use bugmode after all.

Speaking of new features, let’s talk about some of them. To start off, there are some good things proposed in Web Applications 1.0. I like the section element, nav element, article element, aside element, the redefinition of the dl element, and some of the other stuff. But there are some elements and attributes that just make me scratch my head:

  • Why do we have a canvas element? Why not simply use a script to apply some state to any given element to turn it into a canvas? People who have worked with the Google Maps API are familiar with the idea of using a script to replace an arbitrary element (be it a div, p, etc.) with a new object. In most cases, a canvas element could be replaced with a div element, and then the script just sets it to a canvas just as browsers often allow scripts to set arbitrary elements to be contentEditable. What ever happened to semantic markup? What semantics does a canvas element express?
  • ping attributes? In my a? Thanks for slowing my Web experience and using up more of my bandwidth so that advertising companies can track my habits. Much appreciated. I hope my browser quickly adds an option to disable this functionality, because I for one don’t want it. If a website is going to gossip to others about how I’m using the site, it should put in the effort to do it server-side with its own bandwidth.
  • embed element, why won’t you die? Is it the popular thing these days to just call whatever is out there on the Web “the standard”?

I could go on, but my point is that a lot of stuff is being proposed pretty quickly, and I question the motivation and thought behind a lot of these propositions. People seem to be caught up on how to add such-and-such functionality to web apps rather than focusing on semantics and other things we were supposed to have learned since the old boom days of the Web. I dunno, it just feels like we’ve been through all of this before. Even though this is being discussed in a public forum, the types of propositions are all too reminiscent of the seemingly random “sounded-good-at-the-time” features Netscape and Internet Explorer kept adding during the last browser wars. Does anyone know where I can buy some cheap shock collars?

W3C to resume HTML standard development

Monday, October 30th, 2006

Tim Berners-Lee, the W3C director and inventor of the Web, recently made a blog post announcing plans to charter a new HTML working group to make incremental additions to the HTML standard. In his post, he acknowledged problems with getting the Web switched over to XHTML, and determined that such a progression must be done more gradually.

Some things are clearer with hindsight of several years. It is necessary to evolve HTML incrementally. The attempt to get the world to switch to XML, including quotes around attribute values and slashes in empty tags and namespaces all at once didn’t work. The large HTML-generating public did not move, largely because the browsers didn’t complain.

One of the chief problems with the adoption of XHTML is the complete lack of support by Internet Explorer and a number of search engines and other user agents. As a result, webpages that are marked up as XHTML are often sent to the browser using the text/html content type instead of the proper application/xhtml+xml content type, thus causing browsers to treat the page like HTML instead of XHTML. This has lead to lots of “bad” XHTML that, if a browser was to attempt to treat like real XHTML, would completely fall apart. More problems with XHTML are discussed in the Beware of XHTML article.

Tim Berners-Lee also mentioned the advent of the Web Hypertext Application Technology Working Group (WHAT WG), an open standards organization that works separately from the W3C in attempt to more immediately address the interests of real-world web applications developers. WHAT WG is lead by Ian Hickson, who has participated in the development of both Opera and Mozilla products and currently works for Google. WHAT WG has received some criticism that it has departed from the ideals of the semantic web and some of the foundation of today’s established standards. Largely through Ian Hickson’s influence, Opera and Firefox have over the last few versions added support for a number of features in WHAT WG’s Web Applications 1.0 specification.

Berners-Lee hopes that with the chartering of the new HTML working group, parties that are interested in the development of the HTML standard will return from separate efforts like WHAT WG back to the W3C process.

The plan is to charter a completely new HTML group. Unlike the previous one, this one will be chartered to do incremental improvements to HTML, as also in parallel xHTML. It will have a different chair and staff contact. It will work on HTML and xHTML together. We have strong support for this group, from many people we have talked to, including browser makers.