Web Devout tidings

Archive for February, 2007

Validity and well-formedness

Tuesday, February 20th, 2007

I’ve just published a new web development article called Validity and Well-Formedness, which explains the distinctions between valid and well-formed XHTML.

If the W3C HTML Validator says your XHTML page is valid, that means it’s also well-formed, right? Wrong! This article has several examples of XHTML documents which are perfectly valid but are malformed and won’t even load in an XML parser.

XHTML 1.1 Second Edition WD allows text/html… why?

Saturday, February 17th, 2007

An XHTML 1.1 Second Edition Working Draft has just been published in attempt to correct some problems with the previous Recommended specification. However, in the Strictly Conforming Documents section, I feel that they have introduced a brand new problem. The concerned paragraph is:

XHTML 1.1 documents SHOULD be labeled with the Internet Media Type text/html as defined in [RFC2854] or application/xhtml+xml as defined in [RFC3236]. For further information on using media types with XHTML, see the informative note [XHTMLMIME].

text/html is now included with application/xhtml+xml as one of the content types XHTML 1.1 “should” be sent as. I see this as a big mistake that should be fixed before the specification advances further. In theory and in practice, the content type header instructs the user agent on what kind of file it is dealing with. Every major web browser parses text/html documents as HTML, not XML / XHTML. The fact that XHTML 1.0 allowed this has lead to a huge overall misunderstanding in what XHTML is and how user agents handle it, which in turn has resulted in the vast majority of so-called “XHTML” documents being in such a state that, if properly handled as XHTML was meant to be handled, they would fall apart. Even supposed web standards experts fall victim to this misunderstanding all the time, as is evident in this list of standards-related sites that break as XHTML.

XHTML was designed in part to progress from the old state of the Web that tolerated invalid markup and sloppy legacy behaviors in CSS and elsewhere. However, because XHTML was allowed to be sent as text/html, people are in essence writing XHTML just like they wrote HTML before, or even worse. The document looks like XHTML but they depend on browsers treating it as HTML. Most so-called XHTML pages on the Web today aren’t well-formed, which XML and XHTML were designed to forcefully not tolerate. XHTML on the Web has been the same disaster HTML was, except the situation is even more complicated than before. XHTML 1.0 has failed.

Now, XHTML 1.1 is about to do the same thing. By allowing the use of an incorrect content type that instructs browsers to use incorrect behavior, the specification authors are promoting the incorrect use of XHTML.

What warrants this change? Is it because most XHTML pages on the Web use the wrong content type? The road to progress is not to simply approve of whatever poor and harmful practices are used on the Web. XHTML is an XML format. That’s the only significant thing that sets it apart from HTML. If you’re going to allow it to be served and handled as plain old HTML, why bother having an XHTML standard at all? To word it another way, if you’re going to allow an XHTML document to be sent as text/html, which in turn would cause all major browsers to treat it as plain old HTML, why not instead recommend the use of HTML for those documents? Doing otherwise simply further pollutes the already poor state of XHTML on the Web.

For further reading about the problems with XHTML on the Web today, see the Beware of XHTML article.

Web Devout infrastructure changes

Saturday, February 3rd, 2007

Web Devout has adopted a somewhat more user- and search engine-friendly format for its URLs. For example, instead of /browser_support.php it’s now /browser-support. The old URLs should automatically redirect to the new ones, but it’s possible that something was overlooked. If you experience an unexpected 404 or other error, please let me know as soon as possible.