Web Devout tidings

Archive for May 8th, 2007

Re: 55 reasons to design in XHTML/CSS

Tuesday, May 8th, 2007

Someone pointed me to an article entitled “55 reasons to design in XHTML/CSS“, which attempts to explain some of the benefits of certain types of webpage design over others. I’m not going to argue against the points which favor semantic markup over presentational markup (because semantic markup is indeed a good thing), but there are lots of false or plain pointless claims about XHTML in there, so I want to address the points one-by-one. Open the above link in another tab so you can see the original points along with my responses.

1a. The CSS Zen Garden is about separation of content and presentation, not XHTML versus HTML. They happen to use XHTML (incorrectly, I might add, since most of the designs fall apart when you force a browser to parse it as XML instead of HTML like they’re currently doing), but this site doesn’t demonstrate any benefits of XHTML over HTML. If anything, it demonstrates problems with XHTML, for the reason described in the previous sentence.

1b. Stylegala accepts either XHTML or HTML content, as long as it’s valid. Like the Zen Garden, this is a separation of content and presentation issue, not an XHTML vs. HTML issue.

1c. CSS Import is the same deal. It’s about separation of content and presentation, not XHTML vs. HTML.

1d. CSS Beauty is the same deal again.

2. This point is ridiculous. I work exclusively with HTML, but I don’t have to spend any extra time thinking about whether or not to quote attribute values. I always quote attribute values. I always use lower-case tag names and attribute names. I always include end tags for non-empty elements. I always escape my ampersands and less-than signs with entities (as well as double quotes and greater-than signs, if only for consistency, although that isn’t a difference between HTML and XHTML). These are just good practice markup rules, and anyone who does web design regularly has their own styles (hopefully subtle variants of already established best practices) which they do without thinking. If you’re sitting at your computer contemplating whether or not you should write “input” or “INPUT”, you must not have much experience.

3. How is this any different from XHTML following the legacy compatibility guidelines (which are required for Internet Explorer support)? If you’re serving XHTML as text/html, you simply cannot write <div /> and expect browsers to consider that closed. Likewise, you simply cannot write <br></br> and expect all browsers to consider that a single element. Several major browsers actually consider that last example to be two br elements! In XHTML, not only do you have to think about which elements require which style of closing, but you also have to think about new compatibility issues between typical HTML parsing of XHTML (which is currently the most common way XHTML is parsed) and proper XML parsing of XHTML (which, sadly, is often overlooked, even though this should be the single correct way of parsing).

4. That’s a semantic markup issue, not HTML vs. XHTML.

5. Another semantic markup issue.

6. This isn’t really true. XHTML 2.0 is, by design, completely incompatible with XHTML 1.x. They’re both supposedly parseable as XML, but so what? You can do an XSL transformation? The semantics and document structure in the current proposed drafts of XHTML 2.0 have some fundamental differences from XHTML 1.x, and you can’t just perform an XSL transformation on any XHTML 1.x document and expect it to result in XHTML 2.0 with all of the proper semantics. In order for XHTML 2.0 to be used correctly, you’ll have to do the markup from scratch. That is, unless you don’t mind the Web being polluted with a bunch of XHTML 2.0 documents containing improper semantics.

7. Separation of content and presentation issue, not HTML vs. XHTML.

8. Semantic markup / separation of content and presentation issue, not HTML vs. XHTML.

9. Semantic markup issue, not HTML vs. XHTML.

10. Separation of content and presentation issue, not HTML vs. XHTML.

11. Separation of content and presentation issue, not HTML vs. XHTML.

12. (X)HTML vs. Flash issue, not HTML vs. XHTML.

13. (X)HTML vs. Flash issue, not HTML vs. XHTML.

14. Separation of content and presentation issue, not HTML vs. XHTML.

15. Separation of content and presentation issue, not HTML vs. XHTML.

16. I’m not even sure what this point is trying to say. CMSs basically exist so that you don’t have to write a whole site backend yourself. What does clean markup have to do with whether or not you need an elaborate site backend?

17. I completely agree with this point, but it has nothing to do with HTML vs. XHTML. Check out the source at http://www.webdevout.net/. How is this any less “clean” than the XHTML equivalent at http://www.webdevout.net/?output=xhtml (aside from the removal of the comments, which were just done to simplify the on-demand conversion)? If anything, XHTML has more junk thrown in. You may personally prefer seeing “/” characters in anything that ends an element, but you must be pretty darn new to web design if you don’t realize that link elements always end immediately after starting. And if you’re that new, you probably can’t read the XHTML version any easier than the HTML version. In any case, I have seen XHTML markup far, far uglier than my HTML markup. Following best practices, typical HTML and typical XHTML do not have significant differences in “cleanliness” compared to other issues like indentation and semantic markup.

18. Separation of content and presentation issue, not HTML vs. XHTML.

19. Semantic markup / separation of content and presentation issue, not HTML vs. XHTML.

20. Separation of content and presentation issue, not HTML vs. XHTML.

21. Semantic markup / separation of content and presentation issue, not HTML vs. XHTML. And I probably wouldn’t use the word “automatically” here.

22. This is more a CSS issue than a markup issue at this time, and I don’t think it’s quite as true as a lot of people seem to believe. In regard to XHTML, the pace of browser development isn’t likely to be swayed to any significant degree by a few more websites adopting “real” (IE-incompatible) XHTML. The Internet Explorer development team has already said that they plan to add real XHTML support in an upcoming version, but they just want to make sure they support it correctly before they roll it out. It would suck for IE to launch it early and have bugs in well-formedness detection (some well-formed pages might not display at all, or malformed pages might accidentally be seen as well-formed, which means the web developer might not catch the error and other browsers might not be able to display the page) or the kinds of CSS and DOM bugs that Opera currently has with XHTML parsed as XML.

23. In HTML, all elements are closed. An empty element with an omittable end tag is automatically closed right after the start tag in HTML (and any other SGML-based language that allows omittable end tags). I assume he meant it looks cleaner with end tags (or null end tags, the /> things). That’s a matter of personal preference. As far as the markup itself, I think XML generally looks less clean with the extra / characters everywhere and/or end tags immediately after start tags. Sure, you could say that the document structure in XML is more obvious without having to know any of the DTD rules, but hey, the navigation structure at craigslist is obvious, but I wouldn’t call it “clean”.

24. This is absolutely wrong. Since XHTML sent with the typical text/html content type isn’t even parsed as XML by any major browser, well-formedness is a non-issue when it comes to how those browsers parse the document. Take a well-formed XHTML page, change every instance of /> to just >, send it along as text/html, and see what browsers do with it. No difference. You have a horribly malformed XML document, and there isn’t a single major browser that even notices. Now go back to your original well-formed XHTML document and add a few <div /> things here and there. It’s still well-formed, and those few empty divs shouldn’t make much of a difference, right? Wrong. Most major browsers suddenly make a complete mess of your page because they think everything after each of the <div /> things is inside the div! Now take an XHTML document with an XML declaration, and insert a blank line above the XML declaration. Go ahead and validate that with the W3C Validator. It says it’s valid? Great, except guess what? It isn’t well-formed. The current W3C Validator doesn’t check for well-formedness, only validity with an SGML parser in XML mode, so when parsed with a real XML parser (any major non-IE browser if the document is sent as application/xhtml+xml), this lovely XHTML page to which the W3C Validator gave the green light completely fails to load! Try the new W3C Validator beta which uses a real XML parser. The new one says it failed validation, while the old one said it passed. Looks like well-formedness is a much more slippery issue than the writer of this article would have you believe.

25. Huh? You really think any major browser today is spending any significant amount of time on the HTML parser error handling anymore? Are we still living in the ’90s? That job is pretty much done. Instead, the author is suggesting that browsers like IE spend more time adding support for proper XHTML rather than things like CSS? Seems a bit self-contradictory to me.

26. Almost every page on the Web is currently sent with the text/html content type. This means all major browsers use an HTML parser on them. Because no major web browsers are enforcing well-formedness, the vast majority of XHTML content on the web is malformed and can only be parsed using a typical HTML parser. Nearly the entire Web currently requires an HTML parser. Future web browsers will have to support this content. This is why HTML 5 is going to define exactly what browsers are currently doing with HTML (rather than referencing the SGML specification which browsers were supposed to be following). If you use HTML, it will work in future web browsers, or else nearly the entire Web will be broken in them. No browser developers who want people to use their product for general web browsing will let all of that content just break. In the real world, HTML is very future-compatible.

27. It’s true that there are still some mobile browsers which use XML parsers exclusively, but these will die out in time. It’s already possible for a full HTML-parsing browser to run on a mobile device without a significant difference in energy consumption compared to an XML-only one. As technology improves, it will be trivial to put something like Opera or KHTML (Konqueror’s free and lightweight engine, from which forked Safari’s engine) on any mobile device with no significant energy or performance impact. The XML-only mobile browsers today are seldom intended for regular web browsing, since well-formed and valid XHTML is nearly nonexistent on the Web, relatively speaking, and that isn’t likely to change any time in the near future. Market forces are simply going to require HTML parsing on those devices if they are intended for general web browsing. Again, let’s be realistic. HTML isn’t going away.

28. Again, no major browser treats typical XHTML as XML, so you aren’t really dealing with XML in the first place. You’re dealing with bad HTML that has adopted some aspects of XML, but is still treated as just HTML. Learning XML by using XHTML is like learning web design by going to an MS FrontPage class: you may acquire some superficial exposure to it, but you’ll probably learn a bunch of crap in the process.

29. Separation of content and presentation issue, not HTML vs. XHTML.

30. What exactly would I be converting XHTML to? I’ve never come across a situation where such a thing would be useful with HTML/XHTML. If I want to provide an RSS or Atom feed, I probably already have the data in an SQL database to begin with. Doing an XSL transformation in this case would cause unnecessary overhead. And then, if I really want to, it’s not exactly difficult to convert typical HTML to XHTML. Remember the ?output=xhtml thing I mentioned above? I wrote that converter in under an hour as an afterthought. If you have a specific document you’re working with in a specific format, it’s even easier to toss around a couple regular expressions and get the data you want. I always hear people referring to XSL as an advantage of XHTML, but I wonder how many of those people have even used XSL. I’ve never needed to.

31. Semantic markup issue, not HTML vs. XHTML.

32. Semantic markup issue, not HTML vs. XHTML.

33. Ha. Well, thankfully, I don’t have to use XHTML in order to put it on my resume.

34. This isn’t really an issue of HTML vs. XHTML, but it’s worth pointing out that Firefox 2.0 and below actually typically render XHTML content parsed as XML more slowly than HTML. Whether it’s HTML or XML, parsing speed is usually faster than download speed, so it has usually parsed the entire document by the time it finishes downloading. When using the HTML parser, it begins to display the webpage while it’s being parsed (and thus, while it’s still downloading). However, when it’s using the XML parser, it won’t display anything until it has checked for well-formedness throughout the entire document. That means nothing gets displayed to the user until the entire XHTML page has downloaded. So under both HTML and XML modes, the document usually finishes rendering at about the same time, but the HTML parser starts rendering much sooner. Firefox 3 will support incremental rendering of XML content, so the two will be about the same speed on typical Internet connection speeds.

35. But what is “the right way” exactly?

36. They aren’t: Roger Johansson, Anne van Kesteren, Jonathan Snook, Eric Meyer, the Safari team

The links in the following sub-points go to a page which simply sends the webpage contents with the application/xhtml+xml content-type which triggers XML parsing in web browsers. This is exactly how you would see the respective websites if they sent the correct XHTML content type.

36a. Not only does SimpleBits not validate, it isn’t even well-formed, so an XML parser would completely fail to load it.

36b. Shame the “Job Board” section on Jeffrey Zeldman’s site doesn’t work when the page is parsed as XML. document.write and document.writeln don’t exist in XML documents. Being a supposed standards expert, you’d think he’d know that.

36c. Wow, the stylesheet really falls apart when Jason Santa Maria’s page is parsed as XML. Perhaps he doesn’t realize that CSS isn’t supposed to follow the legacy HTML rules when the page is parsed as XML, and when you set the background on the body element, it really goes on the body element, not the html element. Too bad.

36d. Shaun Inman should fix those weird spacing issues that happen when the page is parsed as XML. The search box looks out of shape.

36e. Cameron Moll’s site has a whole slew of validation errors, plus well-formedness errors. The page doesn’t display at all when parsed as XML.

36f. On StopDesign, the “Latest links” don’t appear when the page is parsed as XML. Once again, document.write doesn’t work in XML.

36g. Dave Shea’s mezzoblue has a bunch of validation and well-formedness errors. The page doesn’t display at all when parsed as XML.

For those keeping count, every single example the article gave of web standards experts using XHTML had problems when parsed as XML. Three of them (that’s 43%) couldn’t even be parsed as XML. XHTML was designed specifically to be an XML version of HTML. If these sites don’t work correctly when treated as XML, why are they XHTML? If they depend on browsers treating them like HTML and don’t make use of any benefit XHTML is supposed to offer, why weren’t they just written in HTML?

37. You’re part of the masses who use XHTML without really understanding it. I consider myself part of a movement to educate people about the problems using XHTML this way. To each his own.

38. This isn’t really specific to the HTML vs. XHTML issue.

39. Hooray! Although I wish the same fate about some other elements which are still lingering in drafts of future specifications.

40. Which is why I write my HTML to strict guidelines.

41. Thankfully, I can write books about XHTML without using it.

42. Yes, it’s always good to know the technologies. I know XHTML very well. Of course, it doesn’t mean I use it when HTML is the better option.

43. For someone who doesn’t seem to understand that browsers handle most XHTML content on the Web as regular old HTML, I’m not sure this author is one to speak.

44. Sounds like a CSS issue.

45. You should be caring about this stuff if you’re using HTML, too.

46. Separation of content and presentation issue, not HTML vs. XHTML.

47. “XHTML has a cooler name than HTML”. This doesn’t warrant a response.

48. Yeah, and I’ve definitely seen a lot of disadvantages to using XHTML. Unfortunately, there is a lot of people who get religious about it and refuse to listen to anything bad about XHTML. But for those among you who are open-minded, I hope you read my articles on the subject.

49. Separation of content and presentation issue, not HTML vs. XHTML.

50. Or free tools, like Bluefish on Linux.

51. 1,060,000 columbus discovered the world was round > 826,000 columbus didn’t discover the world was round. The myth that Columbus had any connection with a debate over whether or not the world was round originated entirely from a fictional work published in 1828. If you read the actual history of Christopher Columbus, it was not a debate over whether or not the world was round (it was already well-known and proven that the world was round, and it was contested by very few among the masses and even fewer from more educated backgrounds), it was actually a debate over the circumference of the Earth. Columbus had some errors in his calculation (including confusing two different “mile” units from different measurement systems) and thought the world was much smaller than it turned out to be. In every respect, Columbus was wrong about his predictions as he set sail, and had absolutely no part in proving that the world was round. Yet, according to Google search results, the popular belief is that which was derived from the 1828 work of fiction. Just because something is popular on Google doesn’t mean it’s true.

52. Oh yes, just like those sites I covered in number 36. All I did for my initial check was use the Force Content-Type extension in Firefox, which just causes the browser to treat the site as if it had the application/xhtml+xml content type (also known as a MIME type). Every single one had problems. All of them. I sure hope everyone who’s writing XHTML content takes a moment to check how an XML parser would see the page, because there’s a very high occurrence of things in the markup, stylesheets, and scripts breaking, often to a major degree. If you’ve only been testing the page when it’s sent as text/html, you’ll probably have a lot more work to do than just switching the content type.

53. As mentioned before, Microsoft is already working on it. Making pages that break in IE and thus get very few visitors isn’t going to put any significant pressure on Microsoft to put more people on the job or work faster. All it means is that a lot of visitors will think your site is broken. Let Microsoft take their time and release a good XHTML engine. There’s no rush. Really, XHTML isn’t that immediately important.

54. That’s just valid markup, whether it’s HTML or XHTML.

55. Just 16 of the points above were about HTML vs. XHTML. The rest were about other issues like semantic markup. Among those 16, most of them were plainly false, and the rest were pretty much false or irrelevant. Check out my Beware of XHTML article, which explains a lot more reasons why you should probably use HTML rather than XHTML.

By the way, the article entitled “55 reasons to design in XHTML/CSS” contains invalid and malformed XHTML, and a browser attempting to parse it as XML completely fails to load the page. Here’s the parsed-as-XML version. The only reason a browser is able to display the original article at all is because all major browsers normally treat his XHTML as regular old HTML. I suggest learning the technology before spreading myths about it.