Web Devout tidings


Archive for April, 2007

HTML 5: common practice vs. good practice

Sunday, April 29th, 2007

By the way it’s documenting current browser practices, the HTML 5 specification may inadvertently be encouraging bad web development practices.

One of the big reasons for a lot of the junk that’s currently planned in HTML 5 is that new user agents developed in the future should be able to reasonably handle preexisting content on the Web. The HTML 5 specification will describe what browsers are currently doing with a lot of content which maybe wasn’t considered part of a standard before now, and future web browsers would only have to follow the HTML 5 specification in order to handle much of the legacy content on the Web.

I do recognize the need for this. However, I don’t think this idea is being delivered properly. If we aren’t careful, web developers will end up looking at things like the font and embed elements and say, “Well hey, they’re here, they’re standard, why not use them?” I don’t care how many times you say in the specification that authors shouldn’t use them, if people receive even the slightest hint that they’re considered standard, they will.

What I feel needs to happen is a much more clear and physical separation between the parts of the standard meant only for browsers rendering legacy content and the parts meant for web developers following good practices. I’m not sure if I’d go as far as publishing two separate and complimentary standards, but I feel that the legacy stuff should at least be isolated into its own major section of the specification with a fat heading something along the lines of “Crap that browsers support for legacy reasons”. None of this legacy content should pass a webpage validation, and it should be made perfectly clear that use of those features on new webpages is a violation of the standard.

But I somehow have a feeling that this won’t happen. Even the current Web Applications 1.0 specification says that WYSIWYG editors are allowed to use font tags in their output. It mentions nothing to dissuade the use of the embed element even though common uses like Flash can be done purely through an object element in all of today’s major browsers. The specification generally carries an attitude of “if it’s out there right now, it’s valid” which I think will end up only encouraging bad practices and resulting in a lot more problems in the future.

The whimzical world of HTML 5

Monday, April 23rd, 2007

A lot of scary stuff is going on in HTML 5 development. You know all the things we’ve learned about browser/engine-neutral code, building standards on top of other standards, using semantic markup, and so on? Well from what I’ve seen, the HTML working group seems to be throwing all of that out the window.

I should first note that I only just recently subscribed to the HTML WG mailing list, and I haven’t yet had a chance to read the full breadth of the discussion, but the talk right now seems to be gathered around something called “bugmode”, a new standard mechanism for browsers to add an infinite number of “quirks modes” which webpages can subscribe to. It’s currently proposed as something like this:

<html bugmode="ie7 gecko1.8 opera9">

This would basically cause these browsers to use snapshots of the respective layout engines when displaying the page. All future versions of Internet Explorer would use the IE 7 engine, all future versions of Firefox would use the Gecko 1.8 engine, and all future versions of Opera would use the Opera 9 engine.

Am I the only one who thinks this is a terrible idea?

First of all, since when do web developers experience significant problems with new versions of Firefox or Opera? I’ve never had anything important break with the release of a new version. I’ve only experienced such a problem in Internet Explorer, since IE has to fix major implementation flaws in very fundamental areas of the standards, like the basic behavior of the width and height properties. IE is uniquely in this position because most of their engine was developed before the current CSS standards were in place (they basically extrapolated off of CSS 1 however they saw fit at the time) and the engine had no development work for half a decade to correct the inconsistencies.

So I personally wouldn’t mind it if IE added some sort of conditional comment type of thing to target new quirks modes in IE, but I don’t see why there should be a whole new attribute added to the HTML standard just for triggering new browser-specific quirks modes.

I should point out that this is still very much a brainstorming session, and this idea may fade away in a couple weeks, but I’m still bothered by the number of people who seem to be taking this discussion seriously.

I talked a little more about this issue in a comment on Chris Wilson’s blog.

Now, Ian Hickson, who was responsible for a lot of the WHATWG Web Applications 1.0 work and will serve as an editor for the W3C HTML 5 specification, has it in his mind that the HTML WG is chartered to deviate HTML 5 from SGML. That is, he believes it is one of the stated intentions of the HTML WG that future versions of HTML will not be SGML languages.

Here is the charter quote from which he derived this idea:

The Group will define conformance and parsing requirements for ‘classic HTML’, taking into account legacy implementations; the Group will not assume that an SGML parser is used for ‘classic HTML’.

The charter uses the term “classic HTML” to refer to non-XHTML HTML. In SGML terms, this would be the markup using HTML 4.01’s SGML declaration, rather than XML as used by XHTML. Currently, no major web browser uses a full-featured SGML parser to parse classic HTML content. Therefore, it is wise not to assume that a browser can handle any SGML rules thrown at it in a new version of HTML. What the charter is saying is that the group will take into account this fact when developing the new standard. It does not say that HTML 5 shouldn’t be parseable by an SGML parser; it just says not to assume that an SGML parser will always be used.

However, Ian Hickson and others in the HTML WG have used this twisted interpretation of the charter as an excuse to unnecessarily break compatibility with the SGML standard. I’ll say it again: unnecessarily breaking compatibility with the SGML standard. I haven’t yet seen anything they’re trying to accomplish with HTML 5 that couldn’t be done in an SGML-compatible way.

They want to circumvent the issue of XML-style self-closing tag constructs causing problems in user agents which support the default SGML syntax for null end tags? Just set NETENABL to NO in the SGML declaration. This wouldn’t expressly allow XML-style self-closing constructs in HTML, and in these cases the “/” character would be considered invalid, but it brings a fully-compliant SGML parser to the behavior that all major browsers currently exhibit. Note that this would only be handled as intended when used on elements defined as EMPTY, as is currently the case in all major web browsers. If you were to truly support XML-style self-closing tags even for non-EMPTY elements (which may indeed require a significant departure from how HTML is currently constructed), that would cause problems with legacy user agents, which the HTML WG charter says to avoid. A change to the SGML declaration would be somewhat of an issue for fully compliant SGML parsers, since they generally use the Content-type header to determine which SGML grammar is being used, and we should probably avoid giving HTML 5 a different content-type than HTML 4, but at least this would keep HTML 5 compatible with SGML so it isn’t impossible for an SGML parser to parse it.

It has also been proposed that HTML 5 should have no DTD. For similar reasons, I ask, why? I’ve seen the proposed elements in Web Applications 1.0, which is roughly considered the starting point for HTML 5 development, and I don’t see anything there that would require the absence of a DTD. I’m curious what the W3C Validator development team thinks of this. The W3C Validator currently operates strictly via an SGML/DTD parser (the upcoming new version of the Validator also comes equipped with an XML parser in order to also check for well-formedness). Without a DTD, the validator would have to hard-code all of the rules for HTML 5. And how exactly does omitting a DTD benefit anyone?

Not only does Ian Hickson want to omit a DTD, but he doesn’t seem to think that a version indicator is even necessary. His proposed new doctype declaration is simply <!DOCTYPE html>. So that’s it. Every future version of HTML had better be 100% backwards compatible. No mistakes may be made or else the HTML standard is screwed for life. I think history has shown us that this assumption that we can reasonably keep a sane standard backwards-compatible forever is a bit unwise. At one time, the isindex element seemed like a good idea. There are plenty of people who want the q element redefined in HTML 5 so that the browser doesn’t display quotation marks by itself. HTML 5 already attempts to redefine some elements and attributes from HTML 4. I guarantee that there are features currently in Web Applications 1.0 which people are going to see as a mistake several years down the road and want to correct. It will end up causing compatibility problems if there isn’t a version number to go along with those changes. Maybe we’ll have to use bugmode after all.

Speaking of new features, let’s talk about some of them. To start off, there are some good things proposed in Web Applications 1.0. I like the section element, nav element, article element, aside element, the redefinition of the dl element, and some of the other stuff. But there are some elements and attributes that just make me scratch my head:

  • Why do we have a canvas element? Why not simply use a script to apply some state to any given element to turn it into a canvas? People who have worked with the Google Maps API are familiar with the idea of using a script to replace an arbitrary element (be it a div, p, etc.) with a new object. In most cases, a canvas element could be replaced with a div element, and then the script just sets it to a canvas just as browsers often allow scripts to set arbitrary elements to be contentEditable. What ever happened to semantic markup? What semantics does a canvas element express?
  • ping attributes? In my a? Thanks for slowing my Web experience and using up more of my bandwidth so that advertising companies can track my habits. Much appreciated. I hope my browser quickly adds an option to disable this functionality, because I for one don’t want it. If a website is going to gossip to others about how I’m using the site, it should put in the effort to do it server-side with its own bandwidth.
  • embed element, why won’t you die? Is it the popular thing these days to just call whatever is out there on the Web “the standard”?

I could go on, but my point is that a lot of stuff is being proposed pretty quickly, and I question the motivation and thought behind a lot of these propositions. People seem to be caught up on how to add such-and-such functionality to web apps rather than focusing on semantics and other things we were supposed to have learned since the old boom days of the Web. I dunno, it just feels like we’ve been through all of this before. Even though this is being discussed in a public forum, the types of propositions are all too reminiscent of the seemingly random “sounded-good-at-the-time” features Netscape and Internet Explorer kept adding during the last browser wars. Does anyone know where I can buy some cheap shock collars?

Tech Center Current blog

Friday, April 20th, 2007

If you’re interested in some more of my technology-related writings, I’ve recently been posting on Tech Center Current, the blog for the California Community Colleges Technology Center where I currently work. Most of the posts are less advanced and less industry-centric than I typically make here, and posts are divided into three different levels of technical familiarity, so it reaches to a wider audience. Although I may be going to work for either Microsoft or Mozilla in the near future, I’ll act as an invited expert on the Tech Center Current blog for a while after.

Safari displays 1×1 alphatransparent PNGs too dark

Friday, April 20th, 2007

I finally figured out why Safari was displaying the heading backgrounds on the main Web Devout site too dark: In general, Safari 2.0 seems to screw up the brightness or gamma correction on 1-pixel by 1-pixel alphatransparent PNGs. This is even true for PNGs which don’t have any gamma correction information included. Interestingly, if you change the image size to anything else, the brightness problem goes away. Why does Safari decide to darken 1×1 PNGs? Your guess is as good as mine.

I was using a repeating 1×1 alphatransparent PNG as the background in order to simulate an RGBA value in a CSS 2.x-compatible way. To fix the problem in Safari, I simply changed the image size to 2×1.

I just wanted to point this out in case anyone else runs into it and becomes stumped like I was for a while. The problem seems unique to Safari/WebKit; Konqueror doesn’t seem to have this problem.

Frankly, this is just one of a seemingly endless list of bang-your-head-on-the-desk bugs I regularly find in Safari in quite basic areas. Another one that bothered me for a while was that background images in Safari will repeat if the box is shorter or thinner than the background image even if you have background-repeat: no-repeat;, which you’ll notice if you also use a background-position. This just shows that passing something like Acid2 first doesn’t necessarily mean you’re the cream of the crop. Please exterminate these weird bugs.

Job opportunities: an interesting dilemma

Thursday, April 12th, 2007

This week, I was approached with job opportunities from both Mozilla and Microsoft’s Internet Explorer team. It turns out this is a tougher decision than I thought it would be.

Anyone who knows me knows what I think of Internet Explorer. Let me briefly summarize what, in my mind, are the two biggest problems with Internet Explorer as a product and what I feel are the primary sources for those problems:

First in my mind is standards support. Internet Explorer has by far the worst standards support of any major web browser, period. Anyone serious in web development knows this. Over time, Microsoft has been accused of things like not caring about standards and what have you. But I don’t think that’s really the core issue. I honestly believe that the IE developers fully intend to follow standards whenever they’re available. IE’s nonstandard event model wasn’t the result of deliberately deviating from the standard; there was no event model standard when IE added support. A lot of the so-called “nonstandard behavior” with CSS properties is the result of bugs and design flaws that the IE developers intend to fix. The main problem isn’t that they don’t care.

What I believe is the primary cause of IE’s currently miserable situation with standards support is the fact that Microsoft disbanded the platform development team back in 2001, and thus, aside from security updates, IE layout engine development was completely abandoned for five years. Five years. Half a decade. Roughly half of Internet Explorer’s entire life to date was spent sitting idle. IE 6 wasn’t a bad browser when it first came out, but other browsers have now had twice the time IE had to add standards support, fix bugs, and generally snazz up their engines. Internet Explorer was simply neglected for too long.

The second main problem with Internet Explorer as a product is its security record. Every piece of software as complex as a web browser will have plenty of security problems. And naturally, if you have 80% or higher market share, there will be lots of people trying to pick apart your browser piece by piece. But this isn’t the main problem.

The main problem with IE’s security is the security response process. Internet Explorer simply takes too long to fix its vulnerabilities, and it leaves so many vulnerabilities unfixed. Internet Explorer has taken on average several times as long as Firefox to patch its known vulnerabilities. We just passed the fourth Patch Tuesday of the year, yet according to Secunia, 78% of IE 7’s known vulnerabilities are still unfixed. That isn’t even counting the several-year-old IE 6 vulnerabilities that were never fixed and probably still exist in IE 7. Microsoft says that this is all due to their quality assurance process, but I dunno… I’ve heard about as many cases of IE patch problems as Firefox patch problems. Too many issues are swept under the rug. It’s another case of neglect.

So here I am with an opportunity to help do something about this. I have a chance to help give IE attention where it needs it. Internet Explorer is used by around 75% to 80% of the Internet population. It is, in many or most cases, the single immediate factor holding back professional web developers from doing their jobs as quickly, correctly, and efficiently as they otherwise could.

Meanwhile, I may also have the opportunity to work for Mozilla. Mozilla is an entirely different situation. They have this groundwork laid out. They have an engine that is relatively very well in line with the standards. I have little doubt that the Gecko engine code is much more consistent, well-structured, and mature than the Trident code in Internet Explorer. Mozilla isn’t struggling to correct lots of broken foundation; it’s working to perfect its well-written engine and to develop the new groundwork for future standards.

Working with Internet Explorer would be working to bring a dated but important engine into the present, while working with Mozilla would be working to lead a modern but not-quite-as-prominent engine into the future. Both are very important tasks and both are tasks which I would much like to be a part of. But alas, there is only one of me, and I have to make a choice. I feel like I would better enjoy the work and atmosphere at Mozilla, but I might be able to drive a bigger near-future impact on the Web by working with Internet Explorer. If, in the end, both options are available to me, what should I do?