Web Devout tidings


Archive for the 'Web design theory' Category

The problem with the NET

Sunday, April 16th, 2006

Use of XHTML today is decidedly harmful to the health of the Web when the documents are sent with the text/html content type for Internet Explorer compatibility. This is why I wrote the Beware of XHTML article, explaining some of the reasons why this misuse can lead to future problems and illustrating the situation with some examples. Now I’d like to talk about another potential problem with XHTML sent as text/html: Null End Tags.

As explained in the article, when a document is sent with the text/html content type, most major browsers including Internet Explorer, Firefox, and Opera treat the webpage as if it is actually regular HTML. As a result, they don’t correctly handle self-closing tags, instead treating them as start tags with an erroneous character inside them.

However, there’s more of a problem than just this. Technically speaking, if the browsers are really treating the page like HTML (and therefore SGML), they shouldn’t think that the closing slash is an erroneous character. According to the rules of SGML, they should think that it’s part of a Null End Tag, a kind of shorthand for simple elements that only contain character data. For example, according to the rules of SGML, a title element could be written <title/This is the title of the page/ which would be the equivalent of <title>This is the title of the page</title>. The first slash finishes the start tag, and the second slash represents the end tag if the element can have one.

Now let’s look at a situation in which this would present a problem. Think about this markup: <div>This<br/>is<br/>a<br/>test.</div>. That would work as expected in XHTML, but here’s how a browser should see it when treating the page as HTML: <div>This<br>>is<br>>a<br>>test.</div>. Notice the extra > after each br tag. Since the br element is defined as an empty element (it doesn’t have contents and doesn’t have an end tag), only one slash is relevant for each element. The slash finishes the tag right there, meaning the > character isn’t considered part of the tag, but rather character data after the tag. The presence of a space before the slash makes no difference.

The above markup should result in the following output when treated like HTML:

This
>is
>a
>test.

…That is, if browsers supported Null End Tags. Unfortunately, the major ones currently don’t, meaning that this issue gets entirely overlooked by most web developers. Rather than properly treating the slash as part of a Null End Tag, they treat it as an error and just skip past the character, often resulting in something quite like what it would get with the correct XHTML treatment, but for the wrong reasons.

Keep in mind that there are many issues like this, and use of XHTML should be avoided unless used correctly, with a correct XHTML content type.

CSS Naked Day

Wednesday, April 5th, 2006

Today is the first annual CSS Naked Day. This is the day we take down our stylesheets in support of structural markup.

As all professional web designers know, it is important to separate webpage content from presentation. The purpose of HTML is to express the content of the webpage — to describe the meaning of each component of the webpage in a way that would interest automated agents such as search engines that are attempting to understand the information that your webpage is expressing. CSS then provides a presenation layer to define how human beings should experience the page content.

If you aren’t following this model correctly, there will likely be problems when the stylesheet is disabled. You may see things shoved up against the right side of the page, images cut up into pieces, border or background images placed oddly here and there, or other oddities that make the webpage very difficult to use. However, if you’re following the model correctly, everything should be nicely lined up against one side of the page, the font and color should be consistant, images don’t look out of place, navigation and heirarchical information structures should be in nice organized lists, and the page should generally look like a well-organized text document ready for printing.

Annual CSS Naked Day is a way to encourage web developers to write proper structural markup. The idea is that the quality of your markup should be reflected by how readable and usable your webpage is when stylesheets are disabled. The exercise is pointless if you are using presentation elements like font, big, and b. In most cases, those elements should be replaced with some combination of structural elements like p, h2, and strong and CSS to define how the elements should appear. There are special cases where elements like b and br should be used, but they are quite rare.

And the big problem this exercise is trying to show is the use of tables for layout purposes. Table elements have structural meaning and are the correct elements to use in cases where the information being presented is tabular in nature. However, when table elements are used just to visually position things on a page, it sends the wrong message to search engines and other agents that try to make use of the markup semantics. The agent is told that each row and column is a set of strictly related data, when in fact the only relationship is the intended visual position of the table cell contents. Imagine that the agent is trying to learn something new from you and you tell it that a pig is to bacon as a list of cities is to both a book and a copyright statement. You can see how the agent might walk away quite confused by your website. That probably isn’t what you want to do to your blind visitors who expect the page to be read out in some kind of logical order, or to search engines if you want them to give you a high ranking.

So check out how your page looks without a stylesheet. In Firefox, go to the View menu » Page Style » No Style to see the current page without stylesheets. In Opera, go to the View menu » Style » User mode and uncheck any boxes below it. Internet Explorer and Safari don’t have a straight-forward way to disable author stylesheets even though it is a CSS 2.1 conformance requirement. Fix up your page if you need to, make sure it validates with a Strict doctype, and have a nice CSS Naked Day!

Whose standards?

Thursday, March 2nd, 2006

Occasionally during public discussions about web standards, particularly standards compliance in web browsers, someone comes out of the crowd and asks these two questions: “What good is a web standard if no browser has perfect support for it?” and “Why is a W3C standard any better than a Microsoft standard?”

The second question has an easy answer: even Microsoft is dumping their “standards”. Internet Explorer Group Program Manager Chris Wilson stated in an official blog post, “I want to be clear that our intent is to build a platform that fully complies with the appropriate web standards, in particular CSS 2 ( 2.1, once it’s been Recommended).”. The web development community generally agrees that the models endorsed by the W3C are easier to work with, more flexible, more intuitive, and simply make more sense than Internet Explorer’s implementations. Other browsers have been aiming at W3C standards compliance for quite some time now, and the Internet Explorer development team has made it clear that they aim to follow standards more rigorously in future versions, even when it means breaking websites that rely on Internet Explorer quirks, as we have seen in the big commotion regarding the widely-used * html hack.

It’s true that Microsoft has made some positive contributions to web technology. Some contributions have made it into W3C standards, some have not. When it comes to a battle between an Internet Explorer implementation and a W3C-endorsed web standard, the rest of the browsers and web development community typically flock to the W3C standard. Why? Because we are aware of the great mess that came from the browser wars between Netscape and Internet Explorer, when standards were all but thrown aside and each browser went its own way. It became a tremendous struggle for web developers to get something working across all major browsers. Standards ensure that there is common ground for future development. Considering that all browser developers now aim to follow this one source of rules — rules that are largely the product of discussion and agreement among these browser developers — we as web developers know that these are the rules we can expect will hold up in the future.

This leads to the answer to the first question. Standards are important even if a particular standard isn’t yet well-supported. When no major browsers support a standard, it usually equates to lack of interest. But when interest does spark up, web developers and browser developers can know exactly where to go, since the tracks are already laid out. Standards are more about the future than anything else, and as time goes on they will help to keep our technology models clean, organized, and progressively easier to utilize in a widely compatible way.

Open for comments

Tuesday, February 21st, 2006

Nothing gets my goat quite like a poorly researched news article with no public commenting system. This is the year 2006 and the era of weblogs and open communication on the Web, and we have developed a certain expectation for interaction with our news sources. Those in technological fields know quite well that errors in news stories are all too common, and some form of public review is essential to ensure that less knowledgeable readers don’t get a heap of misinformation.

For a large part, news organizations get it. The provision of some form of commenting system is very much the norm on online news sites. Whether they’re called comments, TalkBack, public discussion, or reader responses, most popular online news sources have some form of public feedback system that gives readers immediate access to the responses.

However, there are still plenty of news sources — some of them quite major — that haven’t caught up to the times. Often they will simply link to their generic message board system, which significantly discourages both the posting of responses and the reading of those responses, as the messages are not directly linked with the article. In other cases, they will only provide a way to privately contact the author of the article or the editor, and you will find that your response very seldom affects the content of the article in question, even if the author confirms that you are right.

Luckily, this is a problem that seems to be slowly dying away as the news industry has generally recognized this change in culture and the benefits it has produced. Market forces may also be playing a role, as people will naturally flock to sources where they can see more discussion on the subject and even participate in said discussion. This concept was at the core of Tim Berners-Lee’s original vision of the Web, and it seems inevitable that it is the direction in which the Web will continue to progress.

Push for health: the great trade-off

Sunday, February 12th, 2006

The goal for this website is to promote the long-term health of the Web. One of the most important factors defining the health of the Web is how open and accessible the information is. This is why standards, proper document formation, and semantic markup are so important.

In an ideal world, all user agents in use would be fully up-to-date, supporting all of the most recent Web standards. Unfortunately, that isn’t real life. A fairly large percentage of the world’s population still uses less-than-current web browser versions, and most the world’s population uses Internet Explorer, with the worst overall support for Web standards of any major brand.

Graceful degradation only goes so far, especially when you start getting into the blossoming world of XML. The big question is, at what point do we decide it’s unreasonable to support older, outdated browsers? How much of our document’s efficiency and semantic accuracy are we willing to sacrifice in order to cater to older user agents? And, more importantly for the purpose of this website, what effects do these decisions have on the long-term health of the Web?

I think we can all agree that, as far as the Web goes, it’s unhealthy for people to be using outdated web browsers. I think we can also agree that it’s unhealthy for developers of popular web browser brands to halt development for several years. We can’t actually force people to update their computers or browser developers to improve their browsers, but wouldn’t you agree that it’s unhealthy to directly reduce incentive for these actions to take place?

The original question is still the same: at what point do we decide it’s unreasonable to support older, outdated browsers? A growing sentiment is that it’s unreasonable to support Internet Explorer 5.5. Where money and necessary services aren’t involved, a few even argue that it’s unreasonable to support Internet Explorer 6 and that users should be made aware that their browser version is five years old and should therefore expect their browser to fail at newer documents. That’s certainly a bit extreme for most types of websites, but it’s a push that, as we have learned over the last half decade, is needed to some extent if we hope for the situation to become more pleasant.

The answer to the question is rather subjective, and it depends on the content of your website. The owner of a small personal weblog could more easily ignore Internet Explorer 5.5 than a business website could, and the negative end of the trade-off would be much less significant. As you will see, when discussing complex ideas like the overall health of the Web, the answers will rarely be black and white.