Web Devout tidings


Archive for September, 2006

There is no solution for the Q element

Tuesday, September 26th, 2006

Stacey Cordoni recently posted an article on A List Apart entitled "Long Live the Q Tag". The article discusses the problems with the q element stemming from Internet Explorer’s continuing lack of support and talks about some alternative solutions. The solution she settles on is to use q elements with the default quotation marks removed via CSS styling and manual quotation mark characters added directly in the HTML source outside the element.

This is not an adequate solution. It completely ignores user agents that don’t support CSS or have it disabled. Text browsers that correctly support HTML and don’t support CSS would render such a quotation delimited by two pairs of quotation marks. lynx is such a browser.

I have found that there is no true solution for the problem with the q element. Unfortunately, the problem isn’t exclusive to Internet Explorer either, as there are other user agents that fail to handle the q element correctly. ELinks behaves like Internet Explorer in this respect.

The reality is that the q element simply won’t behave consistently in all major browsers no matter what you do, and so its use should be avoided.

Web Devout browser version usage statistics

Saturday, September 16th, 2006

I have added to the Visitor statistics page information about user agent (web browser) version usage on this site. User agents whose version identifiers weren’t successfully determined by the script aren’t listed, nor are versions that had only one user during the two-week period. Currently, the script doesn’t know exactly where to find the version information for many lesser-used user agents and instead tries to guess based on the user agent string. I plan to improve this over time.

SHORTTAG and OMITTAG

Monday, September 11th, 2006

Errors reported by the HTML Validator often mention SHORTTAG and OMITTAG in the description. This has caused some confusion, so I will explain what these two features are and where they come from.

Every SGML language, including HTML and XML, has something called an SGML declaration that defines the lexical rules of the language. It defines which characters are used to delimit tags and other constructs, which character ranges can be used for element names, what kinds of constructs may exist in the document, and other things. The SGML declaration usually isn’t included in the document itself (in fact, most user agents won’t support it if it is), but is usually buried somewhere in the language’s specifications — HTML SGML declaration, XML SGML declaration. Although browsers typically don’t parse actual SGML declarations, they typically choose which parsing rules to follow based on the HTTP Content-Type header. The HTML Validator is unusual in that it actually selects the parsing mode based on the doctype, so it parses a document with an XHTML doctype as XML even if it’s sent with Content-Type: text/html.

The SGML declaration defines a heirarchy of settings. One of the main categories is FEATURES, whose first subcategory is MINIMIZE. This is where you will find the SHORTTAG and OMITTAG feature settings.

OMITTAG defines whether or not start or end tags may ever be omitted. If YES, elements may define in the DTD whether start or end tags may be omitted. If NO, regardless of what the DTD says, they may never be omitted. OMITTAG is YES in HTML, but NO in XML and thus XHTML.

SHORTTAG then defines whether or not general shorthand features may be used. The format for this is different between the HTML SGML declaration and the XML SGML declaration. XML uses an extended format that toggles a number of features individually, while HTML (and classic SGML declarations) uses a single boolean value for all of the features. SHORTTAG consists of three main categories: STARTTAG, ENDTAG, and ATTRIB.

STARTTAG deals with start tags and contains three features: EMPTY, UNCLOSED, and NETENABL.

EMPTY defines whether or not the contents of the tag may be omitted. This is not the same as whether or not the contents of the element may be omitted. An empty start tag may look like this: <>. Instead of specifying the element name, it is assumed to be the same kind of element as the previous sibling (the element that most recently closed). This is legal (YES) in HTML, although no major browser supports it. It is illegal (NO) in XML and thus in XHTML.

UNCLOSED defines whether or not the start tag needs to be closed. Again, this is not the same as whether or not the element needs to be closed. Here is an application of an unclosed start tag: <div<p>This is a P inside a DIV.</p></div>. The end of the start tag is assumed by the beginning of the next tag. This is legal (YES) in HTML, although it is poorly supported. It is illegal (NO) in XML and thus XHTML.

NETENABL defines whether or not the start tag may use Null End Tag (NET) notation. This replaces the start tag’s closing delimiter and the end tag with special single-character delimiters. Here is an example of an element using a Null End Tag: <title/This is the title of the page/. The value for this feature may be NO, ALL (which is implied if SHORTTAG is simply YES), or IMMEDNET. Null End Tags are always legal (ALL) in HTML, although, as you might have guessed, no major browser supports it. In XML, it is IMMEDNET, meaning that it is supported, but only when the Null End Tag closing delimiter is immediately after the opening delimiter, which in turn means that the element must have no contents. XML also uses a different character for the closing Null End Tag delimiter: “>“. Therefore, a Null End Tag in XML looks like this: <br/>, which people familiar with XML should recognize.

ENDTAG deals with end tags and contains two features: EMPTY and UNCLOSED.

This EMPTY is similar to the one in STARTTAG, but it applies to end tags and it is assumed to close the most recent element that is open. For example, if you have <div>Foo <span>bar</span> baz</>, the empty end tag closes the div element. This is legal (YES) in HTML, but illegal (NO) in XML and thus XHTML.

This UNCLOSED is also similar to the one in STARTTAG, and applies to end tags. The end of the end tag is assumed by the beginning of the next tag. For example, <div><div>Foo</div<p>Bar</p</div>. This is legal (YES) in HTML, but illegal (NO) in XML and thus XHTML.

ATTRIB deals with attributes and contains three features: DEFAULT, OMITNAME, and VALUE.

DEFAULT defines whether or not attributes may have default values that are defined in the DTD. This is enabled (YES) in both HTML and XML and thus XHTML.

OMITNAME defines whether or not attribute names may be omitted. In such a case, the given attribute value will be used for both the attribute name and attribute value. For example, <input type="checkbox" checked> is equivalent to <input type="checkbox" checked="checked">. This is legal (YES) in HTML, although several major browsers don’t treat it literally in some areas like CSS attribute selectors. It is illegal (NO) in XML and thus XHTML.

VALUE defines whether or not attribute values may be specified without delimiting quotation marks if the value uses certain ranges of characters. This is legal (YES) in HTML, but it is illegal (NO) in XML and thus XHTML.

So here’s the summary: HTML has a simple YES for both OMITTAG and SHORTTAG, meaning all of the above features are allowed. XML has NO for OMITTAG and has a feature breakdown for SHORTTAG, amounting to YES for ATTRIB DEFAULT, IMMEDNET for NETENABL, and NO for everything else.

Although it is technically legal to write your own SGML declaration right into an HTML document, extremely few user agents will even recognize it, let alone support it correctly. It is strictly illegal to write your own SGML declaration into an XML document. SHORTTAG and OMITTAG aren’t options you can toggle to please the browser, they are inherent traits of HTML and XML and valid documents must conform to those rules.

Big site updates

Monday, September 4th, 2006

Web Devout has received some big updates, including a touched-up look and several new articles. This is part of a lot of behind-the-scenes development work that has been going on for a month or so. There is a lot more to come, including some full HTML, CSS, and DOM references that are currently being written, but I wanted to get this update through the door in the meantime.

Here are some newly published articles:

About Web Devout
Some background information about Web Devout and its mission, with a timeline of events.
Common issues in web design
Solutions to commonly encountered problems people experience in web design. This article is expected to grow over time.
CSS hacks
CSS hacks can cause problems down the road, but if people are going to use them, they might as well know the options and weigh the potential consequences appropriately. This article describes many known CSS hacks, including Internet Explorer’s conditional comments and plenty of CSS selector hacks.
Escaping style and script data
Lots of people use HTML comments in their inline style and script data without really understanding what’s going on. This article explains the concepts of hiding style and script data from unsupporting browsers and maximizing document compatibility between HTML and XHTML.
URLs
Did you know that //www.w3.org is a valid URL? http:foo.html? This article explains the parsing structure of a URL with examples and additional notes.

The poorly supported title attribute

Saturday, September 2nd, 2006

For such an important attribute, it’s strange that the title attribute doesn’t have better support than it does in modern web browsers. title is used to provide secondary descriptive text about an element, often rendered as a tooltip on mouse hover. Unfortunately, all major web browsers have bugs with whitespace and/or character reference handling on this attribute, making the feature often unusable for multiple-line texts.

I have constructed a title attribute tooltip test suite which I would like all major browsers to pass. Currently, Internet Explorer, Firefox, Opera, Safari, and Konqueror all fail at least one of the tests.

Internet Explorer allows line breaks in the tooltip value, but in the incorrect manner: it will display a line break if there are newlines or carriage returns in the source HTML. Newlines in the attribute source should be ignored and carriage returns should be converted to spaces, as in other CDATA attribues. The proper way to represent a line break in the attribute value is to use a newline character reference (&#10;), which Internet Explorer also (correctly) converts into a line break. IE also handles tab characters in the source incorrectly.

Firefox has perhaps the most disruptive bug, which limits the tooltip text to a small number of characters on a single line. Progress on fixing that bug is blocked by another bug relating to the calculation of tooltip heights. Firefox also has several problems with newlines, carriage returns, and tabs in the source HTML and doesn’t convert newline character references into line breaks. This is overall the worst implementation of the title attribute of all major browsers.

Opera fares well in the whitespace handling, but it doesn’t convert newline character references into line breaks, making multi-line values impossible.

Safari makes line breaks on newlines and carriage returns in the source and uses tab characters for wide spacing instead of ignoring the newlines and converting carriage returns and tabs into spaces.

Konqueror generally does well, but it handles newline characters in the source like Internet Explorer does. I would say this is the best implementation so far, although still not perfect.

With the growing amount of web-based applications aiming to provide the functionality and feel of traditional desktop applications, proper tooltip display of the title attribute value is becoming increasingly important. Please petition the browser developers for attention to this issue.

Edit: I’ve corrected the link to the Firefox bug report. The previously linked report was specifically for Mozilla/SeaMonkey.