Thursday, July 12, 2007

Website finally updated, hooray!

After years of malnourishment and two weeks of development, my little old static website (now using a smaller domain name!) is live. The old website, like the new website, was created via a templating system. However, the former website's templating system was homegrown using PHP4 classes (disgusting, I know...but that's all I could use at the time). Even more disgusting about my system was that it was HTML comment directives plus a regular expression parser. I was so young and naïve, and I hadn't taken a compilers class yet. So this time around, I said "screw it" and went with a) my favorite language, Python, and b) the template software that I had been using for my Trac-AtomPub plugin (yes, not -atompp anymore, per the lengthy discussion on the atom-protocol mailing list).

The Journey

As the new website was a chance to experiment with new things, I decided to take the plunge and use HTML5 to markup my website. And with any sort of experimental technology, there were many problems.

First, I tried to use the genshihtml5 plugin, but strangely enough the code was a bit buggy (e.g., it was missing an import), and I could never figure out how to get it to output proper HTML5 while still removing end tags from tags which don't need them, e.g. <link/>, while retaining them for tags which require one, e.g., <script/>.

Next, I tried to use html5lib's Genshi-Stream-based tree walker. For some reason, it simply would not output any data. I don't remember all of the details, but I do remember inserting a lot of print statements in html5lib to see if I could find the bad piece.

Finally, I gave up and made Genshi just output XHTML plus the extra HTML5 tags. I figured that all of the debugging trouble simply wasn't worth it for the timeframe I had envisioned.

(As an aside, I do plan on submitting the patches that I've made as a result of this...exercise (for lack of a better word) so that they can be integrated in future releases of the respective software.)

Actual usage of new HTML5 tags was...interesting to debug. If you're writing HTML5 and not XHTML5, and you're viewing the page in Firefox, this is what the DOM tree looks like (according to Firebug):

<figure _moz-userdefined="" />
<img src="..." alt="..." />

For comparison, this is what it looks like when rendered as XHTML5:

<img src="..." alt="..." />

That completely broke my CSS files, as I was using child/descendant rules utilizing the new tags. This sort of thing is why I love using Firebug.


I've really only thoroughly tested this website on Firefox 2.x (Windows & Linux). I just checked it on Opera 9.20 (Linux) and a relatively old development version of Gtk-Webcore (AKA WebKit), and the only bug that I see (in both of them, strangely) is some sort of CSS error in calculating the spacing for the <dd/> box for "Special Skills" in my CV.


Future plans include packing both the CSS and the JavaScript, via csstidy and packer, respectively. Right now there are several bugs with regards to integrating the two applications with my build system. csstidy interprets white-space values incorrectly, particularly the vendor-specific values. I'm currently trying to integrate packer via this nifty little python module that uses ctypes to create an interface with Mozilla's Spidermonkey JavaScript engine. Unfortunately, there's a recursive reference somewhere in base2, and the module is choking on it, so I have to figure out how to resolve that (if possible). Another future plan involves making the site fully dynamic in that the page layout stays the same, while background XMLHttpRequests retrieve the page contents when internal links are clicked. Obviously the current behavior would be retained as a fallback.

Anyhow, there are more details about how I made my website on the colophon. Bug reports, suggestions and feature requests are welcome!


Thomas Broyer said...

Hi, could you report the issue on either genshihtml5 or html5lib, with some sample code to reproduce the bug, please?

I'd love to see genshihtml5 used somewhere (it's almost abandoned but I'd be happy to work on it again if someone expresses interest), though if you only need the serialization thing, I suggest you rather use html5lib's tree walker, which is exercised from within html5lib's test suite (genshihtml5 really lacks unit tests).

Thomas Broyer said... for the parsing of figure and other new HTML5 elements, AFAIK Opera 9 and IE7 will get them right.

In the mean time, I guess using HTML5's XML-like syntax (having your document being at the same time valid XHTML5 and HTML5) is the way to go (serving it as application/xhtml+xml to Firefox at least)