Showing posts with label website. Show all posts
Showing posts with label website. Show all posts

Friday, May 15, 2009

Rshrtnr: The private URI shortener

Over the past couple of days, I've implemented a private URI shortener service for myself, which I have named "Rshrtnr". The derivation of the name is left as an exercise for the reader.

My main motivation for writing it was a criticism of public URI shortening services that I have been seeing in blogs for a long time: if the service has some downtime or suddenly disappears, all of the links that you have created with it are useless. With my approach, I regain some control of where my shortened links point to, and if the service has downtime and/or disappears, I have more options for restoring it.

The code itself is written in Python. SLOCCount says that the core module runs at around 100 SLOC. Most of my time, however, was taken up by working around problems relating to my webhost's Python installation. The supported Python version is 2.4.x, which is ridiculously old (for reference, Gentoo was the last major Linux distribution to switch from Python 2.4 to 2.5, around July 2008). Additionally, for some reason, if I attempt to change the sys.path variable (i.e., the "include path") to use locally installed modules (I am on a shared host), the entire script breaks with zero logged messages anywhere. It runs fine via the command line, but in FastCGI mode, the strangeness occurs.

The two third-party modules that I used were Paste and mysql-python. I store the URIs and their associated aliases in a simple SQL table, and I use Paste for various WSGI/HTTP-related utilities. I "manually" handle routing via parsing the PATH_INFO environment variable.

There are two ways to specify an alias: either explicitly send a custom one as a query parameter with the URI, or let the app make a random one for you. With the latter behavior, it hashes the URI to generate an eight character "unique" alias. Since there are (in theory) 64^8 possibilities, I don't think I'll run out of aliases any time soon, especially since custom aliases can be anywhere from 1 to 15 characters long.

In my opinion, the most interesting feature is that adding URIs requires one to send an OpenPGP-encoded query string, which needs a public key recognized by the app for the operation to succeed. To write this, I simply parsed the output from sending the OpenPGP message to the gpg binary.

Finally, mod_rewrite magic is used to prettify the shortened URIs. Nothing too exciting about that part.

I had thought about hosting a version of Rshrtnr on Google App Engine, but a key component is missing - OpenPGP support.

If anyone wants me to release it, please comment below. There's currently a bunch of webhost-specific things that I would need to abstract out before I release the code to the general public, and unless someone gives me a very good reason, it will be licensed under the AGPL version 3.

Thursday, May 07, 2009

On Bindings

One of the more interesting areas in software development, to me at least, is language bindings. Being able to interface with a library written in one language in another language is kind of satisfying, as it allows me to develop without having to reinvent the wheel. There are two specific projects that I use and work on so that I can enhance the software that I develop: GObject Introspection (G-I) and python-spidermonkey.

GObject Introspection

As a quick overview, the goal of this project is to give C libraries the tools to provide enough metadata about their API so that bindings can be written with minimal effort. Given the time and effort that I have put into maintaining the Awn bindings, it is not very surprising that I would be willing to help out getting this framework working for Awn. My ultimate goal is to eliminate the bindings/python folder in the Awn source tree. It is basically a mixture of a Scheme definition file plus a very bizarrely formatted "override" file for custom definitions, all integrated into autotools to produce a C library that is ready to be dynamically loaded into python via import. To meet this goal, I am contributing to the PyBank project, which is a prototype Python module that interfaces with the GObject Introspection library to read compiled library metadata files (called "typelibs") on the fly so that classes, functions, etc. can be loaded and called at runtime. In addition to myself, a Google Summer of Code student and a Sugar Labs developer are also working on the module, with Johan Dahlin overseeing it all. So far, I've contributed a unit test suite, ported from the gjs project (JavaScript bindings for GLib-based libraries based on the Spidermonkey VM) and working type bindings for various simple types (e.g., int64 and float).

I have also put some coding effort toward G-I integration in Vala. Vala supports G-I by both reading GIR files (the XML serialization of G-I metadata) to produce VAPI files (short for Vala API files), and writing GIR files when producing a library written in Vala (e.g., libdesktop-agnostic). I have contributed mostly what amounts to workarounds in the GIR reading code, with regards to Vala/G-I behavioral inconsistencies. Didier 'Ptitjes' has done much, much more solid work than I have on both fronts, which I greatly appreciate.

python-spidermonkey

This project, as the README states, lets you [execute] arbitrary JavaScript code from Python[, and allows] you to reference arbitrary Python objects and functions in the JavaScript VM. As I've stated in an earlier blog post, I use this in my custom website build system to both validate and pack my JavaScript code, via JSLint and Packer, respectively. Since I published that post almost two years ago, that project was revived twice - once by a Mozilla employee (and co-founder of Humanized, which is quite awesome) named Atul Varma, and the latest incarnation is on github. Since it is based on the original implementation in C, and not the Python-based ctypes version, the Base2 recursion problem does not exist, and so I have happily written modules and scripts which wrap the two JavaScript utilities. Recently, I have made them available in a public project on Launchpad called python-jsutils. I haven't really announced it until now because it currently relies on a change I made to python-spidermonkey which allows one to iterate over a JavaScript array, instead of having to write "unpythonic" code like for x in range(0, len(foo)): #.... While it is in my fork, it has not been merged to the "official" repository.

Saturday, April 25, 2009

Website Internals: The JavaScript Tag Module

I need something to get myself blogging again, so I figured that I should write about how my website is written. I'm going to start with one of the parts written in JavaScript: the Tag module. It's comprised of a simple JavaScript object with static methods and properties, no instances. Its main purpose is to create (X)HTML nodes for use in what used to be known as "dynamic" HTML. Incidentally, it's also part of my "old projects" series: I originally created it for an Intel-sponsored research project that I was working on during my time in university. (For more details on the project, please see my curriculum vitae.)

In practice, calls using Tag.create() don't look too horrible. Consider the example of building a minimal HTML5 document:

var html = Tag.create('html', {children: [
    Tag.create('head', {children: Tag.create('title')),
    Tag.create('body')
]});

Or, a data table:

var table = Tag.create('table', {
    attributes: {summary: 'MLS Statistics'},
    classes: ['sortable', 'centered'], // uses sorttable 
    styles: {border: '1px red inset'}, // use CSS style names, not JS ones
    children: [
        Tag.create('thead', {children: Tag.create('tr', {
            // text is automatically converted
            Tag.create('th', {children: 'Team'}),
            Tag.create('th', {children: 'Goals For'}),
            Tag.create('th', {children: 'Goals Against'}) 
        })}),
        Tag.create('tbody', {children: [
            Tag.create('tr', {
        Tag.create('tbody', {children: [
            Tag.create('tr', {
                classes: ['west-coast', 'usa'],
                children: [
                    Tag.create('td', {children: 'Seattle Sounders FC'}),
                    Tag.create('td', {children: '9'}),
                    Tag.create('td', {children: '3'})
                ]
            }),
            // add more teams here...
        ])
    ]
});

I recently modified the function so that it deserializes an object (originally a JSON string) into a DOM node tree. This is particularly useful when you're sending partial HTML documents as JSON strings (which I prefer to sending HTML strings and dealing with that mess). So, the first example would look like this:

var html = Tag.create({
    "name": "html",
    "children": [
        {
            "name": "head",
            "children": {"name": "title"}
        },
        {"name": "body"}
    ]
});

Interestingly enough, I created this without the knowledge of the existence of JSONML or any of its bretheren, although I suspected that JSON-HTML converters already existed.

Three of the functions in the module are basically wrapper functions. Tag.createWithText() creates an HTML element with a text child node, and Tag.createHeader() creates an HTML header element (e.g., <h1>...</h1>) with a text child node. Tag.text() is shorthand for the DOM's document.createTextNode() method.

In the current iteration, nearly all of the event-related code is commented out, as there are far more competent JavaScript libraries out there which deal with cross-browser events. The only event-related function left is Tag.dispatchEvent(), which sends a "synthetic" event for a given HTML element. If I remember correctly, I coded for both the W3C and Microsoft models, but I don't remember testing it on browsers other than IE6/7 and Firefox 2. At some point, I'll probably reintegrate event support to Tag.create() at minimum, using Base2.

The remaining function is a utility function. Tag.inXHTML() determines whether the document in question is in XHTML mode, using a variety of heuristics. I'm sure there's a better way of doing it, but I couldn't find one when I was researching it.

This module has been tested in IE6/7, Firefox 1.5/2/3, Safari 2/3, and Opera 9.5 - although not all in the same time periods. It's licensed under the Apache Licence (version 2.0) and is currently somewhere in my compressed JavaScript file. If there's any interest, I'll put up the non-compressed version somewhere and update this post.

Wednesday, January 16, 2008

This'll be interesting...

I switched my DNS host from zoneedit.com to editdns.net, because ZoneEdit doesn't support DNS SRV records. They're needed for XMPP server-to-server federation support. My XMPP address is <${my_second_level_domain} at ${my_second_level_domain} dot com>. Note that this is different from my email address.

Thursday, October 25, 2007

TODO List, 2007/10/29

Avant Window Navigator

  • Finish file monitor wrapper
  • Fix python bindings for awn.DesktopItem
  • Fix launcher bugs
  • Add test programs for filemonitor wrapper and desktop item wrapper
  • Fix inter-process config handling

Pidgin Status Updater

  • Add project/source code to Launchpad
  • add Jaiku support (use xmlrpc-c)
  • put HTTP requests in a separate thread
  • cache cookie-based user authentication

Website

  • Make pages unobtrusively load dynamically
  • Add section on Avant Window Navigator

Thursday, July 12, 2007

Website finally updated, hooray!

After years of malnourishment and two weeks of development, my little old static website (now using a smaller domain name!) is live. The old website, like the new website, was created via a templating system. However, the former website's templating system was homegrown using PHP4 classes (disgusting, I know...but that's all I could use at the time). Even more disgusting about my system was that it was HTML comment directives plus a regular expression parser. I was so young and naïve, and I hadn't taken a compilers class yet. So this time around, I said "screw it" and went with a) my favorite language, Python, and b) the template software that I had been using for my Trac-AtomPub plugin (yes, not -atompp anymore, per the lengthy discussion on the atom-protocol mailing list).

The Journey

As the new website was a chance to experiment with new things, I decided to take the plunge and use HTML5 to markup my website. And with any sort of experimental technology, there were many problems.

First, I tried to use the genshihtml5 plugin, but strangely enough the code was a bit buggy (e.g., it was missing an import), and I could never figure out how to get it to output proper HTML5 while still removing end tags from tags which don't need them, e.g. <link/>, while retaining them for tags which require one, e.g., <script/>.

Next, I tried to use html5lib's Genshi-Stream-based tree walker. For some reason, it simply would not output any data. I don't remember all of the details, but I do remember inserting a lot of print statements in html5lib to see if I could find the bad piece.

Finally, I gave up and made Genshi just output XHTML plus the extra HTML5 tags. I figured that all of the debugging trouble simply wasn't worth it for the timeframe I had envisioned.

(As an aside, I do plan on submitting the patches that I've made as a result of this...exercise (for lack of a better word) so that they can be integrated in future releases of the respective software.)

Actual usage of new HTML5 tags was...interesting to debug. If you're writing HTML5 and not XHTML5, and you're viewing the page in Firefox, this is what the DOM tree looks like (according to Firebug):

<figure _moz-userdefined="" />
<img src="..." alt="..." />
<legend>...</legend>

For comparison, this is what it looks like when rendered as XHTML5:

<figure>
<img src="..." alt="..." />
<legend>...</legend>
</figure>

That completely broke my CSS files, as I was using child/descendant rules utilizing the new tags. This sort of thing is why I love using Firebug.

Testing

I've really only thoroughly tested this website on Firefox 2.x (Windows & Linux). I just checked it on Opera 9.20 (Linux) and a relatively old development version of Gtk-Webcore (AKA WebKit), and the only bug that I see (in both of them, strangely) is some sort of CSS error in calculating the spacing for the <dd/> box for "Special Skills" in my CV.

Future

Future plans include packing both the CSS and the JavaScript, via csstidy and packer, respectively. Right now there are several bugs with regards to integrating the two applications with my build system. csstidy interprets white-space values incorrectly, particularly the vendor-specific values. I'm currently trying to integrate packer via this nifty little python module that uses ctypes to create an interface with Mozilla's Spidermonkey JavaScript engine. Unfortunately, there's a recursive reference somewhere in base2, and the module is choking on it, so I have to figure out how to resolve that (if possible). Another future plan involves making the site fully dynamic in that the page layout stays the same, while background XMLHttpRequests retrieve the page contents when internal links are clicked. Obviously the current behavior would be retained as a fallback.

Anyhow, there are more details about how I made my website on the colophon. Bug reports, suggestions and feature requests are welcome!