Monday, November 15, 2010

A Year With Mark @ DevHub

Note: verb tenses might be a little out of whack, because it's kind of strange to write this the day before my last day at DevHub.

Other Note: Obviously, this is my personal blog and in no way is indicative of the opinion of either company.


I've been working on the DevHub platform for just over a year, and I've decided that it's time for me to move on. I'll be working for an educational startup, Dreambox Learning, which creates web-based math software. Specifically, I'll be working on their marketing website (the one that I linked to). I might have more time to work on other projects, but we'll see.

Why now?

Honestly, I expected to be working at EVO Media Group for at least another year, building up my "professional experience" so that I wouldn't have to go through the hell that was my last job-seeking "adventure". Additionally, I loved the work that I did - I was writing challenging code in my favorite programming language, Python, and it was being used by thousands upon thousands of people every day, on hundreds of thousands of sites. It also helps that I enjoyed working with my co-workers, even during extreme crunch time (I'll get to that later).

But there are a few things which this new opportunity will give me, in no particular order (sorry, I rather like bulleted lists):

  • The chance to work at a company where one of the primary focuses is social change. This has been one of my career goals for a while now (along with, "working for a company that primarily creates open source software" - we'll see when that one is checked off). I recently tweeted that "[t]he Seattle school district averages for 10th grade math/science proficiencies (2009-10) are <50%." As a person who enjoyed both of those subjects in school, I would love to help fix that problem. Granted, the company doesn't do high school math curricula, but giving younger students a firmer grasp of basic math concepts will surely help.
  • The opportunity to work in a larger company. Dreambox is several times larger than my current place of work, and working with more (and different types of) people is always a good learning experience, and will help me, career-wise.
  • An excuse to learn Ruby. Sadly, there aren't that many Python jobs around (though strangely, I've been getting cold-called by recruiters on a much more frequent basis lately), and it doesn't hurt to be more versatile. Particularly when I refuse to work with Java servlets, and to a lesser extent, the .NET Framework. I've also been avoiding PHP work, now that I know how wonderful Django is.
  • I won't have to do customer technical support anymore. Not that I absolutely hate doing it - I voluntarily did it for the Avant Window Navigator project for years. I like helping people, I just don't like being strongly encouraged to do so, every single day. Speaking of Awn…
  • I'll (probably) have more time and energy for side projects. So much of both of those things were taken up by work, especially during the Month of Hell™, where we were working nonstop on creating the gamified version of the site editor (and I slept in the office for a week). I've been told that the likelihood that I'll be pulling an all-nighter at my new gig is low - we'll see. But I really, really want to get back to having side projects, and possibly resuming work on Awn and related technology. At my current job, I've mainly been really worried about burnout, which was a strong factor in me putting off working on other coding projects. I really love to code, and it would be terrible if I just started hating it. (On a somewhat related note, one of the metrics for whether I should start looking for a job is when my life starts sounding like the first verse of Jonathan Coulton's "Code Monkey". Not that I currently feel like that about DevHub.)

Hopefully, the reasoning above shows that I have put some thought into whether I should change jobs, unlike what certain people (whom I will not name) have insinuated.

What did I do at DevHub?

I've been relatively quiet about what I've worked on at DevHub. You can see bits and pieces of it via Twitter and LinkedIn (not to mention BitBucket and GitHub), but I wanted to give an overall view of what I did, without violating NDAs or anything like that.

My primary focus was the application layer. As the DevHub developers page says, it's a Django-based environment. Interestingly enough, when I applied for the job, I didn't know Django at all. I was aware that it existed, and I had tried learning Pylons a few months prior (that ended badly). I did, however, know WSGI fairly well, as my URL shortener uses it. So, dealing with Python and the web wasn't a completely foreign concept to me. I would say that it's a testament to how awesome Django is, that I was able to pick it up and port the simple to-do web application that I was writing in PHP (using Doctrine as the ORM) in under a day. Of course, as soon as I was hired, I was made aware that certain major components of Django (the ORM and template systems) weren't being used, but SQLAlchemy and Jinja2 were. Which is another good thing about Django - it may be heavily opinionated, but it's not necessarily "my way or the highway".

This particular aspect is important, mostly because about ten months later, I was given the task to write a "macroframework" around this particular combination of technologies, using all of the best practices that we had accumulated since I was hired. I genuinely hope that it gets open sourced, because it's a fairly complete framework - it ports many popular Django apps, and as a good Django-based package would be, it has a lot of unit tests and documentation. In the process of writing it, I've also contributed fixes to the apps that I've ported, when I've seen areas which need improvement.

There's one other library that I wrote, which I hope will be open sourced. It's essentially a domain name parser. It can tell you whether a given domain name is syntactically valid, and provide relevant and proper concatenations of the constituent parts, such as the subdomain and the domain. It also handles IDNs just fine. It's a bit domain-specific (no pun intended), but works well, mostly due to the amount of unit/doc/regression tests I've written for it.

In late January, I was assigned the task of porting the DevHub platform from PHP to Python (as one of the reasons I was hired was because I knew both languages fairly well). And that began a six month journey, along with my co-workers (which included the other, more senior developer and two recently hired designers) where we would be working incredibly long hours, to get the new-and-improved DevHub launched in early July.

I had been doing some experimenting with PyPy, because of its sandboxing capabilities. Unfortunately, due to several factors, it was deemed infeasible to use. Since then, however, I have been keeping tabs on its development to see if any of said factors have been eliminated. Regardless of that setback, by March I had made a reasonable amount of progress on the port, and the unpaid overtime began. (Yay for exempt status¡)

In the process of porting the platform, I had to deal with a number of third-party APIs, because one of DevHub's features is that it supports a number of third-party services by default (as opposed to having to add HTML embed code given by the third-party). The quality of these APIs ranged from half-decent to just plain terrible. Mind you, I've worked with other APIs prior to DevHub (in fact, I won a t-shirt in an API contest), but they were at least decently documented and the structure made some sense. It's amazing how little thought that some of these API providers give to their users.

In May, a few things happened: The platform port was essentially complete, our hosting provider took forever to move our server instances cross-country, and it was decided that the site editor needed to be gamified. I wrote a small prototype to see how that would work. Eventually, it was decided that most of that would be scrapped and that we would be using the BigDoor API. We were already partners, so it seemed like a natural fit.

June was the aforementioned "Month of Hell™". At one point, I was at the office for 14 days straight. At the end, I began my week of sleeping at the office (AKA, "The Week of Utter Hell™"). Quite possibly, the one good thing that came out of that experience, on a personal level, was that I was given my current phone, a Motorola Droid, as recognition of how much time I spent at the office. (My boss had gone to Google I/O and had gotten one for "free", and was/is an iPhone user and thus on AT&T, so it wasn't much use to him.)

By the launch in July, I was extremely close to burned out as I ever wanted to be. Fortunately, I had made sure that I got a week of vacation in mid-July (where I would be going to OSCON, independent of the company, and also taking in some of the sights of Portland). By the time I was back to work, some people had noticed that I was a significantly different person (i.e., not ridiculously stressed out). I don't really want to think about what would've happened if I didn't take that trip at that time.

Relative to the previous couple of months, August was pretty calm. We (the company) did play a game of dodgeball with a company that we were going to partner with. For me, that just indicated that I was really out of shape. I immediately began jogging when the CEO insinuated that there may be more of these games. (To date, there hasn't been another one.)

September was pretty awesome, mostly because I was fortunate enough to go to DjangoCon. (The company paid for most of it, as part of an agreement during The Month of Hell™.) I talked to some fellow web developers, plus sat in on some pretty interesting talks. I really wish that I could have stayed for all three days, but alas. One interesting thing came out of the experience. One of the technologies that people were consistently touting as a must-use package was celery, a distributed task queue. About a week after DjangoCon, we had a big problem with a long-running task during the request process. I remembered celery, and in under a week, I experimented with it on the development server, documented the process to install the subsystem (for the benefit of our sysadmin), helped my co-worker patch the task to use celery, tested the patch, and deployed it to the live servers.

October was the month where I was both working on a client project and dealing with the decision of whether to change jobs, so I've covered most of that already. One thing that I think is worth mentioning is that I started to use code from the HTML5 Boilerplate project. I liked it so much, I'm using it in my current side project, the recently resurrected to-do app. And I plan on using it in the next job, too.

The End

And here we are, in the "present". I know it's a bit cliché, but I'd like to publicly thank the execs at EVO Media Group for hiring me 13 months ago. I really, really appreciate the amount of confidence that you have with my work, and I hope that DevHub becomes even more popular and awesome than it is now.

Friday, December 04, 2009

Open Letter to Sherman Alexie

While going through my backlog of TV shows from the past week, I was watching the Colbert Report from Tuesday (December 1). During his interview with Sherman Alexie, I heard something that sounded rather offensive to my ears. Skip to 3:14 to hear it.

For those of you who hate Flash or don't wish to watch the video, the context here is that Colbert is asking why he doesn't allow his books to be digitized. His response, up to the point which I reference above, was the typical response about how the music industry is losing money because of the rampant piracy, and that the only way to make money is via live shows. And then he makes this statement:

[...]and with the open-source culture on the Internet, the idea of ownership, of artistic ownership goes away.

Mr. Alexie, to use a colloquialism: What have you been smoking? I know that you're BFF with The Stranger, but this is ridiculous. You're supposed to be intelligent, not ignorant. There are a couple of things that are inaccurate with your statement.

First, I'm pretty sure you're referring to the Free culture movement. "Open source", while it can refer to non-technical ideas, is more closely associated with software and its licenses. But I'm being pedantic.

Secondly (and more importantly), where does the idea of ownership go away? Maybe if you release the work into the public domain, sure. However, the majority of "free culture licenses" (e.g., Creative Commons licenses) ensure that one still owns the work that they create. The significant difference between traditional copyright and those licenses is that certain rights are granted by default, instead of having to ask the author about it. For example, this blog post (and the entire blog, for that matter) is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License (as explicitly stated in the sidebar). What this means is that I have granted noncommercial entities to reproduce (or even "remix") my work (in full or in part) in other works, so long as I am credited, and the work retains the same license. Nowhere did I relinquish my ownership of this blog post. If a commercial entity wants to use my work, or someone wants to reproduce/remix my work under a different license, traditional copyright applies and they have to ask (assuming the use doesn't fall under "fair use").

As one of the literate (see the interview for the reference) and also a dead-tree book lover, I suggest that you read two books, both by Professor Lawrence Lessig: Free Culture and Remix: Making Art and Commerce Thrive in the Hybrid Economy. Both should be available at your local bookstore.

Finally, thank you, Mr. Alexie, for reminding me that I still need to donate to the Creative Commons this year. I want to spread the Free culture movement as quickly as possible, making sure that the views that you expressed are corrected just as quickly.

Saturday, June 27, 2009

A Quick Note about Planet Awn

Weirdness which I can now attribute to "PHP Library Hell" (similar to DLL hell) had caused the full text of posts for most of the feeds to not be syndicated in the planet feed for several weeks. This is now fixed. (I had to upgrade to Wordpress 2.8 for the actual error to appear and a solution to be presented, it seems.) I apologize for the inconvenience.

Wednesday, June 03, 2009

Avant Window Navigator (Awn) 0.4 Progress Report: June 2009

I've been a bit quiet about Awn in my blog. This is partially because I've been working on two major areas: libdesktop-agnostic and Vala/GObject Introspection (as explained in my previous post, On Bindings).

Since my work on libdesktop-agnostic is directly related to Awn, I'll address it here (even though it may be boring to most of you). I currently have abstractions for configuration, desktop entries, and virtual file systems (e.g., GIO). The area that needs the most work currently is the configuration support. What will ultimately happen is that there will be a class which lets you have per-instance configuration for a given app(let), regardless of the backend that is being used. What this means is that Awn users won't be shackled to using GConf if they want a stable dock. Plus, if a user wants to switch backends for some bizarre reason, they don't have to recompile Awn to do so. Obviously, they'd have to migrate the settings themselves, but that's the price one pays.

Anyway, on to Awn. Here's a screencast I took (CC-BY-SA 3.0 licenced!) of a development version of Awn in an Ubuntu Jaunty virtual machine (hooray for VirtualBox!):

(Direct link to YouTube video)

The bar colors are kind of ugly, mostly because I was testing some of the color-related code on that VM.

Obviously, the features shown in the video (orientation, panel style, and applet loading indicator) aren't the only new features in 0.4. I'll be showing off more shininess in successive videos.

Moonbeam has covered most of the current status of Awn 0.4, which I will repeat here, briefly, for those of you who are scared of clicking on links:

  • Moonbeam is working on an API that allows applets to have text and graphics overlay the applet icon. Think Awn plugin support for applets.
  • He is also rewriting Awn System Monitor so that one can monitor multiple things from the dock, among other things.
  • Certain panel/task animations still need to be implemented.
  • Plugin (not applet) support still needs to be implemented.

I'd like to take a moment to talk about the Awn plugin system, particularly to those who actually write the plugins. The D-Bus plugin API that was in 0.2.x and 0.3.x will be deprecated in the 0.4 series, and removed in the 0.6 series. There will be a new API in the 0.4 series.

With regards to what I'm working on - in addition to libdesktop-agnostic integration, I will most likely be working on implementing the D-Bus plugin API, once Moonbeam is done with the API mentioned above.

On the Awn Extras front, I have the Garbage applet waiting to be added. This is waiting on Vala support being re-activated (which is dependant upon adding GObject Introspection support to Awn), and porting the applet to the 0.4 API. There is also an rTorrent frontend and a social aggregator applet that I have on the backburner (both written in Vala), which may or may not make it into 0.4.0.

And finally, a note for Ubuntu users: we are not making available PPA builds of 0.4 until there are no feature regressions in the rewrite. This release is targeted for the official Karmic Koala repositories, just as 0.3.2 was targeted for the Jaunty Jackalope repositories. We anticipate building packages for Hardy and above.

Remember, the best way to keep track of new developments in Awn/Awn Extras is to subscribe to Planet Awn. Be with us next time for "Autohide and Seek", or "The Incredible Shrinking Dock"!

Friday, May 15, 2009

Rshrtnr: The private URI shortener

Over the past couple of days, I've implemented a private URI shortener service for myself, which I have named "Rshrtnr". The derivation of the name is left as an exercise for the reader.

My main motivation for writing it was a criticism of public URI shortening services that I have been seeing in blogs for a long time: if the service has some downtime or suddenly disappears, all of the links that you have created with it are useless. With my approach, I regain some control of where my shortened links point to, and if the service has downtime and/or disappears, I have more options for restoring it.

The code itself is written in Python. SLOCCount says that the core module runs at around 100 SLOC. Most of my time, however, was taken up by working around problems relating to my webhost's Python installation. The supported Python version is 2.4.x, which is ridiculously old (for reference, Gentoo was the last major Linux distribution to switch from Python 2.4 to 2.5, around July 2008). Additionally, for some reason, if I attempt to change the sys.path variable (i.e., the "include path") to use locally installed modules (I am on a shared host), the entire script breaks with zero logged messages anywhere. It runs fine via the command line, but in FastCGI mode, the strangeness occurs.

The two third-party modules that I used were Paste and mysql-python. I store the URIs and their associated aliases in a simple SQL table, and I use Paste for various WSGI/HTTP-related utilities. I "manually" handle routing via parsing the PATH_INFO environment variable.

There are two ways to specify an alias: either explicitly send a custom one as a query parameter with the URI, or let the app make a random one for you. With the latter behavior, it hashes the URI to generate an eight character "unique" alias. Since there are (in theory) 64^8 possibilities, I don't think I'll run out of aliases any time soon, especially since custom aliases can be anywhere from 1 to 15 characters long.

In my opinion, the most interesting feature is that adding URIs requires one to send an OpenPGP-encoded query string, which needs a public key recognized by the app for the operation to succeed. To write this, I simply parsed the output from sending the OpenPGP message to the gpg binary.

Finally, mod_rewrite magic is used to prettify the shortened URIs. Nothing too exciting about that part.

I had thought about hosting a version of Rshrtnr on Google App Engine, but a key component is missing - OpenPGP support.

If anyone wants me to release it, please comment below. There's currently a bunch of webhost-specific things that I would need to abstract out before I release the code to the general public, and unless someone gives me a very good reason, it will be licensed under the AGPL version 3.

Thursday, May 07, 2009

On Bindings

One of the more interesting areas in software development, to me at least, is language bindings. Being able to interface with a library written in one language in another language is kind of satisfying, as it allows me to develop without having to reinvent the wheel. There are two specific projects that I use and work on so that I can enhance the software that I develop: GObject Introspection (G-I) and python-spidermonkey.

GObject Introspection

As a quick overview, the goal of this project is to give C libraries the tools to provide enough metadata about their API so that bindings can be written with minimal effort. Given the time and effort that I have put into maintaining the Awn bindings, it is not very surprising that I would be willing to help out getting this framework working for Awn. My ultimate goal is to eliminate the bindings/python folder in the Awn source tree. It is basically a mixture of a Scheme definition file plus a very bizarrely formatted "override" file for custom definitions, all integrated into autotools to produce a C library that is ready to be dynamically loaded into python via import. To meet this goal, I am contributing to the PyBank project, which is a prototype Python module that interfaces with the GObject Introspection library to read compiled library metadata files (called "typelibs") on the fly so that classes, functions, etc. can be loaded and called at runtime. In addition to myself, a Google Summer of Code student and a Sugar Labs developer are also working on the module, with Johan Dahlin overseeing it all. So far, I've contributed a unit test suite, ported from the gjs project (JavaScript bindings for GLib-based libraries based on the Spidermonkey VM) and working type bindings for various simple types (e.g., int64 and float).

I have also put some coding effort toward G-I integration in Vala. Vala supports G-I by both reading GIR files (the XML serialization of G-I metadata) to produce VAPI files (short for Vala API files), and writing GIR files when producing a library written in Vala (e.g., libdesktop-agnostic). I have contributed mostly what amounts to workarounds in the GIR reading code, with regards to Vala/G-I behavioral inconsistencies. Didier 'Ptitjes' has done much, much more solid work than I have on both fronts, which I greatly appreciate.


This project, as the README states, lets you [execute] arbitrary JavaScript code from Python[, and allows] you to reference arbitrary Python objects and functions in the JavaScript VM. As I've stated in an earlier blog post, I use this in my custom website build system to both validate and pack my JavaScript code, via JSLint and Packer, respectively. Since I published that post almost two years ago, that project was revived twice - once by a Mozilla employee (and co-founder of Humanized, which is quite awesome) named Atul Varma, and the latest incarnation is on github. Since it is based on the original implementation in C, and not the Python-based ctypes version, the Base2 recursion problem does not exist, and so I have happily written modules and scripts which wrap the two JavaScript utilities. Recently, I have made them available in a public project on Launchpad called python-jsutils. I haven't really announced it until now because it currently relies on a change I made to python-spidermonkey which allows one to iterate over a JavaScript array, instead of having to write "unpythonic" code like for x in range(0, len(foo)): #.... While it is in my fork, it has not been merged to the "official" repository.

Saturday, April 25, 2009

Website Internals: The JavaScript Tag Module

I need something to get myself blogging again, so I figured that I should write about how my website is written. I'm going to start with one of the parts written in JavaScript: the Tag module. It's comprised of a simple JavaScript object with static methods and properties, no instances. Its main purpose is to create (X)HTML nodes for use in what used to be known as "dynamic" HTML. Incidentally, it's also part of my "old projects" series: I originally created it for an Intel-sponsored research project that I was working on during my time in university. (For more details on the project, please see my curriculum vitae.)

In practice, calls using Tag.create() don't look too horrible. Consider the example of building a minimal HTML5 document:

var html = Tag.create('html', {children: [
    Tag.create('head', {children: Tag.create('title')),

Or, a data table:

var table = Tag.create('table', {
    attributes: {summary: 'MLS Statistics'},
    classes: ['sortable', 'centered'], // uses sorttable 
    styles: {border: '1px red inset'}, // use CSS style names, not JS ones
    children: [
        Tag.create('thead', {children: Tag.create('tr', {
            // text is automatically converted
            Tag.create('th', {children: 'Team'}),
            Tag.create('th', {children: 'Goals For'}),
            Tag.create('th', {children: 'Goals Against'}) 
        Tag.create('tbody', {children: [
            Tag.create('tr', {
        Tag.create('tbody', {children: [
            Tag.create('tr', {
                classes: ['west-coast', 'usa'],
                children: [
                    Tag.create('td', {children: 'Seattle Sounders FC'}),
                    Tag.create('td', {children: '9'}),
                    Tag.create('td', {children: '3'})
            // add more teams here...

I recently modified the function so that it deserializes an object (originally a JSON string) into a DOM node tree. This is particularly useful when you're sending partial HTML documents as JSON strings (which I prefer to sending HTML strings and dealing with that mess). So, the first example would look like this:

var html = Tag.create({
    "name": "html",
    "children": [
            "name": "head",
            "children": {"name": "title"}
        {"name": "body"}

Interestingly enough, I created this without the knowledge of the existence of JSONML or any of its bretheren, although I suspected that JSON-HTML converters already existed.

Three of the functions in the module are basically wrapper functions. Tag.createWithText() creates an HTML element with a text child node, and Tag.createHeader() creates an HTML header element (e.g., <h1>...</h1>) with a text child node. Tag.text() is shorthand for the DOM's document.createTextNode() method.

In the current iteration, nearly all of the event-related code is commented out, as there are far more competent JavaScript libraries out there which deal with cross-browser events. The only event-related function left is Tag.dispatchEvent(), which sends a "synthetic" event for a given HTML element. If I remember correctly, I coded for both the W3C and Microsoft models, but I don't remember testing it on browsers other than IE6/7 and Firefox 2. At some point, I'll probably reintegrate event support to Tag.create() at minimum, using Base2.

The remaining function is a utility function. Tag.inXHTML() determines whether the document in question is in XHTML mode, using a variety of heuristics. I'm sure there's a better way of doing it, but I couldn't find one when I was researching it.

This module has been tested in IE6/7, Firefox 1.5/2/3, Safari 2/3, and Opera 9.5 - although not all in the same time periods. It's licensed under the Apache Licence (version 2.0) and is currently somewhere in my compressed JavaScript file. If there's any interest, I'll put up the non-compressed version somewhere and update this post.