Google App Engine, Bobby Tables, And #bugsthatuwontfind

Google App Engine went into extended downtime last Thursday. The Tweetosphere was up in arms, and there were a lot of pissed off people roaming the interwebs. Understandable. Downtime sucks for everyone, whether it’s your app, or you’re the user and you just need that damn duck jambalaya that you saved on that spiffy new app you discovered last week. The cause of the downtime? A single malformed file pointer passed from a single application unintentionally exploited a bug in the GFS Master Node that runs the Data Store. More from the official Google post:

The root cause of the outage was a bug in the GFS Master server caused by another client in the data center sending it an improperly formed file handle which had not been safely sanitized on the server side, and thus caused a stack overflow on the Master when processed.

I’m oversimplifying. That single fault actually caused a cascade of unforeseen circumstances that took a not-insignificant amount of time for Google’s engineers to patch up. The details are in the extended downtime information that Google provided and I won’t go into them here.

What I find both comforting and disturbing about this castrophany is that it uncovered a who-knows-how-old bug in GFS. The problem had been experienced a week earlier, but given the nature of the App Engine DataStore, I have to think if the engineers knew what caused the problem that it would have been patched sooner rather than later:

8:00 AM — The cause of the GFS Master failures has not yet been identified. However, a similar-looking issue that had been seen in a different data center the week prior had been resolved by an upgrade to a newer version of the GFS software.

It sounds to me like this bug was accidentally fixed, rather that discovered, recreated, and intentionally patched. It’s difficult, however, to do much more than guess at Google’s internal software upgrade schedule, so this is all just a bunch of guesswork. It just goes to show that you can never be too paranoid about what your users are doing on your platform. I’m reminded of the XKCD comic: Exploits of a Mom

Exploits of a Mom [via XKCD]

Meebo misses me

When I was a student at MSUM, we had one small computer lab that was the unofficial geek hangout: the Linux Lab.

The computers were nothing extravagent, and until the last year of my undergrad degree they still had CRT monitors, but it was our lab. The one downside was that the lab was so locked down network-wise, we couldn’t use the already-installed IM clients (probably for good reason). Fortunately, Meebo was coming in to it’s own during the same time period.

Meebo is pretty cool in it’s own right, but I pretty much had no use for it after I finished college, and so I forgot about it until the other day when I received this email:

Hello from Meebo :-)

I don’t know exactly why, but somehow this obviously automated email seemed strangely personal. Someone at Meebo went out of their way to solicit some feedback about why I stopped using their service, which is enough to make me think, “This company is worth my time.” Tiny glimpses at the personality driving services like Meebo make software seem not so impersonal, and that can really make a difference in how your userbase views you as a company.

Opera’s Standards Compliance a Detriment?

There’s an interesting compatibility debate going in the comments thread of a blog post on my.opera.com.

The issue the article poses is that Opera has gone to great lengths to follow the standards that the W3 put forth. In doing so, the author has run into to a number of major sites that do not work properly due to their abuse^H^H^H^H^H use of JavaScript’s setTimeout method.

When I first got in to web development, I was one of the purists who thought that page validation and standards were really important.

Scratch that.

I still think that standards and validation are important. I’ve seen some bizarre behavior in all of the major browsers (Firefox, Opera, IE, Safari) due to even more bizarre markup. To some end, however, Joel had a point when he argued that standards don’t matter. Standards are great, until you end up with a non-standards compliant implementation that, for a long time, becomes the standard upon which others are judged (there are several issues I have with Joel’s position, but those are best left for another article). Back when IE6 was the browser and Firefox was sitting on a puny 2% market share, if a page displayed and functioned properly in IE but not in Firefox, well dammit, it was Firefox’s fault.  Eventually, however, we learned that Firefox did a much better job of following the W3 standards, and it was a little easier to develop for, and people started using it.  Several years later, the tables have flipped, and it’s the Internet Explorer team that’s taking the bad rap for not having committed to standards early on.

Opera is in the same boat today as Firefox was back then. Firefox has become the standard by which everything else is judged (though Webkit is rapidly gaining support), and as a consequence of Firefox having to implement some of IE’s quirks for sanity’s sake, Opera adheres just a little bit better to the W3 standards than Firefox does. Parsing markup is one thing, but JavaScript performance has become a huge interest lately. Don’t believe me? Ask Google.

Speed Comparison Hits

Given what John Resig discovered about timer implementations across various browsers, it’s no surprise that the sites mentioned in the Opera article have wildly different results depending which browser you’re using. But c’mon people, it’s 2009 already. Haven’t we found some sane way to work around issues like this?

Actually yes.

This is exactly why the more popular JavaScript frameworks were developed and have become widely popular. They take care of these browser differences so you don’t have to. It’s a little surprising to see large organizations such as AOL and the New York Times haven’t picked up one of the popular libraries, though I suppose it’s just as reasonable to assume that they’ve developed their own code in-house, and it’s just not quite as robust as it should be.

Should Opera get slammed for something that is apparently an oversight on the part of the web developers?

Preload your CSS Images

Here’s something I come across occasionally that I find rather annoying.

You’re browsing around the web at 2am, searching for the perfect gift for that special someone. You mouse over the tab navigation and.. WTF… the tab completely disappears for a second. After another round-trip to the server, the on-hover image for the tab finally loads displays as it was intended.

Sometimes this isn’t a big deal, since it’s only a 200ms round-trip to the server for a tiny image, but occasionally on a high-traffic server, or a site that has quite a bit of latency, it creates a noticeable eyesore, and to me, it seems flat out lazy when there’s easy ways of avoiding it, such as:

1. Image Sprites

Sprites are becoming more and more common these days. The basic idea of an image sprite is that you combine many small images into a single large image and use CSS to manipulate the position of the image so only the desired section shows up where needed. For example, to get the hover state image for the undo icon, they would have to set the background-position to -200px  -40px (200px from the left, 40px from the top of the image sprite).

sprite_ex

Instead of the client having to download 7 icons that display on a page, the use of a sprite collapses 7 HTTP GET requests into a single request, albeit one that takes slightly longer due to the large image size. Take a look at the sprite as it’s used in the Stack Overflow Markdown editor.

sofia1

With a bit of context, you can see that the 3 rows represent the regular, disabled, and hover states, respectively from top to bottom, for each icon. They’ve reduced 39 requests to a single 4Kb byte request. This saves a substantial amount of time in setting up and tearing down TCP connections, and it’s fewer hits for your webserver to service, so everybody wins. There are  some good sprite generators around the web, which takes almost all of the hard work out of the equation for you.

2. Preload page images with JavaScript

This one is not quite as elegant, but it’s still used quite a bit, and can be a reasonable solution to a one-off problem.

When I had to build an online portfolio for my Technical Writing class at MSUM, I was required to add links to three tools that I use on a regular basis. I wanted to dazzle a bit, so I took screenshots of the three sites I chose, and used LightBox to load larger, high-quality screenshots.

Logically, the user is going to scan the first paragraph, then move to the first section and start scanning. My guess was at that point, they’d be bored and would click on the image to see a larger version. But in the meantime, enough time has probably elapsed to have downloaded all 3 high-quality versions, so why not make the browser download them in the meantime? Easy.

var imgs = ["images/kuler_large.jpg", "images/reddit_large.jpg", "images/mdc_large.jpg", "images/wikimedia_large.jpg"];
for (i in imgs){
    var newimg = new Image();
    newimg.src = imgs[i];
}


3. Make jQuery do everything for you

jQuery is a wonderful library. It’s made even more wonderful by the vast number of extensions, one of which is is the CSS Image Preloader, written by the JavaScript masters at the Filament Group.

What’s cool about this approach is that it’s a great balance between the other two: It’s simple, only requiring two javascript files (jquery core and the plugin) and a couple lines of code, but unlike the manual method of typing filenames, this plugin automatically parses linked and imported stylesheets. So if you change image urls, or add or remove images from a page, the plugin does the tough work for you. No need to update anything else. They have a great demo set up over at their website as well.

$(document).ready(function(){
  $.preloadCssImages();
});

Ultimately, the ‘right’ approach depends on what your individual needs are. I can see applications and arguments for and against all three of these examples. The first is pure CSS, which is always a plus in my book, but it also requires that you decide on a good way to organize your related images into sprites. The latter two both require JavaScript, which is almost always fine these days, but some people prefer to browse with it turned off, and it’s up to us developers to make sure that their experience isn’t hindered (much) because they chose to not trust our scripts.

Simple, Repeatable, Powerful

In software, simple is better.

As developers, we spend much more time trying to understand code than we do writing it. It’s going to be a lot easier for your collegues to understand what you meant by a method nameddoesUserExist() than everybodyWangChungTonight().

Tracking down obscure bugs is insanely frustrating. It’s even more frustrating when you have to struggle to grasp a shred of meaning from the code that is supposedly buggy.  Imagine what your college professors must have gone through.

Dr. Brekke: I’m sorry I picked random animal names for variable names on that assignment. It was late and OperationHerdCats() seemed hilarious at the time.

There. That feels better.

Similarly, if you have to jump through an arbitrary number of hoops to perform some action at work, be it kick off an integration build, generate a report, or make a new pot of coffee, it’s going to be hard to talk anyone in to taking over or sharing in those duties.

I’ve been gradually working my way though The Pragmatic Programmer over the last few weeks. Some things are old hat (thanks Dr. Walker), while others are less obvious, but insanely practical. One of the ideas was that everything you do should be easily reproducible with simple steps. This falls right in line with #2 of The Joel Test.

The Power of Simple.

In building the template that this blog used to use, I ran into a few minor hiccups. I’m currently hosting the site on GoDaddy shared hosting, so I’m limited to using FTP for transferring files. That, by itself, is not a huge deal. I’d prefer rsync, but I can make do.

While chatting with a friend one night, I mentioned that I’d just updated a bunch of stylesheet rules. He replied that it looked the same as it did the previous week, which was odd since I had made several minor updates since then. It turned out to be a simple CSS caching error, caused by me forgetting to update the version number following the script HREF in the HTML. Previously, the link looked like this in the template:

<link rel="stylesheet" type="text/css" href="<?php bloginfo('stylesheet_url'); ?>?4815162342 />

In this example, 4815162342 is just an arbitrary number that gets incremented to trick the browser into thinking the file might have changed by using a simple GET request. It’s a simple way to let the browser cache the stylesheet, but still force it to get a new one whenever necessary by simply changing the number. Rails does something similar, but it appends the last-modified timestamp of the external CSS or JavaScript file in place of the arbitrary numbers I’ve used.

I thought for a little while, but then I realized that what I really wanted needed was a 1-click (err, command) Build Process that did the following:

  1. Create a dist directory to allow for an uncontaminated base point for a release
  2. Copy all of the image files to the dist directory (I really need to get ambitious and make some sprites)
  3. Copy all of the relevant PHP files to the dist directory
  4. Minify the external CSS and JavaScript files and put them in the new directory
  5. Update my arbitrary number in the header.php file so I don’t have to ever think about it again.

Using Ant, this ended up being a simple, yet valuable introduction to the Java Build world. The only thing I really needed to sub out was to a really trivial shell script on my desktop that took care of the minification using the YUI Compressor. My build process went from actually more steps than what I laid out before to the following:

  1. ant deploy

Even my SO could handle that. Simple. Repeatable. Powerful.

← Previous PageNext Page →