Visualizing Data

I love data visualizations.

Raw data can tell you a lot by itself, but putting that data through a process that creates a visual context is a great way to make people fanatical about your software. For example, I’m absolutely addicted to checking my Flickr stats.

Flickr View Statistics

It’s strangely fascinating to log in day after day and see how many people took the time to flip through my photos when I don’t spend an iota of time trying to drive traffic there. While I don’t get a large amount of traffic, it’s exciting to try to draw conclusions based on the view stats.

Charts and graphs can make tedious or otherwise boring information seem downright exciting. Take Zipdecode, for example. Zip codes are not very interesting. However, slap them into a Processing app and suddenly people are interested enough to spend some time figuring out how zip codes work. In fact, I’m willing to bet that the Zipdecode example is powerful enough that you can learn more about zip codes in 30 seconds than you’ve learned in the rest of your time scrawling them on envelopes.

Twitter is another interesting point of study pertaining to what data conveys about an individual. Enter Twitter Stats: a Perl script that chews on some data from Twitter’s API and spits back some slick graphs that go beyond the traditional Twitter context of, “What are you doing?”

Tweet Density

What does this say about me? I’m more of a night owl than an early bird. Wednesday morning I’m more active than any other morning, and for quite a span of time.  I’m also more apt to post on Wednesday than any other day with Saturday seeing the least activity of all (I try to stay unplugged on weekends to some extent). Overall? Yeah, it’s pretty accurate.

But it doesn’t end there.

I was late to jump on the Google Reader bandwagon. Ok I just started last week. They have graphs too!

Google Reader Stats

According to Google, I don’t read shit in the morning.

When I worked for the IT department at MSUM, we used Cacti to track the basic server vitals: memory consumption, processor utilization, spam statistics, that sort of thing. Of course, it’s useful to see some quick stats from SpamAssasin on the command line, but what’s truly useful is being able to see what changes over time.

Take the WebOps Visualization photo pool, started by some of the Flickr server admins.  Flickr handles an insane amount of data every day.  It’s wildly fascinating to see what  other organizations are doing with their mundane data.