Zk | 2010-11-01 10:12:04

blog fossil diary

I just realized that I posted this to Twitter and not anywhere else; whoops!

Anyhow, I’m an avid reader of FlowingData, because Nathan Yau, the man behind it, does some pretty awesome stuff.  His visualizations are clear and still aesthetically pleasing, and his concepts are always nice.  Of particular interest to me, when I first started reading, was your.flowingdata which is a means to track your own life through Twitter - for example, you can tell it when and how far you ride your bike every day and have it automatically generate a visualization of distances ridden over time.

Recently, however, he posted a little challenge of sorts.  Given a dataset, we, the readers, were to visualize it our own way and draw some conclusions from our visualizations (that, after all, being the point of visualizations).  I’d never done anything like that before for various reasons.  I didn’t want to learn a new domain-specific language such as R that would then require me to edit my results in the form of an image in some other program such as Gimp or Inkscape.  Also, Gimp and Inkscape have some quirks that I’m still learning, and I didn’t want to have to chose between learning those and buying Adobe CS.  However, I have been working quite a bit with Javascript recently, so it seemed to make sense that, when I found two libraries - Flot and Protovis - for visualization in JS that I go ahead and use one of these ‘Visualize This’ challenges to learn one of them.  It’ll definitely be helpful in the future.

The most recent challenge was visualizing data from the National Survey of Sexual Health and Behavior.  Given a small set of data - percentage of respondents in different age groups admitting to engaging in nine different behaviors over the past year - I worked hard to learn Protovis from scant documentation in order to pull together a visualization.  Since it takes place over three ‘slides’ and has text to go along with it, I’ll let it speak for itself here.

I think I did fairly well, given the fact that I wound up doing exactly what I didn’t really want to - learn a new DSL.  Granted, this one will be useful in my web design in the future!  With the time limitation of a due date and the fact that I was learning as I went, I didn’t quite pull off exactly what I wanted, and the trends I was interested in looking after weren’t as apparent I was hoping.  The problem was mostly due to inadequate documentation on Protovis - much of the documentation that wasn’t simply API documentation was either examples or brief write-ups about concepts in statistics as the applied to Protovis.  I learned most from the examples, after I learned some of the basics from the API docs.

I’ll probably find another dataset somewhere that interests me in order to visualize it soon, but I also expect that I’ll be implementing the visualization process in my own projects as well.  I’ve got lots of ideas.