Parsing the Data and Ideology of the We Are 99% Tumblr


One of the most fascinating things to come out of the current We Are 99%/Occupy Wall Street protests is the We Are 99% Tumblr.  At the site, people hold up signs that explain their current circumstances, and it tells the story of a whole range of Americans struggling in the Lesser Depression.  It is highly recommended.


The site features pictures of individuals holding their signs, and occasionally the tumblr reproduces the text of the signs themselves underneath the image as html text.  Sometimes the text under the image is blank, sometimes it is a different message, but often it is the sign itself.

In order to get a slightly better empirical handle on this important tumblr, I created a script designed to read all of the pages and parse out the html text on the site.  It doesn’t read the images (can anyone in the audience automate calls to…

