Archive for the 'I Made This' Category

Pixels and Politics

Saturday, December 16th, 2006

So last month, just before the elections, I was thinking about electoral shifts. With everyone pretty much convinced that the Democrats were about to take over Congress, or at least the House, I saw a lot of people making comparisons to the 1994 Republican takeover of Congress, and I saw one person make the claim that the 1994 Republican takeover wasn’t really that big of a shift compared to previous Congressional swings earlier in American history.

This made me curious. I started to wonder how one might go about judging such a thing, and started to realize that although detailed histories of the U.S. presidency abound, there really is not very much well-organized information out there about the historical makeup of the Congress.

I decided to solve this the way I solve most problems in my life: by making animated GIFs. I downloaded the rosters of the first through 109th congresses from the Congressional Biographical Directory, and then a month later, when things had settled down a bit, added the information from Wikipedia’s tentative listing of the newly-elected 110th congress. Then I wrote some perl to convert the Congressional rosters to graphs, with one colored pixel marking the party which held each seat in each of the 110 elected congresses. You can find the results below.

For starters, here’s just one big graph of everything, sorted from top to bottom by state, with senate and house seats separated. As with any of the images in this post, if you want to see it closer, you can click to zoom in:

Although this graph is to some extent cryptic since it doesn’t tell us exactly why any of these pixels change colors with time, if you look closely you can actually see many of the important events of American history reflected quite visibly in this graph. For the most obvious example, the rise and fall of the Federalist and Whig parties are clearly visible in the early part of the graph as big blobs of purple and green, accompanied by a wave of gray pixels around the 1820s, just before the Whigs appeared, marking the collapse of the Democratic Republicans before the party was reborn under Andrew Jackson. A solid line of gray pixels is also visible at the beginning of the graph, marking those heady few early years of American politics before any political parties existed at all. The Civil War is clearly visible as a long vertical black streak cutting through the graph around 1860, marking the period when the southern states simply didn’t participate in the Congress. After the Civil War most of the southern states turn very briefly red, then blue again, as blacks suddenly gained the right to vote, then lost it again with the rise of Jim Crow and the Ku Klux Klan. After this point the “solid south” phenomenon becomes incredibly marked, with the northern states in the graph a patchwork of red and blue pixels, but the southern states a solid sea of blue for a hundred years as a result of post-Reconstruction animosity toward the Republicans. In the decades after the 1950s, the great northern/southern swap as the Democratic and Republican parties in many was reversed themselves is visible as a great gradual blur of colors swapping, followed by a solid wall of change around 1994– Massachusetts and Texas are almost mirrors of one another in this period, with Massachusetts slowly turning from nearly solid red to solid blue, and Texas doing the same in reverse.

When we look at the graph by states this way, of course, shifts in Congressional control– which is more of a numbers game– are not so clear. Some of the big shifts are visible– for example stripes of red and blue are clearly visible around the beginning of the 1900s, as first the Republicans sweep congress during the Spanish-American War, then the Democrats sweep congress during the Great Depression and WWII. The 1994 Republican Revolution is visible in some states, but not others– and in those places where it does occur, it seems less like a solid switch than just an acceleration of the steady progression from blue to red in many places of the country that followed Nixon’s “southern strategy”, and the steady emergence of the “red state” phenomenon. The 2006 elections– the last column of pixels on the right– is barely visible at all.

The shifts become a little more clearly visible if we choose not to sort by state:

In the graph on the left here, sorting still occurs by state, but rather than being separated neatly the states are all just mashed together. This graph is a little hard to make sense of. More clear is the graph on the right, where pixels are instead sorted by party. Here the shifts in congressional control are quite blatant; very brief swings in power, like the Democratic powergrabs following the Mexican-American war and the Watergate scandal, become easier to see, and it’s easier to see which numeric swings were lasting and which weren’t. The “Republican Revolution” is a lot more visible on this graph than on any other, and at the very end of the graph, someone familiar with the politics of the last decade can almost chart the rise and fall of the Republican congressional majority pixel by pixel: Republican control spikes like crazy in 1994; then drops off just a little bit as voters become disillusioned with the Republicans in the aftermath of the impeachment circus; voters then warm toward the Republicans again in one final one-pixel spike, representing the halo effect of Bush’s 2004 campaign; then suddenly the numbers swing toward the Democrats again in that last final rightmost pixel.

One thing that stands out to me in this particular graph is that though the swing towards the Democrats in 2006 is quite pronounced, it’s certainly not nearly as pronounced as the swing that put the Republicans in power in 1994. Although the Democrats still hold a decent majority, and it looks like they’re about on par with where they were at what looks like the beginning of the Reagan revolution, they don’t hold nearly as much power as most of the historical Democratic majorities since FDR have. Although there are other reasons besides pure numbers to think that in this particular election the voters meant to send a message– although it’s not really visible in any of the graphs above, one of the interesting facts about the 2006 elections is that no congressional seats or governorships held by the Democrats went to the Republicans on 2006, only the other way around– in terms of pure numbers the 2006 elections were not really that big of a shift, and the Democrats are only really halfway to replicating the feat that the Republicans pulled off in the 90s. If nothing else, this means that the Democrats are going to have to govern carefully to keep control of the situation with their relatively thin majority– and will have to really convince the voters they’re doing something worthwhile with that majority from day one, because it will not take much to lose it all in 2008.

These graphs aren’t perfect. The chief problem with them is that they aren’t exactly sorted by seat. The data that I’m working off of here doesn’t show who serves in which district, only who served in what state. This means that if someone holds a particular congressional seat for 20 years, they’ll show up on the graph as a solid line of 10 horizontal pixels– but their replacement for that same seat won’t necessarily be in the exact same horizontal position as they were. Also, I don’t have records of who won the elections– Congress’s listings only showed who served during each two-year period, so if more than one Congressperson occupied the same seat during some period (for example, because one of them died and was replaced with the second), both show up in the graph. (This is why, although each state only has two Senators, many of the “Senate” lines in the by-state graph at the top are occasionally taller than two pixels.)

What I’d be curious about doing with these graphs in future is getting hold of some more specific data concerning who served in exactly which Congressional district, so the graphs can more accurately reflect the shifts within states– for example, so that if most of a state votes one way, but there’s one specific city or region that consistently votes another, it would be clearly visible. It also might be interesting, with that information in hand, to try to rework some of these graphs as colored maps, although I’ve never found a good way of making maps in software. Another interesting possibility might be implementing some sort of mouseover feature, so that by moving the cursor over any particular pixel you can see the name of the person that pixel represents.
The other thing that I’d like to try to fix about these graphs, though I’m less sure how is that they’re kind of a lot to take in all at once– they’re too tall to fit on a computer monitor, and without zooming in a lot of the features are hard to make out. This is a little bit helped on the graph that I think is my favorite, since it serves very well as a kind of “summary”– the graph where House reps are ignored and only the Senate is displayed. On this graph we get a good general idea of how people are voting but the graph is still small enough to take in at a glance, so the nature of the big party shifts by region and event are most “obvious”:

If anyone has any other suggestions for ways that these graphs could possibly be improved, I’d be curious to hear them.

As one final bonus, here’s an animated graph, with columns of pixels from left to right representing states:

I have made a website

Monday, May 8th, 2006

So: I have made this website. You can find it at http://datafall.org/.

The idea of datafall.org is that if you have a website that is something like a blog– like, a copy of WordPress or Movable Type, or a LiveJournal, or a Blogspot account, or a MySpace page, or basically anything that uses RSS– you can add it to Datafall, and after that everything you post to your site will automatically appear at Datafall also. It’s kind of like Slashdot, except that instead of being a group blog for CmdrTaco and Zonk and the three other people who can post at Slashdot, it’s a group blog for the entire internet.

Or, if you don’t have a blog or know what I’m talking about: Datafall is an open site that (hopefully) collects the best bits of other sites, and puts them in one place for your reading pleasure.

Why I did this, and why you might care

Lately a lot of the good content and discussion on the internet has been posted in what are called “blogs”. This is a word that is supposed to be short for “weblogs” but basically just means a site where people frequently post things they wrote.

A problem with blogs, at least in my opinion, is that they aren’t very good at forming communities. Almost all blogs have comment sections, so there’s usually a little community there; but these communities usually aren’t very large, and they can sometimes be very insular. Also, most blogs have links to blogs they like and those blogs usually link back, so you sometimes get little rings of blogs that all tie together; but these usually aren’t communities so much as they are cliques. Sometimes you see “group blogs” where a couple different blogs band together, like the excellent Panda’s Thumb; but this is not common, and the tools for setting this sort of thing up don’t seem to be very good.

To me, a good internet community should be something where a whole bunch of people come together to some kind of common ground that no single person exactly controls, the way most web forums work and by-invite blog cliques don’t. When communities are open like this, you get a much wider and more interesting range of opinions, and people are encouraged to respond to things they don’t agree with instead of just shutting them out. Another nice thing about big “common ground” sort of sites is that finding the good stuff is easier– content comes to you, instead of you having to go to it. Good blogs, in my opinion, are after all kind of hard to find. On the other hand, look at something like Slashdot– it’s not very good, but it’s consistent, and that makes it easy. The links on Slashdot are usually just whatever the rest of the blogosphere was talking about three days ago, so you could get the same links by just reading a bunch of different blogs– but the links do get to Slashdot eventually, and personally, I’d just rather read something like Slashdot because it’s easier.

The problem is, though, that while some kind of big centralized site like a webforum may be what I’d prefer as a reader, the people who are actually writing good, interesting stuff prefer to do it in their blogs rather than something like a web forum. And this makes sense. Who wants to pour their heart and soul into writing something really good if it’s just going to get a Score:5 stamp in a slashdot story and then disappear forever? If you save your best writing for a blog, not only do you get more attention, it’s safer– the blogger has control over their own site, so they never have to worry about somebody else screwing it all up for them. (I’ve seen at least two collaborative writing sites fall apart, partly because the people running it couldn’t consistently keep the hardware up and running.) It’s easy enough to get people to collaborate and submit good stuff when your site is nothing but links, like the front pages of Slashdot or Fark or Digg are. But what if you want actual writing– things like news analysis, or political commentary, or interesting stories? Well, that’s what blogs are for.

I wish there was some way that you could blend the best advantages of blogs with the best advantages of something like Slashdot or a big web forum.

So I decided to try to create one.

How this works

One of the common features all blogs share is what’s called an “RSS Feed”. RSS is a way of displaying posts on a website without displaying the website itself. A lot of people use these programs called “RSS Aggregators” to read blogs. RSS Aggregators (Firefox and Safari each have one built in) keep bookmarks of all your favorite sites, and when you open the aggregator it shows you all the new posts from all of your favorite sites, all mixed together in one place.

Datafall is kind of like an RSS aggregator that’s shared by the entire internet; anyone can add their site as a bookmark at Datafall by going here. (All you have to do is give Datafall a link to your site– Datafall figures out the rest from there.) Once a site is bookmarked on Datafall, Datafall will automatically notice when the site updates, and add an excerpt of the new post, with a “Read more” link that leads to the full post on the blog where it was posted.

There are a few different ways to find posts on Datafall. Every post on Datafall has a post “type” (is it news, an op-ed, a diary?) and a post “topic” (is it about politics, computers, culture..?). By default, everything on Datafall gets posted in “Diaries” (which is basically the “anything goes” section) and doesn’t have a topic. You can move one of your posts to a different type or topic by clicking the “Edit or Moderate” link that appears under every post.

Aside from this, there is also a “Front Page” section, which is supposed to be the best of the best from all story types and topics. Like on Kuro5hin or Digg, the users vote on which stories are the best ones and worthy of going to the front page (again, by clicking the “Edit or Moderate” link under the post).

Regardless of type or topic, you can always see the newest posts on Datafall by looking at the sidebar on the right side of every page.

The hope is that Datafall will eventually work like a big collaborative RSS filter, with a bunch of feeds coming in and the very best stuff coming out on the front page, with the will of the users deciding what goes where. (Of course, since there are no users yet, all it takes to get something to the front page right now is for a single person to click on the “nominate” button.)

In principle, there are several sites that work kind of like Datafall already– sites like Feedster or Blogsearch.google.com, which take in many RSS feeds and help you find things within them. However, these are not communities. They are search engines. They do not bring different blogs together any more than Google brings different forums together, and they’re all pull, no push– you can’t get anything out of Feedster unless you already know what you’re looking for.

Datafall can be different.

Site principles

Datafall is far from finished (the section after this one describes some of the things that need to be done), and along with the work that isn’t done yet, there are also going to be a number of decisions that need to be made about how the site should work and how the community should look. As [if] the site gains momentum, these are the principles I am going to try to shape everything that happens around:

  1. The site should be interesting and readable. All other goals must kneel before this one. If looking at the Datafall front page doesn’t immediately produce something worthwhile to read, then what’s the point?
  2. The site should be controlled by the users. Group moderation should be used everywhere. Datafall isn’t “my” site. If I just wanted to run a blog, I’d just do that. Actually, I’m doing it already, now. Datafall, on the other hand, should be a site that exists for, and is controlled by, the people who post there. I am only one of those. Whenever it is possible for a decision about the site– about what kinds of features get implemented, about what does and doesn’t get moderated well, about how (if at all) the site is policed– to be in some way deferred to the userbase at large, it should be.
  3. Filter, don’t exclude. Of course, there’s a big problem with the above idea: not all of the users are going to agree on everything. Different users might have different ideas about what is good content, or a good feature. Whenever possible, the users on the losing side of the decisionmaking process should be given some way to split away and continue on as they like. The entire point of Datafall is about bridging gaps and bringing different sites together, but it’s important to realize that this isn’t always possible, and you need to have a plan for what to do when it isn’t. If it’s decided that content doesn’t belong on Datafall (short of it actually being spam), it should be hidden, not deleted. If it reaches the point where a subset of the users wind up with a vision of what Datafall should be which is entirely opposed to that of the rest of the userbase, and it turns out there really is no way to reconcile this, the minority should be given some way to split off and carry on without the rest of us (see “groups” and “open source” below).There are two reasons for this. First off, collaborative processes can succumb to groupthink. Whether content is good or bad doesn’t have much to do with whether it is popular or unpopular– but democratic processes, like voting on which stories are the best, are better at picking out what is the popular thing to say than what is the right thing to say. This means eventually content gets excluded which does not deserve to be. The best way to avoid this is to try not to exclude content, at least not all the way. Second off, and more importantly, excluding people never works. Sad as it is to say, every site winds up accumulating people who really shouldn’t be there; but ironically enough, invariably the ones who most deserve to be thrown off the boat turn out to be the ones who are best at keeping themselves from being thrown off the boat. In a best case scenario this “certain kind of person” does this by manipulating the emotions of the people responsible for policing the site, in a worst case scenario by cheating and evading bans. The best way to deal with this, I think, is to just go ahead and give these people their soapbox, and then give everyone else the tools to avoid having to listen to it.
  4. Never stop experimenting. The Internet never stops changing; you can’t survive on the internet unless you do the same. I have seen (and used) enough small sites that failed miserably to know this. Datafall should always be a work in progress, and the site should always be incorporating new ideas, even if they’re bad ones. If they turn out to be bad ideas we can just take them out again.
  5. AJAX. This is a technical issue, but it’s an important one. AJAX is this new fancypants internet technology that lets webpages update without reloading. Like most things on the internet, AJAX has the potential to allow a lot of cool and interesting things, and also the potential to allow a lot of abuse. AJAX is used on Datafall in the following ways:
    • AJAX should always be used for controls. Everything on the site like reporting a bad post, or voting on a good one, is controlled by AJAX. You should never have to suffer a pageload just to change the state of something, and so far, on Datafall, you don’t– the only forms that trigger pageloads are when you’re logging in or signing up for an account, and I may even be able to remove even those eventually.
    • AJAX should never be used to navigate. That’s what pageloads are for. The “back” button is sacred and it should always do exactly what you expect.
    • The site should always work exactly the same with Javascript turned off as it does with Javascript turned on.

Future plans

Things about Datafall that should change in the near term:

  1. Voting is not as robust as it should be. Right now, anyone can change any article to any section, and anyone can nominate something to the front page. I have features in place that would do this better, but they are not turned on– again, because there aren’t any users on the site yet, so right now they’d just make things needlessly complicated. Eventually the site will have something like “I liked this / I didn’t like this” counters on every story. If a lot of people like a story, it will get shown on the front page. If a lot of people dislike a story, it will get cast back down into the diary section.
  2. Hilariously, although the entire site is made up of RSS feeds, Datafall itself doesn’t offer an RSS feed yet.
  3. This is an important one– pinging. Blog engines offer ways to automatically notify sites like Feedster or Datafall when they have updated. I don’t actually even know how this works exactly. I need to find out. Right now Datafall doesn’t immediately know that one of its bookmarked sites has updated– it just checks for changes periodically. This is bad.
  4. More story types– we need a “Links” section eventually, and I’m considering a “podcasts” section.
  5. Deletion. Right now, if you make a post on Datafall, you can’t remove it. Nobody can delete posts but me. This is probably bad and stuff.

Things about Datafall that should change in the long term:

  1. Groups. Right now, the only way to sort things on Datafall are the type and topic sections linked at the top of every page. There should be ways for users to create new types, new topics, or entire other ways of categorizing things. In principle, this should work like “Groups” on LiveJournal– LiveJournal lets you make specialized group blogs that act kind of like message boards, and that you post to as if you were making a post in a LiveJournal. Of course, you can only post to a LiveJournal group by making a post specifically to it on LiveJournal.com. Datafall groups, of course, should be able to take in posts from anywhere. Eventually this can hopefully even work such that it’s possible for users to create their own totally autonomous subsites with Datafall, with their own moderation rules and everything.
  2. Ripping off Feedster and Digg. Right now, posts only enter Datafall if the person who owns the RSS feed wills it. It doesn’t have to work this way. If Datafall ever gets ridiculously large, we could add a separate “best of the internet” section that works kind of like Fark. The outputs would be voted on the same way that any other Datafall post is, but the inputs would be the entire blogosphere instead of just Datafalls’ diaries– for example, maybe Datafall users could nominate articles they liked but didn’t write. Now, given, I really don’t think this is a good idea. It doesn’t fit with any of the site’s goals, and it also introduces various difficulties (both legal and technical). However, it’s something worth considering.
  3. Comment and account tracking. This, on the other hand, is something I really do want to try: Datafall bridges the gaps between sites by putting articles in a central place. However, comments on different Datafall blogs may as well be in different universes. I am curious what can be done about this. Think back to Slashdot: If you post in six Slashdot threads in one day, you can come back to Slashdot later, go to your user page, and have nice convenient links to all your posts, along with how many replies each one got. If you post in six different threads in the Blogosphere in one day, on the other hand, the only way to see what happened to them later to is to go back and track down your posts in each of those six threads. There must be a better way to do this.Right now, a Datafall account isn’t really used for anything except creating feeds. It would be interesting to try to make it so that the posts you make in the comments section of a blog that uses Datafall are automatically recognized as being part of your Datafall account. (Right now there are a couple of “shared account” services which let you access many blogs with a single signin. But as far as I know, none of them are very open or, for that matter, open source.) In addition to, or maybe instead of, this, Datafall could track comments made on Datafall blogs (some, but not all, blog engines offer RSS syndication for comments) and provide a “comments I have made on any Datafall blog” page. I think this entire concept would be something extremely useful, maybe something even more useful than the part of Datafall I’ve implemented so far. However, it would not be trivial. Each blog would have to individually support the comments features; not only is there the problem that not everyone would want to participate in this, but also there is the problem that (by the very nature of Datafall) every blog linked from Datafall is running different software. But, of course, this leads me to:
  4. Blog plugins. Blog engines like Wordpress or Movable type all support plugins. I would like to look into making plugins for these blog engines that makes posting on Datafall easier. A simple version of this plugin might do nothing more than add “type” and “topic” menus whenever you post a story, so you don’t have to go through the silly step of, every time you making a post, fishing it off Datafall and rescuing it from the Diary section. I don’t think this would be very hard (though, on the other hand, I don’t think I really want to do this unless people are actually interested).
  5. Open source. One last thing: I want to release the code that runs Datafall as a Ruby On Rails plugin. I have not actually figured out how to do this yet. Once I have this worked out however I intend to release Datafall’s software under the GNU LGPL.

That’s about it. I hope you find Datafall useful or at least interesting. If you have any thoughts on this experiment, please leave them as a comment below.