The day after Barack Obama won a second term as president of the United States, the blog Jezebel published a slideshow. The gallery displayed a collection of screen-capped tweets. Among them was this:
There were, both shockingly and unsurprisingly, many more where that came from. And many of those tweets were geocoded: Embedded in them were data about where in the U.S. they were sent from.
Floating Sheep, a group of geography academics, took advantage of that fact to turn hatred -- and, just as often, stupidity -- into information. The team searched Twitter for racism-revealing terms that appeared in the context of tweets that mentioned "Obama," "re-elected," or "won." That search resulted in (a shockingly high and surprisingly low) 395 tweets. The team then sorted the tweets according to the state they were sent from, comparing the racist tweets to the total number of geocoded tweets coming from that state during the same time period (November 1 - 7). To normalize states across population levels, the team then used a location quotient-inspired measure -- an economic derivation used to analyze norms across geographical locations -- to compare a state's racist tweets to the national average of racist tweets.
So, per the team's model, a score of 1.0 indicates that the state's proportion of racist tweets to non-racist tweets is the same as the overall national proportion. A score above 1.0 indicates that the proportion of racist tweets to non-racist tweets is higher than the national proportion.
Here's the LQ formula the team used:
Alabama and Mississippi have the highest LQ measures: They have scores of 8.1 and 7.4, respectively. And the states surrounding these two core states -- Georgia, Louisiana, and Tennessee -- also have very high LQ scores and form a fairly distinctive cluster in the southeast.
What might be most surprising, though, is the distribution of tweets beyond that cluster. North Dakota and Utah both had relatively high LQ scores (3.5 each), as did Missouri (3). And Oregon and Minnesota, though they don't score as high when it comes to LQ, have a higher number of hate tweets than their overall Twitter usage would suggest.
In the chart above, the location of individual tweets, indicated by red dots, is overlaid on color-coded states. Yellow shading indicates states that have a relatively low amount of election-oriented hate tweets as compared to their overall tweeting patterns, and the states shaded in green have a higher amount -- the darker the green color, the higher the LQ measure.
States shaded in gray had no geocoded hate tweets within the Floating Sheep database, which could have to do with the fact that many of them (Montana, Idaho, Wyoming, and South Dakota) have a relatively low level of Twitter use overall. The analysts also point out that their samples are measuring tweets, rather than authors -- so it could be that one user could be responsible for several racist tweets, thus bumping up a state's numbers. And given the overall low incidence of racist tweets -- the good news veiled in all this -- once you get past Alabama and Mississippi, it's worth noting, the variation among the more-racist and less-racist states is relatively small.
Still, though. The analysis is a revealing exercise -- and a nice reminder that in the age of the quantified self, biases are just one more thing that can be publicized and analyzed and, finally, judged.
Here, with all that in mind, are the states at the top of the racist-tweet list: