Users Are from Mars, Commenters Are from Venus

At the beginning of this year David Sparks mapped the distribution of commenter locations in the continental U.S. and across the globe. This post got me thinking how similar it is the geographic distribution of HASTAC user base in comparison to users that commented on the website. Would that tell us something we did not know?

So I geocoded the location information provided in the profile field of members and plotted the locations on a map. I only managed to geocode the locations of 720 user IDs while the database includes twice as many users that provided some sort of geographic information (more or less complete).


Not surprisingly, there are more users in the East Coast than in the West Coast. There is a significant number of users in Western Europe, but also a relatively high number of users from African countries, with 9 users based in Egypt and 5 in South Africa. There are also 20 users from Malaysia, 19 from Lithuania, and 4 from China.

Geographic distribution in North America is similar to what David found plotting the comments. The difference between this map and David's approach is that David used the IP address of commenters that posted to the website to map the location of users. The plot you see above, on the other hand, is based on users' self-reported location.

The comparison between the maps based on comments and users' self-reported location shows something interesting. It shows how language impacts participation. David's maps show that comments are hugely concentrated in English-speaking countries and areas where English is spoken as a second language. For the sake of clarity, we can contrast the two maps based on different sources of geographic information.

The user base of is geographically more diverse than that of commenters. This is interesting and it also highlights the role that language plays in facilitating or hindering communication in the network. In fact, the lower activity relative to number of users observed in certain areas is one of those things that get lost in translation -- or rather in the lack of translations. We used ScapeToad to create cartograms and to compare the location of the user base and commenters (thanks to James Cheshire for the initial post).

The following cartograms show that the U.S. is the hotspot location for both user base and commenters, but that the user base is significantly based in Europe, Australia, and regions of Africa. The comparison between the two maps shows that these areas are not present in the cartogram of commenters. We exported the cartograms as SVG (Scalable Vector Graphics) images and made them are available here and here.

Not only language, but also geography is a driving factor in the differences between the location of the user base and commenters. While 65% of the user base is located in the US, the number of commenters is much higher at 91%. This is not surprising given the geographic ties between HASTAC Scholars and American universities. In fact, the percentage of users from non-English-speaking countries is higher across the entire user base. The following bar chart shows that commenters are mostly concentrated in the U.S. and India, and that the user base is spread across a number of countries (The bar chart should be read horizontally — bars of the same color sum up to 100%).

The data actually show interesting variations across countries. The following table shows the top 10 countries by user base and commenters. The complete and cleaned version of this table is available here and here. The tables can also be accessed from our GitHub repo and visualized here and here. Take a look at this data and drop me a line if you see anything strange in the figures.

Distribution of Users per Country
Country Commenters % User base %
United States 5,432 91% 1,032 65%
United Kingdom 111 1,9% 131 8,3%
Canada 139 2,3% 89 5,6%
France 5 0,1% 35 2,2%
Australia 14 0,2% 25 1,6%
India 43 0,7% 11 0,7%
Germany 10 0,2% 18 1,1%
Netherlands 14 0,2% 14 0,9%
Austria 2 0,0% 14 0,9%
Spain 8 0,1% 12 0,8%


This material is based upon work supported by the National Science Foundation under Grant Number 1243622. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.


