Blog Post

Reflections on Group Project

Our project is to provide computational analysis of ‘Black Twitter’ to expose relationships, connections and digital social circles within the online community. There is an inherent issue of processing and understanding the vast scale of the information and data that needs to processed, and we believe computation is the best way to get around those constraints. For example, our script was able to gather over 100,000 relevant tweets within minutes. We would never able to read, understand, and be able to draw connections and organize them in a reasonable period of time. Therefore, the bulk of our project is to build and find the tools to be able to do so.
The final deliverable of our project will either be a website or PowerPoint deck showcasing the results of the analysis. We chose these final formats because the results of our work will cross many mediums from prose, statistical tables and data visualizations. So far our project has focused on gather the necessary data and then using Kevin’s Twitter script to pull it from online. My role will to help run the various computational tools to expose the underlying relationships between tweets, accounts as well as search for meaning. One of my areas of focus will be personally coding a few of the tools or writing a couple scripts in Python as well as to clean and format the Twitter information to make it usable and readable. Our group has a broad range of CS and literary backgrounds that we can use in conjunction with one another. This will allow us to explore a wider range of areas and methods of analysis than before.
We have a Twitter script that uses specific keywords and accounts to gather data. That data will form the underlying foundation onto which we can apply sentiment analysis, geo visualizations, clustering around metadata to expose new sub groups, and vocabulary analysis. An inherent issue with this method is that it relies on a set of prominent and known signifiers to collect the sample from Twitter. We used hashtags, accounts and words that are known because that is what we could find. Social justice is often giving a voice to those who are unknown and disenfranchised and our current methodologies run the risk of excluding them.  The overcoming the difficulty of getting a representative sample is the core issue in insuring a socially equitable form of analysis. The best way to combat this is to use as wide as a digital dragnet as possible without introducing too much noise to insure inclusivity. 



I really like your point about the difficulty of getting a representative sample being an important issues. Even if we do several interesting computational analyses, it could very easily be skewed based on the keywords and hashtags we are filtering by. I think this is something we should keep in mind, and if we find that our sample is not representative enough then re-evaluate our keywords and recollect data if necessary. 


I believe that this is going to be a solid project with a lot of potential. The analytical program that you mention is definitely something I would use and giving users access to this kind of software would be a game changer in the realm of digital humanities. The rhetorical analysis of 'Black Twitter' is definitely something that can be uprooted for this project and I look forward to seeing what output we get from this project.