This blog post is a review of my experience using the data visualization tool Cytoscape to map out police shootings across the United States.
I found the Cytoscape visualization tool rather unsatisfying for analysis. It lacks the contextual data to draw rigorous conclusions, nor provides sufficient levels of clustering to generate much new insight. The clusters surrounding race gave me a visual display of police shootings according to race and ethnicity. Yet with that information and layout, there are very few questions that I can actually answer. The data representation lacks the context required to make insights. For example, without knowing how many people comprise each ethnicity I cannot determine the proportion of police shootings. The Cytoscape visualization at first look suggests whites are shot far more often than blacks. This ignores that there are far more white people in the general population, or any characteristics to control factors and create an accurate counterfactual. I need reams of more data and variables to begin to draw conclusions relating to social justice issues like discrimination, improper enforcement, demography and the disproportionate impact of oppressive laws.
A Picture is Worth a Thousand Words
While I think Cytoscape is a poor tool for analysis in the context of police shootings and most social science data, I believe it can be a powerful tool to convey a message and evoke a stronger emotional response than plain text or a table. Numbers are incredibly information dense. The downside is that visually 100 and 0.01 are fairly similar. Rationally we understand this but the point is registered weaker subconsciously. Analogously, a policy maker giving a presentation about inequality is going to evoke a much stronger response by showing a graph of wealth inequality over time rather than just a table of GINI coefficients. A correct conclusion does not matter much if it is unable to make a strong impact. In the case of the practicum dataset, Cytoscape is a great for showing how shootings cluster around specific geographic locations, ethnicities and outcomes. It is a worse tool for trying to deduce whether and where there is systemic discrimination.