Blog Post

Network Analysis with Cytoscape

The objective of this practicum was to create visualizations in Cytoscape that allow us to analyze and compare the relationships between law enforcement agencies and the race of the deceased in 2015 and 2016. Going into the assignment, I expected to see which demographic groups were most frequently targeted, and whether certain agencies were disproprtionately targeting particular groups. The 2015 visualization below appears to indicate that Whites, Blacks, and Hispanics/Latinos were the most targeted groups for this year. The most targeted groups seem to remain the same in the second visualization, which displays data from 2016.


It took me some time to understand how to interpret the frequency of black lines that seem to circle around particular groups, but I believe these more circular connections indicate that certain groups, particularly Blacks and Hispanic/Latinos, are disproportionately targeted by particular agencies in both years. In other words, although the absolute number of fatalities is largest for whites in both years, the density of these circular lines around Blacks and Hispanics/Latinos reveals that there are more connections to particular agencies than you would expect, given that the absolute number of fatalities for each of these groups is less than the number for whites.

What I also found interesting in the visualization, and what I didn't expect to be able to see going into this practicum, was the relative distance between the different targets. I believe targets that are closer together are more often targeted by the same agencies, because the network connections are orignating from similar sources (or agencies). So I think our Cytoscape visualizations also tell us that to some degree, there seems to be a group of agencies that is targeting minorities more than they are whites, which in both visualizations seem to be a bit separated from the minority targets, which seem more clustered together.

As for the main differences I noticed across years, it seems that the concentration of the circular lines increases for Blacks and Hispanics, and decreases around whites, in 2016 compared to 2015. In other words, the 2016 visualization seems to be indicative of racialized police brutality to a greater degree. Also, I noticed that the Asian/Pacific Islander appeared to move closer to the Hispanic/Latinos ellipse in 2016 rather than 2015, and while I am not sure if we can confidently draw a conclusion from this, I think the idea that particular minority groups are more closely linked and targeted by particular agencies could be an interesting one to explore.

What I found a bit confusing or hard to discern in Cytoscape was whether the singular blue sources that appear more toward the edge of the visualization are truly indicating that this agency was only associated with one fatality. I believe this should be the case, but when I searched "flagstaff," which appeared to only be linked to the Native American target, there was also an additional source not connected to this one that only linked to the white target. I wonder if these is due to the fact that there are two different police departmetns both named flagstaff - but then I wonder how, based on our data, Cytoscape knew that these were different agencies. If these really are the same agency, wouldn't it have been more helpful for the connections to Native American and White to have come from the same source/blue circle?

I also see a couple of the parallels between how we used Cytoscape in this practicum and how my project team is using Voyant for our final project. We have been using Voyant as a tool to expose us to deeper levels of close analysis that we would not otherwise have explored without the tool. In other words, we are using Voyant as a tool that informs our close analysis, rather than a tool that can independently produce all of the analysis we need. After playing around in Cytoscape, I feel that a similar point can be made about these network analyses. The visualizations can pique our interest in certain trends, but they do not allow us to immediately draw conclusions with a high degree of certianty. Rather, they can make us ask questions that we then go back to the data to verify whether these trends really exist.


1 comment

Hi Fiona, Yes to so many of the thoughts and questions in your pose. I want to quickly try to answer your question about interpreting the distance between nodes. The short answer is I don't know if we can hypothesize about the relationship between "Asian/Pacific Islander" being closer to the "Hispanic/Latinos" ellipse in 2016 rather than 2015, as you suggest we look into. It's a great question! I did some research on this in the Cytoscape documentation to help answer your question, but I came up empty. Each different kind of layout has a unique way of treating or moving its nodes in relation to edge length. We used a "force-directed" layout, though there are others. See the documentation description here: I cannot find documentation that answers your question about the length of the edge having a readable meaning out of the box, that is, without manipulating the edge data. You can weight the length of the edge based on other node data, but I don't know if it loads pre-weighted based on any specific criteria. I do find Cytoscape documentation lacking at times!