Blog Post

Data Mining Versus Paying Attention: The Hazards of Hype

In my latest column for Foreign Policy magazine I explore the deeper story behind the early warning signs of the current Ebola outbreak and what it can teach us about data mining, translation, and the need to listen instead of simply data mine.  The simple version of the story is that despite enormous media coverage earlier this month that Harvard’s HealthMap service had provided the earliest known warning signs of the outbreak this past March by data mining millions of social media posts, the reality is far less exciting: the government of Guinea had formally announced the outbreak on national television the day before HealthMap’s earliest warnings.  The problem is that Guinean media is largely broadcast and published in French and traditionally does not receive widespread attention in the media of other countries in the world.  
Instead of focusing on better ways of accessing local media throughout the world and especially improving the ability to translate and monitor materials in languages other than English, the majority of US Government funding to date has emphasized a focus on computer science and data mining.  Yet, even the ultimate data mining of the New York Times is unlikely to provide a view of the Ebola outbreak any more detailed than that which American health authorities already have.  On the other hand, simply paying attention to local Guinean media in March 2014 would have offered early warning of the impending outbreak far more detailed than anything data mining of Western social media was able to offer.
While “big data” has enormous potential for the future, we must be careful to couple data mining with a deeper cultural understanding of what we are trying to monitor or measure and to recognize that sometimes just listening a bit better can yield results far in excess of what the most sophisticated data mining could imagine.  

1 comment

This is an interesting posts. It certainly makes one think about the cultural differences which surround investigation. I agree with you that this element must be included when one goes looking for information in the vastness that is the social media cavern.