Ever have one of those epiphanies where you realize that you don’t know as much about something as you think you do?
I’ve had plenty of those moments throughout my undergrad experience as a Media Studies major, and many of them have come within the last year as I’ve dug more deeply into the true substance of my concentration. Prior to this semester, I’d heard the term “predictive analytics” tossed around, but I’d never considered what it actually meant in the context of big data—and to be completely honest—I didn’t even know what big data was. Since covering predictive analytics in the form of a weeklong unit in class, I’ve taken quite an interest in it, and have decided to look into it more thoroughly in terms of what it means to me—and more importantly—what it means to the bigger picture.
I’ve come to realize that big data and predictive analytics can be utilized in exciting and beneficial new ways, such as for the creation of user-tailored content for entertainment and educational enrichment. At the same time, I’ve come to realize that big data and predictive analytics can be used in ways I’m both iffy and less comfortable about, such as data aggregation and monitoring. Lastly, I’ve come to realize that big data and predictive analytics can be used in ways that I’m flat out against and skeptical of. For the purpose of this blog post, I’ve separated these functions of predictive analytics and how they pertain to big data into three categories: the good, the bad, and the ugly.
Let’s start with the good.
I’m not a complete cynic; I realize that there are a lot of positive and productive perks of predictive analytics. One that immediately comes to mind, is the creation of media content for the people, decided by the people (at least in a way). In the words of Professor Viktor Mayer-Schonberger, from his book Big Data: A Revolution That Will Transform How We Live, Work, and Think, “In the media, the content that gets created and publicized on websites like Huffington Post, Gawker, and Forbes is regularly determined by data, not just the judgment of human editors. The data can reveal what people want to read about better than the instincts of seasoned journalists.” As a consequence of what Professor Mayer-Schonberger is talking about, the modern media landscape not only dictates how the content is tailored, but the content itself. Perhaps the best example of this is the original programming Netflix has employed their site metrics to produce. Shows like “House of Cards” originated because the powers that be at Netflix knew they would succeed based on a number of statistical factors. In this way, predictive analytics deliver on the promise of big data. I enjoy Netflix as much as the next guy. So, sufficed to say, the world might be deprived of some quality entertainment and some other equally cool stuff if it weren’t for predictive analytics.
Now that I’ve spent a little bit of time on the good, I’d like to switch gears and segue into the bad—which as much of a bummer as it is—is where I’ll be spending the majority of my time. (I know what you’re thinking: it’s all downhill from here. Bear with me).
A substantial chunk of what makes me so undecided and suspicious when it comes to predictive analytics entails the processes behind data aggregation and monitoring. Although I take a great attention to detail in monitoring my own digital footprint and making sure that my online content is as private as possible when actively displayed, I’m still aware that I provide a wealth of data that could be of use to certain third party interests. Much of what anybody would need to know–or like to know–about my interests could be mined from my search history. Everything from the sites I visit, to the places I do my shopping provides useful insight into not only how I might be reached, but how I might be persuaded. In other words, what I’m searching for and what I’m reading about leads to targeting on a very complex level.
As J.J. Sylvia, who is our instructor for our big data course, writes in the book Controversies in Digital Ethics:
“Much of the data now being generated is related to individuals on a personal level. It can range from something basic, like what books he or she buys online and which websites he or she has visited, to the more advanced biometric data of the kind collected by those involved with self-tracking movement such as the quantified self, which is often collected using proprietary hardware and software that stores data on the cloud, thus clearly putting such data in the hands of businesses and their big databases.”
Aside from the general data I’ve generated in the past through parameters, statuses, and demographics I fall under, I know that my pictures—namely the captions pertaining thereto—have given away quite a bit of information about me that could serve as a juicy target for direct advertising interests. These visualizations (for all intents and purposes) of myself really seems to draw consideration from ad targeting, in large part, due to the lexical content that supplement them. Key words are where I would imagine that my data are really quantified, collected, compiled, and analyzed for patterns. A real-world application of all this is my interest in politics. As I have a heavy interest in political affairs, I specifically stand to be targeted by political interests based on what I search for, reveal, and publish online. Political interests can use and do use my data to endorse their candidates, platforms, and agendas in an effort to win me over and persuade me to invest in what they’re selling.
At the end of the day, that’s what it’s all about anyway: commodities and products—and in that respect—I’m not crazy about predictive analytics being used to force things on me.
Since I’m already on the bad, I might as well go ahead and jump into the ugly.
Like any practice—and maybe more so—predictive analytics in a big data world runs the risk of being disingenuous, deceptive, and downright nefarious. In the words of Ira S. Rubinstein, a Research Fellow at the New York University School of Law, writes in his article “Big Data: The End of Privacy or a New Beginning?”: “Rather, profiling technologies now extend to every aspect and phase of individual and social life, with [Big Data] supplying the necessary horsepower to find hidden correlations and make interesting predictions, some of which may benefit individuals or society, while others may be more problematic.” For a class assignment, we were asked to individually download all the data Google had stored on us. The data that Google had compiled from my online activity was particularly interesting to say the least. Among the most active folders in terms of data accrual, were Bookmarks, Drive, and Youtube. The extent of the compilation itself was eerie, and its implications left somewhat of a bad taste in my mouth. At the time, my Bookmarks revealed where I was going online, and where I intended to return. My Drive revealed what I was sharing. And my Youtube account–which was probably the most active of the three–showed what I was watching and what I was likely to watch. In and of themselves, these pieces of data weren’t particularly disconcerting, but the more I thought about it, the more anxious I got. If Google can amass such data, what’s standing in the way of them storing bank statements or medical records? The privacy of entire technology-literate world could be at stake. I understand there are laws, but there’s also loopholes and far more gray area than anything else. The thought of how predictive analytics could theoretically be used in ways that take the power out of my hands—in my mind—definitely constitutes the ugly of my three-headed analysis.
In summary, while I’ve learned a lot over the past few years and the past semester specifically, perhaps the most important thing I’ve realized is that there is a good side, a bad side, and an ugly side when it comes to predictive analytics and big data in general. I’ve read a little too much Orwell and Huxley–and as a result–I’ve crafted at least a few conspiracy theories on how corporations could exploit my data. However, I’ve pretty much come to accept that this is just the nature of things, and that everything comes with the territory. I realize that the predominant number of ads, messages, and videos I see are merely the product of choices that I’ve consciously made with my own data contribution. I’ve agreed to the terms and conditions, and I’ve sat through more than a few classes that were considerably critical of media, so I essentially know what I’m getting myself into.
Ultimately—as with anything—knowledge is power. Staying well-informed of predictive analytics, among other things, is the best policy in navigating a seemingly infinite big data landscape. And I feel like I’m equipped to do that.