After recording a day’s worth of all of the possible contributions to big data that I’ve made, I found that a great deal of my online activity could be giving away more personal information than I would have considered before I took this class. The site I believe could be getting the least amount of data out of me is, ironically, the one I probably use the most. At the start of the day I began cataloguing my activity, approximately 9 a.m., I logged onto the social media website Tumblr. The website is designed in such a way that posts created by other users appear on my blog if I’m following those users or if someone I’m following reposts it onto their blog. If I “reblog” a piece of content become I found it interesting, then it allows for the circulation of said piece to the blogs that follow me. Tumblr tracks the posts that I reblog or “like”, and I can view these for myself on my dashboard. While I spend a great deal of time blogging about my various interests, I think that the nature of a Tumblr user’s sporadic posting makes it difficult to uncover any discernible patterns from their entries, at least not from reblogs. Likes might paint a clearer picture, at least for me, since I tend to use them more sparingly.
The next website I used for most of the day, starting at around 10 a.m., was Youtube. I use this site while I study to listen to music that I can’t download into an MP3 format. My personal tastes means I tend not to listen to a lot of mainstream pieces, so once again there’s no real useful pattern here if someone’s looking to advertise. I also use Youtube usually around 12:00 p.m. to watch videos while I eat lunch. These videos are typically centered on “Let’s Players”, content creators that play video games and make commentary on them. This presents a more useful form of data, even if it might not be one hundred percent accurate. Analysts could assume from my habits that I have an interest in video games to some extent, which is a kind of product that could be sold to me. Where the inaccuracy comes from is that even my personal taste in video games doesn’t necessarily reflect the videos I watch, since I tend to be viewing them for the personality of the commentator. The “other videos you might like” feature shows me that Youtube actually has a lot of difficulty discerning between my various interests, either by subject matter or content. Of course the validity of any of this is thrown into question since I turned off Youtube’s ability to track my watch and search history before I started taking this class.
If there’s one thing that I’ve learned, however, it’s that companies can find ways to access my data regardless of how careful I am with my online browsing habits. For instance, I didn’t actually know how to turn off the Youtube app’s tracking abilities until recently. Even then both the app and the website still seem to have some power to suggest videos for me to watch, and that can only be reset by clearing the watch history I believed was no longer being tracked. Youtube is also linked to the Google Plus account I didn’t really want, which means that Google may be getting data related to my video interests as well. I don’t tend to use the Google Plus account by itself, but I know it is connected to the Gmail account that was made for me when I started college. I can’t help but wonder if some combination of my Google search and Youtube watch history has to do with why I’ve been getting so many spam messages on that account, yet my typical email is essentially junk free.
I used Facebook briefly, around 1:00 p.m., to message the supervisor for an internship that I’m taking part in. While I doubt that Facebook could gleam any useful information from that one post on that day, my Facebook activity has increased since it’s my main tool for the internship, and the website in general is one of the largest potential threats in terms of forming my online persona. Since I’ve recently had to post on my personal page a number of articles that I wrote about the stereotypical slacker college life style, including a number of activities I have no interests in (such as drinking), that could result in a serious misconstruction of my personality. I also received a message from the person I’d been speaking to on the dating app Tinder. This app uses Facebook to build a profile, meaning that the social media outlet has access to parts of my love life that I’d prefer to have kept private.
Closer throughout the end of my day, 3 p.m. through 6 p.m., I used the gaming platform Steam. Along with my name, phone number, and credit card information; Steam has access to my video game interests as well. I have no knowledge as to whether or not my data here could be crossed over with my data from Youtube/Google, but if all of my information is being centralized in some way then I have to assume there’s at least some overlap. I used Netflix briefly that day as well, around 8 p.m., which I imagine gathers a lot of similar kinds of data that Steam does, except for films/TV shows. Both of these platforms might be the most accurate source of data that I output, since I tend to buy/browse similar varieties of games and also tend to re-watch shows/movies I like. This would make it much easier to confirm the accuracy of my interests.
The picture that I believe these services will paint of me is someone who has specific interests in film, TV, video games, and other facets of popular culture. I don’t believe I have enough negative/misleading information out there that could form an “incorrect” picture of me, but it’s difficult to be certain. Even if I wanted to opt-out of the big data world, it would be impractical for someone like me who enjoys the multitude of benefits that come from having several digital services. As we’ve discussed previously, in order to participate in this digital culture I’ve already had to give up so much, both in the monetary department for the gadgets I’d need to even write this post and in the form of the data needed to access the services that essentially everyone has to have.
At what point will a website’s ability to read my data allow them to know what my interests and habits are before even I do, with more accuracy than current versions of that service? As I’ve pondered this questions myself, I’ve come to realize that if we’re going to have to live in a digital world where our personal information is a form of currency we’ll have to pay, I’d rather the data on me be as accurate and as much to my benefit as possible. Obviously companies that aggregate data should be less secretive/suspicious about their activities, but the fact that they collect information isn’t inherently bad. It could become extremely harmful, however, the less accurate it is. If future generations are going to live in a world where their data become a commodity, lets at least make sure they won’t be taken advantage of due to a clerical mistake.