Blog Post

How Are You Today?: Understanding the Algorithmic Self Through AI

How Are You Today?:  Understanding the Algorithmic Self Through AI

How Are You Today? is a chatbot based off of a person’s entire Facebook chat history. It can be found here: https://arcane-retreat-1724.herokuapp.com/ . By looking at the patterns of speech and vocabulary, HAYT? can simulate a conversation to the past: It’s an artificial intelligence that let’s you talk to a younger self.

In creating a replica of the speaker based off of these digital artifacts, HAYT? hopes users will examine the ways digital technologies are central to their selfhood rather than ancillary. Technology and biological bodies couple to form distributed networks of agency, intent, and causality. By recognizing themselves as inseparable from their technology through HAYT?, users can start to consider ways that human-machine hybrids are changing culture, government, and economies.

However, rather than just an uncanny speculative desig, HAYT? also works as a piece of data and algorithmic pedagogy. Resisting the technological urge to black box, HAYT? provides users with the option to see how the algorithm works.

 

Part I: Chat

Entering the site for the first time, users are greeted with a dropdown menu to choose which past self they would like to talk to. Maybe they want to chat about high school prom (“May 2011’”), or maybe give some much needed romantic advice to their freshman self (“March 2012”).

After entering the date, the site brings the user to an interface much like GChat, Facebook Message, or AIM. From there, they’re free to chat as much as they’d like with the bot.

This project was first motivated by a personal frustration with pop culture’s understanding of algorithms and AI. As someone who whose future work will certainly entail building algorithms for data analysis, I’m frustrated by the current state of dialogue surrounding AI and technology.

This summer’s blockbusters show pop culture’s tradition of AI paranoia is alive and well. Avengers: Age of Ultron continues Marvel’s financial success to the tune of $875 million global earnings, as moviegoers flock see their favorite superheroes battle an evil robot army. The Terminator franchise, like its eponymous killing machine, seems indestructible, with yet another film being released in July.

Of course, society should be asking questions about the power dynamics behind technology, and what possibilities are being left unconsidered in the techno-optimism of Silicon Valley’s own hype. And dismissing science fiction completely on the basis of improbable technologies is largely tossing the baby out with the bathwater.

But when killer robots are our only frame of reference, then that discourse becomes partitioned into dichotomies: Choose to either unplug from the network or submit to your inevitable robot overlords.

Thus, the project was meant to complicate user’s understanding of their relationship to technology—rather than a simple antagonism between people and machines, I wanted users to consider ways that they’re already deeply and inseparably imbricated with technology. In many ways, I want users to start seeing themselves as cyborgs.

Not literal cybernetic organisms, with bits of machine embedded in their biological bodies, but Donna Haraway’s “chimeras, theorized and fabricated hybrids of machine and organism” which render “thoroughly ambiguous the difference between natural and art)ficial, mind and body, self-developing and externally designed, and many other distinctions that used to apply to organisms and machines.”

But how does chatting about prom night on Facebook reveal your cyborg self? We use Facebook as a social prosthetic—it supplements our ability to keep track of relationships, communicate, and share experiences with one another. Thus, we can’t imagine a social self as preceding or existing separate from Facebook. Michael Wheeler elaborates on this in Computational Culture, through a theory of extended cognition. He rejects separating humans from their tools: Such partitions are arbitrary and unproductive. “Tools are part of us,” Wheeler writes. We should consider technology as much a part of us as our fingers, eyes, and brain.

Thus, HAYT? seeks to ask the question: What happens when these tools are so much a part of us that they can begin to act on their own? After all, HAYT? will have the mannerisms, turns of phrases, and quirks of speech that users had when they were younger. It will remember things that users themselves had forgotten, and will talk about anxieties and hopes that are long gone. In other words, HAYT? tries to claim being a more faithful replica of the past than the user.

Haraway embraces such a blurring of boundaries. With cyborgs, there’s no preceding natural body/self which then gets transgressed or upgraded via technology. Instead they constitute a “disassembled and reassembled, postmodern collective and personal self.”

 

Part II: Visualization

The second part of the project was a visualization of how the algorithm worked. This was because I didn’t want to work with aesthetics of creepy and uncanny. Tropes of the almost-human robot abound in sci-fi, and they seem to only imply the same techno-phobia that I’m arguing against.

Besides, if the goal of the project was to shock users into reconsidering the source of selfhood, then the resulting chatbot fails spectacularly in that regard. Any illusion of intelligence, autonomy, and replica fall away within a few seconds of use. The fragmented, nonsensical responses of the computer become cute and nonthreatening, rather than unsettling.

Instead, then, I decided to go the opposite route. Rather than abstraction and obscuring the inner workings to maintain the illusion of humanness, I wanted to expose the machine logic in its entirety, and provide transparency.

For each salient word in the user’s prompt (we consider words like “university” or “green” but not “the” or “it”), the program gathers all previous conversations that contained that word. These conversations are represented by circles, each of which is color coded by the indexing word. Furthermore, the size of each circle is determined by how similar that conversation is to the user prompt (for example, given the prompt “What are your plans for the summer?,” the conversation with “Got plans this summer?” will be determined to be more similar than “I’m going to go to bed”).

Once all the candidate conversations are gathered, one is randomly chosen, weighted by the similarity. The response from the chosen conversation is then displayed as the response to the user.

Afterward, however, the user is free to explore the pool of all candidate conversations. Mousing over the circles brings up the conversation and date (conversations have been anonymized).

This transparency, in many ways, turns the bot from an AI into a rote machine in the eye of the user. Gone is any potential of real intelligence. The program has no understanding of words or questions or ideas—underneath it’s all puppetry and animatronics.

But why go through all the hassle of making a chatbot, only to show my hand and ruin the magic?

When I first finished the chat algorithm, I started testing the chatbot on other computer scientists. This was a matter of convenience but proved to be serendipitous, because it eventually led me to make the visualization. For other programmers, HAYT? very quickly became a game of “breaking” the chatbot. How did it parse the user prompts? What was salient and what was different? How were responses generated?

Despite my minimalist, flat interface, other programmers didn’t have to take my product at face value. There was nothing uncanny or threatening about my chatbot—no fundamental threat to selfhood—because they had the tools to break it apart and examine the programming.

How could I give lay-users this same empowerment?

This was motivated mainly by Tara McPherson’s call to action in US Operating Systems at Mid-Century. Mainly, she argues that “to study image, narrative, visuality will never be enough if we do not engage as well the non-visual dimensions of code and their organization of the world.” HAYT? instead became a way for me to not only explore our digital constitution of self, but more importantly, support the array of computational and algorithmic literacies that McPherson argues are needed in the digital humanities.

By embracing a philosophy of transparency rather than obfuscation, HAYT? hopes to call for optimism in the face of increasing algorithmic ubiquity. It is insufficient to just recognize oneself to be a post-human subject. A deeper understanding of the mechanisms of algorithms point to reclaiming agency in a world of Big Data and ubiquitous surveillance. Algorithms and AI aren’t magic, but processes that can be understood and demystified and ultimately changed.

 

 

 

 

Footnotes: Technology

The code to the project can be found at my Github (https://github.com/jason-h-hu/MCMFinal). Currently, all paths are hardcoded, so this cannot be deployed elsewhere.

Backend

This project was written in Python, and deployed using Heroku. HTML parsing and organizing was done using the Beautiful Soup library. HTTP requests were routed and served using Flask, with the help of WTF-Forms.

Frontend

The frontend was styled using Bootstrap CSS, and the data visualization was built using D3.js.

Algorithm Design

For tokenization and parsing, two copies of each conversation were maintained. One that was the original conversation, split over whitespaces. The second normalized all words, by turning them lowercase, removing punctuation, changing URLs into standard tokens, and throwing out stopwords. By keeping the two lists, we could return the tokenized word to its original form in generating responses.

I used a reverse index data structure to keep track of words and conversations. For each date, we kept track of all words that occurred in a conversation. Each word then served as a key in a hashtable that mapped to a list of conversation IDs. This way, we only had to store each conversation once.

Finally, the similarity of two sentences was evaluated by taking the Jaccard Score over the respective set of tokenized words. 

145

2 comments

This is really cool. One thing I found especially interesting is how you designed the visuals so that everyone can see what's going on and try to break the program if that's their inclination, even if they aren't computer scientists and don't understand how these sorts of bots tend to be designed. You're giving people the chance to think like computer scientists even if the don't have the experience that would usually let someone do so easily.

I do have one question about the exploration bit -- especially since what's being represented is, in the end, text, is this accessible for screen reader users?

125

Thanks for the kind words! Unfortuantely I don't believe it's accessible to screen readers at the moment. As I start dusting off old projects, I'll try to keep that in mind. 

134