Blog Post

Intertwining Race and Artificial Intelligence

Intertwining Race and Artificial Intelligence

I’ve been thinking a lot about Tara McPherson’s piece U.S. Operating Systems at Mid-Century: the Intertwining of Race and UNIX. As someone who’s invested in both computation and the digital humanities, I appreciated her appeal for new programmatic and algorithmic literacies. Her analysis of UNIX and race was testament for the need for “new hybrid practices.” By making serious attempts to dive deeply into the logic of both computation and digital humanities—rather than topical gestures toward interdisciplinary analysis—she was able to draw new and unexpected parallels behind the organizing logic of computer systems and identity politics.

I have also been thinking a lot about artificial intelligences, both the highly speculative future and the more pedestrian considerations of building a modern AI in Python (i.e. my homework). Media theory gets to analyze the sexiest parts of AI: cyborgs, mass surveillance, and algorithmic culture. AI research, however, needs to remain firmly grounded and tethered to mathematical and programmatic formalisms. Terms that we can take for granted—state, action, reward—require careful and concrete definition. Sci-fi musings and the philosophy of AI are reserved for lunch breaks or the bar.

So like McPherson, I want to try connecting two disparate threads from my own disciplines. What happens when we look at racism in the formal computational discipline of artificial intelligence? To do this, I’ll first review the background of foundational AI concepts. Then I’ll see how these principles can be used to analyze the dynamics of race and racism.

 

Search: My First A.I.

Before I start describing algorithms, try navigating the maze above. Rather than focusing on finding the right path, notice how you problem solve. Are you going through the maze haphazardly? Is there a method or organization to your exploration? How would you describe your methodology, to a very intelligent child who has never seen a maze before?

The first program an AI student writes is usually a search algorithm, where there’s a desired goal, but we don’t know how to reach it. Finding a solution is a matter of systematically and intelligently considering potential solutions—it’s being smart about guess-and-check.

Consider tic-tac-toe. A good strategy might be to consider all possible moves one could make. From those resulting boards, imagine what your opponent would do, and then repeat the process. With enough focus, you could determine which of these futures result in wins, losses, and ties, and thus which moves are better than others.

The state-space of tic-tac-toe

Teaching a computer to play is a matter of expressing this strategy more formally. There exist states (as any valid arrangement of X’s and O’s on the board), and some states are connected to others (if they can be reached via a single valid move). We have goal test, or some way of determining whether the game is finished (win, loss, or tie). The search agent plays the game by starting at some state (the blank board) and systematically exploring the legal sequences of states until it has found its goal.

This should seem simple and straightforward. Algorithms, after all, aren’t black magic and they certainly aren’t unknowable gods. They’re simply a clearly defined set of instructions. A recipe for lemon cake can be thought of as an algorithm, or the steps of doing long division with pen and paper.

So if it seems like search algorithms are trivially obvious and overly formal, it’s because … well … they are. We’re rational agents who are constantly solving problems. Writing an algorithm is oftentimes a matter of introspection, and articulating in general terms how we came to a solution.

But the strength of algorithms lies precisely in their formalism and abstraction. Our example was tic-tac-toe, but if you can clearly define any problem as a state space (the states and their connections) with a goal test, then it can be solved with the search algorithm. Because the algorithm only operates on the level of states and connections, it doesn’t care about what that state space models. Problems from disparate fields of robot navigation, finance, marketing strategy, and networking can all be solved with the same search algorithm, once they have been articulated as a formal, mathematical system.

For example, consider another canonical problem for search algorithms: navigating a maze. States are the discrete locations of the maze. States are connected if the locations they represent are adjacent to one another. The goal test is simply whether the agent has reached the end. Thus, the algorithm is systematically checking various partial routes until it reaches the goal. From PacMan to Google Maps, this algorithm lies at the heart of numerous navigation problems in computation.

Search algorithms can methodically consider all possible futures, but as problems become more complex the number of options explodes exponentially. Yes, it might find a solution, but without a smarter algorithm, you might be waiting for hours as the machine sorts through the deluge of scenarios and situations.

For a small problem like tic-tac-toe or navigating a 5 × 5 grid maze, the algorithm only needs to consider a relativelysmall number of around 1000 states to explore—this is reasonable for any modern processor. A medium sized problem like chess has 1047states, more states than seconds have elapsed in the universe. Moving out of the sandbox of neatly defined games, real-world problems have infinitely more possibilities to consider, making even the most seemingly simple search computationally intractable.

To avoid being paralyzed with indecision, AIs turn to heuristic functions. A heuristic is computational intuition, in a sense. As people, when facing a branching set of possible options, we have some notion of what’s probably a better choice to take. In chess, if a move results in the loss of our queen, then without further exploring possible futures from that move, we could probably conclude that’s a bad move. In trying to navigate a city without a map, when faced with a fork in the road, we probably would take the one that is pointing toward our destination, rather than away.

Of course, these gut-feelings are fallible—sacrificing the queen might really be a dastardly gambit, and heading blindly toward our destination might result in a longer route or even a dead end. But they give us agency, summarizing the endless spirals of possible futures into an actionable, single value.

For the AI programmer, after designing an expressive state-space (one that meaningfully models the real world), the design of heuristics is the next engineering challenge. Designing a heuristic for chess denies abstraction: The engineer needs to understand rooks, feints, and endgames in order to encode them into a function.

Heuristics require creativity and ingenuity. They require a careful consideration of the problem at hand, consulting with experts about the nature of their work and domain. While heuristics can be subject to a battery of mathematical tests for efficiency and correctness, their genesis resists such mechanical or theoretical foundations.

As my AI professor said, heuristics are more art than science.

 

So What?

I now want draw connections between the organizing logic of artificial intelligence, and the way society operates in 21st-century America. How do the principles of AI get reborn in our understanding of race and gender?

The organizing logic of search problems boils down to two principles:

  1. Discretize the world into knowable, manageable states
  2. Engineer heuristics, to allow for the clever navigation of those states

Like McPherson, I want to show how these technological principles are deeply imbricated into with race—both on the level of how it’s structured and produced, as well as how it’s consumed and deployed.

 

State/Space

Search algorithms necessarily need to discretize the problem—slicing and partitioning the world into knowable states that allow for systematic exploration. This works in well-defined models like chess. But in the real world, defining a state-space (or feature-space, in the case of more advanced machine learning algorithms) is riddled with assumptions and blind spots.

These encodings need not be born out of malicious intent. The creators of these algorithms aren’t Scrooge McDuck fat-cats sipping champagne in a town-car, but usually hoodie-and-jeans programmers with a fair-trade latte and an eco-friendly bike.

But when the vast majority of Silicon Valley fits this archetypal mold, this suggests that only a narrow range of experiences (primarily male, White and Asian, and middle class) are being considered and actively designed around.

It’s in these ways that “computers are themselves encoders of cultures.” Our biases are formally encoded into the epistemology of the algorithm: The questions we ask our machines are largely limited by the ways we’ve constructed them in the first place.

Consider the ways that Google and other online spaces have re-enacted racism and sexism in the allegedly “apolitical” Internet. “Beauty” according to Google is an archetypal white woman.

More optimistically, however, this is also a site for possible intervention or interdisciplinary dialogue. After all, I don’t want to argue that we shouldn’t try and define state-spaces or build algorithms, but that understanding the challenges of algorithm engineering provides a common ground for theorists and programmers.

Digital humanities should produce criticism of how companies, industries, and research academics intentionally or inadvertently encode biases and violence into the very assumptions that precede the design of an algorithm. By framing critiques of race, gender, or class in technical terms (e.g. “Your state-space is poorly designed because it doesn’t account for cases x, y, and z”), this provides engineers with actionable ways to reduce harm, and helps theory “engage [with] the non-visual dimensions of code and their organization of the world.”

 

Racism as Heuristic

If identity politics and UNIX were born from the same cultural impulses, how might we update that understanding for the Internet Age? McPherson acknowledges in U.S. Operating Systems at Mid-Century that modern turns in computing seem to favor networks over nodes. If this is a shift in how we organize and understand race, what does that imply?

The search algorithm is in many ways an ideal technological metaphor for the current moment. If the modularity of the 1950s produced discrete identity politics, then those identity politics are today reborn as heuristics. Faced with vast amounts of data to process, these pre-encoded notions of how race operates allow us to make sense of politics and society.

Consider the ways racial tropes become recycled online and in the news during the aftermath of police shootings and protests. Or consider how allegedly post-racial narratives focus on themes of meritocracy and deny the acknowledgement of the systemic. In either case, we can see these as heuristics: They provide an easy way of summarizing a situation, allowing the bypassing of further introspection and analysis. As a result they suggest and privilege certain solutions, responses, and actions over others.

I should make a few important qualifications. As stated previously, we employ heuristics all the time to process our day-to-day data, most of which are in non-digital contexts. Thus, I’m certainly not arguing that heuristics, computational or otherwise, are bad. I’m arguing that racism is an example of bad heuristics. In AI, we’d describe them as inconsistent and inadmissible.

Furthermore, I want to shy away from claims that the challenge of data overload is somehow exceptional to the Digital Age, or even to modernity. Tabloids, wild speculation, and stereotyping are ways we take cognitive shortcuts, and they’re certainly not specific to the digital.

However, acknowledging race as a heuristic in the context of digital media complicates our understanding of how knowledge gets generated and disseminated online. The question shifts from “What is being produced?” to “How are people finding it?”

Even back in 1945, Vannevar Bush pointed out in As We May Think that the challenge of the 20th Century wouldn’t be making new scientific discoveries, but sifting through and analyzing the body of work that was growing every day. As the production of data has only exploded since the post-war years, with all aspects of the Information Age seeping into the everyday, that challenge becomes more and more pronounced. Internally processing the mountain of information grows intractable: We fall back on heuristics both digital (Google) and analog (race). 

151

No comments