Last month, I put out a call to a number of tech-oriented scholarly listservs asking for references on the history of synthetic speech, a topic at the core of my dissertation research. My project is a qualitative study of parents whose children with significant communication impairments (due to developmental disabilities such as autism and cerebral palsy) use an iPad equipped with a special app that helps them “say” words and sentences.
In this way, the iPad is used as an “augmentative and alternative communication” (AAC) device— a term more common perhaps to someone getting a Ph.D. in communication sciences than in communication studies such as myself. With my dissertation, I am focused on how this configuration of technologies fits into family life, but there are also implications for how we conceive of the relationship between ICTs, embodiment, and interpersonal communication.
As it pertains to the latter area of research, I have been struck by a recurring theme in my interviews and observations: parents referring to the speech produced by the iPad as their child’s “voice” and to the iPad itself as a “talker.” There is no neat way to distinguish between the metaphorical and literal uses of voice and talk here. I’ve been looking into the history of synthetic speech (along with the history of prosthetics) in order to understand why the voices sound the way that they do, what other technologies (e.g. earlier personal computers, non-electronic devices) have been used to produce artificial speech, and for what purposes.
In all cases, humans—with their own biases and backgrounds—make speech and speech machines. People impart some of themselves into the various tools for artificial speech production, be it a talking Siri, a talking doll, or a talking AAC app. We’re still a long way from synthetic speech sounding like Scarlett Johansson’s breathy and raspy voice in Spike Jonez’ film Her, but where we came from involves a long history of ideas about what might make an “imitation” voice better than the “real” thing (and “better” in a variety of ways). With those ideas also come assumptions about race, class, gender, sexuality, and ability. The history of “talking machines,” dating back to ancient times, is enmeshed with the development of social, cultural, political, economic, and religious thought.
In case anyone else is interested, here are a few good sources I’ve culled together so far, many from the field of science and technology studies. While my dissertation doesn’t have room to dive as deep into this topic as I’d like, and since there is surprisingly little written (and even less that incorporates a cultural/critical studies perspective), I’m very interested in pursuing the history of artificial speech in the future—after I get this dissertation stuff squared away.
If you have any other sources to recommend, please feel free to leave a comment or send me an email at firstname.lastname@example.org!
Frantz, G. (2013). The Speak N Spell. Retrieved from http://cnx.org/content/col11501/1.4/
Dudley, H. W. (1939). The Vocoder. Bell Labs Record, 17, 122-126. (Plus a YouTube video of a 1939 demonstration of the Bell Labs’s “Voder”: http://www.youtube.com/watch?v=0rAyrmm7vv0)
Gitelman, L. (1999). Scripts, grooves, and writing machines: Representing technology in the Edison era. Stanford, CA: Stanford University Press.
Hankins, T. L., & Silverman, R. J. (1995). Vox Mechanica: The history of speaking machines. In Instruments of the imagination (pp. 178-220). Princeton, NJ: Princeton University Press.
Kevorkian, M. (2006). Integrated circuits. In Color monitors: The black face of technology in America (pp. 74-114). Ithaca: Cornell University Press.
Lindsay, D. (1997). Talking head. American Heritage of Invention & Technology, 13, 57-63.
Mills, M. (2012). Media and prosthesis: The Vocoder, the Artificial Larynx, and the history of signal processing. Qui Parle: Critical Humanities and Social Sciences, 21(1), 107-149.
Olive, J. P. (1997). “The Talking Computer”: Text to speech synthesis. In D. G. Stork (Ed.), HAL’s legacy: 2001’s computer as dream and reality (pp. 101-129). Cambridge, MA: MIT Press.
Ott, K. (2002). The sum of its parts: An introduction to modern histories of prosthetics. In K. Ott, D. Serlin, & S. Mihm (Eds.), Artificial parts, practical lives: Modern histories of prosthetics (pp. 1-42). New York: NYU Press.
Peters, J. D. (1999). Speaking into the air: A history of the idea of communication. Chicago: University of Chicago Press.
Ronell, A. (1989). The telephone book: Technology, schizophrenia, electric speech. Lincoln: University of Nebraska Press.
Sconce, J. (2000). Haunted media: Electronic presence from telegraphy to television. Durham, NC: Duke University Press.
Spiegel, A. (2013, March 11). New voices for the voiceless: Synthetic speech gets an upgrade. National Public Radio. Retrieved from http://www.npr.org/blogs/health/2013/03/11/173816690/new-voices-for-the-...
Sterne, J. (2003). The audible past: Cultural origins of sound production. Durham, NC: Duke University Press.