February 12, 2008

EP 4.2: Eliza and the Turing Test

by Noah Wardrip-Fruin · , 6:41 am

While Eliza is the first well-known digital character, its roots trace back to a highly influential proposal for computer-driven conversation (less than two decades earlier) from the father of general-purpose computing: Alan Turing, mentioned earlier in this book’s introduction. Writing for the philosophy journal Mind, Turing initially proposed to consider the question, “Can machines think?” (1950). However, finding this question hopelessly ambiguous, Turing instead replaced it with a set of questions involving an “imitation game.”

The human version of this game has three participants: “a man (A), a woman (B), and an interrogator (C) who may be of either sex.” During the course of the game the interrogator asks questions of A and B, trying to determine which of them is a woman. A and B, of course, do their best to convince C to see it their way — the woman by telling the truth, the man by “imitation” of a woman. The proposed game is played over a teletype, so that nothing physical (tone of voice, shape of handwriting) can enter into C’s attempt to discern the gender of the other players based on their performances.

Turing then asks, “What will happen when the machine takes the part of A in this game?” How will interrogator’s results compare to when the game is played based on gender? These questions are proposed as a closely-related replacement for the question “Can machines think?”

The ideas in Turing’s essay have been widely discussed — the imitation game is now commonly called the “Turing test” — and vigorously debated.1 For my purposes, however, the key element of Turing’s game is that it is based purely on surface behavior. In part this is no doubt due to his audience — many readers of Mind would have understood little of any discussion of computational processes. But, given the vast influence of Turing’s work, it should also be considered in terms of larger attitudes about the relationship between surface appearance and internal processes that shaped the AI community.

The Turing test is the most famous example of the idea that we need not consider the internal operations of systems when evaluating them. From this point of view, whatever model drives a hypothetical system that can be said to have passed the Turing test, we should consider it to embody something close to “thinking.” Given that — in the limited time available for interaction — some of those who interacted with Eliza/Doctor in the time around which Weizenbaum constructed it appear to have thought it was genuinely thoughtful, Weizenbaum’s famous paper on the system was specifically at pains to dispel this sort of idea. In fact, the paper could be read as a long, detailed counterexample to the argument Turing put forth, which failed to take the workings of the Eliza effect’s initial stages into account.

But Weizenbaum’s paper was not wholly an attempt to help people see through Eliza’s illusion. Among other things, it also speculated on possible future directions for the Eliza project. His projected future Eliza would “slowly build a model of the subject conversing with it” (1966, 43). This would, in turn, enable an Eliza that didn’t simply transform the previous audience utterance, but would say things guided by aspects of this model — aspects that might indicate audience rationalizations, contradictions, or other objects of interest to a more advanced Doctor script. Looking toward how such a system could be built, Weizenbaum cited a then-recent paper by Robert Abelson and J. Douglass Carroll.

Note

1Some have argued that Turing was providing a behaviorist definition of intelligence, while others have argued that, at most, Turing was presenting one possible criterion for thinking (and that it would be possible for things that ought to be described as thinking to not pass the test). Similarly, some have argued that Turing’s test is deeply gendered (the machine attempting to pass as a woman) while others have argued that this is a red herring (at other points the machine is described as imitating a man) and yet others have argued that the gender-driven test plays a key role: as a scoring mechanism for the human/machine test. See The Essential Turing (Turing, 2004) for versions of Turing’s most influential writings and summaries of debates surrounding them.

Moving beyond these debates, Mark Marino (2006b) has positioned the Turing test as an “Ontological Turing Machine.” The name is chosen in relation to the “Universal Turing Machine” — Turing’s famous outlining of the concept, and a possible implementation, of the notion of universal computation. For Marino, Turing’s Mind essay does similar foundational work, both conceptually and in a possible implementation, for the doubt produced by network communication (which, from MMO games to instant message chats, continually raises questions as to the actual “age / sex / location” and human/software status of others).