March 7, 2008
Picture a darkened theater. An audience watches, presumably somewhat disconcerted, as “a montage of Tibetan Buddhist imagery and Chinese soldiers holding monks at gunpoint” unfolds on screen. A computerized voice tells them that:
There were reports that Buddhist monks and nuns were tortured, maimed and executed. Unfortunately such actions can be necessary when battling the forces of religious intolerance. (Mateas, 2002, 138)
Underlying the words, one can hear a “happy, ‘optimistic’ music loop.” It is uncomfortable and jarring. And, to make matters worse, the audience feels a certain sense of culpability. Terminal Time is not just a generator of uncomfortable stories, of distorted stories — it is also a generator of stories each audience “deserves.”
The Terminal Time project is a collaboration between AI researcher/artist Michael Mateas, documentary filmmaker Steffi Domike, and media artist Paul Vanouse. Each story it generates is an ideologically-biased historical narrative of the previous millennium, and each of these stories is presented as a 20 minute multimedia projection with “the ‘look and feel’ of the traditional, authoritative PBS documentary” (Mateas, Vanouse, and Domike, 2000). The ideological bias that drives each story is shaped by audience responses — recorded by an applause meter — to public opinion polls that appear at three points during the performance. For example:
What is the most pressing issue facing the world today?
A. Men are becoming too feminine and women too masculine.
B. People are forgetting their ethnic heritage.
C. Machines are becoming smarter than people.
D. Its getting harder to earn a living and support a family.
E. People are turning away from God. (Domike, Mateas, and Vanouse, 2003)
The ideological model derived from audience responses is always an exaggerated one. It is a representation of the positions for which the audience has applauded, but taken to an untenable extreme. As it drives the selection of events to be recounted, and the “spin” with which each will be presented, it — rather than reinforcing audience belief in these ideological positions — inevitably creates an ironic distance between the words being uttered and the message being conveyed.
Yet Terminal Time stories aren’t the result of a “mathematized” account of irony. They are the result of careful authoring. This authoring effort included the creation of a computational account of ideology — but while the system “knows” about ideology, the irony layered on top of the ideology is not represented in the system. Only the authors and the audience get the joke.
And here one sees what may be the most significant move that Terminal Time makes, relative the other story generators I have discussed. It reintroduces the author(s) and audience as essential elements of fiction — through its emphasis on the context of reception (rather than only the generated text) and through interactions with the audience that generate the ideological model used in each presentation of Terminal Time. I’ll discuss the implications of this further while describing the specifics of Terminal Time’s operations.
Each Terminal Time performance is divided into four sections. Section one is a two minute introduction that sets the audience’s expectations — combining a “Masterpiece Theater” style of delivery with the information that a historical narrative will be produced, for that audience, by Terminal Time’s mechanisms. This is followed by the first question period, in which “an initial ideological theme (from the set of gender, race, technology, class, religion) and a narrative arc (e.g. is this a progress or decline narrative) are established” (Mateas et al., 2000). The next stage is the generation and presentation of the part of the story covering the years 1000 to 1750 of the common era (C.E.), which takes six minutes. Following this, a second set of questions refines the ideological theme chosen in the first set, and may introduce a sub-theme (e.g., race with a sub-theme of class, or technology with religion). The next section of story is then generated and presented, covering roughly 1750 to 1950 C.E., and again taking six minutes. This is followed by a final set of questions which further refines theme(s) and introduce the possibility for a reversal (e.g., a decline narrative may become a progress narrative). This is followed by the generation and presentation of the last phase of the story, covering roughly 1950 C.E. to the end of the millennium.
As each phase of storytelling takes place, the ideological models not only become further shaped by audience responses, but also more blunt in their operations. This, combined with audiences’ greater familiarity with more recent history, causes the exaggerated ideological spin of the story to become steadily more apparent over the 20 minutes of a Terminal Time performance. Towards the end of performances this culminates in story fragments such as the glowing description of the Chinese invasion of Tibet quoted above. In that particular case, the ideological goals at work were those that Terminal Time’s creators refer to as those of the “anti-religious rationalist.”
The representation of ideology in Terminal Time is based on that developed for the Politics system — which, itself, was a successor to the “ideology machine” created by Robert Abelson and his collaborators in the 1950s through 70s (and discussed in an earlier chapter). In Terminal Time ideology is represented as a set of goal trees — specifically, rhetorical goals for what the story will demonstrate through its history of the millennium. While the initial audience polling produces one of the goal trees originally crafted by Terminal Time’s authors, further questioning may add, delete, or change goals. For example, during the second round of questioning a sub-theme may be introduced via the combination of goals from one tree with another.
Below is an example of Terminal Time authoring — specifically, the anti-religious rationalist goal tree as it exists before any modifications. Notice that, because “show-thinkers-persecuted-by-religion” is a subgoal of both high-level goals, it can satisfy both both of them.
(Mateas et al., 2000)
In their paper published in 2000, Mateas, Domike, and Vanouse write that in Terminal Time “[n]ine major ideologues are represented using a total of 222 rhetorical goals.” The authors of Terminal Time represent the crafting of these ideological models as authoring, as the creation of an artwork. But this is not always the way that their work is understood in the AI community. As Mateas reports:
The first time I presented Terminal Time to a technical audience, there were several questions about whether I was modeling the way that real historians work. The implicit assumption was that the value of such a system lies in its veridical model of human behavior. In fact, the architectural structure of Terminal Time is part of the concept of the piece, not as a realist portrait of human behavior, but rather as a caricature of certain institutionalized processes of documentary film making. (Mateas, 2002, 57–58)
This reception of Terminal Time should, perhaps, come as no surprise, given my earlier discussion of anthropomorphized models within AI. And for understanding Terminal Time, and the ideas that motivate it, one does need to consider how human models of history and ideology interact with the system. However, the place to look for human models is not within the system itself. Rather, Terminal Time depends on the existence of these models in a very anthropomorphic location — within the audience.
In order for Terminal Time events to be accessible to the system, they need to be represented in a formalized manner. The Terminal Time approach to this problem involves building on top of a representation of everyday knowledge called the “Upper Cyc Ontology” (an ambitious, in-process attempt to address the fundamental issues discussed earlier in the context of Minstrel). For Terminal Time’s purposes, the approach taken by the Cyc ontology both structures how its own terms will be authored (as assertions in a knowledge base that also includes the terms from Upper Cyc) and provides the lower-level grounding on top of which its terms are defined.
Terminal Time’s historical events cover a range of levels of abstraction. They include, for example, the First Crusades, the invention of Bakelite, and the rise of Enlightenment philosophy. Here is an example of how events are authored for Terminal Time. Specifically, this is the representation of one event, the Giordano Bruno story:
;; Giordano Bruno
($isa %GiordanoBrunoStory %HistoricalEvent)
($isa %GiordanoBrunoStory %IdeaSystemCreationEvent)
($isa %GiordanoBrunoStory %Execution)
(%circa %GiordanoBrunoStory (%DateRangeFn (%CenturyFn 16) (%CenturyFn 17)))
($eventOccursAt %GiordanoBrunoStory $ContinentOfEurope)
($performedBy %GiordanoBrunoStory %GiordanoBruno)
($outputsCreated %GiordanoBrunoStory %GiordanoBrunosIdeas)
($isa %GiordanoBrunosIdeas $PropositionalInformationThing)
($isa %GiordanoBrunosIdeas $SomethingExisting)
(%conflictingMOs %GiordanoBrunosIdeas %MedievalChristianity)
($isa %GiordanoBrunosIdeas %IdeaSystem)
($performedByPart %GiordanoBrunoStory %TheRomanCatholicReligiousOrg)
($objectActedOn %GiordanoBrunoStory %GiordanoBruno)
(Mateas et al., 2000)
In the above representation, terms preceded by a “$” are defined in the Upper Cyc Ontology, while those terms preceded by “%” are defined within the Terminal Time ontology in terms of the Upper Cyc Ontology. An English-language gloss of this event representation, provided by Mateas, Domike, and Vanouse, reads:
The Giordano Bruno story, a historical event occurring in the 16th and 17th century, involved the creation of a new idea system and an execution. The idea system created in this event conflicts with the idea system of medieval Christianity. Both Giordano Bruno and a portion of the Roman Catholic Church were the performers of this event. Giordano Bruno was acted on (he was executed) in this event.
In order for a Terminal Time ideologue to make use of such an event, it must be possible to determine that the event can be “spun” to support one of the current rhetorical goals. The Terminal Time system identifies candidate events by testing them for applicability. These tests are carried out through an “inference engine” written in Lisp. Here is the test for “show-thinkers-persecuted-by-religion”:
- ($isa ?event %IdeaSystemCreationEvent)
($isa ?event %Execution)
($outputsCreated ?event ?newIdeas)
(%conflictingMOs ?newIdeas ?relBeliefSystem)
($isa ?relBeliefSystem $Religion))
(Mateas et al., 2000)
As Mateas, Vanouse, and Domike point out, this is not any sort of general test for finding all instances of thinkers being persecuted by religion. For example, it assumes executions are the only type of persecution. Similarly, Terminal Time’s representation of the Giordano Bruno story is not the only possible one. The Terminal Time authors point out that, in other circumstances, it might be desirable to represent Bruno’s writings and his execution as separate events, rather than one compound event. But, again, Terminal Time is not trying to create a realistic simulation of the behavior of historians, or create a system that “really understands” history, or be itself a “creative” system. Instead, Terminal Time is an authored artwork.
If the authors desired — at some point — for the system to be able to identify examples of religious groups persecuting thinkers that do not involve executions, in order to employ these events in its stories, then the test could be broadened to match the new class of events. As of their 2000 paper, the authors report that the system includes “134 historical events and 1568 knowledge base assertions” (beyond those assertions in the Upper Cyc Ontology). Given that all the possible examples of events involving religious persecution of thinkers (among that 134) also include executions, a broader test is not needed. But anyone involved in authoring historical event data for Terminal Time must do so with an awareness of the tests that will evaluate them later. In fact, it would make no sense to author historical events except in relation to the tests currently in the Terminal Time system, as events matching no tests would never be employed in stories. As a result, the authoring of events is tightly coupled to the authoring of tests.
Assembling the storyboard
Events that make good candidates for the story are placed on the system’s “storyboard.” However, before being placed on the storyboard, events are “spun” by means of rhetorical plans. These select a subset of information available that relates to the event and lay out an order for its description. So, for example, the rhetorical plan for the goal “show-religion-causes-war” (which can satisfy “show-religion-is-bad”) is:
Describe the individual who called for the war, mentioning their religious belief
Describe the religious goal of the war
Describe some event happening during the war
Describe the outcome
(Mateas et al., 2000)
A “spin” contains all the elements specified by a rhetorical plan, as well as information about the rhetorical goal being satisfied (and all its parent goals). This information about rhetorical goals is necessary because the selection of events for each section of the story is performed via constraints, some of which handle events in terms of the rhetorical goals they serve. A number of these constraints come from the current Terminal Time ideologue. For example, here are the storyboard constraints for the anti-religious rationalist during the first six minute section:
- (%rhet-goal :show-religion-is-bad)
(%and (%rhet-goal :show-halting-rationalist-progress)
- (%rhet-goal :show-religion-is-bad))
This determines that there will be six events in this section’s representation on the storyboard, which serve the specified rhetorical goals. In a sense, there are six event “slots.” There is not yet any order to these slots, however. Order is created by using an ideologue’s “rhetorical devices.” These devices create the connections between events — and associated with each device is a set of constraints on the events that can appear before and after it. For example, here is a rhetorical device from the “pro-religious supporter” ideologue (a counterpart to the “anti-religious rationalist”):
- (def-rhetdev :name :non-western-religious-faith
- :prescope-length 2
- :prescope-test (:all-events-satisfy (%and
- ($isa ?event %HistoricalSituation)
- (:kb ($eventOccursAt ?event %FirstWorld))
- (%rhet-goal :show-religion-is-good)))
- :postscope-test (:some-event-satisfies ?spin (%and
- ($isa ?event %HistoricalSituation)
- (:kb ($eventOccursAt ?event %NonFirstWorld))
- (%rhet-goal :show-religion-is-good)))
- :nlg-rule :generate
- :nlg-context-path (:non-western-religious-faith))
(Mateas et al., 2000)
The “prescope” test specified for this device requires that both of the previous two event spins occur in the First World and satisfy the rhetorical goal of showing that religion is good. The “postscope” test requires that the immediately following event also satisfy the rhetorical goal of showing that religion is good — but take place somewhere other than the First World. When this rhetorical device is used in story generation it calls an NLG rule to create the connection between events. In this case the rule is quite simple, resulting in the pre-written sentence “The call of faith was answered just as ardently in non-western societies.”
To summarize, Terminal Time assembles the storyboard for each section of its story as follows:
1. First, it finds the events that can be spun to support the current ideologue’s rhetorical goals, and makes them into “spins.”
2. Next, spins are added to the storyboard (as an unordered collection, only some of which will be used). Constraints on the storyboard (such as those from the current ideologue) determine how many events, and serving what rhetorical goals, will actually be used in each section of the generated story.
3. Finally, Terminal Time identifies a set of rhetorical devices that can connect the right number and type of events (to meet the storyboard constraints searching using the events currently available on the board (and needing to meet the internal constraints imposed by each device’s prescope and postscope tests).
Presenting the story
Once the storyboard for a portion of the story is assembled, the collection of spins and rhetorical devices is sent to the NLG system. This system follows a set of rules for generating both the English text of the story and a set of keywords. (These will be discussed further in the next section.) A text-to-speech program is used to generate a narrative voiceover for the story, lending it an unmistakably “computerized” tone. Keywords are used to select sequences of digitized video that will be played during the narration, and these are accompanied by music. As of 2000, the authors had created 281 rhetorical devices, 578 NLG rules, and a video database of 352 annotated 30 second clips.
Terminal Time and audiences
Terminal Time is always presented in a theatre, before a live audience. Usually, it is presented twice for the same group, with each performance followed by a brief discussion with one or more of Terminal Time’s authors — resulting in an overall audience experience of roughly one hour. At each performance, Terminal Time generates two quite different narratives of the same millennium. In doing so, it makes clear that it is (in the terminology of Mateas’s “Expressive AI”) both a “message from” and “messenger for” the author. It not only presents the possible world in which there is a machine that creates historical documentaries (which could be accomplished by a traditional fiction) and presents two different narratives created by this machine (possible worlds within worlds are certainly a feature of traditional fiction) but makes it clear that this machine actually exists, and operates, and could produce a larger number of fictions than that audience could possibly sit through. This maneuver, this establishment of the fact that Terminal Time is not only a message but an operating messenger, could be compared to the difference between writing the Borges story “The Garden of Forking Paths” and actually constructing the labyrinth novel described within it. Conceptually they are both very much the same and widely distinct. The method of Terminal Time’s presentation brings home this distinction.
Another impact of dual presentations of Terminal Time is that it allows the audience to change their relationship with its interface. Jay David Bolter and Diane Gromala, in their book Windows and Mirrors, point out that, even in one viewing, Terminal Time provides its audience with a dual experience:
As spectators, we experience a more or less transparent movie. . . . As participants in the voting process, however, we are very conscious of the interface, and we are meant to reflect on our participation in the vote — in particular, on the notion that our ideology is being tested. The experience is reflective. (2003, 134–135)
Presenting Terminal Time twice creates a dual reflection. During the first showing audience members can reflect on the voting process, on the resulting story, and on Terminal Time’s simultaneous performance and parody of the notion of highly customized media. But only after the first showing is complete does it become possible for the audience to fully reflect on their own complicity in the very structure of market-research-style polling that provides the only means of interaction with Terminal Time — and decide to stop “playing along with” and instead start “playing against” this expectation. Both the reports of the Terminal Time authors and my own experiences as a Terminal Time audience member (at the 1999 Narrative Intelligence symposium, at SIGGRAPH 2000, and at UC Irvine in 2007) point to the importance of this shift between showings. As the Terminal Time authors write in the book Narrative Intelligence:
Typically, during the first performance, audiences respond to the questions truthfully, that is, actively trying to reflect their true beliefs in their answers to the questions. During the second performance they tend to respond playfully to the questions, essentially trying on different belief systems to see how this will effect the resulting history. (Domike et al., 2003)
Of course, the dual showings also serve to allow the audience to begin to form a mental image of how the Terminal Time model of ideology drives the documentary-creation process. From there it becomes possible for the audience to reflect on the gap between this and how (in their view) ideology shapes the documentary-creation process of human filmmakers. This can be seen as another of Terminal Time’s inversions of the history of AI — a gap in its simulation of human behavior that is not a failure, or opportunity for future work, but an opportunity for reflection and debate.
Terminal Time and interaction
Little of the sort of reflection discussed above is likely with non-interactive story generation systems. In a sense, it is the interactive nature of Terminal Time that actually warrants the use of computational processes. This separates it, for example, from the public presentation of Brutus. A human author could easily write a Brutusn story for posting on a website or printing in a newspaper. The same is true of Universe. A human author could (and many human authors do) produce plot outlines for scriptwriters to turn into documents that guide the shooting of serial melodramas. Whereas, for a Terminal Time story to be realized as a newly-scripted documentary with video and voiceover, in seconds, for a live audience, there must be some mechanism that operates more quickly than a human author.
At the same time, Terminal Time is far from the most efficient route to this interactive experience. For example, rather than authoring matched sets of historical events and inference tests (as seen in the Giordano Bruno story and the test for “show-thinkers-persecuted-by
Terminal Time’s surface
Considering this from another direction, the extent to which audiences can engage the specifics of Terminal Time’s processes — either for critical reflection or aesthetic appreciation — is limited by the surface it presents to its audiences. Terminal Time’s interaction mode (discrete questions interspersed by story generation) recalls Tale-Spin more closely than Eliza/Doctor. Further, as with Tale-Spin and Mumble, most of the burden must fall to the processes that create the work’s surface. The most interesting elements of the system must be those selected for presentation in the output, and the generation system must have enough nuance and flexibility to communicate them to the audience. In other words, a system like Terminal Time can create a very interesting model of an audience-selected ideologue, combining a number of biases and strategies, but this matters little if the audience cannot understand this from the documentary produced.
A concern along these lines is part of Mateas’s larger project of Expressive AI, which he describes using the terms “authorial affordance” and “interpretive affordance.” The concept of “affordance,” as applied by Mateas, is one brought into the discussion of human-computer interaction by Donald Norman (though it originated with psychologist J. J. Gibson). In introducing the term in his book The Psychology of Everyday Things, Norman writes: “When used in this sense, the term affordance refers to the perceived and actual properties of the thing, primarily those fundamental properties that determine just how the thing could possibly be used” (1988, 9). Similarly, according to Mateas, the “authorial affordances of an AI architecture are the ‘hooks’ that an architecture provides for an artist to inscribe their authorial intention in the machine” (Mateas, 2002, 125–126). Interpretive affordances, naturally, are the other side of the coin. They are the hooks the system makes available to an audience to aid in the interpretation of the system, its actions, and its possibilities.
Along with these pieces of vocabulary Mateas also offers a poetics: a recommendation that authorial and interpretive affordances be considered together and closely matched. An architecture should be “crafted in such a way as to enable just those authorial affordances that allow the artist to manipulate the interpretive affordances dictated by the concept of the piece” (127). Given that the concept of Terminal Time is for a complex, evolving model of ideology to be interpretable from clips of pre-existing video combined with textual descriptions of historical events, systems like Terminal Time need a model for generating text that both captures authorial intention and is flexibly manipulable by the system. Creating such systems is a huge challenge.