February 28, 2008
Scott Turner, like many before and since, first became interested in story generation after running upon Vladmir Propp’s analysis of Russian folktales (1968). Propp provides a grammar that describes the structure of many folktales. As linguists and computer scientists know, grammars can be used for describing the structure of given things — and also for generating new things. But, as Turner soon discovered, this task is not easily accomplished with Propp’s grammar. Its elements are rather abstract, making them workable for analysis but insufficient for generation.5
Turner was a senior in college at the time. A few years later, while doing graduate research in UCLA’s Computer Science department, he began work on a radically different vision of story generation, embodied in his Minstrel system. This would culminate in an dissertation more than 800 pages long (setting a new record in his department) that he distilled down to less than 300 as the book The Creative Process: A Computer Model of Storytelling and Creativity (1994).
Though Turner wasn’t at Yale, his work was still pursued in the context of scruffy AI. His UCLA advisor was the newly-arrived Michael Dyer, who had recently completed a dissertation at Yale influenced by Schank and Abelson’s ideas.6 Over the better part of a decade of work, the shape of Minstrel was influenced by two important factors: the evolving scruffy approach to AI and the evolution of Turner’s aims.
At first, the primary aim was to build a better story generation system, one that took the goals of a simulated author into account — especially the goal of telling a story with a theme, as understood in Dyer’s concept of “Thematic Abstraction Units” (Turner and Dyer, 1985, 373).7 The initial technical approach was to create an improved Tale-Spin. As Turner describes it:
Talespin was essentially a planning engine, so it seemed reasonable to build a better storytelling program by simply augmenting the Talespin model with a “meta” level of goals and plans representing what the author was trying to achieve with his storytelling. And, in fact, the first versions of Minstrel operated just this way. One problem became immediately obvious with this approach: the stories weren’t original. (Turner, 2007)
The problem of originality, of creativity, became increasingly central to Turner’s research. As he puts it, “Storytelling went from being an end in itself to being the domain in which Minstrel demonstrated creativity.” At the same time, the account of intelligence in the scruffy AI community was shifting. Particularly, in Schank’s lab the model of dynamic memory and its adaptations was extended into the idea of “Case-Based Reasoning” (CBR). The basic idea of CBR is in some ways quite close to that of scripts: in the main people do not decide what to do in each situation by reasoning from first principles, but rather by drawing on previous knowledge. However, rather than suggesting that each of us has a “restaurant script” and a “sports event script” and so on, case-based reasoning assumes that we remember many cases, and reason from them — much as the learning of previous cases is formalized in legal and business education.
According to CBR theory, humans have three major types of cases we work with. There are “ossified cases” that have been abstracted to the point where they are essentially rules, such as proverbs. There are “paradigmatic cases,” each of which turns out to be the only experience we have that is relevant to a particular current situation, and which we adapt in order to understand the new situation. Finally, the most complex structures are “stories,” which Schank and Riesbeck characterize as “unique and full of detail, like paradigmatic cases, but with points, like proverbs” (1989, 13). The continuing reinterpretation of stories is described as the “basis of creativity in a cognitive system” — but most work in CBR focused, instead, on “understanding and problem solving in everyday situations” (14).
When Meehan began work on Tale-Spin, he rejected scripts as the basis for stories. Instead, he chose the technique that scruffy AI then posited as the approach used when no script was available: planning. Given Turner’s focus on creativity, he similarly rejected the straight employment of case-based knowledge in stories. But this forced him to develop an implementable model of creativity for case-based reasoning that could be employed to generate stories — no small task.
Creating stories from stories
Minstrel begins storytelling much as some human authors might: with a theme to be illustrated. The audience can request a particular theme, or Minstrel can be “reminded” of a story with a similar theme. Minstrel is reminded by being given a pool of fragments structured according to the internal schema representations it uses for characters and scenes. Matching fragments against stories in memory will result in one story being selected, and then Minstrel will have the goal of telling a story on the same theme.
Minstrel uses case-based reasoning to meet its goals, including this one. But goals also need to be organized. Rather than running a simulation of character behavior through time, as Tale-Spin does, Minstrel’s goals are organized as an internal agenda. Planning proceeds in cycles, with each cycle attempting to accomplish the goal that currently has the highest priority on the agenda. If a goal fails it can be put back on the agenda at a lower priority, with the hope that later circumstances will make it possible to achieve. The initial goal is to “tell a story” — which “breaks down into subgoals including selecting a theme, illustrating a theme, applying drama goals, checking the story for consistency, and presenting the story to the reader” (Turner, 1994, 77).
Minstrel’s themes are also represented in its schema system. Each theme is actually a piece of advice about planning, and represents the kinds of characters, plans, and outcomes necessary for illustrating the theme. Though Minstrel is designed to tell stories in the domain of King Arthur’s knights, its “planning advice themes” (PATs) are drawn from Romeo and Juliet, It’s a Wonderful Life, and proverbs. For example, one of the PATs drawn from Romeo and Juliet is PAT:Hasty-Impulse-Regretted, based on Romeo killing himself upon discovering what he believes is Juliet’s lifeless body — though, if he had waited a moment longer, she would have awakened from her simulated death. Unexpectedly, Turner summarizes his schema representation of this as follows:
Decision: &Romeo believes something (&.Belief.1) that causes a goal failure for him (&Goal.1). This and his hasty disposition motivate him to do something irreversible (&Act.1).
Connection: &Romeo learns something new (&State.4) that supersedes the evidence for his earlier belief (&Belief.1).
Consequence: &Romeo now has a different belief, which motivates him to retract his earlier goal (&Goal.2) but he cannot, because his earlier action (&Act.1) is irreversible. (104)
Of course, as Turner notes, this is not actually what happens in Shakespeare’s play. Romeo kills himself, and never knows that Juliet was not actually dead — much less regrets his decision. That it is represented this way is an artifact of the larger system design. In Minstrel, character-level goals and plans are represented in the schema, and so can be transformed (as outlined below). Author-level plans, on the other hand, are each structured, independent blocks of code in the Lisp programming language — presumably for reasons of authoring and execution efficiency.8 Therefore, author-level plans are opaque to Minstrel’s transformation procedures, which operate on the schema representations. As a result, if PATs are going to be transformed, which is Minstrel’s primary engine for producing new stories, then they must be represented at the character level, rather than at the authorial level.
In any case, once a theme has been selected, this adds a set of goals to the agenda: instantiating examples of the decision, connection, consequence, and context of the PAT. Once transformation plans succeed in creating the sequence of story events that accomplish these goals, other goals can come into play. For example, “drama goals” include suspense, tragedy, foreshadowing, and characterization. To illustrate, a characterization goal would add a story scene showing that a character has an important personality element (e.g., makes decisions in haste) before the section of the story that embodies the PAT. As mentioned above, another set of goals, “consistency goals,” fill out elements that aren’t the bare-bones embodiments of the PAT. For example, if a knight kills another person, consistency goals make sure that he is near the person first, and make sure that he has an emotional reaction afterward. Finally, presentation goals make the final selection of the scenes that will be in the story, their ordering, and how they will be expressed in English. But all of this, while it represents a fuller method of story generation than Tale-Spin’s, is only the enabling machinery around Minstrel’s primary operational logic: TRAMs.
At the heart of Minstrel’s approach to generating stories is the implementation of Turner’s theory of creativity: TRAMs. These “Transform-Recall-Adapt Methods” are a way of finding cases in the system memory that are related to the current situation and adapting elements of these previous cases for new uses. In this way stories can be generated that illustrate a particular theme without reusing previous stories verbatim. The approach is based on transforming the problem repeatedly, in carefully-crafted ways, rather than doing an exhaustive search through all the possible solutions.
One TRAM example that Turner provides shows Minstrel trying to instantiate a scene of a knight committing suicide. Minstrel’s first TRAM is always TRAM:Standard-Problem-Solving, which attempts to use a solution that already exists in memory. This TRAM can fail in two ways. First, it is possible that there is no case in memory that matches. Second, it is possible that the matching cases in memory have already been used twice, which results in them being assessed as “boring” by the system — so a new solution must be found. For either type of failure, the next step is to transform the problem and look for a case matching the transformed problem.
In Turner’s example, Minstrel’s memory only contains the schemas for two episodes. In the first a knight fights a troll with his sword, killing the troll and being injured in the process. In the second a princess drinks a potion and makes herself ill. Neither of these is a direct match for suicide, so Minstrel must transform the problem.
One possible transformation is TRAM:Generalize-Constraint. This can be used to relax one of the constraints in a schema. In this case, it is used to relax the requirement of a knight killing himself. This is the “Transform” step in a TRAM, and it is followed by the “Recall” step. Here the system searches for a scene of a knight killing anything — not just himself — and succeeds in finding the scene of the knight killing a troll. Since this was successful, the next step is to attempt to “Adapt” this solution to the new situation, by reinstating the constraint that was relaxed. The result is then assessed, and deemed appropriate, so Minstrel determines that the knight can kill himself with his sword. Here we can see Minstrel — on some level — producing something that wasn’t already present in the system’s data. This is the key to how Minstrel’s model of story generation not only goes beyond shuffling pre-written elements, it also goes beyond simulation via previously-encoded actions, as found in Tale-Spin and F.E.A.R.
Further, the example above is only the most simple use of Minstrel’s TRAMs. The system finds other methods of suicide by a more complex route. For example, there is also TRAM:Similar-Outcomes-Partial
And that’s not all — the TRAM:Similar-Outcomes-Partial
Through various series of small, recursive transformations such as those outlined above, Minstrel is able to produce story events significantly different from any in its memory. While it can only elaborate as many themes as it has hand-coded PATs, with a large enough schema library it could presumably fill out the structures of those themes with a wide variety of events, creating many different stories. But enabling a wide variety of storytelling is not actually Turner’s goal. He writes: “Minstrel begins with a small amount of knowledge about the King Arthur domain, as if it had read one or two short stories about King Arthur. Using this knowledge, Minstrel is able to tell more than ten complete stories, and many more incomplete stories and story fragments” (8–9).
One reason for this sparsity of initial data is, simply, that encoding knowledge into the schema used by Minstrel is time-consuming. Turner’s main task was to work on Minstrel’s processes, rather than its body of data. Another reason is that starting with a small body of cases shows off Minstrel’s creativity to greater effect. It ensures that TRAM:Standard-Problem-Solving will be nearly useless when the program begins, so recursively-built solutions will be needed almost immediately. But another reason is that, by its very design, the more Minstrel knows the more it gets in trouble. The pattern of this trouble points to a deep issue for systems that seek to model part of human intelligence.
Turner provides a simple, amusing example of this trouble: a knight kills and eats a princess, adapting a plan from a dragon (278). A more complex example arises from Turner’s attempt to add a new theme after the system was relatively well developed. Unfortunately, the story produced by PAT:PRIDE is seriously flawed:
Once upon a time, a hermit named Bebe told a knight named Grunfeld that if Grunfeld fought a dragon then something bad would happen. Grunfeld was very proud. Because he was very proud, he wanted to impress the king. Grunfeld moved to a dragon. Grunfeld fought a dragon. The dragon was destroyed, but Grunfeld was wounded. Grunfeld was wounded because he fought a knight. Grunfeld being wounded impressed the king. (240, original emphasis)
The basic problem is this. The more material there is in the Tale-Spin system — the larger its microworld of knowledge — the more its “creative” transformation procedures will succeed in finding episodes to adapt to new circumstances. But the nature of this adaptation is, precisely, that it exceeds the bounds of the knowledge already available about the microworld. (If the knowledge was already present it could have been accessed by TRAM:Standard-Problem-Solving, and creativity would not come into play.) Because of this, the more data is available the more Minstrel will generate inappropriate episodes — and have no principled means of rejecting them. Unless Minstrel’s TRAM processes are to be reigned in, significantly diminishing the interest value of the system, the only way around the problem is to carefully limit and shape the system data.
In other words, the problem with Minstrel lies in trying to simulate one part of human intelligence — a particular kind of creativity — but not the rest. This is a recurring problem in AI systems designed around models of human cognition. Its most widely discussed manifestation is the “common-sense reasoning problem,” which Murray Shanahan has called “the nemesis of artificial intelligence” (Mueller, 2006, xvii). This problem is an umbrella for all that normal human beings know and infer about the world: water makes things wet; if I pick up my water glass and leave the room, it leaves too; if I refill the water glass I probably intend to drink more water, my throat might be dry. Both scruffy AI’s scripts and cases can be seen as attempts to encode common-sense knowledge — but the problem is so daunting that, in decades of effort, no one has succeeded in developing a robust solution.
In 2007, thirteen years after the publication of his book, Turner put it this way:
Minstrel was a brittle program. My contention is that if you give me a robust, non-creative program that demonstrates all the world knowledge, intelligence and problem-solving ability of the average 21 year old, I’ll be able to implement a robust creative program atop that. But I didn’t have that luxury. I had to build just exactly those parts of that robust intelligence I needed to demonstrate my thesis. Naturally, if you stray even slightly from those limits, things break. (Turner, 2007)
5I’ll discuss story grammars in more detail in the next chapter, in the context of Brutus.
6Dyer’s dissertation became the book In-Depth Understanding: A Computer Model of Integrated Processing for Narrative Comprehension (MIT Press, 1983).
7“Thematic Abstraction Units” are also called “Thematic Affect Units.”
8Turner writes: “Minstrel’s author-level plans are represented as structures of Lisp code, and Minstrel’s TRAMs do not know how to adapt Lisp code. Minstrel’s author-level plans are opaque and non-adaptable, and so Minstrel cannot adapt author-level plans” (83). He explains, “Although the same type of representation could be used for Minstrel’s author-level plans, it would be clumsy and time consuming. Schemas for complicated computational actions such as looping, recursion, and so on would have to be defined and an interpreter built to perform these actions” (81).