March 2, 2006
Turner’s Minstrel, part 2
In my previous two posts (1 2) I gave some background about two story generation systems, Minstrel and Universe, and outlined the basic set of plans and goals used by Minstrel. In this post I’ll discuss the main engine Minstrel uses for creating new stories: transformation and adaptation. As we’ll see, it’s both intriguing and problematic.
At the heart of Minstrel‘s approach to generating stories is Turner’s take on creativity, one which (like the structures of PATs) is based on case-based reasoning: TRAMs. These “Transform-Recall-Adapt Methods” are a way of finding cases in the system memory that are related to the current situation and adapting elements of these previous cases for new uses. In this way stories can be generated that illustrate a particular theme without reusing previous stories verbatim.
One example that Turner provides shows Minstrel trying to instantiate a scene of a knight committing suicide (though it is unclear which PAT this will help illustrate). Minstrel‘s first TRAM is always TRAM:Standard-Problem-Solving, which attempts to use a solution that already exists in memory. This TRAM can fail in two ways. First, it is possible that there is no case in memory that matches. Second, it is possible that the matching cases in memory have already been used twice, which results in them being assessed as “boring” by the system — so a new solution must be found. For either type of failure, the next step is to transform the problem and look for a case matching the transformed problem.
In Turner’s example, Minstrel‘s memory only contains the schemas for two episodes. In the first a knight fights a troll with his sword, killing the troll and being injured in the process. In the second a princess drinks a potion and makes herself ill. Neither of these is a direct match for suicide, so Minstrel must transform the problem.
One possible transformation is TRAM:Generalize-Constraint. This can be used to relax one of the constraints in a schema. In this case, it is used to relax the requirement of a knight killing himself. This is the “Transform” step in a TRAM, and it is followed by the “Recall” step. Here the system searches for a scene of a knight killing anything — not just himself — and succeeds in finding the scene of the knight killing a troll. Since this was successful, the next step is to attempt to “Adapt” this solution to the new situation, by reinstating the constraint that was relaxed. The result is then assessed, and deemed appropriate, so Minstrel determines that the knight can kill himself with his sword.
But this is only the most simple use of Minstrel‘s TRAMs, and the system finds other methods of suicide by a more complex route. For example, there is also TRAM:Similar-Outcomes-Partial-Change. According to Turner, this TRAM “recognizes that being killed is similar to being injured” (p. 49) and transforms the schema to one in which a knight purposefully injures himself. This, however, returns no matching cases. The knight fighting the troll is not retrieved, because the injury was accidental. The princess drinking the potion was not retrieved, because the actor was not a knight. But this does not cause Minstrel to simply give up on the direction proposed by TRAM:Similar-Outcomes-Partial-Change. Instead the TRAM process begins again, recursively, using the already transformed problem and applying a different TRAM to it. In this next stage, by applying TRAM:Generalize-Constraint to the actor, it is able to find the princess drinking a potion to injure herself. It adapts by reapplying the generalized constraint to create a schema for a knight drinking a potion to injure himself, and then returns to the original TRAM. This adapts by changing from injuring to killing, and the result is an event of a knight drinking a potion to kill himself. This is assessed as successful, added to the story, and added to memory so that it can become a case retrieved by other TRAM processes.
And that’s not all — the TRAM:Similar-Outcomes-Partial-Change also helps generate another plan for suicide when used as a second-level TRAM. In this case the first-level transformation is TRAM:Intention-Switch, which changes the schema from a knight purposefully killing himself to accidentally killing himself. When this, at the next level, is transformed from death to injury, the fight with the troll is found in memory. Minstrel then produces a story of a knight going into battle in order to die. With three different suicide methods found for the knight, Turner’s example comes to an end as well.
Through various series of small, recursive transformations such as those outlined above, Minstrel is able to produce story events significantly different from any in its memory. While it can only elaborate as many themes as it has hand-coded PATs, with a large enough schema library it could presumably fill out the structures of those themes with a wide variety of events, creating many different stories. But enabling a wide variety of storytelling is not actually Turner’s goal. He writes: “Minstrel begins with a small amount of knowledge about the King Arthur domain, as if it had read one or two short stories about King Arthur. Using this knowledge, Minstrel is able to tell more than ten complete stories, and many more incomplete stories and story fragments” (p. 8-9). We are told that accomplishing this requires about 17,000 lines of code for Minstrel, and another 10,000 lines of code for the tools package upon which it is built.
With such elaborate processes, requiring so much time to develop and so many lines of code to implement, why starve Minstrel for data — only giving it the schemas equivalent to one or two short stories? Certainly no human storyteller was ever so starved for data. We all hear and read many, many stories before we begin to tell successful stories ourselves. Certainly the reason is not to achieve greater connection with Minstrel‘s underlying theories from cognitive science. In Schank’s CBR theories an expert — such as an expert storyteller — is someone with access to a large body of cases that are effectively indexed and retrieved.
One possible explanation is that starting with a small body of cases shows off Minstrel‘s creativity to greater effect. It ensures that TRAM:Standard-Problem-Solving will be nearly useless when the program begins, so recursively-built solutions will be needed almost immediately. The number of stories the system is able to create (about ten) is also clearly much larger than the number it begins with (about two).
But it is more likely that the complex, and in some ways fascinating model of Minstrel was also exceedingly brittle. It may have produced more and more mis-spun tales as more data was added to the system, due to the unpredictable emergent behavior encouraged by the TRAM system. Turner gives some indication of this when he reports on his attempt to add a new theme after the system was complete. Unfortunately, the story produced by PAT:PRIDE is seriously flawed:
Once upon a time, a hermit named Bebe told a knight named Grunfeld that if Grunfeld fought a dragon then something bad would happen.
Grunfeld was very proud. Because he was very proud, he wanted to impress the king. Grunfeld moved to a dragon. Grunfeld fought a dragon. The dragon was destroyed, but Grunfeld was wounded. Grunfeld was wounded because he fought a knight. Grunfeld being wounded impressed the king.
(p. 240, original emphasis)
The problem arises because of the actions of a transformation called TRAM:Similar-Thwart-State, and Turner was able to revise this TRAM to remove portions of episodes that it was not able to adapt. But it is important to remember that this problem arose with the completed system (and not an incomplete one, as with the mis-spun tales of Tale-Spin reprinted by Aarseth, Murray, Bolter, and others). A similar error occurs when a knight kills and eats a princess, adapting a plan from a dragon (p. 278). Of course, a problem such as this could also be easily solved with further changes to the system. But it seems likely that, as any further data was added to the system, more emergent behavior problems would keep cropping up. Rafael Pérez y Pérez and Mike Sharples suggest something along these lines in their evaluation of Minstrel, writing:
[T]he reader can imagine a Knight who is sewing his socks and pricked himself by accident; in this case, because the action of sewing produced an injury to the Knight, Minstrel would treat sewing as a method to kill someone.
(p. 21, “Three Computer-Based Models of Story-Telling: BRUTUS, MINSTREL and MEXICA”)
In all of these examples we can see, in Minstrel, symptoms of a much larger problem. One which Turner, alone, could have done little to address. By the late 1980s it was clear that AI systems in general were not living up to the expectations that had been created over the three previous decades. Many successful systems had been created — by both “neats” and “scruffies” — but all of these worked on very small sets of data. Based on these successes, significant funding had been dedicated to attempting to scale up to larger, more real-world amounts of data. But these attempts failed, perhaps most spectacularly in the once high-flying area of “expert systems.” The methods of AI had produced, rather than operational simulations of intelligence, a panoply of idiosyncratic encodings of researchers’ beliefs about humanity. Guy Steele and Richard Gabriel, in their history of the LISP programming language, note that by 1988 the term “AI winter” had been introduced to describe the growing backlash and resulting loss of funding for many AI projects.
While Minstrel looked, from inside AI, like a significant improvement over Tale-Spin — with an improved model of human cognition, and a smarter choice of which humans to simulate — from our current perspective the conclusion is, to put it charitably, debatable. Instead Minstrel looks like an inventively constructed Rube Goldberg device, massive and complex, continually reconfiguring a few small pieces at its center, and likely to break if given a piece of even slightly different shape. It attempts to create more complete and structured fictional worlds than Tale-Spin, by bringing an author into the processes, but gives that author so little to work with that its alternate universes are mostly uninhabited. The end result of trying to simulate a human author, even with compromises toward greater system functionality, is 27,000 lines of code that produce roughly 10 stories of “one-half to one page in length” (p. 8).
And that’s all for my examination of Minstrel, but soon I’ll write about the rather different approach and results of Universe.
March 2nd, 2006 at 9:55 pm
Noah, you’ve done an extremely lucid job of overviewing Minstrel‘s algorithms, thank you very much for that. I’m sure you’ll be introducing Turner’s work for the first time to many GTxA readers, which is great.
[W]hy starve Minstrel for data — only giving it the schemas equivalent to one or two short stories? … One possible explanation is that starting with a small body of cases shows off Minstrel‘s creativity to greater effect.
Yes, I think so. It’s impressive to see what a generative system can do with a minimal amount of knowledge.
But it is more likely that the complex, and in some ways fascinating model of Minstrel was also exceedingly brittle. It may have produced more and more mis-spun tales as more data was added to the system, due to the unpredictable emergent behavior encouraged by the TRAM system. …it seems likely that, as any further data was added to the system, more emergent behavior problems would keep cropping up.
Another way to spin it is that Minstrel is imaginative, willing to stretch the bounds of believability into fantasy and surrealism. That a system can think of sewing needles as a way to kill someone, or knights eating princesses, is seriously amusing, and not in a bad way.
But obviously most authors/players will want more realistic, believable stories than that; nonetheless, I wouldn’t call Minstrel exceedingly brittle, I’d call it overly imaginative, with the need for more common-sense knowledge and more of an ability to judge what’s believable or not, to reign its imagination in.
While Minstrel looked, from inside AI, like a significant improvement … from our current perspective the conclusion is, to put it charitably, debatable. Instead Minstrel looks like an inventively constructed Rube Goldberg device…
The fact that adding more raw PATs requires that and other bits knowledge to be refined to be used believably, I think is fine and should be expected. Minstrel (or techniques with a similar flavor) seem to be on the right track to me. I would love to see some R&D applying these techniques and filling out (and doing the required knowledge engineering for) a large database of PATs and well-designed set of TRAMs.
But would that effort become exponentially intractable? That’s not obvious. Knowledge engineering — the massaging of data and its annotations by talented humans — is a craft that can be done well, or poorly. My gut tells me there would be ways to keep the process tractable, by writer/programmers who understand how the algorithms will be using the data. (It won’t work to create a webpage interface for naive users to input knowledge.)
Again, it’s not obvious yet what aspects of this can be applied to interactive stories; that would be another discussion.
March 3rd, 2006 at 12:16 am
Andrew, thanks for your thoughtful comment.
I think that Minstrel is a totally fascinating architecture, and I agree that I’d really like to see what would happen with an out of control recursive TRAM system working on a much larger set of data. Knights eating princesses is just the tip of the iceberg. It could be amusing, and I think it could also be quite thought provoking if done with the right TRAMs and data.
But that wasn’t Turner’s goal. And I think the real problem, here, is that Turner was trying to serve two goals, which didn’t go together well. One was to simulate an author through operationalizing CBR theory. The other was to create well-formed stories. He ended up with a compromise that didn’t really accomplish either goal, but took huge amounts of effort to produce.
In a way, I see Minstrel as a cautionary tale: figure out your goal, and design a system for that goal. If your goal is to produce a certain type of story, TRAMs might actually prove an interesting way of getting there. But choose your tools after your goals, if you want to produce stories. If you want to explore a particular set of tools, don’t expect good stories.
March 3rd, 2006 at 7:04 am
Wow, thanks for posting this, Noah.
I’ve been puzzled about this for some time; I did some work on designing schemes for generating interactive stories and for research I looked into generating linear stories and was somewhat puzzled when it seemed the community considered the non-interactive storytelling problem “solved” by things like Minstrel. I’m glad to hear that I’m not the only one reacting to it as “from our current perspective the conclusion is, to put it charitably, debatable” although I’d not be so charitable.
That a system can think of sewing needles as a way to kill someone, or knights eating princesses, is seriously amusing, and not in a bad way.
I don’t know, Andrew. My reaction to this is more like reading Markov-model-generated text (aka Disassociated Press)–which indeed some people find amusing. It’s possible for us to try to read something into it, but it isn’t captured by the underlying representation/model of the generating system, so if that meaning wasn’t put in specifically by the author of the source data, it’s entirely in the eyes of the beholder. I noticed this with Minstrel with things like Noah’s quote above where the story of the Knight intentionally harming himself mutates into accidentally coming to harm as a result of fighting the troll, which we can interpret as “the Knight going into battle in order to die”, but this is entirely our interpretation, or was implicit in the author’s choice of the snippet given the transformation rules. I guess the knitting example translates to this case as well: the system would generate the knight sitting down and knitting in order to die. This obviously doesn’t make sense, and so the whole thing comes off as really only just barely working because of the exact choice of source material.
Basically my reaction to all of these systems has been, if it’s not stated in the generated text, it doesn’t to me count as effective storytelling, because there’s too much danger that we as humans will re-interpret the text in a way that makes sense to us but in a way that is not available to the computer. (Thus I have a severe dislike for any system whose convincing results rely on rewriting into English by a human.) Of course if your only goal is to generate stories for entertainment purposes, this may be more tolerable, but when looking to use what we’ve learned from these systems for further purposes (say, interactive storytelling, or, say, generating plots that are then used to generate dialogue) clearly the system needs to actually “know” what’s going on and not rely on serendipitous viewer interpretation.
March 3rd, 2006 at 9:55 am
I don’t know, Andrew… Is it really useful to attribute an abstract trait like “imagination” to a program in this way? Call me unromantic, but I prefer to just assume that the persons who write such programs are imaginative; maybe even that they are willing to stretch the bounds of believability into fantasy and surrealism (although this might be a fancy way to describe what is otherwise known as a bug). But just because I can write programs that say that they can think doesn’t mean that I believe they do. I’m using a medium, that’s all.
March 3rd, 2006 at 12:05 pm
I think there’s more to Andrew’s assertion that romanticism. Isn’t it possible that in encoding a set of TRAMs, the author is creating implicit connections that can them be extrapolated by the system, and which manifest in surprising ways?
The ‘imagination’ that you end up seeing from the system is thus an implicit property of the author-created scripts, simply interpreted by the system, yet can still be surprising to the author if they haven’t thought out all the implications of the connections that they have authored.
Minstrel to me represents some sort of collossus that has to be defeated in order to bring life to the dream of true interactive storytelling. There’s some very compelling stuff in Turner’s thesis and his use of TRAMs seemed both inspired, and also a serious obstacle. I couldn’t help thinking that the system would benefit from some sort of simplified overall scheme that would enable these TRAMs to fit into a more abstracted architecture.
March 3rd, 2006 at 1:07 pm
This is my first introduction to either Minstrel or Tale-Spin, so you’ll have forgive me if I bring up known issues.
However, to me, there’s a question as to whether the absurdist emergent behavior is the result of fundamental design flaws, or if its the result of simply bugs in the implementation. On a case-by case analysis, its easy to come up with ways to fix individual problems with these stories. Because I’m not sure of how the input data is formatted, perhaps its a question of what parameters the system is given: Knights can’t eat princesses, perhaps its necessary to distinguish types actors, such as “beast” or “human”.
The question of what is important to the system might be insurmountable, but if the system was properly designed to account only for the parameters given, perhaps the scope of possible outputs would be decreased, but the results would be more acceptable. (this might be a question of the tools vs. results conflict that was mentioned earlier).
Another question that I had regarding the criticisms of the system is, to what purpose is the story generation system being built? If we only allow the sytem to generate stories that are obvious to us, should we have just generated those stories ourselves? Perhaps the lack of limitations is a strength of the system: allowing us to experience stories outside our usual understanding, ie. stories that we never could have written ourselves. Why is it that a knight wouldn’t kill and eat a princess? Perhaps this could reveal something about our society’s taboos against cannibalism? Or was the system built to mimic (and perhaps elucidate) the processes the human mind uses to compose stories? In this case, only those stories that make sense to us are desirable.
To me, the stories generated seem like the works of a child, lacking complete understanding of the world and how it operates. The question of the viability of the system is then: Is it possible to correct the child’s incorrect impressions of the world by simply adding more information, or does the child’s brain first have to learn how to determine what information is important on its own?
March 3rd, 2006 at 3:43 pm
Minstrel seems imaginative to me because, and I believe this was Turner’s intention, it’s doing some the things we do when we are creative: it’s sort of searching for variations on existing known story structures to truly come up with something new. E.g., “well, what if it was y instead of x that did z… and hey, what if they did p…” This isn’t the only way we’re creative, but it’s one way. At a minimum, it reminds me of surrealism. It’s more bottom-up, not top-down; it’s more free-associative, less goal/motivation-based.
What this kind of creative technique requires at a minimum to ensure avoiding generation of childlike / overly-absurd stories is some way to judge the quality / believability / fitness of the new PATs it generates, as I alluded to earlier. (Even a top-down character-goal/motivation-based method of generation can benefit from evaluation techniques, although it wouldn’t be as critical since motivation-based generation should be more believable, by definition.)
Sean wrote, the whole thing comes off as really only just barely working because of the exact choice of source material.
Minstrel needs more PATs — more knowledge. It’s certainly no fundamental flaw or bug that a system is only as smart as the knowledge you give it.
It’s possible for us to try to read something into it, but it isn’t captured by the underlying representation/model of the generating system, so if that meaning wasn’t put in specifically by the author of the source data, it’s entirely in the eyes of the beholder.
I’d argue Minstrel does retain some understanding of the meaning / structure of what it’s generating. It has the “hooks” (the potential for the ability) to know and reason about the fact that it is relaxing constraints, or substituting actors, or whatever. Minstrel is not as simplistic as (surrealist) Mad Libs, nor is it random generation of language based on the local statistical properties of source text, e.g. Gnoetry. PATs and TRAMs are higher-order creative techniques.
clearly the system needs to actually “know” what’s going on and not rely on serendipitous viewer interpretation.
Of course. But these PATs and TRAMs are knowledge — the concept that fighting a troll can injure you, and that such an act can be applied to trying to commit suicide, is knowledge. The system, perhaps without much enhancement, could generate the statement “The Knight wanted to commit suicide; he knew that fighting trolls could injure or kill him; he decided to fight a troll in an attempt to commit suicide.” That’s rational behavior. You might argue, why doesn’t the Knight just jump into a moat to kill himself? Simply because he doesn’t know about that method. We can’t blame the system for making non-ideal writing choices if the system doesn’t have anything more to work with.
You might argue the system could generate “The Knight wanted to commit suicide; he knew that sewing could injure him; he decided to sew in an attempt to commit suicide.” That’s rational, but not believable, because the system doesn’t know that sewing causes such light injury, and an probably never could cause death. Also, additional knowledge needs to be there to help the system understand that fighting trolls is a valiant effort, and that valiant deaths are good deaths; sewing is not valiant. (If you think a suicidal person would not want a valiant death, then that knowledge needs to be encoded as well, and non-valiant death methods are needed… this is not a problem, this is just educating the system.)
Noah wrote, I think the real problem, here, is that Turner was trying to serve two goals, which didn’t go together well. One was to simulate an author through operationalizing CBR theory. The other was to create well-formed stories. He ended up with a compromise that didn’t really accomplish either goal, but took huge amounts of effort to produce.
I’m not sure I see that as a problem; isn’t simulating an author’s techniques a potentially fruitful way to create well-formed stories? (Again, more is needed than what was done so far in Minstrel, but it’s on the path.) Also, aren’t huge amounts of effort just par for the course here?
Charles wrote, To me, the stories generated seem like the works of a child, lacking complete understanding of the world and how it operates.
Yes, great observation
The question of the viability of the system is then: Is it possible to correct the child’s incorrect impressions of the world by simply adding more information, or does the child’s brain first have to learn how to determine what information is important on its own?
I believe yes to the former, although knowledge engineering (designing PATs and TRAMs that will be used effectively and elegantly) isn’t “simple”, I’m sure it’s difficult to do well.
March 3rd, 2006 at 4:38 pm
Andrew, thanks — that’s a good clarification of how you’re thinking about Minstrel.
I guess I should clarify as well. I think the problem with Minstrel is that it’s not clear which is the primary goal: simulating authors or generating stories (which means that Charles’s question, wondering to what purpose Minstrel was designed, can’t be answered definitively). Certainly either one could be a subsidiary goal of the other, but when they fight for supremacy I don’t think it turns out well.
So, yes, simulating an author’s techniques might be a fruitful technique for producing stories (though I don’t think any convincing piece of work has been produced along these lines). But it would have to be clear that the stories were the goal, and the techniques borrowed from human authors were only used to the extent they proved useful. Unless, of course, the system is about simulating authors — but then I think the stories (unless they’re meant to illustrate this process) would be a subsidiary goal.
Still, I shouldn’t give the impression that I’m down on story generation in general. Hopefully my forthcoming posts about Universe will help give more nuance to my position.
March 3rd, 2006 at 10:24 pm
I said, the whole thing comes off as really only just barely working because of the exact choice of source material.
Andrew replied, Minstrel needs more PATs — more knowledge. It’s certainly no fundamental flaw or bug that a system is only as smart as the knowledge you give it.
I guess my comment was actually totally redundant to Noah’s brittleness concern. The question is whether you can usefully encode more knowledge in the existing system, or whether something rather different is needed. The impression I came away with from my reading of the papers is that you probably can’t, and that seems to be Noah’s concern (“more likely…”).
Charles: On a case-by case analysis, its easy to come up with ways to fix individual problems with these stories.
Sure, but the question is, is that just “engineering fixup”, or is this in essence a new system, one that needs to be researched and documented?
My suspicion, based on absolutely no information and no real understanding of the knowledge representation in Minstrel, is that some sort of different knowledge representation needs to be added to it to give it more “smarts”, and it needs to use that in a fairly different way. This may be totally tractable, but as far as I can tell (a) it hasn’t been done, and therefore (b) it’s a bad idea to treat the problem as solved!
As a historical note, when I read whichever (long) Minstrel paper it was that I read and found it strangely focused on characterizing Minstrel as the computerized-demonstration of the model of creativity espoused in the paper, rather than as the end-goal of the research, I hypothesized that this was because it came from some period in which AI funding was hard to come by and they hacked around it by characterizing it as research into theories of creativity with no mention of “AI” anywhere. If that really was the goal, that’s kind of interesting and I would blame Turner less for not going further with it.
October 30th, 2007 at 8:58 am
[…] posted a series of thoughts about two story generation systems: Minstrel and Universe (1 2 3 4 5). I had some critical things to say about the Minstrel system, but they were b […]