June 7, 2005
Thoughts on AIIDE
Andrew did a great job posting his talk notes for AIIDE. In this post I’ll describe some of my reactions and thoughts to the talks and conversations I had at AIIDE.
Andrew and I are certainly in agreement with Chris about the need to increase verb counts in order to achieve interactive story. But Chris strongly wants to avoid natural language, and instead move to a custom logographic language. Further, he wants to use parse technology to provide constraints as the player writes sentences in the custom language – I imagine something like pop-up menus. I understand the impulse to avoid natural language (seems like an impossible, AI complete problem) and to prevent the player from being able to form nonsensical sentences, but I worry that:
1) logographic languages will feel unnatural
2) a pre-parse interface that constrains what symbols you can use based on the symbols you’ve used so far will prevent players from being able to speak in their own style.
Chris says that based on his own experiences with Siboot, players were able to pick up languages right away; it will be an empirical question to see how natural or unnatural people find such languages, especially as the vocabulary size grows.
As I’ve argued before, part of the allure of natural language interfaces are that people can say things in their own words, increasing the agency they experience. If a logographic language provided only one way to express each discrete verb (e.g. “flirt with Trip”, “criticize Grace”), this sense of speaking in your own style would be lost.
An advantage of logographic languages is that internally, the characters can speak to each other in exactly the same formalism that the player uses to speak to characters. This symmetry allows the AI characters to interact and coordinate with each other in exactly the same way they do with the player. On the other hand, the player is different than the rest of the characters; the whole world is organized around creating an experience for the player. So perhaps the AI characters should be able to more tightly coordinate with each other, read each other’s minds, etc. in order to coordinate on creating an experience for the player. We certainly found this useful in Facade.
As a design experiment I’m very curious to play an interactive story that uses a logographic language for communication. But the jury is definitely still out as to whether natural language or constrained artificial languages are the way to go.
Will points out that AI isn’t just a single technology or design, but rather a bag of tricks. This is the prevalent view within academia as well. The problem with this definition, however, is that it fails to describe what AI is a bag of tricks for. There are lots of tricks and techniques one can use while writing programs; only a subset of these tricks and techniques are AI. AI is about creating behavior that can be read intentionally, that is, behavior that can be interpreted by a viewer (player) as if it was being produced be an intelligent entity. Often the best way to do this is to build a simple simulation of the intelligent entity; that way there are fairly obvious connections between junks of code (e.g. a chunk of code called a “plan”) and the perception of intelligence in the viewer (e.g. “that monster looks like it has a plan”). Without those connections, it’s difficult for an AI artist to craft the code in such a way as to produce the interpretations she wants to produce in the player’s head.
He specifically mentioned interactive drama as a difficult design problem. Drama consists of complex causal chains; one break in the chain can blow your cover. But creating and maintaining those complex causal chains is crucial for giving the player global agency, a sense that what they do now has complex and evolving ramifications in the future. Interactive drama is hard precisely because it takes the modeling and simulation of open-world games, applies it to character psychology instead of character physics, and requires that the overall experience satisfy complex global constraints.
During one of the breaks I had a nice conversation with Will about prototyping. Will’s design approach is to have his team write hundreds of throw-away prototypes, each of which abstracts some aspect of the game. The prototypes themselves might just look like squares and circles with numbers next to them and a few sliders or menu items to select. After having seen a number of these prototypes, I noticed that they all were either about gameplay or procedural graphics. A gameplay prototype explores some small verb space by modeling how the verbs directly impact the abstract, score-like state, ignoring all the intermediate detail that would exist between the verbs and the abstract state. A procedural graphics prototype explores how you might automatically generate curvy roads or character flow behavior in a city, without worrying about all the work it takes to make the graphics pretty or to hook the inputs of the model to the rest of the game state. But none of the prototypes I’ve seen are what I’d call AI prototypes; something that abstracts away a bunch of code detail while still letting you explore an AI approach (for example, a particular architecture for character AI). I hypothesized that prototyping doesn’t work for AI. All the code detail is the AI; there isn’t a useful level of abstraction that would give you real design feedback. In Facade, Andrew and I didn’t use any prototypes; our design work was all done in the context of building the complete system. So I posed this hypothesis to Will to see what he thought. He completely disagreed, politely saying that if you can’t build a prototype, then you don’t understand what you’re trying to accomplish. I briefly described to him a pet project I’m just starting in automatic game generation (for simple games) and asked how you could prototype that. He suggested that you could build a prototype that spits out games as abstract features (not a playable experience) and explore the design space at that level first. After talking for awhile, I figured out that one of the keys to making prototypes work is imagination; you have to be able to look at the very abstract output of your prototype and with confidence imagine what that abstract output means in the fully-fleshed-out completed experience. For AI prototypes this means developing a good architectural imagination, that is, being able to see the implications for the full architecture just from abstract little pieces of it, and being able to imagine what that architecture would be like to author within, and what kinds of player experiences the architecture affords. Will has a ton of experience designing games and so with high confidence can perform this act of imagination. In my own work I’m going to start using rapid AI prototypes and see if I can develop this skill.
I’ll stop here. I’ll post more on AIIDE later.
June 7th, 2005 at 8:36 pm
What if you used a logographic language internally but let the player enter their stuff in natural language. Just below where the player is entering you would show the logographic interpretation of the players command. That way the player could see if what they are typing is being interpreted correctly and everyone would be using the same language. When the other characters are communicating with the player their speech could be translated into natural language for the player and maybe give the logographic version underneath again.
To me, that seems to give the advantages of both approaches.
June 7th, 2005 at 11:54 pm
I wish I had more time to respond, but great comments. And the NYT article rocks!
June 8th, 2005 at 3:40 am
does that mean you now think you should prototype the full drama?
Also are there terms that separate agency in the twofold sense that you are using or perhaps I am reading too far into? With NLP there seems to be a sense of agency
-agency as in control over one’s own actions as player
-agency as in realising one’s own random or buried wordlist when one searches for words (oh I wonder why I suddenly came up with that word)..perhaps this is not agency but parsed text input
-restrains ie limits the player hencing reducing the sense of agency
-takes away the ‘gee whiz how did i think of that word’ feeling
i also wonder if a few parsed commands really constitute a language, They may not feel unnatural because we only have to remember combinations of letters but we are not in the languageworld unless there is a full descriptive linguistic way of thinking.
Hmm are languages really defined by concepts that don’t slip easily between different languages? If so, if languages are defined by their unique special meanings, then a parsed language is not really a parsed language unless we can only completely think inside that parseable (a word?!) languageworld.
Sorry for a bit of philosophizing there. Enjoyed the notes.
June 8th, 2005 at 6:16 am
Michael Mateas wrote:
I agree with interactive drama being hard, but I’m afraid that if you actually do it the way you say you do it here, you’re making it too hard for me, as a writer. To me, applying the modeling and open-world simulation of games to “character psychology” rather than “character physics” seems impossible. I’m not saying that you can’t do it, but I can tell you why it’s too hard for me.
In Spore, the player gets a GUI to design animatable characters. Let’s say that I design a two-headed, three-legged character called Love Child. Because of the genius of Will Wright and his colleagues, I can actually get my character to move from A to B, though it will move in a fairly weird way. Haha, that looks funny! Do I know what it means? Sure I do – it means that the laws of physics are in effect in this game. Since on the box it said “Create interactive characters and see them buckle under realistic gravity!”, I expected this to happen. Since the whole effect is accomplished by numerical calculations, and the context doesn’t shift (as long as the laws of physics don’t change), Will Wright’s system has to create the right numbers and types and pass them around, but he never has to explain their meaning to the player, since the player is expected to do that part of the job herself. As long as the developers get their numbers right, this kind of of modeling and simulation will work within the limits of this kind of application, for the foreseeable future.
But what if I get the GUI to design Love Child, but on the box it says “Create interactive characters and talk to them!”? I create Love Child and watch him hobble along; “Haha, you look funny!”, I tell him. And what does he reply? “What do you expect, you cruel idiot?! You only gave me three legs, and those two heads aren’t doing much for my balance, either!”
Can we get there? I don’t know, but I know that this sets the direction in which I want to move. Am I trying to get there by applying the modeling and simulation used in current games to “character psychology” instead of “character physics”? Hell no – as a writer, I stand no chance to do that with numbers. Maybe you can figure out how to do that, but I‘ll have to use words instead of numbers, and instead of using the laws of physics, I’ll have to do my modeling and simulation using the laws of storytelling.
Like the laws of physics, the laws of storytelling are based on the assumpion of a general cause/effect relationship. In non-interactive stories, writers are free use this property of stories to cause all kinds of dramatic effects: 1. something happens , the cause of which the audience can’t possibly know – 2. later, something else happens, also by unknown cause, but which the audience might or might not feel has a connection to 1 – 3. finally, something happens that shines a wholly different light on everything that happened so far; in a moment of revelation, the cause/effect relationships of all past events become clear (provided that the audience has been active the whole time).
But if the audience is not only expected to be active, but is expected to be interactive, I can’t write it that way. If I’m an interactor in an interactive story, and something happens that I don’t understand, I expect to be able to ask a fellow interactor – whether human or virtual -, who has witnessed the same event, why this has happened. And if they can’t, I want to know how this is motivated. And if that motivation seems interesting enough, I might want to know the motivation for that, and so on…
Even worse, if I see a creature that moves in a funny way, and the creature is supposed to be “intelligent”, and supposed to understand natural language, I expect it to be able to give an intelligent answer to the question “Why do you hobble around so funnyly?”, even if I know perfectly well why, because I made it that way! To please me, the creators have to explicitly explain the cause/effect relationships for everything that can possibly happen in an interactive story. I do know that some engineers claim that they can do interactive characters by abstracting away details, by averaging, by using “dialog acts” and “drama managers”. Good for you. But I can’t possibly understand how you can cause this effect, given the fact that I, as a writer, work in a world where “average” is antithetical to “character”, and where the detail is virtually all that matters. However, if you can pull this off, I’ll be full of reverence.
June 8th, 2005 at 7:30 am
ErikC, directed at Michael, asked:
I’ll hijack this :-P. All the non-interactive-story-tellers I know prototype their drama. Most storytelling teachers teach the use of prototyping tools for storytelling, such as exposés, treatments, story layouts and story event networks using index cards and pieces of string, and various other graphing formats. My beloved Dramatica can be regarded as a rapid prototyping tool for stories.
I think (and I’ve witnessed) that the question of how to express such prototypes in interactive code is currently being attacked from various angles. Michael’s formulation of the goal – a machine that outputs abstract representations of story events, given inputs of various types – works very well for me. I even went as far as choosing a prototype-bases over a class-based system architecture, so all newly-created objects start out as renamed copies of prototypes, and, after modification, can themselves be prototypes.
June 8th, 2005 at 4:26 pm
I really like this idea.
The first objection might be that it makes parsing bugs transparent to the user. But this could (maybe, possibly) be turned into an advantage by allowing the user to correct the parser’s mistake. Maybe this could be a user-friendly method of giving input to some form of a learning parser.
Then you get the game to asynchronously share data it’s learned on translating natural language with other copies of the game (or a central server which processes the data and shares a summary form with clients) …
June 9th, 2005 at 2:51 am
Problem: User A corrects “parser mistake” X one way, User B corrects the same X another way. What does the program do?
Hm – maybe give each client a seperate parser, which they can customize? As long as it’s guaranteed that each client runs in its own thread, each could evolve its own language…
Problem with the above “solution”: networked games installed on public terminals, e.g. at libraries. You’d probably have to have a non-customizable parser set as a default, with a “customize parser” option in the “advanced” settings.
Oh, and have the “advanced” settings, turned off by default, and only turn-on-able using a password.
June 9th, 2005 at 4:56 am
Ok, another go at defining Interactive Story: a story which, in its own unfolding, reflects the player’s behavior, by giving the player an evaluated interpretation (evaluated using the values that the creators have defined for that story) of her behavior. A story which finds closure at the moment the player recognizes herself: “Oh, that‘s what I did!”
June 9th, 2005 at 1:08 pm
Dirk – “To please me, the creators have to explicitly explain the cause/effect relationships for everything that can possibly happen in an interactive story.”
Not everything that can possibly happen. Everything that does happen. You don’t have to work out complex causal chains on the fly that way. Just give the system a start condition, remember all the events and what events caused what events. Then when the player asks why something happened, a character can tell them. Or not, if you want to have deception happening.
I think this approach is one of the most promising. Centering an interactive story system on the concept of each character as an artificial storyteller, that remembers what they’ve seen and done and heard, and telling each other and the player parts of the story.
June 9th, 2005 at 4:55 pm
I think that the ability of remembering and explaining external events is necessary, but not sufficient, Chris. What I’m interested in is how to simulate that a virtual character is emotionally affected by those events, why something makes a character angry, why something makes her cry. The same event makes a different character deliriously happy – why is that? What will happen if I side with one of them? What will happen if I try to change sides for some reason? What are the values that a character holds dear, and why? Will a character change to succeed? Will I have to change to succeed?
I don’t particulary like the perspective, either, but four years of experimental results indicate to me that any such depth has to be handcrafted in if I want it in. There’s neither technique nor technology that will enable a virtual character to make emotionally meaningful decisions or motivate it to take passionate action on its own. It’s all in the authoring.
Implementing artificial self-awareness is, as I’m experiencing it, a gruesomely laborious process. Tools will help, but most of them can only be built when the work process is mapped out much better, and much more is known about the actual results on humans; right now I’m afraid that more sophisticated tools will close off or cloak possible routes of exploration without me even noticing. Practice will definitely help; I expect to be much better at this in a couple years. Technically, everything works like I want it to; I can rig up a system that does quite fancy stuff with abstract prototypes in an amazingly short time now. The hard part really is the content creation – “motivate, motivate, motivate”. If I ever hear a screenwriting teacher use that phrase again, I might tell him that he doesn’t know what he’s talking about until he writes stories with characters in them whose motivation is completely unknown.
June 9th, 2005 at 5:07 pm
great post michael (and great responses)..
having spent time working with will & the spore folks specifically on prototypes, they do in fact prototype nearly everything (as he suggests), even rather complex AI-based scenarios.. this does requires lots of imagination, but perhaps not so much more than imagining a flock of 2d pixels as fully-fleshed 3d characters might.. from a design perspective, its a very powerful way of working — it often becomes clear quick quickly when an idea is not going to fly, before significant resources have been spent on it.. we’ve used this approach on rapunsel as well.. though one thing becomes very apparent: you need the right language &/or framework w’ which to build such prototypes or the process falls apart.. so its great to hear dick mention prototype-based languages, an approach i think has significant potential here.. [as a side note: i’ve been working recently on such a language for web-shareable content (untyped/class-free w’ java/c syntax, compiling to java byte-code) recently and hope to have a version on sourceforge in the next couple of months.. perhaps i can find some alpha testers here ;) ]
June 10th, 2005 at 6:57 pm
Use a statistical or layered learning system of some sort. The system could weigh the user’s own corrections much higher, but use data pooled at the server to fill in the gaps.
(One of those pie-in-the-sky ideas that should maybe work, but I haven’t given any thought to how much computing power it would take to pull it off … maybe a lot.)