May 24, 2005

Toward Authentically Interactive Characters and Stories

by Andrew Stern · , 3:07 pm

Janet Murray asked for the answers I would have given to the questions I posed to Warren Spector, Neil Young and Tim Schafer at the recent GDC panel, Why Isn’t the Game Industry Making Interactive Stories? I found it useful for myself to write these out, to clarify my own thinking, and to hopefully get feedback from anyone interested.

I’ll try to be succinct and specific. These answers are informed by my experience over the past 13 years developing interactive characters and stories and closely following the industry and academic R&D in the field, helping me identify what I believe is important and what’s not. (Also I’m guessing these would be answers similar to what Michael would have said had he been given more time to participate in the actual panel discussion.) For some background on the panel, you may first want to read what the panelists said: 1 2 3.

Question 1: What do you consider the most important qualities and pleasures we *don’t* yet find in today’s interactive entertainment? And why are they needed?

Boiling it down, I see three major areas sorely deficient in today’s games, that if given substantial attention from game developers, e.g. 3+ focused years of R&D, I believe would lead to some true progress toward creating authentically interactive, much more satisfying characters and stories.

1. We need engineering techniques to manage multiple, various “chains of events”, that can intermix, that can be pursued in any order; ideally the game would be designed to allow the player to be the agent primarily in control of the order they are pursued, so as to react to what the player wants to make happen. These intermixable “chains of events” involve NPC’s, and can vary in scope and size. They can be small chains, lasting 10 seconds to a minute, such as short-term NPC behaviors to explore a room, to describe how they’re feeling, extended use of objects such as loading weapons or fixing drinks, to sob for a few moments, etc.; characters need to be able to do start these behaviors, get interrupted, come back to them, do multiple behaviors simultaneously, dealing with conflicts appropriately. … These event chains can be bigger in scope, such as conversations about specific topics, that last many minutes or longer; any one conversation needs to be able to go off in multiple directions; multiple conversations need to be able to be active, get interrupted, resumed, intelligently aborted if necessary, etc. Larger scope event chains include subplots, such as spending several minutes accomplishing a complex task such as driving across town and interrogating a suspect, or making dinner and proposing marriage, or digging through the ground to discover a secret tunnel and then uncovering treasure. The system needs to be able to keep track of and manage all of these event chains in a coherent, dramatically paced manner. These behaviors / conversations / subplots, while varying in scope, can have overlap in their underlying structure, and therefore can overlap in implementation technique. Their authoring needs to be a tractable production process, and avoid becoming a QA (testing) nightmare; testing becomes a major production issue as complexity of behavior increases.

2. Players need a far broader array of discourse acts that they can express — the things they can say and do, what Chris Crawford calls verbs. The defacto game controller interface, a joystick and a few buttons, isn’t expressive enough. (Requiring players to do ridiculously convoluted button combos to increase their options is just awful; I can’t believe players put up with that. Choosing from menus is a little better, but not much.) Natural language via a keyboard is probably the only viable interface for increasing the player’s expressive range; voice recognition in an emotional game situation isn’t here yet, and may not be for quite a while. The real technical challenge with increasing player expression via language is in making the system capable of understanding it all, and reacting in meaningful, context-sensitive ways — discourse management. Natural language understanding alone is a major technical challenge; pragmatic, robust solutions that are at least minimally capable will need to be developed. The required variety of reactions can be supported by implementing the intermixable chains of events above; more complex event chains go hand-in-hand with increased player expression and discourse management.

3. We need techniques for more expressive, procedural faces and bodies. The best existing game development, e.g. the Half-Life 2 character engine, is pushing in this direction somewhat, although it may not be as procedural yet as it needs to be for truly reactive and dynamic characters. While challenging to accomplish, of the three areas listed here it is the most well-understood and achievable in the short term.

Question 2: Why haven’t these qualities been achieved? What are the obstacles, how daunting? Are they so technically difficult that incremental progress is not possible? Are players happy enough with what they’ve got? Too great a divide between programmers and designers?

The primary reason these haven’t been accomplished is simple: these qualities are complex. It’s not because developers and players don’t want them, it’s because they’re complicated and onerous to implement, in terms of both design and technology. Game developers over the years have taken stabs at these qualities, but are forced to back off when the complexity begins to conflict with 12-to-18 month production cycles.

Unlike graphics and physics, the underlying “laws” of character behavior, narrative event chains and discourse management are not deeply understood or easily modelled. (Facial and body expression is more well understood, thanks to years of animation production; this book published in 1981 is one of the earlier bibles on the subject.)

When it comes down to it, these qualities are fundamentally about people, not about objects and environments. People are very complicated, of course — that’s why they’re so interesting! But although they’re complicated, implementing interesting virtual versions of them should not be intractable. It should be possible to make abstracted, dramatized, simplified but effective versions of people. This is at the heart of Chris Crawford’s new book, and why it’s worth reading. The Sims is a good beginning, but they’re too simple; to more deeply engage the player, virtual characters and stories need to become more personal, first person and directly interactive.

Ultimately we’re talking about creating characters, worlds and systems that have the flexibility and generative power of simulations, but are designed to temporally progress at a good pace, perhaps integrated with drama management, to keep the pace of the experience moving forward and avoiding play that devolves into repetitive, rote labor or long stretches of inaction — some of that dramatic compression Janet is talking about. One might call these “directed simulations”.

Sustained R&D and product experimentation is the required path to get there. In a future post I’m going to pitch an idea of how professional game developers and academic research groups could get together in a new way, to accomplish this kind of thing.

Are today’s players content? On the whole, no… I think many are running out of patience, something has got to happen. We’ve got this latest round of game consoles hitting the market over the next year or so, which will dazzle us with higher fidelity physical objects, environments, and shells of characters — nicely rendered minimally-interactive animatronic mannequins, and gorgeous, pea-brained monsters.

We’ve talked about the need to close the artist programmer divide.

Question 3: Give a realistic prescription for how to surmount these obstacles in the foreseeable future. Or, describe how this is not possible, or the wrong direction. Near-term technical and design milestones to shoot for? How can publishers become more experimental? Do we need to train a new generation of designer-programmers?

I’d wager this will require a big coordinated effort, not merely bits and pieces of progress here and there — like a mini-Human Genome project, or specifically, a Homme Virtuel project. Starting with, say the HL2 world engine, building the required AI from scratch would take 5+ years. Building on academic research results would be better, saving ~2 years of work, I’d guess. Recapping the conclusion of my Hard to Believe post from last December:

If someone said to me, Andrew, tell me what you need to put together a team to do this — a Manhattan project for believable characters — here is what I’d say I’d need, to create just one really, really good believable character:

* 3-4 talented, experienced behavior programmers — very hard to find
* 2-3 talented, experienced character animators
* 2-3 talented, experienced procedural animation programmers, with knowledge of existing techniques
* A behavior language such as ABL, or Zoesis’ tech, or something equivalent
* A producer to manage the team
* A creative director
* Office space
* 12 months to assemble the team
* 24 months of production

This would probably cost about 3 million dollars.

Once the first character has been created, it would probably only take 25% of the original effort to make an additional character. And so on.

This would be difficult, relatively risky R&D.

Add in basic (not overly complex, but basic) natural language understanding and generation, and we’re talking 24-36 months of 3-4 talented, experienced engineers and writers, starting with the best of today’s NL technology, adding about another two mil to the budget. (The $5 million virtual man… we can build him… we have the technology…)

Realistically, could this happen? Sure. With the right team and the right business plan, I like to believe one could find investors interested in such a venture. (Existing game companies? Probably not.)

Over time, probably we’d see a cycle of try-and-fail attempts, hopefully that can build upon each other. Zoesis is most the most substantial first attempt I know of.

On designer-programmers — we’ve talked about the need to train a new generation of them.

Janet’s reverse question: What moments in people’s own gameplay have created intense story pleasure or have aroused story expectations and succeeded or failed in satisfying them in a memorable way?

I’ll just say that I’m hard to please in this area, with an allergic reaction to contrived interfaces and non-natural modes of interaction. I don’t have any personal success stories to report. When it comes to experiencing intense interactive story pleasure, I’m still a virgin. :-) Façade included, I’m afraid.

(Certainly as a kid I’ve had aroused story expectations, from the best C-64 games and from text-based IF, akin to what is described in chapter 16 of Richard Powers’ Plowing the Dark.)

I’m always interested to hear what specific, fundamental pleasurable qualities of stories you’d like to see in games and interactive entertainment, e.g. this from yesterday. Try to be specific, not overly hand-wavy if you can.