January 30, 2004

Emotion in games

by Michael Mateas · , 4:09 pm

There’s a new article at MSNBC.com on the future of emotion in games, a topic we like to talk about here on GTxA. A variety of game developers and researchers are quoted, including Andrew and me. It describes our work on Facade as an example of the advances in AI required to support emotion-rich game experiences.

16 Responses to “Emotion in games”


  1. andrew Says:

    Jonathan Sykes, one of the people interviewed in the article, is this month’s IGDA The Ivory Tower contributor, on the topic of “Affective Gaming”.

    Complementary to these emotion-in-games articles is a PC Gamer article, Graphics R.I.P.? “In some [games], it’s clear that we’ve reached [the graphics ceiling] … How much actual use will further graphical excess be in advancing the genre? … the days when graphics ruled videogames are rapidly drawing to a close.”

  2. ian Says:

    Michael & Andrew —

    I agree that we need new AI models that encapsulate emotional representation instead of, say, ragdoll physics to advance the emotional game space.

    I haven’t read the complete ABL specification (I have read “A Behavior Language for Story-based Believable Agents”), but as far as I can tell, ABL relies on success tests (boolean results). From what I understand, subgoals in ABL also wait for boolean return values, and the success of an entire beat is dependent on the success or failure of its subgoals, even if ABL allows certain subgoals to be deprioritized or even omitted (is that right?).

    I’m wondering how this procedural assumption positively or negatively affects the whole idea of “believable agents.” I’d content that one thing we need to do is think about what procedural completion states mean. In a real social interaction, we could say that actions we conduct never really correspond with simple success or failure states… and in fact, these ambiguities are where everything interesting happens. Was she looking at me? Is he angry with me? Should I touch her shoulder, or her hand, or not at all?

    One possible answer is that the player does the work of mentally refining the game environment’s representations of the beat subgoal outcomes. I think this is a viable position and not a compromise, but it’s also not an excuse to remain satisfied with our current tools.

    Another possible answer is to seek out a more procedural variant for completion states that is more subtle than success/failure. One method I wonder if you’ve considered is Bayesian Analysis. Bayesian algorithms have become very popular in anti-spam software, which essentially responds to input in shades of gray based on patterns learned from previous inputs.

    At the level of code, the author would still wire up responses to subgoal results based on logic tests (after all, that’s how code works), but wider variants could be explored.

    Now, I do understand that, hypothetically, the ABL author could create a series of more subtle or granular subgoals and use their return status to edge the larger experience in this general direction. But, looking at the structure of ABL, I’d say this would be an extremely laborious process.

    Finally, you two have spent a LOT more time on this than I have, so forgive me if I’m oversimplifying. The complexity and subtlety of representation ABL provides is quite clear to me, as I hope I’ve made clear in these comments.

  3. michael Says:

    The completion of a behavior (and hence of a subgoal) doesn’t require that a certain boolean condition pertain in the world. Rather, a behavior completes when all of its steps have completed with success, where completing with success can often just mean executing, with no tests at all. For example, a behavior to express that you’re angry might contain several physical acts to move the hands and change the facial expression in such a way as to show that you’re angry. No formal specification of “angry” as a test of some world model is required. The author writes expressive behaviors and the decision making logic of ABL sequences (including mixing together) these behaviors for you. Success tests and context conditions are continuously monitored tests you can write to spontaneously succeed a step or fail a behavior if certain changes in the world are detected. The author is not required to write such tests for all steps or behaviors, but can include them if it wouldn’t be believable for a character to continue some behavior under certain conditions. For example, a behavior in which a character expresses their anger by yelling at the player probably shouldn’t continue if the player walks out of the room. A context condition could be a written to fail the behavior if the player walks out on the character, perhaps causing the selection of another behavior (to accomplish the same subgoal) in which the character follows saying “I can’t believe you’re walking out on me”, etc. But, if the player doesn’t walk out, just completing the initial yelling behavior would constitute success, and thus causing the success of the parent subgoal.

    Another way of thinking about it is that success and failure (and the way they propagate within the active behavior tree), is just a control flow construct, a resource available to the ABL programmer for structuring behavioral dynamics. This is quite different than modeling the world and needing to define simple boolean tests that capture complex social situations like “have I made you jealous”, or “have they fallen in love”, or “have I successfully shown that I feel hopeful”, “should I touch her shoulder”, etc. This isn’t captured in simple tests, this is captured as procedural knowledge distributed among a bunch of behaviors that dynamically subgoal each other in complicated ways. Success tests and context conditions are used to maintain reactivity to changes in the world during behavior performance. Preconditions are used to dynamically decide which behavior to try to accomplish a subgoal (and subsequent changes in the world may cause the character to dynamically re-choose a different behavior). These tests aren’t used to unequivocally determine whether certain predicates obtain in the world.

    I’m familiar with a number of Baysian methods (e.g. naïve bayes classifiers, which I used along with k-nearest-neighbor classifiers for Office Plant #1, Baysian networks, etc). I’m currently considering a number of strategies for a adding statistical learning methods to ABL, though for now I’m mostly thinking about reinforcement learning.

  4. ian Says:

    Michael — this makes sense. Sometime I’ll have to look deeper under the hood so I understand where the code couples with the world.

    I was confusing narrative control flow with narrative timbre, so to speak. Where would you say emotional response “happens” in ABL-authored systems? I guess it has a lot to do with authorship, and determining how to couple and decouple narrative subgoals to one another?

  5. andrew Says:

    The MSNBC article was slashdotted, and followed up by a discussion focusing on memorable emotional moments in games for the commenters.

    Ian, I’d like to write some follow-up comments to Michael’s, hopefully later tonight or tomorrow…

  6. michael Says:

    Ian, we have authored a number of “low-level” utility behaviors to help us portray mood. As different beat goals happen (subgoals within beats that carry out the story functions of the beat), immediate and longer term emotional responses are set. The longer term responses last longer than a single beat goal (and may cross beat boundaries). The emotion behaviors work to portray the current emotional state by changing facial expression, perhaps mixing in small physical actions, and so forth. This blends with the beat-specific behaviors. So, for example, there may be a beat goal in which Grace invites the player to sit down on the couch. The base performance of this is friendly, but imagine that recently something else has happened to make Grace angry. In this case the performance of the couch invitation would come out as friendly but with a mixture of tension. The detailed body-control behaviors are flexible enough to do this kind of mixing (and more extreme mixing, like Trip performing the same lines while making a drink, even though those beat behaviors weren’t explicitly authored to support making a drink). With techniques like these we try to avoid a lockstep sense of the player mechanically moving through beat goals and beats, giving the characters an internal life that happens on top of the story specific behaviors.

    Of course, ultimately the emotional response happens in the head of the player, in how she reads Grace’s and Trip’s activity.

  7. ian Says:

    Michael — thanks for the clarifications. It’s very slick. I’m going to have to think more about procedural emotion before I say more. I’m wondering if you can describe some of the low-level emotive behaviors, from a procedural perspective. For example, are we talking about things like “furrow brow” or “do angry face.” Clearly there are a lot of highly nuanced facial expressions, so I’m assuming the low-level detail is quite, well, detailed.

    On another note, how interesting were the comments on Slashdot! I guess Slashdotters are always looking to debunk things, but I found it interesting that the ones who voiced their opinions there were content to cite fear and anger and frustration as good enough evidence for emotion in games.

    There is one particularly interesting comment: “There is no emoticon for what I am feeling right now!” Emoticons aren’t procedural in the strict sense, but they are encapsulated. I wonder if we might use this as a goal for emotion in games: we’re talking about emotions that cannot be represented in a simple emoticon. The interaction between emotional signals in Facade rely heavily on authorship through low-level procedural actions.

    On yet another note, from a neuropsychological level, I’m hoping soon to look into the role of mirror neurons in game emotion. There may be some help to be had there.

  8. andrew Says:

    Just a quick comment on behavior-based programming (e.g., the Oz Project’s Hap, ABL) (the topic of our upcoming GDC talk by the way…) :

    ABL offers several ways to control program flow, such as parallel behaviors, success and failure of goals, preconditions and context conditions, etc. etc. that make it easy to write powerful and expressive behavior in relatively few lines of code (as well as behavior that goes haywire if you’re not careful).

    But ABL is “general” in the sense that you’re not forced to rely on those if you don’t want. You’re not restricted to program in a certain way; you can write C++-like functions in ABL if you want. For example, you can annotate a behavior (which is akin to a C++ function) with ignore_failure, which will stop failure from propogating up to parent behaviors (functions).

    As an author, I try my best to write code that’s “natural” to ABL, e.g. taking best advantage of the properties of the active behavior tree, but sometimes I write code in a more traditional imperative style. In general though, while ABL gives great assistance in writing behaviors, much of the burden remains on the author to devise the particular control structures for the particular characters / narrative situations you’re working on.

    On the topic of “actions we conduct never really correspond with simple success or failure states…” One powerful technique that Scott Reilly (now my officemate at Zoesis) developed as part of the Oz Project was Em — infrastructure in which behaviors that express emotion are automatically triggered by success and/or failure of certain goals. This is based on the OCC psychological model of emotion. (To read more, on this page, scroll down to “Believable Social and Emotional Agents”). Depending on which goals you as the author decide should trigger emotion — again, careful authoring is key — it can work as a decent, if simplified, model of how people emotionally react to what’s going on.

    On our procedural facial expressions, they achieve their expressiveness by way of parallel behaviors that control the individual parts/regions of the face in a parameterized way, combined with the real-time renderer that adds additional randomness/noise/subtle motion to the expression. So the granularity of control is more detailed than just “do angry face”. (However sometimes we write higher-level behaviors that encapsulate a lot of expression into one behavior that can be subgoaled by narrative-oriented behaviors. When authoring we vary between using fewer higher- vs. more lower-level emotional expression behaviors, as needed.)

    No one of these pieces are particularly complicated; but when you combine them all together is when you get behavior seems somewhat alive.

  9. michael Says:

    Em indeed offers powerful authorial affordances. The most complete version of Em was implemented in the Oz text version of Hap. Scott implemented three worlds that made use of Em as an expressive resource: The Playground, in which the player is trading baseball cards on a playground with Melvin the nerd and Sluggo the bully, Office Politics, in which the player is embroiled in backstabbing office politics, and The Robbery, in which the player confronts an armed robber in a convenience mart. I used Em extensively for an Oz text world I built called Fastfood World, in which I explored the idea of a subjective avatar that actively filters the player’s perception of the world as a function of the avatar’s subjective state (including emotional state). The player is a GenXer stuck in a deadend fastfood job, lorded over by an abusive boss.

    I certainly intend to add Em (with some extensions I’ve been thinking about) to future versions of ABL.

  10. ian w Says:

    Sorry if this post is coming a little late but I figured that this would be the best place for me to jump into the conversation.

    By way of an introduction for those of you who I have not met or talked to; my particular areas of concentration are with simulating and modeling non cognitive emotional behavior that is consistant with personality, age and gender (hence this being a good topic to start on). I have developed a software engine for this purpose that is based on conceptual simulations of neuronal and hormonal systems.

    Probably like many of the readers here, I have been “evangelizing” to (read “pestering”) the game industry to adopt emotional behavior as a source of depth and differentiation in their products for quite a number of years now. As you have no doubt discussed in the past there are a number of reasons why this has not yet really happened, primarily through the lack of expertize in these areas.

    To be more precise there are two catagories I define with emotion in games, one is evoking emotional responses in the user and the other (my area) is the simulation and generation of emotion in game characters. Both are unfortunatley (for the reader) labeled “emotion in games”. Of course these two areas are distinct in the techniques required to achieve them, i.e. music and camera work in the former and convincing simulations of behavior in the later. I see the MSNBC article mixes the two, which is not unreasonable as they are closely connected.

    My point to game developers, especially in the pre Sims days, was that this kind of technology is essential to not just to enhance your existing [insert genre here] products but more importantly to open up new markets, i.e. female players, casual players and the “mass market”. This has yet to really happen in a diverse way as the technology is difficult to create but applications like the Sims are showing the industry that this is where their main focus should be.

    However I do feel that we are at the “priming point” where all of the elements are in place to move to the next step in adding emotion to games. I believe this will start out simply with the next step in technology for game developers which will be adding convincing facial gestures to characters. This will be the ice breaker. Once the power of deeply emotional interaction is realized and how much it can differentiate a product I think a great deal of energy and resources will be turned to enhancing this element of interactive experiences. Anyway I certainly hope so for all our sakes ;)

  11. andrew Says:

    Ian W, glad to hear your voice on GTxA! I can confirm that you have been proselytizing the need for emotional characters in games for quite some time now, e.g., at AAAI symposia where we’ve met.

    On the challenge and slow progress of adding emotional characters to interactive entertainment, i.e. more than just screams of pain as enemies die, and also emotion as part of realtime interactivity, not frozen in non-interactive cut scenes — sure, like you say, this is partly due to lack of expertise on the part of game developers on how to model emotions.

    But of course modeling is one part of the puzzle; the expression of emotion is at least as crucial, if not more. I’m sure you’d agree, effective expression of emotion must be more than just facial expression; to be convincing and high-impact, in needs to affect a character’s body language, dialog, etc. and even more deeply, it’s overall behavior. To take the example of a typical action game, for emotion to be properly read by the player, a virtual character should fight / run / etc. in significantly different ways if scared vs. angry vs. confident. This is very difficult to accomplish, both technically and production-cost-wise. Technically it requires the ability to do varied and dynamic high-impact facial expressions, body language and action. And these all need to fluidly combine. (I’m not talking about photorealism here, btw.) Perhaps the only way to do this effectively and relatively cheaply is through procedural animation, or a blend of procedural and keyframe. The “low-level” control of procedural bodies and faces requires some degree of AI behavior, e.g., a reactive planner, akin to controlling a robot.

    And the higher-level AI behavior needs to be modulated by emotion as well, e.g. deciding what to do next based on how I feel. (This is separate from the model of emotion itself, which is keeping track of what emotions are happening, but not deciding what to do next based on those emotions.)

    There are many academic research groups (and presumably some game industry R&D teams in secret) working on the low-level expression technology (and some on the higher-level decision-making technology as well), but it’s very difficult and slow-going. But it’s key for creating convincing emotional characters. In my understanding, this lack of non-trivial expression technology is a primary reason we haven’t seen sophisticated emotional characters yet. Without it, it would like having really bad, flat actors performing Hamlet.

    I’d argue that a very expressive animation system like the one above would be high impact, even with a simple model of emotion driving it under the hood. That’s not to say that we shouldn’t develop more sophisticated models of emotion, just that from the player’s perspective (which ultimately is all that matters), expression is the first requirement, and sophisticated internal modeling comes second.

    Secondly, with few exceptions, game designs to date haven’t yet well integrated emotion-laden situations into them, other than fear, conquest, visceral thrills and frustration. That is, even if the technology existed to believely express more sophisticated realtime emotion e.g., things like love, jealousy, pity, shame, etc. along with their underlying emotion models to drive them, current game designs wouldn’t put them to much use (in a realtime way, not in a cut-scene way). This is of course partially because game designs to date haven’t bother trying, since the realtime expression & modeling technology is there yet. But additionally this is because it’s just plain difficult to design deeply interactive, interesting, emotional, non-action games, e.g. an interactive love story, or a family drama, or a Greek / Shakespearean drama.

    I wouldn’t say the lack of emotional characters in games is because game developers don’t care about that or don’t want that. They do, they want it, but have no easy path to make it happen. Combine that with conservative game companies who don’t want to do much risky R&D, and you end up in the position we’re in today.

    Are we at the priming point? Maybe there’s enough expressive technology just coming out now that game companies can get started, e.g., facial expression technology. But as far as I know there is no strong solution out there yet for realtime, dynamic, emotionally expressive procedural body animation.

  12. andrew Says:

    Jesper Juul reports that his vote for top event at Imagina 2004, a European media tradeshow/conference, was a panel with several noted game designers and academics called, “Video games: Where do we go next”, in which “the meta-issue became the question of how to have more interesting emotional content in video games – which isn’t easy, of course.”

  13. ian wilson Says:

    I would definitely agree with the spirit of your comments but rather than needing animation systems before emotion systems or visa versa it is more a “chicken and egg” situation in that you need the one to drive the other. Both systems are required together.

    With my own work (please take a look at my web site here ) Funnily enough (well actually not funny at all) I have been struggling with this very issue for a number of years and in fact my technology was ready 4 years ago except for the fact that I could not find an animation system to both test and demonstrate what my emotion engine could generate (and my funding was exhausted). So it sat gathering dust for 3 years until I finally found a procedural facial animation system ( here ) that I could use to test, debug and demo the engine.

    My technology deals with what I would call “back ground character” level emotion, in that it simulates the fundamental processes involved in behavior but not higher level cognitive behavior. This was intentional for two primary reasons, 1. it made the design an achievable goal in terms of the research available (from 1993 onwards) and 2. it makes the adoption of the technology easier as it is not positioned as something that can reproduce full human behavior but it *can* vastly improve the behavior of interactive characters. It is also general purpose and can be used in any situation.

    With facial gestures we have two nerve pathways, one that is under conscious control and the main pathway that is not. This means that in this realm the most fundamental facial gestures do not require cognitive processing. Hence emotional behaviors by themselves can produce very engaging characters.

    With body gestures, the same situation is again true, a complete lack of available procedural technology (and it has to be procedural to really do what we want rather than replaying a few canned animations). I have in the past worked with character technology that utilized Inverse Kinematics and this held the promise of truly procedural movement. However IK is an immense computational burden and as yet I have not seen a solution that can produce particularly high quality movement procedurally. But this is what is required.

    Which brings us to the point of how and at what point will developers of Interactive Entertainment (I much prefer that to games as it is more encompassing) start deploying this type of technology? As Andrew points out they need complete solutions. For my own technology that would mean ideally offering not just an emotion engine but also, at the least, a procedural facial animation solution that fits with game technology. This means finding technology partners who each have a part of the “jigsaw” and who are looking to partner to offer their customers a more complete product. Building these partnerships is I think key for all of us involved in this area of technology.

    What is important to bear in mind (and forgive me if I am recovering old ground here) is that for most current games this kind of technology is a quantum leap in terms of sophistication. Shooting a target is, relatively speaking, trivial compared to modeling social interactions. Current technology is based in general on well understood scientific principles and concepts (i.e. Geometry, physics, light properties etc) whereas what we are developing is based on, in many cases, our own theories or research that is at the very leading edge of science.

    This means that for us, education and exposure are also very important elements of increasing interest in this area. Education in the sense that we need to show the industry that this technology can open up markets much larger than those they are currently servicing and exposure in the sense that we need to ensure that the general public knows what kind of technology is available. I see a great deal of truly dreadful scientific journalism where the only important point seems to be interviewing some well meaning but entirely uninformed “luminary” who is there to assure the public that the technology that we are all working on is not actually possible and no, “computers will never have emotions” and no, “they will never be able to produce engaging social interactions”, well not for 30 years anyway.

    Ironically I saw over the last 5 years or so the game industry almost talk itself into using AI, which has both good and bad sides to it. Initially “fan sites” would talk about the AI in games, or AI’s (>_

  14. ian wilson Says:

    cont..

    Ironically I saw over the last 5 years or so the game industry almost talk itself into using AI, which has both good and bad sides to it. Initially “fan sites” would talk about the AI in games, or AI’s (I despise that word) although the games were in general not actually employing any AI. However, as the programmers in the industry also read the fan sites they picked up the term and started applying it to simple scripting. Then the marketing department picked up the term and started trumpeting “amazingly advanced AI” (no doubt to the embarrassment the developers) and so on it went, around in a feedback loop. It was aided by a group of “usual suspects” who believed that AI did in fact start with Finite State Machines and end with Hierarchical Finite State Machines as was being used at the time.

    The good side of this is that expectations were raised through the feedback loop and actually created the requirement for ever more sophisticated behavior in characters. The downside was (is) that the lack of experience has limited the horizons for most teams (we should be eternally grateful to Will Wright for having the foresight, passion and most importantly power to open up new vistas for the industry). This is where we need to stand up and make our voices heard (and I know many of you are).

    I see the same thing happening now with social interaction and emotion (hence this thread). The current levels of behavior are crude but that is generating press, which is generating hype (i.e. I would be surprised if Half Life 2 really does use facial animation meaningfully although I know some of those guys and they are first rate), which will generate higher expectations, which will pressure developers for solutions, which is where we need to be ready to step up to the plate. Maybe in 5 years someone will be calling us the “usual suspects” ;)

    Apologies to those of you like me who have varying levels of ADD for the length of this post but living in Tokyo means that I dont get to speak English much (hence the lousy grammar) so this is a nice change.

  15. Water Cooler Games Says:
    Emotion and Games at MSNBC
    There’s a good article on the future of emotion in games at MSNBC, featuring Ion Storm’s Warren Spector and our friends Michael Mateas and Andrew Stern from Grand Text Auto. The article both mentions and shows a screenshot of Facade,…

  16. andrew Says:

    By the same writer, here’s a new MSNBC article reviewing interactive dramas Indigo Prophecy and Facade.

Powered by WordPress