July 16, 2006

Notes from Computational Aesthetics at AAAI

by Nick Montfort · , 6:20 pm

Notes from today’s workshop are by Nick and Michael. We’ve tried to takes notes as best as we can to advertise what work is going on, but please consult the actual academic publications of these individuals for the official word about their projects!

Memex Music and Gambling Games: EVE’s Take on Lucky Number 13

Kevin Burns (MITRE Corporation)
Shlomo Dubnov (University of California at San Diego, US)

Bayesian mathematics + information theory, games + music. EVE is a computational theory of aesthethics – a tradeoff between being able to predict and being surprised: Expectations (E), Violations (V), and E’xplanations (E’).

Slot machines could be set to almost any P (probability of payout) and the payout amount varies. They are empirically set to P=0.13, a value that may represent “peak fun.” Enjoyment can be computed in terms of marginal entropies: A “Goldilocks” function showing pleasure at different values of surprise, peaks around 0.13.

Memex music: each note is linked to the next one in a piece (e.g., first note in Beethoven’s Fifth to second note) and is also linked forward to other notes with similar history. Independently, they were set to branch away with probability 13%. Changing the pleasure function to a product rather than sum of terms (surprise * resolve), an S-shaped function arises.

If a player’s/listener’s P and Q vary from the real ones, there is a difference in pleasure. May lead to a theory of aesthetic utility.

Bringing the Text to Life Automatically

Carlo Strapparava
Alessandro Valitutti
(Istituto Trentino di Cultura/Istituto Ricerca Scientifica e Tecnologica, IT)

Concept: automatically create animation of text (graphemes) based on lexical semantic analysis. This automates a graphic design task (for TV, etc.) and allows study of affect in natural language. Animation might help with memorization and learning, too.

Some words directly represent emotional states: “love”, “joy”. Inducing indirect affective words (“cry”, “monster”) from a corpus.A new lexical resource, WordNet-Affect, was developed for direct affective words. Latent Semantic Analysis used to do dimensionality reduction on term-by-documents matrix. Words, texts, synsets can be represented homogeneously as vectors. Synsets distinguish, e.g., terrific {fantastic, howling, marvelous, marvellous, rattling, terrific, tremendous, wonderful, wondrous} from terrific {terrific, terrifying}.

Kinetic typography engine from CMU used, development environment and scripting language added to allow simple animations: linear, oscillate, pulse, jitter. Mappings based on physical responses to emotion and stereotypical motions. Headlines from Google News were then animated automatically. People recognized automatically animated headlines better (50% less time), did worse when “inconsistent” animations were automatically added.

Structuring Interactive Narrative Poetry Generation: Walking Blues
Changes Undersea

D. Fox Harrell (University of California at San Diego, US)

Interested in: Formal representation of semantics, user feedback, generative multimedia, reconfigurable discourse. Generative poetry provides a research basis, privileges metaphor, no virtual world inhabited by a character. Cognitive linguistics and algebraic semiotics are the main fields drawn upon. Cognitive patterns, formally represented.

Conceptual blending is used to elaborate metaphor theory. Double-scope stories provide clashing, sometimes cascading conceptual blends. Provides a theoretical model used in Walking Blues Changes Undersea.

Algebraic semiotics provides typed elements (sorts) for signs in spaces, partially ordered. Constructors build level n signs from signs of level n or less. Morphisms map between spaces. The alloy blending algorithm computes all possible ways to integrate two conceptual spaces, ordered by optimality.

Different levels: Polypoems (e.g., “The Girl with Skin of Haints and Seraphs”) are built on the GRIOT system, then a person provides input, then a person performs the poem. A discourse structure is specified for polypoems. Demo of the latest polypoem, “Walking Blues Changes Undersea,” in which the world fills with water as a person goes through a daily routine. Keywords specific to locations change the disposition.

Brief Poster Presentations

Automatic Dream Sentiment Analysis – David Nadeau(1,2), Catherine Sabourin(1), Joseph De Koninck(1), Stan Matwin(1), Peter D. Turney(2); (1, University of Ottawa, CA; 2, National Research Council Canada, CA)

Dream sentiment analysis – system that analyses a textual description of a dream and pulls out the affective tone of the dream.

Bayesian Beauty: On the ART of EVE’ and the Act of Enjoyment – Kevin Burns
(MITRE Corporation, US)

Painting/drawing as process, with expectations and violations as in music or gambling. Demo: connect-the-dots. An aesthetic theory based on Aristotle, Birkhoff’s proposed symmetry-based measures, Bayes.

An Initiation Rite for Intelligent Machinery – Orkan Telhan (Massachusetts Institute of Technology, US)

A take on Turing’s imitation game, from a cultural theory perspective. Consider intelligence (humanness) as a club for which there is an initiation rite; it’s not necessary to outperform people, but to join the club. Questions about what an intelligent machine should have: ConceptNet is used to generate replies and further the conversation, feeding the database for the future.

ColorCocktail: an Ontology-Based Recommender System – Yu-hsin Chen, Ting-hsiang Huang, David Chawei Hsu, Jane Yung-jen Hsu (National Taiwan University, TW)

Given a knowledge base of different cocktails (colors, types of alcohol, shape of glass), suggests cocktails based on the drinker’s emotional state, using commonsense reasoning.

Exploring the Compositionality of Emotions in Text: Word Emotions, Sentence Emotions and Automated Tagging – Virginia Francisco & Pablo Gervás (Universidad Complutense de Madrid, ES)

Human judges evaluate the affect of individual words and sentences in a sentence corpus. Affect is tagged as a triple in a 3D emotion space. Based on this tagged corpus, the system can determine the emotional tone of newly generated sentences. Work like this helps inform the pragmatic variation in NLG that will be necessary for automatically generating interactive drama dialog.

LyQ – A Commonsense Music Player – David Chawei Hsu & Jane Yung-jen Hsu
(National Taiwan University, TW)

A music-player that automatically selects songs to match an emotional mood. The emotion of a song is based on both melody and lyrics – he’s currently focusing on lyrics. Using ConceptNet to infer emotional tone of words in lyrics.

Painting as a Thinking Machinel – Simon Ingram (Auckland University of Technology, NZ)

Describes a painting studio practice founded in automatism (seen in surrealism) and implemented using cellular automata. Theoretical foundations in emergence, autopoesis, Deleuze and Guattari’s notion of the machinic; interested in building a painting that thinks and paints itself. His automata paintings are hand-painted outputs of 1D cellular automata. He’s also built painting machines using ego mindstorms.

A Reconstructed Neo-Aristotelian Theory of Interactive Drama – Zach Tomaszewski & Kim Binsted (University of Hawaii at Manoa, US)

Reviews the evolving descriptive model (poetics) of drama through 4 different authors and suggests a reconstructed theory based on this analysis. Aristotle, Smiley (first person to apply material and formal cause to Aristotles model), Laurel, Mateas (brings in agency). The newly proposed model tweaks the descriptive levels to highlight the distinction between object and medium.

Saurus: an emotionally-weighted thesaurus – Jim Gouldstone et al. (Massachusetts Institute of Technology, US)

Describes a system that uses an affect-labeled thesaurus (using affect triples: pleasure, arousal, dominance) to rewrite an entered sentence to match the emotional tone of an entered guide phrase. With a guide phrase of “violent hate,” “enormous gap” becomes “heinous breach.”

Identification of Lifestyle Behavior Patterns with Prediction of the
Happiness of an Inhabitant in a Smart Home

Vikramaditya R. Jakkula
G. Michael Youngblood
Diane J. Cook
(University of Texas at Arlington, US)

The work is motivated by the statistics of the number of elderly folk living at home. [I’ve noticed that over the last few years, in-home health care seems to have become the application area for smart homes.] Sensor metrics are divided into three levels of abstraction: low-level sensor readings (motion, temperature, etc.), mid-level metrics (motion on bed, distance travelled at home), high-level metrics (instrumental activities such as using the fridge). Did a 6 week study in a home – the inhabitant filled out a web form for each day to record their emotional state. Experiments: look for correlations and perform t-tests on metrics, and use machine learning, specifically knn, support vector and decision tree learning, to try and predict reported emotion levels based on metrics. knn gave the best prediction (78% accuracy).

The Role of Abduction in Automatic Storytelling

Rafael Pérez y Pérez
Atocha Aliseda
(Universidad Nacional Autónoma de México, MX)

Writing is an activity of discovering what will be said; Abduction is useful in discovery. In the MEXICA system, anomalies arise and are explained. Emotional links can be established between characters; Tension can be produced by situations (making the story more interesting); Operators or story actions (with preconditions and postconditions) activate emotional links and tensions. After three cycles of engagement, producing three actions, the system reflects, evaluating the coherence of the events.

Abductive reasoning is triggered by a surprising phenomenon, and involves determining a cause for some effects. MEXICA observes the story it is generating during plot generation, not the outside world, to find such situations. The explanation requires modifying the story, to satisfy unsatisfied preconditions of operators and of “the law of smooth progress” coordinating action with emotional ties. After reflecting, new events are inserted to explain the ones that were generated.

Natural Language Generation and Narrative Variation in Interactive Fiction

Nick Montfort
(University of Pennsylvania, US)

Opens by briefly covering a timeline of IF, ending with the rise of the hobby community. To demonstrate the range of contemporary IF, shows three examples: Bronze (Emily Short), Bad Machine (Dan Shiovitz), Book and Volume (Nick Montfort). Brief recapitulation of the definition of IF from Twisty Little Passages: IF offers choices like choose-your-own-adventure or hypertext, but adds a simulated world.

Some insights from narratology: stories are composed of existents (objects, places) and events. In IF, there are rich representations for existents, but not for events. Events are directly mapped to generated text – each turn is one event.

His architecture separates discourse and story; by explicitly representing events, can choose to render them into discourse in different ways. The narrator, the system component responsible for narrating events, consists of a reply planner, microplanner, and realizer.

Narrative variations currently implemented in Nick’s system:

A Text Generation System that Uses Simple Rhetorical Figures

Francisco C. Pereira (Universidade de Coimbra, PT)
Raquel Hervás (Universidad Complutense de Madrid, ES)
Pablo Gervás (Universidad Complutense de Madrid, ES)
Amilcar Cardoso (Universidade de Coimbra, PT)

For aesthetic language generation, need to expand the range of options available for saying something.

Some options for enriching options:

cFROGS – their pipelined architecture for NLG. This is the baseline they start with for adding analogy and metaphor. They use WordNet as their concept dictionary.

Analogies “X is the Y of Z” are accomplished in four phases:

In the standard NLG pipeline, have to add these steps to handle metaphor and analogy:

A Computational Model of Narrative Generation for Suspense

Yun-Gyung Cheong
R. Michael Young
(North Carolina State University, US)

Suspense is excitement or anxiety when anticipating an uncertain event. Little done in story generation in the area – just MINSTREL. But narratology deals with the issue: suspense is related to the number of potential actions. A three-part model for story: Fabula (underlying events), Sjuzhet (order, presentation), Discourse (text).

System takes fabula, intended suspense, point at which to measure suspense as input; sjuzhet as output. Use planning in Crossbow to approximate user’s narrative comprehension and estimate suspense. The Suspenser does this: a Skeleton Builder extracts the kernel and evaluates coherence, then organizes the structure to produce suspense using a generate-and-test approach and heuristics to identify new actions to add. When plan space has a smaller percentage of successful plans, the level of suspense is seen as higher. To increase suspense, hide actions that support the goal state, show actions that threaten the goal state.

In a pilot story, two computer-generated stories (high-suspense, low-suspense) and two human-generated stories (both seeking high suspense) were tested. The computer-generated high-suspense story was seen as suspenseful.