July 16, 2006
Notes from Computational Aesthetics at AAAI
Notes from today’s workshop are by Nick and Michael. We’ve tried to takes notes as best as we can to advertise what work is going on, but please consult the actual academic publications of these individuals for the official word about their projects!
Memex Music and Gambling Games: EVE’s Take on Lucky Number 13
Kevin Burns (MITRE Corporation)
Shlomo Dubnov (University of California at San Diego, US)
Bayesian mathematics + information theory, games + music. EVE is a computational theory of aesthethics – a tradeoff between being able to predict and being surprised: Expectations (E), Violations (V), and E’xplanations (E’).
Slot machines could be set to almost any P (probability of payout) and the payout amount varies. They are empirically set to P=0.13, a value that may represent “peak fun.” Enjoyment can be computed in terms of marginal entropies: A “Goldilocks” function showing pleasure at different values of surprise, peaks around 0.13.
Memex music: each note is linked to the next one in a piece (e.g., first note in Beethoven’s Fifth to second note) and is also linked forward to other notes with similar history. Independently, they were set to branch away with probability 13%. Changing the pleasure function to a product rather than sum of terms (surprise * resolve), an S-shaped function arises.
If a player’s/listener’s P and Q vary from the real ones, there is a difference in pleasure. May lead to a theory of aesthetic utility.
Bringing the Text to Life Automatically
Carlo Strapparava
Alessandro Valitutti
(Istituto Trentino di Cultura/Istituto Ricerca Scientifica e Tecnologica, IT)
Concept: automatically create animation of text (graphemes) based on lexical semantic analysis. This automates a graphic design task (for TV, etc.) and allows study of affect in natural language. Animation might help with memorization and learning, too.
Some words directly represent emotional states: “love”, “joy”. Inducing indirect affective words (“cry”, “monster”) from a corpus.A new lexical resource, WordNet-Affect, was developed for direct affective words. Latent Semantic Analysis used to do dimensionality reduction on term-by-documents matrix. Words, texts, synsets can be represented homogeneously as vectors. Synsets distinguish, e.g., terrific {fantastic, howling, marvelous, marvellous, rattling, terrific, tremendous, wonderful, wondrous} from terrific {terrific, terrifying}.
Kinetic typography engine from CMU used, development environment and scripting language added to allow simple animations: linear, oscillate, pulse, jitter. Mappings based on physical responses to emotion and stereotypical motions. Headlines from Google News were then animated automatically. People recognized automatically animated headlines better (50% less time), did worse when “inconsistent” animations were automatically added.
Structuring Interactive Narrative Poetry Generation: Walking Blues
Changes Undersea
D. Fox Harrell (University of California at San Diego, US)
Interested in: Formal representation of semantics, user feedback, generative multimedia, reconfigurable discourse. Generative poetry provides a research basis, privileges metaphor, no virtual world inhabited by a character. Cognitive linguistics and algebraic semiotics are the main fields drawn upon. Cognitive patterns, formally represented.
Conceptual blending is used to elaborate metaphor theory. Double-scope stories provide clashing, sometimes cascading conceptual blends. Provides a theoretical model used in Walking Blues Changes Undersea.
Algebraic semiotics provides typed elements (sorts) for signs in spaces, partially ordered. Constructors build level n signs from signs of level n or less. Morphisms map between spaces. The alloy blending algorithm computes all possible ways to integrate two conceptual spaces, ordered by optimality.
Different levels: Polypoems (e.g., “The Girl with Skin of Haints and Seraphs”) are built on the GRIOT system, then a person provides input, then a person performs the poem. A discourse structure is specified for polypoems. Demo of the latest polypoem, “Walking Blues Changes Undersea,” in which the world fills with water as a person goes through a daily routine. Keywords specific to locations change the disposition.
Brief Poster Presentations
Automatic Dream Sentiment Analysis – David Nadeau(1,2), Catherine Sabourin(1), Joseph De Koninck(1), Stan Matwin(1), Peter D. Turney(2); (1, University of Ottawa, CA; 2, National Research Council Canada, CA)
Dream sentiment analysis – system that analyses a textual description of a dream and pulls out the affective tone of the dream.
Bayesian Beauty: On the ART of EVE’ and the Act of Enjoyment – Kevin Burns
(MITRE Corporation, US)
Painting/drawing as process, with expectations and violations as in music or gambling. Demo: connect-the-dots. An aesthetic theory based on Aristotle, Birkhoff’s proposed symmetry-based measures, Bayes.
An Initiation Rite for Intelligent Machinery – Orkan Telhan (Massachusetts Institute of Technology, US)
A take on Turing’s imitation game, from a cultural theory perspective. Consider intelligence (humanness) as a club for which there is an initiation rite; it’s not necessary to outperform people, but to join the club. Questions about what an intelligent machine should have: ConceptNet is used to generate replies and further the conversation, feeding the database for the future.
ColorCocktail: an Ontology-Based Recommender System – Yu-hsin Chen, Ting-hsiang Huang, David Chawei Hsu, Jane Yung-jen Hsu (National Taiwan University, TW)
Given a knowledge base of different cocktails (colors, types of alcohol, shape of glass), suggests cocktails based on the drinker’s emotional state, using commonsense reasoning.
Exploring the Compositionality of Emotions in Text: Word Emotions, Sentence Emotions and Automated Tagging – Virginia Francisco & Pablo Gervás (Universidad Complutense de Madrid, ES)
Human judges evaluate the affect of individual words and sentences in a sentence corpus. Affect is tagged as a triple in a 3D emotion space. Based on this tagged corpus, the system can determine the emotional tone of newly generated sentences. Work like this helps inform the pragmatic variation in NLG that will be necessary for automatically generating interactive drama dialog.
LyQ – A Commonsense Music Player – David Chawei Hsu & Jane Yung-jen Hsu
(National Taiwan University, TW)
A music-player that automatically selects songs to match an emotional mood. The emotion of a song is based on both melody and lyrics – he’s currently focusing on lyrics. Using ConceptNet to infer emotional tone of words in lyrics.
Painting as a Thinking Machinel – Simon Ingram (Auckland University of Technology, NZ)
Describes a painting studio practice founded in automatism (seen in surrealism) and implemented using cellular automata. Theoretical foundations in emergence, autopoesis, Deleuze and Guattari’s notion of the machinic; interested in building a painting that thinks and paints itself. His automata paintings are hand-painted outputs of 1D cellular automata. He’s also built painting machines using ego mindstorms.
A Reconstructed Neo-Aristotelian Theory of Interactive Drama – Zach Tomaszewski & Kim Binsted (University of Hawaii at Manoa, US)
Reviews the evolving descriptive model (poetics) of drama through 4 different authors and suggests a reconstructed theory based on this analysis. Aristotle, Smiley (first person to apply material and formal cause to Aristotles model), Laurel, Mateas (brings in agency). The newly proposed model tweaks the descriptive levels to highlight the distinction between object and medium.
Saurus: an emotionally-weighted thesaurus – Jim Gouldstone et al. (Massachusetts Institute of Technology, US)
Describes a system that uses an affect-labeled thesaurus (using affect triples: pleasure, arousal, dominance) to rewrite an entered sentence to match the emotional tone of an entered guide phrase. With a guide phrase of “violent hate,” “enormous gap” becomes “heinous breach.”
Identification of Lifestyle Behavior Patterns with Prediction of the
Happiness of an Inhabitant in a Smart Home
Vikramaditya R. Jakkula
G. Michael Youngblood
Diane J. Cook
(University of Texas at Arlington, US)
The work is motivated by the statistics of the number of elderly folk living at home. [I’ve noticed that over the last few years, in-home health care seems to have become the application area for smart homes.] Sensor metrics are divided into three levels of abstraction: low-level sensor readings (motion, temperature, etc.), mid-level metrics (motion on bed, distance travelled at home), high-level metrics (instrumental activities such as using the fridge). Did a 6 week study in a home – the inhabitant filled out a web form for each day to record their emotional state. Experiments: look for correlations and perform t-tests on metrics, and use machine learning, specifically knn, support vector and decision tree learning, to try and predict reported emotion levels based on metrics. knn gave the best prediction (78% accuracy).
The Role of Abduction in Automatic Storytelling
Rafael Pérez y Pérez
Atocha Aliseda
(Universidad Nacional Autónoma de México, MX)
Writing is an activity of discovering what will be said; Abduction is useful in discovery. In the MEXICA system, anomalies arise and are explained. Emotional links can be established between characters; Tension can be produced by situations (making the story more interesting); Operators or story actions (with preconditions and postconditions) activate emotional links and tensions. After three cycles of engagement, producing three actions, the system reflects, evaluating the coherence of the events.
Abductive reasoning is triggered by a surprising phenomenon, and involves determining a cause for some effects. MEXICA observes the story it is generating during plot generation, not the outside world, to find such situations. The explanation requires modifying the story, to satisfy unsatisfied preconditions of operators and of “the law of smooth progress” coordinating action with emotional ties. After reflecting, new events are inserted to explain the ones that were generated.
Natural Language Generation and Narrative Variation in Interactive Fiction
Nick Montfort
(University of Pennsylvania, US)
Opens by briefly covering a timeline of IF, ending with the rise of the hobby community. To demonstrate the range of contemporary IF, shows three examples: Bronze (Emily Short), Bad Machine (Dan Shiovitz), Book and Volume (Nick Montfort). Brief recapitulation of the definition of IF from Twisty Little Passages: IF offers choices like choose-your-own-adventure or hypertext, but adds a simulated world.
Some insights from narratology: stories are composed of existents (objects, places) and events. In IF, there are rich representations for existents, but not for events. Events are directly mapped to generated text – each turn is one event.
His architecture separates discourse and story; by explicitly representing events, can choose to render them into discourse in different ways. The narrator, the system component responsible for narrating events, consists of a reply planner, microplanner, and realizer.
Narrative variations currently implemented in Nick’s system:
- order (retrograde, analepsis, prolepsis, syllepsis, etc.)
- speed (length of event’s telling vs. length of the event’s happening)
- frequency (narrate 1 event one, n events once, 1 event n times)
- focalization (presenting information relative to what a specific character knows)
- time of narrating (prior, simultaneous, after)
- explicitly include signs of narrator and narratee (“I tell you…”)
A Text Generation System that Uses Simple Rhetorical Figures
Francisco C. Pereira (Universidade de Coimbra, PT)
Raquel Hervás (Universidad Complutense de Madrid, ES)
Pablo Gervás (Universidad Complutense de Madrid, ES)
Amilcar Cardoso (Universidade de Coimbra, PT)
For aesthetic language generation, need to expand the range of options available for saying something.
Some options for enriching options:
- Word level (multiple synonyms for each concept, use hypernyms)
- Playing with concepts (finding mappings between concepts “An F-18 is as scary as a dragon”)
- Richer conceptual mappings (“Fire is a dragon’s artillery” – requires analogical mappings)
cFROGS – their pipelined architecture for NLG. This is the baseline they start with for adding analogy and metaphor. They use WordNet as their concept dictionary.
Analogies “X is the Y of Z” are accomplished in four phases:
- Establish source domain – context of tale + enrichment of WordNet relations.
- Choose a target domain
- Build cross-domain mappings (graph isomorphisms)
- Realization of analogy
In the standard NLG pipeline, have to add these steps to handle metaphor and analogy:
- Establish possible comparison
- Construct comparison mapping
- Build mapping
- Construct analogy messages
- Partially replace some messages in original input with analogies
- Establish relative order of source and analogy messages
- Establish relative position of output of comparison messages
- Consider hypernyms
- Consider eleemtns from the target domain mappet to x as possible references for x
- Identify synonyms
A Computational Model of Narrative Generation for Suspense
Yun-Gyung Cheong
R. Michael Young
(North Carolina State University, US)
Suspense is excitement or anxiety when anticipating an uncertain event. Little done in story generation in the area – just MINSTREL. But narratology deals with the issue: suspense is related to the number of potential actions. A three-part model for story: Fabula (underlying events), Sjuzhet (order, presentation), Discourse (text).
System takes fabula, intended suspense, point at which to measure suspense as input; sjuzhet as output. Use planning in Crossbow to approximate user’s narrative comprehension and estimate suspense. The Suspenser does this: a Skeleton Builder extracts the kernel and evaluates coherence, then organizes the structure to produce suspense using a generate-and-test approach and heuristics to identify new actions to add. When plan space has a smaller percentage of successful plans, the level of suspense is seen as higher. To increase suspense, hide actions that support the goal state, show actions that threaten the goal state.
In a pilot story, two computer-generated stories (high-suspense, low-suspense) and two human-generated stories (both seeking high suspense) were tested. The computer-generated high-suspense story was seen as suspenseful.
July 17th, 2006 at 11:00 pm
[Hello all. Thanks for the write-up and summaries. I thought I would clarify a few of my ideas for those interested. My clarifications are interleaved between square brackets below.]
Interested in: Formal representation of semantics, user feedback, generative multimedia, reconfigurable discourse. Generative poetry provides a research basis, privileges metaphor, no virtual world inhabited by a character. Cognitive linguistics and algebraic semiotics are the main fields drawn upon. Cognitive patterns, formally represented.
[Narrative poetry provides a good initial “case study,” which is something less grand and fundamental than a “research basis” (the phrase used above). I think (hope) it is a good way to express my attitude toward the potential of interactive narrative and my theoretical ideas and foundations without placing a huge amount of effort at this stage toward other related issues (nice graphics, NLG, etc.). The danger is being seen as “the poetry guy” or having the ideas being seen as limited according to the dimensions portrayed by the prose poetry created so far. What I want to convey is a deep structure for semantics as necessary and desirable for interaction and deep and meaningful levels with narrative content, and that many new and experimental forms are ripe to be discovered. The idea of having “no virtual world inhabited by a character” is just meant to emphasize that interaction and narrative can intersect along an endless variety of dimensions. At various times I focus more or less on user agency within the story as a central feature, I do not believe that it has to be *the* defining feature of interacting within a narrative.]
Conceptual blending is used to elaborate metaphor theory.
[It is Fauconnier’s claim that blending elaborates metaphor theory, but also that it describes a fundamental feature of human creative thought. Controversies with the theory tend to center around its lack of predictive results and its tendency to subsume many diverse cognitive phenomena under the heading of blending. Nonetheless, Fauconnier has a forthcoming very compelling and careful account of how spatial metaphors for time are not completely accounted for by the two space model in metaphor theory.]
Double-scope stories provide clashing, sometimes cascading conceptual blends. Provides a theoretical model used in Walking Blues Changes Undersea.
[The “double scope” aspect here refers to blends of spaces with clashing (very different in terms of structure and content) frames. It is an amazing aspect of human cognition that, in fact, we do not perceive these blends as clashing and fluidly and extemporaneously perform such blending operations and elaborate imaginatively based on such blends.]
Algebraic semiotics provides typed elements (sorts) for signs in spaces, partially ordered. Constructors build level n signs from signs of level n or less. Morphisms map between spaces. The alloy blending algorithm computes all possible ways to integrate two conceptual spaces, ordered by optimality.
[Algebraic semiotics provides representations for sign systems, including the idea the signs are built from other signs. From computer science and algebraic specification it brings data types (here known as sorts) and a precise formalization of sign systems as semiotic theories. One important difference in using algebraic semiotics as a basis for conceptual integration from other approaches to analogy is this fact that we use theories, as opposed to models, to describe conceptual structures. This captures the idea that we are not describing completely understood set-theoretic type objects, but rather partially understood representations which can always be refined or elaborated. Morphisms map between spaces, and in the case of blending map multiples spaces to spaces.]
Different levels: Polypoems (e.g., “The Girl with Skin of Haints and Seraphs”) are built on the GRIOT system, then a person provides input, then a person performs the poem. A discourse structure is specified for polypoems. Demo of the latest polypoem, “Walking Blues Changes Undersea,” in which the world fills with water as a person goes through a daily routine. Keywords specific to locations change the disposition.
[In the new polypoem “Walking Blues Changes Undersea,” the idea that “the world fills with water” refers to undersea concepts being integrated with a mundane, rather pessimistic view of unfulfilling mundane life. This is meant to conjure a more fantastic world with varying emotional tint, depending on user interaction. In this fantastic world the undersea elements can appear as strange, threatening, oppressive, or otherwise…as can the every day elements of the world. In particular the weight and pressure of the sea are pervasive in the content, along with the myth of Atlantis and undersea creatures.]
August 1st, 2006 at 5:11 pm
[…] grad students are doing great work, including Fox Harrell (whose presentations at DAC and Computational Aesthetics we’ve noted in recent mont […]