December 9, 2005


by Andrew Stern · , 3:33 am

The ALICE bot developers (Richard Wallace and co.) are making an interesting resource available for purchase: based on 10 years of conversation logs between their award-winning chatterbot and thousands of users, they have compiled a list of ALICE’s most common 10,000 user inputs. (Their bot is in fact comprised of short responses to 4x that number of inputs.) Further, they’ve abstracted this raw data into the top 10,000 patterns of input, which I’d guess is drawn from the top 30,000 inputs or more. This “Superbot” data, in the form of Excel spreadsheets, can be yours for $999.

I actually think that’s a decent value for such data, even if it’s somewhat tied to the general design and interface of ALICE. That is, there’d be all kinds of new inputs users would say that are not on the list, once you make a conversational agent that can have deeper conversations than the very broad but shallow ALICE, or if you made agents with a more focused, less generic domain, such as Grace and Trip in Fa├žade. Still, I’m sure many items on this list would be said to most any bot, at least in this early era of overall bot intelligence.

Speaking of bots and what people say to them, I came across the webpage for an intriguing symposium held at Interact 2005 as well as a sequel to be held at CHI 2006: Agent Abuse, the dark side of human computer interaction. Here’s the symposia’s abstract:

The goal of this workshop is to address the darker side of HCI by examining how computers sometimes bring about the expression of negative emotions. In particular, we are interested in the phenomena of human beings abusing computers. Such behavior can take many forms, ranging from the verbal abuse of conversational agents to physically attacking the hardware. In some cases, particularly in the case of embodied conversational agents, there are questions about how the machine should respond to verbal assaults. This workshop is also interested in understanding the psychological underpinnings of negative behavior involving computers. In this regard, we are interested in exploring how HCI factors influence human-to-human abuse in computer mediated communication. The overarching objective of this workshop is to sketch a research agenda on the topic of the misuse and abuse of interactive technologies that will lead to design solutions capable of protecting users and restraining disinhibited behaviors.

Papers from 2005 can be found here, and the 2006 cfp here.

We’ve discussed the abuse of agents on GTxA a couple of times (1, 2); also ALICE several times, including this discussion.

10 Responses to “CAN YOU SPEAK ABOUT *”

  1. Dr. Rich Wallace Says:

    Thanks for the kind review, Andrew. You certainly get the idea of where we
    are going with this concept. The Superbot data is valuable even if
    you are not using AIML. We recently translated it to Buddyscript, for use
    with Conversagent bots. You can see a sample at

  2. Malcolm Ryan Says:

    Anyone wanna bet on the #1 entry?
    I’m guessing it’s “Wanna cyber?”


  3. andrew Says:

    Hi Rich, thanks for the comment. Procedural Arts may become a Superbot customer at some point, depending on what we end up doing next.

    I’m curious, have you played Facade? If so, we’d love to hear any feedback you have on it.

    Malcolm, yeah, I’d guess the top 100 patterns are variants of sexual propositions and insults. Would make for an interesting study of human-computer interaction, and fodder for the Agent Abuse symposium above…

  4. andrew Says:

    Christy at WRT has an interesting post about ALICE dressed up as God — a seed of what could make a great contribution to the post just after this one, CFP: Alien/Other.

  5. Dr. Rich Wallace Says:

    No, sorry, I haven’t had a chance to play Facade. I do like games very much though.
    I would like to make creating a bot as fun and easy as playing a video game. Lately
    I’ve been working on a bot called John the Pickup Artist, based on the Superbot,
    using Pandorabots, and writing the content from Psychology, Assertiveness Training, and
    Hypnosis textbooks. I uploaded the Superbot data to Pandorabots initially with the
    templates set to “respnose0” to “response10210”. I wrote a set of random Pickup Lines for the
    * (ultimate default) category (the #1 most activated category; about 2-5% of inputs are
    not matched by any more specific category, depending on the bot). Then I began typing in
    transcripts from the hypnosis, therapy, and assertiveness triaining sessions to get the
    content for the charachter. Each input activates a category from the Superbot, and then
    I use the transcripts to think up clever AIML responses. I use both sides of the
    transcript (client and therapist) as a source of both inputs and responses. This is not
    quite as much fun as playing a video game, but I like to to think of it as a battle
    and I am shooting at linguistic targets. The more points I score, up to 10,000, the
    smarter John becomes.

  6. Leena Says:

    I would like to know more what “pattern” exactly means in this context? Aren’t user inputs higly related to what bot says and what patterns it uses? All kinds of specific words and patterns do get “copied” from the users to the bot-chat like in normal conversation people start using each others words and style. Hence, perhaps, this data is not applcaple to any bot as such but needs to be very much applied to each bot character’s characteristics.

  7. Dr. Rich Wallace Says:

    You are quite correct that the distribution of client inputs depends on the bot’s
    personality and responses. One of the most frequently actived patterns in the ALICE bot for example
    is “WHO IS DR RICHARD WALLACE”, which would probably not be very high on the list of
    patterns for a characher in a Facade game. Yet there remain thousands of inputs common
    to all bots, or to conversational language in general. This is something some people have
    a hard time believing at first, that their clients or customers will try to break the
    bot intentionally by going off topic. A bot unprepared to handle these off topic inputs
    will seem stupid.

    Basically all bots work by some kind of pattern recognition; the word “pattern” also has
    a specific meaning in AIML. The AIML language also has features for imitating, reflecting,
    and pacing the client as well as taking context into account. But I’m not sure this is the best
    forum for me to go on blowing my own horn about the world’s greatest, most popular,
    and totally free, bot scripting language, which is described more fully at

  8. Gordon Says:


    I was interested to hear about your investigation of HCI abuse. This is a topic which has come up in our use of Alice with our talking bot application. We believe that users will deliberately push further and be much more abusive to a computer than they would in any normal circumstance. Perhaps becuase of anonimity or a sense of danger in using either imprecation or talking about taboo subjects. If there is any doubt about it being a human or computer response the reply is almost always one of shame and an apology which is elicted if the user is not sure. We belive the aspect of a voice without an avatar, which (the avatar) then places focus on a computer program, is a good combination and will allow the user to interpret the sometimes very human responses as real and from a real person. This area of the technology has many uses but could also help in the specialised answer to say an epidemic and what symptoms would be obvious etc. Any service which answers repetitive clearly defined replies could benefit from this powerful combination of technology. The user could be guided into the areas of specialism quite easily or sent to a specialist Bot which more nearly has answers to specific subjects. Clearly what the virtual world of ecommerce needs is a virtual shop assistant and this combination being tried by many sites to get beyond a gimmick.

  9. hypnose Says:

    Now this is a highly interesting resource that shows not only how people communicate, but also how they react to the response. If there is something like a “mirror” for linguistic patterns – this is it. A “communication mirror”, but of course: Its more than just that.

  10. Seeker Says:

    Dr. Wallace, from all I have learned of him so far, seems to be a brilliant man, yet he cannot find himself, really find! A.L.I.C.E. and Pandorbots are the products of a master of creative creation, yet they are the results of yearning for knowledge of self, which I believe many people share with Dr. Wallace. Pandorabots is an absolutely awesome comparative opening of something that every human at their core wants to know about self,who am I, why am I here and what, if any is my purpose? I do not know Dr Wallace personally, but I respect, admire and thank him for his efforts in opening and allowing others to open a door to explore human self. It is not the only door, perhaps not the best door, but it is a door into a world of discovery of self, others, life and beyond. Whatever his motives are for this journey, good or bad as you judge them and him, I personally am honored that he has and is willing to share the path that he is on with anyone else that ventures to follow or at least head the same direction of destiny that calls!!!

Powered by WordPress