Deikto: A Language for Interactive Storytelling
The personal computer has been with us for twenty-five years now, and it has revolutionized the world around us. But in the arts, the computer has yet to approach its potential. Yes, the computer has dramatically changed the execution of existing artistic fields: movies, music, writing, and image creation will never be the same. These, however, are matters of applying the computer as a tool rather than exploiting it as a medium of expression. Yes, many artists have attempted to express themselves directly through the computer, but their efforts, while laudable extensions of existing artistic media, do not begin to use the computer as a medium in its own right.
Partly this is because so many artists fail to recognize that the essence of the computer as a medium of expression lies in the interactivity it makes possible. Yes, the computer is multitalented and can achieve great things as an image presentation medium or a sound presentation medium - but that is not its true genius. Movies can show text and play music, but text and music are not central to cinema as a medium of expression. Visual action is the essence of cinema; everything else is secondary. In the same fashion, interactivity is the essence of the computer; everything else is secondary.
I give interactivity this special place because interactivity is the one element that is unique to the computer. All of the other nice features of the computer - its graphics, its sound, its animation - are all offered by other media. Only the computer can provide interactivity. That is its strength. Using a computer for any purpose other than interactivity is like raising Arabian horses for any purpose other than riding.
The flip side of this emphasis on interactivity is a de-emphasis on the finer points of other media. Good literature lives or dies by the quality of the writing, the turns of phrase, the verbal imagery. Interactive art must sacrifice this kind of thing to concentrate on its strengths. Thus, it simply won't do to think of interactive art as "Old Art Form X with interactivity added." The phrases "interactive movie," "interactive literature," "interactive painting," "interactive novel," and so forth are all misleading. Interactivity is not a feature to be tacked onto an existing medium. It is the very essence of a new medium of expression. Interactive art, whatever form it takes, will most certainly not compete with or be seen as in any way related to any of the existing art forms. It is the first truly new artistic medium we have seen in a long time - and it cannot be extrapolated from any existing art form.
Of course, interactivity is difficult to understand, especially when we have so few models to guide us. Computer games offer lots of interactivity, but they are so drenched in violence and puerile fantasy that it's difficult to generalize from them.
Another reason for the dearth of artistic interactivity lies in the intrinsic abstractness of interactive design. Every other art form communicates its message through a specific instance of the message. Let's take the theme of pathos from loss of a child. Michelangelo created a magnificent instantiation of this theme in his Pietà. Countless paintings addressed the same theme, with imagery of the grief-stricken face of Mary. Some of the greatest works of modern photography are images of grief-stricken mothers burying their children. The Lord of the Rings uses this theme twice. It is an important and ever-present theme.
We know how to present this theme in images and stories - but how would we present it interactively? That would require us to create algorithms for the processes attending the loss of a child: the circumstances that created it, with possible parental culpability or inattention; the nature of the death and the intrinsic horror of all human loss; the means by which the news of the death is communicated to the parent; and the means by which a parent can express grief. All these factors must be determined at a level of abstraction previously unaddressed in art. How can we calculate the degree to which a parent's actions contribute to the probability of death of the child? The computation is conceivable, but immensely difficult. How do we calculate the events that lead up to and follow the death? Again, this can be done but it requires a level of abstraction and complexity that boggles the mind. This is the challenge of interactive art; it is also its potential.
The Dearth of Tools
On top of all these obstacles, we have the greatest obstacle of all: the dearth of tools that give artists the power to express themselves in a manner both artistically meritorious and interactively substantial. Sure, there are plenty of level editors for creating your own blast-em-up, but they don't offer much artistic potential. There are plenty of wonderful tools for creating great images, music, novels, and movies - but nothing that gives an artist access to the interactive power of the computer. What we need is a programming language for artists.
Programming languages are created by programmers, mostly for programmers. They reflect the needs and interests of programmers, not artists. There have been a few programming languages designed for non-programmers, but none of these address the needs of artists.
What would a programming language for artists look like? I can assure you that it won't look like this:
NEW story =
a dusty western town
+ a boy and his dog
+ a man realizing that he is no longer what he once was
+ a prostitute with a heart of gold
+ a chase scene
The problem is that the computer is too dumb to understand any of these concepts. If you want a programming language that can add two numbers together, that's easy, because computers readily understand addition and numbers. But a programming language for artists cannot include most of the elements that artists consider fundamental to their work. We need to reduce artistic fundamentals to even smaller fundamentals, those of the computer: addition, subtraction, multiplication, and division with numbers. This is not a task for artists; it demands cerebration, not inspiration. It is instead a task for mutant artist/programmers who unnaturally straddle the divide between the arts and the sciences. And what they produce will still demand that artists work hard to understand the mathematical constructs of the artistic programming language.
From Hot Air to Hot Circuits
At this point, I shall turn from concept to practice and describe my own work in this area: the Erasmatron. Now in its fourth generation, the Erasmatron provides the artist with the means to create interactive storytelling products. This technology incorporates a great many ideas that have their own special terminology; that is the way of all technology. For example, the product of the artist's labors with the Erasmatron is called a storyworld. It's a complete dramatic environment, populated with characters possessed of personality traits, moods, and relationships. It has stages where action takes place and props for the actors to use. These elements are precisely defined so that the artist can manipulate them precisely; here is a brief description of the major elements in a storyworld:
An actor is a character in the storyworld, endowed with the following basic traits:
location (which stage the actor is occupying)
spying on (whom the actor is spying on, if anybody)
strength (overall physical strength)
wealth (how rich the actor is)
loquacity (how talkative the actor is)
There are also basic traits such as age, weight, height, and so forth. Next come the three basic moods that affect an actor's behavior:
These moods are bipolar: the opposite of joy is sadness, so if an actor has joy/sadness equal to 0.0, then the actor is in a normal mood, but positive values indicate joy, and negative values indicate sadness.
Last come the personality traits. I spent many years honing this system and, while it isn't exactly what we'd want for good drama, it is (I think) the best compromise between dramatic requirements and computational capabilities.
These are also bipolar traits, so that an actor with honesty equal to 0.0 is normal; an actor with a positive honesty is more honest than normal, and an actor with negative honesty is more dishonest than normal.
What makes this system special is that it directly leads to emotional relationships through what I call "perceived traits." This is easiest to understand when we talk about honesty. Suppose that Mary has an honesty value of - 0.2; Mary is somewhat dishonest. But suppose also that Fred perceives Mary's honesty value as +0.3. Then Fred trusts Mary more than she deserves to be trusted. This discrepancy is, of course, the fodder on which drama feeds. Thus, for every actor, we have an honesty value, but we also have a value for how every other actor perceives the actor's honesty value - their trust in that actor.
But it goes even one step further in abstraction. We also keep track of how every actor perceives every other actor's perception of every actor's honesty. "Huh?" you stammer. "What's that?" Think of it in terms of this snippet of dialogue from a story:
Don't ask Mary to borrow the money from Tom; you must not have known that Tom doesn't trust her. Ask Fred to borrow the money; Tom trusts Fred.
This atom of dramatic exchange relies on the fact that the speaker knows how Tom and Fred perceive Mary's honesty. It's a three-way relationship. Here's another example of the dramatic utility of this idea:
You invited both Mary and Annabeth to your party? How could you be so stupid? Don't you know they hate each other?
There are also stages, on which all events take place, and props, the objects that actors use during their interactions. I will skip the detailed specifications for these elements of the storyworld, as this kind of thing is common in the world of programming.
The other major component of the Erasmatron technology is Deikto, a special language of interaction designed for interactive storytelling. We seldom realize it, but language controls the way we interact with others. That was the foundation for the idea of Newspeak in the novel 1984; if the language lacked terms for freedom and justice, then nobody could express thoughts about those ideas and it would be easier to control the people. This concept, known in linguistics as the Sapir-Whorf hypothesis, has been disproved in its strongest form but confirmed in its softer applications. Basically, the Sapir-Whorf hypothesis addresses the influence that language has on our thinking. The strong version of the hypothesis, which never had many supporters, posits that language actually controls our thinking; it is impossible to think ideas that are not reflected in the language we use. This flavor of the hypothesis, as I said, has fallen into disfavor. The softer version of the hypothesis posits that language merely influences our thinking, and a variety of studies have supported this softer version of the hypothesis.
I decided to take advantage of this concept by creating my own little language that would express the concepts required for interactive storytelling. Any such language is necessarily limited; it is simply impossible to capture the richness of natural human language inside a computer. But if we tackle the much smaller problem of creating a narrow-purpose language, we bring the problem within reach of current computers. The trick here lies in keeping the language small and manageable; engaging in wild flights of fancy would surely doom any such effort to failure.
The most important reason for design parsimony in creating my toy language is the requirement that every single word used in this language must be fully computable. If I add the verb "give" to my toy language, then I must program the computer to understand everything about that verb: that it represents the transfer of an object from one person to another. The concept of ownership must be part of the program; and the program must understand that people can only give objects that they already own. That's a lot of programming - and that's only what's required for a simple verb such as "give."
Here is the list of verbs I have established so far in Deikto:
agree to deal
make love to
search for actor
search for prop
small talk with
You may find this list laughably short; how can one possibly tell a story using such a paucity of verbs? First, it is important to remember that in interactive storytelling, we need expressive richness but not descriptive richness. We don't need the nuance and poetry that a truly gifted writer weaves into a good story. That's necessary for literature, but interactive storytelling is not literature and could never hope to compete with literature. We need the characters to be able to do things, but we don't need synonyms.
Even though this list is short, it imposes many complexities. Consider the difference between the verbs "advise," "command," and "request." Each of these verbs relies on a different relationship between the subject and the direct object. Advice relies on the advisee's respect for the wisdom of the advisor (his perceived value of the subject's Smart trait); commands rely on the direct object's deference to the subject (his perceived value of the subject's Dominating trait); and requests hinge on the good will that the direct object feels toward the subject (his perceived value of the subject's Good trait).
Worse, such verbs require multiple clauses, such as this:
direct object that direct object (executes some verb) (to some other direct object).
As you can see, this can become quite messy.
This is where a second feature of Deikto comes into play. Deikto is a toy language that will never be spoken nor written; it will appear only on computer screens. Accordingly, it suffers none of the constraints - nor enjoys all the benefits - of speech and writing. In particular, Deikto can be displayed on the two-dimensional computer screen; why should such a language be restricted to the one-dimensional structure used in human languages? The result of this realization looks like this:
This Deikto sentences says, "I advise Mary (with medium urgency) that she should go somewhere." The last word in the sentence has not yet been filled in; this demonstrates the user interface for Deikto. The user builds sentences word by word, clicking on words like "where?," which brings up a pop-up menu showing a list of words appropriate for insertion into the blank spot.
Here's another, more complicated sentence showing off Deikto at its most complicated:
This sentence says, "I offer Jane a deal: if Jane will kiss Fred, then I will give Jane the candlestick."
The concept of a deal can be extended to a threat by use of the negation icon (the yin-yang symbol just above the verb). A version of this deal presenting a threat might read, "I offer Jane a deal: if Jane will kiss Fred, then I will not kill Jane."
However, I have drawn the line at the complexities of temporality in deal-making. How soon should Jane kiss Fred? Must I give her the candlestick before she kisses Fred, or vice versa? These matters are computationally simple, but expressively difficult: Deikto cannot express such considerations without undue complexity. I have decided to make all deals implicitly immediate; this is an example of the many trade-offs I had to make between dramatic completeness and artistic conciseness.
Of course, there's much more to creating a language than merely creating a list of verbs and designing a graphical scheme for displaying sentences. A huge pile of algorithms must be created for all the words in the language. I earlier mentioned the need to create algorithms reflecting the physical results of a sentence, such as transferring ownership of a given object. But there's much more to be put into algorithms. The most complicated problem arises from determining the reaction of characters to events. Will Jane decide to accept my candlestick in return for kissing Fred? That depends on how much she wants the candlestick and how she feels about Fred. And she has more options than merely accepting or rejecting my offer. She might propose a counteroffer, or react angrily to my offer. All of these possibilities must be built directly into the language.
Expressing the Loss of a Child in an Imaginary Interaction via Deikto
Here I will use the space-saving convention of reducing Deikto sentences to linear structures. This is not the only way that an interaction involving the loss of a child could be handled; it is one of many thousands of possible ways. That's the beauty of interactivity: it addresses possibilities, not instances. That is also what makes interactivity so difficult to comprehend.
This is not a storyworld; it is a representation of a single journey through a storyworld. Interactive storyworlds offer the player zillions of possible stories; the player, in effect, builds a story in concert with the storybuilder. The storybuilder defines the content of the story while creating the storyworld; the player elicits that content in the playing of the storyworld, instantiating it into a single story. Thus, this presentation only offers a single possibility, not the full richness of the storyworld. To hint at the greater possibilities of such a storyworld, I have provided in italics a few alternative options that the player (playing Mom) could have chosen.
Billy request Mom that Mom permit that Billy go to town.
Mom permit that Billy go to town.
Mom not permit that Billy go to town.
Mom require that Billy go to town with Mom.
Mom offer deal Billy: Mom permit that Billy go to town
if Billy promise Mom that Billy not go to lake.
Mom offer deal Billy: Mom permit that Billy go to town
if Billy promise Mom that Billy not play with Johnny.
Mom command Billy that Billy not go to lake.
Billy go to town.
Billy meet Johnny.
Billy play with Johnny.
Johnny command Billy that Billy go to lake.
Billy slightly not accept command Johnny.
Johnny greatly insult Billy.
Billy slightly accept command Johnny that Billy go to lake.
Johnny go to lake.
Billy go to lake.
Johnny play with Billy.
Fate injure Billy.
Johnny flee from Billy.
Billy exclaim greatly.
Sheriff find Billy.
Sheriff take Billy to home.
Sheriff tell Mom story.
Mom exclaim greatly.
Sheriff go to town.
Sheriff find Doc.
Sheriff tell Doc story.
Sheriff request Doc that Doc go to home.
Doc go to home.
Sheriff find Dad.
Sheriff tell Dad story.
Dad go to home.
Doc tell Mom that Billy health very negative.
Mom request Doc that Doc make Billy health positive.
Doc tell Mom that Doc not able make Billy health positive.
Doc advise Mom that Mom take Billy to city.
Mom accept advice.
Mom go to town.
Dad go to town.
Mom find Jethro.
Mom tell Jethro story.
Mom request Jethro that Jethro give Mom $300.
Jethro not accept request Mom.
Jethro offer deal that Jethro give Mom $300 and Mom give Jethro home.
Mom request urgently that Jethro give Mom $300.
Mom accept deal.
Jethro refuse request.
Dad threaten Jethro that Jethro give Mom $300 lest Dad injure Jethro.
Mom advise Dad that Dad not fight Jethro.
Jethro insult Dad.
Dad hit Jethro with fist.
Jethro hit Dad with fist.
Sheriff take Dad to jail.
Mom go home.
Mom request Sheriff that Sheriff give Mom $300.
Mom request Hank that Hank give Mom $300.
Mom rob bank.
Fate kill Billy.
The key point here is that, even though the language is skeletal, the story comes through quite clearly. This example demonstrates just how different interactive storytelling is from conventional storytelling. In terms of conventional storytelling, this story is wretched: there is no color, no subtlety, no nuance. However, this kind of storytelling boasts one advantage - and only one - over conventional storytelling: it is interactive. Mom could have made decisions that would have taken the story in different directions.
Another interesting point about this system is the role played by Fate in the story. Fate is an active character in the story who executes actions that are usually presented in passive form. The villain doesn't accidentally slip at the crucial moment and fall from the impossibly high precipice; Fate does that to the villain. Luke Skywalker didn't just accidentally run into Obi-Wan Kenobi; Fate made that happen. Fate is the most important character in interactive storyworlds, the direct representative of the storybuilder, who maintains the dramatic thrust of the story as it develops. The storybuilder must program Fate to watch over the story as it develops and keep everything moving in the proper direction. This most commonly takes the form of goosing up the action when the player manages to wander into a boring situation. The artist can program Fate to detect the loss of dramatic momentum and inject some new event guaranteed to get the story moving again.
I conclude with a small but critical observation. Most researchers working on interactive storytelling technology use the term "drama manager" for the system of algorithms that I call "Fate." Their term is technically superior, because it more precisely describes the function of this software. However, bridging the gap between artist and programmer will require terminological compromise, and I find "Fate" a snappier and more recognizable term than "drama manager." Here is a partial list of technical terms used by the Erasmatron; note how often I steal from the world of drama:
Actor (not "avatar")
Prop (not "object")
Inclination (I might change this to "Preference")
The tradeoffs between artistic power and manageability has been the most vexing problem in designing the Erasmatron. I solved the technical programs in the hidden areas of the program, and placed the means of solving the more artistic problems into the programming language I created. The separation of the artistic from the technical is not clean; I had to make some of the artistic decisions in building the Erasmatron, and the artist has to make some technical decisions in adjusting the language. Users always demand more power, but they seldom realize just how complicated a powerful program can be. Consider word processing. Perhaps you recall the clean and simple word processors of the early years. They were intuitive and easy to learn. Compare those word processors with a beast such as Microsoft Word. Yes, Word is certainly a powerful word processor - but do you know anybody who has truly mastered the program, who understands every nook and cranny of its capabilities? Consider the heavy books explaining how to use Microsoft Word. If interactive storytelling required the artist to master such books, do you think it could ever get off the ground?
The Erasmatron is not as easy to learn as the early word processors. It does require the equivalent of a book to explain its utilization. However, that book will be a hundred pages long, not a thousand. With the passage of time, as artists grow more comfortable with the Erasmatron, I will add more power to the program and its manual will someday challenge Word's manual in the World Heavyweight Manual championships - but not just yet.
The Erasmatron is an early effort, and as such it is flawed. Programmers and artists face a huge task in bringing computer technology within the reach of artists. This task will require a great deal of trial and error. Both sides must stretch to the utmost to bring their fingertips within touching distance of each other. Artists must commit themselves to working with clumsy, weak, hard-to-learn software in order to show programmers how to make that software less clumsy, more powerful, and easier to use.