A Preliminary Poetics
A Preliminary Poetics
The builder of Façade, an “interactive story world,” Michael Mateas offers both a poetics and a neo-Aristotelian project (for interactive drama and games).
Interactive drama has been discussed for a number of years as a new AI-based interactive experience (Laurel 1986; Bates 1992). While there has been substantial technical progress in building believable agents (Bates, Loyall, and Reilly 1992; Blumberg 1996, Hayes-Roth, van Gent, and Huber 1996), and some technical progress in interactive plot (Weyhrauch 1997), no work has yet been completed that combines plot and character into a full-fledged dramatic experience. The game industry has been producing plot-based interactive experiences (adventure games) since the beginning of the industry, but only a few of them (such as The Last Express) begin to approach the status of interactive drama. Part of the difficulty in achieving interactive drama is due to the lack of a theoretical framework guiding the exploration of the technological and design issues surrounding interactive drama. This paper proposes a theory of interactive drama based on Aristotle’s dramatic theory, but modified to address the interactivity added by player agency. This theory both provides design guidance for interactive dramatic experiences that attempt to maximize player agency (answering the question “What should I build?”) and technical direction for the AI work necessary to build the system (answering the question “How should I build it?”). In addition to clarifying notions of interactive drama, the model developed in this essay also provides a general framework for analyzing player agency in any interactive experience (e.g., interactive games).
This neo-Aristotelian theory integrates Murray’s (1998) proposed aesthetic categories for interactive stories and Aristotle’s structural categories for drama. The theory borrows from Laurel’s treatment of Aristotle in an interactive context (Laurel 1986, 1991) but extends it by situating Murray’s category of agency within the model; the new model provides specific design guidelines for maximizing user agency. First, I present the definition of interactive drama motivating this theory and situate this definition with respect to other notions of interactive story. Next, I present Murray’s three categories of immersion, agency, and transformation. Then, I present a model of Aristotle’s categories relating them in terms of formal and material causation. Within this model, agency will be situated as two new causal chains inserted at the level of character. Finally, I use the resulting model to clarify conceptual and technical issues involved in building interactive dramatic worlds, and briefly describe a current project informed by this model.
Defining Interactive Drama
Many game designers, writers, and theorists have wrestled with the vexing question, “What is interactive story?” This paper continues a specific thread of discussion with respect to this question, the thread begun by Laurel’s adoption of an Aristotelian framework first for interactive drama (Laurel 1986) and then more generally for interactive experiences (Laurel 1991) and continued by Murray’s description of the experiential pleasures and properties of interactive stories (Murray 1998). Whereas Murray explores a variety of interactive story types, this essay focuses explicitly on the notion of interactive drama as defined in Laurel’s thesis (Laurel 1986) and pursued by the Oz Project at Carnegie Mellon University (Bates, Loyall, and Reilly 1992; Weyhrauch 1997).
In this conception of interactive drama, the player assumes the role of a first-person character in a dramatic story. The player does not sit above the story, watching it as in a simulation, but is immersed in the story.
Following Laurel, dramatic (Aristotelian) stories are distinguished from narrative stories by the following properties:
Enactment vs. Description
Intensification vs. Extensification
Unity of Action vs. Episodic Structure
Enactment refers to action. Dramas utilize action rather than description to tell a story. Intensification is achieved by arranging incidents so as to intensify emotion and condense time. In contrast, narrative forms often “explode” incidents by offering many interpretations of the same incident, examining the incident from multiple perspectives, and expanding time. Unity of action refers to the arrangement of incidents such that they are all causally related to a central action. One central theme organizes all the incidents that occur in the story. Narratives tend to employ episodic structure, in which the story consists of a collection of causally unrelated incidents.
Certainly not all interactive story experiences must have the properties of Aristotelian drama. In fact, most interactive story experiences built to date have either been highly episodic (generally those narrative experiences built by the game industry, e.g., adventure games), have employed a hypertextual logic of association rather than a logic of dramatic probability and causality (generally those experiences built by fine artists and writers), or have focused on story not as a highly structured experience created by an author for consumption by an audience, but rather as a shared social construction facilitating human communication (e.g., multiuser worlds such as MUDs, MOOs,and avatar spaces; massive multiplayer games such as Everquest and Ultima Online; and games such as Purple Moon’s Rocket series or Will Wright’s The Sims). Additionally, the interaction in an interactive story does not necessarily have to be first-person interaction as a character within the story. The neo-Aristotelian poetics developed here informs a specific niche within the space of interactive narrative and provides a principled way of distinguishing this niche from other interactive narrative experiences.
Murray’s Aesthetic Categories
Murray (1998) proposes three aesthetic categories for the analysis of interactive story experiences: immersion, agency, and transformation.
Immersion is the feeling of being present in another place and engaged in the action therein. Immersion is related to Coleridge’s “willing suspension of disbelief” – when a participant is immersed in an experience, they are willing to accept the internal logic of the experience, even though this logic deviates from the logic of the real world. A species of immersion is telepresence, the feeling of being physically present (from a first-person point of view) in a remote environment.
Agency is the feeling of empowerment that comes from being able to take actions in the world whose effects relate to the player’s intention. This is not mere interface activity. If there are many buttons and knobs for the player to twiddle, but all this twiddling has little effect on the experience, there is no agency. Furthermore, the effect must relate to the player intention. If, in manipulating the interface elements, the player does have an effect on the world, but they are not the effects that the player intended (perhaps the player was randomly trying things because they didn’t know what to do, or perhaps the player thought that an action would have one effect, but it instead had another), then there is no agency.
Transformation is the most problematic of Murray’s three categories. Transformation has at least three distinct meanings:
Transformation as masquerade. The game experience allows the player to transform themselves into someone else for the duration of the experience.
Transformation as variety. The game experience offers a multitude of variations on a theme. The player is able to exhaustively explore these variations and thus gain an understanding of the theme.
Personal transformation. The game experience takes the player on a journey of personal transformation.
Transformation as masquerade and variety can be seen as means to effect personal transformation.
Integrating Agency into Aristotle
Murray’s categories are phenomenological categories of the interactive story experience, that is, categories describing what it feels like to participate in an interactive story. Aristotle’s categories (described later) are structural categories for the analysis of drama, that is, categories describing what parts a dramatic story is made out of. The trick in developing a theoretical framework for interactive drama is integrating the phenomenological (that is, what it feels like) aspect of first-person experiences with the structural aspects of carefully crafted stories. In attempting this integration, I first discuss the primacy of the category of agency. Second, I briefly present an interpretation of the Aristotelian categories in terms of material and formal cause. Finally, agency is integrated into this model.
Primacy of Agency
From an interactive dramatic perspective, agency is the most fundamental of Murray’s three categories. Immersion, in the form of engagement, is already implied in the Aristotelian model. Engagement and identification with the protagonist are necessary in order for an audience to experience catharsis. Transformation, in the form of change in the protagonist, also already exists in the Aristotelian model. Murray’s discussion of transformation as variety, particularly in the form of the kaleidoscopic narrative that refuses closure, is contrary to the Aristotelian ideals of unity and intensification. To the extent that we want a model of interactive drama, as opposed to interactive narrative, much of Murray’s discussion of transformation falls outside the scope of such a model. While immersion and transformation exist in some form in noninteractive drama, the audience’s sense of having agency within the story is a genuinely new experience enabled by interactivity. For these reasons, agency will be the category integrated with Aristotle.
Following Laurel (1991), Aristotle’s theory of drama is represented in Figure 3.1.
Aristotle analyzed plays in terms of six hierarchical categories, corresponding to different “parts” of a play. These categories are related via material cause and formal cause. The material cause of something is the material out of which the thing is created. For example, the material cause of a building is the building materials of which it is constructed. The formal cause of something is the abstract plan, goal, or ideal towards which something is heading. For example, the formal cause of a building is the architectural blueprints.
In drama, the formal cause is the authorial view of the play. The author has constructed a plot that attempts to explicate some theme. The characters required in the play are determined by the plot; the plot is the formal cause of the characters. A character’s thought processes are determined by the kind of character they are. The language spoken by the characters is determined by their thought. The patterns (song) present in the play are determined, to a large extent, by the characters’ language (more generally, their actions). The spectacle, the sensory display presented to the audience, is determined by the patterns enacted by the characters.
In drama, the material cause is the audience’s view of the play. The audience experiences a spectacle, a sensory display. In this display, the audience detects patterns. These patterns are understood as character actions (including language). Based on the character’s actions and spoken utterances, the audience infers the characters’ thought processes. Based on this understanding of the characters’ thought processes, the audience develops an understanding of the characters, the characters’ traits and propensities. Based on all this information, the audience understands the plot structure and the theme. In a successful play, the audience is then able to recapitulate the chain of formal causation. When the plot is understood, there should be an “a-ha” experience in which the audience is now able to understand how the characters relate to the plot (and why they must be the characters they are), why those types of characters think the way they do, why they took the actions they did and said what they did, how their speech and actions created patterns of activity, and how those patterns of activity resulted in the spectacle that the audience saw. By a process of interpretation, the audience works up the chain of material cause in order to recapitulate the chain of formal cause.
Adding interaction to the Aristotelian model can be considered the addition of two new causal chains at the level of character.
In Figure 3.2, the gray arrows are the traditional chains of material and formal causation.
The player has been added to the model as a character who can choose his or her own actions. This has the consequence of introducing two new causal chains. The player’s intentions become a new source of formal causation. By taking action in the experience, the player’s intentions become the formal cause of activity happening at the levels from language down to spectacle. But this ability to take action is not completely free; it is constrained from below by material resources and from above by authorial formal causation from the level of plot.
The elements present below the level of character provide the player with the material resources (material cause) for taking action. The only actions available are the actions supported by the material resources present in the game. The notion of affordance (Norman 1988) from interface design is useful here. In interface design, affordances are the opportunities for action made available by an object or interface. But affordance is even stronger than implied by the phrase “made available”; in order for an interface to be said to afford a certain action, the interface must in some sense “cry out” for the action to be taken. There should be a naturalness to the afforded action that makes it the obvious thing to do. For example, the handle on a teapot affords picking up the teapot with your hand. The handle cries out to be grasped. In a similar manner, the material resources in an interactive drama afford action. Thus these resources not only limit what actions can be taken (the negative form of constraint) but cry out to make certain actions obvious (the positive form of constraint). Several examples of the material affordances in interactive drama are provided later.
The characters in an interactive drama should be rich enough that the player can infer a consistent model of the characters’ thoughts. If the characters’ thoughts can be understood (e.g., goals, motivations, desires), then these thoughts become a material resource for player action. By reasoning about the other characters’ thoughts, the player can take actions to influence these characters, either to change their thoughts, or actively help or hinder them in their goals and plans.
The dialogue (language) spoken by the characters and the opportunities for the player to engage in dialogue are other material resources for action. Dialogue is a powerful means for characters to express their thoughts, thus instrumental for helping the player to infer a model of the characters’ thoughts. Conversely, dialogue is a powerful means to influence character behavior. If the experience makes dialogue available to the player (and most contemporary interactive experiences do not), this becomes a powerful resource for expressing player intention.
The objects available in the experience (I place the presence of interactive objects somewhere between spectacle and pattern) are yet another resource for player action.
Finally, the mechanics of interaction (spectacle) provide the low-level resources for player actions. The mechanics provide the interface conventions for taking action.
In addition to the material affordances (constraints) from below, the player experiences formal constraints from above. Of course, these constraints are not directly perceived by the player, but, just as in noninteractive drama, are understood by recapitulating the author’s chain of formal causation by making inferences along the chain of material causation. In noninteractive drama, understanding the formal chain of causation allows the audience to appreciate how all the action of the play stems from the dramatic necessity of the plot and theme. In interactive drama, the understanding of the formal causation from the level of plot to character additionally helps the player to have an understanding of what to do, that is, why they should take action within the story world at all. Just as the material constraints can be considered as affording action from the levels of spectacle through thought, the formal constraints afford motivation from the level of plot. This motivation is conveyed as dramatic probability. By understanding what actions are dramatically probable, the player understands what actions are worth considering.
We are now ready to propose a prescriptive, structural model for agency. A player will experience agency when there is a balance between the material and formal constraints. When the actions motivated by the formal constraints (affordances) via dramatic probability in the plot are commensurate with the material constraints (affordances) made available from the levels of spectacle, pattern, language, and thought, then the player will experience agency. An imbalance results in a decrease in agency. This will be made clearer by considering several examples.
Many puzzle-based adventures suffer from the imbalance of providing more material affordances than formal affordances. This results in the feeling of having many things to do (places to go, objects to fiddle with) without having any sense of why any one action would be preferable to another. For example, Zork Grand Inquisitor offers a rich world to navigate and many objects to collect and manipulate. Yet, since there is no unity of action, there is no way to relate current actions to the eventual goal of defeating the Grand Inquisitor. This leaves the player in the position of randomly wandering about trying strange juxtapositions of objects. This detracts from the sense of agency – though the player can take action, this action is often not tied to a high-level player intention. Notice that adding more material opportunities for action would not help the matter. The problem is not a lack of options of things to do, the problem is having insufficient formal constraint to decide between choices.
Quake (and its ilk) induce agency by providing a nice balance between material and formal constraints. The proto-plot establishes the following formal constraints (dramatic probabilities):
Everything that moves will try to kill you.
You should try to kill everything.
You should try to move through as many levels as possible.
From these three principles, all the rest of the action follows. The material affordances perfectly balance these formal affordances. The player can run swiftly and smoothly through the space. The player can pick up a wide array of lethal weapons. The player can fire these weapons at monsters and produce satisfying, gory deaths. The monsters’ behavior is completely consistent with the “kill or be killed” ethos. Everything that one would want to try and do given the formal constraints is doable. There are no extraneous actions available (for example, being able to strike up a conversation with a monster) that are not dictated by the formal constraints.
Note that though these example games are not specifically interactive drama, the model can still be used to analyze player agency within these games. Though the model is motivated by interactive drama, it can be used to analyze the sense of agency in any interactive experience by analyzing the experience in terms of the dramatic categories offered by the model. For example, though Quake has neither plot nor characters in the strict sense, there are top-down player expectations established by a “proto-plot.” This “proto-plot” is communicated by the general design of the spectacle (e.g., the design of the creepy industrial mazes) as well as the actions of the characters, even if these characters do have primitive diction and thought.
Again, in order to invoke a sense of agency, an interactive experience must strike a balance between the material and formal constraints. An experience that successfully invokes a sense of agency inhabits a “sweet spot” in design space. Trying to add additional formal constraints (more plot) or additional material constraints (more actions) to a balanced experience will likely move it out of the sweet spot.
Relationship to Immersion and Transformation
In the previous section, agency was taken as the fundamental Murray category to integrate with Aristotle. In this section, I examine what the new, integrated model has to say about immersion and transformation.
Murray suggests three ways of inducing immersion: structuring participation with a mask (an avatar), structuring participation as a visit, and making the interaction conventions (the interface mechanics) seamless. These three mechanisms can be viewed, in turn, as a way to provide material and formal constraints, as a design suggestion for balancing the constraints, or as a design suggestion for providing effective material constraints at the level of spectacle. Agency is a necessary condition for immersion.
An avatar can provide both material and formal constraints on a player’s actions. The avatar can provide character exposition through such traits as physical mannerisms and speech patterns. This character exposition helps the player to recapitulate the formal, plot constraints. Through both input and output filtering (e.g., the characters in Everquest, or Mateas 1997), the avatar can provide material constraints (affordances) for action.
A visit is one metaphor for balancing material and formal constraints when the material opportunities for action are limited. From the formal side, the conventions of a visit tell the player that they won’t be able to do much. Visits are about just looking around, possibly being guided through a space. Given the limited expectations for action communicated by the formal constraints, the designer can get away with (and in fact, must only) provide limited material means for action.
The mechanics provide the material resources for action at the level of spectacle (the interface can be considered part of the spectacle). Providing a clean, transparent interface insures that agency (and thus immersion) will not be disrupted.
Most of Murray’s discussion of transformation examines transformation as variety, particularly in the form of kaleidoscopic narratives that can be reentered multiple times so as to experience different aspects of the story. Agency, however, requires that a plot structure be present to provide formal constraints. An open-ended story without a clear point of view may disrupt the plot structure too much, thus disrupting agency. However, transformation as variety is necessary to make interaction really matter. If, every time a player enters the dramatic world, roughly the same story events occur regardless of the actions taken by the player, the player’s interaction will seem inconsequential; the player will actually have no real effect on the story.
One way to resolve the apparent conflict between transformation and agency is to note that agency is a first-person experience induced by making moment-by-moment decisions within a balanced (materially and formally) interactive system, while transformation as variety is a third-person experience induced by observing and reflecting on a number of interactive experiences. Imagine an interactive drama system that guides the player through a fixed plot. As the player interacts in the world, the system, through a number of clever and subtle devices, moves the fixed plot forward. Given that these devices are clever and subtle, the player never experiences them as coercive; the player is fully engaged in the story, forming intentions, acting on them, and experiencing agency. Imagine an observer who watches many players interact with this system. The observer notices that no matter what the players do, the same plot happens (meaning that roughly the same story events occur in the same order, leading to the same climax).
By watching many players interact with the system, the observer has begun to discern the devices that control the plot in the face of player interaction. This observer will conclude that the player has no true agency, that the player is not able to form any intentions within the dramatic world that actually matter. But the first-time player within the world is experiencing agency. The designer of the dramatic world could conclude – because they are designing the world for the player, not for the observer – that as long as the player experiences a true sense of interactive freedom (that is, agency) transformation as variety is not an important design consideration.
The problem with this solution to the agency vs. transformation dilemma becomes apparent as the player interacts with the world a second time. On subsequent replays of the world, the player and the observer become the same person. The total interactive experience consists of both first-person engagement within the dramatic world and third-person reflection across multiple experiences in the world. In order to support the total experience, the dramatic world must support both first-person engagement and third-person reflection; must provide agency and transformation as variety.
A dramatic world supporting this total experience could provide agency (and the concomitant need to have a plot structure providing formal constraints) and transformation by actively constructing the player experience such that each run-through of the story has a clean, unitary plot structure, but multiple run-throughs have different, unitary plot structures. Small changes in the player’s choices early on result in experiencing a different unfolding plot. The trick is to design the experience such that, once the end occurs, any particular run-through has the force of dramatic necessity.
The story should have the dramatic probabilities smoothly narrowing to a necessary end. Early choices may result in different necessary ends – later choices can have less effect on changing the whole story, since the set of dramatically probable events has already significantly narrowed. Change in the plot should not be traceable to distinct branch points; the player will not be offered an occasional small number of obvious choices that force the plot in a different direction. Rather, the plot should be smoothly mutable, varying in response to some global state that is itself a function of the many small actions performed by the player throughout the experience.
The Type of Experience Informed by the Model
This neo-Aristotelian poetics clarifies a specific conceptual experiment in the space of interactive stories. Specifically, the experiment consists of creating an interactive dramatic experience with the experiential properties of traditional drama, namely enactment, intensity, catharsis, unity, and closure. The Aristotelian analytic categories describe the structure (parts and relationships) of a story experience that induces these experiential properties. The way in which interaction has been incorporated into this model clarifies what is meant by interactive dramatic experience. Here, interaction means first-person interaction as a character within the story. Further, the essential experiential property of interactivity is taken to be agency. The interactive dramatic experience should be structured in such a way as to maximize the player’s sense of agency within the story. The model provides prescriptive structural guidance for maximizing agency, namely, to balance material and formal constraints. So the conceptual experiment informed by this model can be more precisely stated as follows: build a first-person, interactive dramatic world that, in addition to the classical experiential properties of Aristotelian drama, also provides the player with a strong sense of agency.
In addition to clarifying conceptual and design issues in interactive drama, the neo-Aristotelian model informs a technical agenda of AI research necessary to enable this kind of experience.
The primary heuristic offered by the model is, again, that to maintain a sense of player agency in an interactive experience, material and formal constraints must be balanced. As the sophistication of the theme and plot of an experience increases, maintaining this balance will require characters whose motivations and desires are inferable from their actions. In addition, these characters will have to respond to the player’s actions. Believable agents, that is, computer-controlled characters with rich personalities and emotions, will be necessary. Additionally, in many cases (e.g., domestic dramas in which the plot centers around relationships, trust, betrayal, infidelity, and self-deception), language is necessary to communicate the plot.
In order to convey the formal constraints provided by the plot, the characters must have a rich repertoire of dialogue available. In addition, the player must be able to talk back. One can imagine a system in which the characters can engage in complex dialogue but the player can only select actions from menus or click on hotspots on the screen; this is, in fact, the strategy employed by character-based multimedia artwork and contemporary adventure games. But this strategy diminishes agency precisely by unbalancing material and formal constraints. The characters are able to express complex thoughts through language. However, the player is not able to influence their thoughts except at the coarse level provided by the mouse-click interactivity. Thus maximizing player agency requires providing at least a limited form of natural language dialogue.
The function of interactive characters is primarily to communicate material and formal constraints. That is, the player should be able to understand why characters take the actions they do, and how these actions relate to the plot. Sengers (this volume, 1998a) provides a nice analysis of how an audience-based focus on agents as communication requires changes in agent architectures. When the focus changes from “doing the right thing” (action selection) to “doing the thing right” (action expression), the technical research agenda changes (Sengers 1998b). The neo-Aristotelian model indicates that action expression is exactly what is needed. In addition, an interactive drama system must communicate dramatic probability (likely activity given the plot) while smoothly narrowing the space of dramatic probability over time. This means that story action must be coordinated in such a way as to communicate these plot-level constraints. Thus it is not enough for an individual character’s actions to be “readable” by an observer. Multiple characters must be coordinated in such a way that their joint activity communicates both formal and material (plot and character level) affordances. This requires a technical solution that blurs the firm plot/character distinction usually made in AI architectures for interactive drama (Blumberg and Galyean 1995; Weyhrauch 1997).
Façade: An Interactive DramaGuided by the Model
The author is currently engaged in a three-year collaboration with Andrew Stern to build Façade (Mateas and Stern 2000, Stern this volume), an interactive story world that seeks to carry out the conceptual and technical experiment informed by the neo-Aristotelian poetics. Together we will:
Create a compelling, well-written story that obeys dramatic principles, designed with many potential ways to play out.
Build artificial intelligence (AI) that can control the behavior of real-time-animated computer characters, to be used for performing the roles of all but one of the characters in the story.
Create a user interface that allows the player to move easily within the world, and converse and gesture with the computer characters.
Build AI that can understand a natural language and gestural input within the context of the story.
Build AI that can integrate the user’s interactions into the space of potential plot directions and character behaviors in the story.
Collaborate with voice actors and animators to author spoken dialogue, character behavior and story events within the engine, to construct the finished story world.
The story requirements describe the properties we wish our story to have. These are not intended to be absolute requirements; that is, this is not a description of the properties that all interactive stories must have. Rather, these requirements are the set of assumptions grounding the design of the particular interactive story we intend to build.
Short One-Act Play. Any one run of the scenario should take the player 15 to 20 minutes to complete. We focus on the short story for a couple of reasons. Building an interactive story has all the difficulties of writing and producing a noninteractive story (film or play) plus all the difficulty of supporting true player agency in the story. In exploring this new interactive art form it makes sense to first work with a distilled form of the problem, exploring scenarios with the minimum structure required to support dramatically interesting interaction. In addition, a short one-act play is an extreme, contrarian response to the many hours of gameplay celebrated in the design of contemporary computer games. Instead of providing the player with 40 to 60 hours of episodic action and endless wandering in a huge world, we want to design an experience that provides the player with 15 to 20 minutes of emotionally intense, tightly unified, dramatic action. The story should have the intensity, economy, and catharsis of traditional drama.
Relationships. Rather than being about manipulating magical objects, fighting monsters, and rescuing princesses, the story should be about the emotional entanglements of human relationships. We are inter-ested in interactive experiences that appeal to the adult, non-computer-geek, movie-and-theater-going public.
Three Characters. The story should have three characters, two controlled by the computer and one controlled by the player. Three is the minimum number of characters needed to support complex social interaction without placing the responsibility on the player to continually move the story forward. If the player is shy or confused about interacting, the two computer controlled characters can conspire to set up dramatic situations, all the while trying to get the player involved.
Player as Protagonist. Ideally the player should experience the change in the protagonist as a personal journey. The player should be more than an “interactive observer,” not simply poking at the two computer-controlled characters to see how they change.
Embodied Interaction Matters. Though dialogue should be a significant (perhaps the primary) mechanism for character interaction, it should not be the sole mechanism. Embodied interaction, such as moving from one location to another, picking up an object, or touching a character, should play a role in the action. These physical actions should carry emotional and symbolic weight, and should have a real influence on the characters and their evolving interaction. The physical representation of the characters and their environment should support action significant to the plot.
Action in a Single Location. This provides unity of space and forces a focus on plot and character interaction.
Player’s Role not Over-constrained. The amount of noninteractive exposition describing the player’s role should be minimal. The player should not have the feeling of playing a role, of actively having to think about how the character they are playing would react. Rather, the player should be able to be themselves as they explore the dramatic situation. Any role-related scripting of the interactor (Murray 1998) should occur as a natural by-product of their interaction in the world. The player should “ease into” their role; the role should be the “natural” way to act in the environment, given the dramatic situation.
Our story, which satisfies these story requirements, is a domestic drama in which a married couple has invited the player over for dinner. (Assume for the moment that the player’s character is male.) Grace and Trip are apparently a model couple, socially and financial successful, well-liked by all. Grace and Trip both know the player from work. Trip and the player are friends; Grace and the player have gotten to know each other fairly recently. Shortly after arriving at their house for dinner, Grace confesses to the player that she has fallen in love with him. Throughout the rest of the evening, the player discovers that Grace and Trip’s marriage is actually falling apart. Their marriage has been sour for years; deep differences, buried frustrations, and unspoken infidelities have killed their love for each other. How the façade of their marriage cracks, what is revealed, and the final disposition of Grace and Trip’s marriage, and Grace and the player’s relationship, depends on the actions of the player. The story’s controlling idea: to be happy you must be true to yourself.
The story world is presented to the player as an animated, three-dimensional environment. The environment and characters within the environment are rendered in an illustrative style reminiscent of graphic novels. The player is able to move about this environment from a first-person point of view, gesture and pick up objects, and converse with the other characters by typing. The computer-controlled characters look directly out of the screen to gesture and talk to the player. The conversation discourse is real-time; that is, if the player is typing, it is as if they are speaking those words in (pseudo) real-time.
The story is structured as a classic Aristotelian plot arc. The AI plot system explicitly attempts to change dramatic values (e.g., the love between Trip and Grace, the trust between the player and Trip) in such a way as to make a well-formed plot arc happen. In the theory of (classical) dramatic writing, the smallest unit of value change is the beat (McKee 1997). Roughly, a beat consists of an action/reaction pair between characters. Beats are sequenced to make scenes, scenes to make acts, acts to make stories. The AI plot system contains a library of beats appropriate for our story. The system dynamically sequences beats in such a way as to respond to player activity and yet maintain a well-formed plot arc. For the player, each run-through of the story should have the force of dramatic necessity. Explicit decision points, which would highlight the nonlinearity of the story, should not be visible. However, in multiple run-throughs of the story, the player’s actions have a significant influence on what events occur in the plot, which are left out, and how the story ends. Only after playing the experience six or seven times should the player begin to feel they have “exhausted” the interactive story. In fact, full appreciation of the experience requires that the story be played multiple times. In Façade, our goal is to create an interactive story experience that provides the player with the agency to have an effect on the trajectory of the story, yet has the feel of a traditional, linear, dramatic experience.
The architecture for Façade is informed by the neo-Aristotelian poetics of interactive drama, specifically by the technical agenda following from the poetics to:
Support the coordination of multiple characters’ actions to communicate material and formal affordances; that is, the coordination of multiple characters in carrying out dramatic action, and
Support natural language dialogue so as to maintain player agency in an interactive story with a complex theme.
Again, the architectural basis for providing each of these capabilities is the smallest unit of dramatic value change, the beat.
In Façade, beats are architectural entities. A beat consists of: preconditions, a description of the values changed by the beat, success and failure conditions, and joint behaviors, to coordinate the characters in order to carry out the specific beat. Scenes have a similar structure, except that instead of having joint behaviors, a scene has a collection of beats it can use to try and make the scene happen. Preconditions and effects are used to first select a scene, and then, within the scene, beats. When a beat is selected, the joint behaviors associated with this beat are activated in the characters. These joint behaviors extend the reactive behaviors of Hap (Loyall and Bates 1991; Loyall 1997) to include explicit support for multi-agent (in our case, multicharacter) coordination in a manner similar to the STEAM architecture (Tambe 1997). As the player interacts within the beat, she will influence the specific performance of the beat. Because the beat is trying to cause specific value changes, it may turn out that there is no performance of the beat that believably incorporates player interaction while appropriately changing the values. In this case the beat is aborted and another beat is selected.
Most approaches to computer-controlled characters have been driven by a notion of strong autonomy; that is, by the idea that the character independently chooses moment-by-moment what action to take next, based on local state (what has recently happened in the world). But interactive drama requires that character action make sense globally as well as locally; all of a character’s actions must “add up” to a consistent set of material and formal affordances, while still providing immediate response to player interaction. Rather than putting all the “character-ness” in the characters and all the “story-ness” in a drama manager, the architectural construct of the beat tightly binds character-specific and story-specific knowledge, just as character and plot are tightly related in the neo-Aristotelian poetics. Character behavior is now organized around the dramatic functions that the behavior serves, rather than organized around a conception of the character as independent of the dramatic action.
Natural Language Dialogue
Natural language understanding is a notoriously difficult AI problem; it is commonly agreed that building a system that is as good as a human being at participating in dialogue would be tantamount to modeling all of human intelligence. Thus, on first blush, our desire to have the player engage in unrestricted dialogue with the characters seems ludicrous. But here the fact that what we really want is dramatic dialogue within a specific story context comes to the rescue. The player’s dialogue and actions are additional material causes in the story (a contribution to the material out of which the story is being built), while the player’s intentions are additional formal causes in the story.
Of course these material and formal contributions must be consonant with the author-provided chains of material and formal causation. So for natural language understanding, we don’t need something that can glean the open-ended meaning out of arbitrary utterances, but rather something that interprets dialogue as contributions within a specific dramatic context. This is accomplished as follows: template rules map from surface text to a small number of discourse acts (things like “praise Grace,” or “praise Trip,” or “mention-topic marriage”). This is a many-to-few mapping, in which a huge number of surface productions get turned into a few discourse acts out of a small set of possible acts. Forward chaining rules then map the initial discourse acts to final discourse acts in a context-specific way. Discourse context is maintained by beats; the current active beat is the current active discourse context. Associated with beats are the beat-specific mapping rules that get added to the general rules when the beat is activated. When an utterance is not understood (no mapping rule is activated), recovery mechanisms try to mask the failure to understand while moving the story forward.
In this essay, Murray’s concept of agency was integrated into Laurel’s Aristotelian structural model to yield a proposed Aristotelian interactive poetics. This model illuminates the general conditions under which a user will experience agency in any interactive experience and provides design and technology guidance for the particular case of building interactive dramatic experiences. The design of Façade, an interactive dramatic world being built by the author and Andrew Stern, is informed by this interactive poetics.
Aaarseth, Espen (1997). Cybertext: Perspectives on Ergodic Literature. Baltimore: Johns Hopkins University Press.
Aristotle (1997). The Poetics. Mineola, NY: Dover.
Avedon, Elliot M. and Brian Sutton-Smith (1971). The Study of Games. New York: Wiley.
Bates, J. (1992). “Virtual Reality, Art, and Entertainment.” Presence: The Journal of Teleoperators and Virtual Environments 1, no.1 (1992): 133-138.
—., A.B. Loyall, and W.S. Reilly (1992). “Integrating Reactivity, Goals and Emotion in a Broad Agent.” Technical Report (CMU-CS-92-142), Department of Computer Science, Carnegie Mellon University, Pittsburg, Pennsylvania.
Blumberg, B. (1996). “Old Tricks, New Dogs: Ethology and Interactive Creatures.” Ph.D. Thesis, MIT Media Lab. Cambridge, Massachusetts.
—., and T. Galyean (1995). “Multi-level Direction of Autonomous Creatures for Real-Time Virtual Environments.” In Proceedings of SIGGRAPH 95, (1995): 47-54.
Eskelinen, Markku (2001). “The Gaming Situation.” Game Studies 1, no.1 (July 2001). http://www.gamestudies.org/0101/eskelinen/.
Frasca, Gonzalo (2001). “Ephemeral Games: Is it Barbaric to Design Videogames after Auschwitz?” In Cybertext Yearbook 2001, edited by Markku Eskelinen and Raine Koskimaa. Jyväskylä: Research Centre for Contemporary Culture, University of Jyväskylä. http://www.jacaranda.org/frasca/ephemeralFRASCA.pdf.
Hayes-Roth, B., R. van Gent, and D. Huber (1996). “Acting in Character.” In Creating Personalities for Synthetic Actors, edited by R. Trappl and P. Petta. Berlin and New York: Springer. Also available as Stanford Knowledge Systems Laboratory Report KSL-96-13 (1996).
Kelso, M.T., P. Weyhrauch, and J. Bates (1999). “Dramatic Presence.” Presence: The Journal of Teleoperators and Virtual Environments 2, no. 1 (Winter 1993): 1-15. http://www2.cs.cmu.edu/afs/cs.cmu.edu/project/oz/web/papers/CMU-CS-92-195.ps.
Laurel, Brenda (1986). “Towards the Design of a Computer-Based Interactive Fantasy System” Ph.D. Thesis, Ohio State University, Columbus, Ohio.
—. (1991). Computers as Theatre. Reading, MA: Addison-Wesley.
Loyall, A.B. (1997). “Believable Agents.” Ph.D. Thesis (Tech report CMU-CS-97-123), Carnegie Mellon University, Pittsburgh, Pennsylvania.
Loyall, A.B., and J. Bates (1991). “Hap: A Reactive, Adaptive Architecture for Agents.” Technical Report CMU-CS-91-147, Department of Computer Science, Carnegie Mellon University Pittsburgh, Pennsylvania.
Mateas, Michael (1997). “Computational Subjectivity in Virtual World Avatars.” In Working Notes of the Socially Intelligent Agents Symposium, 1997 AAAI Fall Symposium Series. Menlo Park, CA: AAAI Press.
—. and Andrew Stern (2000). “Towards Integrating Plot and Character for Interactive Drama.” In Working Notes of the Socially Intelligent Agents: Human in the Loop Symposium, 2000 AAAI Fall Symposium Series. Menlo Park, CA: AAAI Press.
McKee, Robert (1997). Story: Substance, Structure, Style and the Principles of Screenwriting. New York: Harper Collins.
Murray, Janet (1998). Hamlet on the Holodeck. Cambridge, MA: MIT Press.
Norman, Don (1988). The Design of Everyday Things. New York: Doubleday.
Sengers, Phoebe (1998a). “Anti-Boxology: Agent Design in Cultural Context.” Ph.D. Thesis (Technical Report CMU-CS-98-151), School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania.
—. (1998b). “Do the Thing Right: An Architecture for Action Expression.” In Proceedings of the Second International Conference on Autonomous Agents, 24-31. May, 1998
Tambe, Milind. “Towards Flexible Teamwork.” Journal of Artificial Intelligence Research 7, (1997): 83-124.
Weyhrauch, P. (1997). “Guiding Interactive Drama.” Ph.D. Thesis (Technical Report CMU-CS-97-109), School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania.