1. Introduction
The essence of the computer as a representational medium is procedurality - the ability of the computer to engage in arbitrary mechanical processes to which observers can ascribe meaning. Computers do, of course, participate in the production of imagery, support communication between people via the mediation of long-distance signals, control electromechanical devices, and support the storage and interlinking of large quantities of human-readable data. Many tools are available that allow users to engage these various capacities of the computer, such as image manipulation or Web page authoring, without requiring users to think procedurally. But it is precisely the computer's ability to morph into these special-purpose machines that highlights the computer's procedural nature. These special-purpose machines (e.g., tools) are made out of computational processes; the computer's ability to engage in arbitrary processes allows it to morph into arbitrary machines.
Taking full representational advantage of the computer thus requires procedurally literate authorship; that is, artists and writers who are able to think about and work within computational frameworks. By procedural literacy, we mean the ability to read and write processes, to engage in procedural representation and aesthetics, to understand the interplay between the culturally embedded practices of human meaning-making and technically mediated processes. Even for new media practitioners who don't themselves write much code, procedural literacy is necessary for successfully participating in interdisciplinary collaborative teams, and for understanding the space of possibility for digital works. Many authors find themselves engaged in some level of programming, especially for interactive work that, of necessity, requires conditional response to input, and thus the specification of a process. In the extreme case of developing new modes of computational expression, authors must be highly proficient in the use of general purpose programming languages, used to construct new languages and tools specialized for the new representational mode.
In this chapter, we provide a case study, using the interactive drama Façade, of this last type of procedural authorship. Façade represents a new mode of computational representation, interactive drama, combining the gamelike pleasure of moment-by-moment interaction with believable characters and the storylike pleasure of participating in and influencing a long-term, well-formed dramatic progression. As procedural authors, we undertook several design-plus-programming tasks: deconstructing a dramatic narrative into a hierarchy of story and behavior pieces; designing an AI (artificial intelligence) architecture and a collection of special-purpose languages within the architecture, which respond to and integrate the player's moment-by-moment interactions to reconstruct a real-time dramatic performance from those pieces; and writing an engaging, compelling story within this new framework.
This essay makes a case for the importance of procedural authorship, describes the design goals of Façade and how these goals could only be met through a highly procedural approach to interactive narrative, and finally describes Façade's architecture, content organization, and the experience of authoring within this framework.
2. Procedurality
Janet Murray has identified four essential properties of the computer as a representational medium: that computers are procedural, participatory, encyclopedic and spatial (Murray 1998). The procedural, of course, refers to the machinic nature of computers, that they embody complex causal processes, and in fact can be made to embody any arbitrary process. The participatory refers to the interactive nature of computers, that they can dynamically respond to outside signals, and be made to respond to those signals in a way that treats those signals as having the meaning ascribed to them by people (that is, nonarbitrary response). The encyclopedic refers to the vast storage capacity of digital computers, and their ability to organize, retrieve, and index stored material. The spatial refers to the ability of digital computers to represent space, whether that is the physical space of virtual reality and games or the abstract space of networks of information.
Various communities of practice tend to hold different properties as central. Here we provide a few examples of the privileging of various properties. For the Demoscene, a largely competition-oriented subculture with groups and individual artists competing against each other in technical and artistic excellence (Wikipedia 2005), procedurality is central; the aim is to procedurally generate as rich an audiovisual experience as possible using the minimum amount of stored content. The participatory is privileged in rhetorics of agency, control, and co-authorship, and has been adopted by communities as diverse as user-interface design, interactive art, and digital marketing. Database art privileges the encyclopedic, sometimes viewing all new media art practice as metaphorically related to the manipulation and resequencing of data stores. Spatiality is privileged by such diverse communities as virtual reality, game design, and hypertext.
While all of these properties play some role in various computational media, procedurality is the essential, defining property of computational media, without which the other properties could not exist.
Any participatory system requires the specification of potential action that is carried out in response to a stimulus. Capturing a space of potential action requires specifying a machine or process that can actualize the potential under different contingencies. In other words, participatory systems require procedurality. The converse is not true; there can be procedural systems that are not participatory, but rather execute a fixed process without accepting input. Many generative art systems, such as Aaron (McCorduck 1991), exhibit procedurality without being participatory.
Encyclopedic systems are similarly dependent on procedurality. Without the ability to perform operations on data, to be able to access, resequence, search, modify, index, and so forth, large data stores are useless. Without the procedural competencies of Web search technologies, for instance, the Web literally could not exist at its current scale. There would be no reason to create a new Web page without the ability to relate the page to other, already published pages, and the ability for others to be able to find and view your page. Again, the converse is not true. Processes can create elaborate experiences from very small kernels; this capability is, in fact, the inspiration for the Demoscene.
The spatial is clearly a derivative property, a representational illusion actively maintained by a process. Graphical spatial representations make use of procedural models to compute and dynamically update the displayed space. Interactive spaces, which create the sense of space by supporting active navigation through the space, and which may not make use of 2D or 3D graphical representations at all, depend on the participatory, which in turn is dependent on procedurality.
The goal here is not simply to play a dominance game between the various representational properties of computers, but to avoid serious confusions that can arise in new media theory and practice from misunderstanding the central importance of procedurality. Without a deep understanding of the relationship between what lies on and beneath the screen, scholars are unable to deeply read new media work, while practitioners, living in the prison-house of "art-friendly" tools, are unable to tap the true representational power of computation as a medium.
Without an understanding of procedurality, of how code operates as an expressive medium, new media scholars are forced to treat the operation of the media artifacts they study as a black box, losing the crucial relationship between authorship, code, and audience reception. Code is a kind of writing; just as literary scholars wouldn't dream of reading only translated glosses of work, never reading the full work in its original language, so new media scholars must read code, not just at the simple level of primitive operations and control flow, but at the level of the procedural rhetoric, aesthetics and poetics encoded in a work.
New media practitioners without procedural literacy are confined to producing those interactive systems that happen to be possible to produce within existing authoring tools. To date, such tools tend to have an encyclopedic orientation; in the absence of significant support for procedural authorship (i.e., programming), authorship consists of the gathering together of numerous media assets (video, sound, text, image, etc.), and the spatial and temporal composition of those assets within the procedural framework supported by the tool (e.g., linking). This approach fundamentally limits the size and complexity of new media artifacts. For interactive works, this problem is especially severe, as it forces the author to pre-specify and explicitly author responses to all possible interactive situations.
2.1 Procedurality and Content
To describe the relationship between computation and media assets, Chris Crawford introduced the term process intensity (Crawford 1987). Process intensity is the "crunch per bit," the ratio of computation to the size of the media assets being manipulated by the system. If a game (or any interactive software) primarily triggers media playback in response to interaction, it has low process intensity. The code is doing very little work - it's essentially just shoveling bits from the hard drive or CD-ROM to the screen and speakers. As a game (or any interactive software) manipulates and combines media assets, its process intensity increases. Algorithmically generated images and sound that make no use of assets produced offline have maximum process intensity.
Process intensity directly enables richness of interactivity. As process intensity decreases, the author must produce a greater number of offline assets (e.g., pre-rendered chunks of text, animations or video) to respond to the different possible interactions. The number of offline assets required to maintain a given level of interactivity increases exponentially as process intensity decreases; therefore, in general, decreases in process intensity result in decreases in the richness of interactivity.
Although games have a relatively high process intensity within the space of new media artifacts, contemporary games are pushing against authoring limits caused by an overreliance on non-procedural, static assets. Contemporary games such as Electronic Arts' The Lord of the Rings franchise currently contain more media files than lines of code (Mateas 2005). Even open-world games such as the Grand Theft Auto franchise, lauded for their simulated, procedural worlds, still use static assets for every vehicle, every type of person, every building, every weapon, and so forth.
Furthermore, developers at a recent Game Developers Conference voiced concern that next-generation console game hardware will only exacerbate this content crisis (Taylor 2005). The requirement for ever-more detailed graphics to entice consumers to purchase next-generation consoles means that assets become more expensive to produce, requiring ever-larger teams, making games more expensive. Consumers want more gameplay, meaning larger games, thus requiring even more assets to be produced; this all results in a positive-feedback loop that is considered by many to be unsustainable.
Where insufficient procedurality is creating a crisis in the authoring of traditional games, it has prevented some long sought-after genres of interactive art and entertainment, such as the high-agency interactive story, from even getting off the ground. Bringing process intensive, AI-based techniques to the problem of interactive story was one of the fundamental research goals of our interactive drama Façade.
3. Procedural Content in the Interactive Drama Façade
3.1 A Case Study for Procedural Content
Motivated by our belief that the research into highly procedural authoring methods will enable yet-to-be-realized genres of interactive art and entertainment, we undertook the development of the interactive drama Façade. The dream of interactive drama, perhaps best envisioned by the Star Trek Holodeck and first presented in an academic context by Brenda Laurel in Computers as Theatre (Laurel 1991), has players interacting with compelling, psychologically complex characters, and through these interactions having a real influence on a dynamically evolving storyline. Using a decade of prior research from the Carnegie Mellon Oz Project (Bates 1992; Loyall 1997) as a starting point and our belief that a fully realized interactive drama had not yet been built, we embarked on a five-year effort to develop procedural authoring methods for believable characters, natural language conversation, and a dynamic storyline, integrated into a small but complete, playable experience. Publicly released in July 2005, Façade has been downloaded by over 150,000 players worldwide as of this writing, and has received widespread critical acclaim (Montfort 2005).
Enjoyable video games tend to be highly procedural in implementation, because among implementation methods, procedurality affords the greatest degree of dynamism and reactivity - features very satisfying to players. The best procedural video games excel at giving players high-agency experiences; that is, providing ample opportunities for the player to take action and receive immediate feedback. With Façade, we wanted to create an interactive drama that provides the level of immediate, moment-by-moment agency, that is, local agency, found in games. But unlike games, we want the player to experience global agency, that is, longer-term player influence on the overall story arc, over which topics get brought up, how the characters feel about the player over time, and how the story ends.
Like contemporary games, Façade is set in a simulated world with real-time 3D animation and sound, and offers the player a first-person, continuous, direct-interaction interface, with unconstrained navigation and the ability to pick up and use objects. But like drama, particularly theatrical drama about personal relationships such as Who's Afraid of Virginia Woolf? (Albee 1962), Façade uses unconstrained natural language and emotional gesture as a primary mode of expression for all characters, including the player. Rather than being about saving the world, fighting monsters, or rescuing princesses, the story is about the emotional entanglements of human relationships, specifically about the dissolution of a marriage. There is unity of time and space - all action takes place in an apartment - and the overall event structure is modulated to align to a well-formed Aristotelian tension arc, that is, inciting incident, rising tension, crisis, climax, and denouement, independent of the details of exactly what events occur in any one run-through of the experience.
Additionally, the story-level choices in Façade are intended to not feel like obvious branch points. We believe that when a player is faced with obvious choice points consisting of a small number of choices (e.g., being given a menu of three different possible things to say), it detracts from the sense of agency; the player feels railroaded into doing what the designer has dictated. Instead, in Façade, the story progression changes in response to many small actions performed by the player throughout the experience. Later in this chapter we describe Façade's procedural content in detail, and how it achieves these design goals.
3.2 Hindrances Of Low- or Non-Procedural Content
Authors have faced a longtime conundrum when undertaking the construction of interactive stories: how can a story be structured to incorporate interaction, yet retain a satisfying, well-formed plot when experienced by the reader/player? Historically, the designs of low- or non-procedural interactive stories have been forced to make a tradeoff between these two goals. The resulting "interactive story" may have a well-formed plot, but can only be minimally influenced by the reader/player, as seen in the linear narrative threads of most games and some text-adventure interactive fiction (IF).
Alternatively, the design tradeoff may be made in the other direction, resulting in interactive experiences that can vary significantly as a result of player action, but lack the degree of coherence, pacing and focus that are pleasurable in well-constructed stories. A non-procedural, encyclopedic design approach, in which the author creates a large number of static story pieces (assets) that are sequenced by a simple system, inevitably forces this design tradeoff. The author can choose to place minimal constraints on the ordering of story pieces, allowing the local sequencing of pieces to depend on the local player interaction. But then the sequences produced will lack the coherency of well-formed story arcs. Fragmented plots, or plots heavily diluted with unorganized or non-useful bits of action, are common in hypertext fiction as well as some IF, making them problematic to characterize as proper stories.
Within an encyclopedic design approach, the only way to increase interactivity is to author extraordinary amounts of content by brute force. This strategy has been borne out to be impractical; even the most successful Choose Your Own Adventure books or their digital equivalents, where the plot may vary significantly in response to reader's choices and be well-formed, necessarily offer an unsatisfyingly short series of infrequent, binary choices in order to avoid a combinatorial explosion of explicitly rendered (prewritten) plot directions. In such an approach, the limited and cumbersome nature of a non-procedural, encyclopedic approach is exposed.
The encyclopedic tradeoff between coherency and the combinatorial explosion seen at the plot level is mirrored at the more detailed level of character dialogue. The low-coherency, simple-process approach to dialogue is exemplified by chatterbots, in which lines of dialogue are sequenced from a large pool in response to each player interaction, making use of little to no context, and depending primarily on simple stimulus/response rules. The high-coherency Choose Your Own Adventure-approach to dialogue is exemplified by dialogue trees, in which an author must explicitly and statically represent discourse context by pre-specifying all possible paths through the dialogue, resulting in the same combinatorial explosions suffered by story graphs.
Based on such frustrating limitations in prior approaches to interactive story, local and global agency have commonly been seen as incompatible.
3.3 Procedural Story Design
Our solution in Façade to this long-time conundrum is to recast player interactions within a story in terms of abstract social games. Games, which are procedural by nature, achieve the high degree of event variability and player agency that we desire; the challenge becomes how to design and structure games that reflect the particular meanings we wish our story to exhibit, and how to dramatically perform the games as coherent, focused, well-paced narratives.
Further, to be compatible with the procedural, simulation-oriented nature of games, the granularity of immutable story content pieces must be made unusually small, on the order of individual and recombinable facial expressions, gestures and lines of dialogue, rather than multi-sentence lexias of text or extended cutscenes. As described in detail later, Façade's content pieces are organized into multiple, mixable hierarchical levels, sequenced by procedures written in multiple, mixable authoring languages.
At a high level, Façade's abstract social games are organized around a numeric "score," such as the affinity between a character and the player.
However, unlike traditional video games where there is a fairly direct connection between player interaction (e.g., pushing
a button to fire a gun) and score state (e.g., a decrease in the health of a monster), Façade's social games have several levels of abstraction separating atomic player interactions from changes in social "score." Instead
of jumping over obstacles or firing a gun, in Façade, players fire off a variety of discourse acts in natural language, such as praise, criticism, flirtation, and provocation
(see table 30.1). While these discourse acts will generate immediate reactions from the characters, it may take story-context-specific patterns of discourse acts to influence
the social game score. Furthermore, the score is not directly communicated to the player via numbers or sliders, but rather
via enriched, theatrically dramatic performance.![]()
As a friend invited over for drinks at a make-or-break moment in the collapsing marriage of the protagonists Grace and Trip, the player unwittingly becomes an antagonist of sorts, forced by Grace and Trip into playing psychological "head games" with them (Berne 1964). During the first part of the story, Grace and Trip interpret all of the player's discourse acts in terms of a zero-sum affinity game that determines whose side Trip and Grace currently believe the player to be on. Simultaneously, the hot-button game is occurring, in which the player can trigger incendiary topics such as sex or divorce, progressing through tiers to gain more character and backstory information, and if pushed too far on a topic, affinity reversals. The second part of the story is organized around the therapy game, where the player is (purposefully or not) potentially increasing each characters' degree of self-realization about their own problems, represented internally as a series of counters. Additionally, the system keeps track of the overall story tension level, which is affected by player moves in the various social games. Every change in each game's state is performed by Grace and Trip in emotionally expressive, dramatic ways. On the whole, because their attitudes, levels of self-awareness, and overall tension are regularly progressing, the experience takes on the form and aesthetic of a loosely plotted domestic drama.
As the granularity of the atomic pieces of story content (e.g., dialogue, emotion and gestural expression) becomes very small, and the procedures to sequence and combine them into a coherent narrative performance become primary to the realization of the experience for the player, the author's activity shifts from that of a writer of prose into a writer of procedures; that is, into becoming a programmer.
The following is an excerpt of a play session of Façade. Before this example began, the player chose the name Brenda. All she is told initially is that she is friends with Grace and Trip from college, hasn't seen them in a long time, and has been invited over for drinks. The drama begins with Brenda standing in a foyer at the front door of Grace and Trip's apartment.
From a first-person point of view, Brenda can freely walk and move about using the arrow keys, pick up objects and gesture using the mouse-controlled hand cursor, and speak at any time by typing and entering text, which is displayed at the bottom of the screen. Grace and Trip animate fluidly and speak their dialogue out loud.
A dialogue trace in the form of a stageplay, like the one below, is generated each time Façade is played.
GRACE (offscreen, audible behind the door)
Trip, when are you going to get rid of this?
TRIP (offscreen, audible behind the door)
What, Grace...this?
GRACE
Yes, you know how I feel about it -
TRIP
I know, I know, I'll do it right now, alright?! - (interrupted)
(Brenda knocks on the front door.)
TRIP
Oh, she's here!
GRACE
What?! You told me it'd be an hour from now!
TRIP
No, she's right on time!
GRACE
God...Trip!
(Trip opens the front door.)
TRIP
Brenda!! Ah, I'm so happy you could make it! We haven't seen you in so long, how's it going?
BRENDA
hi trip, how are you?
TRIP
Oh, we're great. I mean really, really great.
TRIP
Come on in!
BRENDA
great thanks
(Grace enters the living room.)
GRACE
Brenda, hi! How are you? God, it's been a while!
TRIP
Yeah, how - how are you doing?
GRACE
I just asked her that...
TRIP
Well, I'm asking her, too!
GRACE
(frustrated sigh) - (interrupted)
BRENDA
i'm good
GRACE
Oh, H-mmm (happy smile sound)... ell, come on in, make yourself at home!
TRIP
So! Drinks!
(Trip closes the front door.)
TRIP
What would you like? How does a martini sound? Everybody tells me I fix the best drinks, so I'm sure you're gonna love this.
GRACE
Now Trip, don't get too worked up with the drinks tonight...
BRENDA
sure, a martini sounds great
TRIP
Beautiful!
(Trip trots over to the bar.)
GRACE
No no, Brenda, maybe you'd like some juice, or a mineral water?
TRIP
Oh come on...
BRENDA
what's wrong grace
GRACE
(anxious) What do you...No, we're fine, everything's fine...(clears throat)
BRENDA
ok
TRIP
Grace, I assume you want your usual..."a lovely, very cold glass of Chardonnay."
GRACE
(distant) Yes, a glass of Chardonnay sounds nice.
TRIP
Of course.
GRACE
(anxious) So, Brenda, I'm hoping you can help me understand where I went wrong with my new decorating, ha ha.
TRIP
Oh, Grace, let's not do that.
GRACE
(little sigh) You know, for this corner of the room, I had a desire for something...big...and bold...
TRIP
Yeah, this is a huge couch...
(Grace picks up her glass of wine from the bar.)
GRACE
...but now I can see how I should have chosen a simple, comfortable...love seat.
TRIP
uhh...
(Trip hands Brenda her drink.)
TRIP
Here we are...hope you like it!
(Brenda takes her drink from Trip.)
BRENDA
Thanks, i'm thirsty.
GRACE
I'm sure I can return most of this, and try to - (interrupted)
BRENDA
No, I think it looks fine.
GRACE
Really?
(Brenda sips her drink.)
TRIP
See, Grace, you're the only one unhappy with your decorating.
GRACE
(sarcastic) Oh well, I guess I'm wrong then. Ha ha. Thanks for clearing that up.
BRENDA
But it does look good, Grace.
(Grace sips her drink.)
GRACE
I can tell I'm going to need another one of these.
TRIP
(little sigh)
BRENDA
Well, the couch looks sexy.
TRIP
Oh! Brr, I'm going to have to turn up the thermostat if we're going to talk about sex.
GRACE
Trip, come on, that's not funny.
BRENDA
Oops.
TRIP
(sigh) Brenda, I should warn you, I never know how much of what I say is true.
3.4 Richness Through Coherent Intermixing
To dramatically perform Façade's social games as coherent, focused, well-paced narratives, an organizing principle is required that breaks away from the constraints of traditional branching narrative structures, to avoid the combinatorial explosion that occurs with complex causal event chains (Crawford 1989). Our approach to this in Façade is twofold: first, we divide the narrative into multiple fronts of progression, often causally independent, only occasionally interdependent. Second, we build a variety of narrative sequencers to sequence these multiple narrative progressions. These procedural sequencers, described next, operate in parallel and can coherently intermix their performances with one another.
Façade's architecture and content structure are two sides of the same coin, and will be described in tandem; along the way, we will describe how the coherent intermixing is achieved.
3.4.1 Architecture and Content Framework
The Façade system consists of several procedural subsystems that operate simultaneously and communicate with one another (Mateas and Stern 2000, 2003a, 2003b, 2004a, 2004b). Each is briefly described here.
The dynamic, moment-by-moment performance of the characters Grace and Trip - how they perform their dialogue, how they express emotion, how they follow the player around and use objects - are written as a vast collection of behaviors, which are short reactive procedures representing numerous goals and sub-goals for the characters, arranged in a vast, hierarchical, dynamically changing tree structure. These behaviors are written in a reactive-planning language called "A Behavior Language" (ABL), developed as part of the Façade project, that manages both parallel and sequential behavior interrelations such as sub-goal success and failure, priority, conflict, preconditions and context conditions.
The narrative sequencers for the social games are also written in ABL, taking advantage of ABL's ability to perform meta-behaviors that modify the runtime state of other behaviors.
The highest-level narrative sequencer, a subsystem called the drama manager, sequences dramatic beats according to specifications written in a custom drama management language. Beats in Façade are large groups of behaviors organized around a particular topic, described in the next section.
Another subsystem is a set of rules for understanding and interpreting natural language (NL) and gestural input from the player. These rules are written in a custom language implemented with Jess, a forward-chaining rule language. When the player enters dialogue, these NL rules interpret one or more meanings (the aforementioned discourse acts). A second set of rules called reaction proposers further interpret these discourse acts in context-specific ways, such as agreement, disagreement, alliance, or provocation, and send this interpretation to the behaviors and drama manager to react to.
The final subsystem is a custom animation engine that performs character action, emotional expression and spoken dialogue by way of real-time non-photorealistic procedural rendering, as well as music and sound. The animation engine is driven by the ABL behaviors; the engine also senses information about the location and actions of each character for the behaviors to use.
Table 30.1. Façade's discourse acts.
agree
disagree
generalExclamation
positiveExclamation
negativeExclamation
express <emotion>
maybeUnsure
dontUnderstand
thank
apologize
referTo <character> | <object> | <topic> | <theme>
physicallyFavor <object>
praise
hugComfort
flirt
kiss
showConcern
howAreYou
areYouOkay
showSupport
pacify
criticize
oppose
greet
goodbye
getAttention
inappropriateObscene
explain <explainAdviceDescriptor>
advice <explainAdviceDescriptor>
explainRelationship <character1> <relationshipDescriptor> <character 2>
leaveApartment
leaveForKitchen
uncooperativeNotSpeaking
uncooperativeNotMoving
uncooperativeFidgety
systemDoesntUnderstand
3.4.2 Beats, Beat Goals, and Beat Mix-ins
Façade's primary narrative sequencing occurs within a beat, inspired by the smallest unit of dramatic action in the theory of dramatic
writing (McKee 1997). However, Façade's beats ended up being larger structures than the canonical beats of dramatic writing. In dramatic writing, a beat tends
to consist of just a few lines of dialogue that convey a single narrative action/reaction pair.
1
A Façade beat, however, is comprised of anywhere from 10 to 100 joint dialogue behaviors (JDBs), written in ABL. Each beat itself is a narrative sequencer, responsible for sequencing a subset of its JDBs in response
to player interaction. Only one beat is active at any time. A JDB, Façade's atomic unit of dramatic action (and closer to the canonical beat of dramatic writing), consists of a tightly coordinated,
dramatic exchange of 1 to 5 lines of dialogue between Grace and Trip, typically lasting a few seconds. JDBs typically consist
of 50 to 200 lines of ABL code. A beat's JDBs are organized around a common narrative goal, such as a brief conflict about
a topic, like Grace's obsession with redecorating, or the revelation of an important secret, like Trip's attempt to force
Grace to enjoy their second honeymoon in Italy. Each JDB is capable of changing one or more values of story state, such as
the affinity game's value, or any of the therapy game's self-revelation progression counters, or the overall story tension
level. Within-beat narrative sequencers implement the affinity game; the topic of a beat is a particular instance of the affinity
game.
Each beat can be viewed as a bag of procedural content, specifically JDBs, which are dynamically sequenced by the specific
logic of each beat. The drama manager is, in turn, a bag of procedural content, specifically beats, which are dynamically
sequenced by the general logic of the drama manager, as influenced by the preconditions, weights, priorities, etc. specified
for each beat. The logic required to sequence individual lines of dialogue is more detailed and complex than can be easily
described in the declarative annotations at the drama management level; this is precisely why our beats turned out to be larger
than traditional beats of dramatic writing.
2![]()
There are two typical uses of JDBs within beats: as beat goals and beat mix-ins. A beat consists of a canonical sequence of narrative goals called beat goals. The typical canonical sequence consists of a transition-in goal that provides a narrative transition into the beat (e.g., bringing up a new topic, perhaps connecting it to the previous topic), several body goals that accomplish the beat (in affinity game beats, the body goals establish topic-specific conflicts between Grace and Trip that force the player to choose sides), a wait goal in which Grace and Trip wait for the player to respond to the head game established by the beat, and a default transition-out that transitions out of the beat in the event of no player interaction. In general, transition-out goals both reveal information and communicate how the player's action within the beat has changed the affinity dynamic.
A beat's canonical beat goal sequence captures how the beat would play out in the absence of interaction. In addition to the beat goals, there is a set of meta-behaviors, called handlers, which wait for specific interpretations of player dialogue (discourse acts), and modify the canonical sequence in response, typically using beat mix-ins. That is, the handler logic implements the custom narrative sequencer for the beat. Beat mix-in JDBs are beat-specific reactions used to respond to player actions and connect the interaction back to the canonical sequence. Handlers are responsible both for potentially adding, removing, and reordering future beat goals, as well as interjecting beat mix-ins into the canonical sequence. By factoring the narrative sequencing logic and the beat goals in this way, we avoid having to manually unwind the sequencing logic into the beat goal JDBs themselves, thus avoiding the dialogue tree problem mentioned earlier.
For Façade, an experience that lasts about 20 minutes and requires several replays to see all of the content available (any one run-through performs at most 25% of the total content available), we authored about 2,500 JDBs. Approximately 66% of those 2,500 are in beat goals and beat mix-ins, organized into 27 distinct beats, of which approximately 15 are encountered by the player in any one run-through (see the drama management section).
3.4.3 Global Mix-in Progressions
Another type of narrative sequencer, which operates in parallel to, and can intermix with, beat goals and beat mix-ins, are global mix-ins. (How coherent intermixing is achieved is described later.) Each category of global mix-in has three tiers, progressively digging deeper into a topic; advancement of tiers is caused by player interaction, such as referring to the topic. Each tier in the progression is constructed from one or more JDBs, just like beat goals or beat mix-ins. They are focused on satellite topics such as marriage, divorce, sex, and therapy; or about objects such as the furniture, drinks, their wedding photo, the brass bull, or the view; or as generic reactions to praise, criticism, flirtations, oppositions, and the like. Additionally, there are a variety of generic deflection and recovery global mix-ins for responding to overly confusing or inappropriate input from the player. In total, there are about 20 instances of this type of narrative sequencer in Façade, comprising about 33% of the roughly 2,500 total JDBs.
3.4.4 Drama Management (Beat Sequencing)
The coarsest narrative sequencing in Façade occurs in the drama manager, or beat sequencer, as seen in table 30.2.
PlayerArrives
TripGreetsPlayer
PlayerEntersTripGetsGrace
GraceGreetsPlayer
ArgueOverRedecorating
ExplainDatingAnniversary
ArgueOverItalyVacation
FightOverFixingDrinks
PhoneCallFromParents
TransitionToTension2
GraceStormsToKitchen
PlayerFollowsGraceToKitchen
GraceReturnsFromKitchen
TripStormsToKitchen
PlayerFollowsTripToKitchen
TripReturnsFromKitchen
TripReenactsProposal
BlowupCrisis
PostCrisis
TherapyGame
RevelationsBuildup
Revelations
EndingNoRevelations
EndingSelfRevelationsOnly
EndingRelationshipRevelationsOnly
EndingBothNotFullySelfAware
EndingBothSelfAware
This lies dormant most of the time, only active when the current beat is finished or is aborted (by the beat's own decision, or by a global mix-in). It is at the beat sequencing level where causal dependence between major events is handled - that is, where high-level plot decisions are made.
In a beat sequencing language, the author annotates each beat with selection knowledge consisting of preconditions, weights, weight tests, priorities, priority tests, and story value effects - the overall tension level, in Façade's case. Given a collection of beats represented in the beat language, such as the twenty-seven listed in table 30.2, the beat sequencer selects the next beat to be performed. The unused beat whose preconditions are satisfied and whose story tension effects most closely match the near-term trajectory of an author-specified story tension arc (in Façade, an Aristotelian tension arc) is the one chosen; weights and priorities also influence the decision (Mateas and Stern 2003b).
Beat sequencing is further discussed in the Coherent Intermixing section, as well as that on Failures and Successes.
3.4.5 Long-term Autonomous Mix-in Behaviors
Long-term autonomous behaviors, such as fixing drinks and sipping them over time, or compulsively playing with an advice ball toy, last longer than a sixty-second beat or a ten-second global mix-in. While perhaps performing only a minor narrative function, occasionally mixing in a JDB into the current beat (comprising only 1% of Façade's JDBs), they contribute a great deal to the appearance of intelligence in the characters, by having them perform extended, coherent series of low-level actions in the background over the course of many minutes, across several beat boundaries. By simultaneously performing completely autonomous behaviors and joint behaviors, Façade characters are a hybrid between the "one-mind" and "many-mind" extremes of approaches to agent coordination, becoming in effect "multi-mind" agents (Mateas and Stern 2004a).
3.5 Strategies for Coherent Intermixing
Since global mix-ins for the hot-button game are sequenced among beat goals/mix-ins for the affinity game, which both operate in parallel with the drama manager that is occasionally progressing overall story tension, several strategies are needed to maintain coherency, both in terms of discourse management and narrative flow.
First, global mix-in progressions are written to be causally independent of any beats' narrative flow. For example, while quibbling about their second honeymoon in Italy, or arguing about what type of drinks Trip should serve (affinity game beats, chosen by the drama manager), it is safe to mix in dialogue about, for example, sex, or the wedding photo (hot-button game mix-ins, triggered by a player's reference to their topics). Each mix-in's dialogue is written and voice-acted as if they are slightly tangential topics that are being jutted into the flow of conversation ("Oh, that photo, yeah, it's really...").
At the discourse level, mechanisms exist for smoothly handling such interruptions. During a beat goal, such as Trip's reminiscing about the food in Italy, if a global mix-in is triggered, such as the player picking up (thereby referring to) the brass bull, a gift from Trip's lover, the current Italy beat goal will immediately stop mid-performance, and the brass bull global mix-in will begin performing, at whichever tier to which that hot-button game has already progressed. At the time of interruption, if the current Italy beat goal had not yet passed its gist point, which is an author-determined point in a beat goal's JDBs, it will need to be repeated when the global mix-in completes. Short alternate uninterruptible dialogue is authored for each beat goal for that purpose. Also, each beat goal has a reestablish JDB that gets performed if returning to the beat from a global mix-in ("So, I was going to say, about Italy..."). Mix-ins themselves can be interrupted by other mix-ins, but if so, are not repeated as beat goals are.
With only a few exceptions, the affinity game beats themselves are also designed to be causally independent of one another. For example, in terms of maintaining coherency, it does not matter in which order Grace and Trip argue about Italy, their parents, redecorating, fixing drinks, or their dating anniversary. When beat sequencing, this allows the drama manager to prefer sequencing any beats related to past topics brought up by the player. Likewise, hot-button mix-ins can be safely triggered in any order, into almost any beat at any time.
However, great authorial effort was taken to make the tone of each beat goal/mix-in and global mix-in match each other during performance. Most JDBs are authored with three to five alternates for expressing their narrative contents at different combinations of player affinity and tension level. These include variations in word choice, voice acting, emotion, gesture, and appropriate variation of information revealed. By having the tone of hot-button global mix-ins and affinity game beat goals/mix-ins always match each other, players often perceive them as causally related, even though they are not. Additionally, for any one tone, most JDBs are authored with two to four dialogue alternates, equivalent in narrative functionality but helping create a sense of freshness and non-roboticness in the characters between run-throughs of the drama.
4. Detailed Example of Authoring Procedural Content in Façade
To make concrete our discussion of authoring narrative and dialogue within a procedural framework, we will describe the process of authoring a specific story beat of the interactive drama Façade. Authoring a Façade beat involves a combination of interaction design, dialogue writing, and programming, summarized here.
4.1 Designing the Core Structure of a Façade Beat
Our example will be the beat "FightOverFixingDrinks," in which Trip and Grace argue over what kind of drink to make for the player, intended to reveal some of the underlying tension between them, and to further develop their characters. In the first half of the drama during which this beat can occur, the couple Grace and Trip, whose marriage has reached its breaking point, are trying their best to act like nothing is wrong. Specifically in this beat, we'll have Trip use fixing drinks as way to brag about how well-off and cultured he thinks they are. Grace, however, emboldened by the presence of the player, will counter Trip with an attempted attack on Trip about his materialism and faux-sophistication. Both Grace and Trip will challenge the player to take sides on these differences.
We will first lay out a relatively simple outline for the beat, to which we can add additional richness as we go. We designed a basic structure for this beat as follows, as a sequence of beat goals:
• Transition-in to the beat - Trip brings up the idea of drinks.
• Trip makes an initial suggestion, with bragging; Grace initially reacts to the brag. They wait for a few seconds for a player response, if any.
• Grace counters with her own suggestion based on what the player said, attacking Trip; Trip resists. They wait for a few seconds for another player response, if any.
• Transition-out of the beat - Trip and Grace each react to the player's decision, and Trip begins making the drinks.
It is important that each beat goal described here be relatively short, for example, no more than ten seconds each, ideally 5 seconds or less. A small granule size for beat goals allows other beat goals to be intermixed more easily into this sequence (as described next). If a beat goal were longer than ten seconds, we'd want to split it up into smaller multiple beat goals.
4.2 Reactivity Adds Richness
Next we will describe the additional reactivity requirements for this beat, which will add further richness to the interaction. These requirements include:
• At any time during the beat, the player should be able to interrupt what Grace and Trip are saying and get an immediate response of some sort. Whatever dialogue was interrupted should be re-spoken afterward in a believable way, as needed.
• At any time during the beat, the player should be able to bring up other topics or do actions that are not directly related to the topic of fixing drinks, and still get a response from Grace and Trip, as described earlier. These global mix-ins include progressing responses to tangential topics such as divorce, sex, or therapy, or about objects such as the furniture, their wedding photo, or the brass bull, or generic reactions to praise, criticism, flirtations, oppositions, and the like. After the response, Grace and Trip should return to progressing the original beat itself, in a coherent way.
• Any time after the beat, once in another beat, the player should be able to refer to what previously happened during this beat and get a response of some sort; we call this a post-beat mix-in.
To support these reactivity requirements, we will add the following specific features to the beat's structure:
• Gist points: each beat goal needs to be annotated with a gist point, to know how far into a beat goal the player must have gotten to avoid needing to repeat it if interrupted to perform some other mix-in.
• Repeat-dialogue: Each beat goal needs dialogue variation used in case the beat goal needs to be repeated, because it got interrupted in order to perform a mix-in.
• Reestablish-dialogue: Each beat goal needs a prefatory line of dialogue that can re-establish its context, in case the previous beat goal was a global mix-in and the current beat goal is returning to what it was talking about. These often play as a prefix to the repeat-dialogue.
• Local-deflect-dialogue: Each beat goal needs a small set of local deflect dialogue, to be used in case the player interrupts the beat goal with a very generic utterance, for which there is no appropriate global mix-in. These are essentially local mix-ins.
4.3 Performance in a Variety of Contexts Adds Richness
In addition to the reactivity requirements described thus far, we want this beat to operate in a variety of contexts. For example, its specific dialogue, and perhaps its structure, should vary if the beat is performed early in the drama when the tension is still low, versus a bit further along when the tension has increased. (Once the tension has reached a very high level, as authors we've decided that Trip won't be in the mood to fix anyone a drink, and this beat won't be allowed to occur.)
Also, the beat should vary in specific dialogue, and perhaps structure, if the player has been siding with Grace, or with Trip, or stayed neutral, independent of tension level. In fact, if the player's affinity changes during the beat, the beat should use its varying dialogue/structure appropriately.
Finally, this beat, by its nature, can be performed a second time, if enough time has passed since the first time it was performed. That is, if the player wants Trip to make a second drink for her, that should be possible. There needs to be enough internal dialogue and structure variation to avoid unbelievably repeating the same dialogue a second time.
To support such context variety, we will add the following specific features to our beat's structure:
• Each beat goal will be written with dialogue variations for each combination of tension level (low or medium) and each player affinity value (neutral, siding-with-Grace, siding-with-Trip), for a total of 2 x 3 = 6 variations.
• When the beat is occurring at the second (medium) tension level, we will author alternate transition-out beat goals (endings) for the beat, in which Grace reveals aloud one of Trip's Façade-shattering alcohol-related secrets, such as a secret dislike of the taste of liquor, his secret job in college as a lowly bartender, or how he regularly sneaks off to a working-class sports bar down the street. We will divvy these up among the tension/affinity structure variations.
Meeting the requirements listed in this and the previous section contribute to creating agency for the player, because they allow the player to cause this beat to happen when she wishes. They also contribute to dramatic believability, because it only makes sense that drinks could be requested to be fixed at any time, at least until the tension level of the drama becomes too great. Without supporting these requirements, the timing and structure of the discourse and drama overall can seem arbitrarily and unnaturally constrained, significantly reducing agency and believability; that is, the aforementioned problems with the status quo of commercial and noncommercial interactive stories.
4.4 Alternate Dialogue Adds Richness
Ideally each line of dialogue has several variations; for example, three to five alternates, all with the same dramatic meaning but with different phrasings and word choice. While only one alternate will be heard for any line of dialogue per performance, the player will have the opportunity to notice this variation the next time she plays Façade and experiences this beat again, or if this beat happens a second time in the same session.
4.5 Parallel Behavior Adds Richness
Critical for lifelikeness and dramatic believability, Grace and Trip are required to perform expressive, parallel behavior as part of their beat goals:
• As Grace and Trip speak their dialogue, they should emote their current mood through facial expression, gaze and gesture. The specific dialogue they are speaking during the beat will affect their mood, of course, but overall mood can also be affected by whatever other events happened before this beat, as well as by whatever mix-ins may occur during the beat. For example, if a global mix-in occurs about divorce during this beat, that may sour Trip's mood, even if he started off somewhat chipper about fixing drinks. Additionally, while a character is speaking, all nonspeaking characters should react dynamically to the speaking character. This is why the author must write joint dialogue behaviors for each character; behavior must still be written for the nonspeaking characters that control how they react to the dialogue being spoken by the speaking character.
• As characters speak their dialogue, they should tend to follow the player to wherever she walks within the room. This means
that, in general, the dialogue should be written to not depend on where the character is standing when it is spoken.
3![]()
• At almost any time during this beat, we could have Trip autonomously decide to walk behind the bar and begin preparing drinking glasses as he speaks, in anticipation of pouring drinks. Like alternate-dialogue variation, this timing variation will be noticed in subsequent performances of this beat in this session or next. This requires the beat's dialogue to be written to be believable whether or not Trip is behind the bar.
4.6 Simplifications/Abstractions to Reduce Complexity
There are a few aspects of this design that can be simplified and/or abstracted to reduce the complexity of its implementation, while still achieving a satisfying level of agency and believability for the player.
• Simplify the mapping of player utterances/actions to meanings, reducing the number of story reactions to author. Ideally, we would create a distinct reaction (plus alternate dialogue) for each discourse act the player could express, for each distinct context in a beat. However, there are dozens of supported discourse acts (see table 30.1), and potentially as



