Representation, Enaction, and the Ethics of Simulation

Representation, Enaction, and the Ethics of Simulation

Simon Penny

Do violent games train us for violence? Drawing on social psychology and cognitive science, Simon Penny examines the “ethics of simulation.”

The goal of this essay is twofold, academic and activist. The academic goal is to attempt to enhance critical discussion of interactive media practice and interactive media cultural practice by introducing a consideration of the implications of embodied involvement in the process. The activist dimension arises from this, and raises a question of ethical responsibility regarding cultural objects that might function as training environments to build behaviors that will ultimately be expressed in the real world.

While sociologists and anthropologists have examined virtual communities, gaming culture, and related cyberspatial phenomena, interest has centered upon issues of identity, subjectivity, and community. The evaluation of the psychological and sociological aspects of interactive entertainment has, to my mind, been limited. The embodied, enacted dimension of interactive entertainment has not been adequately considered. In particular, embodied interaction with a representation, where bodily action changes the representation in a way which is analogous to, and is designed to be analogous to, human action in the world of physical objects and forces, raises scenarios which conventional critiques of representation, and those aspects of art theory that remain influenced by traditional psychology of visual perception, are not well equipped to deal with.

The core of this conversation then is in the space between pictorial representation and simulation, or rather, in the gray and murky area where they overlap. We need a new way to think about the relation between user behavior and digital representations in interactive entertainment. The embodied aspects of simulation feed back onto representation, make representation not inert but interactive. In order to gain some purchase on this territory, I want to juxtapose three aspects of human activity: interactive entertainment, professional simulator training, and the not-technologically-facilitated learning of bodily disciplines and regimes of behavior.

Body Training

One need go no further than Foucault for persuasive argument and evidence that bodily training is a powerful tool in the formation of citizens (Foucault 1977, 3-8). Repetitive physical actions have been an integral part of education and socialization since preliterate times. Anthropological observations by Marcel Mauss (and others) attest to the unacknowledged but pervasive power of physical behaviors in social and cultural formation. Indeed, physical imitation is a key component in social development. The establishment of gender roles, for instance, through such emulation, voluntary or coerced, is well documented.

Pierre Bordieu (1977) and others have established that social behaviors are often learned without conscious intellectual understanding. The way someone rationalizes or explains an activity on an intellectual level, and the behaviors that have been learned and are enacted can be different, even diametrically opposed. Maria Fernandez has (at the Performative Sites Symposium 2000: Intersecting Art, Technology and the Body, Oct. 24-28, Penn State) argued that racist behavior exhibits both ideological and bodily dimensions. Alarmingly, she observes that racist bodily behaviors can be in place and expressed, even while a subject believes she is not racist, and believes that she is not behaving in a racist manner. Such training exists below the level of consciousness, verbalizations and rationalization.

A version of this paper will appear in a forthcoming anthology of papers from the conference edited by Charles Garoian and Yvonne Gaudelius. The paper is also found at: There is a small industry of corporate training in the reading and deployment of body language. Legion are the training and rehabilitation systems which rely on repetitive physical action, even to exhaustion. From this perspective, military boot camp, football training, some forms of yoga and other spiritual training, ballet lessons and some schools of drugs, juvenile delinquency and other psychosocial rehabilitation are almost indistinguishable, except on the level of academicization of the particular techniques.

One quality common to sports training, martial arts training and military training is anti-intellectuality. Whether an activity is introduced verbally and methodically or is instilled by discipline and repetition, it is universally acknowledged by both teacher and (successful) student that the training it is only really effective when it becomes automatic, reflex. It becomes not conscious. The unsuccessful student is told, “You think too much.”

The Military-Entertainment Complex

Computer simulated immersive environments are clearly an effective tool for bodily training, demonstrated by their use in civil aviation and in the military. Over the last decade, applications have broadened; VR simulations have even been applied to psychotherapy. Such simulations create a useful environment for desensitizing phobic patients who transfer what they’ve learned in the “simulated” world to the “real” world, allowing them to ride elevators and cross high bridges. So, while the electronic game industry vehemently counters claims that interactive electronic games have any real-life consequences, psychotherapists employ simulation technologies precisely because they have effect in people’s lives.

Early simulator systems focused on training human users of machine systems whose behavior was relatively easy to simulate. “Psychological tracking and targeting research [of the early 1940s] became an ergonomic discipline aimed at the construction of integrated human-machine cyborgs. Among its most fertile offshoots were military experimental training simulators similar to modern videogames, complete with joystick controls” (Edwards 1996, 199). This technology trickled down from the military and found particular application in the many generations of flight simulators, from the “link trainer” on. The Link flight trainer was designed in the U.S. by Edwin Link, primarily as an amusement park flying simulator. However, with the developments in aviation in the 1930s, it was adapted as an instrument flying trainer. In general, simulators find application anywhere where the cost of the simulator is less than the cost of the real item, as is clearly the case with commercial aircraft.

Training simulation and interactive entertainment were born joined at the hip. There is no better place to examine that join than SIGGRAPH, the Special Interest Group in Computer Graphics and Interactive Techniques of the Association for Computing Machinery (ACM). SIGGRAPH holds a vast annual conference that is attended by academic computer scientists, the computer industry, the entertainment industry and military personnel. Here the military simulator development community, the academic computer science community and the high-end civilian computer graphics and animation community blend. For in truth, there is substantial overlap, and personnel movements between the communities is constant and smooth.

During the 1980s, DARPA (the Defense Advanced Research Projects Agency) and the U.S. Army developed simulators for tanks and vehicles that were integrated in a local network, a multiuser virtual environment. About the same time that Simnet became public in the early 1990s, a high-end immersive multiuser battle simulation game called Battletech opened in Chicago. In the early 1990s, Simnet became integrated into STOW, the military-wide “Synthetic Theater of War.” STOW is a “synthetic battlespace” in which “Computer Generated Forces” are integrated with live military exercises and manned SIMNET simulators, with all data integrated allowing for overall Command and Control. In late 1999, the University of Southern California received a five-year contract from the U.S. Army to establish the Institute For Creative Technologies (ICT). The ICT’s mandate is: “to enlist the resources and talents of the entertainment and game development industries and work collaboratively with computer scientists to advance the state of immersive training simulation.” The fact that the U.S. military have invested millions in Simnet, STOW and other simulation training systems is proof enough that simulation is an effective tool for such training. It is clear that immersive simulation environments are effective in producing such training, and that such training transfers usefully to the “real world.”

In the mid-90s, it was revealed that the U.S. Marines had licensed Doom from Id Software and built “Marine Doom,” to use as a tactical training tool. The U.S. Army MARKS military training device is manufactured by Nintendo. It is highly reminiscent of Duck Hunt except the gun is a plastic M-16, and the targets are images of people, not ducks. More recently the Navy has been using The Sims to model the organisation of terrorist cells (Kaplan 2001). So, in the spirit of “what’s good for the goose is good for the gander,” we are drawn to the conclusion that what separates the first person shooter from the high-end battle simulator is the location of one in an adolescent bedroom and the other in a military base. And having accepted that simulators are effective environments for training, we must accept that so too are the desktop shooter games. The question is: what exactly is the user being trained to do?

David Grossman, a retired Lieutenant-Colonel and expert at desensitizing soldiers to increase their killing efficiency is well known (1996) for his opposition to violent video-games on the basis that the entertainment industry conditions the young in exactly the same way the military does: they hardwire young people for shooting at humans. David Grossman is an expert on the psychology of killing. Retired from the U.S. Army, he now teaches psychology at Arkansas State University, directs the Killology Research Group in Jonesboro, Arkansas. See David Grossman (1996), On Killing: The Psychological Cost of Learning to Kill in War and Society; and David Grossman and Gloria DeGaetano (1999), Stop Teaching Our Kids to Kill: A Call To Action Against TV, Movie & Video Game Violence. On the other hand, advocates for game culture do their best to downplay such associations.

Game designers and theorists Frank Lantz and Eric Zimmerman (1999), in an apologia for Quake, argue:

In single player mode, and especially in multiplayer “deathmatch” mode, Quake ‘s blend of lightspeed tactics and hand-eye coordination has more in common with the cerebral athletics of tennis than the spectacular violence of Rambo. Quake and games like it have succeeded in creating meaningful spaces for play where the extravagant promises of virtual reality have failed. They have focused design on what participants are actually doing from moment to moment in the game, rather than on just the visual and kinetic sensations of moving through an immersive space. This discussion of first-person shooters should not be regarded as a tirade against gaming and game culture per se. I am strongly sympathetic to the open source, hackivist sentiments adduced by Lantz and Zimmerman, and the rejection of sterile corporate product in favor of the vigor of anarchic street culture. I believe that gaming and its offshoots are perhaps the most vibrant and novel cultural forms to have arisen in digital media. Perhaps the most extraordinary of these offshoots is the “machinimation” (my term) community who make synthetic movies using game engines.

In their choice of tennis as a comparison, Lantz and Zimmerman may be suspected of a little tactical disingenuousness. War and combat are clearly present in games such as rugby, chess and Quake, metaphorized to varying degrees. But in tennis one does not claim territory, there is no body contact; it is even difficult to regard the ball as a metaphorized projectile. The narrative logic of the game seems to map more clearly the strategic exchange of a token (a metaphorization of commerce) or debate (a metaphorization of diplomacy), In the 11th or 12th centuries a simple handball game called jeu de paume (“game of the hand”) was played in French monasteries. Later it was adopted by the nobility and moved indoors, where it became known as tennis. It seems to have been devoid of militaristic narratives. whereas Quake is mired in images of gobs of steaming bloody flesh and graphic depictions of death of the most violent kind.

Lantz and Zimmerman continue:

Quake is an undeniably elaborate and ritualized spectacle of violence. But what is the best way to frame this violence? As culture, Quake is serious hardcore pulp, a self-consciously adolescent blood and gore frenzy. But should we consider Quake as the ultimate embodiment of male computerdom’s phallocentric obsessions? Or as the refusal of the mess and blood of the body to be excluded from the clean and infinite perspectives of cyberspace? As the ironic product of a generation of young men with no war to fight?

Or, one is tempted to continue: as the cultural/psychological backwash of two generations of Cold War mentality, of the militarization of education and entertainment, or possibly as an enactment, in the most graphic way, of the reigning dog-eat-dog ethic of the business world?

Is it unfair to blame such atrocities as the Columbine and Jonesboro school massacres purely on such products? Clearly most people, even most Quake players, have a reasonable grasp of the difference between simulation and real life. But equally clearly, these games would not find a market if a larger cultural formation had not prepared the ground. It is in this context that we must ask: what behaviors do these games train? While Zimmerman and Lantz remain equivocal on the issue of whether such game play anesthetizes players to the horrors of real-world violence, others, such as Grossman, are explicit, not just about their desensitizing role but about their ability to efficiently build killing skills:

Whatever you train to do, under stress, is coming out the other end. That’s why we do fire drills. That’s why we do flight simulators… Well, when the children play the violent video games, they’re drilling, drilling, drilling – not two times a year – every night, to kill every living creature in front of you, until you run out of targets or you run out of bullets… we’re reasonably confident that in Pearl, Mississippi, and in Paducah, Kentucky, and in Jonesboro, Arkansas, these juvenile, adolescent killers set out to shoot just one person: usually their girlfriend… maybe a teacher. But, then, they kept on going! And, they gunned down every living creature in front of them, until they ran out of targets or ran out of bullets…! [A]fterwards, the police asked them… “Okay. You shot the person you were mad at. Why did you shoot all these others? Some of `em were your friends!” And the kids don’t know. (Steinberg, 2000) In the same interview, Grossman continues: “I guess the classic example was in Paducah, Kentucky. In Paducah, a 14-year-old boy stole a 22-caliber pistol from a neighbor’s house. Now, prior to stealing that gun, he had never fired a pistol before in his life. He fired a few shots, on a couple of nights before the killings, with the neighbor boy. And, then he brought that gun into school, and he fired eight shots.

“Now, the FBI says that the average officer in the average engagement hits with one bullet in five. In the Amadou Diallo shooting, they fired 41 shots at point-blank range, against an unarmed man: They hit 19 times. The guy that went into the Jewish daycare center in Los Angeles last summer, fired 70 shots, and hit five of those helpless children. So, this boy fires eight shots. How many hits does he get? Eight shots, eight hits, on eight different children. Five of them are head shots. The other three are upper torso. This is stunning.”

Grossman argues that not only do such games train children to “to kill every living creature in front of you,” but as with real training simulators, the children become excellent shots.

Simulation and Metaphorization

Between the full force-feedback VR suit fantasy of the early 1990s (or even the direct neural jack) and the “choose-your-own-adventure” book lies a vast range of technologies of simulation in which bodily action is more or less metaphorized. Often, interactive interfaces depend on complex layers of metaphorization of bodily behavior. The notion of “navigation,” in graphically rendered virtual spaces is a good case example, whether we’re talking about Quake, about immersive VR or navigable three-dimensional web environments. Even in immersive stereoscopic environments (such as the CAVE) the user is navigating not a real space, but a pictorial representation of a space, according to certain culturally established pictorial conventions of spatial representation (such as perspective) established centuries ago for static images. One is not navigating space, but projecting, in the imagination, the implications of manipulating an interactive image medium in a way that will generate a presumed logical next step in a stream of images that represent a space perspectivally from a sequence of points of view.

Little of the proprioceptive or perceptuo-motor correlation characteristic of bodily movement in real space is simulated or accommodated. In the case of the CAVE, for instance, such correlation is utterly scrambled in paradigmatically mechanistic style by the disconnection of forward movement from turning, of “drive” from “steering.” One can bodily turn, but one cannot bodily walk, lest one rend the screen or wrench the gear off your head. The illusion of forward movement is achieved by dragging the world under one’s feet using codified button clicks. This is a laughable example of the way that such systems often inhere awkward and paradoxical user constraints as a result of hardware limitations. More surprising is that this inconsistency seems not to have been found problematic, even in the professional literature.

“Spatial navigation” on the desktop is achieved by the utilization of streams of perspectivally rendered images which correspond to the movement of an avatar in a virtual world, combined with arbitrary combinations of mouse movements and keystrokes which correspond to movement on several degrees of freedom (DOF). The notion of “navigation” in a highly metaphorical “space” of data is several degrees more abstracted. At this point the notion of “navigation” is so highly metaphorized that a substantial amount of cultural background is necessary to make the use of the term comprehensible. Web “navigation,” like many computer applications, leverages and metaphorizes human skills in spatial location and spatial navigation to facilitate information searches. Without doubt, a substantial part of our capability as humans is our enormously rich sensory-motor coordination that allows us to move and work so well in the 2.1-dimensional space we live in. (I use the fractal nomenclature of fractional dimension to emphasize the fact that we do not really live in three dimensions, as birds and fish do. Whereas manually, on a small scale, we work in three dimensions, on the larger scale we exist in just over two, even if that `thick plane’ is folded and crumpled into architecture and landscape.) But whether that spatial sensibility (developed to varying degrees and in various ways in different individuals and probably cultures) can be usefully exploited in computational systems depends entirely on what sort of data is being represented and what the desired mode of interaction might be, and most importantly the affordances of each particular interface technology and hardware platform.

The degree of literalness of simulation depends substantially upon the precision with which bodily behaviors germane to that task in the real world can be accommodated and measured in a simulator environment. Commercial simulators and many interactive artworks make the assumption that a close and accurate accommodation of bodily behaviors results in a more persuasive simulation. The Legible City, Jeffrey Shaw’s paradigmatic immersive artwork of 1988, provides a good case example (figure 7.1). The interface for The Legible City was a stationery bicycle, instrumented so that the speed of pedaling and angle of steering could be extracted. These data directly drove the projection of streetlike imagery on a large projection screen in front of the user. The effect was a fairly complete and persuasive simulation of riding a bicycle through a city.

Figure 7.1: The Legible City

In a similar vein, Janet Murray relates that her immersion in the Mad Dog McRee arcade game “depended heavily on the heft and six shooter shape of the laser gun controller and on the way it was placed in a hip height holster ready for quick draw contests. As soon as I picked up that gun, I was transported back to my childhood and to the world of TV Westerns” (Murray 1997, 146). She notes that her son needed no such tangible interface to enjoy the desktop version of the game. The interface hardware stimulated a bodily memory in Murray, which itself was connected with voluminous cultural background from her childhood, experiences not shared by her son.

If you play Quake on a standard desktop PC, there is no gun-sized, gun-shaped input device. There is a QWERTY keyboard. It must be acknowledged that, as the pen is mightier than the sword, so in the post-Gulf War era, such a device is regularly the interface to machines which kill, maim and destroy, at distances greater than the flight of most projectiles. All that notwithstanding, a keyboard is not like a gun. And this is often the (naïve) argument made to adduce that first-person shooters do not induce violent behavior. But the user’s relation to the system is not that simplistic. The user of a first-person shooter sees the front end of a weapon on the screen. She can point that weapon in various directions. She can press a key with an index finger to see and hear a plume of fire emanate from the weapon and incinerate some alien beast who writhes in agony in a rewarding fashion before collapsing into a steaming heap.

Many “mouse/keyboard” games can be played with joysticks, which are essentially a pistol grip complete with a trigger. More recent joystick peripherals provides force feedback: the user feels a recoil jolt in the hand when the trigger is pulled. And of course, more elaborate arcade game interfaces simulate such effects more completely. Janet Murray notes: “The most compelling aspect of the fighting game is the tight visceral match between the game controller and the screen action. A palpable click on the mouse or joystick results in an explosion. It requires very little imaginative effort to enter such a world because the sense of agency is so direct” (Murray 1997, 146). This statement demonstrates the way that conventional critiques of representation are rendered inadequate in this fusing of bodily action and real-time effect in modeled 3D worlds. The weapon is no longer just a picture. The representation is controlled, driven. In this space between mere pictures and the “real world” the embodied aspects of simulation influence representation in real time.

Why Theories of Visual Representation Are Inadequate

In the postwar years, theories of visual perception based in gestalt psychology, by such authors as Rudolf Arnheim, Ernst Gombrich, and R. L. Gregory, had a significant effect in art theory and criticism. See Rudolf Arnheim (1954), Art and Visual Perception: A Psychology of the Creative Eye; Rudolf Arnheim (1966), Toward a Psychology of Art; E.H. Gombrich (1960), Art and Illusion: A Study in the Psychology of Pictorial Representation; and R L. Gregory (1990), Eye and Brain: The Psychology of Seeing. From a contemporary point of view, this work, especially when it considered art, was characterized by a conception of vision and visual perception as a one-way process of information inflow, through the eyes into the brain. Gregory’s reporting of Helds and Hein’s research (cited later) notwithstanding. This conception of the detached observer eye, the disembodied mindlike eye, the eye as extension of mind, is dualist and objectivist. The shortcoming of such approaches is that they disregard the dynamic perceptuo-motoric nature of visual learning. In a classic experiment by Held and Hein (1958), a group of kittens were reared in total darkness. The kittens were fitted in a gantry arrangement with two baskets. One basket had holes for the legs such that the physical movements of one kitten would drive both animals through roughly similar spatial experiences. In each case, after a few weeks the kitten that walked and could associate visual information with its own physical movement developed effective vision. The rider in the basket remained functionally blind.

We could never interpret an image of a domestic space, had we not actually moved about in such spaces. Physical experience does not simply disambiguate, it is the key by which images are understood. A baby learns about its visual system via physical exploration and hand-eye experiments. Such experiments might be said to “calibrate” the baby’s visual system but in fact they build it, they build a correlation between visual signals and a kinesthetic/tactile nature of the world. Vision is remote sensing but it is grounded on touch and proprioception, on reaching and grasping, stumbling and falling.

In interactive media a user is not simply exposed to images that may contain representations of things and actions. The user is trained in the enaction of behaviors in response to images, and images appear in response to behaviors in the same way that a pilot is trained in a flight simulator. Passive observation may be shown to have some effect on the beliefs or even the actions of an observer, but an enacted training regime must be a more powerful technique. So critiques of representation derived from painting, photography, film and video are inadequate for discussing the power of interactive experience.

Much debate has occurred on the correlation between pornographic images and sex-crime. Conversations about representations of violence typically conflate movies and computer games, as if they were in the same category. Whatever the power of images, interactive media is more. Not “just a picture,” it is an interactive picture that responds to my actions. Our analysis of interactive media must therefore go beyond theories of representation in images. The image is just the target, the surface. The interactive image cannot be spoken of in the terms of traditional passive images because it is procedural. The content is as much in the routine that runs the image as it is in the image itself. Interactive applications are not pictures, they are machines which generate pictures.

Interactive Art and Antisocial Behavior

In late 2000, I visited an interactive installation that in my opinion was unfortunate and ill-conceived. I mention it here only because in a rather extreme way it raises these issues in a fine art context. The work, Kan Xuan by Alexander Brandt, consisted of a full-length image of a naked Asian woman lying face up projected, life-size, on a crumpled cloth on the cold stone floor in a dark corner. The only way to interact with this work was to stomp on the woman, and the only reward was that she recoiled in pain. If you stomped a lot she faded away. But after a few seconds she returned, like the repressed. The figure never objected or defended herself, but neither did she encourage such treatment. It was simply the only possible mode of engagement presented to the user. Inescapably: the audience is invited to enact violence against a naked, prone woman of color. Here is a case study of the potential of electronic representations to encourage or reinforce behaviors in the real world, in this case racist and/or misogynist behaviors.

There was no explanatory text or any other device to encourage a reflexive reading of the work. It might be argued that the work was blackly ironic: that is, by presenting an extreme image, the work promotes in the viewer a powerfully negative response, and has therefore triggered critical thought. Whereas this argument sometimes has merits, it falls into a double trap of double coding. First, there is always a sector of the population for whom the work is not ironic, and in such cases it serves to reinforce the values it (ostensibly) works against. Second, ironic double coding allows the speaker or maker equivocality, to be simultaneously for and against. Beyond the projected image and the sheet, the only other element was the soundtrack of a plaintive, wailing song sung by an Asian woman that added to the air of misery. It bears emphasizing the figure was not a fashionably fetishized object of S&M porn, she was not wearing high heels or makeup, there was nothing about her pose contrived to induce desire. Her flesh tones were grayed out and pale, her eyes were closed, abject. While its subject matter was, to my mind, deeply unfortunate, the work was quite proficient technically and formally. Its interface was well-designed, self-explanatory with excellent economy. Kan Xuan by Alexander Brandt was included in the “Beyond the Screen” exhibition of ISEA 2000 in Paris, at the Ecole Normale Superieur de Beaux Arts.

What is immediately relevant to this and other interactive work, is the (to my knowledge, unremarked) gap between “representation” and “enactment.” In the area of digital media practice that is attached to the visual arts, theories of visual representation have been a powerful critical force. But, as I have been arguing, an interactive “representation” is more than a representation. In desktop computer-based interactives, there are additional levels of metaphorization (i.e., of mouse clicks for body movement) but “embodied interactives” are in a kinesthetically different and more literal territory. Although a mouse-click on an HTML document may “represent,” in a manner similar to a picture, the action of turning a page, it is still in the realm of the symbolic. But actually lifting one’s leg and stomping down on a face moves the action several steps toward the literal. It was not a flesh-and-blood face, nor a rubber model of a face, but on the continuum, a responding photo-real interactive image is definitely more “real” than a block of text or a diagrammatic pistol-shooting target.

When soldiers shoot at targets shaped like people, this trains them to shoot real people. When pilots work in flight simulators, the skills they develop transfer to the real world. When children play “first-person shooters,” they develop skills of marksmanship. So we must accept that there is something that qualitatively separates a work like the one discussed above from a static image of a misogynistic beating, or even a movie of the same subject. That something is the potential to build behaviors that can exist without or separate from, and possibly contrary to, rational argument or ideology.

Why Doesn’t the Army Teach Acting?

I have proposed that interactive entertainments must be assumed to have the same pedagogical power as training simulators, and we must therefore consider their ramifications in this light. I have argued that use of such technologies, both for entertainment and for training, is a bodily training and thus its techniques and effects can be related to nontechnological techniques of body training. Yet, a counterexample is sometimes proposed, which appears to problematize the entire argument. It might be phrased this way: If training in simulated worlds is productive of real-world skills, then why don’t actors who play serial killers usually go on to be serial killers? By introducing another term into the equation, that of the theater, such an objection allows us to explore in more detail the relation between the various parts of my argument.

The counterargument assumes that the theater is a virtual world and that acting is a bodily training. This begs certain other questions: What is the difference between acting and simulation? Why doesn’t the army teach acting? What is the difference between, say, battle simulation training and Macbeth, in terms of its residual effect on the psyche of the enactant?

In order to disentangle these questions, we must first decide whether we are considering the actor or the audience. Given that the audience’s relation with the spectacle is not significantly interactive, we must assume we are talking about the actor. Yet the environment of the (conventional) play is not in any substantial sense interactive. The illusion has been built up, painstakingly, by actor and director, piece by piece, over weeks or months. Every movement is known. The complete illusion, with all the characters, costumes, props, lighting, and sound effects is usually only assembled at a late stage in the process. The illusion is not, it must be remembered, for the actor, but for the audience.

Theater is reflexive and double. A world is conjured but it is a world open on one side to the audience, who are not in that world, but are keen to engage in the illusion that they are. People cry over deaths in the theater, knowing full well that the same actor will die the same way the next night. The actor plays to the audience and gauges their reactions; this is the measure of her success. The audience knows she is the character, and also the actor. The stage is colored lights and painted cardboard, but it is also High Dunsinane. The actor learns the part to help produce a persuasive illusion for the audience. The soldier indulges in a persuasive illusion in order to become better at fighting a war, killing and surviving: a distinctly nonillusory experience.

In simulation, in VR, one is encouraged to believe there is no “outside.” The desire is for complete enveloping illusion. The soldier or the student pilot is not encouraged to look reflexively at the artifice but is encouraged to make no distinction between the simulation and the simulated. But cultural experience rides on the celebration of artifice, the manipulation of the threshold of illusion, where it simultaneously is and isn’t. The holy grail of total immersion is ultimately either psychotic or hallucinogenic. No surprise then that Timothy Leary was enlisted as an early booster for VR, which he referred to as “electronic acid.” Much art, on the other hand, enlists the active intelligence of the viewer to discern fugitive patterns on a perceptual, conceptual or narrative level.

Serious Play

My conclusion is that the objection concerning acting and serial killing does not significantly destabilize my basic argument. It may be proposed that engagement with interactive work, or gaming for that matter, is “just a game,” is “play,” and therefore doesn’t matter. This turn of phrase obscures the fact that the truth is the opposite: “play” is a powerful training tool. To return to the key question: what behaviors do these games train? For certainly they do train. Some game production companies are known to implement so-called “reinforcement schedules” based on the long-standing psychological knowledge that intermittent reinforcement is a more effective training technique than consistent reinforcement. In the case of Quake, a simplistic response would be that they improve hand-eye coordination at the computer console. This would be true to their cybernetic human-machine interaction roots. Yet this would be to ignore any interaction between the physical action and the imagery and narrative of the game.

We know that simulators do train effectively. Skills learned in simulation are elicited in the world. A pilot responds, “instinctually” to a situation familiar from training. No reasoning is necessary; indeed, that is exactly the point: this training, like football training, is about making responses rapid, reflexive. Any Quake -playing kid knows how to blow away approaching enemies – knows, in fact, according to the logic of the game, that any approaching stranger is an enemy and must therefore be blown away immediately. We must assume that these “learned responses” can also transfer to the real world, if triggered. Such a bodily training may not correlate with any considered political position. Indeed, one might contend that the power of interactive experience is to inculcate behaviors, and these behaviors do not require any ideological correlation. In fact they might work best without such correlation. Such learned behaviors are triggered, without conscious decision making, when the current context matches the conditions of the training context. How does the subject know, unconsciously, to produce a specific trained behavior in a specific context? Subtle codes and emotional tenor must play key roles in such triggering. There is the possibility that such behaviors might be expressed in situations which resemble the visual context or emotional tenor of the gameplay. Which is to say, games and interactive media in general can be powerful inculcators of behaviors, and these learned behaviors can be expressed outside the realm of the game. And if this is true, then it is hard to escape the conclusion that an interactive artwork might encourage misogynistic violence or that first person shooters actively contribute to an increase in gun violence among kids.

Coda: The Aesthetics of Interactivity

The emerging range of digital social practices calls for a new range of sociological studies, among them a consideration of the implications of interactive technologies, the “cyber-social.” In the arts, the same technologies demand the development of new areas of aesthetics. The “aesthetics of interactivity” can be divided into two mirroring aspects. A deeper analysis of these two aspects of the aesthetics of interactivity, and the theoretical background relevant to them is the subject of a more substantial forthcoming study. The machine-centric aspect is concerned with semi-autonomous programs and systems which generate variations of output in response to real-time inputs, with behaviors defined by algorithmic procedures. The second aspect is user-centric and is concerned with the users’ behavior and experience.

Part of the machine-centric aspect concerns the temporal modeling of lifelike systems, with Grey Walter’s “turtles” being early examples. A generation later, Craig Reynolds’s Boids system, of 1987, generated the temporal behavior of a flock of birds in a highly abstracted 3D environment. The Boids are the paradigmatic example of “procedural modeling,” a set of techniques that transformed computer animation. No longer was animation a matter of successions of images, but of computational entities that exhibited temporal behaviors. Reynolds’s work was highly suggestive for the group of researchers who, about that time, claimed the title of “Artificial Life” for their field. Out of this group emerged a new paradigm of semi-autonomous systems premised on genetic and evolutionary metaphors, of which Tom Ray’s Tierra attained early notoriety. Around the same time, some robotics researchers were working with emergent paradigms of robot behavior; many of these were grouped around Luc Steels in Belgium and Rodney Brooks at MIT. Since the late 1980s, the notion of semi-autonomous software entities has proven a rich catalyst for experimentation in both the fine and the applied ends of the electronic arts.

In recent years, complex autonomous entities called agents have been a subject of much excitement, and subgenres of research such as “socially intelligent agents” have arisen. In my presentation at one such gathering, the 1997 AAAI Fall workshop on Socially Intelligent Agents at MIT, I coined the term “culturally intelligent agents” to describe my own work. See Simon Penny (2000), “Agents as Artworks and Agent Design as Artistic Practice.” Here procedural systems are combined with real-time response to both digital and real-world entities. One of the applications for such agents is in the field of interactive drama. This area of work has affinities with hypertextual writing, which itself must be regarded as “procedural literature.” In hypertext, the aesthetic work is as much in the design of the system that will present text according to the user’s behavior, as it is in the construction of the textual elements themselves. In this spirit, Joseph Weizenbaum’s Eliza program of 1965, which modeled the behavior of a nondirective psychotherapist, must be acknowledged as the first procedural “portrait” or character study, the first “socially intelligent agent.”

Viewed as a class, the common aspect of these projects is their “procedurality.” The systems generate behavior on the fly, in the real world or a simulated physical or social context. This phenomenon of computational entities responding to their environments implies a new aesthetic field that we might call “procedural aesthetics” or an “aesthetics of automated behavior.”

It is the second aspect of the aesthetics of interactivity that has concerned me most in this paper: the design of (a context for) user behavior. Each work affords, accommodates, or permits only certain types of behavior. So, the behavior of the user is constrained and in a sense, modeled. The quality of this behavior becomes a key component of the user’s experience. Many works, both at the desktop and as installations such as The Legible City, encourage a calm contemplative manner. Some, such as Quake, elicit rapid reactions and adrenaline rushes, but little physical movement. Others encourage large athletic movement, like my own work, Fugitive.

Yet, consciousness of the bodily action of the user per se is only part of this complex. It is necessary that we begin to be able to discuss with some precision the relationship between user behavior and system behavior, or system expression: graphics, text, sound and mechanical events. As I have argued in this paper, the persuasiveness of interactivity is not in the images per se, but in the fact that bodily behavior is intertwined with the formation of representations. It is the ongoing interaction between these representations and the embodied behavior of the user that makes such images more than images. This interaction renders conventional critiques of representation inadequate, and calls for the theoretical and aesthetic study of embodied interaction.


N. Katherine Hayles responds Eugene Thacker responds Simon Penny responds Jan Van Looy responds Simon Penny responds to Jan Van Looy


Arnheim, Rudolf (1954). Art and Visual Perception: A Psychology of the Creative Eye. Berkeley: University of California Press.

—. (1966). Toward a Psychology of Art. Berkeley: University of California Press.

Bordieu, Pierre (1977). Outline of a Theory of Practice. New York: Cambridge University Press.

Card, Orson Scott (1985). Ender’s Game. New York: Tor Books.

Connerton, Paul (1989). How Societies Remember (Themes in the Social Sciences). Cambridge, UK: Cambridge University Press.

Edwards, Paul N. (1996). The Closed World: Computers and the Politics of Discourse in Cold War America. Cambridge, MA: The MIT Press.

Fernandez, Maria (2000). Presentation at Performative Sites Symposium 2000, Intersecting Art, Technology and the Body, October 24-28, Pennsylvania State University, State College, Pennsylvania.

Foucault, Michel (translated by Alan Sheridan) (1977). Discipline and Punish: The Birth of the Prison. New York: Vintage.

Gombrich, E.H. (1960). Art and Illusion: A Study in the Psychology of Pictorial Representation. Princeton: Princeton University Press.

Gregory, R.L. (1990). Eye and Brain: The Psychology of Seeing. Princeton: Princeton University Press.

Grossman, David (1996). On Killing: The Psychological Cost of Learning to Kill in War and Society. Boston: Little, Brown and Co.

—., and Gloria DeGaetano (1999). Stop Teaching Our Kids to Kill: A Call To Action Against TV, Movie & Video Game Violence. New York: Random House.

Held, R., and A. Hein (1958). “Adaption of Disarranged Hand-Eye Coordination Contingent upon Re-afferent Stimulation,” Perceptual-Motor Skills 8 (1958): 87-97.

Kaplan, Karen (2001). “The Sims Take on Al Qaeda.” The Los Angeles Times Record edition (Nov 2, 2001): A1.

Lantz, Frank, and Eric Zimmerman (1999). “Rules, Play, and Culture: Checkmate.” Merge Magazine 1, no.5 (Summer 1999): 41-43.

Maus, Marcel (translated by Ben Brewster) (1973). “Techniques of the Body.” Economy and Society 2.1 (1973). Original in Journal de Psychologie Normal et Pathologique, Paris Année xxxii (1935): 271-93.

Murray, Janet (1997). Hamlet on the Holodeck: The Future of Narrative in Cyberspace. New York: Free Press.

Penny, Simon (2000). “Agents as Artworks and Agent Design as Artistic Practice.” In Human Cognition and Social Agent Technology, edited by Kerstin Dautenhahan. Amsterdam: John Benjamins.

Steinberg, Jeffrey (2000). “Interview with David Grossman: Giving Children the Skill and the Will To Kill.” The Executive Intelligence Review, March 17, 2000.