Embodied AI Characters for Emergent Narrative

How AI and augmented reality will issue forth a new genre of interactive character design

Imagine yourself five years from now. Apple has come out with a new version of its augmented reality glasses. You have just purchased a pair. Stepping into a café, you order an espresso and sit down to open the shiny new box. After trying on the glasses – and with a bit of fumbling – you manage to get through the mile-long Apple terms and conditions without touching your computer. Instead, you scroll down to the bottom of the form by gesturing in the air with your index finger, making sure not to poke the eye of the person sitting across from you. You tap the air to indicate “Yes – I have read the terms and conditions”…which of course is a lie. Finally, just for kicks, you start running a cool-sounding app: something about “an ongoing narrative with virtual characters”. Now you are ready go for a walk. You finish your espresso and head out into the street, wearing your new glasses.

Once on the street, you immediately notice a couple of animated characters – emitting an eerie glow and slightly out of place – just outside of the café. They are deep in conversation. They are speaking in a strange accent. You stop and listen, and one of them glances over at you with a glare, annoyed that you are eavesdropping. This makes you a bit nervous and embarrassed.

And that is strange, because you know that these are not real people: They are virtual characters in an ongoing story that is taking place among the streets of your town.

These characters, and several others, have been talking for several months now. They are debating the changes to society that have disrupted their lifestyles. Some of the characters are able to peer into the future and engage with us living here and now in the year 2017 – those of us who happen to have the augmented reality app.

As we join in these virtual conversations, we become incorporated into the unfolding fiction that is playing out. It is an open-ended narrative, acted out by artificially-intelligent characters that are experienced only in augmented reality. Increasingly, we humans here in meatspace become intertwined in the narrative. The boundary between fiction and reality dissolves into a fractal curve. Our lives fuse with the narrative – the narrative fuses with our lives. And that takes some getting used to: the nature of narratives change as we become participants in them.

This vignette that I have just described is just one possible future manifestation of a set of technologies that point toward a new genre where the boundary between reality and fiction is increasingly difficult to detect. And the key ingredient is a set of artificially-intelligent characters with real embodiment: They occupy real place and real time, via augmented reality, geolocation, computer vision, and other technologies that situate them in the physical world. This embodiment is critical to how their narratives play out.

Emergent Narratives vs. Branching Stories

“A story should have a beginning, a middle and an end, but not necessarily in that order.”
– Jean-Luc Godard

Our lives are awash in a sea of overlapping narratives. “Narrative” can be defined as “a spoken or written account of connected events” (Wikipedia). Although the term “story” is often included in this definition, it must be emphasized that a story is a fixed work of creation, which has a beginning, middle, and end. A story is contained, like a song.

But consider Godard’s quote: the linearity of a story can be deconstructed in many ways. Branching storylines have been developed extensively in digital games in which the player has agency in the story, as well as with altered traditional literary media (i.e., the Choose Your Own Adventure book series). Branching stories are assembled from discrete building blocks. And while the ordering of these building blocks may be open-ended, the blocks themselves are fixed and static. The blocks fit together in various ways, but the story is still essentially set by the content of the various blocks.

Can a building block be broken down into sub-chunks? Certainly. But beware. There is no logical end of this line of reasoning – one falls into a kind of Zeno’s Paradox. The smaller and more numerous the chunks, the harder it becomes to compose the glue that holds these chunks together to create meaning. After all, a story is more than just a sequence of events. So, if you want infinitely small, and infinitely many story “atoms”, you’ll be left with just glue…and a pile of atoms.

Simulation

But don’t despair, simulation can come to the rescue! We have only just begun to tap the vast potential of artificial intelligence and other technologies to design simulations that permit narratives to emerge from the very atoms of virtual reality. But while we are waiting for the technology to become advanced enough to make this viable, I would claim that we have a perfect shortcut: Us!

It may be tempting to claim that a deep simulation using the power of Artificial Intelligence (AI), physics, behavioral psychology, and other basic laws of nature can be tuned just right to make spontaneous events happen in a virtual world that are “meaningful.” But that is a tall order, and some may even claim that it can never happen. This is why I am suggesting the inclusion of us – real people with already rich intertwined narratives – in the mix. Think of the simulation matrix as dead soil: Add water, microbes, and seeds, and something will start to grow. True AI doesn’t just work right out of the box – it has to grow, it has to learn – it needs fertile soil.

For this reason, I suggest that we do not need super-intelligent AI systems that emulate high-level human reasoning, emotion, and narrative intelligence. The AI can be just smart enough, and – more importantly – able to react to us and learn from us – to absorb our own meanings into the fabric of the simulation.

The Artificial Life Approach: Starting with a Primordial Soup

I have been developing a technology for several decades that I started while doing research at the MIT Media Lab in the early 90’s. It takes as inspiration the craft of artificial life: Designing virtual petri dishes from which lifelike behaviors emerge. Since that time, the toolset has exploded to include more sophisticated genetic algorithms, physics simulations, neural nets and much more. Concurrently, the rise of machine learning algorithms will help tap vast databases to extract something resembling meaning.

But there is still something missing from this toolset: Virtual body language. In order for AI to be expressive, it needs some form of embodiment. For this reason, I’ve focused on cartoon-style characters – having just the right set of expressive affordances and the ability to learn adaptations to provide an affective dimension. These characters also have a degree of reactive agency, such that narrative-like moments can emerge spontaneously.

With this simplified approach, one can avoid the uncanny valley as well as keep things real and actionable. We will eventually get to human-like intelligence, but I am in no rush. And besides, replicating ourselves accurately may not actually be what the future calls for.

What the future may be calling for is an augmentation of our own narratives with highly-connected artificial agents which have emotional intelligence, expressive body language, access to the internet’s crowd-wisdom, and a strong association with time and place – being truly embodied and situated. Their responsiveness to us complex humans here in meatspace would give the characters endless fodder for generating continuous emergent narrative.