By Dani in Gaming — May 29, 2023

An architectural framework for Generative Agents

Exploring the generative agent architecture proposed by Google researchers, diving into their three modules.

Introduction

In this article, I will explore a generative agent architecture proposed by Park et al. from Google, which consists of three major components: the Memory Stream module, the Reflections module and the Planning and Reacting module.

This revolutionary approach opens up new possibilities for AI-driven entities that mimic human-like behaviour and cognition, paving the way for advanced and captivating gaming and virtual environments.

This article is part of a series dedicated to exploring the fascinating world of generative agents. To read the first part of the series, see the link below:

[Part 1] How generative agents will revolutionise believability in Video Games

The Generative Agent Architecture

Park et al. [1] proposed a generative agent architecture that includes three major components:

a. Memory Stream: A long-term memory module that records the agent’s experiences in the form of ‘memory objects’, containing a description, the time they were recorded and the time they were retrieved by the agent. The most basic item of a memory stream is an observation, which is something that the agent perceives when acting within the environment and which could be related to themselves, to an object or to somebody else.

To identify relevant memory objects, agents utilise a retrieval function assigning a score to each memory based on how recent it is, how important it is, and how related it is to what’s happening now.

The top-ranked memories that fit in the language model’s context window are included in the prompt.

b. Reflections: These components process memories into higher-level abstract thoughts that are periodically generated by the agent and included alongside other observations during retrieval. Reflections provide an advantage when making decisions that require a deeper synthesis of experiences, as they help the agent to reflect on their observations and generate trees of memory objects.

c. Planning and Reacting: A module that ensures coherent and believable agent behaviour over time by converting observations and reflections into high-level action plans broken down into sequences of actions. Agents can react and update their plans based on the stream of observations stored in their memory or when they engage in dialogue with other agents.

The memory stream enables non-player characters to store and recall observed events, reflect on past experiences, plan their actions, and respond to current events, resulting in the display of more realistic and convincing behaviours.

Developers can effectively implement the proposed architecture by ensuring that the memory stream captures a comprehensive list of experiences which are relevant to the context of the game and that they are capable of making informed decisions based on the reflection they can generate. Additionally, they should also make sure that the agents possess robust planning and reacting capabilities, making sure that their behaviour and personality remains consistent as they adapt to the change of states happening in the setting.

When designing generative agents for a storyworld, it’s important to also keep the player’s possible behaviour in mind. As the “human-in-the-loop,” players provide ongoing feedback to both the agents and the environment. Generative agents should therefore be able to respond appropriately to unexpected feedback, making sure there are no shocks introduced in the storyworld which could disrupt the narrative and the overall design. This is because the human may not always be able to provide clear and concise inputs, and the agent may need to be able to infer the human’s intent from their actions. Additionally, the player may change their mind, make mistakes, or intentionally try to disrupt the game, so the agent needs to be able to adapt to these changes and eventually re-align with the original design of the storyworld.

Creating generative agents with this architecture has the potential to revolutionise the way we interact with NPCs since the generative agents are not necessarily bound to deterministic processes and rules, allowing them to act in more complex and unpredictable ways while remaining coherent with the storyworld that unfolds around them. This will allow the design of more engaging, interactive and immersive experiences in gaming and virtual environments and also pave the way for more advanced and complex AI-driven entities that mimic human-like behaviour and cognition.

In this article we unravelled the capabilities and potential of generative agents. In the next one, we will delve even deeper and explore the potential application of generative agents in contexts that go beyond the gaming industry.

Read part 3 here