Mental Models

Some images on this post have been lost.

The reality that we sense in front of us is a fiction created by our brains. A host of modules process information in various ways and the end result is a mental model of the outside world. Knowing how this works is crucial to game development as the shape of these mental simulations has a huge effect on how a game feels and plays.

Look around the room or the place you are currently in. It certainly feels like what you are seeing is really there, right? However, that’s not really the case. Reality is in fact made up by subatomic particles that constantly exchange various force particles amongst each other [1]. What you think of as a chair is really just a collection of particles that happen to form a temporarily semi-stable configuration. The reason why you see it as a chair only has to do with how your brain chooses to process the various data that it collects through its senses.

In the previous post on presence I mentioned how the brain is made up of modules, each of them having their own specific purpose. The results from these various modules are then used to form a collective image of your surroundings. For instance, there is a particular module that recognizes faces and, if damaged, it can no longer recognize people – the person affected will only see an object made up of some hair, a nose, two eyes and so forth. Recognizing individual people will only be possible if they have a particularly stand-out feature, like a large beard. Apart from that, all faces will look alike to this person. The normal flow of information is broken and something that most of us take for granted, an intrinsic part of our reality, is no longer present.

This is an extremely important point and it’s essential to fully grasp it. It’s not as if people who lose the ability to see faces still really see faces but don’t “recognize” them. This is the good old “homunculus in the head” fallacy. When you look at the world around you, you are not really seeing details. You are being fed a stream of information and that stream contains things like “that is a chair”, “the chair is made of wood”, “that is the face of your mother”, and so on. If the brain module that does the processing needed for a particular piece of information is damaged, it’s not like your “mental view” remains the same – information is what your mental view is made up from. To get a better idea of this, look at this image:

When first looking at it, most people see this image as simply a collection of dots. But if you look carefully for a bit you will see the form of a dog appearing. Once you have managed to spot this dog, it becomes impossible to unsee. Your brain has gone from interpreting the image as a collection of dots to seeing it as a dog. If you were to lose a brain module this process would be reversed. What was once an image of a dog would turn into a collection of dots. The dog would not still “be there” – it would be erased from your perception of reality.

Your view of reality is not what reality is like, it is a mental simulation based on interpretations of data collected by your senses. You are really living your life in a sort of virtual world that the brain constructs for you [2].

This doesn’t mean that your view of reality is a complete lie, though. It is still based on things that do exist and is a crucial tool for getting around and being able to make decisions. Even though a chair is a made up concept with no basis in reality, it still is very useful. It tells you something about what to expect and what your options are. For instance, if you are presented with either sitting down on a chair or on a pile of broken glass, your mental simulations are invaluable and can quickly give you pretty accurate estimates of what sitting down on each of the alternatives would mean. Note that these mental simulations are not confined to a single aspect of an object. There are things like shape, materials, current light conditions, the physical dimensions, emotional attachment, ownership and many other things that are all connected to an object. When you focus your gaze on an object, that is what you “see” – not some crystal clear pixel-by-pixel representation.

This array of properties is not always correct, though. For instance, if you try and pick up a carton of milk that your brain has modeled as filled (=heavy) and it turns out to be empty (=light), you will lift it with way too much force. But most of the time, because of the practice you’ve had at experiencing reality, your brain is pretty good at providing a good simulation.

Let’s move on to games. When you are playing a game, you are not playing the game that is presented on the screen. You are playing the game that you are currently modelling in your mind. The brain turns clusters of pixels into abstract icons (eg “a power-up”) and then attaches all sort of concepts to them. Just in the same way as it does when you encounter a chair in real-life. The modules in your brain use pre-existing knowledge and experience from interacting with the game and build up a mental model of how it is all connected.

The best example I know of this is from Brian Upton’s book “The Aesthetics of Play“. In the book he presents the example of navigating an environment in a game. What doesn’t happen is that the player bumps into every wall and object, trying to figure out the bounds of the simulation. Instead the player analyses the scene in front of them and then mentally figures out a path to follow. This means that there is a lot of gameplay that takes place inside the player’s head. In fact, unless the player is actively trying to test the systemic bounds of the game, almost all gameplay happens within the player’s mental simulation of the game.

What all of this means is that is that we should be less concerned about the data (images, sounds, etc) that we send to our players and focus more on the sorts of mental simulations it gives rise to. This is an extremely important aspect of game making, and it has far-reaching consequences. No matter how much more realistically you render an object, it doesn’t matter if the player’s mental model chooses to represent it as something else.

The mental model is closely linked to our ability to anticipate. This is something that happens in all kinds of media [3]. For instance when watching a film and a character steps on a banana peel, we predict that they will slip and fall. As we see the foot approaching the banana our brain is already simulating possible outcomes and various filmic tricks, such as editing, are based around this happening in our minds. All mediums rely on this, but creating anticipation in games is extra tricky because of interaction.

In order for us to work with this we need to learn how these mental models are formed. There are three basic ways in which this happens: by using built-in knowledge, extrapolating from past experiences or learning through experimentation. These three modes complement one another, but it is useful to start by looking at them one at a time.

Built-in Knowledge

This is what our brains come equipped to deal with when we are born. They’re essential to a human and you can pretty much assume that anyone playing the game will have them. Basic things like shape, lighting, perspective and so forth are all part of this category. It also includes behaviors like how pouring the content of a large glass into a smaller one will cause it to overflow, rotation of 3D shapes and how objects ought to act if you drop them. Social things like facial expressions are also part of this sort of knowledge. The facial expression connected to disgust is universal, hardwired, and does not depend on mimicking.

The one thing you need to realize about any built-in knowledge is that it’s extremely hard to break. It takes a lot of effort to convince a person that dropping a ball will make it fall upwards. It is basically impossible to make a person intuitively see a mad face as a positive response. This is all hardwired knowledge that comes with equally interesting pros and cons.

If you can tie some basic functionality of your game directly to some built-in knowledge then it will instantly come off as intuitive to any player. For instance, if you want the player to feel disgusted by an enemy it’s good to know that disgust is a disease-avoiding behavior. This knowledge allows you to trigger built in responses and also suggest what sort of events and interactions will strengthen a mental model that gives rise to feelings of disgust.

On the contrary, if your gameplay relies on something that goes against built-in knowledge, you either need to be prepared to spend a lot of time building the proper mental model or to ditch the concept altogether. Sometimes it is of course OK to break the rules, but remember that conforming to built-in knowledge is what makes a world seem believable. And if you want to focus on evoking basic human emotions, this basic believability is crucial. Without that you also lose a bunch of connections which are foundational to our emotional world.

Past Experiences

This is a huge area and it includes everything the player has learned throughout life. It is also something that can vary culturally. What I will focus on right now are two parts of this: past experiences with games versus past experiences with real life.

When you are first presented with a scene in the game there is a ton of stuff for you to process. If you see a red barrel and you have played games in the past, there is a big chance that you will think the barrel will explode when being shot upon. This interpretation relies on more than simply having encountered this specific object before. It relies heavily on what sort of game you are playing (point and click behaves differently from a quake-like shooter), what actions you think are possible (can you shoot it?), and so forth. So players come in with a lot of expectations and preconceptions on how things ought to behave. All of these will not just change how the player feel about the game, they will directly affect how the player think the game actually is like.

A monster can either be a horrible threat that you wanna keep away from, or it can be the source of what makes the game fun in the first place. The view the player takes directly affects how they behave and also has a long reaching effect on the experience of playing the game. For instance, in our game Penumbra the player has the ability to use weapons but they are very weak and inefficient. For players that interpreted the game as one where you’d best avoid any monsters, this worked great and they used the weapons as a last desperate effort to escape – as we had intended. Their mental model was one where the weapons and monsters were just like in real life. For other players the game was interpreted as a one where you could fight back. For these players it didn’t work at all. The weapons felt frustrating to use and the monster was an annoyance. Their mental model was based on how videogames usually work. Despite interacting with the same system, seeing the same visuals and hearing the same sound, these two types of players experienced radically different games.[4]

To combat this in Amnesia: The Dark Descent we started the game with a quick notice on how the game was supposed to be played. This, together with other design changes of course, made a huge difference in how players approach the game. Unlike built-in knowledge, things learned from past events are quite malleable and it is possible to adapt them according to new situations. Which leads us to the final foundational way in which mental models are formed.

Experimentation

From the moment we are born (and possibly even earlier) our brains are hardwired to analyze, generalize and make assumptions. Whenever we encounter a new object we try it out in a variety of ways (squeezing, chewing, throwing, etc) in order to figure out what it is like. We then store that information and pull it out whenever we encounter a similar object. Everyone who has been near a small child knows about this process, and so does everyone who has played an unfamiliar game.

As noted before, the moment we see a scene from a new game, we make a whole load of assumptions of what everything is like and how it functions. But it is not until we get to interact with the scene that our assumptions get confirmation and are cemented. Unless the game is similar to another game we’ve already played, we know that we have new lessons to learn. These first impressions are crucial to how the rest of our experience is shaped [5]. This is why the opening of a game is so important. If a player gets the wrong idea about something it can be really hard to get rid of that faulty mental model.

Once the player interacts with something it will tell them about some aspect of the object. For instance, if they can pick it up or not. The player will then try to generalize this knowledge, often by using pre-existing information. So if a glass bottle can be picked up, they will assume that it’s possible to pick up plastic bottles as well. Furthermore, if you throw a glass bottle and it breaks, it means the player will assume that everything made of glass is breakable. And so the experimentation continues as the game is played. Every new aspect is connected to other things the player already knows and an increasingly detailed mental simulation is built. The next time the player finds a bottle lying around,  a lot of attributes will be assumed the moment it comes into view.

The basic gist of the above shouldn’t be too surprising, as it’s pretty basic stuff. But the key thing to remember here is that these are not just things that form opinions. They form actual reality for the player.

To be able to look at an object and assume a bunch of attributes is what makes the world feel alive. It allows the player to use their hardwired brain faculties to explore, interact and make plans. The world might be rendered using toon shaders and feature talking rabbits, but if it allows for a rich mental model it will feel “real”. Remember, it isn’t about the objective facts of what you see (eg a teapot using highly realistic PBR-based shading), but what processing it gives rise to.

In order to make this happen, you can’t just put objects and interactions into a world at random. The player must be able to explore the elements of the world, and in doing so they must be met by a consistent set of rules. The brain doesn’t have an infinite amount of resources, and will therefore optimize when possible.

So if an object looks like something found in the real world, but you are unable to interact with it, it will not be given any further attributes. As it isn’t of any importance, it will simply become part of the background. In a similar vein, the simplest explanation will also be used when possible. If there are ten keys lying on a table, but only the one that unlocks the door can be picked up, then players will stop modeling these objects as any sort of real keys. They will instead be seen as quest items, possible to pick up when it is convenient for the designer. When there’s no consistency of any sort, the player’s brain will just skip trying to do any modeling and rely on direct experimentation when needed (trial and error, basically). In these cases, players will have a very fuzzy mental model of an object and the object won’t feel very “real”.

An important aspect of this is that it’s not always a bad thing that the brain optimizes away things. For instance, if you are making a simple shooter you don’t really need to take any wall ornaments into account. You should just focus on the overall layout and the positions of the monsters. Everything else is a distraction.

It is, however, crucial to keep all of this in mind. There may be many cases where you don’t want the player to optimize away certain objects. If you want the player to feel like the environment is a real place, you really need to make sure that as many details as possible can have intricate attributes in the player’s mental simulation. It becomes even more important for characters where you want the player to model internal emotions, needs and goals. If your goal is to make the player feel like they are encountering real people, you want those people to be part of their mental model. This is what it means to make something feel real and alive.

All of this doesn’t mean that one’s goal should be to model everything in as detailed a way as possible. In fact, in many cases this may be counterproductive. Details could mean the player makes more assumptions, leading to the structure being more fragile and more likely to crumble. Keep in mind that all we want to worry about is the end result – how the player perceives the experience. The actual content – images, sounds and so on – that we send to the player is just a means to an end.

It’s at this point where narrative-focused games become very different from classical ones. In a classical videogame, it’s almost always a good thing for the player to learn the systems exactly as they are. The better the player understands how all the underlying mechanics work together, the more competently they can play the game and the more fun they will have. Narrative-focused games are different. Here we often want to suggest a lot more than what is in the systems that we have at our direct disposal. Pulling this off requires a collection of tricks where the common thread is to try and make the player do the hard work. I will go over these tricks in future blog posts.

Next week there will be a discussion on how systems and story come together to form a mental model and more discussions on the most common pitfalls and opportunities when designing for mental simulations that feel alive.

Foot notes:

[1] It is actually much more complicated than this as your current reality is a sort of vertical slice of a much later Hilbert space where everything is modeled as waveforms.

[2] And even the idea of a “you” is a mental construct. Check the previous blog on presence for some discussion on this.

[3] Brian Upton goes very in-depth into this area in his book.

[4] The game was not this evenly divided into groups, but the general gist was this kind of behavior.

[5] There are a lot of psychological reasons for this such as the ultimate attribution error and anchoring.