Soon video games may no longer be programmed line by line by developers, but generated in real time by an AIcapable of creating dynamic interactive environments unique to each game. This is no longer science fiction, it’s the reality that Google DeepMind, in collaboration with its partners, is making possible with its revolutionary project: GameNGen.

GameNGen, Google’s new innovation, is a game engine powered entirely by a neural model.

It redefines the way video games are designed and played, replacing traditional coding methods with AI capable of generating every element of the game in real time.

This project, which has already surprised many industry experts, could mark the beginning of a new era for video game creators, making the development process faster, cheaper and more accessible to a wider range of creators.

Gamengen and doom ai the video game generated in real time

But what makes GameNGen so special? What challenges might this technology overcome, and how might it transform the video game industry as a whole?

This article takes a deep dive into the details of this innovation, exploring how it works, its early achievements, including the recreation of the iconic game Doom, as well as its future potential in various fields beyond mere entertainment.

What is GameNGen?

GameNGen is the result of a collaboration between Google DeepMind, Google Research, and Tel Aviv University.

In contrast to traditional game engines, such as Unreal Engine or Unity, which rely on predefined code and rules set by developers,

GameNGen stands out for its ability to simulate and create every element of the game on the fly, based on the player’s actions.

To fully understand the impact of this innovation, it’s essential to compare GameNGen to traditional game engines.

Traditional engines, such as those used for popular games like Fortnite or Minecraft, are designed to interpret player commands (like pressing a key or moving the mouse), update the game state accordingly, and then render the image on screen.

With GameNGen, this approach is completely reinvented. The engine uses artificial intelligence, specifically a neural model, to observe player actions in real time and instantly generate game content.

GameNGen operation

The heart of GameNGen is based on a model called the “broadcast model”. This model works in a similar way to an advanced predictive system:

it “watches” how the game evolves frame by frame, like an observer watching a frame-by-frame animation, and then predicts what should happen in the game based on the player’s actions.

The diffusion model used by GameNGen is in some ways similar to that of the major language models used in AI, such as GPT. It works by taking into account a sequence of past images and actions to predict the next image.

The result is a game that evolves dynamically and realistically, with the player’s every action directly influencing the game environment forming in front of them.

With GameNGen, not only is the game generated in real time, but it also maintains a fluidity that rivals that of traditional game engines.

For example, if a player decides to turn left or shoot, GameNGen’s neural model predicts and generates the new game state and renders the corresponding images.

This process is so efficient that GameNGen can run Doom at over 20 frames per second, even with limited specialized hardware.

This is a major achievement, because for a game to be fluid and responsive, it needs to run at a high frame rate. Gamers are sensitive to “lags” or slowdowns in a game, which can seriously detract from the gaming experience.

GameNgen template architecture

GameNGen’s architecture is based on the use of an automatic agent to play and collect data, a modified broadcast model to generate consistent images from this data, and precise decoder tuning to ensure high image quality, even in the finest detail.

Gamengen de google l'avenir des jeux vidéo générés en temps réel architecture

Data collection via an automatic agent

To train a generative model like GameNGen, large amounts of game data are required.

However, obtaining this data using human players would be too costly and time-consuming. Therefore, as a first step, a reinforcement learning agent (RL-agent) is used.

This agent is a program that learns to play the game on its own, by observing actions and results. This agent’s game episodes, including actions taken and images observed, are then recorded and used as training data for GameNGen’s generative model.

Training the generative broadcast model

The base model used to generate the images in the game is a modified version of a model called Stable Diffusion v1.4.

This model is designed to generate images based on a sequence of past actions and observations, i.e. previous game images and movements.

To improve the accuracy and stability of the images generated, one technique is used: noise (Gaussian noise) is added to the images during training.

This noise simulates errors or imperfections, and the model thus learns to correct these errors, which is crucial for maintaining stable, consistent images, even over long periods of play.

Read our article on AI image generation: AI Image Generators: Exploring the key principles

Latent decoder fine-tuning

Stable Diffusion uses an auto-encoder tocompress game images into small chunks (8×8 pixel patches), which are then reduced to latent channels.

However, this compression can introduce artifacts, i.e. imperfections in the generated images, particularly in fine details such as the display of on-screen information (for example, the player’s status bar).

To improve image quality, only the auto-encoder decoder is reworked.

This decoder is adjusted using an MSE (Mean Squared Error) loss method, which compares generated images with real images to fine-tune detail accuracy.

The making of Doom with GameNGen

Doom, a case study

Doom, the legendary jvideo game released in 1993, was chosen by Google to demonstrate GameNGen’s capabilities.

Doom 1993 gamengen

Why Doom, when there are so many other games?

Doom is not only a classic adored by gamers the world over, but has also become a symbol of hacking culture, able to run on almost any device, from calculators to pregnancy tests.

Its popularity and relative complexity made it an ideal candidate for testing and demonstrating the power of an AI-generated game engine.

The choice of Doom is also explained by its technical characteristics. At the time of its release, Doom revolutionized the video game industry by introducing textured 3D graphics and an unprecedented immersive experience for gamers.

Reproducing Doom using GameNGen therefore represents a real technical challenge: not only is it a matter of recreating the visual and interactive elements of the game, but also of capturing the very essence of what made Doom such an icon.

Performance and limitations

The engine is capable of simulating Doom at over 20 frames per second, which is an impressive feat for a technology that generates every frame in real time from scratch.

This means that every monster, every gunshot, and every set element is created by the AI in real time, without recourse to pre-rendered graphics or fixed code.

However, this technical feat comes with certain limitations.

First of all, the AI’s memory is limited to around three seconds of game history, which can sometimes lead to inconsistencies or errors in image generation.

For example, if a player moves quickly through several rooms or interacts unexpectedly with the environment, the engine may start to “hallucinate” or generate elements that don’t exactly match the reality of the game.

These errors are comparable to the “hallucinations” of language models like GPT, where the AI generates text that seems plausible but is in fact incorrect.

Moreover, although GameNGen works remarkably well with a relatively simple game like Doom, it remains to be seen how the engine will perform with more modern and complex games.

Today’s games, with their open worlds, ultra-realistic graphics, and sophisticated game mechanics, pose far greater challenges.

Google researchers recognize these challenges and are already working on improvements to enable GameNGen to handle more advanced games.

Despite these limitations, the results obtained with Doom are very promising.

Tests show that the AI-generated version of Doom is visually and functionally very close to the original.

When human testers were asked to distinguish between clips from the original game and those generated by GameNGen, they were able to correctly identify the AI clips only slightly better than at random, a testament to the realism of the AI-generated environments.

GameNGen’s impact on the video game industry

Towards accelerated video game production

One of the most revolutionary implications of GameNGen is the possibility of dramatically speeding up the video game development process.

Traditionally, creating a video game requires thousands of hours of work, involving teams of developers, designers, artists and testers.

With GameNGen, developers could simply define general rules or intentions, and let the AI do the rest.

For example, a designer could describe verbally or with sketches the appearance of a game world, and GameNGen could generate it instantly, with dynamic interactions and visual consistency.

This automation of the creation process could significantly reduce development costs.

Smaller game studios, which often don’t have the resources to compete with the industry’s big players, would thus be able to produce quality games in much less time.

In addition, it would open the door to a new generation of creators who, while talented in design or storytelling, lack the programming skills needed to realize their ideas.

GameNGen would make game creation accessible to a wider range of people, democratizing the industry.

Future applications beyond video games

Although GameNGen is currently focused on video game creation, the potential applications of this technology go far beyond mere entertainment.

The real-time generation capabilities of interactive environments can be exploited in many other fields, such as training, education, simulation or even therapy.

  • In training, for example, GameNGen could be used to create realistic simulations to train professionals in safe, controlled environments.
  • Imagine surgeons training in AI-generated simulations, where each scenario is unique and adapts according to the decisions made by the practitioner.
  • Similarly, in education, teachers could use interactive simulations to illustrate complex concepts, immersing students in dynamic virtual worlds that react to their actions.
  • The entertainment industry could also benefit from this technology, with interactive movies or TV shows where the viewer controls the action in real time.

GameNGen could therefore not only redefine the future of video games, but also open up new possibilities for the creation of interactive content in many other sectors.

Challenges and future prospects

Technical challenges

Although GameNGen represents a major advance in the way video games can be created and generated, there are still significant technical challenges to overcome before the technology can be fully exploited in large-scale productions.

Among these challenges, the management of AI memory and the prevention of generation errors, or “hallucinations”, occupy a central place.

One of GameNGen’s major limitations is its ability to remember the state of the game over a short period of time.

As mentioned above, GameNGen’s AI can only retain memory for a few seconds of gameplay, which can lead to inconsistencies or errors when the player interacts with the environment in complex or unexpected ways.

This problem is particularly noticeable when the player moves quickly or changes direction abruptly, which can confuse the AI and cause it to generate incorrect scenes.

To solve this problem, Google researchers have introduced a technique called “noise augmentation”, which involvesadding a degree of controlled randomness to the AI’s predictions.

This approach allows the AI to correct its errors and remain true to the reality of the game. However, this solution doesn’t completely solve the problem of long-term memory, and GameNGen developers will need to find ways to increase the AI’s ability to retain a longer history of actions and game states.

Another issue is the fidelity of simulations compared to original games. Although GameNGen has shown that it can faithfully reproduce games such as Doom, minor visual errors persist, particularly in details such as numbers, text, or small graphic elements.

Finally, there’s the question ofcomputing power.

To generate real-time environments with GameNGen, it requires a considerable amount of hardware resources.

Actually, the engine runs on tensor processing units (TPU), which are not accessible to the general public.

This means that for now, this technology is primarily reserved for research purposes, and it will take advances in hardware before AI-generated games can become accessible to all gamers.

Prospects for developers and players

As AI technology continues to advance,it’s likely that GameNGen and similar engines will become increasingly common tools in video game development.

This could transform the role of developers from coders to designers of worlds and experiences, collaborating with AI to create unique games.

For gamers, this technology opens the door to a new era of personalized gaming experiences.

Rather than playing games created by others, they could soon be playing games generated specifically for them, based on their preferences, actions and decisions.

Each game session would be different, offering endless possibilities for replay and discovery of new elements in constantly evolving worlds.

Moreover, engines like GameNGen could offer gamers tools to create their own games, without any programming knowledge.

In the future, generating a video game could be as simple as writing a text description or drawing a few basic sketches.

This accessibility would transform game creation into a mass-market hobby, allowing anyone to turn their video game ideas into reality, even without technical skills.

Conclusion

GameNGen is an innovation that could well profoundly transform the video game industry.

Although the technology is still in development and has certain limitations, its future applications, both in the gaming world and in other sectors, are endless.

As GameNGen continues to improve, it’s exciting to imagine a future where every player could enjoy a unique gaming experience, tailor-made by an AI.

What are your thoughts on GameNGen and the future of AI-generated video games? Share your thoughts in the comments.