Reinforcement learning improves game testing, AI team finds

October 7, 2021 2:20 PM

Join gaming leaders online at GamesBeat Summit Next this upcoming November 9-10. Learn more about what comes next.

As game worlds grow more vast and complex, making sure they are playable and bug-free is becoming increasingly difficult for developers. And gaming companies are looking for new tools, including artificial intelligence, to help overcome the mounting challenge of testing their products.

A new paper by a group of AI researchers at Electronic Arts shows that deep reinforcement learning agents can help test games and make sure they are balanced and solvable.

“Adversarial Reinforcement Learning for Procedural Content Generation,” the technique presented by the EA researchers, is a novel approach that addresses some of the shortcomings of previous AI methods for testing games.

Testing large game environments

Webinar

Three top investment pros open up about what it takes to get your video game funded.

Watch On Demand

“Today’s big titles can have more than 1,000 developers and often ship cross-platform on PlayStation, Xbox, mobile, etc.,” Linus Gisslén, senior machine learning research engineer at EA and lead author of the paper, told TechTalks. “Also, with the latest trend of open-world games and live service we see that a lot of content has to be procedurally generated at a scale that we previously have not seen in games. All this introduces a lot of ‘moving parts’ which all can create bugs in our games.”

Developers have currently two main tools at their disposal to test their games: scripted bots and human play-testers. Human play-testers are very good at finding bugs. But they can be slowed down immensely when dealing with vast environments. They can also get bored and distracted, especially in a very big game world. Scripted bots, on the other hand, are fast and scalable. But they can’t match the complexity of human testers and they perform poorly in large environments such as open-world games, where mindless exploration isn’t necessarily a successful strategy.

“Our goal is to use reinforcement learning (RL) as a method to merge the advantages of humans (self-learning, adaptive, and curious) with scripted bots (fast, cheap and scalable),” Gisslén said.

Reinforcement learning is a branch of machine learning in which an AI agent tries to take actions that maximize its rewards in its environment. For example, in a game, the RL agent starts by taking random actions. Based on the rewards or punishments it receives from the environment (staying alive, losing lives or health, earning points, finishing a level, etc.), it develops an action policy that results in the best outcomes.

Testing game content with adversarial reinforcement learning

In the past decade, AI research labs have used reinforcement learning to master complicated games. More recently, gaming companies have also become interested in using reinforcement learning and other machine learning techniques in the game development lifecycle.

For example, in game-testing, an RL agent can be trained to learn a game by letting it play on existing content (maps, levels, etc.). Once the agent masters the game, it can help find bugs in new maps. The problem with this approach is that the RL system often ends up overfitting on the maps it has seen during training. This means that it will become very good at exploring those maps but terrible at testing new ones.

The technique proposed by the EA researchers overcomes these limits with “adversarial reinforcement learning,” a technique inspired by generative adversarial networks (GAN), a type of deep learning architecture that pits two neural networks against each other to create and detect synthetic data.

In adversarial reinforcement learning, two RL agents compete and collaborate to create and test game content. The first agent, the Generator, uses procedural content generation (PCG), a technique that automatically generates maps and other game elements. The second agent, the Solver, tries to finish the levels the Generator creates.

There is a symbiosis between the two agents. The Solver is rewarded by taking actions that help it pass the generated levels. The Generator, on the other hand, is rewarded for creating levels that are challenging but not impossible to finish for the Solver. The feedback that the two agents provide each other enables them to become better at their respective tasks as the training progresses.

The generation of levels takes place in a step-by-step fashion. For example, if the adversarial reinforcement learning system is being used for a platform game, the Generator creates one game block and moves on to the next one after the Solver manages to reach it.

“Using an adversarial RL agent is a vetted method in other fields, and is often needed to enable the agent to reach its full potential,” Gisslén said. “For example, DeepMind used a version of this when they let their Go agent play against different versions of itself in order to achieve super-human results. We use it as a tool for challenging the RL agent in training to become more general, meaning that it will be more robust to changes that happen in the environment, which is often the case in game-play testing where an environment can change on a daily basis.”

Gradually, the Generator will learn to create a variety of solvable environments, and the Solver will become more versatile in testing different environments.

A robust game-testing reinforcement learning system can be very useful. For example, many games have tools that allow players to create their own levels and environments. A Solver agent that has been trained on a variety of PCG-generated levels will be much more efficient at testing the playability of user-generated content than traditional bots.

One of the interesting details in the adversarial reinforcement learning paper is the introduction of “auxiliary inputs.” This is a side-channel that affects the rewards of the Generator and enables the game developers to control its learned behavior. In the paper, the researchers show how the auxiliary input can be used to control the difficulty of the levels generated by the AI system.

EA’s AI research team applied the technique to a platform and a racing game. In the platform game, the Generator gradually places blocks from the starting point to the goal. The Solver is the player and must jump from block to block until it reaches the goal. In the racing game, the Generator places the segments of the track, and the Solver drives the car to the finish line.

The researchers show that by using the adversarial reinforcement learning system and tuning the auxiliary input, they were able to control and adjust the generated game environment at different levels.

Their experiments also show that a Solver trained with adversarial machine learning is much more robust than traditional game-testing bots or RL agents that have been trained with fixed maps.

Applying adversarial reinforcement learning to real games

The paper does not provide a detailed explanation of the architecture the researchers used for the reinforcement learning system. The little information that is in there shows that the the Generator and Solver use simple, two-layer neural networks with 512 units, which should not be very costly to train. However, the example games that the paper includes are very simple, and the architecture of the reinforcement learning system should vary depending on the complexity of the environment and action-space of the target game.

“We tend to take a pragmatic approach and try to keep the training cost at a minimum as this has to be a viable option when it comes to ROI for our QV (Quality Verification) teams,” Gisslén said. “We try to keep the skill range of each trained agent to just include one skill/objective (e.g., navigation or target selection) as having multiple skills/objectives scales very poorly, causing the models to be very expensive to train.”

The work is still in the research stage, Konrad Tollmar, research director at EA and co-author of the paper, told TechTalks. “But we’re having collaborations with various game studios across EA to explore if this is a viable approach for their needs. Overall, I’m truly optimistic that ML is a technique that will be a standard tool in any QV team in the future in some shape or form,” he said.

Adversarial reinforcement learning agents can help human testers focus on evaluating parts of the game that can’t be tested with automated systems, the researchers believe.

“Our vision is that we can unlock the potential of human playtesters by moving from mundane and repetitive tasks, like finding bugs where the players can get stuck or fall through the ground, to more interesting use-cases like testing game-balance, meta-game, and ‘funness,’” Gisslén said. “These are things that we don’t see RL agents doing in the near future but are immensely important to games and game production, so we don’t want to spend human resources doing basic testing.”

The RL system can become an important part of creating game content, as it will enable designers to evaluate the playability of their environments as they create them. In a video that accompanies their paper, the researchers show how a level designer can get help from the RL agent in real-time while placing blocks for a platform game.

Eventually, this and other AI systems can become an important part of content and asset creation, Tollmar believes.

“The tech is still new and we still have a lot of work to be done in production pipeline, game engine, in-house expertise, etc. before this can fully take off,” he said. “However, with the current research, EA will be ready when AI/ML becomes a mainstream technology that is used across the gaming industry.”

As research in the field continues to advance, AI can eventually play a more important role in other parts of game development and gaming experience.

“I think as the technology matures and acceptance and expertise grows within gaming companies this will be not only something that is used within testing but also as game-AI whether it is collaborative, opponent, or NPC game-AI,” Tollmar said. “A fully trained testing agent can of course also be imagined being a character in a shipped game that you can play against or collaborate with.”

Ben Dickson is a software engineer and the founder of TechTalks. He writes about technology, business, and politics.

GamesBeat

GamesBeat’s creed when covering the game industry is “where passion meets business.” What does this mean? We want to tell you how the news matters to you — not just as a decision-maker at a game studio, but also as a fan of games. Whether you read our articles, listen to our podcasts, or watch our videos, GamesBeat will help you learn about the industry and enjoy engaging with it.

How will you do that? Membership includes access to:

Newsletters, such as DeanBeat
The wonderful, educational, and fun speakers at our events
Networking opportunities
Special members-only interviews, chats, and “open office” events with GamesBeat staff
Chatting with community members, GamesBeat staff, and other guests in our Discord
And maybe even a fun prize or two
Introductions to like-minded parties

Become a member