October 5, 2023

Researchers train AI with reinforcement learning to defeat champion Street Fighter players

by Singapore University of Technology and Design

Researchers from the Singapore University of Technology and Design (SUTD) have successfully applied reinforcement learning to a video game problem. The research team created a new complicated movement design software based on an approach that has proven effective in board games like Chess and Go. In a single testing, the movements from the new approach appeared to be superior to those of top human players.

These findings could possibly impact robotics and automation, ushering in a new era of movement design. The team's article in Advanced Intelligence Systems is titled "A Phase-Change Memristive Reinforcement Learning for Rapidly Outperforming Champion Street Fighter Players."

"Our findings demonstrate that reinforcement learning can do more than just master simple board games. The program excelled in creating more complex movements when trained to address long-standing challenges in movement science," said principal investigator Desmond Loke, Associate Professor, SUTD.

"If this method is applied to the right research problems," he says, "it could accelerate progress in a variety of scientific fields."

The study marks a watershed moment in the use of artificial intelligence to advance movement science studies. The possible applications are numerous, ranging from the development of more autonomous automobiles to new collaborative robots and aerial drones.

Reinforcement learning is a kind of machine learning in which a computer program learns to make decisions by experimenting with various actions and getting feedback. For example, the algorithm can learn to play chess by testing millions of possible moves that result in success or defeat on the board. The program is intended to help algorithms learn from their experiences and improve their decision-making skills over time.

The research team provided the computer with millions of initial motions to create a reinforcement learning program for movement design. The program then made several tries at improving each move randomly towards a specific objective. The computer tweaks character movement or adjusts its strategy until it learns how to make moves that overcome the built-in AI.

Human-level performance in Street Fighter game using phase-change memory reinforcement learning. Credit: SUTD

Associate Prof Loke added "Our approach is unique because we use reinforcement learning to solve the problem of creating movements that outperforms those of top human players. This was simply not possible using prior approaches, and it has the potential to transform the types of moves we can create."

As part of their research, the scientists create motions to compete with various in-built AIs. They confirmed that the moves could overcome different in-built AI opponents.

"Not only is this approach effective, but it is also energy efficient." The phase-change memory-based system, for example, was able to make motions with a hardware energy consumption of about 26 fJ, which is 141 times less than that of existing GPU systems. "Its potential for making ultra-low-hardware-energy movements has yet to be fully explored," stated Associate Prof Loke.

The team focused on creating new motions capable of defeating top human players in a short amount of time. This required the use of decay-based algorithms to create the motions.

Algorithm testing revealed that new AI-designed motions were effective. The researchers noted numerous good qualities as a measure of how successful the design system had become, such as reasonable game etiquette, management of inaccurate information, ability to attain specific game states, and the short times used to defeat opponents.

In other words, the program exhibited exceptional physical and mental qualities. This is referred to as effective movement design. For example, motions were more successful at overcoming opponents because the decayed-based technique used for training neural networks takes fewer training steps than conventional decay methods.

The researchers envision a future in which this strategy will allow them and others to build movements, skills, and other actions that were not before possible.

"The more effective the technology becomes, the more potential applications it opens up, including the continued progression of competitive tasks that computers can facilitate for the best players, such as in Poker, Starcraft, and Jeopardy," Associate Prof Loke said. "We may also see high-level realistic competition for training professional players, discovering new tactics, and making video games more interesting."

SUTD researchers Shao-Xiang Go and Yu Jiang also contributed to the study.

More information: Shao-Xiang Go et al, A Phase‐Change Memristive Reinforcement Learning for Rapidly Outperforming Champion Street‐Fighter Players, Advanced Intelligent Systems (2023). DOI: 10.1002/aisy.202300335

Provided by Singapore University of Technology and Design

Citation: Researchers train AI with reinforcement learning to defeat champion Street Fighter players (2023, October 5) retrieved 29 June 2024 from https://techxplore.com/news/2023-10-ai-defeat-champion-street-fighter.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Reinforcement learning: From board games to protein design

80 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Jun 28, 2024

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (0)

Researchers train AI with reinforcement learning to defeat champion Street Fighter players

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Reinforcement learning: From board games to protein design

New mathematical model: Punishments and rewards teach AI agents to make the right decisions

DeepMind's new AI app plays Stratego at expert level

How a computer learns to dribble: Practice, practice, practice

Researchers exploit weaknesses of master game bots

Tennis anyone? Researchers serve up advances in developing motion simulation technology's next generation

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Sony introduces AI for single-instrument accompaniment generation in music production

New work explores optimal circumstances for reaching a common goal with humanoid robots

Software engineers develop a way to run AI language models without matrix multiplication

Phys.org

Medical Xpress

Science X

Researchers train AI with reinforcement learning to defeat champion Street Fighter players

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

Reinforcement learning: From board games to protein design

New mathematical model: Punishments and rewards teach AI agents to make the right decisions

DeepMind's new AI app plays Stratego at expert level

How a computer learns to dribble: Practice, practice, practice

Researchers exploit weaknesses of master game bots

Tennis anyone? Researchers serve up advances in developing motion simulation technology's next generation

Recommended for you

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Sony introduces AI for single-instrument accompaniment generation in music production

New work explores optimal circumstances for reaching a common goal with humanoid robots

Software engineers develop a way to run AI language models without matrix multiplication

Your Privacy