share this!
4
7
Share
Email

April 28, 2021

New algorithm makes it easier for computers to solve decision making problems

by Chinese Association of Automation

Computer scientists often encounter problems relevant to real-life scenarios. For instance, "multiagent problems," a category characterized by multi-stage decision-making by multiple decision makers or "agents," has relevant applications in search-and-rescue missions, firefighting, and emergency response.

Multiagent problems are often solved using a machine learning technique known as reinforcement learning (RL), which concerns itself with how intelligent agents make decisions in an environment unfamiliar to them. An approach usually adopted in such an endeavor is policy iteration (PI), which starts off with a 'base policy' and then improves on it to generate a 'rollout policy' (with the process of generation called a rollout). Rollout is simple, reliable, and well-suited for an on-line, model-free implementation.

There is, however, a serious issue. "In a standard rollout algorithm, the amount of total computation grows exponentially with the number of agents. This can make the computations prohibitively expensive even for a modest number of agents," explains Prof. Dimitri Bertsekas from Massachusetts Institute of Technology and Arizona State University, USA, who studies large-scale computation and optimization of communication and control.

In essence, PI is simply a repeated application of rollout, in which the rollout policy at each iteration becomes the base policy for the next iteration. Usually, in a standard multiagent rollout policy, all agents are allowed to influence the rollout algorithm at once ("all-agents-at-once" policy). Now, in a new study published in the IEEE/CAA Journal of Automatica Sinica, Prof. Bertsekas has come up with an approach that might be a game changer.

In his paper, Prof. Bertsekas focused on applying PI to problems with a multiple-component control, each component selected by a different agent. He assumed that all agents had perfect state information and shared it among themselves. He then reformulated the problem by trading off control space complexity with state space complexity. Additionally, instead of an all-agents-at-once policy, he adopted an agent-by-agent policy wherein only one agent was allowed to execute a rollout algorithm at a time, with coordinating information provided by the other agents.

The result was impressive. Instead of an exponentially growing complexity, Prof. Bertsekas found only a linear growth in computation with the number of agents, leading to a dramatic reduction in the computation cost. Moreover, the computational simplification did not sacrifice the quality of the improved policy, performing at par with the standard rollout algorithm.

Prof. Bertsekas then explored exact and approximate PI algorithms using the new version of agent-by-agent policy improvement and repeated application of rollout. For highly complex problems, he explored the use of neural networks to encode the successive rollout policies, and to precompute signaling policies that coordinate the parallel computations of different agents.

Overall, Prof. Bertsekas is optimistic about his findings and future prospects of his approach. "The idea of agent-by-agent rollout can be applied to challenging multidimensional control problems, as well as deterministic discrete/combinatorial optimization problems, involving constraints that couple the controls of different stages," he observes. He has published two books on RL, one of which, titled "Rollout, Policy Iteration, and Distributed Reinforcement Learning" soon to be published by Tsinghua Press, China, deals with the subject of his study in detail.

The new approach to multiagent systems might very well revolutionize how complex sequential decision problems are solved.

More information: Dimitri Bertsekas. Multiagent Reinforcement Learning: Rollout and Policy Iteration, IEEE/CAA Journal of Automatica Sinica (2021). DOI: 10.1109/JAS.2021.1003814

Provided by Chinese Association of Automation

Citation: New algorithm makes it easier for computers to solve decision making problems (2021, April 28) retrieved 23 April 2024 from https://techxplore.com/news/2021-04-algorithm-easier-decision-problems.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Researchers introduce new algorithm to reduce machine learning time

11 shares

Feedback to editors

With a game show as his guide, researcher uses AI to predict deception

2 hours ago

Super Mario hackers' tricks could protect software from bugs, study finds

3 hours ago

The world's largest 3D printer is at a university in Maine. It just unveiled an even bigger one

5 hours ago

Researchers develop tiny chip that can safeguard user data while enabling efficient computing on a smartphone

6 hours ago

Personalization has the potential to democratize who decides how LLMs behave

6 hours ago

Aerogel-based phase change materials improve thermal management, reduce microwave emissions in electronic devices

6 hours ago

Holographic displays offer a glimpse into an immersive future

6 hours ago

Researchers develop high-energy-density aqueous battery based on halogen multi-electron transfer

7 hours ago

Extracting high-purity gold from electrical and electronic waste

8 hours ago

How potatoes, corn and beans led to breakthrough in smart windows technology

9 hours ago

Load comments (0)

New algorithm makes it easier for computers to solve decision making problems

With a game show as his guide, researcher uses AI to predict deception

Super Mario hackers' tricks could protect software from bugs, study finds

The world's largest 3D printer is at a university in Maine. It just unveiled an even bigger one

Researchers develop tiny chip that can safeguard user data while enabling efficient computing on a smartphone

Personalization has the potential to democratize who decides how LLMs behave

Aerogel-based phase change materials improve thermal management, reduce microwave emissions in electronic devices

Holographic displays offer a glimpse into an immersive future

Researchers develop high-energy-density aqueous battery based on halogen multi-electron transfer

Extracting high-purity gold from electrical and electronic waste

How potatoes, corn and beans led to breakthrough in smart windows technology

Researchers introduce new algorithm to reduce machine learning time

Resilience against replay attacks in computer systems

COVID-19 exposes broadband gaps

A technique to plan paths for multiple robots in flexible formations

Training agents to walk with purpose: Improving machine learning and relational data classification

Army research leads to more effective training model for robots

With a game show as his guide, researcher uses AI to predict deception

Personalization has the potential to democratize who decides how LLMs behave

Holographic displays offer a glimpse into an immersive future

A new framework to generate human motions from language prompts

Neural networks can mediate between download size and quality, according to researcher

A coffee roastery in Finland has launched an AI-generated blend. The results were surprising

Phys.org

Medical Xpress

Science X

New algorithm makes it easier for computers to solve decision making problems

With a game show as his guide, researcher uses AI to predict deception

Super Mario hackers' tricks could protect software from bugs, study finds

The world's largest 3D printer is at a university in Maine. It just unveiled an even bigger one

Researchers develop tiny chip that can safeguard user data while enabling efficient computing on a smartphone

Personalization has the potential to democratize who decides how LLMs behave

Aerogel-based phase change materials improve thermal management, reduce microwave emissions in electronic devices

Holographic displays offer a glimpse into an immersive future

Researchers develop high-energy-density aqueous battery based on halogen multi-electron transfer

Extracting high-purity gold from electrical and electronic waste

How potatoes, corn and beans led to breakthrough in smart windows technology

Related Stories

Researchers introduce new algorithm to reduce machine learning time

Resilience against replay attacks in computer systems

COVID-19 exposes broadband gaps

A technique to plan paths for multiple robots in flexible formations

Training agents to walk with purpose: Improving machine learning and relational data classification

Army research leads to more effective training model for robots

Recommended for you

With a game show as his guide, researcher uses AI to predict deception

Personalization has the potential to democratize who decides how LLMs behave

Holographic displays offer a glimpse into an immersive future

A new framework to generate human motions from language prompts

Neural networks can mediate between download size and quality, according to researcher

A coffee roastery in Finland has launched an AI-generated blend. The results were surprising

Your Privacy