December 6, 2021 feature

SEIHAI: The hierarchical AI that won the NeurIPS-2020 MineRL competition

by Ingrid Fadelli , Tech Xplore

In recent years, computational tools based on reinforcement learning have achieved remarkable results in numerous tasks, including image classification and robotic object manipulation. Meanwhile, computer scientists have also been training reinforcement learning models to play specific human games and videogames.

To challenge research teams working on reinforcement learning techniques, the Neural Information Processing Systems (NeurIPS) annual conference introduced the MineRL competition, a contest in which different algorithms are tested on the same task in Minecraft, the renowned computer game developed by Mojang Studios. More specifically, contestants are asked to create algorithms that will need to obtain a diamond from raw pixels in the Minecraft game.

The algorithms can only be trained for four days and on 8,000,000 samples created by the MineRL simulator, using a single GPU machine. In addition to the training dataset, participants are also provided with a large collection of human demonstrations (i.e., video frames in which the task is solved by human players).

A team of researchers at Huawei Noah's Ark Lab, Tianjin University and Tsinghua University won the NeurIPS- MineRL 2020 competition. Using a sample-efficient hierarchical artificial intelligence (AI) tool called SEIHAI, the researchers were able to outperform all other algorithms participating in the contest.

"We present SEIHAI, a sample-efficient hierarchical AI that fully takes advantage of the human demonstrations and the task structure," Hangyu Mao and his colleagues wrote in a paper outlining their AI, which was pre-published on arXiv. "Specifically, we split the task into several sequentially dependent subtasks and train a suitable agent for each subtask using reinforcement learning and imitation learning."

To obtain a diamond in Minecraft, players need to follow a series of steps. Sequentially, they need to chop a tree to create a log, then use the log to craft a wooden pickaxe, which they will then use to dig out a cobblestone. Finally, the cobblestone needs to be placed into a furnace and crafted into a stone, which could be diamond or something else. Diamond is rare in the game, which further complicates the task for MineRL participants.

To tackle the task most effectively, Mao and his colleagues divided it into a series of subtasks, each of which required different skills and capabilities. They then trained different agents to tackle each of the subtasks individually, using reinforcement learning or imitation learning, depending on which one best suited the problem they were trying to solve.

To decide which agent was better suited for each of the different subtasks, the researchers used a scheduler, a tool that selected an agent for different situations based on the unique characteristics of the subtask that needed to be completed. The hierarchical model created by the researchers significantly outperformed all the other algorithms and models participating in the MineRL 2020 contest, achieving remarkable results.

"We won first place in the preliminary and final of the NeurIPS-2020 MineRL competition, which demonstrates the efficiency of our hierarchical method, SEIHAI," the researchers wrote in their paper. "We believe that developing methods that properly combine human priors and sample-efficient learning-based techniques is a competitive way to solve complex tasks with limited demonstrations, sparse rewards but an explicit task structure."

More information: Hangyu Mao et al, SEIHAI: A sample-efficient hierarchical AI for the MineRL competition. arXiv:2111.08857v1 [cs.LG], arxiv.org/abs/2111.08857

Citation: SEIHAI: The hierarchical AI that won the NeurIPS-2020 MineRL competition (2021, December 6) retrieved 29 June 2024 from https://techxplore.com/news/2021-12-seihai-hierarchical-ai-won-neurips-.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Robots deciding their next move need help prioritizing

211 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

22 hours ago

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (0)

SEIHAI: The hierarchical AI that won the NeurIPS-2020 MineRL competition

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Robots deciding their next move need help prioritizing

A system to transfer robotic dexterous manipulation skills from simulations to real robots

Using imitation and reinforcement learning to tackle long-horizon robotic tasks

A model that translates everyday human activities into skills for an embodied artificial agent

Monte Carlo tree search algorithms that can play the Lord of the Rings card game

Researchers exploit weaknesses of master game bots

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Phys.org

Medical Xpress

Science X

SEIHAI: The hierarchical AI that won the NeurIPS-2020 MineRL competition

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

Robots deciding their next move need help prioritizing

A system to transfer robotic dexterous manipulation skills from simulations to real robots

Using imitation and reinforcement learning to tackle long-horizon robotic tasks

A model that translates everyday human activities into skills for an embodied artificial agent

Monte Carlo tree search algorithms that can play the Lord of the Rings card game

Researchers exploit weaknesses of master game bots

Recommended for you

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Your Privacy