April 29, 2020

Automating the search for entirely new 'curiosity' algorithms

by Kim Martineau, Massachusetts Institute of Technology

Driven by an innate curiosity, children pick up new skills as they explore the world and learn from their experience. Computers, by contrast, often get stuck when thrown into new environments.

To get around this, engineers have tried encoding simple forms of curiosity into their algorithms with the hope that an agent pushed to explore will learn about its environment more effectively. An agent with a child's curiosity might go from learning to pick up, manipulate, and throw objects to understanding the pull of gravity, a realization that could dramatically accelerate its ability to learn many other things.

Engineers have discovered many ways of encoding curious exploration into machine learning algorithms. A research team at MIT wondered if a computer could do better, based on a long history of enlisting computers in the search for new algorithms.

In recent years, the design of deep neural networks, algorithms that search for solutions by adjusting numeric parameters, has been automated with software like Google's AutoML and auto-sklearn in Python. That's made it easier for non-experts to develop AI applications. But while deep nets excel at specific tasks, they have trouble generalizing to new situations. Algorithms expressed in code, in a high-level programming language, by contrast, have the capacity to transfer knowledge across different tasks and environments.

"Algorithms designed by humans are very general," says study co-author Ferran Alet, a graduate student in MIT's Department of Electrical Engineering and Computer Science and Computer Science and Artificial Intelligence Laboratory (CSAIL). "We were inspired to use AI to find algorithms with curiosity strategies that can adapt to a range of environments."

The researchers created a "meta-learning" algorithm that generated 52,000 exploration algorithms. They found that the top two were entirely new—seemingly too obvious or counterintuitive for a human to have proposed. Both algorithms generated exploration behavior that substantially improved learning in a range of simulated tasks, from navigating a two-dimensional grid based on images to making a robotic ant walk. Because the meta-learning process generates high-level computer code as output, both algorithms can be dissected to peer inside their decision-making processes.

The paper's senior authors are Leslie Kaelbling and Tomás Lozano-Pérez, both professors of computer science and electrical engineering at MIT. The work will be presented at the virtual International Conference on Learning Representations later this month.

The paper received praise from researchers not involved in the work. "The use of program search to discover a better intrinsic reward is very creative," says Quoc Le, a principal scientist at Google who has helped pioneer computer-aided design of deep learning models. "I like this idea a lot, especially since the programs are interpretable."

The researchers compare their automated algorithm design process to writing sentences with a limited number of words. They started by choosing a set of basic building blocks to define their exploration algorithms. After studying other curiosity algorithms for inspiration, they picked nearly three dozen high-level operations, including basic programs and deep learning models, to guide the agent to do things like remember previous inputs, compare current and past inputs, and use learning methods to change its own modules. The computer then combined up to seven operations at a time to create computation graphs describing 52,000 algorithms.

Even with a fast computer, testing them all would have taken decades. So, instead, the researchers limited their search by first ruling out algorithms predicted to perform poorly, based on their code structure alone. Then, they tested their most promising candidates on a basic grid-navigation task requiring substantial exploration but minimal computation. If the candidate did well, its performance became the new benchmark, eliminating even more candidates.

Four machines searched over 10 hours to find the best algorithms. More than 99 percent were junk, but about a hundred were sensible, high-performing algorithms. Remarkably, the top 16 were both novel and useful, performing as well as, or better than, human-designed algorithms at a range of other virtual tasks, from landing a moon rover to raising a robotic arm and moving an ant-like robot in a physical simulation.

All 16 algorithms shared two basic exploration functions.

In the first, the agent is rewarded for visiting new places where it has a greater chance of making a new kind of move. In the second, the agent is also rewarded for visiting new places, but in a more nuanced way: One neural network learns to predict the future state while a second recalls the past, and then tries to predict the present by predicting the past from the future. If this prediction is erroneous it rewards itself, as it is a sign that it discovered something it didn't know before. The second algorithm was so counterintuitive it took the researchers time to figure out.

"Our biases often prevent us from trying very novel ideas," says Alet. "But computers don't care. They try, and see what works, and sometimes we get great unexpected results."

More researchers are turning to machine learning to design better machine learning algorithms, a field known as AutoML. At Google, Le and his colleagues recently unveiled a new algorithm-discovery tool called Auto-ML Zero. (Its name is a play on Google's AutoML software for customizing deep net architectures for a given application, and Google DeepMind's Alpha Zero, the program that can learn to play different board games by playing millions of games against itself.)

Their method searches through a space of algorithms made up of simpler primitive operations. But rather than look for an exploration strategy, their goal is to discover algorithms for classifying images. Both studies show the potential for humans to use machine-learning methods themselves to create novel, high-performing machine-learning algorithms.

"The algorithms we generated could be read and interpreted by humans, but to actually understand the code we had to reason through each variable and operation and how they evolve with time," says study co-author Martin Schneider, a graduate student at MIT. "It's an interesting open challenge to design algorithms and workflows that leverage the computer's ability to evaluate lots of algorithms and our human ability to explain and improve on those ideas."

More information: Meta-learning curiosity algorithms. arxiv.org/abs/2003.05325. arXiv:2003.05325v1 [cs.LG]

Effective, interpretable algorithms for curiosity automatically discovered by evolutionary search: lis.csail.mit.edu/wp-content/u … osity-compressed.pdf

Provided by Massachusetts Institute of Technology

This story is republished courtesy of MIT News (web.mit.edu/newsoffice/), a popular site that covers news about MIT research, innovation and teaching.

Citation: Automating the search for entirely new 'curiosity' algorithms (2020, April 29) retrieved 30 June 2024 from https://techxplore.com/news/2020-04-automating-curiosity-algorithms.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Researchers measure reliability, confidence for next-gen AI

111 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Jun 28, 2024

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (0)

Automating the search for entirely new 'curiosity' algorithms

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Researchers measure reliability, confidence for next-gen AI

Computer scientists create a 'laboratory' to improve streaming video

Synergy emergence in deep reinforcement motor learning

Google's robot learns to walk in real world

Chemists show how bias can crop up in machine learning algorithm results

How CERN machine-learning techniques could improve autonomous vehicles

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Phys.org

Medical Xpress

Science X

Automating the search for entirely new 'curiosity' algorithms

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

Researchers measure reliability, confidence for next-gen AI

Computer scientists create a 'laboratory' to improve streaming video

Synergy emergence in deep reinforcement motor learning

Google's robot learns to walk in real world

Chemists show how bias can crop up in machine learning algorithm results

How CERN machine-learning techniques could improve autonomous vehicles

Recommended for you

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Your Privacy