A technique to improve machine learning inspired by the behavior of human infants

**A technique to improve machine learning inspired by the behavior of human infants
A detailed diagram of the approach developed by the researchers. (Bottom right) For every pair of objects, the researchers feed their features into a relation encoder to get relation rij and object i’s state sobji. (Top Left) Using the greedy method, for each object, they find the maximum Q value to get our focus object, relation object, and action. (Top Right) Once they gathered their focus object and relation object, they feed their states and all of their relations to their decoders to predict the change in position and change in velocity. Credit: Choi & Yoon.

From their first years of life, human beings have the innate ability to learn continuously and build mental models of the world, simply by observing and interacting with things or people in their surroundings. Cognitive psychology studies suggest that humans make extensive use of this previously acquired knowledge, particularly when they encounter new situations or when making decisions.

Despite the significant recent advances in the field of artificial intelligence (AI), most virtual agents still require hundreds of hours of training to achieve human-level performance in several tasks, while humans can learn how to complete these tasks in a few hours or less. Recent studies have highlighted two key contributors to humans' ability to acquire knowledge so quickly—namely, intuitive physics and intuitive psychology.

These intuition models, which have been observed in humans from early stages of development, might be the core facilitators of future learning. Based on this idea, researchers at the Korea Advanced Institute of Science and Technology (KAIST) have recently developed an intrinsic reward normalization method that allows AI agents to select actions that most improve their intuition models. In their paper, pre-published on arXiv, the researchers specifically proposed a graphical physics integrated with learning inspired by the learning behavior observed in human infants.

"Imagine human infants in a room with toys lying around at a reachable distance," the researchers explain in their paper. "They are constantly grabbing, throwing and performing actions on objects; sometimes, they observe the aftermath of their actions, but sometimes, they lose interest and move on to a different . The 'child as a scientist' view suggests that human infants are intrinsically motivated to conduct their own experiments, discover more information, and eventually learn to distinguish different objects and create richer internal representations of them."

Psychology studies suggest that in their first years of life, humans are continuously experimenting with their surroundings, and this allows them to form a key understanding of the world. Moreover, when children observe outcomes that do not meet their prior expectations, which is known as expectancy violation, they are often encouraged to experiment further to achieve a better understanding of the situation they're in.

The team of researchers at KAIST tried to reproduce these behaviors in AI agents using a reinforcement-learning approach. In their study, they first introduced a graphical physics network that can extract physical relationships between objects and predict their subsequent behaviors in a 3-D environment. Subsequently, they integrated this network with a deep-reinforcement learning model, introducing an intrinsic reward normalization technique that encourages an AI agent to explore and identify actions that will continuously improve its intuition model.

Using a 3-D physics engine, the researchers demonstrated that their graphical physics network can efficiently infer the positions and velocities of different objects. They also found that their approach allowed the deep reinforcement learning network to continuously improve its intuition model, encouraging it to interact with objects solely based on intrinsic motivation.

In a series of evaluations, the new technique devised by this team of researchers achieved remarkable accuracy, with the AI agent performing a greater number of different exploratory actions. In the future, it could inform the development of machine learning tools that can learn from their past experiences faster and more effectively.

"We have tested our network on both stationary and non-stationary problems in various scenes with spherical objects with varying masses and radii," the researchers explain in their paper. "Our hope is that these pre-trained intuition models will later be used as a prior knowledge for other goal-oriented task such as ATARI games or video prediction."

Explore further

A bio-inspired approach to enhance learning in ANNs

More information: Intrinsic motivation driven intuitive physics learning using deep reinforcement learning with intrinsic reward normalization. arXiv:1907.03116 [cs.LG]. arxiv.org/abs/1907.03116

© 2019 Science X Network

Citation: A technique to improve machine learning inspired by the behavior of human infants (2019, July 19) retrieved 25 August 2019 from https://techxplore.com/news/2019-07-technique-machine-behavior-human-infants.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Feedback to editors

User comments

Jul 19, 2019
Quantumeering technology Company is developing ENIGMA which stands for Enhanced Numeric Intuitive Grammatical Machine Algorithm. It is an AI based programming platform combined with GENESIS which stands for Graphics Enabled Numerically Engaged Secure Interface System. DEJAVU is a Language Protocol Interrupter Feature and is deployed via an optical Octonion Linear Logic track of advanced holographic design incorporating laser optics with quantum data feedback/response monitoring system. This will allow the operator to see the results of the training in real time and can correct language input. The most amazing aspect of this new platform is that it will utilize vocal input to teach the computer and receive a response from the data STRAM of ENIGMA in the form of synthesized speech and code in text form.

Jul 22, 2019
I was at a deep learning conference just a few days ago. One of the hot topics is data augmentation (because labled training data is sometimes not accessible in the volume one would need to achieve good performance)
Currently efforts focus on creating synthetic data and/or using generative adversarial networks (in essence two neural networks working against one another. One trying to fool the other while the other one sharpens its discriminatory capabilities).
Another way was to train on a simple "game" task with lots of data and then use that to initialize the net trained on the real data - cutting down training time substantially. Particularly the latter seems akin to how humans learn. First get to grips with something simple before moving on to more complicated tasks.

I wonder how the above method would fare against (or in conjunction) with these approaches.

Jul 29, 2019
I'm not sure how novel this idea is. I suppose if someone went back in time to the year 1984, they could use that principle for the plot basis of the 1985 movie D.A.R.Y.L.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more