Researchers at the University of Lorraine have recently devised a new type of transfer learning based on model-free deep reinforcement learning with continuous sensorimotor space enlargement. Their approach, presented in a paper published during the eighth Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, and freely available on HAL archives-ouvertes, is inspired by child development, particularly by the growth of the sensorimotor space that occurs as a child is acquiring helpful new strategies.
"The formal framework of reinforcement learning can be used to model a wide range of problems," said Matthieu Zimmer, one of the researchers who carried out the study. "In this framework, an agent uses a trial-and-error method to slowly learn what sequence of actions is the most appropriate to reach a desired goal. If some requisites are met, then theory tells us that we have algorithms that the agent can use to find the optimal solution to the problem, yet this can take long periods of time. To speed up this process, we explored ways for an agent to attain good performance in fewer trials, even when it has nearly no knowledge of the task it will have to solve."
The transfer learning method proposed by Zimmer and his colleagues adds developmental layers to neural networks, allowing them to develop new strategies to complete tasks, especially when these tasks are somehow related. These developmental layers progressively uncover some dimensions of the sensorimotor space, following an intrinsic motivation heuristic.
To mitigate the effects of "catastrophic forgetting," a common issue in the development of neural networks, the researchers took inspiration from elastic weight consolidation theory, using it to regulate the learning of the neural controller.
"The basic idea of our work is for the agent to start with very limited perception and action capabilities and then develop these in a developmental way, inspired by how a child learns," said Alain Dutech, another researcher who carried out the study. "The space in which the agent searches for a solution is thus reduced, and this solution, albeit to a degraded problem, can be found more easily. Then we increase the capabilities of the agent, taking advantage of the previous solution found."
To better explain how their transfer learning approach works, the researchers use the example of a child learning to grab a pen. Initially, the child might only use her elbow and shoulder, learning how to touch the pen. Successively, she might decide to start using the hand and fingers, having grasped the basics of how to best make initial contact with the pen. This entails a gradual learning process, in which the child acquires sensorimotor strategies step by step, without having to learn too many things at once.
The researchers validated their new approach using two state-of-the-art deep learning algorithms, namely DDPG and NFAC, tested on Half-Cheetah and Humanoid, two high-dimensional environment benchmarks. Their results suggest that searching for a suboptimal solution in a subset of the parameter space before considering the full space is a helpful technique to bootstrap learning algorithms, achieving better performance with shorter training.
"In the very active and stimulating field of deep-reinforcement learning, we have shown that developmental methods like ours, as well as other similar ones explored by other researchers, could be combined with deep-learning methods to allow learning from scratch, with little prior knowledge," Zimmer said.
Despite its promising results, the study carried out by Zimmer and his colleagues also highlighted the gap that still exists between the abilities of deep neural networks and human beings. In fact, even when using developmental reinforcement learning, most existing agents are still far less versatile and efficient than humans.
"Sometimes, humans can learn in just one trial, yet even the most efficient artificial learning will require a complex combination of different algorithms to learn, estimate, memorize, compare, and optimize," Zimmer said. "Moreover, some of these algorithms are still not clearly defined."
Dutech and his colleagues are now exploring new horizons within the field of AI and deep learning. For instance, they would like to develop new ways for a learning agent to properly categorize the stimuli it perceives.
"Learning is much more efficient when the agent can interpret what is 'sees' or 'feels'," Dutech explained. "Today, the trend is to use deep-learning and neural networks to do this. We are now exploring other methods of extracting pertinent and useful information from the raw perception of artificial agents, which are less dependent on having a huge corpus of examples; such as unsupervised learning and self-organization."
More information: Developmental reinforcement learning through sensorimotor space enlargement. HAL Id: hal-01876995. hal.archives-ouvertes.fr/hal-01876995/document
Deep developmental reinforcement learning repo: github.com/matthieu637/ddrl
More resources: matthieu-zimmer.net/publicatio … /icdl2018_slides.zip
Provided by Tech Xplore
© 2018 Tech Xplore