September 24, 2018 feature
Using reinforcement learning to achieve human-like balance control strategies in robots
Researchers at the University of Edinburgh have developed a hierarchical framework based on deep reinforcement learning (RL) that can acquire a variety of strategies for humanoid balance control. Their framework, outlined in a paper pre-published on arXiv and presented at the 2017 International Conference on Humanoid Robotics, could perform far more human-like balancing behaviors than conventional controllers.
When standing or walking, human beings innately and effectively use a number of techniques for under-actuated control that help them to keep their balance. These include toe tilting and heel rolling, which create better foot-ground clearance. Replicating similar behaviors in humanoid robots could greatly improve their motor and movement capabilities.
"Our research focuses on using deep RL to solve dynamic locomotion of humanoid robots," Dr. Zhibin Li, a lecturer in robotics and control at the University of Edinburgh, who carried out the study, told TechXplore. "In the past, locomotion was mainly done using conventional analytical approaches—model based, which are limited because they require human effort and knowledge, and demand high computing power to run online."
Requiring less human effort and manual tuning, machine learning techniques could lead to the development of more effective and specific controllers than traditional engineering approaches. A further advantage of using RL is that the computation for these tools can also be outsourced offline, resulting in faster online performance for high dimensional control systems, such as humanoid robots.
The framework developed by Dr. Li, in collaboration with Dr. Taku Komura and Ph.D. student Chuanyu Yang, uses deep RL to attain high-level control policies. Constantly receiving feedback of the robot's state, these strategies enable desired joint angles at a lower frequency.
"At the low-level, proportional and derivative (PD) controllers are used at a much higher control frequency to guarantee the stable joint motions," Ph.D. student Chuanyu said. "The inputs for the low-level PD controller are desired joint angles produced by the high-level neural network, and the outputs are the desired torques for joint motors."
The researchers tested the performance of their algorithm and achieved highly promising results. They found that transferring human knowledge from control engineering methods to the reward design for RL algorithms enabled balance control strategies that resembled those used by humans. Moreover, as RL algorithms improve through a trial and error process, automatically adapting to new situations, their framework requires little hand tuning or other interventions by human engineers.
Dr. Li and his colleagues are now working on an extension of their study that applies RL to a full body Valkyrie robot in a 3-D simulation. In this new research effort, they were able to generalize human-resembling balancing strategies to walking and other locomotion tasks.
"Eventually, we would like to apply this hierarchical framework of combining machine learning and robot control to real humanoid robots, as well as to other robotic platforms," Dr. Li said.
© 2018 Tech Xplore