Enhanced imitation learning algorithms using human gaze data

Enhancing imitation learning algorithms using human gaze data — Input image stack fed to the algorithms. Credit: Saran et al.

Past psychology studies suggest that the human gaze can encode the intentions of humans as they perform everyday tasks, such as making a sandwich or a hot beverage. Similarly, human gaze has been found to enhance the performance of imitation learning methods, which allow robots to learn how to complete tasks by imitating human demonstrators.

Inspired by these previous findings, researchers at the University of Texas at Austin and Tufts University have recently devised a novel strategy to enhance imitation learning algorithms using human gaze-related data. The method they developed, outlined in a paper pre-published on arXiv, uses the gaze of a human demonstrator to direct the attention of imitation learning algorithms toward areas that they believe are important, based on the fact that human users attended to them.

"Deep-learning algorithms have to learn to identify important features in visual scenes, for instance, a video game character or an enemy, while also learning how to use those features for decision making," Prof. Scott Niekum from the University of Texas at Austin told TechXplore. "Our approach makes this easier, using the human's gaze as a cue that indicates which visual elements of the scene are most important for decision-making."

The approach devised by the researchers entails the use of human gaze-related information as guidance, directing the attention of a deep learning model to particularly important features in the data it is analyzing. This gaze-related guidance is encoded in the loss function applied to deep learning models during training.

"Prior research exploring the use of gaze data to enhance imitation learning approaches typically integrated gaze data by training algorithms with more learnable parameters, making the learning computationally expensive and requiring gaze information at both train and test time," Akanksha Saran, a Ph.D. student at the University of Texas at Austin who was involved in the study, told TechXplore. "We wanted to explore alternative avenues to easily augment existing imitation learning approaches with human gaze data, without increasing learnable parameters."

The strategy developed by Niekum, Saran and their colleagues can be applied to most existing convolutional neural network (CNN)-based architectures. Using an auxiliary gaze loss component that guides the architectures toward more effective policies, their approach can ultimately enhance the performance of a variety of deep-learning algorithms.

Short video showing some examples of how the learning algorithms perform with and without the use of human gaze. Credit: Saran et al.

The new approach has several advantages over other strategies that use gaze-related data to guide deep learning models. The two most notable ones are that it does not require access to gaze data at test time and the addition of supplementary learnable parameters.

The researchers evaluated their approach in a series of experiments, using it to enhance different deep learning architectures and then testing their performance on the Atari games. They found that it significantly improved the performance of three different imitation learning algorithms, outperforming a baseline method that uses human gaze data. Moreover, the researchers' approach matched the performance of another strategy that uses gaze-related data both during training and at test time, but that entails increasing the number of learnable parameters.

"Our findings suggest that the benefits of some previously proposed approaches come from an increase in the number of learnable parameters themselves, not from the use of gaze data alone," Saran said. "Our method shows comparable improvements without adding parameters to existing imitation learning techniques."

While conducting their experiments, the researchers also observed that the motion of objects in a given scene alone does not fully explain the information encoded by gaze. In the future, the strategy they developed could be used to enhance the performance of imitation learning algorithms on a variety of different tasks. The researchers hope that their work will also inform further studies aimed at using human gaze-related data to advance computational techniques.

"While our method reduces computational needs during test time, it requires tuning of hyperparameters during training to get good performance," Saran said. "Alleviating this burden during training by encoding other intuitions of human gaze behavior will be an aspect of future work."

The approach developed by Saran and her colleagues has so far proved to be highly promising, yet there are several ways in which it could be improved further. For instance, it does not currently model all aspects of human gaze-related data that could be beneficial for imitation learning applications. The researchers hope to focus on some of these other aspects in their future studies.

"Finally, temporal connections of gaze and action have not yet been explored and might be critical to attain more benefits in performance," Saran said. "We are also working on using other cues from human teachers to enhance imitation learning, such as human audio accompanying demonstrations."

More information: Efficiently guiding imitation learning algorithms with human gaze. arXiv: 2002.12500 [cs.LG]. arxiv.org/abs/2002.12500

Enhanced imitation learning algorithms using human gaze data

An imitation learning approach to train robots without the need for real human demonstrations

Study explores why human-inspired machines can be perceived as eerie

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

Microsoft claims that small, localized language models can be powerful as well

Scientists pioneer new X-ray microscopy method for data analysis 'on the fly'

New tech could help traveling VR gamers experience 'ludicrous speed' without motion sickness

On the trail of deepfakes, researchers identify 'fingerprints' of AI-generated video

New approach could make reusing captured carbon far cheaper, less energy-intensive

How much energy can offshore wind farms in the U.S. produce? New study sheds light

Engineers uncover key to efficient and stable organic solar cells

Mask-inspired perovskite smart windows enhance weather resistance and energy efficiency

Researchers increase storage, efficiency and durability of capacitors

High-energy-density capacitors with 2D nanomaterials could significantly enhance energy storage

Study shows potential of super grids when hurricanes overshadow solar panels

Rubber-like stretchable energy storage device fabricated with laser precision

Why can't robots outrun animals?

Virtual sensors help aerial vehicles stay aloft when rotors fail

New insights lead to better next-gen solar cells

Enhanced imitation learning algorithms using human gaze data

Let us know if there is a problem with our content

Thank you for taking time to provide your feedback to the editors

Share article

E-MAIL THE STORY