October 31, 2022

New software allows nonspecialists to intuitively train machines using gestures

Machine learning, from you — In each image of the HuTics custom data set, the users’ hands are visualized in blue and the object in green. HuTics is used to train a machine learning model. Credit: ©2022 Yatani and Zhou

Many computer systems that people interact with on a daily basis require knowledge about certain aspects of the world, or models, to work. These systems have to be trained, often needing to learn how to recognize objects from video or image data. This data frequently contains superfluous content that reduces the accuracy of models. So, researchers found a way to incorporate natural hand gestures into the teaching process. This way, users can more easily teach machines about objects, and the machines can also learn more effectively.

You've probably heard the term machine learning before, but are you familiar with machine teaching? Machine learning is what happens behind the scenes when a computer uses input data to form models that can later be used to perform useful functions. But machine teaching is the somewhat less explored part of the process, which deals with how the computer gets its input data to begin with.

In the case of visual systems, for example ones that can recognize objects, people need to show objects to a computer so it can learn about them. But there are drawbacks to the ways this is typically done that researchers from the University of Tokyo's Interactive Intelligent Systems Laboratory sought to improve.

The model made with HuTics allows LookHere to use gestures and hand positions to provide extra context for the system to pick out and identify the object, highlighted in red. Credit: ©2022 Yatani and Zhou

"In a typical object training scenario, people can hold an object up to a camera and move it around so a computer can analyze it from all angles to build up a model," said graduate student Zhongyi Zhou.

"However, machines lack our evolved ability to isolate objects from their environments, so the models they make can inadvertently include unnecessary information from the backgrounds of the training images. This often means users must spend time refining the generated models, which can be a rather technical and time-consuming task. We thought there must be a better way of doing this that's better for both users and computers, and with our new system, LookHere, I believe we have found it."

Zhou, working with Associate Professor Koji Yatani, created LookHere to address two fundamental problems in machine teaching: first, the problem of teaching efficiency, aiming to minimize the users' time, and required technical knowledge. And second, of learning efficiency—how to ensure better learning data for machines to create models from.

LookHere achieves these by doing something novel and surprisingly intuitive. It incorporates the hand gestures of users into the way an image is processed before the machine incorporates it into its model, known as HuTics. For example, a user can point to or present an object to the camera in a way that emphasizes its significance compared to the other elements in the scene. This is exactly how people might show objects to each other. And by eliminating extraneous details, thanks to the added emphasis to what's actually important in the image, the computer gains better input data for its models.

"The idea is quite straightforward, but the implementation was very challenging," said Zhou. "Everyone is different and there is no standard set of hand gestures. So, we first collected 2,040 example videos of 170 people presenting objects to the camera into HuTics. These assets were annotated to mark what was part of the object and what parts of the image were just the person's hands.

"LookHere was trained with HuTics, and when compared to other object recognition approaches, can better determine what parts of an incoming image should be used to build its models. To make sure it's as accessible as possible, users can use their smartphones to work with LookHere and the actual processing is done on remote servers. We also released our source code and data set so that others can build upon it if they wish."

Factoring in the reduced demand on users' time that LookHere affords people, Zhou and Yatani found that it can build models up to 14 times faster than some existing systems. At present, LookHere deals with teaching machines about physical objects and it uses exclusively visual data for input. But in theory, the concept can be expanded to use other kinds of input data such as sound or scientific data. And models made from that data would benefit from similar improvements in accuracy, too.

The research was published as part of The 35th Annual ACM Symposium on User Interface Software and Technology.

More information: Zhongyi Zhou et al, Gesture-aware Interactive Machine Teaching with In-situ Object Annotations, The 35th Annual ACM Symposium on User Interface Software and Technology (2022). DOI: 10.1145/3526113.3545648

Provided by University of Tokyo

Citation: New software allows nonspecialists to intuitively train machines using gestures (2022, October 31) retrieved 17 July 2024 from https://techxplore.com/news/2022-10-software-nonspecialists-intuitively-machines-gestures.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

New machine-learning approach brings digital photos back to life

132 shares

Feedback to editors

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

11 hours ago

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

13 hours ago

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

15 hours ago

Large language models make human-like reasoning mistakes, researchers find

15 hours ago

Unveiling a new class of synthetic fuels

16 hours ago

Microsoft unveils software that allows LLMs to work with spreadsheets

16 hours ago

New technique to assess a general-purpose AI model's reliability before it's deployed

17 hours ago

New system enables intuitive teleoperation of a robotic manipulator in real-time

20 hours ago

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

21 hours ago

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Jul 15, 2024

Load comments (0)

New software allows nonspecialists to intuitively train machines using gestures

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

New machine-learning approach brings digital photos back to life

The benefits of peripheral vision for machines

Open source platform enables research on privacy-preserving machine learning

When it comes to AI, can we ditch the datasets?

The complexity of artificial intelligence

Machines that see the world more like humans do

New system enables intuitive teleoperation of a robotic manipulator in real-time

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

Large language models make human-like reasoning mistakes, researchers find

A new neural network makes decisions like a human would

Phys.org

Medical Xpress

Science X

New software allows nonspecialists to intuitively train machines using gestures

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Related Stories

New machine-learning approach brings digital photos back to life

The benefits of peripheral vision for machines

Open source platform enables research on privacy-preserving machine learning

When it comes to AI, can we ditch the datasets?

The complexity of artificial intelligence

Machines that see the world more like humans do

Recommended for you

New system enables intuitive teleoperation of a robotic manipulator in real-time

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

Large language models make human-like reasoning mistakes, researchers find

A new neural network makes decisions like a human would

Your Privacy