share this!
7
7
Share
Email

January 26, 2022

Instagram teaches AI to recognize rooms

It is not hard for humans to recognize an indoor environment, but teaching an artificial intelligence (AI) system to distinguish an office from a library is. AI systems are usually trained to use images only, and recognizing a space just by looking at objects can easily go wrong. That is why computer scientist Estefanía Talavera Martínez added a new data modality, audio/sound, to the teaching material that the AI system looks at. This resulted in a high success rate in recognizing indoor spaces, and in a new dataset of real-world videos to use in research. Her work was published in the journal Neural Computing and Applications on 22 January.

Estefanía Talavera Martínez is interested in developing algorithms for the automatic analysis of human behavior. In previous work, she relied on photo streams gathered by wearable cameras to gain an understanding of people's daily behavior. These images were first analyzed using AI systems. Doing the same with video is a next step, and one with more applications. "This could also be used to help robots find where they are, or to monitor the elderly, for example," explains Talavera Martínez. However, this requires an automated system that can identify indoor spaces.

Speech

Previous attempts to teach AI to recognize indoor spaces have not been very successful. "One of the reasons for this is that most systems are trained using just one modality, usually recognition of objects in a room." Therefore, Talavera Martínez decided to train her system using a second modality: transcribed texts of speech recorded in the videos.

She used real-world videos from Instagram to train her AI system. This was achieved using the images and speech. The spoken texts were transcribed using standard Google speech recognition software. Talavera Martínez and her then Master's student Andreea Glavan tried different approaches in combining information from images and audio, to find which approach would produce the best result. This resulted in a system that could recognize videos from nine different types of indoor spaces with a 70 percent accuracy, which is higher than previously published systems managed. "Tests that we performed confirmed that using this combination results in a better performance of this system than training it using only images or text," says Talavera Martínez.

Behavior

Furthermore, the research project has produced a dataset of 3,788 Instagram videos describing nine indoor scenes. Also, a selection of 900 YouTube videos was used to confirm the results of the training program. "We have made both datasets publicly available, the first of their kind."

Talavera Martínez would like to use the new AI system to further analyze human behavior from videos: "They contain a lot of information, both as individual frames and as sequences. Importantly, our new system would be able to recognize the type of environment in which the images were made."

Apart from studying behavior, the system could be used, for example, to monitor patients with a special focus on healthy aging. It could also be used to identify positive experiences to be relived by people. "And we know that people often have a very subjective view of their own life. Our system could provide them with an objective registration and analysis."

More information: Andreea Glavan et al, InstaIndoor and multi-modal deep learning for indoor scene recognition, Neural Computing and Applications (2022). DOI: 10.1007/s00521-021-06781-2

Provided by University of Groningen

Citation: Instagram teaches AI to recognize rooms (2022, January 26) retrieved 26 April 2024 from https://techxplore.com/news/2022-01-instagram-ai-rooms.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Facebook announces AI that learns from videos

15 shares

Feedback to editors

Proof of concept study shows path to easier recycling of solar modules

6 hours ago

New circuit boards can be repeatedly recycled

8 hours ago

Researchers develop an automated benchmark for language-based task planners

8 hours ago

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

8 hours ago

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

8 hours ago

Researchers outline path forward for tandem solar cells

10 hours ago

Researcher develop high-performance amorphous p-type oxide semiconductor

10 hours ago

Scientists create new atomic clock that is both ultra-precise and sturdy

10 hours ago

A framework to compare lithium battery testing data and results during operation

13 hours ago

New approach could make reusing captured carbon far cheaper, less energy-intensive

17 hours ago

Load comments (0)

Instagram teaches AI to recognize rooms

Speech

Behavior

Proof of concept study shows path to easier recycling of solar modules

New circuit boards can be repeatedly recycled

Researchers develop an automated benchmark for language-based task planners

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

Researchers outline path forward for tandem solar cells

Researcher develop high-performance amorphous p-type oxide semiconductor

Scientists create new atomic clock that is both ultra-precise and sturdy

A framework to compare lithium battery testing data and results during operation

New approach could make reusing captured carbon far cheaper, less energy-intensive

Facebook announces AI that learns from videos

A 26-layer convolutional neural network for human action recognition

Team develops vision system that improves object recognition

Computer learns to recognize sounds by watching video

Using deep-learning techniques to locate potential human activities in videos

A new taxonomy to characterize human grasp types in videos

Researchers develop an automated benchmark for language-based task planners

Study explores why human-inspired machines can be perceived as eerie

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

Microsoft claims that small, localized language models can be powerful as well

Scientists pioneer new X-ray microscopy method for data analysis 'on the fly'

Phys.org

Medical Xpress

Science X

Instagram teaches AI to recognize rooms

Speech

Behavior

Proof of concept study shows path to easier recycling of solar modules

New circuit boards can be repeatedly recycled

Researchers develop an automated benchmark for language-based task planners

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

Researchers outline path forward for tandem solar cells

Researcher develop high-performance amorphous p-type oxide semiconductor

Scientists create new atomic clock that is both ultra-precise and sturdy

A framework to compare lithium battery testing data and results during operation

New approach could make reusing captured carbon far cheaper, less energy-intensive

Related Stories

Facebook announces AI that learns from videos

A 26-layer convolutional neural network for human action recognition

Team develops vision system that improves object recognition

Computer learns to recognize sounds by watching video

Using deep-learning techniques to locate potential human activities in videos

A new taxonomy to characterize human grasp types in videos

Recommended for you

Researchers develop an automated benchmark for language-based task planners

Study explores why human-inspired machines can be perceived as eerie

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

Microsoft claims that small, localized language models can be powerful as well

Scientists pioneer new X-ray microscopy method for data analysis 'on the fly'

Your Privacy