June 29, 2019 weblog

Researchers step back to mannequin viral wave to explore depth

by Nancy Cohen , Tech Xplore

Who said the viral craze called Mannequin Challenge (MC) is done and dusted? Not so. Researchers have turned to the Challenge that won attention in 2016 to serve their goal. They used the MC for training a neural network that can reconstruct depth information from the videos.

"Learning the Depths of Moving People by Watching Frozen People" is the name of their paper, now up on arXiv, authored by Zhengqi Li, Tali Dekel, Forrester Cole, Richard Tucker, Noah Snavely, Ce Liu and William Freeman. The paper was submitted in April this year.

The Mannequin Challenge? Who can forget? This was a YouTube trend gone viral. Anthony Alford in InfoQ brought readers back to 2016, when an internet meme had people teamed in groups impersonate mannequins. They were "frozen" but a videographer would make moves around the scene taking a video from different angles.

Alford wrote, because the camera is moving and the rest of the scene is static, parallax methods can easily reconstruct accurate depth maps of human figures in a variety of poses.

As the authors stated, the videos involved freezing in diverse, natural poses, while a hand-held camera toured the scene.

For training the neural network, the team converted 2,000 of the videos into 2-D images with high-resolution depth data.

Alford said that out of the 2,000 YouTube MC videos, a dataset was produced of 4,690 sequences with a total of more than 170K valid image-depth pairs. The target of the learning system was the known depth map for the input image, computed from the MC videos. The DNN learned to take the input image, initial depth map, and human mask, and output a "refined" depth map where the depth values of humans were filled in.

Christine Fisher, Engadget: "To train the neural network, the researchers converted the clips into 2-D images, estimated the camera pose and created depth maps. The AI was then able to predict the depth of moving objects in videos with higher accuracy than previously possible."

Taking up the challenge was described by two of the paper's co-authors back in May in a Google blog.

"Because the entire scene is stationary (only the camera is moving), triangulation-based methods—like multi-view-stereo (MVS)—work, and we can get accurate depth maps for the entire scene including the people in it. We gathered approximately 2000 such videos, spanning a wide range of realistic scenes with people naturally posing in different group configurations." Tali Dekel, research scientist and Forrester Cole, software engineer, machine perception, wrote more about the challenge they took on.

"The human visual system has a remarkable ability to make sense of our 3-D world from its 2-D projection. Even in complex environments with multiple moving objects, people are able to maintain a feasible interpretation of the objects' geometry and depth ordering. The field of computer vision has long studied how to achieve similar capabilities by computationally reconstructing a scene's geometry from 2-D image data, but robust reconstruction remains difficult in many cases."

Why this matters: "While there is a recent surge in using machine learning for depth prediction, this work is the first to tailor a learning-based approach to the case of simultaneous camera and human motion," they said in the May blog. "In this work, we focus specifically on humans because they are an interesting target for augmented reality and 3-D video effects."

Talking about results, Karen Hao, MIT Technology Review, said the researchers converted 2,000 of the videos into 2-D images with high-resolution depth data and used them to train a neural network. It was then able to predict the depth of moving objects in a video at much higher accuracy than was possible with previous state-of-the-art methods.

More information: Learning the Depths of Moving People by Watching Frozen People, arXiv:1904.11111 [cs.CV] arxiv.org/abs/1904.11111

Moving Camera, Moving People: A Deep Learning Approach to Depth Prediction: ai.googleblog.com/2019/05/movi … ing-people-deep.html

Citation: Researchers step back to mannequin viral wave to explore depth (2019, June 29) retrieved 16 August 2024 from https://techxplore.com/news/2019-06-mannequin-viral-explore-depth.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

A two-view network to predict depth and ego motion from monocular sequences

0 shares

Feedback to editors

Engineers design tiny batteries for powering cell-sized robots

11 hours ago

Leaf-like solar concentrators promise major boost in solar efficiency

11 hours ago

Why does AI beat humans at the strategy game Diplomacy?

12 hours ago

New technique prints metal oxide thin film circuits at room temperature

13 hours ago

Studies highlight challenges and solutions in making large language models trustworthy

14 hours ago

Finding security flaws in Android ahead of malicious hackers

14 hours ago

Robot planning tool accounts for human carelessness

15 hours ago

From shrimp to steel: Introducing nature-inspired metalworking

15 hours ago

'AI Scientist' model designed to conduct scientific research autonomously

16 hours ago

Global AI adoption is outpacing risk understanding, researchers warn

16 hours ago

Load comments (0)

Researchers step back to mannequin viral wave to explore depth

Engineers design tiny batteries for powering cell-sized robots

Leaf-like solar concentrators promise major boost in solar efficiency

Why does AI beat humans at the strategy game Diplomacy?

New technique prints metal oxide thin film circuits at room temperature

Studies highlight challenges and solutions in making large language models trustworthy

Finding security flaws in Android ahead of malicious hackers

Robot planning tool accounts for human carelessness

From shrimp to steel: Introducing nature-inspired metalworking

'AI Scientist' model designed to conduct scientific research autonomously

Global AI adoption is outpacing risk understanding, researchers warn

A two-view network to predict depth and ego motion from monocular sequences

Pixel 3: A turn to machine learning for depth estimations

Neural networks taught to recognize similar objects on videos without accuracy degradation

Connecting the dots between voice and a human face

Measuring distance with a single photo

A new technique for synthesizing motion-blurred images

A two-stage framework to improve LLM-based anomaly detection and reactive planning

Robot planning tool accounts for human carelessness

'AI Scientist' model designed to conduct scientific research autonomously

Global AI adoption is outpacing risk understanding, researchers warn

Why does AI beat humans at the strategy game Diplomacy?

Studies highlight challenges and solutions in making large language models trustworthy

Phys.org

Medical Xpress

Science X

Researchers step back to mannequin viral wave to explore depth

Engineers design tiny batteries for powering cell-sized robots

Leaf-like solar concentrators promise major boost in solar efficiency

Why does AI beat humans at the strategy game Diplomacy?

New technique prints metal oxide thin film circuits at room temperature

Studies highlight challenges and solutions in making large language models trustworthy

Finding security flaws in Android ahead of malicious hackers

Robot planning tool accounts for human carelessness

From shrimp to steel: Introducing nature-inspired metalworking

'AI Scientist' model designed to conduct scientific research autonomously

Global AI adoption is outpacing risk understanding, researchers warn

Related Stories

A two-view network to predict depth and ego motion from monocular sequences

Pixel 3: A turn to machine learning for depth estimations

Neural networks taught to recognize similar objects on videos without accuracy degradation

Connecting the dots between voice and a human face

Measuring distance with a single photo

A new technique for synthesizing motion-blurred images

Recommended for you

A two-stage framework to improve LLM-based anomaly detection and reactive planning

Robot planning tool accounts for human carelessness

'AI Scientist' model designed to conduct scientific research autonomously

Global AI adoption is outpacing risk understanding, researchers warn

Why does AI beat humans at the strategy game Diplomacy?

Studies highlight challenges and solutions in making large language models trustworthy

Your Privacy