June 15, 2018 report

Generation query network lets computer create multi-view 3-D model from 2-D photographs

by Bob Yirka , Tech Xplore

A team of researchers working with Google's DeepMind division in London has developed what they describe as a Generation Query Network (GQN)—it allows a computer to create a 3-D model of a scene from 2-D photographs that can be viewed from different angles. In their paper published in the journal Science, the team describes the new type of neural network system and what it represents. They also offer a more personal take on their project in a post on their website. Matthias Zwicker, with the University of Maryland offers a Perspective on the work done by the team in the same journal issue.

In computer science, big jumps in systems engineering can seem small because of the seeming simplicity of results—it is not until someone applies the results that the big leap is truly recognized. This was the case, for example, when the first systems began to appear that were able to listen to a what a person says and extract meaning from it. In this new endeavor, the team at DeepMind might have made a similar leap.

In traditional computer applications, including deep learning networks, a computer must be spoon-fed data in order to behave as if it has learned something. That is not the case for the GQN, which learns purely from observation, like human infants. The system can observe a real-world scene, such as blocks sitting on a table, and then recreate a model of it able to show the scene from other angles. At first glance, as Zwicker notes, this might not seem all that groundbreaking. It is only when considering what the system must do to come up with those new angles that the real power of the system becomes clear. It has to look at the scene and infer characteristics of occluded objects that cannot be observed using only 2-D information provided by cameras. There is no radar or depth finder, or images of what blocks are supposed to look like stored in its data banks. All it has to work with are the few photographs it takes.

Accomplishing this, the team explains, involves using two neural networks, one to analyze the scene, the other to use the resulting data to create a 3-D model of it that can be viewed from angles not shown in the photographs. There is much more work to be done, of course, most obviously, determining if it can be broadened to more complex objects—but in its primitive form, it clearly represents a new way to allow computers to learn.

GQN agent “imagining” new viewpoints in rooms with multiple objects. Credit: DeepMind

GQN agent operating in partially observed maze environments. Credit: DeepMind

GQN agent performing the Shepard Metzler object rotation task. Credit: DeepMind

More information: S. M. Ali Eslami et al. Neural scene representation and rendering, Science (2018). DOI: 10.1126/science.aar6170

Abstract
Scene representation—the process of converting visual sensory data into concise descriptions—is a requirement for intelligent behavior. Recent work has shown that neural networks excel at this task when provided with large, labeled datasets. However, removing the reliance on human labeling remains an important open problem. To this end, we introduce the Generative Query Network (GQN), a framework within which machines learn to represent scenes using only their own sensors. The GQN takes as input images of a scene taken from different viewpoints, constructs an internal representation, and uses this representation to predict the appearance of that scene from previously unobserved viewpoints. The GQN demonstrates representation learning without human labels or domain knowledge, paving the way toward machines that autonomously learn to understand the world around them.

Journal information: Science

Citation: Generation query network lets computer create multi-view 3-D model from 2-D photographs (2018, June 15) retrieved 17 July 2024 from https://techxplore.com/news/2018-06-query-network-multi-view-d.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

DeepMind uses neural network to help explain meta-learning in people

87 shares

Feedback to editors

Engineers develop technique to pinpoint nanoscale 'hot spots' in electronics to improve their longevity

4 hours ago

Researchers create insect-inspired autonomous navigation strategy for tiny, lightweight robots

4 hours ago

Soft, stretchy 'jelly batteries' inspired by electric eels

4 hours ago

Astronomy methods applied to reflections in eyes could help with spotting deepfakes

4 hours ago

The magnet trick: New invention makes vibrations disappear

5 hours ago

Creating and verifying stable AI-controlled robotic systems in a rigorous and flexible way

6 hours ago

Unlocking the potential of rust: High-efficiency green hydrogen production from hematite

6 hours ago

Scientists bridge the 'valley of death' in carbon capture technologies

6 hours ago

Flexible electronics researchers develop a completely stretchy lithium-ion battery

10 hours ago

A strategy to enhance the stability of perovskite solar cells under reverse bias conditions

11 hours ago

Load comments (0)

Generation query network lets computer create multi-view 3-D model from 2-D photographs

Engineers develop technique to pinpoint nanoscale 'hot spots' in electronics to improve their longevity

Researchers create insect-inspired autonomous navigation strategy for tiny, lightweight robots

Soft, stretchy 'jelly batteries' inspired by electric eels

Astronomy methods applied to reflections in eyes could help with spotting deepfakes

The magnet trick: New invention makes vibrations disappear

Creating and verifying stable AI-controlled robotic systems in a rigorous and flexible way

Unlocking the potential of rust: High-efficiency green hydrogen production from hematite

Scientists bridge the 'valley of death' in carbon capture technologies

Flexible electronics researchers develop a completely stretchy lithium-ion battery

A strategy to enhance the stability of perovskite solar cells under reverse bias conditions

DeepMind uses neural network to help explain meta-learning in people

A webcam is enough to produce a real-time 3-D model of a moving hand

AI senses people's pose through walls

New algorithm allows human being to communicate task to robot by performing it first in virtual reality

Google DeepMind project taking neural networks to a new level

Team takes a step up in system that teaches robot how to complete a task

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Visual abilities of language models found to be lacking depth

Reasoning skills of large language models are often overestimated, researchers find

A new model to plan and control the movements of humanoids in 3D environments

Researchers introduce generative AI to analyze complex tabular data

Computer scientists develop new and improved camera inspired by the human eye

Phys.org

Medical Xpress

Science X

Generation query network lets computer create multi-view 3-D model from 2-D photographs

Engineers develop technique to pinpoint nanoscale 'hot spots' in electronics to improve their longevity

Researchers create insect-inspired autonomous navigation strategy for tiny, lightweight robots

Soft, stretchy 'jelly batteries' inspired by electric eels

Astronomy methods applied to reflections in eyes could help with spotting deepfakes

The magnet trick: New invention makes vibrations disappear

Creating and verifying stable AI-controlled robotic systems in a rigorous and flexible way

Unlocking the potential of rust: High-efficiency green hydrogen production from hematite

Scientists bridge the 'valley of death' in carbon capture technologies

Flexible electronics researchers develop a completely stretchy lithium-ion battery

A strategy to enhance the stability of perovskite solar cells under reverse bias conditions

Related Stories

DeepMind uses neural network to help explain meta-learning in people

A webcam is enough to produce a real-time 3-D model of a moving hand

AI senses people's pose through walls

New algorithm allows human being to communicate task to robot by performing it first in virtual reality

Google DeepMind project taking neural networks to a new level

Team takes a step up in system that teaches robot how to complete a task

Recommended for you

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Visual abilities of language models found to be lacking depth

Reasoning skills of large language models are often overestimated, researchers find

A new model to plan and control the movements of humanoids in 3D environments

Researchers introduce generative AI to analyze complex tabular data

Computer scientists develop new and improved camera inspired by the human eye

Your Privacy