November 11, 2019

New tool highlights what generative models leave out when reconstructing a scene

by Kim Martineau, Massachusetts Institute of Technology

Anyone who has spent time on social media has probably noticed that GANs, or generative adversarial networks, have become remarkably good at drawing faces. They can predict what you'll look like when you're old and what you'd look like as a celebrity. But ask a GAN to draw scenes from the larger world and things get weird.

A new demo by the MIT-IBM Watson AI Lab reveals what a model trained on scenes of churches and monuments decides to leave out when it draws its own version of, say, the Pantheon in Paris, or the Piazza di Spagna in Rome. The larger study, Seeing What a GAN Cannot Generate, was presented at the International Conference on Computer Vision last week.

"Researchers typically focus on characterizing and improving what a machine-learning system can do—what it pays attention to, and how particular inputs lead to particular outputs," says David Bau, a graduate student at MIT's Department of Electrical Engineering and Computer Science and Computer Science and Artificial Science Laboratory (CSAIL). "With this work, we hope researchers will pay as much attention to characterizing the data that these systems ignore."

In a GAN, a pair of neural networks work together to create hyper-realistic images patterned after examples they've been given. Bau became interested in GANs as a way of peering inside black-box neural nets to understand the reasoning behind their decisions. An earlier tool developed with his advisor, MIT Professor Antonio Torralba, and IBM researcher Hendrik Strobelt, made it possible to identify the clusters of artificial neurons responsible for organizing the image into real-world categories like doors, trees, and clouds. A related tool, GANPaint, lets amateur artists add and remove those features from photos of their own.

One day, while helping an artist use GANPaint, Bau hit on a problem. "As usual, we were chasing the numbers, trying to optimize numerical reconstruction loss to reconstruct the photo," he says. "But my advisor has always encouraged us to look beyond the numbers and scrutinize the actual images. When we looked, the phenomenon jumped right out: People were getting dropped out selectively."

Just as GANs and other neural nets find patterns in heaps of data, they ignore patterns, too. Bau and his colleagues trained different types of GANs on indoor and outdoor scenes. But no matter where the pictures were taken, the GANs consistently omitted important details like people, cars, signs, fountains, and pieces of furniture, even when those objects appeared prominently in the image. In one GAN reconstruction, a pair of newlyweds kissing on the steps of a church are ghosted out, leaving an eerie wedding-dress texture on the cathedral door.

"When GANs encounter objects they can't generate, they seem to imagine what the scene would look like without them," says Strobelt. "Sometimes people become bushes or disappear entirely into the building behind them."

The researchers suspect that machine laziness could be to blame; although a GAN is trained to create convincing images, it may learn it's easier to focus on buildings and landscapes and skip harder-to-represent people and cars. Researchers have long known that GANs have a tendency to overlook some statistically meaningful details. But this may be the first study to show that state-of-the-art GANs can systematically omit entire classes of objects within an image.

An AI that drops some objects from its representations may achieve its numerical goals while missing the details most important to us humans, says Bau. As engineers turn to GANs to generate synthetic images to train automated systems like self-driving cars, there's a danger that people, signs, and other critical information could be dropped without humans realizing. It shows why model performance shouldn't be measured by accuracy alone, says Bau. "We need to understand what the networks are and aren't doing to make sure they are making the choices we want them to make."

More information: Seeing What a GAN Cannot Generate: ganseeing.csail.mit.edu/

Provided by Massachusetts Institute of Technology

This story is republished courtesy of MIT News (web.mit.edu/newsoffice/), a popular site that covers news about MIT research, innovation and teaching.

Citation: New tool highlights what generative models leave out when reconstructing a scene (2019, November 11) retrieved 17 April 2024 from https://techxplore.com/news/2019-11-tool-highlights-reconstructing-scene.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Teaching artificial intelligence to create visuals with more common sense

55 shares

Feedback to editors

A rimless wheel robot that can reliably overcome steps

22 minutes ago

Student engineering team successfully builds and runs hydrogen-powered engine

2 hours ago

Cooler transformers could help electric grid

13 hours ago

Neutron scattering study points the way to more powerful lithium batteries

14 hours ago

Taichi: A large-scale diffractive hybrid photonic AI chiplet

21 hours ago

New insight about the working principles of bipolar membranes could guide future fuel cell design

23 hours ago

Using sound waves for photonic machine learning: Study lays foundation for reconfigurable neuromorphic building blocks

Apr 16, 2024

Samsung returns to top of the smartphone market: Industry tracker

Apr 16, 2024

Safeguarding the future of online security with AI and metasurfaces

Apr 15, 2024

Security vulnerability in browser interface allows computer access via graphics card

Apr 15, 2024

Load comments (0)

New tool highlights what generative models leave out when reconstructing a scene

A rimless wheel robot that can reliably overcome steps

Student engineering team successfully builds and runs hydrogen-powered engine

Cooler transformers could help electric grid

Neutron scattering study points the way to more powerful lithium batteries

Taichi: A large-scale diffractive hybrid photonic AI chiplet

New insight about the working principles of bipolar membranes could guide future fuel cell design

Using sound waves for photonic machine learning: Study lays foundation for reconfigurable neuromorphic building blocks

Samsung returns to top of the smartphone market: Industry tracker

Safeguarding the future of online security with AI and metasurfaces

Security vulnerability in browser interface allows computer access via graphics card

Teaching artificial intelligence to create visuals with more common sense

Detecting fake face images created by both humans and machines

Mugshots evoke mood of gallery, grapes and goblets

Training artificial intelligence with artificial X-rays

Commercial cloud service providers give artificial intelligence computing a boost

CosmoGAN: Training a neural network to study dark matter

Taichi: A large-scale diffractive hybrid photonic AI chiplet

Using sound waves for photonic machine learning: Study lays foundation for reconfigurable neuromorphic building blocks

AI's new power of persuasion: Study shows LLMs can exploit personal information to change your mind

Engineers recreate Star Trek's Holodeck using ChatGPT and video game assets

New computer vision tool can count damaged buildings in crisis zones and accurately estimate bird flock sizes

Tiny AI-trained robots demonstrate remarkable soccer skills

Phys.org

Medical Xpress

Science X

New tool highlights what generative models leave out when reconstructing a scene

A rimless wheel robot that can reliably overcome steps

Student engineering team successfully builds and runs hydrogen-powered engine

Cooler transformers could help electric grid

Neutron scattering study points the way to more powerful lithium batteries

Taichi: A large-scale diffractive hybrid photonic AI chiplet

New insight about the working principles of bipolar membranes could guide future fuel cell design

Using sound waves for photonic machine learning: Study lays foundation for reconfigurable neuromorphic building blocks

Samsung returns to top of the smartphone market: Industry tracker

Safeguarding the future of online security with AI and metasurfaces

Security vulnerability in browser interface allows computer access via graphics card

Related Stories

Teaching artificial intelligence to create visuals with more common sense

Detecting fake face images created by both humans and machines

Mugshots evoke mood of gallery, grapes and goblets

Training artificial intelligence with artificial X-rays

Commercial cloud service providers give artificial intelligence computing a boost

CosmoGAN: Training a neural network to study dark matter

Recommended for you

Taichi: A large-scale diffractive hybrid photonic AI chiplet

Using sound waves for photonic machine learning: Study lays foundation for reconfigurable neuromorphic building blocks

AI's new power of persuasion: Study shows LLMs can exploit personal information to change your mind

Engineers recreate Star Trek's Holodeck using ChatGPT and video game assets

New computer vision tool can count damaged buildings in crisis zones and accurately estimate bird flock sizes

Tiny AI-trained robots demonstrate remarkable soccer skills

Your Privacy