July 7, 2023

Engineers look to an old source to empower the future of computer vision

Artificial intelligence seems perfect for creating massive sets of images needed to train autonomous cars and other machines to see their environment, but current generative AI systems have shortcomings that can limit their use. Now, engineers at Princeton have developed a software system to overcome those limits and quickly create image sets to prepare machines for nearly any visual setting.

The new system, called Infinigen, relies on mathematics to create natural looking objects and environments in three dimensions. Infinigen is a procedural generator, which in computer science denotes a program that creates content based on automated, human-designed algorithms rather than labor-intensive manual data entry or the neural networks that power modern AI. In this way, the new program generates myriad 3D objects using only randomized mathematical rules.

Infinigen is "a dynamic program for building unlimited, diverse, and realistic natural scenes," said Jia Deng, an associate professor of computer science at Princeton and senior author of a new study that details the software system. The paper was presented at the CVPR 2023 conference.

Infinigen's mathematical approach allows it to create labeled visual data, which is needed to train computer vision systems, including those deployed on home robots and autonomous cars. Because Infinigen generates every image programmatically—it creates a 3D world first, populates it with objects, and places a camera to take a picture—Infinigen can automatically provide detailed labels about each image including the category and location of each object.

The images with automatic labels can then be used to train a robot to recognize and locate objects given only an image as input. Such labeled visual data would not be possible with existing AI image generators, according to Deng, because those programs generate images using a deep neural network that does not allow the extraction of labels.

In addition, Infinigen's users have fine-grained control of the system's settings, such as the precise lighting and viewing angle, and can fine-tune the system to make images more useful as training data.

Besides generating virtual worlds populated by digital objects with natural shapes, sizes, textures and colors, Infinigen's capabilities extend to synthetic representations of natural phenomena including fire, clouds, rain and snow.

"We expect that Infinigen will prove to be a useful resource not just for creating training data for computer vision, but also for augmented and virtual reality, game development, film-making, 3D printing, and content generation in general," Deng said.

To build Infinigen, the Princeton researchers started with Blender, a free-to-use, open-source graphic system of prebuilt software tools that dates to the 1990s. In keeping with the spirit of Blender, the Princeton researchers have released Infinigen's code under a GPL-compatible license, meaning anyone can freely use it.

By vastly expanding the menu of 3D-rendered objects and landscapes, another key advantage of Infinigen is that it can boost machines' ability to perform 3D reconstructions, from just 2D pixels, of the complex spaces they will operate within. While moving away from real-world images to synthetic images to develop cars and robots that will move in the real world might seem counterintuitive, real image datasets have key limitations, Deng said.

For starters, the computers that guide robots and smart cars do not perceive images and other visual objects like humans do. An image that looks three-dimensional to a human is just a two-dimensional collection of pixels to a computer. To allow robots to perceive an image in 3D, the image needs to include an instruction called a "3D ground truth." This is difficult to do with existing 2D images, but easy for a system like Infinigen.

"Synthetic datasets of 3D images have shown great initial promise," said Deng, "and we developed Infinigen to further deliver on this promise."

For Infinigen, the Princeton researchers designed subprograms, dubbed generators, that specialize in producing single distinct types of digital objects—for instance, "fish" or "mountains." Users can work with the subprograms to tailor a range of parameters including size, texture, color and reflectivity.

"Users can tweak the parameters to create as much realness or un-realness as they desire for their particular task," said Deng. "The expansiveness can help ensure that machines are being broadly trained to handle and navigate the full spectrum of encounterable environments."

The researchers hope that Infinigen will become a collaborative tool, allowing users to add more features as it develops.

"A goal is for Infinigen coverage to become so good that the project becomes the go-to place for computer vision training data, whatever the task is," said Deng. "We want Infinigen to become a collaborative, community-driven effort that provides a useful tool for a lot of users."

More information: Report: Infinite Photorealistic Worlds Using Procedural Generation

Provided by Princeton University

Citation: Engineers look to an old source to empower the future of computer vision (2023, July 7) retrieved 27 April 2024 from https://techxplore.com/news/2023-07-source-empower-future-vision.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

New AI method for graphing scenes from images

111 shares

Feedback to editors

Computer scientists unveil novel attacks on cybersecurity

35 minutes ago

Proof of concept study shows path to easier recycling of solar modules

19 hours ago

New circuit boards can be repeatedly recycled

20 hours ago

Researchers develop an automated benchmark for language-based task planners

20 hours ago

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

20 hours ago

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

21 hours ago

Researchers outline path forward for tandem solar cells

22 hours ago

Researcher develop high-performance amorphous p-type oxide semiconductor

22 hours ago

Scientists create new atomic clock that is both ultra-precise and sturdy

23 hours ago

A framework to compare lithium battery testing data and results during operation

Apr 26, 2024

Load comments (0)

Engineers look to an old source to empower the future of computer vision

Computer scientists unveil novel attacks on cybersecurity

Proof of concept study shows path to easier recycling of solar modules

New circuit boards can be repeatedly recycled

Researchers develop an automated benchmark for language-based task planners

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

Researchers outline path forward for tandem solar cells

Researcher develop high-performance amorphous p-type oxide semiconductor

Scientists create new atomic clock that is both ultra-precise and sturdy

A framework to compare lithium battery testing data and results during operation

New AI method for graphing scenes from images

Computer vision system marries image recognition and generation

Generative modeling tool renders 2D sketches in 3D

New apps for visually impaired users provide virtual labels for controls and a way to explore images

Researchers use AI to identify similar materials in images

A software package to ease the use of neural radiance fields in robotics research

Researchers develop an automated benchmark for language-based task planners

Study explores why human-inspired machines can be perceived as eerie

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

Microsoft claims that small, localized language models can be powerful as well

Scientists pioneer new X-ray microscopy method for data analysis 'on the fly'

Phys.org

Medical Xpress

Science X

Engineers look to an old source to empower the future of computer vision

Computer scientists unveil novel attacks on cybersecurity

Proof of concept study shows path to easier recycling of solar modules

New circuit boards can be repeatedly recycled

Researchers develop an automated benchmark for language-based task planners

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

Researchers outline path forward for tandem solar cells

Researcher develop high-performance amorphous p-type oxide semiconductor

Scientists create new atomic clock that is both ultra-precise and sturdy

A framework to compare lithium battery testing data and results during operation

Related Stories

New AI method for graphing scenes from images

Computer vision system marries image recognition and generation

Generative modeling tool renders 2D sketches in 3D

New apps for visually impaired users provide virtual labels for controls and a way to explore images

Researchers use AI to identify similar materials in images

A software package to ease the use of neural radiance fields in robotics research

Recommended for you

Researchers develop an automated benchmark for language-based task planners

Study explores why human-inspired machines can be perceived as eerie

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

Microsoft claims that small, localized language models can be powerful as well

Scientists pioneer new X-ray microscopy method for data analysis 'on the fly'

Your Privacy