April 10, 2023 report

Powerful new Meta AI tool can identify individual items within images

by Peter Grad , Tech Xplore

Meta took a big leap forward this week with the unveiling of a model that can detect and isolate objects in an image even if it never saw them before. The technology is introduced and described in an article on the arXiv pre-print server.

The AI tool represents a major advance in one of technology's tougher challenges: allowing computers to detect and comprehend the elements of a previously unseen image and isolate them for user interaction.

It recalls a concept the former chair of the National Security Commission on Artificial Intelligence Robert O. Work once described: "What AI and machine learning allows you to do is find the needle in the haystack."

In this instance, Meta's Segment Anything Model (SAM) hunts for related pixels in an image and identifies the common components that make up all the pieces of the picture.

"SAM has learned a general notion of what objects are, and it can generate masks for any object in any image or any video, even including objects and image types that it had not encountered during training," Meta AI announced in a blog post Wednesday.

The recognition task is called segmentation. We do it daily without a moment's thought. We recognize items on our offices desks such as smartphones, cables, computer screen, a lamp, a melting candy bar, a cup of coffee.

But without prior programming, a computer must strain to distinguish all components down to the last pixel in a two-dimensional image, and it's more complicated when there are overlapping items, shadows or an irregular or partitioned shape.

Prior approaches to segmentation usually required human intervention to define a mask. Earlier automated segmentation permitted detection of objects but, according to Meta AI, that required "thousands or even tens of thousands of examples" of objects along with "computer resources and technical expertise to train the segmentation model."

SAM incorporates the two approaches in a fully automated system. It employs more than 1 billion masks that allow it to recognize new types of objects.

"This ability to generalize means that, by and large, practitioners will no longer need to collect their own segmentation data and fine-tune a model for their use case," the Meta blog stated.

One reviewer called SAM "Photoshop's 'Magic Wand' tool on steroids."

SAM can be activated by user clicks or text prompts. Meta researchers envision SAM's further utilization in the AR/VR realm. When users focus on an object, it can be delineated, defined and "lifted" into a 3D image and incorporated into a movie, game or presentation.

A free working model is available online. Users can select from an image gallery or upload their own photos. They can then tap anywhere on the screen or draw a rectangle around an item of interest and watch SAM define, for instance, the outline of a nose, face or entire body. Another option directs SAM to identify every object in an image.

Although SAM has not been applied to Facebook yet, similar technology has been applied to familiar processes such as photo tagging, moderation and tagging of disallowed content, and generation of recommended posts on both Facebook and Instagram.

More information: Alexander Kirillov et al, Segment Anything, arXiv (2023). DOI: 10.48550/arxiv.2304.02643

ai.facebook.com/blog/segment-a … -image-segmentation/

segment-anything.com/

Journal information: arXiv

Citation: Powerful new Meta AI tool can identify individual items within images (2023, April 10) retrieved 26 April 2024 from https://techxplore.com/news/2023-04-powerful-meta-ai-tool-individual.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Generative modeling tool renders 2D sketches in 3D

165 shares

Feedback to editors

Proof of concept study shows path to easier recycling of solar modules

6 hours ago

New circuit boards can be repeatedly recycled

7 hours ago

Researchers develop an automated benchmark for language-based task planners

8 hours ago

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

8 hours ago

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

8 hours ago

Researchers outline path forward for tandem solar cells

9 hours ago

Researcher develop high-performance amorphous p-type oxide semiconductor

10 hours ago

Scientists create new atomic clock that is both ultra-precise and sturdy

10 hours ago

A framework to compare lithium battery testing data and results during operation

13 hours ago

New approach could make reusing captured carbon far cheaper, less energy-intensive

17 hours ago

Load comments (0)

Powerful new Meta AI tool can identify individual items within images

Proof of concept study shows path to easier recycling of solar modules

New circuit boards can be repeatedly recycled

Researchers develop an automated benchmark for language-based task planners

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

Researchers outline path forward for tandem solar cells

Researcher develop high-performance amorphous p-type oxide semiconductor

Scientists create new atomic clock that is both ultra-precise and sturdy

A framework to compare lithium battery testing data and results during operation

New approach could make reusing captured carbon far cheaper, less energy-intensive

Generative modeling tool renders 2D sketches in 3D

New software allows nonspecialists to intuitively train machines using gestures

Advancing human-like perception in self-driving vehicles

New technique improves accuracy of computer vision technologies

Digital image enhancement using skin-color segmentation and smoothness

New method allows robot vision to identify occluded objects

Researchers develop an automated benchmark for language-based task planners

Study explores why human-inspired machines can be perceived as eerie

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

Microsoft claims that small, localized language models can be powerful as well

Scientists pioneer new X-ray microscopy method for data analysis 'on the fly'

Phys.org

Medical Xpress

Science X

Powerful new Meta AI tool can identify individual items within images

Proof of concept study shows path to easier recycling of solar modules

New circuit boards can be repeatedly recycled

Researchers develop an automated benchmark for language-based task planners

Built-in bionic computing: Researchers develop method to control pneumatic artificial muscles

Custom-made catalyst leads to longer-lasting and more sustainable green hydrogen production

Researchers outline path forward for tandem solar cells

Researcher develop high-performance amorphous p-type oxide semiconductor

Scientists create new atomic clock that is both ultra-precise and sturdy

A framework to compare lithium battery testing data and results during operation

New approach could make reusing captured carbon far cheaper, less energy-intensive

Related Stories

Generative modeling tool renders 2D sketches in 3D

New software allows nonspecialists to intuitively train machines using gestures

Advancing human-like perception in self-driving vehicles

New technique improves accuracy of computer vision technologies

Digital image enhancement using skin-color segmentation and smoothness

New method allows robot vision to identify occluded objects

Recommended for you

Researchers develop an automated benchmark for language-based task planners

Study explores why human-inspired machines can be perceived as eerie

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

Microsoft claims that small, localized language models can be powerful as well

Scientists pioneer new X-ray microscopy method for data analysis 'on the fly'

Your Privacy