share this!
7
9
Share
Email

June 23, 2022

Turning senses into media: Can we teach artificial intelligence to perceive?

Humans perceive the world through different senses: we see, feel, hear, taste and smell. The different senses with which we perceive are multiple channels of information, also known as multimodal. Does this mean that what we perceive can be seen as multimedia?

Xue Wang, Ph.D. Candidate at LIACS, translates perception into multimedia and uses Artificial Intelligence (AI) to extract information from multimodal processes, similar to how the brain processes information. In her research she has tested learning processes of AI in four different ways.

Putting words into vectors

First, Xue looked into word-embedded learning: the translation of words into vectors. A vector is a quantity with two properties, namely a direction and a magnitude. Specifically, this part deals with how the classification of information can be improved. Xue proposed the use of a new AI model that links words to images, making it easier to classify words. While testing the model, an observer could interfere if the AI did something wrong. The research shows that this model performs better than a previously used model.

Looking at sub-categories

A second focus of the research are images accompanied by other information. For this topic Xue observed the potential of labeling sub-categories, also known as fine-grained labeling. She used a specific AI model to make it easier to categorize images with little text around it. It merges coarse labels, which are general categories, with fine-grained labels, the sub-categories. The approach is effective and helpful in structuring easy and difficult categorizations.

Finding relations between images and text

Thirdly, Xue researched image and text association. A problem with this topic is that the transformation of this information is not linear, which means that it can be difficult to measure. Xue found a potential solution for this problem: she used kernel-based transformation. Kernel stands for a specific class of algorithms in machine learning. With the used model, it is now possible for AI to see the relationship of meaning between images and text.

Finding contrast in images and text

Lastly, Xue focused on images accompanied by text. In this part AI had to look at contrasts between words and images. The AI model did a task called phrase grounding, which is the linking of nouns in image captions to parts of the image. There was no observer that could interfere in this task. The research showed that AI can link image regions to nouns with an average accuracy for this field of research.

The perception of artificial intelligence

This research offers a great contribution to the field of multimedia information: we see that AI can classify words, categorize images and link images to text. Further research can make use of the methods proposed by Xue and will hopefully lead to even better insights in the multimedia perception of AI.

Provided by Leiden University

Citation: Turning senses into media: Can we teach artificial intelligence to perceive? (2022, June 23) retrieved 23 April 2024 from https://techxplore.com/news/2022-06-media-artificial-intelligence.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

A model to generate artistic images based on text descriptions

16 shares

Feedback to editors

With a game show as his guide, researcher uses AI to predict deception

6 hours ago

Super Mario hackers' tricks could protect software from bugs, study finds

7 hours ago

The world's largest 3D printer is at a university in Maine. It just unveiled an even bigger one

9 hours ago

Researchers develop tiny chip that can safeguard user data while enabling efficient computing on a smartphone

10 hours ago

Personalization has the potential to democratize who decides how LLMs behave

10 hours ago

Aerogel-based phase change materials improve thermal management, reduce microwave emissions in electronic devices

10 hours ago

Holographic displays offer a glimpse into an immersive future

10 hours ago

Researchers develop high-energy-density aqueous battery based on halogen multi-electron transfer

11 hours ago

Extracting high-purity gold from electrical and electronic waste

12 hours ago

How potatoes, corn and beans led to breakthrough in smart windows technology

13 hours ago

Load comments (0)

Turning senses into media: Can we teach artificial intelligence to perceive?

Putting words into vectors

Looking at sub-categories

Finding relations between images and text

Finding contrast in images and text

The perception of artificial intelligence

With a game show as his guide, researcher uses AI to predict deception

Super Mario hackers' tricks could protect software from bugs, study finds

The world's largest 3D printer is at a university in Maine. It just unveiled an even bigger one

Researchers develop tiny chip that can safeguard user data while enabling efficient computing on a smartphone

Personalization has the potential to democratize who decides how LLMs behave

Aerogel-based phase change materials improve thermal management, reduce microwave emissions in electronic devices

Holographic displays offer a glimpse into an immersive future

Researchers develop high-energy-density aqueous battery based on halogen multi-electron transfer

Extracting high-purity gold from electrical and electronic waste

How potatoes, corn and beans led to breakthrough in smart windows technology

A model to generate artistic images based on text descriptions

Machine-learning model can identify the action in a video clip and label it, without the help of humans

A machine-learning method hallucinates its way to better text translation

New AI algorithms for cost-effective medical image diagnostics

Do AI systems really have their own secret language?

Using AI and old reports to understand new medical images

A new framework to generate human motions from language prompts

Personalization has the potential to democratize who decides how LLMs behave

With a game show as his guide, researcher uses AI to predict deception

Neural networks can mediate between download size and quality, according to researcher

A coffee roastery in Finland has launched an AI-generated blend. The results were surprising

Microsoft teases lifelike avatar AI tech but gives no release date

Phys.org

Medical Xpress

Science X

Turning senses into media: Can we teach artificial intelligence to perceive?

Putting words into vectors

Looking at sub-categories

Finding relations between images and text

Finding contrast in images and text

The perception of artificial intelligence

With a game show as his guide, researcher uses AI to predict deception

Super Mario hackers' tricks could protect software from bugs, study finds

The world's largest 3D printer is at a university in Maine. It just unveiled an even bigger one

Researchers develop tiny chip that can safeguard user data while enabling efficient computing on a smartphone

Personalization has the potential to democratize who decides how LLMs behave

Aerogel-based phase change materials improve thermal management, reduce microwave emissions in electronic devices

Holographic displays offer a glimpse into an immersive future

Researchers develop high-energy-density aqueous battery based on halogen multi-electron transfer

Extracting high-purity gold from electrical and electronic waste

How potatoes, corn and beans led to breakthrough in smart windows technology

Related Stories

A model to generate artistic images based on text descriptions

Machine-learning model can identify the action in a video clip and label it, without the help of humans

A machine-learning method hallucinates its way to better text translation

New AI algorithms for cost-effective medical image diagnostics

Do AI systems really have their own secret language?

Using AI and old reports to understand new medical images

Recommended for you

A new framework to generate human motions from language prompts

Personalization has the potential to democratize who decides how LLMs behave

With a game show as his guide, researcher uses AI to predict deception

Neural networks can mediate between download size and quality, according to researcher

A coffee roastery in Finland has launched an AI-generated blend. The results were surprising

Microsoft teases lifelike avatar AI tech but gives no release date

Your Privacy