September 2, 2022

Revolutionizing image generation through AI: Turning text into images

by Ludwig Maximilian University of Munich

Revolutionizing image generation by AI: Turning text into images — Image generated from the text "Happy vegetables waiting for supper.". Credit: Ludwig Maximilian University of Munich

Creating images from text in seconds—and doing so with a conventional graphics card and without supercomputers? As fanciful as it may sound, this is made possible by the new Stable Diffusion AI model. The underlying algorithm was developed by the Machine Vision & Learning Group led by Prof. Björn Ommer (LMU Munich).

"Even for laypeople not blessed with artistic talent and without special computing know-how and computer hardware, the new model is an effective tool that enables computers to generate images on command. As such, the model removes a barrier to ordinary people expressing their creativity," says Ommer. But there are benefits for seasoned artists as well, who can use Stable Diffusion to quickly convert new ideas into a variety of graphic drafts. The researchers are convinced that such AI-based tools will be able to expand the possibilities of creative image generation with paintbrush and Photoshop as fundamentally as computer-based word processing revolutionized writing with pens and typewriters.

In their project, the LMU scientists had the support of the start-up Stability.Ai, on whose servers the AI model was trained. "This additional computing power and the extra training examples turned our AI model into one of the most powerful image synthesis algorithms," says the computer scientist.

The essence of billions of training images

A special aspect of the approach is that for all the power of the trained model, it is nonetheless so compact that it runs on a conventional graphics card and does not require a supercomputer such as was formerly the case for image synthesis. To this end, the artificial intelligence distills the essence of billions of training images into an AI model of just a few gigabytes.

"Once such AI has really understood what constitutes a car or what characteristics are typical for an artistic style, it will have apprehended precisely these salient features and should ideally be able to create further examples, just as the students in an old master's workshop can produce work in the same style," explains Ommer. In pursuit of the LMU scientists' goal of getting computers to learn how to see—that is to say, to understand the contents of images—this is another big step forward, which further advances basic research in machine learning and computer vision.

The trained model was recently released free of charge under the "CreativeML Open RAIL-M" license in order to facilitate further research and application of this technology more widely. "We are excited to see what will be built with the current models as well as to see what further works will be coming out of open, collaborative research efforts," says doctoral researcher Robin Rombach.

More information: Robin Rombach et al, High-Resolution Image Synthesis with Latent Diffusion Models, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

Provided by Ludwig Maximilian University of Munich

Citation: Revolutionizing image generation through AI: Turning text into images (2022, September 2) retrieved 29 June 2024 from https://techxplore.com/news/2022-09-revolutionizing-image-ai-text-images.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

A model to generate artistic images based on text descriptions

100 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

23 hours ago

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (7)

Revolutionizing image generation through AI: Turning text into images

The essence of billions of training images

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

A model to generate artistic images based on text descriptions

When it comes to AI, can we ditch the datasets?

Training AI classifiers to better sort plankton images

Harnessing noise in optical computing for AI

A weakly supervised machine learning model to extract features from microscopy images

Training a robot to recognize and pour water

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Phys.org

Medical Xpress

Science X

Revolutionizing image generation through AI: Turning text into images

The essence of billions of training images

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

A model to generate artistic images based on text descriptions

When it comes to AI, can we ditch the datasets?

Training AI classifiers to better sort plankton images

Harnessing noise in optical computing for AI

A weakly supervised machine learning model to extract features from microscopy images

Training a robot to recognize and pour water

Recommended for you

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Your Privacy