A model to generate artistic images based on text descriptions

Artificial intelligence (AI) tools have proved to be highly valuable for completing a wide range of tasks. While they are primarily used to increase productivity or simplify everyday processes, they have also shown promise for automatically generating creative texts and artistic images.

Researchers at University of Waterloo and New York University Courant Institute have recently created an AI tool that can automatically generate unique artistic images based on text descriptions. Their method, introduced in a paper pre-published on arXiv, is based on a dynamic memory generative adversarial network (DM-GAN), a model based on two artificial neural networks that work together to generate increasingly convincing images.

"We create an end-to-end solution that can generate artistic images from text descriptions," Qinghe Tian and Pr. Jean-Claude Franchitti wrote in their paper.

The key idea behind the recent work by Tian and Franchitti was to create a model that could use text descriptions provided by users to produce artistic images matching these descriptions. This would allow people with disabilities that prevent them from effectively drawing and other individuals who are not very good at drawing to produce beautiful artistic images depicting specific things.

Most existing datasets for training generative models, however, either contain labeled images or texts, rather than images paired with their text descriptions. Therefore, the researchers had to come up with an alternative way of training their model.

"Due to the lack of datasets with paired text description and artistic images, it is hard to directly train an algorithm which can create art based on text input," the researchers explained in their paper. "To address this issue, we split our task into three steps."

Firstly, the researchers' used their DM-GAN model to generate a realistic image that represents a text description. Subsequently, they used ResNet, an artificial neural network with several layers, to classify the image produced by the DM-GAN into one of the genre categories outlined by the WikiArt dataset.

The WikiArt dataset, which has often been used to train deep learning methods, contains more than 40,000 artistic paintings produced by 195 artists. After it classified the image produced by DM-GAN into one of the genre categories outlined by WikiArt, the model can select a painting style compatible with this genre category and transfer it to the generated image, using a neural artistic stylization network.

The researchers evaluated their multi-framework method in a series of initial trial experiments. While it attained pretty good results, they would like to improve its performance further in their next works.

"In general, we obtain acceptable results for multiple combinations of text inputs and desired styles," the researchers wrote in their paper. "However, there are still many areas of our solution that can be improved. In particular, we plan to add a speech recognition module to make it possible for people with hand disabilities to specify their inputs via voice instead of typing."

In the future, the technique developed by Tian and Franchitti could potentially be integrated into graphics and drawing applications, allowing all individuals to produce high-quality artistic images, irrespective of their abilities and artistic talents. The code for the model devised by the researchers is publicly available on GitHub. In their next studies, the team also plan to compare its performance to that of other methods for image generation and improve the performance of its individual components.

More information: Qinghe Tian, Jean-Claude Franchitti, Text to artistic image generation. arXiv:2205.02439v1 [cs.CV], arxiv.org/abs/2205.02439

github.com/Astatine-213-Tian/T … tic-image-generation

A model to generate artistic images based on text descriptions

When it comes to AI, can we ditch the datasets?

Microsoft's AI app VASA-1 makes photographs talk and sing with believable facial expressions

To build a better AI helper, start by modeling the irrational behavior of humans

Team develops a way to teach a computer to type like a human

For more open and equitable public discussions on social media, try 'meronymity'

Using sim-to-real reinforcement learning to train robots to do simple tasks in broad environments

Meta's newest AI model beats some peers. But its amped-up AI agents are confusing Facebook users

Researchers use machine learning to create a fabric-based touch sensor

Researchers develop sodium battery capable of rapid charging in just a few seconds

Greater access to clean water, thanks to a better membrane

Silent flight edges closer to take off, according to new research

A flexible and efficient DC power converter for sustainable-energy microgrids

Versatile fibers offer improved energy storage capacity for wearable devices

Harnessing solar energy for high-efficiency NH₃ production

A dexterous four-legged robot that can walk and handle objects simultaneously

Climate change will increase value of residential rooftop solar panels across US, study finds

Bitcoin's next 'halving' is right around the corner. Here's what you need to know

Universal 'cocktail electrolyte' developed for 4.6 V ultra-stable fast charging of commercial lithium-ion batteries

Garbage could replace a quarter of petroleum-based jet fuel every year

A model to generate artistic images based on text descriptions

Let us know if there is a problem with our content

Thank you for taking time to provide your feedback to the editors

Share article

E-MAIL THE STORY