November 26, 2023 feature

Creating artistic collages using reinforcement learning

by Ingrid Fadelli , Tech Xplore

Researchers at Seoul National University have recently tried to train an artificial intelligence (AI) agent to create collages (i.e., artworks created by sticking various pieces of materials together), reproducing renowned artworks and other images. Their proposed model was introduced in a paper pre-printed on arXiv and presented at ICCV 2023 in October.

"Collage art requires high human artistry, and we wondered what collage artworks created by AI would look like," the authors told Tech Xplore by email, "Existing AI image generation tools like DALL-E or StableDiffusion can already generate collage images, but they are just 'collage-imitations' from pixels, not the actual collage from carrying out the real steps of collage artwork, What we wanted to do was to train AI to create 'real collage'."

In a previous study focusing on painting generation, researchers used reinforcement learning (RL) to teach AI to paint following steps similar to those followed by humans. They then started wondering if this could also be achieved for the creation of collages and started working on their reinforcement learning-based autonomous collage artwork generator.

The primary objective of their recent paper was thus to train an AI agent to create collages that are as similar to target images (e.g., paintings, photographs, etc.) as possible by tearing and pasting multiple materials, using reinforcement learning. These collages would be created using a set of materials provided by human users.

"Our RL model needs to make an agent understand what a collage is and how to do it well," the authors explained. "As RL basically requires many trials and errors, the model needs to gain experience interacting with a canvas and producing an actual collage."

As collages are made of various scraps of materials, to effectively create these artworks, an agent first needs to test diverse cut-and-paste options to ultimately determine which materials produce a collage that best resembles target images. The researchers found that initially, their model performed very poorly, yet over time, its skills significantly improved.

"The RL agent learns to make the reward bigger, where the reward is defined as an improvement in the similarity between their canvas and a target image," the authors said. "The reward function also keeps evolving over time, learning to better evaluate the similarity between agent-made collage and the target image."

During training, the researchers' model was fed a randomly assigned random image and tried to create a collage reproducing this image on a white canvas. At every step of the collage, the agent selects a random material among available options and chooses how to cut it, scrap it, and paste it onto the canvas.

"Because the target images and materials are randomly given in training, the agent becomes able to deal with any targets and materials at a later stage," the authors said. "This whole process is a bit complicated for using existing model-free RL, so we developed a differentiable collaging environment to allow the agent to track the collage's dynamics easily. This allowed us to apply model-based RL and enhance performance."

The model-based RL training scheme developed by the researchers draws inspiration from the previous work about RL-based paintings. However, the team developed their own model-based RL algorithm that addressed the dynamics associated with creating collages, which are more complex than those underpinning painting.

"While painting uses a predefined brushstroke, a collage needs to observe how the given material looks and figure out how to manipulate it to make a proper image fragment for the total collage, comprehending shape, texture, colors, and coordinates," the authors said. "Since SAC allows an agent to experience diverse actions more effectively in the continuous action space than DDPG, which was used in paintings, SAC matches our case."

To effectively generate collages, the authors used their trained model as a partial collage generator unit. This unit was found to produce high-resolution collages which closely resembled various target images.

"We also developed a module for analyzing target image complexity to assign more workload for partial collage generator to the place where the complexity is high," Lee explained. "This module can enhance the aesthetic quality of collages."

A crucial advantage of the team's architecture is that it does not require any collage samples and demonstration data, as it was simply trained using examples of materials and target images. Notably, these materials and images are far easier to collect than original artworks.

"Without artistic data or knowledge, the agent independently learned how to make a collage," the authors said. "The final collaging ability was made by the agent's own exploration, which is the notable finding of this work; it shows the mighty ability of RL as a data-free learning domain."

As the team's trained model gradually grasped the process of collage-making, it could generalize well across a wide range of images and scenarios. So far, it has only been tested in simulations. However, if applied to a humanoid robot or a robotic hand, the model could also provide 'blueprints' for the creation of physical collages.

"Building an environment in which the RL agent can learn properly was very challenging," the authors said. "We spent a lot of time developing and defining collage dynamics and actions that are legit for RL. Also, to save training time, we should keep them as compact and efficient as possible. Even more, we had to keep the dynamics differentiable for our model-based RL scheme as well."

As art is highly subjective, evaluating the quality of collages produced by the model is challenging. The researchers initially carried out a user study, asking various human participants to share their opinions and feedback on the AI-created collages.

"We conducted a user study, but this may not be enough," the authors said. "After much consideration for more objective evaluation, we decided to use CLIP, a large vision-language pre-trained model. Because CLIP is trained with about 400M text-image pairs, we believe it has the ability to evaluate more objectively than user study. With user study and CLIP, we compared our model with other pixel-based generation models by evaluating generated images' collage-ness and content consistency."

The user study and the CLIP-based evaluation carried out by the researchers yielded similar results. In both these tests, the new model was found to outperform other models for collage generation.

The model introduced in this recent paper could soon be developed further and tested to allow customized styles using a broader range of images and materials. In addition, the team's work could inspire the development of additional AI tools for generating various types of artwork.

"We are now interested in developing strategies that allow our models to cope with various style preferences," the authors added said. "As a future work, we consider developing a user-interactive interface, which can reflect user's preference during our model's creating collages."

More information: Ganghun Lee et al, Neural Collage Transfer: Artistic Reconstruction via Material Manipulation, arXiv (2023). DOI: 10.48550/arxiv.2311.02202

Ganghun Lee et al, From Scratch to Sketch: Deep Decoupled Hierarchical Reinforcement Learning for Robotic Sketching Agent, 2022 International Conference on Robotics and Automation (ICRA) (2022). DOI: 10.1109/ICRA46639.2022.9811858

Journal information: arXiv

Citation: Creating artistic collages using reinforcement learning (2023, November 26) retrieved 29 June 2024 from https://techxplore.com/news/2023-11-artistic-collages.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Google adds cinematic touch, automated collages and favorite people to Memories

62 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

23 hours ago

Researchers develop the fastest possible flow algorithm

Jun 28, 2024

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (0)

Creating artistic collages using reinforcement learning

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Google adds cinematic touch, automated collages and favorite people to Memories

A deep learning framework to enhance the capabilities of a robotic sketching agent

Research team develops an AI model for effectively removing biases in a dataset

Addressing copyright, compensation issues in generative AI

Synthetic imagery sets new bar in AI training efficiency

Images of simulated cities help artificial intelligence to understand real streetscapes

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Phys.org

Medical Xpress

Science X

Creating artistic collages using reinforcement learning

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

Google adds cinematic touch, automated collages and favorite people to Memories

A deep learning framework to enhance the capabilities of a robotic sketching agent

Research team develops an AI model for effectively removing biases in a dataset

Addressing copyright, compensation issues in generative AI

Synthetic imagery sets new bar in AI training efficiency

Images of simulated cities help artificial intelligence to understand real streetscapes

Recommended for you

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Robotic hand with tactile fingertips achieves new dexterity feat

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New work explores optimal circumstances for reaching a common goal with humanoid robots

Your Privacy