Our approach DragGAN allows users to "drag" the content of any GAN-generated images. Users only need to click a few handle points (red) and target points (blue) on the image, and our approach will move the handle points to precisely reach their corresponding target points. Users can optionally draw a mask of the flexible region (brighter area), keeping the rest of the image fixed. This flexible point-based manipulation enables control of many spatial attributes like pose, shape, expression, and layout across diverse object categories. Credit: arXiv (2023). DOI: 10.48550/arxiv.2305.10973

A team of computer scientists from the Max Planck Institute for Informatics, MIT, Google and the University of Pennsylvania has developed a new AI imaging tool for user-interactive 3D manipulation of 2D images depicted in a photograph. The team published a paper describing the new tool, which is called DragGAN, on the arXiv preprint server along with short videos depicting what the tool can do.

Photoshop was first released back in the late 1980s, and since that time, it and similar apps have been used to edit photographs. Such use has become a standard part of social media—people photoshop images before posting them online as a way to "improve" them. In this new effort, the research team has taken image editing to a whole new level by adding artificial intelligence.

At first glance, DragGAN looks very much like any other tool. But videos posted by the creative team clarify that it is capable of doing things no prior application has come even close to achieving, allowing users to alter images in imaginary 3D, on the fly. The researchers call the results "hallucinated occluded content."

Credit: arXiv (2023). DOI: 10.48550/arxiv.2305.10973

Photographs, by their very nature, are two-dimensional. Previous tools have allowed for blurring, coloring or even patching in other imagery. But all such editing is based on user effort—the user has to direct the color correction or blur out wrinkles. An AI-based photo editing tool, taught to recognize features through analyzing thousands or millions of other images, can infer what missing parts of a picture might look like and make changes based on that, with user prompting.

In one video, for example, a photograph of an angry person can be changed to show the same person smiling—all with just a click and a drag. The person's face can be turned, as well, revealing parts of the head that were never captured in the original photograph. Likewise, cars, animals or landscapes can be drastically altered using just a few clicks and drags. Adding AI to photo editing adds a whole new dimension to the category—one that could make as big a splash as Photoshop did when it was first introduced.

More information: Xingang Pan et al, Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold, arXiv (2023). DOI: 10.48550/arxiv.2305.10973

Project page: vcai.mpi-inf.mpg.de/projects/DragGAN/

Journal information: arXiv