This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

preprint

trusted source

proofread

DeepMind demonstrates Genie, an AI app that can generate playable 2D worlds from a single image

DeepMind demonstrates Genie, an AI app that can generate playable 2D worlds from a single image
Playing from Image Prompts: We can prompt Genie with images generated by text-to-image models, hand-drawn sketches or real-world photos. In each case we show the prompt frame and a second frame after taking one of the latent actions four consecutive times. In each case we see clear character movement, despite some of the images being visually distinct from the dataset. Credit: arXiv (2024). DOI: 10.48550/arxiv.2402.15391

AI researchers at Google's DeepMind, working with colleagues at the University of British Columbia, have announced the development of Genie, an AI-backed application capable of turning a single image into a playable 2D virtual world.

The team has posted a paper on the arXiv preprint server outlining their work and have also posted an announcement page on DeepMind's research site.

Two-dimensional video games, such as Super Mario Brothers, allow players to manipulate a character on a as they proceed through a virtual world. In this new effort, the team at DeepMind has automated the process of creating 2D video games by allowing Genie to accept a , such as a character in front of an imagined background, and then using it to generate the rest of the game. This was made possible by training it on thousands of hours of video from hundreds of 2D video games.

To create Genie, the team first built an AI application that was able to tokenize video frames into millions of parameters that it could use to build new frames. They then added what they describe as a "latent action model" to make predictions about what a given next scene might look like based on the current image.

Next, they added a module to generate a to make guesses about possible next sequences based on what it learned during the training phase. The result is a series of frames linked together to form what looks like a 2D .

Credit: Google DeepMind

The researchers acknowledge that Genie is still very much a work in progress. It has several limitations not easily seen in the examples provided. It takes a very long time to run, for example—it is approximately 20 to 30 times slower than what the average player would consider normal speed. It also makes a lot of mistakes—it can create unrealistic worlds that are not playable, for example. It is also currently limited in scope—it can only run 16 frames at a time.

Still, the team at DeepMind suggests that Genie demonstrates a new step forward in , allowing users to generate their own games based on their own unique preferences.

More information: Jake Bruce et al, Genie: Generative Interactive Environments, arXiv (2024). DOI: 10.48550/arxiv.2402.15391

Genie: Generative Interactive Environments: sites.google.com/view/genie-2024/home and
deepmind.google/research/publications/60474/

Journal information: arXiv

© 2024 Science X Network

Citation: DeepMind demonstrates Genie, an AI app that can generate playable 2D worlds from a single image (2024, March 6) retrieved 15 April 2024 from https://techxplore.com/news/2024-03-deepmind-genie-ai-app-generate.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Google announces the development of Lumiere, an AI-based next-generation text-to-video generator

77 shares

Feedback to editors