August 17, 2018

AI could make dodgy lip sync dubbing a thing of the past

Researchers have developed a system using artificial intelligence that can edit the facial expressions of actors to accurately match dubbed voices, saving time and reducing costs for the film industry. It can also be used to correct gaze and head pose in video conferencing, and enables new possibilities for video postproduction and visual effects.

The technique was developed by an international team led by a group from the Max Planck Institute for Informatics and including researchers from the University of Bath, Technicolor, TU Munich and Stanford University. The work, called Deep Video Portraits, was presented for the first time at the SIGGRAPH 2018 conference in Vancouver on 16th August.

Unlike previous methods that are focused on movements of the face interior only, Deep Video Portraits can also animate the whole face including eyes, eyebrows, and head position in videos, using controls known from computer graphics face animation. It can even synthesise a plausible static video background if the head is moved around.

Hyeongwoo Kim from the Max Planck Institute for Informatics explains: "It works by using model-based 3-D face performance capture to record the detailed movements of the eyebrows, mouth, nose, and head position of the dubbing actor in a video. It then transposes these movements onto the 'target' actor in the film to accurately sync the lips and facial movements with the new audio."

The research is currently at the proof-of-concept stage and is yet to work at real time, however the researchers anticipate the approach could make a real difference to the visual entertainment industry.

Professor Christian Theobalt, from the Max Planck Institute for Informatics, said: "Despite extensive post-production manipulation, dubbing films into foreign languages always presents a mismatch between the actor on screen and the dubbed voice.

"Our new Deep Video Portrait approach enables us to modify the appearance of a target actor by transferring head pose, facial expressions, and eye motion with a high level of realism."

Co-author of the paper, Dr. Christian Richardt, from the University of Bath's motion capture research centre CAMERA, adds: "This technique could also be used for post-production in the film industry where computer graphics editing of faces is already widely used in today's feature films."

A great example is 'The Curious Case of Benjamin Button' where the face of Brad Pitt was replaced with a modified computer graphics version in nearly every frame of the movie. This work remains a very time-consuming process, often requiring many weeks of work by trained artists.

"Deep Video Portraits shows how such a visual effect could be created with less effort in the future. With our approach even the positioning of an actor's head and their facial expression could be easily edited to change camera angles or subtly change the framing of a scene to tell the story better."

In addition, this new approach can also be used in other applications, which the authors show on their project website, for instance in video and VR teleconferencing, where it can be used to correct gaze and head pose such that a more natural conversation setting is achieved. The software enables many new creative applications in visual media production, but the authors are also aware of the potential of misuse of modern video editing technology.

Dr. Michael Zollhöfer, from Stanford University, explains: "The media industry has been touching up photos with photo-editing software for many years, meaning most of us have learned to take what we see in photos with a pinch of salt. With ever improving video editing technology, we must also start being more critical about the video content we consume every day, especially if there is no proof of origin. We believe that the field of digital forensics should and will receive a lot more attention in the future to develop approaches that can automatically prove the authenticity of a video clip. This will lead to ever better approaches that can spot such modifications even if we humans might not be able to spot them with our own eyes."

To address this, the research team is using the same technology to develop in tandem neural networks trained to detect synthetically generated or edited video at high precision to make it easier to spot forgeries. The authors have no plans to make the software publicly available but state that any software implementing the many creative use cases should include watermarking schemes to clearly mark modifications.

More information: richardt.name/publications/deep-video-portraits/

Provided by University of Bath

Citation: AI could make dodgy lip sync dubbing a thing of the past (2018, August 17) retrieved 17 July 2024 from https://techxplore.com/news/2018-08-ai-dodgy-lip-sync-dubbing.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Computer scientists produce realistic face models from video recordings

29 shares

Feedback to editors

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

11 hours ago

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

13 hours ago

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

15 hours ago

Large language models make human-like reasoning mistakes, researchers find

16 hours ago

Unveiling a new class of synthetic fuels

16 hours ago

Microsoft unveils software that allows LLMs to work with spreadsheets

16 hours ago

New technique to assess a general-purpose AI model's reliability before it's deployed

17 hours ago

New system enables intuitive teleoperation of a robotic manipulator in real-time

20 hours ago

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

22 hours ago

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Jul 15, 2024

Load comments (0)

AI could make dodgy lip sync dubbing a thing of the past

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Computer scientists produce realistic face models from video recordings

Turning photos into an interactive experience

Tracking humans in 3-D with off-the-shelf webcams

Software enables avatar to reproduce our emotions in real time

Multi-face tracking to help AI follow the action

A webcam is enough to produce a real-time 3-D model of a moving hand

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Visual abilities of language models found to be lacking depth

Reasoning skills of large language models are often overestimated, researchers find

A new model to plan and control the movements of humanoids in 3D environments

Researchers introduce generative AI to analyze complex tabular data

Computer scientists develop new and improved camera inspired by the human eye

Phys.org

Medical Xpress

Science X

AI could make dodgy lip sync dubbing a thing of the past

Engineers evaluate cybersecurity risks associated with EV fast-charging equipment

Machine learning framework maps global rooftop growth for sustainable energy and urban planning

Giving drones wrap-and-grip wings to allow them to land on poles and tree limbs

Large language models make human-like reasoning mistakes, researchers find

Unveiling a new class of synthetic fuels

Microsoft unveils software that allows LLMs to work with spreadsheets

New technique to assess a general-purpose AI model's reliability before it's deployed

New system enables intuitive teleoperation of a robotic manipulator in real-time

Recycled micro-sized silicon anodes from photovoltaic waste improve lithium-ion battery performance

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Related Stories

Computer scientists produce realistic face models from video recordings

Turning photos into an interactive experience

Tracking humans in 3-D with off-the-shelf webcams

Software enables avatar to reproduce our emotions in real time

Multi-face tracking to help AI follow the action

A webcam is enough to produce a real-time 3-D model of a moving hand

Recommended for you

You're just a stick figure to this camera—a new camera to prevent companies from collecting private information

Visual abilities of language models found to be lacking depth

Reasoning skills of large language models are often overestimated, researchers find

A new model to plan and control the movements of humanoids in 3D environments

Researchers introduce generative AI to analyze complex tabular data

Computer scientists develop new and improved camera inspired by the human eye

Your Privacy