Shoot better drone videos with a single word
The pros make it look easy, but making a movie with a drone can be anything but.
First, it takes skill to fly the often expensive pieces of equipment smoothly and without crashing. And once you've mastered flying, there are camera angles, panning speeds, trajectories and flight paths to plan.
With all the sensors and processing power onboard a drone and embedded in its camera, there must be a better way to capture the perfect shot.
"Sometimes you just want to tell the drone to make an exciting video," said Rogerio Bonatti, a Ph.D. candidate in Carnegie Mellon University's Robotics Institute.
Bonatti was part of a team from CMU, the University of Sao Paulo and Facebook AI Research that developed a model that enables a drone to shoot a video based on a desired emotion or viewer reaction. The drone uses camera angles, speeds and flight paths to generate a video that could be exciting, calm, enjoyable or nerve-wracking—depending on what the filmmaker tells it.
The team presented their paper on the work at the 2021 International Conference on Robotics and Automation this month.
"We are learning how to map semantics, like a word or emotion, to the motion of the camera," Bonatti said.
But before "Lights! Camera! Action!" the researchers needed hundreds of videos and thousands of viewers to capture data on what makes a video evoke a certain emotion or feeling. Bonatti and the team collected a few hundred diverse videos. A few thousand viewers then watched 12 pairs of videos and gave them scores based on how the videos made them feel.
The researchers then used the data to train a model that directed the drone to mimic the cinematography corresponding to a particular emotion. If fast moving, tight shots created excitement, the drone would use those elements to make an exciting video when the user requested it. The drone could also create videos that were calm, revealing, interesting, nervous and enjoyable, among other emotions and their combinations, like an interesting and calm video.
"I was surprised that this worked," said Bonatti. "We were trying to learn something incredibly subjective, and I was surprised that we obtained good quality data."
The team tested their model by creating sample videos, like a chase scene or someone dribbling a soccer ball, and asked viewers for feedback on how the videos felt. Bonatti said that not only did the team create videos intended to be exciting or calming that actually felt that way, but they also achieved different degrees of those emotions.
The team's work aims to improve the interface between people and cameras, whether that be helping amateur filmmakers with drone cinematography or providing on-screen directions on a smartphone to capture the perfect shot.
"This opens this door to many other applications, even outside filming or photography," Bonatti said. "We designed a model that maps emotions to robot behavior."
More information: Batteries, camera, action! Learning a semantic control space for expressive robot cinematography, arXiv:2011.10118 [cs.RO] arxiv.org/pdf/2011.10118.pdf