A new algorithm enables more realistic sound effects in VR

When we watch movies or play video games, the right sound effects can help make scenes more realistic: When a grizzled gambler rolls a silver dollar across a card table in a silver screen saloon, the sound seems to travel from ear to ear, a trick movie-makers can accomplish by splicing in a pre-recorded sound that moves from speaker to speaker.

But creating such sensations in virtual reality has so far been practically impossible because VR is unscripted. It's hard to predict what noises an object might make, or where they might be heard. To make VR sound realistic, engineers would have to create a vast number of "sound models"—the computerized equivalents of pre-recordings. Each sound model would enable the VR system to synthesize a particular sound at the precise moment it was needed. Until now it would have taken a cluster of computers many hours to create even a single sound model, and since many different models would be needed to synthesize many different potential sounds, creating realistic sound in interactive environments has been an elusive goal.

Now computer scientists at Stanford have invented an algorithm that can create sound models in seconds, making it cost effective to simulate sounds for many different objects in a virtual environment. When an action occurs that demands a sound, this new model can synthesize a sound every bit as realistic as the sounds generated by the much slower and still experimental algorithms of the past. "Making it easier to create models makes it practical to build interactive environments with realistic sound effects," said Doug James, a professor of computer science with a courtesy appointment in music.

Prior algorithms to create sound models were based on work done by the 19th-century scientist Hermann von Helmholtz, who gave his name to an equation that describes how sounds propagate. Based on this theoretical underpinning, scientists designed algorithms to create 3-D sound models: software routines that are capable of synthesizing audio that seems realistic because the volume and direction of the sound change depending on where the action occurs relative to the listener. Until now the best algorithms for creating 3-D sound models relied on the boundary element method (BEM), a slow process that was just too costly for commercial use.

James and his graduate student collaborator, Jui-Hsien Wang, developed an algorithm that calculates sound models hundreds to thousands of times faster by avoiding the Helmholtz equation and BEM. Their approach is inspired by 20th-century Austrian composer Fritz Heinrich Klein, who found a way to blend a great many piano tones and notes into a single, pleasant sound known as the Mother Chord. The scientists—who named their algorithm KleinPAT in a nod to his inspiration—explain how their approach creates sound models in a scientific paper they are presenting at the ACM SIGGRAPH 2019 conference on computer graphics and interactive techniques. "We think this is a game changer for interactive environments," James said.