Google Duo audio boost won't leave you hanging on the phone

"It's good to hear your voice, you know it's been so long
If I don't get your calls, then everything goes wrong…
Your voice across the line gives me a strange sensation"
— Blondie, "Hanging on the Telephone"

In 1978, Debbie Harry propelled her new wave band Blondie to the top of the charts with a plaintive tale of yearning to hear her boyfriend's voice from afar and insisting he not leave her "hanging on the telephone."

But the questions arises: What if it were 2020 and she was speaking over VOIP with intermittent packet losses, audio jitter, network delays and out-of-sequence packet transmissions?

We'll never know.

But Google this week announced details of a new technology for its popular Duo voice and video app that will help ensure smoother voice transmissions and reduce momentary gaps that sometimes mar internet-based connections. We'd like to think Debbie would approve.

We've all experienced Internet audio jitter. It occurs when one or more packets of instructions comprising a stream of audio instructions are delayed or shuffled out of order between caller and listener. Methods employing voice packet buffers and artificial intelligence generally can smooth over jitter of 20 milliseconds or less. But the interruptions become more noticeable when the missing packets add up to 60 milliseconds and greater.

Google says virtually all calls experience some data packet loss: one-fifth of all calls lose 3 percent of their audio and one-tenth lose 8 percent.

This week, Google researchers at the DeepMind division reported that they have begun using a program called WaveNetEQ to address these issues. The algorithm excels at filling in momentary sound gaps with synthesized but natural-sounding speech elements. Relying on a voluminous library of speech data, WaveNetEQ fills in sound gaps up to 120 milliseconds. Such sound bit swaps are called packet loss concealments (PLC).

"WaveNetEQ is a generative model based on DeepMind's WaveRNN technology," Google's AI Blog reported April 1, "that is trained using a large corpus of speech data to realistically continue short speech segments enabling it to fully synthesize the raw waveform of missing speech."

The program analyzed sounds from 100 speakers in 48 languages, zeroing in on "the characteristics of human speech in general, instead of the properties of a specific language," the report explained.

In addition, sound analysis was tested in environments offering a wide variety of background noise to help ensure accurate recognition by speakers on busy city sidewalks, train stations or cafeterias.

All WaveNetEQ processing must run on the receiver's phone so that encryption services are not compromised. But the extra demand on processing speed is minimal, Google asserts. WaveNetEQ is "fast enough to run on a phone, while still providing state-of-the-art audio quality and more natural sounding PLC than other systems currently in use."

Sounds samples illustrating audio jitter and improvement with WabeNetEQ are posted on the Google Blog report.

More information: ai.googleblog.com/2020/04/impr … ity-in-duo-with.html

Google Duo audio boost won't leave you hanging on the phone

Google vows to do more to protect your voice data

Turing test study shows humans rate artificial intelligence as more 'moral' than other people

AI speech analysis may aid in assessing and preventing potential suicides, says researcher

Computer scientists unveil novel attacks on cybersecurity

Adobe's VideoGigaGAN uses AI to make blurry videos sharp and clear

Emulating neurodegeneration and aging in artificial intelligence systems

Holographic displays offer a glimpse into an immersive future

Climate change will increase value of residential rooftop solar panels across US, study finds

Making batteries takes lots of lithium: Almost half of it could come from Pennsylvania wastewater

A new approach to using neural networks for low-power digital pre-distortion in mmWave systems

Scientists convert chicken fat into energy storage devices

AI systems are already skilled at deceiving and manipulating humans, study shows

Researchers test AI systems' ability to solve the New York Times' connections puzzle

First transatlantic sustainable aviation fuel flight saved 95 metric tons of CO₂, results show

Controlling chaos using edge computing hardware: Digital twin models promise advances in computing

Manganese sprinkled with iridium reduces need for rare metal without altering rate of green hydrogen production

A better way to control shape-shifting soft robots

New tool pinpoints security fixes in open-source software updates

Prototype browser extension adds Wikipedia-like citations on YouTube to curb misinformation

Google Duo audio boost won't leave you hanging on the phone

Let us know if there is a problem with our content

Thank you for taking time to provide your feedback to the editors

Share article

E-MAIL THE STORY