Supervised speech enhancement approach improves quality of voice communication

Credit: CC0 Public Domain

For voice communication, it is important to suppress background noise without introducing unnatural distortion. Deep learning-based speech enhancement approaches can effectively suppress background noise components.

However, in the noise-mismatched condition, unnatural residual noise is generated and it heavily influences speech comfortableness.

Recently, researchers from the Institute of Acoustics of the Chinese Academy of Sciences (IACAS) proposed a type of supervised speech enhancement approach with residual noise control for voice communication.

Based on artificially maintaining low-level residual noise, researchers dedicated to maximizing and minimizing speech distortion jointly, leading to better perceptual comfortableness of enhanced speech.

Facing the widely-existing disadvantages of loss functions, researchers introduced multiple adjustable hyper-parameters and derived a generalized loss function.

They selected suitable parameter configurations, making the enhanced speech weigh flexibly and effectively between the two objectives. Meanwhile, by introducing low-level , they improved the subjective perceptual quality.

Experimental results showed that choosing suitable parameter configurations could make the enhanced speech outperform previous works in terms of both objective metrics and subjective evaluation results.

This work could be utilized for noise suppression and speech information extraction in the speech communication devices.

The study, published in Applied Sciences, was supported by the National Natural Science Foundation of China.

Explore further

Noise is an increasing problem in learning environments

More information: Andong Li et al., A Supervised Speech Enhancement Approach with Residual Noise Control for Voice Communication, Applied Sciences (2020). DOI: 10.3390/app10082894
Citation: Supervised speech enhancement approach improves quality of voice communication (2020, July 23) retrieved 24 May 2022 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Feedback to editors