This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:



Physiological-physical feature fusion for automatic voice spoofing detection

Physiological-physical feature fusion for automatic voice spoofing detection
The proposed model structure. Credit: Frontiers Journals

Biometric speech recognition systems are often subject to various spoofing attacks, the most common of which are speech synthesis and speech conversion attacks. These spoofing attacks can cause the biometric speech recognition system to incorrectly accept these spoofing attacks, which can compromise the security of this system. Researchers have made many efforts to address this problem. But existing voice spoofing detection methods only consider the physical features of speech, resulting in poor detection performance.

To solve the problem, a research team led by Junxiao Xue published their new research on March 27, 2023 in Frontiers of Computer Science.

The team proposed a spoofing detection method based on physiological-physical feature fusion. The method included a feature extractor, a densely connected with squeeze and excitation blocks (SE-DenseNet), and a feature fusion strategy. Compared to existing methods, the tandem decision cost function and equal error rate scores improved by 5% and 7% respectively.

Specifically, physiological features in the audio were first extracted from a pre-trained convolutional network. SE-DenseNet was then used to extract the . Such a densely connected model had high parametric efficiency and squeeze and excitation blocks enhanced the efficiency of feature transmission. Finally, the two features were integrated into the classification network for voice spoofing detection.

They compared the proposed model with some of the best single systems. The experiments showed that their proposed model performs better on both EER and t-DCF. To validate the effectiveness of the face features, they also evaluated the performance of some baseline models that introduced face features. It was found that different baseline methods showed different degrees of performance improvement when combined with the face features, proving that the face features are practicable for the baseline models.

Future work may attempt to extract more accurate face features and study more effective feature fusion strategies to detect spoofing attacks.

More information: Junxiao Xue et al, Physiological-physical feature fusion for automatic voice spoofing detection, Frontiers of Computer Science (2022). DOI: 10.1007/s11704-022-2121-6

Provided by Frontiers Journals
Citation: Physiological-physical feature fusion for automatic voice spoofing detection (2023, May 15) retrieved 1 October 2023 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Using transfer learning and model fusion method to detect distracted drivers


Feedback to editors