Researchers propose high-density surface electromyography technique for automatic speech recognition

Researchers propose a high-density sEMG technique for automatic speech recognition — Distribution of the high density sEMG electrodes on the left and right sides of the face/neck: (left) Four main parts of the face; (right) Symmetrical arrangement. Credit: CHEN Shixiong

Verbal communication is an important way to engage in social interactions. The normal speaking process requires coordinated contractions of a mass of articulatory muscles on the face and neck.

Surface electromyography (sEMG) signals containing electrophysiology information associated with speaking activities are usually considered as an alternative input for automatic speech recognition.

A research team led by Prof. Chen Shixiong from the Shenzhen Institutes of Advanced Technology (SIAT) of the Chinese Academy of Sciences proposed a high-density (HD) sEMG technique using dense arrays of individual electrodes to acquire muscle activities over a relatively large area with a rich set of information for adequate motion classification.

In the sEMG-based speech recognition system, the locations of electrodes used to recording the sEMG signals are the main factor that would affect the classification performances on automatic speech recognition. However, in previous studies, the placement of the electrodes was dependent on the knowledge of the individual researchers without prior quantitative analysis or benchmark standard.

Chen's team analyzed the contribution of sEMG signals between the left and right sides of the facial and neck muscles when classifying the daily words in speaking task with English and Chinese, respectively.

In their study, the HD sEMG signals were recorded by the surface electrodes which have 120 channels from eight subjects' facial and neck muscles.

Recording from the electrode arrays in the left and right sides of the facial and neck muscles, classification accuracies were obtained when recognizing the speaking tasks, compared with the signals recorded by HD sEMG.

The results showed that there were similar classification accuracies obtained between using the HD sEMG recording from the left side and right side of the neck. On the contrary, a significant difference in classification accuracies between using the signals from the left and right facial muscles appeared.

"The HD sEMG signals from symmetrical positions in the neck are consistent in their contribution to speech recognition, whereas facial signals are not," said Prof. Chen.

"The proposed HD sEMG technique can determine the appropriate placement of electrodes used for automatic speech recognition, which might provide a potential tool for reducing the electrode number and selecting the optimal location of channels for speech recognition."

Provided by Chinese Academy of Sciences