August 13, 2018 feature

Using multi-task learning for low-latency speech translation

by Ingrid Fadelli , Tech Xplore

Researchers from the Karlsruhe Institute of Technology (KIT), in Germany, have recently applied multi-task machine learning to low-latency neural speech translation. Their study, which was pre-published on ArXiv, addresses some of the limitations of existing neural machine translation (NMT) techniques.

Advances in the field of deep learning have led to significant improvements in human speech and text translation. NMT, a widely used approach to machine translation, trains a large neural network to read a sentence and provide an accurate translation, generally by modeling entire sentences into an integrated model.

When compared to traditional approaches, such as rule-based or statistical machine translation, NMT typically achieves more fluent translations, both for speech and written text. While it can effectively capture more complex dependencies between source and target languages, to consistently perform well, this approach requires substantial amounts of training data.

"When applying partial sentence translation to neural machine translation systems, we encounter the problem that the MT system has only been trained on complete sentences, and thus the decoder is biased to generate complete target sentences," the researchers wrote in their paper. "When receiving inputs which are partial sentences, the translation outputs are not guaranteed to exactly match with the input content. We observe that the translation is often 'fantasized' by the model to be a full sentence, as would have occurred in the training data."

In other instances, the decoder can fall in an over-generation state, repeating the last word that was fed to it several times in its translation. To address these issues, the KIT researchers focused on speech translation in cases in which an NMT needs to provide an initial translation in real time, before a speaker has finished his/her sentence.

"In this work, we aim to remedy the problem of partial sentence translation in NMT," the researchers wrote. "Ideally, we want a model that is able to generate appropriate translations for incomplete sentences, without any compromise during other translation use cases."

As datasets with partial sentences are not readily available, the researchers created artificial data that could be used in the training process. They trained the network using multi-task learning, a deep learning strategy that has been often used in natural language processing (NLP) to train a single model for different tasks, reducing expenses and enhancing its performance.

Their study achieved promising results, suggesting that NMT systems could be adapted to perform well even in cases where task-specific data is not available, without losing performance on the original task they were trained for. "We first showed that simple techniques to generate artificial data are effective to get more fluent output with less correction," the researchers concluded in their paper. "We also illustrated that multi-task learning can help adapt the model to the new inference condition, without losing the original capability to translate full sentences."

Their adaptation of NMT achieved high-quality translations at low latency, minimizing the number of corrected words by 45 percent. In the future, their study could have meaningful practical implications, helping to develop better tools for real-time speech translation.

More information: Low-Latency Neural Speech Translation, arXiv: 1808.00491v1 [cs.CL]. arxiv.org/abs/1808.00491

Abstract
Through the development of neural machine translation, the quality of machine translation systems has been improved significantly. By exploiting advancements in deep learning, systems are now able to better approximate the complex mapping from source sentences to target sentences. But with this ability, new challenges also arise. An example is the translation of partial sentences in low-latency speech translation. Since the model has only seen complete sentences in training, it will always try to generate a complete sentence, though the input may only be a partial sentence. We show that NMT systems can be adapted to scenarios where no task-specific training data is available. Furthermore, this is possible without losing performance on the original training data. We achieve this by creating artificial data and by using multi-task learning. After adaptation, we are able to reduce the number of corrections displayed during incremental output construction by 45%, without a decrease in translation quality.

Journal information: arXiv

Citation: Using multi-task learning for low-latency speech translation (2018, August 13) retrieved 29 June 2024 from https://techxplore.com/news/2018-08-multi-task-low-latency-speech.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Chinese to English translating: Not human, but exceptional

76 shares

Feedback to editors

Researchers develop novel 3D printing strategy with controllable gradients porous structures

18 hours ago

Researchers develop the fastest possible flow algorithm

22 hours ago

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Jun 28, 2024

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Jun 27, 2024

Wireless receiver blocks interference for better mobile device performance

Jun 27, 2024

Researchers successfully develop domestic 6G antenna measurement system

Jun 27, 2024

Research shows how common plastics could passively cool and heat buildings with the seasons

Jun 27, 2024

Researchers suggest smart solution to harness waste heat from industry

Jun 27, 2024

Robotic hand with tactile fingertips achieves new dexterity feat

Jun 27, 2024

Help or hindrance? ER robots have potential to aid health care workers

Jun 27, 2024

Load comments (0)

Using multi-task learning for low-latency speech translation

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Chinese to English translating: Not human, but exceptional

Fighting offensive language on social media with unsupervised text style transfer

Google teaches machines to become more fluent translators

Google Brain posse takes neural network approach to translation

Quality of sentence leaps turns corner thanks to newly announced Google machine translation system

Pushing back the boundaries of machine translation for health

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New tool detects AI-generated videos with 93.7% accuracy

Researchers propose the next platform for brain-inspired computing

Phys.org

Medical Xpress

Science X

Using multi-task learning for low-latency speech translation

Researchers develop novel 3D printing strategy with controllable gradients porous structures

Researchers develop the fastest possible flow algorithm

Real-time modeling of 3D temperature distributions within nuclear microreactors to improve safety systems

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Wireless receiver blocks interference for better mobile device performance

Researchers successfully develop domestic 6G antenna measurement system

Research shows how common plastics could passively cool and heat buildings with the seasons

Researchers suggest smart solution to harness waste heat from industry

Robotic hand with tactile fingertips achieves new dexterity feat

Help or hindrance? ER robots have potential to aid health care workers

Related Stories

Chinese to English translating: Not human, but exceptional

Fighting offensive language on social media with unsupervised text style transfer

Google teaches machines to become more fluent translators

Google Brain posse takes neural network approach to translation

Quality of sentence leaps turns corner thanks to newly announced Google machine translation system

Pushing back the boundaries of machine translation for health

Recommended for you

Researchers develop the fastest possible flow algorithm

Is ChatGPT the key to stopping deepfakes? Study asks LLMs to spot AI-generated images

Sony introduces AI for single-instrument accompaniment generation in music production

Mechanical computer relies on kirigami cubes, not electronics

New tool detects AI-generated videos with 93.7% accuracy

Researchers propose the next platform for brain-inspired computing

Your Privacy