This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

peer-reviewed publication

trusted source

proofread

Meta's AI can translate dozens of under-resourced languages

Meta's AI can translate dozens of under-resourced languages
Architecture of the LASER3 teacher-student approach. Credit: Nature (2024). DOI: 10.1038/s41586-024-07335-x

The technology behind Meta's artificial intelligence model, which can translate 200 different languages, is described in a paper published in Nature. The model expands the number of languages that can be translated via machine translation.

Neural machine translation models utilize to translate languages. These models typically need a large amount of accessible data online to train with, which may not be publicly, cheaply, or commonly available for some languages, termed "low-resource languages." Increasing a model's linguistic output in terms of the number of languages it translates could negatively affect the quality of the model's translations.

Marta Costa-jussà and the No Language Left Behind (NLLB) team have developed a cross-language approach, which allows neural models to learn how to translate low-resource languages using their pre-existing ability to translate high-resource languages.

As a result, the researchers have developed an online multilingual tool, called NLLB-200, that includes 200 languages, contains three times as many low-resource languages as high-resource languages, and performs 44% better than pre-existing systems.

Given that the researchers only had access to 1,000–2,000 samples of many low-resource languages, to increase the volume of training data for NLLB-200 they utilized a language identification system to identify more instances of those given dialects. The team also mined bilingual textual data from Internet archives, which helped improve the quality of translations NLLB-200 provided.

The authors note that this tool could help people speaking rarely translated languages to access the Internet and other technologies. Additionally, they highlight education as a particularly significant application, as the model could help those speaking low-resource languages access more books and . However, Costa-jussà and co-authors acknowledge that mistranslations may still occur.

More information: Scaling neural machine translation to 200 languages, Nature (2024). DOI: 10.1038/s41586-024-07335-x

Journal information: Nature
Citation: Meta's AI can translate dozens of under-resourced languages (2024, June 7) retrieved 22 June 2024 from https://techxplore.com/news/2024-06-meta-ai-dozens-resourced-languages.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Faulty machine translations litter the web

52 shares

Feedback to editors