Microsoft announced Wednesday that its labs have developed an AI machine translation system that can translate from Chinese to English with the same accuracy as can a human. The researchers are at Asia and U.S. labs of Microsoft.
Microsoft regards this as a historic milestone in Neural Machine Translation in that it has reached human parity for Chinese to English translations.
Well, it is no small feat. In translation, there are no absolute "right" ways, as there are variations in how one can relay the same thought. On the other hand, we know what it is like to click on "English" on a document not in English, or click on "translate this page," to discover unintelligible word strings in English that one simply cannot re-use.
Xuedong Huang, technical fellow in charge of Microsoft's speech, natural language and machine translation efforts, said, "Hitting human parity in a machine translation task is a dream that all of us have had," Huang said. Huang was quoted in The AI Blog (an official Microsoft blog).
What backs up their claim? According to Microsoft, an industry standard test set of news stories to compare human and machine translation results was applied.
Not only that; the team hired a group of bilingual human evaluators. They were asked to compare the results against a different set of human-produced translations.
So what makes their attempt successful? The key words for an answer appear to be deep neural networks, which is all about ways to train AI systems.
The advantage is that you get more fluent, natural-sounding translations.
"Much of our research is really inspired by how we humans do things," said Tie-Yan Liu, a principal research manager with Microsoft Research Asia in Beijing.
In The AI Blog, Allison Linn named and described their techniques: fact-checking, deliberation networks, joint training, and agreement regularization.
In fact-checking, every time they sent a sentence through the system to be translated from Chinese to English, the research team also translated it back from English to Chinese. The fact-checking advantage is that "it allowed the system to refine and learn from its own mistakes."
In deliberation networks, "The researchers taught the system to repeat the process of translating the same sentence over and over, gradually refining and improving the responses."
In joint training, the English-to-Chinese translation system translates new English sentences into Chinese in order to obtain new sentence pairs. Those are then used to augment the training dataset that is going in the opposite direction, from Chinese to English. The same procedure is then applied in the other direction. As they converge, the performance of both systems improves.
In agreement regularization, the translation can be generated by having the system read from left to right or from right to left to look for the same translation.
"Machine translation is much more complex than a pure pattern recognition task," Zhou said. "People can use different words to express the exact same thing, but you cannot necessarily say which one is better."
A discussion about "neural machine translation" technologies appears in the research paper, "Achieving Human Parity on Automatic Chinese to English News Translation."
The authors said their evaluation found that their system reached parity with professional human translations on the WMT 2017 Chinese to English news task.
So, is their work over for such a neural machine translation system? Are human translators to become irrelevant?
Liu, according to The AI Blog, said that nobody knows if machine translation systems will ever be good enough to translate any text in any language pair with the accuracy and lyricism of a human translator.
At the same time, he added, the breakthroughs allow the teams to move on to the next big steps toward that goal and other AI achievements.