New technology to greatly improve video communication tested during dive to Titanic wreck
The COVID-19 pandemic has enormously boosted the popularity of video communication—but sometimes poor transmission quality, dropouts, and connection failures during meetings or conference calls tax the participants' patience. Researchers at Karlsruhe Institute of Technology (KIT) and Carnegie Mellon University (CMU) have developed a method for transmitting video conferences over very low bandwidth connections, enabling such transmissions even under extreme conditions. It was tested during a dive to the wreck of the Titanic, which lies at a depth of nearly 4,000 meters in the North Atlantic.
"Transmitting data from a depth of four kilometers through salt water without any loss is extremely difficult," says Professor Alex Waibel, who conducts research on speech translation at KIT and CMU. Natural conditions allow sonar transmission from the submersible to the mother ship at sea-surface level only, since radio communication does not work in salt water. The researchers have developed synthetic methods to convert video data into text. The sound recording is first converted to text in the submersible and then transmitted to the surface by sonar sound pulses, where the video is reconstructed from the text. "The video then features a synthetic voice that is mapped to the voice of the person who is speaking, so that it sounds like the voice of that person. In addition, the video synthesis is controlled in such a way that the lips of the speaker move in sync with the sound," explains Waibel, who has been doing research in speech recognition, speech processing, and speech translation for decades. "In the future, this will facilitate remote communicate in spoken language," says Waibel. However, it is also suitable for synthesizing videos in a different language or for lip-syncing videos.
The technology tested by Waibel on the wreck of the Titanic builds on decades of pioneering work in speech translation. Waibel's developments include the Lecture Translator in use at KIT to automatically record the lecturer's speech in lectures and translate the speech signals simultaneously into written English text. This means that students can follow the lecture on their laptop, smartphone, or tablet.