Baidu Research is keen on addressing transcription pain points

Baidu Research is keen on addressing transcription pain points

(Tech Xplore)—Artificial intelligence powered transcription software? How, where? Professionals in many sectors who may have to cope with transcriptions of interviews and recorded statements know how tiring transcribing can be. Yet assignment have deadlines and feeling tired will be no excuse for not getting the words correctly or leaving off chunks of what the speaker actually said.

Baidu Research has a fresh answer. They are introducing SwiftScribe. Tian Wu, project manager, made an announcement.

Also, they have a page where you can try it out at and it includes a link if you want to sign up for their closed beta.

Yes, this is a beta launch. "To begin, we will invite between 30-50 transcriptionists to test the beta version." (That is, Baidu is inviting just 30 to 50 transcriptionists to participate.)

Why SwiftScribe? The team recognized transcription's pain point – that is, the time-consuming process of manually transcribing word-by-word.

For transcriptionists and the groups they work for, higher productivity would not be such a bad thing. Target users for SwiftScribe range from freelancers, transcriptionists working for transcription service companies and data entry specialists. They may work in industries requiring some or substantial work in transcription, including medical, legal, business and media.

"It typically takes between four to six hours to transcribe one hour of audio data, and the going rate for transcriptions is somewhere around one dollar per audio minute."

Baidu Research aims to fix the pain point. The idea is, you use SwiftScribe to quickly, easily, transcribe voice recordings.

Using SwiftScribe, the time a transcriptionist spends on a project is on average cut down by 40 percent. (VentureBeat wrote, "Wu's team believes SwiftScribe can help people transcribe audio 1.67 times faster—in 40 percent less time—than they would on their own. That would imply that they could do more work and ultimately get paid more for their work, Wu said.")

Technology drivers are technology and editing tools.

The company is vocal about its speech recognition capability, with the engine Deep Speech 2. A neural network has been trained on audio, learning to associate sounds with words and phrases.

SwiftScribe said its product is in a space apart from its competitors, for "as users transcribe and make edits, the system can learn and improve along the way."

As VentureBeat noted, you will need to go in and make changes such as capitalizing, adding punctuation or changing spellings of certain words but, wrote VB's Jordan Novet, "Keyboard shortcuts help you more efficiently change the speed of audio, rewind, and add a line break."

Novet provided some background on how this announcement gels with past interest in speech recognition. "Baidu in the past few years has been honing its DeepSpeech software for speech recognition. Last year, the company introduced TalkType, an Android keyboard that, using DeepSpeech, puts speech input first and typing second, based on the idea that you can enter information more quickly when you say it than when you peck."

More … nscription-software/

© 2017 Tech Xplore

Citation: Baidu Research is keen on addressing transcription pain points (2017, March 15) retrieved 22 July 2024 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Speech recognition faster at texting


Feedback to editors