November 30, 2017 report

Mozilla releases transcription model and huge voice dataset

by Bob Yirka , Tech Xplore

(Tech Xplore)—Mozilla (maker of the Firefox browser) has announced the release of an open source speech recognition model along with a large voice dataset. The release marks the advent of open source speech recognition development. Sean White, chief executive of Mozilla, suggests in the announcement that it will "result in more internet-connected products that can listen and respond to us than ever before."

Up until now, virtually every commercially available speech recognition product has come from a major company, such as Microsoft or Google. This, White notes, is because such applications require a huge investment and an equally huge voice dataset to learn how to recognize and interpret human speech. Mozilla, he adds, promotes efforts to make technology more available to developers and users alike. To that end, the company set a goal of developing a speech recognition model that could be made publicly available for free, which it calls Project DeepSpeech. Along with that goal, the company created Project Common Voice, a website where people can volunteer to record their voices and to transcribe recordings made by others. White claims the dataset now holds voice data for over 20,000 people with 400,000 samples that can be downloaded, making it the second-largest publicly available dataset in the world.

Project DeepSpeech is based on work done by Baidu's Deep Speech project and uses Google's TensorFlow machine learning tool, which is open source. The newly released model allows developers to create applications with voice recognition abilities without having to pay royalties, and the Project Common Voice dataset allows it to be trained using a huge free voice dataset. The end result could be an onslaught of new applications, some likely in the form of apps available for smartphone users. White claims that the transcription engine has an error rate of just 6.5 percent, which is very close to what humans can do, which means new apps should be better at recognizing what users have to say than earlier products.

White also notes that currently, the model and voice dataset only work for English, but promises that multiple languages will soon be supported as well, some as early as next year. He also encourages people to visit the Common Voice website to add to the dataset, making it better for everyone.

Citation: Mozilla releases transcription model and huge voice dataset (2017, November 30) retrieved 23 April 2024 from https://techxplore.com/news/2017-11-mozilla-transcription-huge-voice-dataset.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Voice impersonators can fool speaker recognition systems

20 shares

Feedback to editors

With a game show as his guide, researcher uses AI to predict deception

5 hours ago

Super Mario hackers' tricks could protect software from bugs, study finds

6 hours ago

The world's largest 3D printer is at a university in Maine. It just unveiled an even bigger one

8 hours ago

Researchers develop tiny chip that can safeguard user data while enabling efficient computing on a smartphone

10 hours ago

Personalization has the potential to democratize who decides how LLMs behave

10 hours ago

Aerogel-based phase change materials improve thermal management, reduce microwave emissions in electronic devices

10 hours ago

Holographic displays offer a glimpse into an immersive future

10 hours ago

Researchers develop high-energy-density aqueous battery based on halogen multi-electron transfer

10 hours ago

Extracting high-purity gold from electrical and electronic waste

12 hours ago

How potatoes, corn and beans led to breakthrough in smart windows technology

12 hours ago

Load comments (1)

Mozilla releases transcription model and huge voice dataset

With a game show as his guide, researcher uses AI to predict deception

Super Mario hackers' tricks could protect software from bugs, study finds

The world's largest 3D printer is at a university in Maine. It just unveiled an even bigger one

Researchers develop tiny chip that can safeguard user data while enabling efficient computing on a smartphone

Personalization has the potential to democratize who decides how LLMs behave

Aerogel-based phase change materials improve thermal management, reduce microwave emissions in electronic devices

Holographic displays offer a glimpse into an immersive future

Researchers develop high-energy-density aqueous battery based on halogen multi-electron transfer

Extracting high-purity gold from electrical and electronic waste

How potatoes, corn and beans led to breakthrough in smart windows technology

Voice impersonators can fool speaker recognition systems

Google leverages WaveNet model's gains, sounds seem more natural

Cars and speakers: Baidu speeds up AI progress

Baidu Research is keen on addressing transcription pain points

Coming next in domotics—houses that decipher voice commands

MSI shows voice-controlled motherboard approach at IDF

Super Mario hackers' tricks could protect software from bugs, study finds

Microsoft's AI app VASA-1 makes photographs talk and sing with believable facial expressions

New code mines microscopy images in scientific articles

New software enables blind and low-vision users to create interactive, accessible charts

Artificial intelligence recognizes and learns to predict patterns in behavior from video

A fully autonomous drone system for cinematography and wildlife monitoring

Phys.org

Medical Xpress

Science X

Mozilla releases transcription model and huge voice dataset

With a game show as his guide, researcher uses AI to predict deception

Super Mario hackers' tricks could protect software from bugs, study finds

The world's largest 3D printer is at a university in Maine. It just unveiled an even bigger one

Researchers develop tiny chip that can safeguard user data while enabling efficient computing on a smartphone

Personalization has the potential to democratize who decides how LLMs behave

Aerogel-based phase change materials improve thermal management, reduce microwave emissions in electronic devices

Holographic displays offer a glimpse into an immersive future

Researchers develop high-energy-density aqueous battery based on halogen multi-electron transfer

Extracting high-purity gold from electrical and electronic waste

How potatoes, corn and beans led to breakthrough in smart windows technology

Related Stories

Voice impersonators can fool speaker recognition systems

Google leverages WaveNet model's gains, sounds seem more natural

Cars and speakers: Baidu speeds up AI progress

Baidu Research is keen on addressing transcription pain points

Coming next in domotics—houses that decipher voice commands

MSI shows voice-controlled motherboard approach at IDF

Recommended for you

Super Mario hackers' tricks could protect software from bugs, study finds

Microsoft's AI app VASA-1 makes photographs talk and sing with believable facial expressions

New code mines microscopy images in scientific articles

New software enables blind and low-vision users to create interactive, accessible charts

Artificial intelligence recognizes and learns to predict patterns in behavior from video

A fully autonomous drone system for cinematography and wildlife monitoring

Your Privacy