Microsoft’s Voice Recognition Technology Almost as Accurate as Humans

Microsoft reached a new milestone in the development of more accurate speech recognition.
Using a cluster of Tesla M40 GPUs and the cuDNN version of Computational Network Toolkit (CNTK), their latest version of the technology achieved the lowest word error rate (WER) in the industry.
“Our best single system achieves an error rate of 6.9% on the NIST 2000 Switchboard set,” said the researchers in their recent research paper. “We believe this is the best performance reported to date for a recognition system not based on system combination. An ensemble of acoustic models advances the state of the art to 6.3% on the Switchboard test data.”

MSFT wep — Historical progress of speech recognition WER on more and more difficult tasks. Twenty years ago, the error rate of the best published research system had a WER of greater than 43 percent.

These advances will directly benefit the future of digital assistants, like Cortana and their real-time Skype Translator service. Microsoft said “the speech research is significant to Microsoft’s overall artificial intelligence strategy of providing systems that can anticipate users’ needs instead of responding to their commands, and to the company’s overall ambitions for providing intelligent systems that can see, hear, speak and even understand, augmenting how humans work today.”’
Read more >

Microsoft’s Voice Recognition Technology Almost as Accurate as Humans

Related resources

Tags

About the Authors

Microsoft’s Voice Recognition Technology Almost as Accurate as Humans

Related resources

Tags

About the Authors

Comments

Related posts

New Support for Dutch and Persian Released by NVIDIA NeMo ASR

Create Speech AI Applications in Multiple Languages and Customize Text-to-Speech with Riva

NVIDIA Accelerates Conversational AI from Research to Production with Latest Updates in NVIDIA NeMo and NVIDIA Riva

Startup Builds AI System To Transcribe Meetings

Microsoft Sets New Speech Recognition Record

Related posts

Just Released: NVIDIA Modulus v24.04

New Video Series: OpenUSD for Developers

Generative AI for Digital Humans and New AI-powered NVIDIA RTX Lighting

NVIDIA Speech and Translation AI Models Set Records for Speed and Accuracy

Boost Multi-Omics Analysis with GPU-Acceleration and Generative AI