Microsoft reached a new milestone in the development of more accurate speech recognition.
Using a cluster of Tesla M40 GPUs and the cuDNN version of Computational Network Toolkit (CNTK), their latest version of the technology achieved the lowest word error rate (WER) in the industry.
“Our best single system achieves an error rate of 6.9% on the NIST 2000 Switchboard set,” said the researchers in their recent research paper. “We believe this is the best performance reported to date for a recognition system not based on system combination. An ensemble of acoustic models advances the state of the art to 6.3% on the Switchboard test data.”
These advances will directly benefit the future of digital assistants, like Cortana and their real-time Skype Translator service. Microsoft said “the speech research is significant to Microsoft’s overall artificial intelligence strategy of providing systems that can anticipate users’ needs instead of responding to their commands, and to the company’s overall ambitions for providing intelligent systems that can see, hear, speak and even understand, augmenting how humans work today.”’
Read more >
Microsoft’s Voice Recognition Technology Almost as Accurate as Humans
Sep 15, 2016
Discuss (0)

AI-Generated Summary
- Microsoft achieved a milestone in speech recognition development, achieving the industry's lowest word error rate (WER) using a cluster of NVIDIA Tesla M40 GPUs and the cuDNN version of Computational Network Toolkit (CNTK).
- The researchers reported a 6.9% error rate on the NIST 2000 Switchboard set with their best single system, and an ensemble of acoustic models further reduced the error rate to 6.3%.
- This advancement will benefit digital assistants like Cortana and Skype Translator, contributing to Microsoft's artificial intelligence strategy of creating intelligent systems that can anticipate users' needs.
AI-generated content may summarize information incompletely. Verify important information. Learn more