Developer Blog

AI / Deep Learning |

Inception Spotlight: Supercharging Synthetic Speech with Resemble AI

Deep learning is proving to be a powerful tool when it comes to high-quality synthetic speech development and customization. A Toronto-based startup, and NVIDIA Inception member, Resemble AI is upping the stakes with a new generative voice tool able to create high-quality synthetic AI Voices. 

The technology can generate cross-lingual and naturally speaking voices in over 50 of the most popular languages, and with Resemble Fill, users can create programmatic audio and edit and replace words for audio clips. 

The ability to build, deploy, and scale realistic AI voices stands to help a multitude of industries. The wide-ranging applications span from creating AI-generated text for advertisements, to interactive voice response systems, to video game development. 

Since July 2020, the Resemble AI team has worked closely with the conversational AI team at NVIDIA to integrate the NVIDIA Riva multimodal conversational AI SDK into their speech pipeline. According to Resemble AI Founder and CEO, Zohaib Ahmed, the experience gave them unique insights into the entire conversational AI pipeline.

The NVIDIA Inception Program has been helpful with providing key insights into the conversational AI space, as well as technical support on recommending GPU compute for every workload that we have as a product,” Ahmed said. 

For training their speech models and inference, the team is using Amazon Elastic Kubernetes service (Amazon EKS) with clusters of NVIDIA T4 GPUs. They then use the NVIDIA Triton Inference Server to deploy their trained AI models at scale in production.

A recent demo of Resemble AI synthetic speech integrated with NVIDIA Omniverse Audio2Face showcases how the combined technology can create expressive facial animations and voices from a single audio source. 





“Audio2Face is a good example of a powerful tool that can be combined easily with generative AI speech to produce results in seconds, which otherwise would take days,” Ahmed said. 

The company has grown to host over 150,000 users, building over 60,000 voices. To date, Resemble AI has over 240 paying customers in various industries including telecommunication, finance, contact centers, education, gaming, and media and entertainment. 






Do you have a startup? Join NVIDIA Inception’s global network of over 8,500 startups.