United States call centers generate more than a billion hours of audio recordings every year and less than 25% of the audio is made searchable or analyzed.
Launching out of Y Combinator’s Winter 2016 class, DeepGram uses deep learning and GPUs hosted in the Amazon Web Services cloud to quickly index audio and make it searchable.
The free demo on the DeepGram site allows you to upload or provide a URL for an audio file, enter a keyword or phrase, and quickly analyze the audio to highlight precisely all the places your keyword is mentioned.
Below is the result – the red mark indicates where ‘NVIDIA’ is mentioned in a recent Share Your Science interview with Jeroen Tromp of Princeton.
The Y Combinator blog post outlines how speech search is motivated by market factors. There has been a structural change in phone support from on-site employees to a distributed, international workforce which makes quality assurance more challenging. Businesses are also focusing more on data and analytics, and they want actionable insights from their information-rich audio datasets.
Read more >>
Related resources
- GTC session: Mastering Speech AI for Multilingual Multimedia Transformation
- GTC session: Using Deep Learning & Generative AI Models to Enable Audio & Content Workflows
- GTC session: Harnessing Generative AI and Large Language Model With Vision AI Agents
- NGC Containers: Zero Shot Inference Service (Jetson)
- NGC Containers: vst
- Webinar: Vision for All: Unlocking Video Analytics With AI Agents