The article below is a guest post by Nuance, a company focused on conversational AI. In this post, Nuance engineers describe their use of NVIDIA’s automatic mixed precision to speed up their AI models in the healthcare industry.
By Wenxuan Teng, Ralf Leibold, and Gagandeep Singh
Nuance’s ambient clinical intelligence (ACI) technology is an example of how it is accelerating development of solutions for urgent problems in the U.S. healthcare system by training its automatic speech recognition (ASR) and natural language processing (NLP) models using NVIDIA’s Automatic Mixed Precision capabilities on Volta and Turing GPUs with Tensor Cores.
ACI addresses what the World Medical Association calls a “pandemic of physician burnout” caused by huge amounts of electronic paperwork. Doctors spend two hours completing documentation for every hour they deliver care. This documentation burden forces physicians to interact with their computer during the patient encounter or intrudes after-hours into their home life, or both.
Nuance developed ACI based on its core competencies in ASR and NLP, and domain expertise developed over 20 years in speech technology for healthcare. It uses ambient sensing and conversational AI to document a patient exam automatically. ACI removes the distraction of documentation and allows the doctor to get back to focusing on patient care as opposed to reporting.
Making ACI available to more doctors is a top priority for Nuance. Time to market depends on how quickly Nuance can train and assess deep learning algorithms to advance model performance.
Using Automatic Mixed Precision running on TensorFlow, Nuance has realized a 50% speedup in ASR and NLP model training on NVIDIA Volta GPUs without loss of accuracy, helping to reduce their time to market. Only a single line of code was required to activate performance gains from Automatic Mixed Precision.
Mixed-precision training executes the majority of operations using half-precision (FP16) floating point arithmetic, yielding improved throughput and reduced memory footprint, while maintaining the accuracy of full (FP32) floating point. NVIDIA’s Automatic Mixed Precision feature, integrated in major deep learning frameworks such as TensorFlow and PyTorch, makes all the required adjustments automatically obviating the need for manual changes of network modeling or parameters.
Nuance researchers expect further gains by leveraging the reduced memory footprint to increase training batch size. They are also achieving significant performance gains in other deep learning language processing applications. These shorter training times allow Nuance to deploy more accurate clinical conversational applications into the hands of doctors to improve patient care.
ACI and other clinical solutions from Nuance have benefited from their use of Automatic Mixed Precision on NVIDIA Tensor Core GPUs. Ultimately, we all benefit because it lets doctors get back to doing what they trained for and love, and gives us our doctors’ undivided attention when we need it the most.