Note: This video may require joining the NVIDIA Developer Program or login

SIGGRAPH 2019: Deep Learning for Content Creation and Real-Time Rendering- Deep learning - practical considerations on the workstation

Chris Hebert, NVIDIA
You may already use NVIDIA's cuDNN library to accelerate your deep neural network inference, but are you getting the most out of it to truly unleash the tremendous performance of NVIDIA's newest GPU architectures, Volta and Turing? We'll discuss how to avoid the most common pitfalls in porting your CPU-based inference to the GPU and demonstrate best practices in a step-by-step optimization of an example network, including how to perform graph surgery to minimize computation and maximize memory throughput. Learn how to deploy your deep neural network inference in both the fastest and most memory-efficient way, using cuDNN and Tensor Cores, NVIDIA's revolutionary technology that delivers groundbreaking performance in FP16, INT8 and INT4 inference on Volta and Turing. We will also examine methods for optimization within a streamlined workflow when going directly from traditional frameworks such as TensorFlow to WinML via ONNX.

View More SIGGRAPH 2019 Talks and Sessions