Note: This video may require joining the NVIDIA Developer Program or login

GTC Silicon Valley-2019 ID:S9713:Quantized Neural Networks and QEngine

Yifan Zhang(Institute of Automation, Chinese Academy of Sciences)
We'll discuss network quantization its background, methods, achievements, and the motivation behind it. Deep neural networks have achieved remarkable performance in a wide range of tasks. But DNNs are computationally intensive and resource-consuming, which hinders their use in embedded systems. We'll explain how we're working to alleviate this problem with quantized neural networks and a lightweight framework for efficient inference of these networks.

View the slides (pdf)