GTC Silicon Valley-2019 ID:S9713:Quantized Neural Networks and QEngine
Yifan Zhang(Institute of Automation, Chinese Academy of Sciences)
We'll discuss network quantization its background, methods, achievements, and the motivation behind it. Deep neural networks have achieved remarkable performance in a wide range of tasks. But DNNs are computationally intensive and resource-consuming, which hinders their use in embedded systems. We'll explain how we're working to alleviate this problem with quantized neural networks and a lightweight framework for efficient inference of these networks.