GTC Silicon Valley-2019: Using ONNX for Accelerated Inferencing on Cloud and Edge

Note: This video may require joining the NVIDIA Developer Program or login

GTC Silicon Valley-2019 ID:S9979:Using ONNX for Accelerated Inferencing on Cloud and Edge

Kevin Chen(NVIDIA),Prasanth Pulavarthi(Microsoft)
Are you a developer looking to operationalize machine learning models from different sources without compromising performance? Are you a data scientist who wishes there was a way to use the machine learning framework you want without worrying about how to deploy it to a variety of end points on cloud and edge? We'll describe ONNX, which provides a common format supported by many popular AI frameworks and hardware. Learn about ONNX and its core concepts and find out how to create ONNX models using frameworks like TensorFlow, PyTorch, and SciKit-Learn. We'll explain how to deploy models to cloud or edge using the high-performance, cross-platform ONNX Runtime, which leverages accelerators like NVIDIA TensorRT. Our talk will include case studies of Microsoft teams improving latency and reducing costs, thanks to ONNX.

View the slides (pdf)