After clicking “Watch Now” you will be prompted to login or join.
Operationalizing PyTorch Models Using ONNX and ONNX Runtime
Emma Ning, Microsoft | Spandan Tiwari, Microsoft
GTC 2020
We'll demonstrate how product teams delivering ML scenarios with PyTorch models can take advantage of ONNX/ONNX Runtime to improve their workflows for better performance and model interoperability. ONNX is an open-standard format that has been adopted by several organizations for representing machine-learning models. ONNX Runtime is an inference engine that is fully compatible with the ONNX format. ONNX Runtime is designed as platform where different backend implementations, including hardware-specific accelerators, can be plugged in and executed seamlessly. NVIDIA's TensorRT runtime has been integrated as an “execution provider” in Onnx Runtime and provides top-of-the-line speedup for inference on several models. PyTorch is a popular deep-learning framework that natively supports ONNX. Come for an overview of PyTorch, ONNX, and ONNX Runtime; the basics of creating a PyTorch model and details of how to export a PyTorch model to ONNX; and how to run inference with ONNX Runtime and get better performance using accelerators, such as TensorRT.