GTC 2020: Operationalizing PyTorch Models Using ONNX and ONNX Runtime
After clicking “Watch Now” you will be prompted to login or join.
Operationalizing PyTorch Models Using ONNX and ONNX Runtime
Emma Ning, Microsoft | Spandan Tiwari, Microsoft
GTC 2020
We'll demonstrate how product teams delivering ML scenarios with PyTorch models can take advantage of ONNX/ONNX Runtime to improve their workflows for better performance and model interoperability. ONNX is an open-standard format that has been adopted by several organizations for representing machine-learning models. ONNX Runtime is an inference engine that is fully compatible with the ONNX format. ONNX Runtime is designed as platform where different backend implementations, including hardware-specific accelerators, can be plugged in and executed seamlessly. NVIDIA's TensorRT runtime has been integrated as an “execution provider” in Onnx Runtime and provides top-of-the-line speedup for inference on several models. PyTorch is a popular deep-learning framework that natively supports ONNX. Come for an overview of PyTorch, ONNX, and ONNX Runtime; the basics of creating a PyTorch model and details of how to export a PyTorch model to ONNX; and how to run inference with ONNX Runtime and get better performance using accelerators, such as TensorRT.