GTC Silicon Valley-2019: Fast Neural Network Inference with TensorRT on Autonomous Vehicles
GTC Silicon Valley-2019 ID:S9895:Fast Neural Network Inference with TensorRT on Autonomous Vehicles
Josh Park(NVIDIA),Jeff Pyke(Zoox),Zejia Zheng(Zoox)
Autonomous driving systems use various neural network models that require extremely accurate and efficient computation on GPUs. This session will outline how Zoox employs two strategies to improve inference performance (i.e., latency) of trained neural network models without loss of accuracy: (1) inference with NVIDIA TensorRT, and (2) inference with lower precision (i.e., Fp16 and Int8). We will share our learned lessons about neural network deployment with TensorRT and our current conversion workflow to tackle limitations.