NVIDIA TensorRT™ is a high-performance deep learning inference optimizer and runtime that delivers low latency, high-throughput inference for deep learning applications. NVIDIA released TensorRT last year with the goal of accelerating deep learning inference for production deployment. A new NVIDIA Technical Blog post introduces TensorRT 3, which improves performance over previous versions and adds new features that make it easier