Data Science

TensorRT 5 GA Now Available

Nov 08, 2018

By Brad Nemire

Discuss (0)

AI-Generated Summary

Dislike

The latest version of NVIDIA's TensorRT, version 5, is now available and provides significant performance improvements for deep learning inference.
TensorRT 5 achieves up to 40x faster inference over CPU-only platforms by utilizing mixed precision on Turing Tensor Cores for models like translation.
TensorRT 5 also optimizes inference models with new INT8 APIs and supports Xavier-based NVIDIA DRIVE platforms and the NVIDIA DLA accelerator for FP16.

AI-generated content may summarize information incompletely. Verify important information. Learn more

NVIDIA announced the latest version of the TensorRT’s high-performance deep learning inference optimizer and runtime. Today we are releasing the general availability TensorRT. TensorRT 5 supports the new Turing architecture, provides new optimizations, and INT8 APIs achieving up to 40x faster inference over CPU-only platforms. This latest version also dramatically speeds up inference of recommenders, neural machine translation, speech, and natural language processing apps.
TensorRT 5 Highlights:

Speeds up inference by 40x over CPUs for models such as translation using mixed precision on Turing Tensor Cores
Optimizes inference models with new INT8 APIs
Supports Xavier-based NVIDIA Drive platforms and the NVIDIA DLA accelerator for FP16

TensorRT 5 GA is available now to all members of the NVIDIA Developer Program.
To learn how to get started, read the new NVIDIA Developer Blog post, “How to Speed Up Deep Learning Inference Using TensorRT“.
Learn more >

Discuss (0)

About the Authors

About Brad Nemire
Brad Nemire leads the Developer Communications team at NVIDIA. Prior to NVIDIA, he worked at Arm on the Developer Relations team. Brad graduated from San Diego State University and currently resides in Silicon Valley.

View all posts by Brad Nemire

TensorRT 5 GA Now Available

Tags

About the Authors

Comments