TensorRT: What’s New

The upcoming version, TensorRT 8.2, includes new optimizations to run billion parameter language models in real time.

TensorRT is also integrated natively with PyTorch and TensorFlow.

Highlights:

  • TensorRT 8.2 - Optimizations for T5 and GPT-2 deliver real time translation and summarization with 21x faster performance vs CPUs
  • TensorRT 8.2 - Simple Python API for developers using Windows
  • Torch-TensorRT - Integration of TensorRT with PyTorch delivers 3x performance vs in-framework inference on GPUs with just 1 line of code
  • TensorFlow-TensorRT - Integration with TensorRT delivers 3x faster performance versus in-framework inference on GPUs

TensorRT 8.2 is planned for release in late November from the TensorRT page.

Torch-TensorRT is planned for release in late November in the NGC PyTorch Container.

You can download the TensorFlow-TensorRT integration here today.

TensorRT 8.0 is available freely to members of the NVIDIA Developer Program today.



You can find additional resources on the NVIDIA Developer Blog or find other TensorRT developers on the NVIDIA Developer Forum




Introductory Resources



Introductory Blog

Learn how to apply TensorRT optimizations and deploy a PyTorch model to GPUs.

Read Blog

Introductory Webinar

Watch and learn more about TensorRT 8.2 features, and tools that simplify the inference workflow.

Watch Webinar

Developer Guide

See how to get started with TensorRT in this step-by-step developer guide and API reference.

Read Guide




Additional TensorRT Resources

Framework Integrations


Conversational AI



Image and Video



Recommendation Systems



Ethical AI

NVIDIA’s platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.