Join us for a series of expert-led talks showcasing how to get started with AI inference.   Register Free

TensorRT: What’s New

NVIDIA® TensorRT-LLM greatly speeds optimization of large language models (LLMs). Leveraging TensorRT™, FasterTransformer, and more, TensorRT-LLM accelerates LLMs via targeted optimizations like Flash Attention, Inflight Batching, and FP8 in an open-source Python API, enabling developers to get optimal inference performance on GPUs.

NVIDIA TensorRT 8.6 improves cross-compatibility between GPUs and software stacks, making TensorRT more versatile across hardware deployments and upgrades.


TensorRT 8.6 GA is a free download for members of the NVIDIA Developer Program.

Download Now Documentation

Ways to Get Started With NVIDIA TensorRT

TensorRT and TensorRT-LLM are available on multiple platforms for free for development or you can purchase NVIDIA AI Enterprise, an end-to-end AI software platform that includes TensorRT and TensorRT-LLM, for mission-critical AI inference with enterprise-grade security, stability, manageability, and support. Contact sales or apply for a 90-day NVIDIA AI Enterprise evaluation license to get started.


TensorRT

TensorRT is available to download for free as a binary on multiple different platforms or as a container on NVIDIA NGC™.


Download Now Pull Container From NGC Documentation

Intermediate

TensorRT-LLM

TensorRT-LLM is available for free on GitHub.


Download Now Documentation

Ways to Get Started With NVIDIA TensorRT Frameworks

Torch-TensorRT and TensorFlow-TensorRT are available for free as containers on the NGC catalog or you can purchase NVIDIA AI Enterprise for mission-critical AI inference with enterprise-grade security, stability, manageability, and support. Contact sales or apply for a 90-day NVIDIA AI Enterprise evaluation license to get started.


Torch-TensorRT

Torch-TensorRT is available in the PyTorch container from the NGC catalog.


Pull Container From NGC Documentation

Intermediate

TensorFlow-TensorRT

TensorFlow-TensorRT is available in the TensorFlow container from the NGC catalog.


Pull Container From NGC Documentation

Explore More TensorRT Resources


Stay up to date on the latest inference news from NVIDIA.

Sign Up