TensorRT Getting Started
TensorRT 8.0: What’s New
TensorRT 8.0 is packed with new features like Transformer Optimizations, Quantization aware training providing accurate INT8, and Sparsity support for leveraging sparse tensor cores on Ampere GPUs.
- BERT-Large Inference in 1.2 ms with new Transformer Optimizations
- Achieve accuracy equivalent to FP32 with INT8 precision using Quantization Aware Training
- Sparsity support for faster inference on Ampere GPUs
TensorRT 8.0 is freely available to members of the NVIDIA Developer Program.
Learn how to apply TensorRT optimizations and deploy a PyTorch model to GPUs.
Watch and learn more about TensorRT 8.0 features, and tools that simplify the inference workflow.
Download pre-trained models optimized for TensorRT to get started quickly.
Additional TensorRT Resources
- Real-Time Natural Language Processing with BERT Using TensorRT (Blog)
- Automatic Speech Recognition with TensorRT (Notebook)
- Accelerating Real-Time Text-to-Speech with you TensorRT (Blog)
- NLU with BERT (Notebook)
- Real Time Text-to-Speech (Sample)
- Neural Machine Translation (NMT) Using A Sequence To Sequence (seq2seq) Model (Sample Code)
- Building An RNN Network Layer By Layer (Sample Code)
Image and Video
- Accelerating Inference with Sparsity using Ampere Architecture and TensorRT (Blog)
- Achieving FP32 Accuracy in INT8 using Quantization Aware Training with TensorRT (Blog)
- Optimize Object Detection with EfficientDet and TensorRT 8 (Notebook)
- Speeding up Deep Learning Inference using TensorFlow, ONNX, and TensorRT (Semantic Segmentation Blog)
- Object detection with SSD network (Python Code Sample)
- Object detection with SSD, Faster R-CNN networks (C++ Code Samples)
- Accelerating Wide and Deep with TensorRT (Blog)
- Movie Recommendation Using Neural Collaborative Filter (NCF) (Sample Code)
- Deep Recommender (Sample Code)
- Intro to Recommenders in TensorRT (Video)
NVIDIA’s platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.