Posts by Jay Rodge
News
Dec 02, 2021
NVIDIA Announces TensorRT 8.2 and Integrations with PyTorch and TensorFlow
Learn about TensorRT 8.2 and the new TensorRT framework integrations, which accelerate inference in PyTorch and TensorFlow with just one line of code.
2 MIN READ
Technical Walkthrough
Dec 02, 2021
Optimizing T5 and GPT-2 for Real-Time Inference with NVIDIA TensorRT
TensorRT 8.2 optimizes HuggingFace T5 and GPT-2 models. You can build real-time translation, summarization, and other online NLP apps.
9 MIN READ
News
Nov 09, 2021
ICYMI: New AI Tools and Technologies Announced at NVIDIA GTC Keynote
New AI software tools include Riva Customer Voice, TensorRT, Triton Inference Server, Merlin, NeMo Megatron, and DeepStream.
5 MIN READ
News
Oct 05, 2021
NVIDIA GTC: Can’t-Miss Sessions in AI and Deep Learning this November
Register now for AI and deep learning GTC sessions focused on topics such as training, inference, frameworks, and tools.
4 MIN READ
News
Jul 20, 2021
NVIDIA Announces TensorRT 8 Slashing BERT-Large Inference Down to 1 Millisecond
NVIDIA announced TensorRT 8.0 which brings BERT-Large inference latency down to 1.2 ms with new optimizations.
3 MIN READ
Technical Walkthrough
Jul 20, 2021
Achieving FP32 Accuracy for INT8 Inference Using Quantization Aware Training with NVIDIA TensorRT
○ TensorRT is an SDK for high-performance deep learning inference and with TensorRT 8.0, you can import models trained using Quantization Aware Training (QAT) to run inference in INT8 precision with...
17 MIN READ