Posts by Nick Comly
Generative AI / LLMs
Dec 04, 2023
NVIDIA TensorRT-LLM Enhancements Deliver Massive Large Language Model Speedups on NVIDIA H200
Large language models (LLMs) have seen dramatic growth over the last year, and the challenge of delivering great user experiences depends on both high-compute...
5 MIN READ
Generative AI / LLMs
Oct 19, 2023
Optimizing Inference on Large Language Models with NVIDIA TensorRT-LLM, Now Publicly Available
Today, NVIDIA announces the public release of TensorRT-LLM to accelerate and optimize inference performance for the latest LLMs on NVIDIA GPUs. This open-source...
10 MIN READ
Top Stories
Sep 09, 2023
NVIDIA TensorRT-LLM Supercharges Large Language Model Inference on NVIDIA H100 GPUs
Large language models (LLMs) offer incredible new capabilities, expanding the frontier of what is possible with AI. However, their large size and unique...
9 MIN READ
Data Science
Jul 20, 2022
Optimizing and Serving Models with NVIDIA TensorRT and NVIDIA Triton
Imagine that you have trained your model with PyTorch, TensorFlow, or the framework of your choice, are satisfied with its accuracy, and are considering...
11 MIN READ