Posts by Erin Ho
Generative AI
Aug 28, 2024
Boosting Llama 3.1 405B Performance up to 1.44x with NVIDIA TensorRT Model Optimizer on NVIDIA H200 GPUs
The Llama 3.1 405B large language model (LLM), developed by Meta, is an open-source community model that delivers state-of-the-art performance and supports a...
7 MIN READ
Generative AI
Aug 15, 2024
NVIDIA TensorRT Model Optimizer v0.15 Boosts Inference Performance and Expands Model Support
NVIDIA has announced the latest v0.15 release of NVIDIA TensorRT Model Optimizer, a state-of-the-art quantization toolkit of model optimization techniques...
5 MIN READ
Conversational AI
Jul 12, 2024
Train Generative AI Models More Efficiently with New NVIDIA Megatron-Core Functionalities
First introduced in 2019, NVIDIA Megatron-LM sparked a wave of innovation in the AI community, enabling researchers and developers to use the underpinnings of...
11 MIN READ
Generative AI
May 08, 2024
Accelerate Generative AI Inference Performance with NVIDIA TensorRT Model Optimizer, Now Publicly Available
In the fast-evolving landscape of generative AI, the demand for accelerated inference speed remains a pressing concern. With the exponential growth in model...
9 MIN READ
Generative AI
Mar 07, 2024
NVIDIA TensorRT Accelerates Stable Diffusion Nearly 2x Faster with 8-bit Post-Training Quantization
In the dynamic realm of generative AI, diffusion models stand out as the most powerful architecture for generating high-quality images with text prompts. Models...
7 MIN READ