Erin Ho

Erin Ho is the product manager for TensorRT quantization and Megatron-Core at NVIDIA, where her experience spans both training and inference. Her current focus is shaping the direction of NVIDIA's AI software to better serve the community. She holds an M.S. in computer science from National Tsing Hua University, complemented by a business degree from Carnegie Mellon University.
Erin Ho NVIDIA bio

Posts by Erin Ho

Generative AI

Boosting Llama 3.1 405B Performance up to 1.44x with NVIDIA TensorRT Model Optimizer on NVIDIA H200 GPUs

The Llama 3.1 405B large language model (LLM), developed by Meta, is an open-source community model that delivers state-of-the-art performance and supports a... 7 MIN READ
Generative AI

NVIDIA TensorRT Model Optimizer v0.15 Boosts Inference Performance and Expands Model Support

NVIDIA has announced the latest v0.15 release of NVIDIA TensorRT Model Optimizer, a state-of-the-art quantization toolkit of model optimization techniques... 5 MIN READ
Conversational AI

Train Generative AI Models More Efficiently with New NVIDIA Megatron-Core Functionalities

First introduced in 2019, NVIDIA Megatron-LM sparked a wave of innovation in the AI community, enabling researchers and developers to use the underpinnings of... 11 MIN READ
Generative AI

Accelerate Generative AI Inference Performance with NVIDIA TensorRT Model Optimizer, Now Publicly Available

In the fast-evolving landscape of generative AI, the demand for accelerated inference speed remains a pressing concern. With the exponential growth in model... 9 MIN READ
Four images compared against three modes for quality.
Generative AI

NVIDIA TensorRT Accelerates Stable Diffusion Nearly 2x Faster with 8-bit Post-Training Quantization

In the dynamic realm of generative AI, diffusion models stand out as the most powerful architecture for generating high-quality images with text prompts. Models... 7 MIN READ