Posts by Fred Oh
Simulation / Modeling / Design
Mar 06, 2024
CUDA Toolkit 12.4 Enhances Support for NVIDIA Grace Hopper and Confidential Computing
The latest release of CUDA Toolkit, version 12.4, continues to push accelerated computing performance using the latest NVIDIA GPUs. This post explains the new...
9 MIN READ
Simulation / Modeling / Design
Jan 05, 2024
Improving CUDA Initialization Times Using cgroups in Certain Scenarios
Many CUDA applications running on multi-GPU platforms usually use a single GPU for their compute needs. In such scenarios, a performance penalty is paid by...
5 MIN READ
Simulation / Modeling / Design
Nov 01, 2023
CUDA Toolkit 12.3 Delivers New Features for Accelerated Computing
The latest release of CUDA Toolkit continues to push the envelope of accelerated computing performance using the latest NVIDIA GPUs. New features of this...
4 MIN READ
Generative AI / LLMs
Oct 19, 2023
Optimizing Inference on Large Language Models with NVIDIA TensorRT-LLM, Now Publicly Available
Today, NVIDIA announces the public release of TensorRT-LLM to accelerate and optimize inference performance for the latest LLMs on NVIDIA GPUs. This open-source...
10 MIN READ
Top Stories
Sep 09, 2023
NVIDIA TensorRT-LLM Supercharges Large Language Model Inference on NVIDIA H100 GPUs
Large language models (LLMs) offer incredible new capabilities, expanding the frontier of what is possible with AI. However, their large size and unique...
9 MIN READ
Simulation / Modeling / Design
Aug 22, 2023
Simplifying GPU Application Development with Heterogeneous Memory Management
Heterogeneous Memory Management (HMM) is a CUDA memory management feature that extends the simplicity and productivity of the CUDA Unified Memory programming...
16 MIN READ