Hopper

Sep 16, 2025

Autodesk Research Brings Warp Speed to Computational Fluid Dynamics on NVIDIA GH200

Computer-aided engineering (CAE) forms the backbone for modern product development across industries, from designing safer aircraft to optimizing renewable...

8 MIN READ

Sep 05, 2025

Accelerate Large-Scale LLM Inference and KV Cache Offload with CPU-GPU Memory Sharing

Large Language Models (LLMs) are at the forefront of AI innovation, but their massive size can complicate inference efficiency. Models such as Llama 3 70B and...

7 MIN READ

Sep 02, 2025

Improving GEMM Kernel Auto-Tuning Efficiency on NVIDIA GPUs with Heuristics and CUTLASS 4.2

Selecting the best possible General Matrix Multiplication (GEMM) kernel for a specific problem and hardware is a significant challenge. The performance of a...

8 MIN READ

Aug 21, 2025

Less Coding, More Science: Simplify Ocean Modeling on GPUs With OpenACC and Unified Memory

NVIDIA HPC SDK v25.7 delivers a significant leap forward for developers working on high-performance computing (HPC) applications with GPU acceleration. This...

11 MIN READ

Jun 10, 2025

How Modern Supercomputers Powered by NVIDIA Are Pushing the Limits of Speed — and Science

Modern high-performance computing (HPC) is enabling more than just quick calculations — it’s powering AI systems that are unlocking scientific...

6 MIN READ

May 30, 2025

Telcos Across Five Continents Are Building NVIDIA-Powered Sovereign AI Infrastructure

AI is becoming the cornerstone of innovation across industries, driving new levels of creativity and productivity and fundamentally reshaping how we live and...

12 MIN READ

May 27, 2025

Advanced Optimization Strategies for LLM Training on NVIDIA Grace Hopper

In the previous post, Profiling LLM Training Workflows on NVIDIA Grace Hopper, we explored the importance of profiling large language model (LLM) training...

10 MIN READ

May 27, 2025

Profiling LLM Training Workflows on NVIDIA Grace Hopper

The rapid advancements in AI have resulted in an era of exponential growth in model sizes, particularly in the domain of large language models (LLMs). These...

12 MIN READ

An image representing matrix multiplication.

May 01, 2025

Boosting Matrix Multiplication Speed and Flexibility with NVIDIA cuBLAS 12.9

The NVIDIA CUDA-X math libraries empower developers to build accelerated applications for AI, scientific computing, data processing, and more. Two...

8 MIN READ

Apr 02, 2025

NVIDIA Blackwell Delivers Massive Performance Leaps in MLPerf Inference v5.0

The compute demands for large language model (LLM) inference are growing rapidly, fueled by the combination of growing model sizes, real-time latency...

10 MIN READ

Mar 03, 2025

AI Model Offers Conservationists New Tools to Protect Fisheries, Wildlife at Scale

In an effort to rein in illicit fishing, researchers have unveiled a new open-source AI model that can accurately identify what virtually all of the world’s...

5 MIN READ

Feb 28, 2025

Build an AI Agent with Expert Reasoning Capabilities Using the DeepSeek-R1 NIM

AI agents are transforming business operations by automating processes, optimizing decision-making, and streamlining actions. Their effectiveness hinges on...

9 MIN READ

Feb 28, 2025

Spotlight: NAVER Place Optimizes SLM-Based Vertical Services with NVIDIA TensorRT-LLM

NAVER is a popular South Korean search engine company that offers Naver Place, a geo-based service that provides detailed information about millions of...

13 MIN READ

Feb 20, 2025

Spotlight: University of Tokyo Uses NVIDIA Grace Hopper for Groundbreaking Energy-Efficient Seismic Research

Supercomputers are the engines of groundbreaking discoveries. From predicting extreme weather to advancing disease research and designing safer, more efficient...

6 MIN READ

Feb 13, 2025

Simplify System Memory Management with the Latest NVIDIA GH200 NVL2 Enterprise RA

NVIDIA Enterprise Reference Architectures (Enterprise RAs) can reduce the time and cost of deploying AI infrastructure solutions. They provide a streamlined...

8 MIN READ

Mixture of experts icons for attention kernels.

Feb 12, 2025

Automating GPU Kernel Generation with DeepSeek-R1 and Inference Time Scaling

As AI models extend their capabilities to solve more sophisticated challenges, a new scaling law known as test-time scaling or inference-time scaling is...

6 MIN READ