Blackwell

Oct 06, 2025

Speeding Up Data Decompression with nvCOMP and the NVIDIA Blackwell Decompression Engine

Compression is a common technique to reduce storage costs and accelerate input/output transfer times across databases, data-center communications,...

7 MIN READ

Sep 29, 2025

Streamline Robot Learning with Whole-Body Control and Enhanced Teleoperation in NVIDIA Isaac Lab 2.3

Training robot policies from real-world demonstrations is costly, slow, and prone to overfitting, limiting generalization across tasks and environments. A...

10 MIN READ

Sep 23, 2025

Faster Training Throughput in FP8 Precision with NVIDIA NeMo

In previous posts on FP8 training, we explored the fundamentals of FP8 precision and took a deep dive into the various scaling recipes for practical large-scale...

12 MIN READ

Sep 19, 2025

NVIDIA HGX B200 Reduces Embodied Carbon Emissions Intensity

NVIDIA HGX B200 is revolutionizing accelerated computing by unlocking unprecedented performance and energy efficiency. This post shows how HGX B200 is...

5 MIN READ

Sep 11, 2025

How Quantization Aware Training Enables Low-Precision Accuracy Recovery

After training AI models, a variety of compression techniques can be used to optimize them for deployment. The most common is post-training quantization (PTQ),...

10 MIN READ

Sep 09, 2025

NVIDIA Blackwell Ultra Sets New Inference Records in MLPerf Debut

As large language models (LLMs) grow larger, they get smarter, with open models from leading developers now featuring hundreds of billions of parameters. At the...

10 MIN READ

Sep 05, 2025

Accelerate Large-Scale LLM Inference and KV Cache Offload with CPU-GPU Memory Sharing

Large Language Models (LLMs) are at the forefront of AI innovation, but their massive size can complicate inference efficiency. Models such as Llama 3 70B and...

7 MIN READ

Sep 03, 2025

Accelerate Autonomous Vehicle Development with the NVIDIA DRIVE AGX Thor Developer Kit

Autonomous vehicle (AV) technology is rapidly evolving, fueled by ever-larger and more complex AI models deployed at the edge. Modern vehicles now require not...

8 MIN READ

Sep 02, 2025

Improving GEMM Kernel Auto-Tuning Efficiency on NVIDIA GPUs with Heuristics and CUTLASS 4.2

Selecting the best possible General Matrix Multiplication (GEMM) kernel for a specific problem and hardware is a significant challenge. The performance of a...

8 MIN READ

Aug 29, 2025

Fine-Tuning gpt-oss for Accuracy and Performance with Quantization Aware Training

Major open-source foundational model releases are an exciting time for the AI community, bringing unique architectural innovations and capabilities. As the...

7 MIN READ

Aug 25, 2025

NVFP4 Trains with Precision of 16-Bit and Speed and Efficiency of 4-Bit

In recent years, AI workloads have grown exponentially—not only in the deployment of large language models (LLMs) but also in the demand to process ever more...

9 MIN READ

Aug 25, 2025

Introducing NVIDIA Jetson Thor, the Ultimate Platform for Physical AI

Robotics is undergoing a revolution, moving beyond the era of specialist machines to generalist robotics. This shift moves away from single-purpose,...

14 MIN READ

Aug 22, 2025

Inside NVIDIA Blackwell Ultra: The Chip Powering the AI Factory Era

As the latest member of the NVIDIA Blackwell architecture family, the NVIDIA Blackwell Ultra GPU builds on core innovations to accelerate training and AI...

14 MIN READ

Aug 22, 2025

NVIDIA Hardware Innovations and Open Source Contributions Are Shaping AI

Open source AI models such as Cosmos, DeepSeek, Gemma, GPT-OSS, Llama, Nemotron, Phi, Qwen, and many more are the foundation of AI innovation. These models are...

8 MIN READ

Aug 21, 2025

Less Coding, More Science: Simplify Ocean Modeling on GPUs With OpenACC and Unified Memory

NVIDIA HPC SDK v25.7 delivers a significant leap forward for developers working on high-performance computing (HPC) applications with GPU acceleration. This...

11 MIN READ

Aug 20, 2025

Deploying Your Omniverse Kit Apps at Scale

Running 3D applications that take advantage of advanced rendering and simulation technologies often requires users to navigate complex installs and have access...

12 MIN READ