CUDA

May 04, 2026

Optimize Supply Chain Decision Systems Using NVIDIA cuOpt Agent Skills

Modern supply chains operate under the constant pressures of fluctuating demand, volatile costs, constrained capacity, and interdependent decision-making....

6 MIN READ

A person working on code on their computer.

Apr 30, 2026

Automating GPU Kernel Translation with AI Agents: cuTile Python to cuTile.jl

NVIDIA CUDA Tile (cuTile) is a tile-based programming model that enables developers to write GPU kernels in terms of tile-level operations—loads, stores, and...

9 MIN READ

Apr 22, 2026

Simplify Sparse Deep Learning with Universal Sparse Tensor in nvmath-python

In a previous post, we introduced the Universal Sparse Tensor (UST), enabling developers to decouple a tensor’s sparsity from its memory layout for greater...

11 MIN READ

Apr 14, 2026

NVIDIA NVbandwidth: Your Essential Tool for Measuring GPU Interconnect and Memory Performance

When you’re writing CUDA applications, one of the most important things you need to focus on to write great code is data transfer performance. This applies to...

8 MIN READ

Apr 14, 2026

NVIDIA Ising Introduces AI-Powered Workflows to Build Fault-Tolerant Quantum Systems

NVIDIA Ising is the world's first family of open AI models for building quantum processors, launching with two model domains: Ising Calibration and Ising...

9 MIN READ

Apr 09, 2026

How to Accelerate Protein Structure Prediction at Proteome-Scale

Proteins rarely function in isolation as individual monomers. Most biological processes are governed by proteins interacting with other proteins, forming...

10 MIN READ

Apr 01, 2026

CUDA Tile Programming Now Available for BASIC!

Note: CUDA Tile Programming in BASIC is an April Fools’ joke, but it's also real and actually works, demonstrating the flexibility of CUDA. CUDA 13.1...

7 MIN READ

Mar 25, 2026

Maximize AI Infrastructure Throughput by Consolidating Underutilized GPU Workloads

In production Kubernetes environments, the difference between model requirements and GPU size creates inefficiencies. Lightweight automatic speech recognition...

9 MIN READ

Mar 16, 2026

How NVIDIA Dynamo 1.0 Powers Multi-Node Inference at Production Scale

Reasoning models are growing rapidly in size and are increasingly being integrated into agentic AI workflows that interact with other models and external tools....

14 MIN READ

Mar 09, 2026

CUDA 13.2 Introduces Enhanced CUDA Tile Support and New Python Features

CUDA 13.2 arrives with a major update: NVIDIA CUDA Tile is now supported on devices of compute capability 8.X architectures (NVIDIA Ampere and NVIDIA Ada), as...

15 MIN READ

Mar 05, 2026

Tuning Flash Attention for Peak Performance in NVIDIA CUDA Tile

In this post, we dive into one of the most critical workloads in modern AI: Flash Attention, where you’ll learn: How to implement Flash Attention using NVIDIA...

20 MIN READ

Mar 05, 2026

Controlling Floating-Point Determinism in NVIDIA CCCL

A computation is considered deterministic if multiple runs with the same input data produce the same bitwise result. While this may seem like a simple property...

7 MIN READ

Mar 03, 2026

cuTile.jl Brings NVIDIA CUDA Tile-Based Programming to Julia

NVIDIA CUDA Tile is one of the most significant additions to NVIDIA CUDA programming and unlocks automatic access to tensor cores and other specialized...

5 MIN READ

Feb 19, 2026

Accelerating Data Processing with NVIDIA Multi-Instance GPU and Locality Domains

NVIDIA flagship data center GPUs in the NVIDIA Ampere, NVIDIA Hopper, and NVIDIA Blackwell families all feature non-uniform memory access (NUMA) behaviors, but...

12 MIN READ

Feb 18, 2026

Topping the GPU MODE Kernel Leaderboard with NVIDIA cuda.compute

Python dominates machine learning for its ergonomics, but writing truly fast GPU code has historically meant dropping into C++ to write custom kernels and to...

5 MIN READ

Feb 10, 2026

Using Accelerated Computing to Live-Steer Scientific Experiments at Massive Research Facilities

Scientists and engineers who design and build unique scientific research facilities face similar challenges. These include managing massive data rates that...

13 MIN READ