CUDA

An image of a scientist using XR glasses.

Jun 16, 2026

Building AI Agents for AR Glasses and XR Devices with NVIDIA XR AI

Developers building for AR glasses and wearable devices face an infrastructure gap. The hardware is ready, but creating AI experiences requires integrating...

8 MIN READ

Jun 16, 2026

Build Your Own Transaction Foundation Model for Financial Intelligence

Every swipe, transfer, and payment on a modern financial network encodes a pattern of human behavior. Transaction data is one of the richest signals an...

11 MIN READ

Jun 16, 2026

How to Optimize Transformer-Based Models for Low-Precision Training

Transformer architectures are the backbone of many modern large language and generative AI models. As these models grow in size, training runs consume more GPU...

9 MIN READ

Jun 16, 2026

NVIDIA Blackwell Tops MLPerf Training 6.0 with Industry-Leading Scale and Performance

NVIDIA delivered a clean sweep in MLPerf Training v6.0, the latest edition of industry-standard AI training benchmarks developed by the MLCommons consortium....

11 MIN READ

Jun 15, 2026

Boosting MoE Training Throughput with Advanced Fusion Kernels

Mixture-of-experts (MoE) models have quickly become a foundational component of modern, large-scale AI systems. They are widely adopted because they enable...

9 MIN READ

Jun 01, 2026

Deploy Agentic-Ready AI at the Edge with Memory Efficiency in NVIDIA JetPack 7.2

As AI agents move from the digital world to the physical environment, they can readily use NVIDIA Jetson to accelerate real-world deployment with optimized...

10 MIN READ

May 26, 2026

Extract More Kernel Performance with NVIDIA CompileIQ Auto-Tuning

NVIDIA CompileIQ tackles one of the hardest problems in performance engineering: finding the compiler options that unlock the best performance for a specific...

12 MIN READ

May 26, 2026

Develop High-Performance GPU Kernels in C++ with NVIDIA CUDA Tile

Developers can now use NVIDIA CUDA Tile programming within large existing C++ GPU codebases to develop highly optimized GPU kernels using tile-based...

14 MIN READ

May 26, 2026

NVIDIA CUDA 13.3 Enhances GPU Development with Tile Programming in C++, Compiler Autotuning, and Python Updates

NVIDIA CUDA 13.3 brings new capabilities and performance optimizations to developers across the CUDA ecosystem. The launch of NVIDIA CUDA Tile programming in...

13 MIN READ

May 13, 2026

Accelerated X-Ray Analysis for Nanoscale Imaging (XANI) of Novel Materials

A massive-scale X-ray free-electron laser (XFEL) enables tracking structural and electron dynamics in novel systems, including fusion materials,...

11 MIN READ

May 04, 2026

Optimize Supply Chain Decision Systems Using NVIDIA cuOpt Agent Skills

Modern supply chains operate under the constant pressures of fluctuating demand, volatile costs, constrained capacity, and interdependent decision-making....

6 MIN READ

A person working on code on their computer.

Apr 30, 2026

Automating GPU Kernel Translation with AI Agents: cuTile Python to cuTile.jl

NVIDIA CUDA Tile (cuTile) is a tile-based programming model that enables developers to write GPU kernels in terms of tile-level operations—loads, stores, and...

9 MIN READ

Apr 22, 2026

Simplify Sparse Deep Learning with Universal Sparse Tensor in nvmath-python

In a previous post, we introduced the Universal Sparse Tensor (UST), enabling developers to decouple a tensor’s sparsity from its memory layout for greater...

11 MIN READ

Apr 14, 2026

NVIDIA NVbandwidth: Your Essential Tool for Measuring GPU Interconnect and Memory Performance

When you’re writing CUDA applications, one of the most important things you need to focus on to write great code is data transfer performance. This applies to...

8 MIN READ

Apr 09, 2026

How to Accelerate Protein Structure Prediction at Proteome-Scale

Proteins rarely function in isolation as individual monomers. Most biological processes are governed by proteins interacting with other proteins, forming...

10 MIN READ

Apr 01, 2026

CUDA Tile Programming Now Available for BASIC!

Note: CUDA Tile Programming in BASIC is an April Fools’ joke, but it's also real and actually works, demonstrating the flexibility of CUDA. CUDA 13.1...

7 MIN READ