CUDA
Dec 04, 2025
NVIDIA CUDA 13.1 Powers Next-Gen GPU Programming with NVIDIA CUDA Tile and Performance Gains
NVIDIA CUDA 13.1 introduces the largest and most comprehensive update to the CUDA platform since it was invented two decades ago. In this release,...
11 MIN READ
Dec 04, 2025
Simplify GPU Programming with NVIDIA CUDA Tile in Python
The release of NVIDIA CUDA 13.1 introduces tile-based programming for GPUs, making it one of the most fundamental additions to GPU programming since CUDA was...
7 MIN READ
Dec 04, 2025
Focus on Your Algorithm—NVIDIA CUDA Tile Handles the Hardware
With its largest advancement since the NVIDIA CUDA platform was invented in 2006, CUDA 13.1 is launching NVIDIA CUDA Tile. This exciting innovation introduces a...
5 MIN READ
Nov 19, 2025
Building Better Qubits with GPU-Accelerated Computing
Quantum computing promises to revolutionize science and industry, from drug discovery to materials science. But building a useful, large-scale quantum computer...
5 MIN READ
Nov 17, 2025
NVIDIA NVQLink Architecture Integrates Accelerated Computing with Quantum Processors
Quantum computing is entering an era where progress will be driven by the integration of accelerated computing with quantum processors. The hardware that...
8 MIN READ
Nov 13, 2025
Achieve CUTLASS C++ Performance with Python APIs Using CuTe DSL
CuTe, a core component of CUTLASS 3.x, provides a unified algebra for describing data layouts and thread mappings, and abstracts complex memory access patterns...
9 MIN READ
Nov 13, 2025
How to Get Started with Neural Shading for Your Game or Application
For the past 25 years, real-time rendering has been driven by continuous hardware improvements. The goal has always been to create the highest fidelity image...
21 MIN READ
Oct 24, 2025
Unlocking Tensor Core Performance with Floating Point Emulation in cuBLAS
NVIDIA CUDA-X math libraries provide the fundamental numerical building blocks that enable developers to deploy accelerated applications across multiple...
11 MIN READ
Oct 23, 2025
Bring Your Circuits to CUDA-Q Using QGEAR
Download NERSC’s QGEAR project to easily import Qiskit circuits into GPU-accelerated CUDA-Q kernels.
1 MIN READ
Oct 14, 2025
Understanding Memory Management on Hardware-Coherent Platforms
If you're an application developer or a cluster administrator, you’ve likely seen how non-uniform memory access (NUMA) can impact system performance. When an...
6 MIN READ
Sep 29, 2025
Unlock GPU Performance: Global Memory Access in CUDA
Managing memory is one of the most important performance characteristics to consider when writing a GPU kernel. This post walks you through the important...
15 MIN READ
Sep 16, 2025
Autodesk Research Brings Warp Speed to Computational Fluid Dynamics on NVIDIA GH200
Computer-aided engineering (CAE) forms the backbone for modern product development across industries, from designing safer aircraft to optimizing renewable...
8 MIN READ
Sep 11, 2025
Build High-Performance Vision AI Pipelines with NVIDIA CUDA-Accelerated VC-6
The constantly increasing compute throughput of NVIDIA GPUs presents a new opportunity for optimizing vision AI workloads: keeping the hardware fed with data....
13 MIN READ
Sep 10, 2025
Developers Can Now Get NVIDIA CUDA Directly from Their Favorite Third-Party Platforms
Building and deploying applications can be challenging for developers, requiring them to navigate the complex relationship between hardware and software...
3 MIN READ
Sep 03, 2025
Accelerate Autonomous Vehicle Development with the NVIDIA DRIVE AGX Thor Developer Kit
Autonomous vehicle (AV) technology is rapidly evolving, fueled by ever-larger and more complex AI models deployed at the edge. Modern vehicles now require not...
8 MIN READ
Sep 02, 2025
Improving GEMM Kernel Auto-Tuning Efficiency on NVIDIA GPUs with Heuristics and CUTLASS 4.2
Selecting the best possible General Matrix Multiplication (GEMM) kernel for a specific problem and hardware is a significant challenge. The performance of a...
8 MIN READ