CUDA
May 10, 2024
Dynamic Control Flow in CUDA Graphs with Conditional Nodes
CUDA Graphs can provide a significant performance increase, as the driver is able to optimize execution using the complete description of tasks and...
7 MIN READ
May 07, 2024
NVIDIA GTC Training Labs On Demand Available Now
Missed GTC or want to replay your favorite training labs? Find it on demand with the NVIDIA GTC Training Labs playlist.
1 MIN READ
Apr 19, 2024
Measuring the GPU Occupancy of Multi-stream Workloads
NVIDIA GPUs are becoming increasingly powerful with each new generation. This increase generally comes in two forms. Each streaming multi-processor (SM), the...
11 MIN READ
Mar 27, 2024
Efficient CUDA Debugging: Using NVIDIA Compute Sanitizer with NVIDIA Tools Extension and Creating Custom Tools
NVIDIA Compute Sanitizer is a powerful tool that can save you time and effort while improving the reliability and performance of your CUDA applications....
14 MIN READ
Mar 25, 2024
Building High-Performance Applications in the Era of Accelerated Computing
AI is augmenting high-performance computing (HPC) with novel approaches to data processing, simulation, and modeling. Because of the computational requirements...
6 MIN READ
Mar 14, 2024
Just Released: NVIDIA cuSPARSELt 0.6
NVIDIA cuSPARSELt harnesses Sparse Tensor Cores to accelerate general matrix multiplications. Version 0.6. adds support for the NVIDIA Hopper architecture.
1 MIN READ
Mar 06, 2024
CUDA Toolkit 12.4 Enhances Support for NVIDIA Grace Hopper and Confidential Computing
The latest release of CUDA Toolkit, version 12.4, continues to push accelerated computing performance using the latest NVIDIA GPUs. This post explains the new...
9 MIN READ
Feb 28, 2024
Optimizing OpenFold Training for Drug Discovery
Predicting 3D protein structures from amino acid sequences has been an important long-standing question in bioinformatics. In recent years, deep...
7 MIN READ
Jan 12, 2024
Just Released: cuBLASDx
cuBLASDx allows you to perform BLAS calculations inside your CUDA kernel, improving the performance of your application. Available to download in Preview...
1 MIN READ
Jan 05, 2024
Improving CUDA Initialization Times Using cgroups in Certain Scenarios
Many CUDA applications running on multi-GPU platforms usually use a single GPU for their compute needs. In such scenarios, a performance penalty is paid by...
5 MIN READ
Dec 20, 2023
Just Released: cuBLASMp
cuBLASMp is a high-performance, multi-process, GPU-accelerated library for distributed basic dense linear algebra. It is available to download in Preview now.
1 MIN READ
Nov 29, 2023
CUDA-Q 0.5 Delivers New Features for Quantum-Classical Computing
NVIDIA CUDA-Q is a platform for building quantum-classical computing applications. It is an open-source programming model for heterogeneous computing such as...
4 MIN READ
Nov 21, 2023
Unlocking GPU Intrinsics in HLSL
There are some useful intrinsic functions in the NVIDIA GPU instruction set that are not included in standard graphics APIs. Updated from the original 2016 post...
9 MIN READ
Nov 17, 2023
Boosting Custom ROS Graphs Using NVIDIA Isaac Transport for ROS
NVIDIA Isaac Transport for ROS (NITROS) is the implementation of two hardware-acceleration features introduced with ROS 2 Humble-type adaptation and type...
11 MIN READ
Nov 16, 2023
Unlock the Power of NVIDIA Grace and NVIDIA Hopper Architectures with Foundational HPC Software
High-performance computing (HPC) powers applications in simulation and modeling, healthcare and life sciences, industry and engineering, and more. In the modern...
7 MIN READ
Nov 07, 2023
CUDA-Accelerated Robot Motion Generation in Milliseconds with NVIDIA cuRobo
Real-time autonomous robot navigation powered by a fast motion-generation algorithm can enable applications in several industries such as food and services,...
3 MIN READ