CUDA

Aug 22, 2023
Simplifying GPU Application Development with Heterogeneous Memory Management
Heterogeneous Memory Management (HMM) is a CUDA memory management feature that extends the simplicity and productivity of the CUDA Unified Memory programming...
16 MIN READ

Jul 19, 2023
Programming the Quantum-Classical Supercomputer
Heterogeneous computing architectures—those that incorporate a variety of processor types working in tandem—have proven extremely valuable in the continued...
9 MIN READ

Jun 28, 2023
Improving GPU Performance by Reducing Instruction Cache Misses
GPUs are specially designed to crunch through massive amounts of data at high speed. They have a large amount of compute resources, called streaming...
11 MIN READ

Jun 05, 2023
CUDA 12.1 Supports Large Kernel Parameters
CUDA kernel function parameters are passed to the device through constant memory and have been limited to 4,096 bytes. CUDA 12.1 increases this parameter limit...
5 MIN READ

Jun 02, 2023
Harnessing the Power of NVIDIA AI Enterprise on Azure Machine Learning
AI is transforming industries, automating processes, and opening new opportunities for innovation in the rapidly evolving technological landscape. As more...
7 MIN READ

May 17, 2023
Webinar: Performant Multiphase Flow Simulation at Leadership-Class Scale
On June 6, learn how researchers use OpenACC for GPU acceleration of multiphase and compressible flow solvers that obtain speedups at scale.
1 MIN READ

May 16, 2023
Asynchronous Error Reporting: When printf Just Won’t Do
Some programming situations call for reporting “soft” errors asynchronously. While printf can be a useful tool, it can increase register use and impact...
17 MIN READ

Apr 20, 2023
Debugging a Mixed Python and C Language Stack
Debugging is difficult. Debugging across multiple languages is especially challenging, and debugging across devices often requires a team with varying skill...
18 MIN READ

Apr 14, 2023
A Guide to CUDA Graphs in GROMACS 2023
GPUs continue to get faster with each new generation, and it is often the case that each activity on the GPU (such as a kernel or memory copy) completes very...
13 MIN READ

Apr 04, 2023
Topic Modeling and Image Classification with Dataiku and NVIDIA Data Science
The Dataiku platform for everyday AI simplifies deep learning. Use cases are far-reaching, from image classification to object detection and natural language...
11 MIN READ

Mar 07, 2023
Developing an End-to-End Auto Labeling Pipeline for Autonomous Vehicle Perception
Accurately annotated datasets are crucial for camera-based deep learning algorithms to perform autonomous vehicle perception. However, manually labeling data is...
6 MIN READ

Mar 06, 2023
Maximizing Performance with Massively Parallel Hash Maps on GPUs
Decades of computer science history have been devoted to devising solutions for efficient storage and retrieval of information. Hash maps (or hash tables) are a...
19 MIN READ

Mar 01, 2023
Just Released: CUDA Toolkit 12.1
Available now for download, the CUDA Toolkit 12.1 release provides support for NVIDIA Hopper and NVIDIA Ada Lovelace architecture.
1 MIN READ

Feb 28, 2023
Just Released: NVIDIA Nsight Compute 2023.1
NVIDIA Nsight Compute 2023.1 adds more metrics and usability to the source view, a sample for shared memory banks, and improves the application replay...
1 MIN READ

Feb 10, 2023
Fast Large-Scale Agent-based Simulations on NVIDIA GPUs with FLAME GPU
The COVID-19 pandemic has brought the focus of agent-based modeling and simulation (ABMS) to the public’s attention. It’s a powerful computational technique...
19 MIN READ

Jan 31, 2023
Accelerating Python Applications with cuNumeric and Legate
cuNumeric is a library that aims to provide a distributed and accelerated drop-in replacement for the NumPy API that supports all NumPy features, such as...
14 MIN READ