Tag: CUDA C++

AI / Deep Learning

Fast, Flexible Allocation for NVIDIA CUDA with RAPIDS Memory Manager

When I joined the RAPIDS team in 2018, NVIDIA CUDA device memory allocation was a performance problem. RAPIDS cuDF allocates and deallocates memory at high… 24 MIN READ

Unified Memory for CUDA Beginners

This post introduces CUDA programming with Unified Memory, a single memory address space that is accessible from any GPU or CPU in a system. 16 MIN READ
Accelerated Computing

High-Performance Geometric Multi-Grid with GPU Acceleration

Algorithms and optimizations for accelerating geometric multi-grid in the HPGMG benchmark with GPUs, including scalability on supercomputers. 16 MIN READ
Accelerated Computing

Cutting Edge Parallel Algorithms Research with CUDA

Leyuan Wang, a Ph.D. student in the UC Davis Department of Computer Science, presented one of only two “Distinguished Papers” of the 51 accepted at Euro-Par… 14 MIN READ
Accelerated Computing

Accelerating Materials Discovery with CUDA

In this post, we discuss how CUDA has facilitated materials research in the Department of Chemical and Biomolecular Engineering at UC Berkeley and Lawrence… 15 MIN READ

Voting and Shuffling to Optimize Atomic Operations

2iSome years ago I started work on my first CUDA implementation of the Multiparticle Collision Dynamics (MPC) algorithm, a particle-in-cell code used to… 10 MIN READ