DEVELOPER BLOG
Tag: Algorithms
Accelerated Computing
Oct 04, 2017
Cooperative Groups: Flexible CUDA Thread Programming
In efficient parallel algorithms, threads cooperate and share data to perform collective computations. To share data, the threads must synchronize.
16 MIN READ
Accelerated Computing
Oct 19, 2015
Cutting Edge Parallel Algorithms Research with CUDA
Leyuan Wang, a Ph.D. student in the UC Davis Department of Computer Science, presented one of only two “Distinguished Papers” of the 51 accepted at Euro-Par…
14 MIN READ
HPC
Aug 06, 2015
Voting and Shuffling to Optimize Atomic Operations
2iSome years ago I started work on my first CUDA implementation of the Multiparticle Collision Dynamics (MPC) algorithm, a particle-in-cell code used to…
10 MIN READ
HPC
Mar 17, 2015
GPU Pro Tip: Fast Histograms Using Shared Atomics on Maxwell
Histograms are an important data representation with many applications in computer vision, data analytics and medical imaging. A histogram is a graphical…
9 MIN READ
Accelerated Computing
Oct 01, 2014
CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics
This post introduces warp-aggregated atomics, a useful technique to improve performance when many CUDA threads atomically update a single counter.
14 MIN READ
HPC
Feb 13, 2014
Faster Parallel Reductions on Kepler
Parallel reduction is a common building block for many parallel algorithms. A presentation from 2007 by Mark Harris provided a detailed strategy for…
12 MIN READ
