Posts by Andy Adinets
Technical Walkthrough
May 21, 2021
Sparse Forests with FIL
The RAPIDS Forest Inference Library, affectionately known as FIL, dramatically accelerates inference (prediction) for tree-based models…
6 MIN READ
Technical Walkthrough
Oct 01, 2014
CUDA Pro Tip: Optimized Filtering with Warp-Aggregated Atomics
This post introduces warp-aggregated atomics, a useful technique to improve performance when many CUDA threads atomically update a single counter.
14 MIN READ
Technical Walkthrough
Jun 12, 2014
A CUDA Dynamic Parallelism Case Study: PANDA
Learn how Dynamic Parallelism of NVIDIA GPUs is being used to accelerate the discoveries of particle physics running on the PANDA experiment part of the Facility for Antiproton and Ion Research in Europe (FAIR).
11 MIN READ
Technical Walkthrough
May 20, 2014
CUDA Dynamic Parallelism API and Principles
This post is the second in a series on CUDA Dynamic Parallelism. In my first post, I introduced Dynamic Parallelism by using it to compute images of the…
13 MIN READ
Technical Walkthrough
May 06, 2014
Adaptive Parallel Computation with CUDA Dynamic Parallelism
Early CUDA programs had to conform to a flat, bulk parallel programming model. Programs had to perform a sequence of kernel launches, and for best performance…
13 MIN READ