Technical Blog
Tag: CUDA C++
Subscribe
Technical Walkthrough
Mar 23, 2022
Boosting Application Performance with GPU Memory Prefetching
NVIDIA GPUs have enormous compute power and typically must be fed data at high speed to deploy that power. That is possible, in principle, because GPUs also...
10 MIN READ
Technical Walkthrough
Feb 10, 2022
Implementing High-Precision Decimal Arithmetic with CUDA int128
“Truth is much too complicated to allow anything but approximations.” -- John von Neumann The history of computing has demonstrated that there is no limit...
19 MIN READ
Technical Walkthrough
Dec 08, 2020
Fast, Flexible Allocation for NVIDIA CUDA with RAPIDS Memory Manager
When I joined the RAPIDS team in 2018, NVIDIA CUDA device memory allocation was a performance problem. RAPIDS cuDF allocates and deallocates memory at high...
24 MIN READ
Technical Walkthrough
Jun 19, 2017
Unified Memory for CUDA Beginners
My previous introductory post, "An Even Easier Introduction to CUDA C++", introduced the basics of CUDA programming by showing how to write a simple program...
16 MIN READ
Technical Walkthrough
Feb 23, 2016
High-Performance Geometric Multi-Grid with GPU Acceleration
Linear solvers are probably the most common tool in scientific computing applications. There are two basic classes of methods that can be used to solve an...
16 MIN READ