Beginner
Jun 19, 2017
Unified Memory for CUDA Beginners
My previous introductory post, "An Even Easier Introduction to CUDA C++", introduced the basics of CUDA programming by showing how to write a simple program...
16 MIN READ
Jan 25, 2017
An Even Easier Introduction to CUDA
This post is a super simple introduction to CUDA, the popular parallel computing platform and programming model from NVIDIA. I wrote a previous post, Easy...
13 MIN READ
Jan 07, 2013
How to Access Global Memory Efficiently in CUDA C/C++ Kernels
In the previous two posts we looked at how to move data efficiently between the host and device. In this sixth post of our CUDA C/C++ series we discuss how to...
9 MIN READ
Jan 03, 2013
How to Access Global Memory Efficiently in CUDA Fortran Kernels
CUDA Fortran for Scientists and Engineers shows how high-performance application developers can...
8 MIN READ
Dec 04, 2012
How to Optimize Data Transfers in CUDA C/C++
In the previous three posts of this CUDA C & C++ series we laid the groundwork for the major thrust of the series: how to optimize CUDA C/C++ code. In this...
10 MIN READ
Nov 29, 2012
How to Optimize Data Transfers in CUDA Fortran
CUDA Fortran for Scientists and Engineers shows how high-performance application developers can...
12 MIN READ
Nov 07, 2012
How to Implement Performance Metrics in CUDA C/C++
In the first post of this series we looked at the basic elements of CUDA C/C++ by examining a CUDA C/C++ implementation of SAXPY. In this second post we discuss...
8 MIN READ
Nov 05, 2012
How to Implement Performance Metrics in CUDA Fortran
CUDA Fortran for Scientists and Engineers shows how high-performance application developers can...
9 MIN READ
Oct 29, 2012
An Easy Introduction to CUDA Fortran
CUDA Fortran for Scientists and Engineers shows how high-performance application developers can...
9 MIN READ
Jun 11, 2012
CUDA 101: Get Ahead of the CUDA Curve with Practice!
After a recent talk I gave called "CUDA 101:Â Intro to GPU Computing", a student asked "What's the best way for me to get experience in parallel programming and...
5 MIN READ