Technical Walkthrough 0

Unified Memory for CUDA Beginners

My previous introductory post, "An Even Easier Introduction to CUDA C++", introduced the basics of CUDA programming by showing how to write a simple program... 16 MIN READ
Technical Walkthrough 1

An Even Easier Introduction to CUDA

This post is a super simple introduction to CUDA, the popular parallel computing platform and programming model from NVIDIA. I wrote a previous post, Easy... 13 MIN READ
Technical Walkthrough 0

How to Access Global Memory Efficiently in CUDA C/C++ Kernels

In the previous two posts we looked at how to move data efficiently between the host and device. In this sixth post of our CUDA C/C++ series we discuss how to... 9 MIN READ
Technical Walkthrough 0

How to Access Global Memory Efficiently in CUDA Fortran Kernels

[caption id="attachment_8972" align="alignright" width="318"] CUDA Fortran for Scientists and Engineers shows how high-performance application developers can... 8 MIN READ
Technical Walkthrough 1

How to Optimize Data Transfers in CUDA C/C++

In the previous three posts of this CUDA C & C++ series we laid the groundwork for the major thrust of the series: how to optimize CUDA C/C++ code. In this... 10 MIN READ
Technical Walkthrough 0

How to Optimize Data Transfers in CUDA Fortran

[caption id="attachment_8972" align="alignright" width="318"] CUDA Fortran for Scientists and Engineers shows how high-performance application developers can... 12 MIN READ