Combine OpenACC and Unified Memory for Productivity and Performance

Features, CUDA, LULESH, OpenACC, Optimization, Unified Memory

Nadeem Mohammad, posted Oct 01 2015

The post Getting Started with OpenACC covered four steps to progressively accelerate your code with OpenACC.

Read more

Simple, Portable Parallel C++ with Hemi 2 and CUDA 7.5

Features, C++, C++11, Hemi, Lambda

Nadeem Mohammad, posted Sep 24 2015

The last two releases of CUDA have added support for the powerful new features of C++.

Read more

CUDA 7.5: Pinpoint Performance Problems with Instruction-Level Profiling

Features, CUDA 7.5, Optimization, Profiling, tools

Nadeem Mohammad, posted Sep 14 2015

[Note: Thejaswi Rao also contributed to the code optimizations shown in this post.] Today NVIDIA released CUDA 7.5, the latest release of the powerful CUDA Toolkit.

Read more