Combine OpenACC and Unified Memory for Productivity and Performance

Features, CUDA, LULESH, OpenACC, Optimization, Unified Memory

Nadeem Mohammad, posted Oct 01 2015

The post Getting Started with OpenACC covered four steps to progressively accelerate your code with OpenACC.

Read more

Customize CUDA Fortran Profiling with NVTX

CUDA Pro Tip, CUDA Fortran, Optimization, Profiling

Nadeem Mohammad, posted Sep 29 2015

The NVIDIA Tools Extension (NVTX) library lets developers annotate custom events and ranges within the profiling timelines generated using tools such as the NVIDIA Visual Profiler (NVVP) and NSight.

Read more

CUDA 7.5: Pinpoint Performance Problems with Instruction-Level Profiling

Features, CUDA 7.5, Optimization, Profiling, tools

Nadeem Mohammad, posted Sep 14 2015

[Note: Thejaswi Rao also contributed to the code optimizations shown in this post.] Today NVIDIA released CUDA 7.5, the latest release of the powerful CUDA Toolkit.

Read more