Combine OpenACC and Unified Memory for Productivity and Performance

Features, CUDA, LULESH, OpenACC, Optimization, Unified Memory

Nadeem Mohammad, posted Oct 01 2015

The post Getting Started with OpenACC covered four steps to progressively accelerate your code with OpenACC. It’s often necessary to use OpenACC directives to express both loop parallelism and data locality in order to get good performance with accelerators. After expressing available parallelism, excessive data movement generated by the compiler can be a bottleneck, and correcting this by

Read more

Customize CUDA Fortran Profiling with NVTX

CUDA Pro Tip, CUDA Fortran, Optimization, Profiling

Nadeem Mohammad, posted Sep 29 2015

The NVIDIA Tools Extension (NVTX) library lets developers annotate custom events and ranges within the profiling timelines generated using tools such as the NVIDIA Visual Profiler (NVVP) and NSight. In my own optimization work, I rely heavily on NVTX to better understand internal as well as customer codes and to spot opportunities for better interaction

Read more

Simple, Portable Parallel C++ with Hemi 2 and CUDA 7.5

Features, C++, C++11, Hemi, Lambda

Nadeem Mohammad, posted Sep 24 2015

The last two releases of CUDA have added support for the powerful new features of C++. In the post The Power of C++11 in CUDA 7 I discussed the importance of C++11 for parallel programming on GPUs, and in the post New Features in CUDA 7.5 I introduced a new experimental feature in the NVCC CUDA C++ compiler:

Read more

GTC16 Call for Submissions


David Coombes, posted Sep 23 2015

GTC is the largest GPU technology conference in the world. It’s all about exploring the innovative ways GPU technology helps developers overcome computational and graphical challenges. GTC16 will be in the heart of Silicon Valley in April 2016. The event attracts developers, researchers, and technologists from some of the top companies, universities, research firms and government agencies from around the world.

Read more

Announcing GTX980 Notebooks: The VR Developer’s Dream

GameWorks, VRWorks

David Coombes, posted Sep 23 2015

Content creators have been asking for VR ready performance in a laptop. This week we announced the availability of notebooks containing the power of the GTX980 GPU.

These notebooks use the same 2,048-core GM204 GPU found in our GeForce GTX 980 graphics cards. They’re equipped with GDDR5 memory, fast CPUs, multiple USB 3.0 ports and direct HDMI out, making them the world’s first notebooks to meet (and exceed) the recommended spec for Oculus Rift.

Read more