Register Cache: Caching for Warp-Centric CUDA Programs

Features, Cooperative Groups, CUDA, Optimization

Nadeem Mohammad, posted Oct 12 2017

In this post we introduce the “register cache”, an optimization technique that develops a virtual caching layer for threads in a single warp. It is a software abstraction implemented on top of the NVIDIA GPU shuffle primitive.

Read more

Linux Graphics Debugger 2.2 released with enhanced performance analysis and frame capture serialization

Linux Graphics Debugger, GameWorks

Robert Bischof, posted Oct 11 2017

Linux Graphics Debugger 2.2 is available for download under the NVIDIA GameWorks Registered Developer Program.

Read more

Mixed-Precision Training of Deep Neural Networks

Features, Deep Learning, FP16, Mixed Precision, Tensor Cores, Volta

Nadeem Mohammad, posted Oct 11 2017

Deep Neural Networks (DNNs) have lead to breakthroughs in a number of areas, including image processing and understanding, language modeling, language translation, speech processing, game playing, and many others.

Read more

Aftermath 1.3 Update

Aftermath, GameWorks

Alex Dunn, posted Oct 10 2017

It’s only been a short while since we first released Aftermath to the public; and since then, Aftermath has helped approximately 1000 developers world-wide debug GPU crashes in their applications, on both, DirectX 11 and DirectX 12.

Read more

VRWorks 2.5 SDK Release

Virtual Reality

Nadeem Mohammad, posted Oct 10 2017

NVIDIA released VRWorks SDK V2.5 for application and headset developers along with the NVIDIA display drivers 387.92 (Windows) and 387.12 (Linux/Beta).

Read more