Posts by Jeremy Appleyard
Simulation / Modeling / Design
Oct 17, 2017
Programming Tensor Cores in CUDA 9
A defining feature of the new NVIDIA Volta GPU architecture is Tensor Cores, which give the NVIDIA V100 accelerator a peak throughput that is 12x...
16 MIN READ
Data Science
Apr 06, 2016
Optimizing Recurrent Neural Networks in cuDNN 5
This week at GTC 2016, we announced the latest update to NVIDIA Deep Learning SDK, which now includes cuDNN 5. Version 5 offers new features, improved...
10 MIN READ
Simulation / Modeling / Design
Aug 07, 2014
CUDA Pro Tip: Optimize for Pointer Aliasing
Often cited as the main reason that naïve C/C++ code cannot match FORTRAN performance, pointer aliasing is an important topic to understand when considering...
6 MIN READ