NVIDIA cuDNN

GPU Accelerated Deep Learning

The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN is part of the NVIDIA Deep Learning SDK.

Deep learning researchers and framework developers worldwide rely on cuDNN for high-performance GPU acceleration. It allows them to focus on training neural networks and developing software applications rather than spending time on low-level GPU performance tuning. cuDNN accelerates widely used deep learning frameworks, including Caffe, Caffe2, TensorFlow, Theano, Torch, and Microsoft Cognitive Toolkit. See supported frameworks for more details. cuDNN is freely available to members of the Accelerated Computing Developer Program

Download

(Click to Zoom)

(Click to Zoom)

cuDNN 7 For Volta GPUs

cuDNN 7 delivers up to 2.5x faster training of Microsoft’s ResNet network used for image classification and up to 3x faster training of OpenNMT language translation network on a Tesla V100 GPU, powered by Volta, compared to a Tesla P100 GPU

cuDNN 7 will become available in July. Register for the NVIDIA developer program to be notified when cuDNN 7 becomes available.

Key Features

  • Forward and backward paths for many common layer types such as pooling, LRN, LCN and batch normalization, ReLU, Sigmoid, softmax and Tanh
  • Forward and backward convolution routines, including cross-correlation, designed for convolutional neural networks
  • LSTM and GRU Recurrent Neural Networks (RNN) and Persistent RNNs
  • Arbitrary dimension ordering, striding, and sub-regions for 4d tensors means easy integration into any neural net implementation
  • Tensor transformation functions
  • Context-based API allows for easy multithreading

cuDNN is supported on Windows, Linux and MacOS systems with Pascal, Maxwell, Kepler, Tegra K1, Tegra X1 and Tegra X2 GPUs.

 

cuDNN Accelerated Frameworks

 

Learn More