Deep learning frameworks offer building blocks for designing, training and validating deep neural networks, through a high level programming interface. Widely used deep learning frameworks such as Caffe2, Cognitive toolkit, MXNet, PyTorch, TensorFlow and others rely on GPU-accelerated libraries such as cuDNN and NCCL to deliver high-performance multi-GPU accelerated training.

Developers, researchers and data scientists can get easy access to NVIDIA optimized deep learning framework containers, that’s performance tuned and tested for NVIDIA GPUs. This eliminates the need to manage packages and dependencies or build deep learning frameworks from source. Visit NVIDIA GPU Cloud (NGC) to learn more and get started.

Following is a list of popular deep learning frameworks, including learning resources and links to getting started resources.


Caffe2

Caffe2 is a deep-learning framework designed to easily express all model types, for example, CNN, RNN, and more, in a friendly python-based API, and execute them using a highly efficiently C++ and CUDA back-end. Users have flexibility to assemble their model using combinations of high-level and expressive operations in python allowing for easy visualization, or serializing the created model and directly using the underlying C++ implementation. Caffe2 supports single and multi-GPU execution, along with support for multi-node execution.


Model Deployment:

For high-performance inference deployment for Caffe2 trained models, export to ONNX format and optimize and deploy with NVIDIA TensorRT inference accelerator.

Learning Resources


Cognitive Toolkit

The Microsoft Cognitive Toolkit, formerly known as CNTK, is a unified deep-learning toolkit that describes neural networks as a series of computational steps via a directed graph. In this directed graph, leaf nodes represent input values or network parameters, while other nodes represent matrix operations upon their inputs.


Model Deployment:

For high-performance inference deployment for Cognitive Toolkit trained models, export to ONNX format and optimize and deploy with NVIDIA TensorRT inference accelerator.

Learning Resources


MATLAB

MATLAB makes deep learning easy for engineers, scientists and domain experts. With tools and functions for managing and labeling large data sets, MATLAB also offers specialized toolboxes for working with machine learning, neural networks, computer vision, and automated driving. With just a few lines of code, MATLAB lets you create and visualize models, and deploy models to servers and embedded devices without being an expert. MATLAB also enables users to generate high-performance CUDA code for deep learning and vision applications automatically from MATLAB code.


Model Deployment:

For high-performance inference deployment of MATLAB trained models, use MATLAB GPU Coder to automatically generate TensorRT optimized inference engines from cloud to embedded deployment environments.

Learning Resources


MXNet

MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to mix the flavors of symbolic programming and imperative programming to maximize efficiency and productivity.

In its core is a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer on top of that makes symbolic execution fast and memory efficient. The library is portable and lightweight, and it scales to multiple GPUs and multiple machines.


Model Deployment:

For high-performance inference deployment for MXNet trained models, export to ONNX format and optimize and deploy with NVIDIA TensorRT inference accelerator.

Learning Resources


NVIDIA Caffe

Caffe is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. NVIDIA Caffe, also known as NVCaffe, is an NVIDIA-maintained fork of BVLC Caffe tuned for NVIDIA GPUs, particularly in multi-GPU configurations.


Model Deployment:

For high-performance inference deployment for Caffe trained models, import, optimize and deploy with NVIDIA TensorRT's built-in Caffe model importer.

Learning Resources


PyTorch

PyTorch is a Python package that provides two high-level features:

  • Tensor computation (like numpy) with strong GPU acceleration
  • Deep Neural Networks built on a tape-based autograd system

You can reuse your favorite Python packages such as numpy, scipy and Cython to extend PyTorch when needed.


Model Deployment:

For high-performance inference deployment for trained models, export to ONNX format and optimize and deploy with NVIDIA TensorRT inference accelerator.

Learning Resources


TensorFlow

TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. For visualizing TensorFlow results, TensorFlow offers TensorBoard, suite of visualization tools.


Model Deployment:

For high-performance inference deployment for TensorFlow trained models, you can:

  1. Use TensorFlow-TensorRT integration to optimize models within TensorFlow and deploy with TensorFlow
  2. Export TensorFlow models and import, optimize and deploy with NVIDIA TensorRT's built in TensorFlow model importer.

Learning Resources


Chainer

Chainer is a Python-based deep learning framework aiming at flexibility. It provides automatic differentiation APIs based on the define-by-run approach, also known as dynamic computational graphs, as well as object-oriented high-level APIs to build and train neural networks. It supports CUDA and cuDNN using CuPy for high performance training and inference.


Model Deployment:

For high-performance inference deployment for Chainer trained models, export to ONNX format and optimize and deploy with NVIDIA TensorRT inference accelerator.

Learning Resources


PaddlePaddle

PaddlePaddle provides an intuitive and flexible interface for loading data and specifying model structures. It supports CNN, RNN, multiple variants and configures complicated deep models easily.

It also provides extremely optimized operations, memory recycling, and network communication. PaddlePaddle makes it easy to scale heterogeneous computing resources and storage to accelerate the training process.

Learning Resources