NVIDIA Deep Learning Institute Online Labs

Online Self-Paced Labs

The NVIDIA Deep Learning Institute (DLI) offers hands-on training for developers, data scientists, and researchers looking to solve the world’s most challenging problems with deep learning and accelerated computing.

Through self-paced online labs and instructor-led workshops, DLI provides training on the latest techniques for designing, training, and deploying neural networks across a variety of application domains. DLI also teaches you how to optimize your code for performance using NVIDIA CUDA® and OpenACC.

Create an account to take a two-hour hands-on lab online.

Deep Learning Labs

Applications of Deep Learning with Caffe, Theano, and Torch

Prerequisites: None
Industry: Fundamentals
Frameworks: Caffe, Theano, Torch

Learn how deep learning will change the future of computing. In this hands-on session (no technical background required), you will:

  • Compare deep learning to traditional methods
  • Run training and inference with three different deep learning frameworks
  • Learn how deep learning works and why the GPU is integral

Upon completion, you will be better equipped to decide how you or your organization can get started with deep learning.

Start this FREE lab


Image Classification with DIGITS

Prerequisites: None
Industry: Fundamentals
Frameworks: Caffe

Deep learning enables entirely new solutions by replacing hand-coded instructions with models learned from examples. Train a deep neural network to recognize handwritten digits by:

  • Loading image data to a training environment
  • Choosing and training a network
  • Testing with new data and iterating to improve performance

Upon completion, you will be able to assess what data you should be training from.

Start this FREE lab


Object Detection with DIGITS

Prerequisites: Image Classification with DIGITS
Industry: Fundamentals
Framework: Caffe

Deep learning has established solutions to many problems, but sometimes a problem is unique. Create your own solution to detect whale faces from aerial images by:

  • Combining traditional computer vision with deep learning
  • Performing minor “brain surgery” on an existing neural network using the deep learning framework Caffe
  • Hiring an army of experts to build your dream network

Upon completion, you will be able to solve unique problems with deep learning.

Start lab


Neural Network Deployment with DIGITS and TensorRT

Prerequisites: Image Classification Using DIGITS
Industry: Fundamentals
Frameworks: Caffe

Deep learning allows us to map inputs to outputs that are extremely computationally intense. Learn to deploy deep learning to applications that recognize images and detect pedestrians in real time by:

  • Accessing and understanding the files that make up a trained model
  • Building from each function’s unique input and output
  • Optimizing the most computationally intense parts of your application for different performance metrics like throughput and latency

Upon completion, you will be able to implement deep learning to solve problems in the real world.

Start lab


Image Segmentation with TensorFlow

Prerequisites: Image Classification with DIGITS
Industry: Fundamentals
Framework: TensorFlow

Image (or semantic) segmentation is the task of placing each pixel of an image into a specific class. In this lab, you will segment MRI images to measure parts of the heart by:

  • Comparing image segmentation with other computer vision problems
  • Experimenting with TensorFlow tools such as TensorBoard and the TensorFlow Python API
  • Learning to implement effective metrics for assessing model performance

Upon completion, you will be able to set up most computer vision workflows using deep learning.w.

Start lab


Linear Classification with TensorFlow

Prerequisites: None
Industry: Fundamentals
Frameworks: TensorFlow

Learn to make predictions from structured data using TensorFlow’s TFLearn API. Through the challenge of predicting a person’s income when given the rest of their census data, you will learn to:

  • Load, view, and organize data from a CSV for machine learning
  • Split an existing dataset into features and labels (input and output) of a neural network
  • Train and evaluate a linear model.
  • Build from linear to deep models and assess the difference in performance.

Upon completion, you will be able to make predictions from your own structured data.

Start lab


Signal Processing Using DIGITS

Prerequisites: None
Industry: Fundamentals
Framework: Caffe

The fact that deep neural networks are better at classifying images than humans has implications beyond what we typically think of computer vision. In this lab, you will convert Radio Frequency (RF) signals into images to detect a weak signal corrupted by noise and learn:

  • How non-image data can be treated like image data
  • How to implement a deep learning workflow (load, train, test, adjust) in DIGITS
  • Programmatic ways to test performance and guide performance improvement

Upon completion, you will be able to classify both image and image-like data using deep learning.

Start lab


Deep Learning Workflows with TensorFlow and MXNet and NVIDIA-Docker

Prerequisites: Bash terminal familiarity
Industry: Fundamentals
Framework: TensorFlow

The NVIDIA-Docker plugin makes it possible to containerize production-grade deep learning workflows using GPUs. Learn to considerably reduce host configuration and administration by:

  • Learning to work with Docker images and manage the container lifestyle
  • Accessing images on the public Docker image registry DockerHub for maximum reuse in creating composable lightweight containers
  • Training neural networks using both TensorFlow and MXNet frameworks

Upon completion, you will be able to containerize and distribute pre-configured images for deep learning.

Start lab


Deep Learning for Genomics using DragoNN with Keras and Theano

Prerequisites: Basic understanding of genomics
Industry: Healthcare
Frameworks: Theano

Learn to interpret deep learning models to discover predictive genome sequence patterns. Use the DragoNN toolkit on simulated and real regulatory genomic data to:

  • Demystify popular DragoNN (Deep Regulatory Genomics Neural Network) architectures
  • Explore guidelines for modeling and interpreting regulatory sequence using DragoNN models
  • Identify when DragoNN is a good choice for a learning problem in genomics and high-performance models

Upon completion, you will be able to use the discovery of predictive genome sequence patterns to hopefully gain new biological insights.

Start lab


Image Classification with TensorFlow: Radiomics - 1p19q Chromosome Status Classification

Prerequisites: Basic understanding of convolutional neural networks and genomics
Industry: Healthcare
Framework: Caffe

Thanks to work being performed at Mayo Clinic, using deep learning techniques to detect Radiomics from MRI imaging has led to more effective treatments and better health outcomes for patients with brain tumors. Learn to detect the 1p19q co-deletion biomarker by:

  • Designing and training Convolutional Neural Networks (CNNs)
  • Using Imaging Genomics (Radiomics) to create biomarkers that identify the genomics of a disease without the use of an invasive biopsy
  • Exploring the Radiogenomics work being done at the Mayo Clinic

Upon completion, you will have unique insight into the novelty and promising results of utilizing deep learning to predict Radiomics.

Start lab


Medical Image Analysis with R and MXNet

Prerequisites: Image Classification Using DIGITS
Industry: Healthcare
Frameworks: MXNet

Convolutional neural networks (CNNs) can be applied to medical image analysis to infer patient status from non-visible images. Train a CNN to infer the volume of the left ventricle of the human heart from time-series MRI data and learn to:

  • Extend a canonical 2D CNN to more complex data
  • Use the framework MXNet through the standard Python API and through R
  • Process high dimensionality imagery that may be volumetric and have a temporal component

Upon completion, you will know how to use CNNs for non-visible images.

Start lab


Medical Image Segmentation with DIGITS

Prerequisites: None
Industry: Healthcare
Framework: Caffe

Image (or semantic) segmentation is the task of placing each pixel of an image into a specific class. In this lab, you will segment MRI images to measure parts of the heart by:

  • Extend Caffe with custom Python layers
  • Implementing the process of transfer learning
  • Creating fully Convolutional Neural Networks from popular image classification networks

Upon completion, you will be able to set up most computer vision workflows using deep learning.

Start lab


Modeling Time Series Data with Recurrent Neural Networks in Keras

Prerequisites: Some experience training CNNs
Industry: Healthcare
Frameworks: Theano

Recurrent Neural Networks (RNNs) allow models to classify or forecast time-series data, like natural language, markets, and in the case of this lab, a patient’s health over time. You will:

  • Create training and testing datasets using electronic health records in HDF5 (hierarchical data format version five)
  • Prepare datasets for use with recurrent neural networks (RNNs), which allows modeling of very complex data sequences
  • Construct a long-short term memory model (LSTM), a specific RNN architecture, using the Keras library running on top of Theano to evaluate model performance against baseline data

Upon completion, you will be able to model time-series data using Recurrent Neural Networks.

Start lab

Accelerated Computing Labs

Accelerating Applications with CUDA C/C++

Prerequisites: None
Industry: Accelerated Computing

Learn how to accelerate your C/C++ application using CUDA to harness the massively parallel power of NVIDIA GPUs. You will program with CUDA to:

  • Accelerate SAXPY algorithms
  • Accelerate Matrix Multiply algorithms
  • Accelerate heat conduction algorithms

Upon completion, you will be able to use the CUDA platform to accelerate C/C++ applications.

Start lab


GPU Memory Optimizations with CUDA C/C++

Prerequisites: Basic CUDA C/C++ competency
Industry: Accelerated Computing

In this lab, you'll learn useful memory optimization techniques to use when programming with CUDA C/C++ on an NVIDIA GPU, and how to use the NVIDIA Visual Profiler (NVVP) to support these optimizations. You will:

  • Implement a naive matrix transposing algorithm
  • Perform several cycles of profiling the algorithm with NVVP and then optimizing its performance

Upon completion, you will have learned how to analyze and improve both global and shared memory access patterns, and will be able to optimize your accelerated C/C++ applications.

Start lab


Accelerating Applications with GPU-Accelerated Libraries in CUDA C/C++

Prerequisites: Basic CUDA C/C++ competency
Industry: Accelerated Computing

Learn how to accelerate your C/C++ application using drop-in libraries to harness the massively parallel power of NVIDIA GPUs. You will work through three exercises, including:

  • Using cuBLAS to accelerate a matrix multiplication algorithm
  • Combining libraries by adding cuRAND API calls to the previous cuBLAS calls
  • Using nvprof to profile code and optimize your accelerated code

Upon completion, you will be able to utilize CUDA optimized libraries to accelerate your C/C++ applications.

Start lab


Using Thrust to Accelerate C++

Prerequisites: Basic CUDA C/C++ competency
Industry: Accelerated Computing

Thrust is a parallel algorithms library loosely based on the C++ Standard Template Library, which allows developers to quickly embrace the power of parallel computing. Thrust code can be compiled to run on the massively parallel NVIDIA GPUs, as well as OpenMP and Intel’s Threading Building Blocks.
In this lab, you will learn the following Thrust features, and incorporate them all into a case study:

  • Iterators, Containers, and Functions
  • Porting to CPU processing
  • Exception and Error handling

Upon completion, you will be able to build GPU-accelerated applications in C/C++ that utilize the powerful Thrust library.

Start lab


Accelerating Applications with GPU-Accelerated Libraries in Python

Prerequisites: None
Industry: Accelerated Computing

Learn how to accelerate your Python application using GPU drop-in libraries to harness the massively parallel power of NVIDIA GPUs. You will work through three exercises, including:

  • Using a Python profiler to determine which part of the would benefit most from acceleration
  • Using a cuRAND API call to optimize the application
  • Profile and optimize again using the CUDA Runtime API to optimize data movement

Upon completion, you will be ready to start accelerating your Python applications using CUDA and CUDA optimized libraries.

Start lab


Accelerating Applications with CUDA Fortran

Prerequisites: None
Industry: Accelerated Computing

Learn how to accelerate your Fortran application using CUDA to harness the massively parallel power of NVIDIA GPUs. You will work through three exercises, including:

  • Accelerate SAXPY algorithms
  • Accelerate Matrix Multiply algorithms
  • Accelerate heat conduction algorithms

Upon completion, you will be able to use the CUDA platform to accelerate Fortran applications.

Start lab


GPU Memory Optimizations with CUDA Fortran

Prerequisites: Accelerating Applications with CUDA Fortran
Industry: Accelerated Computing

In this lab, you'll learn useful memory optimization techniques to use when programming with CUDA Fortran on an NVIDIA GPU, and how to use the NVIDIA Visual Profiler (NVVP) to support these optimizations. You will:

  • Implement a naive matrix transposing algorithm
  • Perform several cycles of profiling the algorithm with NVVP and then optimizing its performance

Upon completion, you will have learned how to analyze and improve both global and shared memory access patterns, and will be able to optimize your accelerated Fortran applications.

Start lab


Accelerating Applications with GPU-Accelerated Libraries in Fortran

Prerequisites: Basic CUDA Fortran Competency
Industry: Accelerated Computing

Learn how to accelerate your Fortran application using CUDA to harness the massively parallel power of NVIDIA GPUs. You will work through three exercises, including:

  • Using cuBLAS to accelerate a matrix multiplication algorithm
  • Combining libraries by adding cuRAND API calls to the previous cuBLAS calls
  • Using nvprof to profile code and optimize your accelerated code

Upon completion, you will be able to utilize CUDA optimized libraries to accelerate your Fortran applications.

Start lab


Introduction to Accelerated Computing

Prerequisites: None
Industry: Accelerated Computing

This lab will expose you to a collection of techniques for accelerating applications. You will:

  • Use CUDA-accelerated libraries to accelerate application code
  • Use compiler directives like OpenACC to accelerate application code
  • Use the CUDA platform to accelerate application code

Upon completion, you will be ready to accelerate your applications using a variety of acceleration techniques.

Start lab


OpenACC – 2X in 4 Steps

Prerequisites: None
Industry: Accelerated Computing

Learn how to accelerate your C/C++ or Fortran application using OpenACC, a directive based approach parallel computing, in order to harness the massively parallel power of NVIDIA GPUs. In this lab, you will learn the following 4 step approach for utilizing OpenACC in your application:

  • Characterize and profile your application
  • Add compute directives
  • Add directives to optimize data movement
  • Optimize your application using kernel scheduling

Upon completion, you will be able to use OpenACC to accelerate your C/C++ applications.

Start lab


Profiling and Parallelizing with OpenACC

Prerequisites: Basic OpenACC competency
Industry: Accelerated Computing

In this lab, you will gain experience with the first two steps of the OpenACC programming cycle: identifying critical application paths, and expressing parallelism. You will:

  • Profile a provided C or Fortran application using NVIDIA NVPROF
  • Use the PGI OpenACC compiler to accelerate the application

Upon completion, you will be able to identify candidate application paths for acceleration and accelerate them with OpenACC.

Start lab


Expressing Data Movement and Optimizing Loops with OpenACC

Prerequisites: Basic OpenACC competency
Industry: Accelerated Computing

This lab continues to work completed in the lab Profiling and Parallelizing with OpenACC by:

  • Adding OpenACC data management directives
  • Optimizing the code using the OpenACC loop directive
  • Optimize your application using kernel scheduling

Upon completion, you will be able to use the PGI compiler and NVIDIA Visual Profiler to optimize accelerated applications.

Start lab


Pipelining Work on the GPU with OpenACC

Prerequisites: Basic OpenACC competency
Industry: Accelerated Computing

This lab teaches OpenACC programmers the parallel programming performance optimization technique of pipelining, which reduces or eliminates the overhead of copying data to and from device memory. In this lab, you will:

  • Use the OpenACC routine directive to allow on-device function calls
  • Learn to run GPU code asynchronously
  • Overlap GPU computation and PCIe data motion

Upon completion, you will be able to leverage pipelining to reduce the overhead of data movement between host and device in your accelerated applications.

Start lab


Introduction to Multi GPU Programming with MPI and OpenACC

Prerequisites: Basic OpenACC and MPI competency
Industry: Accelerated Computing

In this lab, you will learn how to program multi GPU systems using MPI and OpenACC. The topics covered by this lab are:

  • Exchanging data between different GPUs using CUDA-aware MPI and OpenACC
  • Overlapping communication with computation to hide communication times
  • Using the NVIDIA performance analysis tools to support optimizations

Upon completion, you will be able to combine OpenACC and MPI in order to accelerate your applications across a cluster of GPUs.

Start lab


Profile-Driven Approach to Accelerate Seismic Application with OpenACC

Prerequisites: None
Industry: Accelerated Computing

In this lab, you will use PGPROF, a host and GPU profiling tool, to accelerate an open source seismic processing application. This lab follows the 4 stage APOD development cycle and in it you will:

  • Assess critical regions of the application, profile baseline CPU code, and decorate key loops with directives in order to parallelize them
  • Use profile, and verbose compiler output to decorate data directives, optimize, and measure best performance
  • Use compiler “multicore” option with OpenACC directives for portable performance

Upon completion, you will be able to use PGPROF with OpenACC to accelerate your C/C++ applications.

Start lab


Advanced Multi GPU Programming with MPI and OpenACC

Prerequisites: ‘Introduction to Multi GPU Programming with MPI and OpenACC’
Industry: Accelerated Computing

Learn how to improve a multi-GPU MPI and OpenACC program by exploring:

  • Overlapping communication with computation to hide communication times
  • Handling noncontiguous halo updates with a 2D tiled domain decomposition

Upon completion, you will be able to optimize the performance of your OpenACC and MPI applications running on a cluster of GPUs.

Start lab


Want more? Visit www.nvidia.com/DLI for upcoming DLI workshops, educational deep learning resources, and more.

Join the NVIDIA Developer Program for access to the latest software, tools, and information you need to develop applications using NVIDIA solutions.