Existing University Courses

This page has online courses to help you get started programming or teaching CUDA as well as links to Universities teaching CUDA.

This page organized into three sections to get you started

Introductory CUDA Technical Training Courses
CUDA University Courses
CUDA Seminars and Tutorials

Introductory CUDA Technical Training Courses

Udacity: CS344 Intro To Parallel Programming
Volume I: Introduction to CUDA Programming
- Exercises (for Linux and Mac)
- Visual Studio Exercises (for Windows)
- Instructions for Exercises
Volume II: CUDA Case Studies

Check out our CUDAcasts playlist on youtube

CUDA University Courses

University of Illinois : Current Course: ECE408/CS483
Taught by Professor Wen-mei W. Hwu and David Kirk, NVIDIA CUDA Scientist.

Introduction to GPU Computing (60.2 MB)
CUDA Programming Model (75.3 MB)
CUDA API (32.4 MB)
Simple Matrix Multiplication in CUDA (46.0 MB)
CUDA Memory Model (109 MB)
Shared Memory Matrix Multiplication (81.4 MB)
Additional CUDA API Features (22.4 MB)
Useful Information on CUDA Tools (15.7 MB)
Threading Hardware (140 MB)
Memory Hardware (85.8 MB)
Memory Bank Conflicts (115 MB)
Parallel Thread Execution (32.6 MB)
Control Flow (96.6 MB)
Precision (137 MB)

These classes are each downloadable CUDAcasts with video pre-scaled to be compatible with major players.

All PowerPoint class presentations can be found on the Fall 2014 webpage: ECE408/CS483

Stanford University: CS 193G: Programming Massively Parallel Processors with CUDA
Taught by Jared Hoberock and David Tarjan

Course Materials

University of Oxford: CUDA Programming on NVIDIA GPUs
Taught by Mike Giles, Professor

Course Materials

UC Davis: EE171: Parallel Computer Architecture
Taught by John Owens, Associate Professor

Course Materials

University of Sheffield: COM4521: Parallel Computing with GPUs
Taught by Paul Richmond,

Course Materials

CUDA Seminars and Tutorials

GPU Technology Conference: search for recordings
SC10
- NVIDIA GPU Computing Theatre
SC09
- NVIDIA GPU Computing Theatre
- SC09 Tutorial: High Performance Computing with CUDA
SC08 Tutorial: High Performance Computing with CUDA
SC07 Tutorial: High Performance Computing with CUDA
Dr Dobbs Article Series
- CUDA, Supercomputing for the Masses: Part 1 : CUDA lets you work with familiar programming concepts..
- CUDA, Supercomputing for the Masses: Part 2 : A first kernel
- CUDA, Supercomputing for the Masses: Part 3 : Error handling and global memory performance limitations
- CUDA, Supercomputing for the Masses: Part 4 : Understanding and using shared memory (1)
- CUDA, Supercomputing for the Masses: Part 5 : Understanding and using shared memory (2)
- CUDA, Supercomputing for the Masses: Part 6 : Global memory and the CUDA profiler
- CUDA, Supercomputing for the Masses: Part 7 : Double the fun with next-generation CUDA hardware
- CUDA, Supercomputing for the Masses: Part 8 : Using libraries with CUDA
- CUDA, Supercomputing for the Masses: Part 9 : Extending High-level Languages with CUDA
- CUDA, Supercomputing for the Masses: Part 10 : CUDPP, a powerful data-parallel CUDA library
- CUDA, Supercomputing for the Masses: Part 11 : Revisiting CUDA memory spaces
- CUDA, Supercomputing for the Masses: Part 12 : CUDA 2.2 changes the data movement paradigm
- CUDA, Supercomputing for the Masses: Part 13 : Using texture memory in CUDA
- CUDA, Supercomputing for the Masses: Part 14 : Debuging CUDA and using CUDA-GDB
- CUDA, Supercomputing for the Masses: Part 15 : Using Pixel Buffer Objects with CUDA and OpenGL
- CUDA, Supercomputing for the Masses: Part 16 : CUDA 3.0 provides expanded capabilities
- CUDA, Supercomputing for the Masses: Part 17 : CUDA 3.0 provides expanded capabilities and makes development easier
- CUDA, Supercomputing for the Masses: Part 18 : Using Vertex Buffer Objects with CUDA and OpenGL
- CUDA, Supercomputing for the Masses: Part 19 : Parallel Nsight Part 1: Configuring and Debugging Applications
- CUDA, Supercomputing for the Masses: Part 20 : Parallel Nsight Part 2: Using the Parallel Nsight Analysis capabilities
- CUDA, Supercomputing for the Masses: Part 21 : The Fermi architecture and CUDA
- Unified Memory in CUDA 6: A Brief Overview

Existing University Courses

Introductory CUDA Technical Training Courses

Check out our CUDAcasts playlist on youtube

CUDA University Courses

CUDA Seminars and Tutorials

Dr Dobbs Article Series