NVIDIA Developer Zone

GPU Computing Webinars

Parallel programming has never been this easy.  The CUDA programming model, tools and powerful libraries have provided the foundation - this webinar series will fuel your development. Get trained directly from GPU Computing experts in CUDA, OpenCL and DirectCompute, find out about the latest developments from companies around the world leading the GPU Computing revolution.

Advance registration is required.  You will be kept informed of updates, future webinars and added to our CUDA Newsletter mailing list and invited to become a registered developer. 

Previously recorded sessions    Additional Parallel Nsight and Tools Webinar Records  GTC Express Webinar Series

IMPORTANT NOTE: Some of the Webinars are "Reg.Dev Priority", these are special webinars that are part of the membership benefits of our free to join  CUDA Registered Developer Program.  To join: complete the short application  Apply Now . Members will be given priority registration when the webinars are oversubscribed.

New Webinars - Sign Up Now

Series Webinar Title and Brief Description Registration Links
(Pacific Time)
CUDA Partner Introduction to Bright Cluster Manager - Advanced Clusters Made Easy
Learn about how you can manage your GPU cluster with this powerful tool. Session will include a technical feature overview and live Q&A

To be rescheduled

CUDA Partner

CUDA X86 - Running your CUDA Code on multi-core CPUs
PGI's CUDA X86 compiler enables developers to create a single code base using CUDA C/C++ optimized for parallel execution on systems with and without GPU Computing acceleration. This webinar will provide an overview and insight into this powerful solution.
Featuring a live Q&A session and technical presenters from the Portland Group

Jan 31
10am(PST)
Register Now

CUDA
Intro
CUDA Toolkit 4.1 - Technical & Performamce Overview
Now in production release features many improvements including:
New LLVM Based Compiler, over 1000 new image processing functions and major improvements in the Visual Profiler and much more. Presented by NVIDIA's CUDA PM, featuring a live Q&A session.

Feb 1,
10am(PST)
Register Now

Feb 3
10am
(Indian Standard Time)
Register Now

OpenACC
Intro
 

OpenACC 1.0 - Technical Overview
The new standard for compiler directives for parallel programming. Enabling GPU Acceleration with just hours of programming effort. OpenACC is a major new initiative providing an open standard for compiler hints or directives, the easist way to leverage the performace of any parallel computers. This webinar will provide a technical overview of the OpenACC API 1.0 specification. Featuring a live Q&A session with members of the OpenACC board.

Feb 14,
10am (PST)
Register Now
GTC Express Debugging CUDA with TotalView
With Totalview 8.9.2 and the NVIDIA CUDA add-on, you can debug both the CPU and the GPU code in applications that use CUDA. You can set breakpoints, step, and dive in code running on the CUDA device using all the familiar TotalView GUI methods. TotalView supports unified virtual addressing, as well as multi-device debugging, handles CUDA function in-lining and provides type qualification in the expression system. You can display how your logical threads are being mapped to hardware and navigate kernel threads using either hardware or logical coordinates.

The webinar will also preview the upcoming TotalView 8.10 with support for CUDA 4.1

Feb 22,
9am(PST)
Register Now

Previous Webinars 

Webinar Title Links to recordings 
5x in 5 Hours: Accelerating SEISMIC_CPML Using High-level GPU Programming
Programming GPU accelerators involves 3 basic aspects: splitting the source code between host and GPU, managing data allocation and movement between host memory and GPU memory, and optimizing GPU kernels. Much of this process can be automated using modern compiler technology and high level programming techniques. In this webinar, Mat Colgrove, Applications Engineer for The Portland Group, will present a case study on using PGI Accelerator compiler directives to achieve a 5x speed-up in approximately 5 hours of programming time on this popular geophysics code.
Links Coming Soon

Debugging workshop for CUDA 4.1 using Allinea DDT
Presented by David Lecomber,  CTO Allinea Software, Learn how Allinea DDT for CUDA can address your GPU debugging requirements. See how easy and powerful it is to debug your code on the host CPU and CUDA boards.
The agenda for this meeting is to
A) Overview of the capabilities of the DDT debugger
B) Multiple use-cases to highlight how DDT can improve programmer productivity
C) Mini training on how to install, configure and run the debugger 

With Live Q&A.

Links Coming Soon
CUDA Toolkit 4.1 Feature Overview
Now available for all developers, this new release features 3 major improvements:
New LLVM Based Compiler, over 1000 new image processing functions and major improvements in the Visual Profiler and much more. Presented by NVIDIA's CUDA PM, featuring Q&A session. Updated Dec 2011

Video(mp4)

PDF

Heterogeneous Data-Parallel Programming
Presented by Satnam Singh, Professor of Reconfigurable Computing, School of Computer Science, University of Birmingham (UK)
Easy and effieicient data-parallel programming will be essential for getting performance from today's massively parallel systems. In this talk Dr Singh will share his vision and demostrate a novel Microsoft Accelerator System, a language neural solution working on NVIDIA GPUs.
Video(mp4)
Getting Started with TotalView 8.9.2 and CUDA 4.0
Presented  Chris Gottbrath, Principal Product Manager, Rogue Wave
With Totalview 8.9.2 and the CUDA add-on you can debug both the CPU and the GPU code. Set breakpoints, step, and dive in code running on the GPU using all the familiar TotalView GUI methods. TotalView supports many of the advanced features such as UVA and Multi-GPUs.
Video(mp4)
CUDA Optimization : Register Spilling and Local Memory Usage with Live Q&A  by Dr Paulius Micikevicius, NVIDIA
Final installment of our optimization series.
Focusing on one of the more subtle performance impactors -this webinar provides strategies which you will be able to implement immeadaitely to extract extra performance from your application.

Video(mp4)

PDF

Trace Based Performance Analysis for GPUs Using Vampir Trace Collector
Learn how to use Vampir Trace Collector event logging to identify performance bottlenecks. Understand how the graphical analysis of this data provides insight about how the various layers of parallelism interact in order to see what the  is really happening.
Video(mp4)
PGI Accelerator for Fortran - Simplified GPU Programming Using Directives, by Michael Wolfe, Portland Group
This webinar provides an overview and real examples of how to parallelize your application with intuitive compiler directives. A fast and easy complement to programming directly in CUDA Fortran or CUDA C/C++. 
Video(mp4)

Overview and Usage of LibJacket CUDA Library, Presented by Accelereyes
A powerful library with hundreds of fast prepacked convolutions, reductions, matrix indexing, linear algebra, image processing, signal processing, and statistics functions. Handling of N-Dimensional data. Scalable to multiple GPUs. Powerful GFOR loop for running FOR-loops iterations in parallel on the GPU cores and a Graphics library.

Video(mp4)

PDF

CUDA Optimization : Memory Bandwidth Limited Kernels + Live Q&A  by Tim Schroeder
This Webinar focuses on one of the main performance limiters and provides actionable ideas and strategies for performance optimization - don't miss the live Q&A session which will followed.

Video(mp4)

PDF

PGI Accelerator for C -  Simplified GPU Programming Using Directives

Presented by Michael Wolfe, PGI leading compiler expert providing a technical overview and real code examples using this exciting and innovative programming solution.

Video(mp4)
CUDA Optimization : Instruction Limited Kernels with Live Q&A  by Gernot Ziegler
Get field proven advice on methods to improve kernels who's performance is limited by instructions. A not to be missed live webinar.

Video(mp4)

PDF

GPU Direct and Unified Virtual Addressing+ Live Q&A by Tim Schroeder
These new features of CUDA4.0 have enabled huge oppotunities , if you haven't looked at how to leverage them for your application, this would be an excellent way to find out the latest hints and tips.

Video(mp4)
PDF

Multi-GPU and Host Multi-Threading Considerations+ Live Q&A by Dr Paulius Micikevicius
Get critical , field proven tips from one the World's leading CUDA Experts

Video(mp4) 
PDF

CUDA Optimization: Identifying Performance Limiters by Dr Paulius Micikevicius
A great oppotunity to learn how to identify what is limiting the performance of your application. This is first of a series of optimization webinars which will help you extract even more performance from your applications. A must attend for all CUDA Developers.

Video(mp4)

PDF

Introduction to CUDA Libraries+ Live Q&A by Dr Justin Luitjens
Get an insight on how to best leverage the powerful libraries than come as part of the CUDA Toolkit.

Video(mp4) 
PDF

CUDA Texture Memory & Graphics interop+ Live Q&A With Gernot Ziegler Video(mp4)
Special Live Q&A with Numeric Library Team  + Live Q&A
Your opportunity to engage with NVIDIA Engineering - ask questions about CUDA and the CUDA Libraries - fast paced webinars are a must attend for all CUDA Developers
Team Manager: Ujval Kapasi
Video(mp4)
CUDA Warps and Occupancy Considerations+ Live with  Dr Justin Luitjens, NVIDIA

Topics  covered in this talk include:
* Thread Blocks and Warps
* Tesla vs Fermi Warps
* Hiding memory access latency
* Smart use of profiler information
Other considerations which may be higher impact than occupancy

Presented by  Dr Justin Luitjens, GPU Computing Expert and NVIDIA Devtech

Video(mp4)

PDF

Overview of Latest Release of CULA Tools, An Accelerated Linear Alegra Solution for Professionals by John Humphrey, Engineering Director,  EM Photonics Inc

EM Photonics produces the popular CULA library for dense linear algebra  functions using CUDA GPUs. CULA has hundreds of routines for system solutions, eigenvalue problems, and matrix factorizations. Recently, we have introduced an exciting new feature called the Link Interface which is compile-time and link-time compatible with other matrix libraries. In the

link interface, you re-link your application and CULA handles all the details so you can try GPUs with very little effort! This webinar will cover the CULA library and show real-world usage of the link interface, including use in Matlab.

Video(mp4)
CUDA Shared Memory  & Cache + Live Q&A with Dr Steve Rennich, NVIDIA

Some of the topics discussed in this technical webinar include:
Shared memory usage vs L2 Cache
Shared memory banking overview
Minimizing bank conflict and maximizing performance
Hints and Tips for optimal use of Fermi's L2 Cache

Presented by Dr Steven Rennich, GPU Computing Expert and NVIDIA DevTech
 
Video(mp4)
The Practical Reality of Heterogeneous Super Computing

This presentation will cover the typical concerns that developers and owners of HPC solutions have about GPU adoption and how they are being resolved; in particular, the concern that two code bases have to be maintained. The presentation emphasizes that GPU Computing has evolved to a point that it is now possible to write and maintain codes that will work on GPUs and CPUs using CUDA & CUDA x86.

Presentered by Dr Rob Farber, Author and GPU Computing Expert

Video(mp4)

PDF

CUDA  Global Memory Usage & Strategy + Live Q&A with Dr Justin Luitjens, NVIDIA

Smartly using the CUDA memory model is critical for writing high performance code.
This talk will cover Global memory coalescing and strategies data structures and access patterns. Followed by Live Q&A
Presented by  Dr Justin Luitjens, GPU Computing Expert and NVIDIA Devtech

Video (mp4)
PDF  
How to accelerate the incomplete-LU and Cholesky preconditioned iterative methods on a GPU using CUSPARSE and CUBLAS libraries Video (mp4)
WhitePaper
Introduction to GPU Computing & CUDA By Sarah Tariq, NVIDIA Video (mp4)
PDF
Getting Started with CUDA & GPU Computing  + Live Q&A
A detailed review of how to get up and running using CUDA C/C++ by Sarah Tariq
Video (mp4)
Floating Point Capabilities and Accuracy of Latest NVIDIA GPUs

Video (mp4)
Slides(PDF)
WhitePaper

Introduction to MainConcept's CUDA H.264/AVC Encoder
Learn how to take advantage of NVIDIA GPUs with the new MainConcept CUDA H.264 Video Encoder, which delivers as much as 700% faster video encoding performance improvements over CPU. Presented by MainConcept, an industry-leading supplier of video and audio codec solutions for the Multimedia, Consumer Electronics, Broadcast & Professional video production, Digital Signage, Medical, Security.
Video (.mp4)
Slides (PDF)
Monitoring and Managing GPU Clusters with Bright Cluster Management
Presented by CEO and Founder of Bright Computing, Dr. Matthijs van Leeuwen. This technical presentation about Bright Cluster Manager a complete cluster management software solution that offers comprehensive functionality for monitoring and managing GPUs
Video (.mp4)
GPU Computing using CUDA C – An Introduction (2010)
An introduction to the basics of GPU computing using CUDA C. Concepts will be illustrated with walkthroughs of code samples. No prior GPU Computing experience required
Video (mp4 )
GPU Computing using CUDA C – Advanced 1 (2010)
First level optimization techniques such as global memory optimization, and processor utilization. Concepts will be illustrated using real code examples
Video (wmv )
Slides (pdf)
GPU Computing using CUDA C - Advanced 2 (2010)
Advanced topics such as execution configuration, instruction and warp optimization with a focus on real applications
Video (wmv 
Slides (pdf)
GPU Computing using OpenCL- An Introduction (2010)
An introduction to the OpenCL API leveraging NVIDIA's CUDA parallel computing architecture. Topics covered include comparison between CUDA C and OpenCL API, memory and threading models. No prior GPU Computing experience is required.
Video (wmv )
Slides (pdf)
GPU Computing using OpenCL Advanced 1 (2010)
NVIDIA presents tricks and tips on how to write great OpenCL code. Topics covered include memory usage best practises, achieving best processor occupancy and instruction throughput.
Video (wmv)
Slides (pdf)
Parallel Nsight - An Introduction and Overview (2010)
This Webinar will provide an overview of the powerful features of Parallel Nsight - NVIDIA's latest tool to help develop, debug and analyse parallel code on CPU's and GPUs
Video (mp4 format)
Thrust, A C++ Standard Template Library for CUDA C - An Introduction (2010)
Thrust is a powerful C++ standard template library providing highly optimized highlevel CUDA C kernels which can be be a fast track for adding GPU acceleration to existing applications. Thrust is one of NVIDIA many open source projects. This Webinar provides an overview and example uses
Video (mp4 format)
Slides (pdf format)
PGI Fortran - An Overview Video (mp4 )
DirectCompute for GPU Computing - An Introduction by Microsoft Video (.mp4 )
An Introduction to the MAGMA project - acceleration of dense linear algebra by Prof. Jack Dongarra Video (mp4 )
An Introduction to CULA GPU Accelerated Linear Algebra Video (mp4 )
Rapid Application Development platform for GPGPUs – Jacket with MATLAB® Video (mp4 )
CUDA 4.0 Overview Video (.mp4)
Live Q&A with Ian Buck Audio Only (.mp3)
CUDA memcheck  Overview and Demo Video (.mp4)
CUDAgdb Overview and Demo Video (.mp4)
CUDA Numeric Libraries 3.2  Performance Overview Video (.mp4)
CUDA 3.2 Introduction & Overview Video (.mp4)
Introduction to GPU.NET Video (.mp4)

OpenCL is trademark of Apple Inc. used under license to the Khronos Group Inc.