Parallel programming has never been this easy. The CUDA programming model, tools and powerful libraries have provided the foundation - this webinar series will fuel your development. Get trained directly from GPU Computing experts in CUDA, OpenCL and DirectCompute, find out about the latest developments from companies around the world leading the GPU Computing revolution.
Advance registration is required. You will be kept informed of updates, future webinars and added to our CUDA Newsletter mailing list and invited to become a registered developer.
Previously recorded sessions Additional Parallel Nsight and Tools Webinar Records GTC Express Webinar Series
IMPORTANT NOTE: Some of the Webinars are "Reg.Dev Priority", these are special webinars that are part of the membership benefits of our free to join CUDA Registered Developer Program. To join: complete the short application Apply Now . Members will be given priority registration when the webinars are oversubscribed.
New Webinars - Sign Up Now
| Series | Webinar Title and Brief Description |
Registration Links (Pacific Time) |
|---|---|---|
| CUDA Partner |
Introduction to Bright Cluster Manager - Advanced Clusters Made Easy Learn about how you can manage your GPU cluster with this powerful tool. Session will include a technical feature overview and live Q&A |
To be rescheduled |
| CUDA Partner |
CUDA X86 - Running your CUDA Code on multi-core CPUs |
Jan 31 |
|
CUDA Intro |
CUDA Toolkit 4.1 - Technical & Performamce Overview Now in production release features many improvements including: New LLVM Based Compiler, over 1000 new image processing functions and major improvements in the Visual Profiler and much more. Presented by NVIDIA's CUDA PM, featuring a live Q&A session. |
Feb 1, Feb 3 |
|
OpenACC Intro |
OpenACC 1.0 - Technical Overview |
Feb 14, 10am (PST) Register Now |
| GTC Express |
Debugging CUDA with TotalView With Totalview 8.9.2 and the NVIDIA CUDA add-on, you can debug both the CPU and the GPU code in applications that use CUDA. You can set breakpoints, step, and dive in code running on the CUDA device using all the familiar TotalView GUI methods. TotalView supports unified virtual addressing, as well as multi-device debugging, handles CUDA function in-lining and provides type qualification in the expression system. You can display how your logical threads are being mapped to hardware and navigate kernel threads using either hardware or logical coordinates. The webinar will also preview the upcoming TotalView 8.10 with support for CUDA 4.1 |
Feb 22, 9am(PST) Register Now |
Previous Webinars
| Webinar Title | Links to recordings |
|---|---|
|
5x in 5 Hours: Accelerating SEISMIC_CPML Using High-level GPU Programming Programming GPU accelerators involves 3 basic aspects: splitting the source code between host and GPU, managing data allocation and movement between host memory and GPU memory, and optimizing GPU kernels. Much of this process can be automated using modern compiler technology and high level programming techniques. In this webinar, Mat Colgrove, Applications Engineer for The Portland Group, will present a case study on using PGI Accelerator compiler directives to achieve a 5x speed-up in approximately 5 hours of programming time on this popular geophysics code. |
Links Coming Soon |
|
Debugging workshop for CUDA 4.1 using Allinea DDT With Live Q&A. |
Links Coming Soon |
|
CUDA Toolkit 4.1 Feature Overview Now available for all developers, this new release features 3 major improvements: New LLVM Based Compiler, over 1000 new image processing functions and major improvements in the Visual Profiler and much more. Presented by NVIDIA's CUDA PM, featuring Q&A session. Updated Dec 2011 |
|
|
Heterogeneous Data-Parallel Programming Presented by Satnam Singh, Professor of Reconfigurable Computing, School of Computer Science, University of Birmingham (UK) Easy and effieicient data-parallel programming will be essential for getting performance from today's massively parallel systems. In this talk Dr Singh will share his vision and demostrate a novel Microsoft Accelerator System, a language neural solution working on NVIDIA GPUs. |
Video(mp4) |
|
Getting Started with TotalView 8.9.2 and CUDA 4.0 Presented Chris Gottbrath, Principal Product Manager, Rogue Wave With Totalview 8.9.2 and the CUDA add-on you can debug both the CPU and the GPU code. Set breakpoints, step, and dive in code running on the GPU using all the familiar TotalView GUI methods. TotalView supports many of the advanced features such as UVA and Multi-GPUs. |
Video(mp4) |
|
CUDA Optimization : Register Spilling and Local Memory Usage with Live Q&A by Dr Paulius Micikevicius, NVIDIA Final installment of our optimization series. Focusing on one of the more subtle performance impactors -this webinar provides strategies which you will be able to implement immeadaitely to extract extra performance from your application. |
|
|
Trace Based Performance Analysis for GPUs Using Vampir Trace Collector Learn how to use Vampir Trace Collector event logging to identify performance bottlenecks. Understand how the graphical analysis of this data provides insight about how the various layers of parallelism interact in order to see what the is really happening. |
Video(mp4) |
|
PGI Accelerator for Fortran - Simplified GPU Programming Using Directives, by Michael Wolfe, Portland Group This webinar provides an overview and real examples of how to parallelize your application with intuitive compiler directives. A fast and easy complement to programming directly in CUDA Fortran or CUDA C/C++. |
Video(mp4) |
|
Overview and Usage of LibJacket CUDA Library, Presented by Accelereyes |
|
|
CUDA Optimization : Memory Bandwidth Limited Kernels + Live Q&A by Tim Schroeder This Webinar focuses on one of the main performance limiters and provides actionable ideas and strategies for performance optimization - don't miss the live Q&A session which will followed. |
|
|
PGI Accelerator for C - Simplified GPU Programming Using Directives Presented by Michael Wolfe, PGI leading compiler expert providing a technical overview and real code examples using this exciting and innovative programming solution. |
Video(mp4) |
|
CUDA Optimization : Instruction Limited Kernels with Live Q&A by Gernot Ziegler Get field proven advice on methods to improve kernels who's performance is limited by instructions. A not to be missed live webinar. |
|
|
GPU Direct and Unified Virtual Addressing+ Live Q&A by Tim Schroeder These new features of CUDA4.0 have enabled huge oppotunities , if you haven't looked at how to leverage them for your application, this would be an excellent way to find out the latest hints and tips. |
|
|
Multi-GPU and Host Multi-Threading Considerations+ Live Q&A by Dr Paulius Micikevicius Get critical , field proven tips from one the World's leading CUDA Experts |
|
|
CUDA Optimization: Identifying Performance Limiters by Dr Paulius Micikevicius A great oppotunity to learn how to identify what is limiting the performance of your application. This is first of a series of optimization webinars which will help you extract even more performance from your applications. A must attend for all CUDA Developers. |
|
|
Introduction to CUDA Libraries+ Live Q&A by Dr Justin Luitjens Get an insight on how to best leverage the powerful libraries than come as part of the CUDA Toolkit. |
|
| CUDA Texture Memory & Graphics interop+ Live Q&A With Gernot Ziegler | Video(mp4) |
|
Special Live Q&A with Numeric Library Team + Live Q&A Your opportunity to engage with NVIDIA Engineering - ask questions about CUDA and the CUDA Libraries - fast paced webinars are a must attend for all CUDA Developers Team Manager: Ujval Kapasi |
Video(mp4) |
|
CUDA Warps and Occupancy Considerations+ Live with Dr Justin Luitjens, NVIDIA
Topics covered in this talk include: |
|
|
Overview of Latest Release of CULA Tools, An Accelerated Linear Alegra Solution for Professionals by John Humphrey, Engineering Director, EM Photonics Inc EM Photonics produces the popular CULA library for dense linear algebra functions using CUDA GPUs. CULA has hundreds of routines for system solutions, eigenvalue problems, and matrix factorizations. Recently, we have introduced an exciting new feature called the Link Interface which is compile-time and link-time compatible with other matrix libraries. In the link interface, you re-link your application and CULA handles all the details so you can try GPUs with very little effort! This webinar will cover the CULA library and show real-world usage of the link interface, including use in Matlab. |
Video(mp4) |
|
CUDA Shared Memory & Cache + Live Q&A with Dr Steve Rennich, NVIDIA
Some of the topics discussed in this technical webinar include: |
Video(mp4) |
|
The Practical Reality of Heterogeneous Super Computing
This presentation will cover the typical concerns that developers and owners of HPC solutions have about GPU adoption and how they are being resolved; in particular, the concern that two code bases have to be maintained. The presentation emphasizes that GPU Computing has evolved to a point that it is now possible to write and maintain codes that will work on GPUs and CPUs using CUDA & CUDA x86. Presentered by Dr Rob Farber, Author and GPU Computing Expert |
|
|
CUDA Global Memory Usage & Strategy + Live Q&A with Dr Justin Luitjens, NVIDIA
Smartly using the CUDA memory model is critical for writing high performance code. |
Video (mp4) |
| How to accelerate the incomplete-LU and Cholesky preconditioned iterative methods on a GPU using CUSPARSE and CUBLAS libraries |
Video (mp4) WhitePaper |
| Introduction to GPU Computing & CUDA By Sarah Tariq, NVIDIA |
Video (mp4) |
|
Getting Started with CUDA & GPU Computing + Live Q&A A detailed review of how to get up and running using CUDA C/C++ by Sarah Tariq |
Video (mp4) |
| Floating Point Capabilities and Accuracy of Latest NVIDIA GPUs | |
|
Introduction to MainConcept's CUDA H.264/AVC Encoder Learn how to take advantage of NVIDIA GPUs with the new MainConcept CUDA H.264 Video Encoder, which delivers as much as 700% faster video encoding performance improvements over CPU. Presented by MainConcept, an industry-leading supplier of video and audio codec solutions for the Multimedia, Consumer Electronics, Broadcast & Professional video production, Digital Signage, Medical, Security. |
Video (.mp4) Slides (PDF) |
|
Monitoring and Managing GPU Clusters with Bright Cluster Management Presented by CEO and Founder of Bright Computing, Dr. Matthijs van Leeuwen. This technical presentation about Bright Cluster Manager a complete cluster management software solution that offers comprehensive functionality for monitoring and managing GPUs |
Video (.mp4) |
|
GPU Computing using CUDA C – An Introduction (2010) An introduction to the basics of GPU computing using CUDA C. Concepts will be illustrated with walkthroughs of code samples. No prior GPU Computing experience required |
Video (mp4 ) |
|
GPU Computing using CUDA C – Advanced 1 (2010) First level optimization techniques such as global memory optimization, and processor utilization. Concepts will be illustrated using real code examples |
Video (wmv ) Slides (pdf) |
|
GPU Computing using CUDA C - Advanced 2 (2010) Advanced topics such as execution configuration, instruction and warp optimization with a focus on real applications |
Video (wmv Slides (pdf) |
|
GPU Computing using OpenCL- An Introduction (2010) An introduction to the OpenCL API leveraging NVIDIA's CUDA parallel computing architecture. Topics covered include comparison between CUDA C and OpenCL API, memory and threading models. No prior GPU Computing experience is required. |
Video (wmv ) Slides (pdf) |
|
GPU Computing using OpenCL Advanced 1 (2010) NVIDIA presents tricks and tips on how to write great OpenCL code. Topics covered include memory usage best practises, achieving best processor occupancy and instruction throughput. |
Video (wmv) Slides (pdf) |
|
Parallel Nsight - An Introduction and Overview (2010) This Webinar will provide an overview of the powerful features of Parallel Nsight - NVIDIA's latest tool to help develop, debug and analyse parallel code on CPU's and GPUs |
Video (mp4 format) |
|
Thrust, A C++ Standard Template Library for CUDA C - An Introduction (2010) Thrust is a powerful C++ standard template library providing highly optimized highlevel CUDA C kernels which can be be a fast track for adding GPU acceleration to existing applications. Thrust is one of NVIDIA many open source projects. This Webinar provides an overview and example uses |
Video (mp4 format) Slides (pdf format) |
| PGI Fortran - An Overview | Video (mp4 ) |
| DirectCompute for GPU Computing - An Introduction by Microsoft | Video (.mp4 ) |
| An Introduction to the MAGMA project - acceleration of dense linear algebra by Prof. Jack Dongarra | Video (mp4 ) |
| An Introduction to CULA GPU Accelerated Linear Algebra | Video (mp4 ) |
| Rapid Application Development platform for GPGPUs – Jacket with MATLAB® | Video (mp4 ) |
| CUDA 4.0 Overview | Video (.mp4) |
| Live Q&A with Ian Buck | Audio Only (.mp3) |
| CUDA memcheck Overview and Demo | Video (.mp4) |
| CUDAgdb Overview and Demo | Video (.mp4) |
| CUDA Numeric Libraries 3.2 Performance Overview | Video (.mp4) |
| CUDA 3.2 Introduction & Overview | Video (.mp4) |
| Introduction to GPU.NET | Video (.mp4) |
OpenCL is trademark of Apple Inc. used under license to the Khronos Group Inc.




Registered Developers Website
NVDeveloper (old site)