GTC 2020: Roofline Performance Model for HPC and Deep-Learning Applications
After clicking “Watch Now” you will be prompted to login or join.
Click “Watch Now” to login or join the NVIDIA Developer Program.
Roofline Performance Model for HPC and Deep-Learning Applications
Charlene Yang, NERSC, Lawrence Berkeley National Laboratory | Samuel Williams, CRD, Lawrence Berkeley National Laboratory | Yunsong Wang, NERSC, LBNL
Learn how to use the Roofline model to analyze the performance of GPU-accelerated applications. We'll cover the basics of the model, explain how to use tools such as nvprof and Nsight Systems/Compute to automate the data collection, and demonstrate how to track progress using Roofline for both HPC and deep-learning applications. We'll use examples such as GPP from material science, high-performance geometric multigrid from adaptive mesh refinement, and two kernels from TensorFlow to show how characteristics such as arithmetic intensity, memory access pattern, and thread divergence/prediction can all be captured by Roofline, offering useful insights to performance optimization.