After clicking “Watch Now” you will be prompted to login or join.


WATCH NOW



 
Click “Watch Now” to login or join the NVIDIA Developer Program.

WATCH NOW

Roofline Performance Model for HPC and Deep-Learning Applications

Charlene Yang, NERSC, Lawrence Berkeley National Laboratory | Samuel Williams, CRD, Lawrence Berkeley National Laboratory | Yunsong Wang, NERSC, LBNL

GTC 2020

Learn how to use the Roofline model to analyze the performance of GPU-accelerated applications. We'll cover the basics of the model, explain how to use tools such as nvprof and Nsight Systems/Compute to automate the data collection, and demonstrate how to track progress using Roofline for both HPC and deep-learning applications. We'll use examples such as GPP from material science, high-performance geometric multigrid from adaptive mesh refinement, and two kernels from TensorFlow to show how characteristics such as arithmetic intensity, memory access pattern, and thread divergence/prediction can all be captured by Roofline, offering useful insights to performance optimization.




View More GTC 2020 Content