GTC Silicon Valley-2019: Performance Analysis of GPU-Accelerated Applications using the Roofline Model
GTC Silicon Valley-2019 ID:S9624:Performance Analysis of GPU-Accelerated Applications using the Roofline Model
Samuel Williams(Lawrence Berkeley National Laboratory),Charlene Yang(Lawrence Berkeley National Laboratory)
Learn how to use the roofline model to analyze the performance of GPU-Accelerated applications. We'll cover the basics of the model and explain how to use it to analyze application performance and track progress. We'll also explain how to use nvprof to automate data collection on GPU-Accelerated systems. Demonstrations will include DOE proxy applications in arithmetic intensity, memory stride, memory coalescing, and thread divergence/prediction, all of which can be captured within the roofline methodology.