After clicking “Watch Now” you will be prompted to login or join.
What the Profiler is Telling You: How to Get the Most Performance out of Your Hardware
Markus Hrywniak, NVIDIA | Milos Maric, NVIDIA
GTC 2020
We'll explore how to analyze and optimize the performance of GPU-accelerated applications. Working with a real-world example, we'll start by identifying high-level bottlenecks, then walk through an analysis-driven process leading to a series of kernel-level optimizations. Using NVIDIA's Nsight Systems and Nsight Compute profiling tools as an example, you'll learn about the fundamental performance limiters: instruction throughput, memory throughput, and latency. We'll present strategies to identify and tackle each type of limiter.