As the demand for high-performance computing (HPC) and AI applications grows, so does the importance of energy efficiency. NVIDIA Principal Developer Technology Engineer, Alan Gray, shares insights on optimizing energy and power efficiency for various applications running on the latest NVIDIA technologies, including NVIDIA H100 Tensor Core GPUs and NVIDIA DGX A100 systems.
Traditionally, the focus has been on maximizing performance by reducing the time to solution. However, rising energy costs and the environmental impact of data centers are pushing developers to consider energy consumption as a critical factor. Energy usage, defined as the product of power and time, can be optimized by carefully tuning GPU settings and application-level configurations.
This session is perfect for HPC and AI developers, data center operators, and GPU programmers looking to optimize energy efficiency in conjunction with performance. It’s also valuable for researchers using applications like GROMACS or AI inference models, as well as IT teams focused on reducing energy costs and environmental impact.
Follow along with a PDF of the session as Gray dives into several key topics focused on optimizing energy and power efficiency for HPC and AI applications running on NVIDIA GPUs, including:
- Introduction to energy optimization: key considerations for balancing performance and energy efficiency in HPC and AI applications.
- GPU clock frequency tuning: exploring how adjusting clock frequency affects power consumption, runtime, and overall energy savings.
- Application benchmarks: insights from energy optimization in workloads such as GROMACS, Quantum Espresso, and TensorRT-LLM.
- Non-GPU power impact: addressing energy consumption from CPUs, memory, and cooling systems, and optimizing with techniques like Direct Liquid Cooling (DLC).
- Energy efficiency on NVIDIA H100 and DGX A100: analysis of energy-saving potential on these platforms and how non-GPU components affect total power consumption.
- Application-level optimizations: various application-level techniques to optimize for performance and energy efficiency.
- Holistic data center energy strategies: a comprehensive approach to minimizing energy usage through hardware and software optimizations in data centers.
Watch the advanced talk on Energy and Power Efficiency for Applications on the Latest NVIDIA Technology, explore more videos on NVIDIA On-Demand, and gain valuable skills and insights from industry experts by joining the NVIDIA Developer Program.
This content was partially crafted with the assistance of generative AI and LLMs. It underwent careful review and was edited by the NVIDIA Technical Blog team to ensure precision, accuracy, and quality.