The electrical grid is designed to support loads that are relatively steady, such as lighting, household appliances, and industrial machines that operate at constant power. But data centers today, especially those running AI workloads, have changed the equation.
Data centers consume a significant percentage of power plant and transformer capacity. Traditionally, diverse activities in the centers could average out consumption. Training large-scale AI models, however, causes sudden fluctuations in how much power is needed and poses unique challenges for grid operators:
- If power demand suddenly ramps up, it can take one minute to 90 minutes for generation resources to respond because of physical limitations in their ramp rates.
- Repeating power transients could cause resonance and stress equipment.
- If the data center suddenly reduces its power consumption, the energy production systems find themselves with excess energy and no outlet.
These sudden changes can be felt by other grid customers as spikes or sags in supplied voltage.
In this blog, we’ll detail how NVIDIA addresses this challenge through a new power supply unit (PSU) with energy storage in the GB300 NVL72. It can smooth power spikes from AI workloads and reduce peak grid demand by up to 30%. And it’s also coming to GB200 NVL72 systems.
We will describe the different solutions for training workloads at the start, for running at full load, and for the end of the training run. Then we’ll share measured results using this new power smoothing solution.
The impact of synchronized workloads
In AI training, thousands of GPUs operate in lockstep and perform the same computation on different data. This synchronization results in power fluctuations at the grid level. Unlike traditional data center workloads, where uncorrelated tasks “smooth out” the load, AI workloads cause abrupt transitions between idle and high-power states, as shown in Figure 1.

Visualizing the individual GPUs as rows on a heatmap illustrates why AI data centers pose unique power challenges to the power delivery grid. (See Figure 2 below.) Traditional data center workloads operate asynchronously across the compute infrastructure. The AI training workload heatmap highlights how GPUs operate synchronously, causing the total power drawn by a GPU cluster to mirror and amplify the power pattern of a single node.

Power smoothing in GB300 NVL72
To address these challenges, NVIDIA is introducing a comprehensive power smoothing solution in the GB300 platform. It’s comprised of several mechanisms across different operational phases. Figure 3 (below) shows the power cap, energy storage, and GPU burn mechanisms that together smooth the power demand from the rack. We will explore each mechanism in the image from left to right.
We again show the example AI training GPU power consumption as a gray line. Then we added a green line to show the desired power profile—a smooth ramp-up, a flat steady state, and a smooth ramp-down.
With the new power cap feature, GPU power draw at the start of a workload is capped by the power controller. New maximum power levels are sent to the GPUs and gradually increased, aligning with the ramp rates the grid can tolerate. A more complex strategy is used for ramp-down; if the workload ends abruptly, the GPU burn system continues to dissipate power by operating the GPU in a special power burner mode. This ensures a smooth transition rather than a sharp drop (Figures 3 and 5).

For rapid, short-term power fluctuations during steady-state operation, energy storage elements—specifically electrolytic capacitors—have been integrated into the GB300 NVL72 power shelves. Energy storage charges during times of low GPU power demand and discharges during times of high GPU power demand (Figure 4).

The solution in ramp-down is power burn hardware and a software algorithm that senses GPU power has reduced to idle levels when the running average power drops. The software driver that implements the power smoothing algorithm engages the hardware power burner. The burner keeps using constant power as it waits for the workload to resume; if the workload doesn’t resume, the burner smoothly reduces the power consumption. If the GPU workload does resume, the burner disengages instantly. When a workload ends, the burner tapers off the power draw at a rate consistent with grid capabilities and then disengages.
There are configurable parameters to fine-tune the behavior. The following table shows the key parameters exposed to the user, while figure above it provides the visual guide to key parameters listed in the first column of the table. These can be set using the NVIDIA SMI tool or the Redfish protocol.

Figure 5. The figure reflects how key configuration parameters in the table below affect power demands.

Measured benefits and results
Empirical results with both the previous-generation (GB200) and the new (GB300) power supply units with energy storage demonstrate significant improvements. To show this, we instrumented a power shelf in a GB200 rack, as depicted in Figure 6:

With the old power supply, the AC power drawn from the grid resembles fluctuations in rack power consumption. With the new energy-storage-enhanced power shelves, these input power variations are largely eliminated. Notably, the peak power demand seen by the grid is reduced by 30% when training the Megatron LLM, and rapid fluctuations are substantially dampened, as shown in Figure 7.

Looking inside the GB300 power supply, we find that about half of the volume is occupied by capacitors for energy storage. NVIDIA worked with power supply vendor LITEON Technology to optimize the power electronics for size, and filled the remaining space with 65 joules/GPU of energy storage. Coupled with a new charge management controller, we deliver a rack-level fast transient power smoothing solution

System design implications
Integrating energy storage not only smooths transients but also lowers the peak demand requirements for the broader data center. Facilities previously needed to be provisioned for the maximum instantaneous power consumption. Now, with effective energy storage, provisioning can be closer to the target average consumption, enabling more racks within the same power budget or allowing for reduced total power allocation.
The design ensures that the fluctuations within the rack are tolerated; the computing nodes and internal DC buses are built to accommodate rapid power state changes. The energy storage mechanism is only used to optimize the load profile seen by the grid and does not provide energy back to the utility.
Both the GB200 and GB300 NVL72 systems employ multiple power shelves within each rack. As a result, strategies for integrating energy storage and load smoothing must consider aggregation at the rack and data hall levels. Power reductions at peak enable either increased rack density or reduced provisioning requirements for the entire data center.
Takeaways
The innovations in energy storage and advanced ramp-rate management algorithms in GB300 NVL72 power shelves achieve a significant reduction in peak and transient load presented to the grid. The advanced PSU with energy storage—plus the hardware and software to implement the power cap and power burn elements—will be available with GB300 NVL72.
All data center operators should consider integrating advanced power smoothing and energy storage technologies to optimize peak power consumption, enable increased compute density, and save on operating costs.
Contributors to this research include Jared Huntington, Gabriele Gorla, Apoorv Gupta, Mostafa Mosa, Chad Plummer, Nilesh Dattani, Tom Li, Pratik Patel, Kevin Wei, Ajay Kamalvanshi, and Divya Ramakrishnan.