How to Connect Distributed Data Centers Into Large AI Factories with Scale-Across Networking

AI scaling is incredibly complex, and new techniques in training and inference are continually demanding more out of the data center. While data center capabilities are scaling quickly, data center infrastructure is subject to fundamental physical limitations that have no impact on algorithms and models. Power availability, cooling capacity, and space constraints place limits on the physical footprint of an AI factory. To continue growing, new data centers are built, and connectivity over distance becomes a factor in pooling these resources together to function in tandem on a single training or disaggregated inference workload.

Traditionally, when connecting data centers together with long-haul Ethernet built from “off-the-shelf” merchant silicon, the principal objective was to ensure that data successfully made it to its destination. Because distances can be long and latencies high, the possibility for congestion is also high, and the impact can be extreme.

To mitigate this challenge and prevent packets from being dropped, off-the-shelf Ethernet vendors create solutions where deep packet buffers, capable of absorbing large bursts of network traffic, are employed. While these deep buffer switches are a solution for long-haul service providers and telecoms, they introduce problems for AI.

In particular, switches with deep buffers inherently suffer from higher latencies. In addition, when the buffer starts becoming full, it must “drain.” With respect to AI workloads, this occurrence is unpredictable, causing a large amount of jitter, or variance in data delivery. High latency and unpredictability from this shock-absorber technique becomes problematic for training and disaggregated inference performance, which are synchronous in nature and require predictable performance from the network.

This post explains how NVIDIA Spectrum-XGS Ethernet technology for scale-across networking enables inter-data center connectivity with the high performance needed for AI.

What is scale-across networking?

Scale-across networking is a new category of AI compute fabric connectivity that can be thought of as a new dimension, orthogonal to the existing connectivity options of scale-up and scale-out. With Spectrum-XGS Ethernet for scale-across networking, multiple data centers of varying sizes and distances can be unified as one large AI factory. For the first time, the network can deliver the performance needed for large-scale single job AI training and inference across geographically separated data centers.

A diagram featuring several data centers connected together by scale-up, scale-out, and scale-across networking. — *Figure 1. The three types of types of networking required for AI are scale-up, scale-out, and scale-across*

How does NVIDIA Spectrum-XGS Ethernet enable scale-across networking?

NVIDIA Spectrum-XGS Ethernet is a new technology addition to the NVIDIA Spectrum-X Ethernet platform. It is based on the same hardware combination of Spectrum-X Ethernet switches and ConnectX-8 SuperNICs, and leverages the same stack of software and libraries used for scale-out connectivity within the data center.

With Spectrum-XGS Ethernet, the connectivity is between AI factories over long distances; that is, over 500 meters. This could mean connectivity between buildings in a campus, or over tens or hundreds of miles, across cities or even states and countries. To make scale-across connectivity feasible, the algorithms responsible for ensuring high effective bandwidth and performance isolation had to evolve.

What is the role of distance-aware algorithms in scale-across networking?

One of the challenges with moving data across long distances is the implication of increased latency—even for data traversing an optical fiber in the form of light. Data propagates across the glass strands at a rate of 5 nanoseconds per meter. This means that traveling 1 kilometer takes 5 microseconds. These numbers may seem small in absolute terms, but for GPU-to-GPU communication, every microsecond counts.

Spectrum-XGS Ethernet features modified telemetry-based congestion control and adaptive routing algorithms that are optimized around the distance between communicating devices. Whenever a connection is initiated, the network notes whether the two devices are together inside the data center, or not.

This helps the switch know the best approach to load balance for adaptive routing, and informs the SuperNIC to handle injection rate for congestion control. At the network level, this enables Spectrum-XGS Ethernet to holistically handle communications without incurring additional latency.

Some of the key benefits of Spectrum-XGS Ethernet technology to scale-across networking include:

Integrated, unified network architecture: Both Spectrum-X Ethernet scale-out and Spectrum-XGS Ethernet scale-across are based on the same hardware, software, and libraries. This leads to a unified approach to workload management and network operations that isn’t possible with off-the-shelf Ethernet.
End-to-end, telemetry-based congestion control: The unified architecture also enables a global approach to network visibility. With comprehensive telemetry data from the network both inside and outside the data center, telemetry-based congestion management can be handled without the need for deep buffer switching.
Intelligent, auto-adjusting load balancing: The Spectrum-X Ethernet AI fabric is both distance-aware and NVIDIA Collective Communications Library (NCCL)-aware, with the ability to account and compensate for network traffic patterns that may vary by site and dynamically adjust thresholds and limits to ensure highest performance.
Minimized latency for scale-across workloads: Spectrum-XGS Ethernet is tuned to deliver predictable outcomes. This enables the network to account and compensate for data flows traveling from long distances, mitigating any further latency penalties without introducing any risk of jitter due to deep buffers.
Elastic scale-across capacity: Since the same hardware can be used for scale-out and scale-across, networking resources can be re-allocated to support intra- or inter-data center traffic. Off-the-shelf shallow buffer Ethernet switches cannot be re-purposed for long-haul connectivity.

What are the performance benefits of NVIDIA Spectrum-XGS Ethernet?

To show the impact of NVIDIA Spectrum-XGS Ethernet on scale-across performance, NVIDIA engineers ran NCCL primitives across multiple sites at a distance of 10 km and compared the results to off-the-shelf Ethernet. The results, shown in Figure 2 below, were significant:

A graph comparing NCCL all-reduce performance between Spectrum-XGS Ethernet and off-the-shelf Ethernet showing message sizes from 128 KB to 16 GB. The graph shows up to 1.9x better performance using Spectrum-XGS Ethernet. — *Figure 2. NVIDIA Spectrum-XGS Ethernet improves performance up to 1.9x compared to off-the-shelf Ethernet*

NVIDIA Spectrum-XGS Ethernet delivers up to 1.9x higher NCCL all-reduce bandwidth over off-the-shelf Ethernet. The greatest speedup occurs with the larger message sizes, which are the most common with AI training workloads. These improvements to NCCL performance translate into faster job completion times for AI applications.

How does scale-across networking increase ROI for AI factories?

NVIDIA Spectrum-XGS Ethernet enhances the fungibility of AI infrastructure. By introducing a technology that enables data centers to communicate over any distance without performance degradation, Spectrum-XGS Ethernet creates common architecture shared between scale-out and scale-across networking. Ethernet data centers built on Spectrum-XGS Ethernet can readily be combined together to act as one, regardless of proximity.

Ethernet data centers built on Spectrum-XGS can be seamlessly combined to operate as a single system, no matter how far apart they are. This enables mission-critical AI infrastructure to pool resources and consistently deliver value for advanced AI workloads.

To learn more about the technical innovations underpinning NVIDIA Spectrum-X Ethernet, see NVIDIA Spectrum-X Network Platform Architecture.

How to Connect Distributed Data Centers Into Large AI Factories with Scale-Across Networking

What is scale-across networking?

How does NVIDIA Spectrum-XGS Ethernet enable scale-across networking?

What is the role of distance-aware algorithms in scale-across networking?

What are the performance benefits of NVIDIA Spectrum-XGS Ethernet?

How does scale-across networking increase ROI for AI factories?

Tags

About the Authors

How to Connect Distributed Data Centers Into Large AI Factories with Scale-Across Networking

What is scale-across networking?

How does NVIDIA Spectrum-XGS Ethernet enable scale-across networking?

What is the role of distance-aware algorithms in scale-across networking?

What are the performance benefits of NVIDIA Spectrum-XGS Ethernet?

How does scale-across networking increase ROI for AI factories?

Tags

About the Authors

Comments

Related posts

North–South Networks: The Key to Faster Enterprise AI Workloads

Scaling AI Factories with Co-Packaged Optics for Better Power Efficiency

Accelerating AI Storage by up to 48% with NVIDIA Spectrum-X Networking Platform and Partners

Powering Next-Generation AI Networking with NVIDIA SuperNICs

Optimize Large-Scale AI Workloads with NVIDIA Spectrum-X

Related posts

Scaling Large MoE Models with Wide Expert Parallelism on NVL72 Rack Scale Systems

North–South Networks: The Key to Faster Enterprise AI Workloads

Simplify System Memory Management with the Latest NVIDIA GH200 NVL2 Enterprise RA

Accelerating AI Storage by up to 48% with NVIDIA Spectrum-X Networking Platform and Partners

Just Released: DOCA 2.8 Software Framework