Posts by Thomas Gillis
Data Center / Cloud
Jul 14, 2025
Enabling Fast Inference and Resilient Training with NCCL 2.27
As AI workloads scale, fast and reliable GPU communication becomes vital, not just for training, but increasingly for inference at scale. The NVIDIA Collective...
9 MIN READ
Networking / Communications
Jul 14, 2025
NCCL Deep Dive: Cross Data Center Communication and Network Topology Awareness
As the scale of AI training increases, a single data center (DC) is not sufficient to deliver the required computational power. Most recent approaches to...
9 MIN READ
Networking / Communications
Jan 31, 2025
New Scaling Algorithm and Initialization with NVIDIA Collective Communications Library 2.23
The NVIDIA Collective Communications Library (NCCL) implements multi-GPU and multinode communication primitives optimized for NVIDIA GPUs and networking. NCCL...
9 MIN READ