Technical Walkthrough 0

Accelerating NVSHMEM 2.0 Team-Based Collectives Using NCCL

NVSHMEM 2.0 is introducing a new API for performing collective operations based on the Team Management feature of the OpenSHMEM 1.5 specification. A team is a… 9 MIN READ
Technical Walkthrough 0

Case Study: ResNet50 with DALI

Let’s imagine a situation. You buy a brand-new, cutting-edge, Volta-powered DGX-2 server. You’ve done your math right, expecting a 2x performance increase in… 12 MIN READ
Technical Walkthrough 0

Scaling Deep Learning Training with NCCL

NVIDIA Collective Communications Library (NCCL) provides optimized implementation of inter-GPU communication operations, such as allreduce and variants. 6 MIN READ
Technical Walkthrough 0

NVSwitch Accelerates NVIDIA DGX-2

NVIDIA CEO Jensen Huang described the NVIDIA® DGX-2™ server as "the world's largest GPU" at its launch during GPU Technology Conference earlier this year. 8 MIN READ
Technical Walkthrough 0

Training AI for Self-Driving Vehicles: the Challenge of Scale

Modern deep neural networks, such as those used in self-driving vehicles, require a mind boggling amount of computational power. Today a single computer… 22 MIN READ
Technical Walkthrough 0

NVIDIA DGX-1: The Fastest Deep Learning System

One year ago today, NVIDIA announced the NVIDIA® DGX-1™, an integrated system for deep learning. DGX-1 (shown in Figure 1) features eight Tesla P100 GPU… 12 MIN READ