Technical Walkthrough 0

Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model

MT-NLG has 3x the number of parameters compared to the existing largest model of this type and demonstrates unmatched accuracy in a broad set of natural… 13 MIN READ
News 0

New on NGC: Latest Versions of NeMo, HPC SDK, DOCA, PyTorch Lightning, and More 

Learn about the latest additions and software updates to the NVIDIA NGC catalog, a hub of GPU-optimized software that simplifies and accelerates workflows. 3 MIN READ
Technical Walkthrough 0

Discovering New Features in CUDA 11.4

This post shares an overview of the key capabilities released in CUDA 11.4. 14 MIN READ
Technical Walkthrough 0

Using the NVIDIA CUDA Stream-Ordered Memory Allocator, Part 2

In part 1 of this series, we introduced new API functions, and , that enable memory allocation and deallocation to be stream-ordered operations. In this post… 9 MIN READ
Technical Walkthrough 0

Using the NVIDIA CUDA Stream-Ordered Memory Allocator, Part 1

This post introduces new API functions that enable memory allocation and deallocation to be stream-ordered operations. 14 MIN READ
News 0

NVIDIA Announces Availability for Arm HPC Developer Kit with New HPC SDK v21.7

The DevKit is an integrated hardware-software platform for creating, evaluating, and benchmarking HPC, AI, and scientific computing applications for Arm server… 2 MIN READ