Technical Walkthrough 0

Reducing Application Build Times Using CUDA C++ Compilation Aids

The CUDA 11.5 C++ compiler enhancements address how to reduce CUDA application build times. 13 MIN READ
Technical Walkthrough 0

Revealing New Features in the CUDA 11.5 Toolkit

Technical description of new features and capabilities in the CUDA toolkit 11.5 release. 11 MIN READ
Technical Walkthrough 0

Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model

MT-NLG has 3x the number of parameters compared to the existing largest model of this type and demonstrates unmatched accuracy in a broad set of natural… 13 MIN READ
News 0

New on NGC: Latest Versions of NeMo, HPC SDK, DOCA, PyTorch Lightning, and More 

Learn about the latest additions and software updates to the NVIDIA NGC catalog, a hub of GPU-optimized software that simplifies and accelerates workflows. 3 MIN READ
Technical Walkthrough 0

Discovering New Features in CUDA 11.4

This post shares an overview of the key capabilities released in CUDA 11.4. 14 MIN READ
Technical Walkthrough 0

Using the NVIDIA CUDA Stream-Ordered Memory Allocator, Part 2

In part 1 of this series, we introduced new API functions, and , that enable memory allocation and deallocation to be stream-ordered operations. In this post… 9 MIN READ