Technical Walkthrough 3

Scaling VASP with NVIDIA Magnum IO

You could make an argument that the history of civilization and technological advancement is the history of the search and discovery of materials. Ages are... 22 MIN READ
Technical Walkthrough 0

Accelerating Scientific Applications in HPC Clusters with NVIDIA DPUs Using the MVAPICH2-DPU MPI Library

High-performance computing (HPC) and AI have driven supercomputers into wide commercial use as the primary data processing engines enabling research, scientific... 7 MIN READ
News 1

Achieve up to 75% Performance Improvement for Communication Intensive HPC Applications with NVTAGS

Many GPU-accelerated HPC applications spend a substantial portion of their time in non-uniform, GPU-to-GPU communications. Additionally, in many HPC systems,... 2 MIN READ
Figure 5: Ring order of GPUs in PCIe tree.
Technical Walkthrough 1

Fast Multi-GPU collectives with NCCL

Today many servers contain 8 or more GPUs. In principle then, scaling an application from one to many GPUs should provide a tremendous performance boost. But in... 10 MIN READ
GPU Pro Tip
Technical Walkthrough 0

GPU Pro Tip: Track MPI Calls In The NVIDIA Visual Profiler

Often when profiling GPU-accelerated applications that run on clusters, one needs to visualize MPI (Message Passing Interface) calls on the GPU timeline in the... 5 MIN READ
Technical Walkthrough 1

Benchmarking GPUDirect RDMA on Modern Server Platforms

NVIDIA GPUDirect RDMA is a technology which enables a direct path for data exchange between the GPU and third-party peer devices using standard features of PCI... 13 MIN READ