Tag: MPI


Accelerating Scientific Applications in HPC Clusters with NVIDIA DPUs Using the MVAPICH2-DPU MPI Library

HPC and AI have driven supercomputers into wide commercial use as the primary data processing engines enabling research, scientific discoveries… 7 MIN READ

Achieve up to 75% Performance Improvement for Communication Intensive HPC Applications with NVTAGS

NVTAGS automates intelligent GPU assignment by profiling HPC applications and launching them with a custom GPU assignment tailored to an application and system… 2 MIN READ
Artificial Intelligence

Fast Multi-GPU collectives with NCCL

Today many servers contain 8 or more GPUs. In principle then, scaling an application from one to many GPUs should provide a tremendous performance boost. 10 MIN READ
Accelerated Computing

GPU Pro Tip: Track MPI Calls In The NVIDIA Visual Profiler

Often when profiling GPU-accelerated applications that run on clusters, one needs to visualize MPI (Message Passing Interface) calls on the GPU timeline in the… 5 MIN READ
Accelerated Computing

Benchmarking GPUDirect RDMA on Modern Server Platforms

NVIDIA GPUDirect RDMA is a technology which enables a direct path for data exchange between the GPU and third-party peer devices using standard features of PCI… 13 MIN READ
Accelerated Computing

CUDA Pro Tip: Profiling MPI Applications

Use nvprof and NVTX to profile your MPI+CUDA application. 4 MIN READ