Technical Blog
Tag: MPI
Subscribe
Technical Walkthrough
Jun 28, 2021
Accelerating Scientific Applications in HPC Clusters with NVIDIA DPUs Using the MVAPICH2-DPU MPI Library
HPC and AI have driven supercomputers into wide commercial use as the primary data processing engines enabling research, scientific discoveries, and product development. These systems can carry comple...
7 MIN READ
News
Jun 23, 2021
Achieve up to 75% Performance Improvement for Communication Intensive HPC Applications with NVTAGS
NVTAGS automates intelligent GPU assignment by profiling HPC applications and launching them with a custom GPU assignment tailored to an application and system to minimize communication costs.
2 MIN READ
Technical Walkthrough
Apr 07, 2016
Fast Multi-GPU collectives with NCCL
Today many servers contain 8 or more GPUs. In principle then, scaling an application from one to many GPUs should provide a tremendous performance boost.
10 MIN READ
Technical Walkthrough
May 05, 2015
GPU Pro Tip: Track MPI Calls In The NVIDIA Visual Profiler
Often when profiling GPU-accelerated applications that run on clusters, one needs to visualize MPI (Message Passing Interface) calls on the GPU timeline in the…
5 MIN READ
Technical Walkthrough
Oct 07, 2014
Benchmarking GPUDirect RDMA on Modern Server Platforms
NVIDIA GPUDirect RDMA is a technology which enables a direct path for data exchange between the GPU and third-party peer devices using standard features of PCI…
13 MIN READ
Technical Walkthrough
Jun 19, 2014
CUDA Pro Tip: Profiling MPI Applications
Use nvprof and NVTX to profile your MPI+CUDA application.
4 MIN READ