Technical Walkthrough 0

Using Fortran Standard Parallel Programming for GPU Acceleration

We present lessons learned from refactoring a Fortran application to use modern do concurrent loops in place of OpenACC for GPU acceleration. 12 MIN READ
Four panels vertically laid out each showing a simulation with a black background
Technical Walkthrough 0

Multi-GPU Programming with Standard Parallel C++, Part 2

By developing applications using MPI and standard C++ language features, it is possible to program for GPUs without sacrificing portability or performance. 13 MIN READ
Four panels vertically laid out each showing a simulation with a black background
Technical Walkthrough 0

Multi-GPU Programming with Standard Parallel C++, Part 1

By developing applications using MPI and standard C++ language features, it is possible to program for GPUs without sacrificing portability or performance. 17 MIN READ
Technical Walkthrough 5

Multinode Multi-GPU: Using NVIDIA cuFFTMp FFTs at Scale

cuFFTMp is a multi-node, multi-process extension to cuFFT that enables scientists and engineers to solve challenging problems on exascale platforms. 9 MIN READ
Technical Walkthrough 0

Analyzing the RNA-Sequence of 1.3M Mouse Brain Cells with RAPIDS on NVIDIA GPUs

Learn how the use of RAPIDS to accelerate the analysis of single-cell RNA-sequence on a single NVIDIA V100 GPU shows a massive performance increase. 8 MIN READ
News 0

Achieve up to 75% Performance Improvement for Communication Intensive HPC Applications with NVTAGS

NVTAGS automates intelligent GPU assignment by profiling HPC applications and launching them with a custom GPU assignment tailored to an application and system to minimize communication costs. 2 MIN READ