Posts by Jiri Kraus
Technical Walkthrough
Jan 22, 2021
Accelerating NVSHMEM 2.0 Team-Based Collectives Using NCCL
NVSHMEM 2.0 is introducing a new API for performing collective operations based on the Team Management feature of the OpenSHMEM 1.5 specification. A team is a…
9 MIN READ
Technical Walkthrough
Nov 17, 2014
Increase Performance with GPU Boost and K80 Autoboost
NVIDIA® GPU Boost™ is a feature available on NVIDIA® GeForce® and Tesla® GPUs that boosts application performance by increasing GPU core and memory clock rates…
11 MIN READ
Technical Walkthrough
Jun 19, 2014
CUDA Pro Tip: Profiling MPI Applications
Use nvprof and NVTX to profile your MPI+CUDA application.
4 MIN READ
Technical Walkthrough
Jun 03, 2014
Accelerating a C++ CFD Code with OpenACC
This post describes the process of accelerating the ZFD C++ Computational Fluid Dynamics code using OpenACC and Tesla K40 GPUs.
12 MIN READ
Technical Walkthrough
Sep 03, 2013
CUDA Pro Tip: Generate Custom Application Profile Timelines with NVTX
The last time you used the timeline feature in the NVIDIA Visual Profiler, Nsight VSE or the new Nsight Systems to analyze a complex application…
8 MIN READ
Technical Walkthrough
Mar 27, 2013
Benchmarking CUDA-Aware MPI
I introduced CUDA-aware MPI in my last post, with an introduction to MPI and a description of the functionality and benefits of CUDA-aware MPI. In this post I…
8 MIN READ