GTC 2020: Accelerating Large-Scale GW Calculations in Material Science
After clicking “Watch Now” you will be prompted to login or join.
Click “Watch Now” to login or join the NVIDIA Developer Program.
Accelerating Large-Scale GW Calculations in Material Science
Charlene Yang, NERSC, Lawrence Berkeley National Laboratory | Mauro Del Ben, CRD, LBNL
Learn the balancing act of porting a large-scale HPC code to modern GPUs, where a plethora of architectural characteristics can both accelerate and limit performance. We'll showcase various techniques used to accelerate the material science code BerkeleyGW on NVIDIA GPUs targeting large-scale simulations with thousands of atoms, matrices of up to 1 million by 1 million, and reductions of thousands of billions of numbers. These techniques include the use of cuBLAS and cuFFT, pinned memory, streams, batched operations, shared memory, and the overlapping of message-passing interface communication and GPU computation. Excellent strong scaling and weak scaling are observed on thousands of Volta GPUs, and a 16x improvement is obtained on FLOPs/Watt efficiency compared to the CPU-only implementation.