This week’s Spotlight is on Yifeng Cui, director of the High Performance GeoComputing Lab (HPGeoC) at the San Diego Supercomputer Center (and adjunct professor in the Department of Geological Sciences at San Diego State University).
HPGeoC was recently named a winner of the HPC Innovation Excellence Award by IDC for developing a highly scalable computer code that promises to dramatically cut research times and energy costs in simulating seismic hazards. The video below demonstrates some of the earthquake simulation done by HPGeoC.
NVIDIA: How are GPUs helping you solve key challenges in your field?
Yifeng: Our team developed a scalable GPU code based on AWP-ODC, an Anelastic Wave Propagation code originally developed by Prof. Kim Olsen of SDSU. This code has two versions that give equivalent results. The first version can efficiently calculate extreme scale ground motions at many sites.
The second can efficiently calculate ground motions from many single-site ruptures as capacity computing. The optimization of the code results in around a 110x speed up over the CPU in key strain tensor calculations critical to the probabilistic seismic hazard analysis.
Yifeng: Our team developed a scalable GPU code based on AWP-ODC, an Anelastic Wave Propagation code originally developed by Prof. Kim Olsen of SDSU. This code has two versions that provide equivalent results. The first version can efficiently calculate extreme scale ground motions at many sites. The second can efficiently calculate ground motions from many single-site ruptures as capacity computing. The optimization of the code results in around a 110x speedup over the CPU in key strain tensor calculations critical to the probabilistic seismic hazard analysis. – See more at: http://www.nvidia.com/content/cuda/spotlights/yifeng-cui-sdsc.html#sthash.G9sdxt2s.dpuf
NVIDIA: What specific approaches do you use to apply GPU computing to your work?
Yifeng: This code is a memory-bounded stencil that is limited in compute performance by its low computational intensity and poor data locality. We re-designed the Fortran code to C to maximize throughput and memory locality. Good scalability was achieved through a two-layer 2D decomposition, and an algorithm-level communication reduction scheme, which eliminates stress data exchange otherwise needed per iteration.
CUDA asynchronous memory copy operations help effective overlap of CPU/PCI-e data transfer with GPU computation. A two-layer scalable IO technique was developed to efficiently handle many terabytes of dynamic source and media inputs, as well as 3D volume velocity outputs. We are also tuning co-scheduling to allow full utilization of both CPUs and GPUs in the hybrid heterogeneous systems. We are grateful for NVIDIA’s support during our implementation process.
NVIDIA: Tell us about your use of the Titan system at Oak Ridge National Lab.
Yifeng: We simulated realistic 0-10 hertz ground motions on a mesh comprising 443 billion elements in a calculation that includes both small-scale fault geometry and media complexity at a model size far beyond what has been done previously. This was done in collaboration with Profs. Olsen and Steve Day of SDSU; Prof. Thomas Jordan of USC, the Director of SCEC; and others at SCEC. The validation simulation on Titan demonstrated ideal scalability up to 8K Titan nodes, and sustained 2.3 petaflop/s on 16K Titan nodes.
NVIDIA: How did you get interested in this field?
Yifeng: It was the NPACI program (the National Partnership for Advanced Computational Infrastructure, housed at SDSC until 2005) that attracted me to move to HPC more than a decade ago. Working together with many of my talented colleagues from diverse science backgrounds at an environment encouraging innovation through freedom inspired me to pursue new ideas related to computational issues. In 2004, I helped enable the San Andreas fault scenario called TeraShake, the first terascale earthquake simulation that discovered how the rupture directivity of the southern San Andreas fault, a source effect, could couple to the excitation of sedimentary basins, a site effect, to substantially increase the seismic hazard in Los Angeles. Since then, I have been very fortunate to work with some of the nation’s finest seismologists and support their earthquake research.
NVIDIA: When did you first think of applying GPUs to earthquake simulations?
Yifeng: AWP-ODC was transformed from a ‘personal’ researcher code into a SCEC community code for large-scale earthquake simulations. Although highly scalable, the CPU version of the code would still take hundreds of millions of computing core-hours to generate the California state-wide seismic hazard map at the maximum frequency of 1 hertz. Using GPUs provided an alternative solution.
NVIDIA: What advice would you offer other seismologists looking at using CUDA?
Yifeng: Know your code and what you are looking for and develop a good long-term plan before jumping into any new programming models. Time-to-solution is the single most important metric for scientific applications, I would say. Computing is changing more rapidly than ever before, which is particularly challenging for earthquake applications; however, these changes also offer an unprecedented opportunity to re-design your code. Keep an open mind and enjoy the fun with emerging many-core architectures.
Read more GPU Computing Spotlights.