Technical Walkthrough 0

Case Study: ResNet50 with DALI

Let’s imagine a situation. You buy a brand-new, cutting-edge, Volta-powered DGX-2 server. You’ve done your math right, expecting a 2x performance increase... 11 MIN READ
Technical Walkthrough 0

Machine Learning Acceleration in Vulkan with Cooperative Matrices

Machine learning harnesses computing power to solve a variety of ‘hard’ problems that seemed impossible to program using traditional languages and... 8 MIN READ
Technical Walkthrough 0

Tensor Core Programming Using CUDA Fortran

The CUDA Fortran compiler from PGI now supports programming Tensor Cores with NVIDIA’s Volta V100 and Turing GPUs. This enables scientific programmers using... 12 MIN READ
Technical Walkthrough 0

Speeding Up Semantic Segmentation Using MATLAB Container from NVIDIA NGC

Gone are the days of using a single GPU to train a deep learning model.  With computationally intensive algorithms such as semantic segmentation, a single GPU... 8 MIN READ
Technical Walkthrough 0

Video Series: Mixed-Precision Training Techniques Using Tensor Cores for Deep Learning

Neural networks with thousands of layers and millions of neurons demand high performance and faster training times. The complexity and size of neural networks... 5 MIN READ
Technical Walkthrough 0

Using Tensor Cores for Mixed-Precision Scientific Computing

Double-precision floating point (FP64) has been the de facto standard for doing scientific simulation for several decades. Most numerical methods used in... 9 MIN READ