Paulius Micikevicius is a Director in the Compute Architecture and Applied Deep Learning Research groups at NVIDIA. He joined NVIDIA in 2007, prior to which he as an assistant professor of computer science at Armstrong Atlantic State University. Paulius holds a PhD in computer science from University of Central Florida.

Accelerating AI Training with NVIDIA TF32 Tensor Cores

NVIDIA Ampere GPU architecture introduced the third generation of Tensor Cores, with the new TensorFloat32 (TF32) mode for accelerating FP32 convolutions and… 10 MIN READ
Tips for Optimizing GPU Performance Using Tensor Cores

Our most popular question is "What can I do to get great GPU performance for deep learning?" We’ve recently published a detailed Deep Learning Performance Guide… 13 MIN READ
Mixed-Precision Training of Deep Neural Networks

Deep Neural Networks (DNNs) have lead to breakthroughs in a number of areas, including image processing and understanding, language modeling… 9 MIN READ