NVIDIA Apex: Tools for Easy Mixed-Precision Training in PyTorch

Accelerated Computing, Artificial Intelligence, Amp, Featured, FP16_Optimizer, machine learning and AI, Mixed Precision, PyTorch

Nadeem Mohammad, posted Dec 03 2018

Most deep learning frameworks, including PyTorch, train using 32-bit floating point (FP32) arithmetic by default.

Read more

New Optimizations To Accelerate Deep Learning Training on NVIDIA GPUs

Accelerated Computing, Artificial Intelligence, Containers, cuDNN, DALI, machine learning and AI, MxNet, NGC, PyTorch, Software Tools and Libraries, TensorFlow

Nadeem Mohammad, posted Dec 03 2018

The pace of AI adoption across diverse industries depends on maximizing data scientists’ productivity.

Read more

The Peak-Performance-Percentage Analysis Method for Optimizing Any GPU Workload

Game Development, DX11, DX12, GameWorks, HBAO+, Nsight Graphics, Nsight Visual Studio Edition, OpenGL

Nadeem Mohammad, posted Nov 27 2018

Figuring out how to reduce the GPU frame time of a rendering application on PC is challenging for even the most experienced PC game developers.

Read more

Parallel Shader Compilation for Ray Tracing Pipeline States

Design & Visualization, Game Development, DXR, pipeline state objects, PSO, ray tracing, Shaders

Nadeem Mohammad, posted Nov 19 2018

In ray tracing, a single pipeline state object (PSO) can contain any number of shaders. This number can grow large, depending on scene content and ray types handled with the PSO; construction cost of the state object can significantly increase.

Read more

Accelerating Intelligent Video Analytics with Transfer Learning Toolkit

Artificial Intelligence, Smart Cities, DeepStream SDK, IVA, machine learning and AI, TLT, Transfer Learning Toolkit

Nadeem Mohammad, posted Nov 13 2018

Over the past several years, NVIDIA has been developing solutions to make AI and its benefits accessible to every industry.

Read more