Sparsity

May 16, 2023

Sparsity in INT8: Training Workflow and Best Practices for NVIDIA TensorRT Acceleration

The training stage of deep learning (DL) models consists of learning numerous dense floating-point weight matrices, which results in a massive amount of...

12 MIN READ

Jul 20, 2021

Accelerating Inference with Sparsity Using the NVIDIA Ampere Architecture and NVIDIA TensorRT

This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. When deploying a neural network, it's useful to think about how the network could be...

8 MIN READ

Dec 08, 2020

Exploiting NVIDIA Ampere Structured Sparsity with cuSPARSELt

Deep neural networks achieve outstanding performance in a variety of fields, such as computer vision, speech recognition, and natural language processing. The...

9 MIN READ

May 14, 2020

Defining AI Innovation with NVIDIA DGX A100

Organizations of all kinds are incorporating AI into their research, development, product, and business processes. This helps them meet and exceed their...

15 MIN READ

May 14, 2020

State-of-the-Art Language Modeling Using Megatron on the NVIDIA A100 GPU

Recent work has demonstrated that larger language models dramatically advance the state of the art in natural language processing (NLP) applications such as...

9 MIN READ

May 14, 2020

NVIDIA Ampere Architecture In-Depth

Today, during the 2020 NVIDIA GTC keynote address, NVIDIA founder and CEO Jensen Huang introduced the new NVIDIA A100 GPU based on the new NVIDIA Ampere GPU...

30 MIN READ

Gradient Boosting, Decision Trees, XGBoost

Sep 11, 2017

Gradient Boosting, Decision Trees and XGBoost with CUDA

Gradient boosting is a powerful machine learning algorithm used to achieve state-of-the-art accuracy on a variety of tasks such as regression,...

17 MIN READ

Apr 28, 2015

Parallel Direct Solvers with cuSOLVER: Batched QR

[Note: Lung Sheng Chien from NVIDIA also contributed to this post.] A key bottleneck for most science and engineering simulations is the solution of sparse...

15 MIN READ