Sparsity
May 16, 2023
Sparsity in INT8: Training Workflow and Best Practices for NVIDIA TensorRT Acceleration
The training stage of deep learning (DL) models consists of learning numerous dense floating-point weight matrices, which results in a massive amount of...
12 MIN READ
Jul 20, 2021
Accelerating Inference with Sparsity Using the NVIDIA Ampere Architecture and NVIDIA TensorRT
This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. When deploying a neural network, it's useful to think about how the network could be...
8 MIN READ
Dec 08, 2020
Exploiting NVIDIA Ampere Structured Sparsity with cuSPARSELt
Deep neural networks achieve outstanding performance in a variety of fields, such as computer vision, speech recognition, and natural language processing. The...
9 MIN READ
May 14, 2020
Defining AI Innovation with NVIDIA DGX A100
Organizations of all kinds are incorporating AI into their research, development, product, and business processes. This helps them meet and exceed their...
15 MIN READ
May 14, 2020
State-of-the-Art Language Modeling Using Megatron on the NVIDIA A100 GPU
Recent work has demonstrated that larger language models dramatically advance the state of the art in natural language processing (NLP) applications such as...
9 MIN READ
May 14, 2020
NVIDIA Ampere Architecture In-Depth
Today, during the 2020 NVIDIA GTC keynote address, NVIDIA founder and CEO Jensen Huang introduced the new NVIDIA A100 GPU based on the new NVIDIA Ampere GPU...
30 MIN READ
Sep 11, 2017
Gradient Boosting, Decision Trees and XGBoost with CUDA
Gradient boosting is a powerful machine learning algorithm used to achieve state-of-the-art accuracy on a variety of tasks such as regression,...
17 MIN READ
Apr 28, 2015
Parallel Direct Solvers with cuSOLVER: Batched QR
[Note: Lung Sheng Chien from NVIDIA also contributed to this post.] A key bottleneck for most science and engineering simulations is the solution of sparse...
15 MIN READ