Linear Algebra

Dec 17, 2025

Solving Large-Scale Linear Sparse Problems with NVIDIA cuDSS

Solving large-scale problems in Electronic Design Automation (EDA), Computational Fluid Dynamics (CFD), and advanced optimization workflows has become the norm...

16 MIN READ

Feb 25, 2025

NVIDIA cuDSS Advances Solver Technologies for Engineering and Scientific Computing

NVIDIA cuDSS is a first-generation sparse direct solver library designed to accelerate engineering and scientific computing. cuDSS is increasingly adopted in...

12 MIN READ

Stylized image of a line chart with a magnifying glass next to it.

Apr 20, 2023

A Comprehensive Overview of Regression Evaluation Metrics

As a data scientist, evaluating machine learning model performance is a crucial aspect of your work. To do so effectively, you have a wide range of statistical...

17 MIN READ

Dec 05, 2017

CUTLASS: Fast Linear Algebra in CUDA C++

Update May 21, 2018: CUTLASS 1.0 is now available as Open Source software at the CUTLASS repository. CUTLASS 1.0 has changed substantially from our preview...

25 MIN READ

Oct 17, 2017

Programming Tensor Cores in CUDA 9

A defining feature of the new NVIDIA Volta GPU architecture is Tensor Cores, which give the NVIDIA V100 accelerator a peak throughput that is 12x...

16 MIN READ

Feb 27, 2017

Pro Tip: cuBLAS Strided Batched Matrix Multiply

There’s a new computational workhorse in town. For decades, general matrix-matrix multiply—known as GEMM in Basic Linear Algebra Subroutines (BLAS)...

10 MIN READ

Jun 09, 2015

Graph Coloring: More Parallelism for Incomplete-LU Factorization

In this blog post I will briefly discuss the importance and simplicity of graph coloring and its application to one of the most common problems in sparse...

12 MIN READ

Oct 23, 2014

Optimizing the High Performance Conjugate Gradient Benchmark on GPUs

[This post was co-written by Everett Phillips and Massimiliano Fatica.] The High Performance Conjugate Gradient Benchmark (HPCG) is a new benchmark intended to...

11 MIN READ

Apr 29, 2014

CUDA Pro Tip: Fast and Robust Computation of Givens Rotations

A Givens rotation [1] represents a rotation in a plane represented by a matrix of the form $latex G(i, j, \theta) = \begin{bmatrix} 1 & \cdots & 0...

3 MIN READ