Pro Tip: cuBLAS Strided Batched Matrix Multiply

Research, Algorithms & Numerical Techniques, CUDA, Education & Training, Machine Learning & Artificial Intelligence

Nadeem Mohammad, posted Feb 28 2017

There’s a new computational workhorse in town. For decades, general matrix-matrix multiply—known as GEMM in Basic Linear Algebra Subroutines (BLAS) libraries—has been a standard benchmark for computational performance.

Read more