Posts by Matthew Nicely
News
Nov 29, 2021
Programming Distributed Multi-GPU Tensor Operations with cuTENSOR v1.4
NVIDIA cuTENSOR library, v1.4, supports 64-dimensional tensors, distributed multi-GPU tensor ops, and improves tensor contraction performance models.
2 MIN READ
News
Nov 23, 2021
Implementing High Performance Matrix Multiplication Using CUTLASS v2.8
High performance CUTLASS template abstractions support matrix multiply operations (GEMM), Convolution AI, and improved Strided-DGrad.
2 MIN READ
News
Nov 15, 2021
Accelerating ReLu and GeLu Activation Functions, and Batched Sparse GEMM in cuSPARSELt v0.2.0
NVIDIA cuSPARSELt v0.2 now supports ReLu and GeLu activation functions, bias vector, and batched Sparse GEMM.
2 MIN READ
News
Nov 15, 2021
Using Fully Redesigned Batch API and Performance Optimizations in nvCOMP v2.1.0
The new nvCOMP v2.1.0 library features redesigned batch API and performance optimizations.
2 MIN READ
News
May 10, 2021
cuSOLVERMp v0.0.1 Now Available: Through Early Access
cuSOLVERMp provides a distributed-memory multi-node and multi-GPU solution for solving systems of linear equations at scale! In the future, it will also solve eigenvalue and singular value problems.
< 1
News
Apr 30, 2021
nvCOMP v2.0.0 Now Available: With New Compressors
nvCOMP is a CUDA library that features generic compression interfaces to enable developers to use high-performance GPU compressors in their applications.
< 1