Tensor Linear Algebra on NVIDIA GPUs


The cuTENSOR Library is a first-of-its-kind GPU-accelerated tensor linear algebra library providing tensor contraction, reduction and elementwise operations. cuTENSOR is used to accelerate applications in the areas of deep learning training and inference, computer vision, quantum chemistry and computational physics. Using cuTENSOR, applications automatically benefit from regular performance improvements and new GPU architectures.

cuTENSOR Performance

The cuTENSOR library is highly optimized for performance on NVIDIA GPUs. The newest version adds support for DMMA and TF32.

cuTENSOR Key Features

  • Tensor Contraction, Reduction and Elementwise Operations
  • Mixed precision support
  • Expressive API allowing elementwise operation fusion