Today, NVIDIA is announcing the availability of cuTENSOR version 1.3.0. This software can be downloaded now free for members of the NVIDIA Developer Program.
What’s New
- Support for up to 40-dimensional tensors
- Support 64-bit strides
- Support for BFloat16 Element-wise operations
- Improved performance for direct Tensor Contractions
- Bug fixes
See the cuTENSOR Release Notes for more information.
About cuTENSOR
cuTENSOR is a high-performance CUDA library for tensor primitives; its key features are:
- Extensive mixed-precision support:
- FP64 inputs with FP32 compute.
- FP32 inputs with FP16, BF16, or TF32 compute.
- Complex-times-real operations.
- Conjugate (without transpose) support.
- Support for up to 40-dimensional tensors.
- Arbitrary data layouts.
- Trivially serializable data structures.
- Main computational routines:
- Element-wise tensor operations:
- Support for various activation functions.
- Arbitrary tensor permutations.
- Conversion between different data types.
- Element-wise tensor operations:
Learn more:
- GTC 2021: S31754 Recent Developments in NVIDIA Math Libraries
- GTC 2021: S31286 A Deep Dive into the Latest HPC Software
- GTC 2021: CWES1098 Tensor Core-Accelerated Math Libraries for Dense and Sparse Linear Algebra in AI and HPC
- cuTENSOR Product Documentation
Recent Developer Blog posts: