CUDA 10.1 is now available for download. This version includes a new lightweight GEMM library, new functionality and performance updates to existing libraries, and improvements to the CUDA Graphs API.
With CUDA 10.1, you get:
- cuBLASLt, a new lightweight GEMM library with a flexible API and tensor core support for INT8 inputs and FP16 CGEMM split-complex matrix multiplication
- New selective eigensolvers SYEVDX and SYGVDX in cuSOLVER, and performance improvements of up to 1.5X for full spectrum eigensolvers
- New encoding and batched decoding functionalities in nvJPEG
- Up to 4X faster performance for broad set of random number generators in cuRAND
- Improved performance and support for fork/join kernels in CUDA Graphs APIs
Additionally, CUDA 10.1 includes bug fixes, support for new operating systems, and updates to the Nsight Systems and Nsight Compute developer tools.
Starting with CUDA 10, NVIDIA and Microsoft have worked closely to ensure a smooth experience for CUDA developers on Windows – CUDA 10.1 adds host compiler support for the latest versions of Microsoft Visual Studio 2017 and 2019 (Previews for RTW, and future updates). A full list of supported compilers is available in the documentation on system requirements.