NVIDIA cuBLAS introduces cuBLASDx APIs, device side API extensions for performing BLAS calculations inside your CUDA kernel. Fusing numerical operations decreases the latency and improves the performance of your application.
Refer to the cuBLASDx documentation on hardware and software requirements
By downloading and using the software, you agree to fully comply with the terms and conditions of the NVIDIA Software License Agreement.Download cuBLASDx TAR
TAR local installer instructions (x86):
$ wget https://developer.download.nvidia.com/compute/cublasdx/redist/cublasdx/nvidia-cublasdx-24.01.0.tar.gz
Download cuBLASDx ZIP
ZIP local installer instructions (x86):
$ wget https://developer.download.nvidia.com/compute/cublasdx/redist/cublasdx/nvidia-cublasdx-24.01.0.zip