Fast Fourier Transforms for NVIDIA GPUs


The cuFFT Library provides GPU-accelerated FFT implementations that perform up to 10X faster than CPU-only alternatives. cuFFT is used for building commercial and research applications across disciplines such as deep learning, computer vision, computational physics, molecular dynamics, quantum chemistry, and seismic and medical imaging. Using cuFFT, applications automatically benefit from regular performance improvements and new GPU architectures. The cuFFT library is included in both the NVIDIA HPC SDK and the CUDA Toolkit.

cuFFT Device Extensions

cuFFT device extensions (cuFFTDx) allow applications to inline FFTs into user kernels. This massively improves performance over cuFFT host APIs and permits fusion with application operations. cuFFTDx is currently a part of the Math Libraries Device Extensions.

Explore what’s new in the latest release...

Learn More...

cuFFT Performance

The cuFFT library is highly optimized for performance on NVIDIA GPUs. Note the second chart compares the performance across 16 Volta GV100 GPUs to the performance across eight new GA100 Ampere Architecture GPUs.

cuFFT Key Features

  • 1D, 2D, 3D transforms of complex and real data types
  • Support for up to 16-GPU systems
  • Multi-GPU C2C, R2C and C2R support
  • Familiar API similar to FFTW Advanced Interface
  • Flexible data layouts allowing arbitrary strides between individual elements and array dimensions
  • Streamed asynchronous execution
  • Half, single and double precision transforms
  • Batch execution
  • In-place and out-of-place transforms
  • Thread-safe and callable from multiple host threads