GTC Silicon Valley-2019: Fast Convolutions Via the Overlap-and-Save Method Using Shared Memory FFT
Note: This video may require joining the NVIDIA Developer Program or login
GTC Silicon Valley-2019 ID:S9352:Fast Convolutions Via the Overlap-and-Save Method Using Shared Memory FFT
Karel Adamek(Department of Engineering Sciences, University of Oxford)
We will present optimizations that increase performance of overlap-and-save calculations of linear convolution using shared memory FFT. The overlap-and-save method is used when convolution of a long signal with many filters is required. We'll explain how we implemented custom FFT, which uses shared memory, to eliminate most of the device memory transfers normally required when calculating convolution. We'll show how we achieved significant impact for certain problem sizes.