Posts by Leopold Cambier
Models / Libraries / Frameworks
Sep 02, 2025
Improving GEMM Kernel Auto-Tuning Efficiency on NVIDIA GPUs with Heuristics and CUTLASS 4.2
Selecting the best possible General Matrix Multiplication (GEMM) kernel for a specific problem and hardware is a significant challenge. The performance of a...
8 MIN READ
Models / Libraries / Frameworks
Dec 14, 2024
Introducing Tile-Based Programming in Warp 1.5.0
With the latest release of Warp 1.5.0, developers now have access to new tile-based programming primitives in Python. Leveraging cuBLASDx and cuFFTDx, these new...
14 MIN READ
Simulation / Modeling / Design
Jan 27, 2022
Multinode Multi-GPU: Using NVIDIA cuFFTMp FFTs at Scale
Today, NVIDIA announces the release of cuFFTMp for Early Access (EA). cuFFTMp is a multi-node, multi-process extension to cuFFT that enables scientists and...
10 MIN READ