Pradeep Ramani

Pradeep Ramani is a senior deep learning architect at NVIDIA working on designing abstractions for speed of light linear algebra computations on GPUs. Pradeep has over 14 years of experience working across multiple layers of the GPU stack including hardware design, architecture, programming models, and library design (CUTLASS). He received his M.Sc. in electrical and computer engineering from the University of California Santa Barbara.
Avatar photo

Posts by Pradeep Ramani

Stack diagram for LLM Megatron Core.
Generative AI

OpenAI Triton on NVIDIA Blackwell Boosts AI Performance and Programmability

Matrix multiplication and attention mechanisms are the computational backbone of modern AI workloads. While libraries like NVIDIA cuDNN provide highly optimized... 5 MIN READ