Rakib Hasan

Rakib Hasan is a senior AI developer technology engineer at NVIDIA, specializing in optimizing deep learning workloads, including large language model (LLM) inference. He contributed to TensorRT-LLM by adding support for Llama models and implementing features like RoPE scaling and Speculative Decoding. Rakib earned his PhD from Louisiana State University (LSU), focusing on optimizing mathematical libraries on x64 and ARM CPUs.
Avatar photo

Posts by Rakib Hasan

Generative AI

NVIDIA TensorRT-LLM Now Supports Recurrent Drafting for Optimizing LLM Inference

Recurrent drafting (referred as ReDrafter) is a novel speculative decoding technique developed and open-sourced by Apple for large language model (LLM)... 6 MIN READ