Posts by Rakib Hasan
Agentic AI / Generative AI
Nov 10, 2025
How to Achieve 4x Faster Inference for Math Problem Solving
Large language models can solve challenging math problems. However, making them work efficiently at scale requires more than a strong checkpoint. You need the...
7 MIN READ
Agentic AI / Generative AI
Dec 18, 2024
NVIDIA TensorRT-LLM Now Supports Recurrent Drafting for Optimizing LLM Inference
Recurrent drafting (referred to as ReDrafter) is a novel speculative decoding technique developed and open-sourced by Apple for large language model (LLM)...
6 MIN READ