Benjamin Chislett

Benjamin Chislett is a senior software engineer at NVIDIA and a maintainer of the vLLM inference engine. He works on speculative decoding algorithms and performance optimization for LLM inference.
Avatar photo

Posts by Benjamin Chislett

Data Center / Cloud

Boost Inference Performance up to 15x on NVIDIA Blackwell Using DFlash Speculative Decoding

As AI systems move from single-turn interactions to coordinated multiagent workflows, low-latency inference becomes increasingly important. Autoregressive LLMs... 7 MIN READ