Michael Iovine

Michael Iovine is a senior software engineer at NVIDIA. He currently works on inference optimization for TensorRT-LLM and leads the development of the framework’s speculative decoding module. He holds a bachelor’s degree in Computer Science from the California Institute of Technology.
Avatar photo

Posts by Michael Iovine

Data Center / Cloud

Boost Inference Performance up to 15x on NVIDIA Blackwell Using DFlash Speculative Decoding

As AI systems move from single-turn interactions to coordinated multiagent workflows, low-latency inference becomes increasingly important. Autoregressive LLMs... 7 MIN READ