Keval Morabia

Keval Morabia is a senior deep learning engineer on the NVIDIA TensorRT Model Optimizer team where he focuses on algorithms for optimizing LLMs. More specifically, Keval works on optimization techniques like Pruning, Neural Architecture Search, and Knowledge Distillation that have demonstrated significant speedups for the MLPerf Inference submissions in the past. Keval joined NVIDIA through the acquisition of OmniML Inc., where he was an early ML engineer. Keval received his master's degree in Computer Science from the University of Illinois at Urbana-Champaign and bachelor's degree in Computer Science from BITS Pilani, India.
Avatar photo

Posts by Keval Morabia

Generative AI / LLMs

NVIDIA TensorRT Model Optimizer v0.15 Boosts Inference Performance and Expands Model Support

NVIDIA has announced the latest v0.15 release of NVIDIA TensorRT Model Optimizer, a state-of-the-art quantization toolkit of model optimization techniques... 5 MIN READ