Saurav Muralidharan

Saurav Muralidharan is a senior research scientist at NVIDIA Research, working on the Deep Learning Efficiency Research (DLER) team. Saurav’s work focuses on improving the runtime performance and efficiency of deep neural networks, especially large language models (LLMs), using techniques like model compression (sparsity, low-rank factorization, distillation, and so on) and neural architecture search (NAS).
Avatar photo

Posts by Saurav Muralidharan

Generative AI / LLMs

Mistral-NeMo-Minitron 8B Foundation Model Delivers Unparalleled Accuracy

Last month, NVIDIA and Mistral AI unveiled Mistral NeMo 12B, a leading state-of-the-art large language model (LLM). Mistral NeMo 12B consistently outperforms... 5 MIN READ
Decorative image of two cartoon llamas in sunglasses.
Generative AI / LLMs

How to Prune and Distill Llama-3.1 8B to an NVIDIA Llama-3.1-Minitron 4B Model

Large language models (LLM) are now a dominant force in natural language processing and understanding, thanks to their effectiveness and versatility. LLMs such... 12 MIN READ