Sharath Sreenivas

Sharath Sreenivas is a senior deep learning engineer at NVIDIA and is interested in development and optimization of learning algorithms. He received his M.Sc. in computer science with a focus on machine learning from University of California, Santa Cruz.
Avatar photo

Posts by Sharath Sreenivas

Generative AI

Mistral-NeMo-Minitron 8B Model Delivers Unparalleled Accuracy

This post was originally published August 21, 2024 but has been revised with current data. Recently, NVIDIA and Mistral AI unveiled Mistral NeMo 12B, a leading... 7 MIN READ
Decorative image of two cartoon llamas in sunglasses.
Generative AI

How to Prune and Distill Llama-3.1 8B to an NVIDIA Llama-3.1-Minitron 4B Model

Large language models (LLM) are now a dominant force in natural language processing and understanding, thanks to their effectiveness and versatility. LLMs such... 12 MIN READ
Conversational AI

Pretraining BERT with Layer-wise Adaptive Learning Rates

Training with larger batches is a straightforward way to scale training of deep neural networks to larger numbers of accelerators and reduce the training time.... 10 MIN READ