Aditya Vavre

Aditya Vavre is a deep learning algorithms engineer at NVIDIA, where he focuses on advancing efficient large-scale language model training and architecture design. His past work includes 4-bit and 8-bit LLM pretraining, quantization-aware training and distillation, and sparse attention mechanisms, enabling more efficient long-context and large-scale transformer models. Prior to NVIDIA, he contributed to research and development in NLP and AI applications during his time as a Research Engineer at Sony, building retrieval-based dialogue systems and text-to-video generation pipelines. Aditya holds a master’s degree in Computer Science from The University of Texas at Austin and a bachelor’s degree from IIT Bombay. His interests lie at the intersection of scalable deep learning systems, model efficiency, and next-generation foundation model architectures.
Avatar photo

Posts by Aditya Vavre

Agentic AI / Generative AI

Using NVFP4 Low-Precision Model Training for Higher Throughput Without Losing Accuracy

As the sizes of AI models and datasets continue to increase, relying only on higher-precision BF16 training is no longer sufficient. Key challenges such as... 8 MIN READ