Utkarsh Uppal

Utkarsh Uppal is a senior applied deep learning solutions architect at NVIDIA, where he specializes in building high-performance deep learning pipelines across domains like language and speech. His primary focus is on developing end-to-end conversational AI systems, including training LLMs from scratch, particularly for Indic languages and building domain-specific models with enterprises. He also has deep expertise in designing and optimizing inference architectures for production, with a focus on low-precision formats (FP4, FP8), decoding strategies, and KV-cache optimizations.
Avatar photo

Posts by Utkarsh Uppal

Generative AI

Faster Training Throughput in FP8 Precision with NVIDIA NeMo

In previous posts on FP8 training, we explored the fundamentals of FP8 precision and took a deep dive into the various scaling recipes for practical large-scale... 12 MIN READ
Decorative image.
Generative AI

Per-Tensor and Per-Block Scaling Strategies for Effective FP8 Training

In this blog post, we’ll break down the main FP8 scaling strategies—per-tensor scaling, delayed and current scaling, and per-block scaling (including the... 10 MIN READ
A decorative image.
Generative AI

Floating-Point 8: An Introduction to Efficient, Lower-Precision AI Training

With the growth of large language models (LLMs), deep learning is advancing both model architecture design and computational efficiency. Mixed precision... 11 MIN READ