Shubham Agrawal

Shubham Agrawal is an AI developer technology engineer at NVIDIA, where he works on the Metropolis team. He focuses on bringing generative AI-based solutions to industry using vision language models (VLMs). His previous research concentrated on computer vision in the medical domain. He holds a M.Sc. in computer science from Columbia University and a B.Sc. in information technology from NITK Surathkal.
Avatar photo

Posts by Shubham Agrawal

Computer Vision / Video Analytics

Advance Video Analytics AI Agents Using the NVIDIA AI Blueprint for Video Search and Summarization

Vision language models (VLMs) have transformed video analytics by enabling broader perception and richer contextual understanding compared to traditional... 15 MIN READ
Computer Vision / Video Analytics

Build Real-Time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization

With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of... 9 MIN READ
A GIF of a warehouse with people walking around.
Computer Vision / Video Analytics

Vision Language Model Prompt Engineering Guide for Image and Video Understanding

Vision language models (VLMs) are evolving at a breakneck speed. In 2020, the first VLMs revolutionized the generative AI landscape by bringing visual... 12 MIN READ
An avatar sitting at a computer, which is linked to multiple action icons through the NVIDIA NIM icon.
Generative AI

Build an Agentic Video Workflow with Video Search and Summarization

Building a question-answering chatbot with large language models (LLMs) is now a common workflow for text-based interactions. What about creating an AI system... 11 MIN READ
GIF shows multiple photos and images selected within the photos according to a prompt, such as "person with glasses" or "tallest cat."
Computer Vision / Video Analytics

New Foundational Models and Training Capabilities with NVIDIA TAO 5.5

NVIDIA TAO is a framework designed to simplify and accelerate the development and deployment of AI models. It enables you to use pretrained models, fine-tune... 13 MIN READ