Shubham Agrawal

Shubham Agrawal is an AI developer technology engineer at NVIDIA, where he works on the Metropolis team. He focuses on bringing generative AI-based solutions to industry using vision language models (VLMs). His previous research concentrated on computer vision in the medical domain. He holds a M.Sc. in computer science from Columbia University and a B.Sc. in information technology from NITK Surathkal.
Avatar photo

Posts by Shubham Agrawal

Computer Vision / Video Analytics

Build Real-Time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization

With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of... 9 MIN READ
A GIF of a warehouse with people walking around.
Computer Vision / Video Analytics

Vision Language Model Prompt Engineering Guide for Image and Video Understanding

Vision language models (VLMs) are evolving at a breakneck speed. In 2020, the first VLMs revolutionized the generative AI landscape by bringing visual... 12 MIN READ
An avatar sitting at a computer, which is linked to multiple action icons through the NVIDIA NIM icon.
Generative AI

Build an Agentic Video Workflow with Video Search and Summarization

Building a question-answering chatbot with large language models (LLMs) is now a common workflow for text-based interactions. What about creating an AI system... 11 MIN READ
GIF shows multiple photos and images selected within the photos according to a prompt, such as "person with glasses" or "tallest cat."
Computer Vision / Video Analytics

New Foundational Models and Training Capabilities with NVIDIA TAO 5.5

NVIDIA TAO is a framework designed to simplify and accelerate the development and deployment of AI models. It enables you to use pretrained models, fine-tune... 13 MIN READ