Shubham Agrawal

Shubham Agrawal is an AI developer technology engineer at NVIDIA, where he works on the Metropolis team. He focuses on bringing generative AI-based solutions to industry using vision language models (VLMs). His previous research concentrated on computer vision in the medical domain. He holds a M.Sc. in computer science from Columbia University and a B.Sc. in information technology from NITK Surathkal.

Posts by Shubham Agrawal

Computer Vision / Video Analytics Jul 15, 2026

Build a Multi-Camera 3D Tracking Application with NVIDIA DeepStream 9.1 Skills

Developers building video analytics applications across large spaces must track the same object as it moves between camera views. Single-camera 2D tracking... 12 MIN READ

Computer Vision / Video Analytics Sep 25, 2025

How to Integrate Computer Vision Pipelines with Generative AI and Reasoning

Generative AI is opening new possibilities for analyzing existing video streams. Video analytics are evolving from counting objects to turning raw video content... 14 MIN READ

Computer Vision / Video Analytics May 18, 2025

Advance Video Analytics AI Agents Using the NVIDIA AI Blueprint for Video Search and Summarization

Vision language models (VLMs) have transformed video analytics by enabling broader perception and richer contextual understanding compared to traditional... 15 MIN READ

Computer Vision / Video Analytics Mar 11, 2025

Build Real-Time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization

With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of... 9 MIN READ

Computer Vision / Video Analytics Feb 26, 2025

Vision Language Model Prompt Engineering Guide for Image and Video Understanding

Vision language models (VLMs) are evolving at a breakneck speed. In 2020, the first VLMs revolutionized the generative AI landscape by bringing visual... 12 MIN READ

Agentic AI / Generative AI Dec 03, 2024

Build an Agentic Video Workflow with Video Search and Summarization

Building a question-answering chatbot with large language models (LLMs) is now a common workflow for text-based interactions. What about creating an AI system... 11 MIN READ