Agentic AI / Generative AI

Power Real-Time AI Media Effects with New AI Reference Apps on NVIDIA Holoscan for Media

AI Virtual Camera video input and output.

Jun 17, 2025

By Guillaume Polaillon and Sepi Motamedi

Discuss (0)

AI-Generated Summary

Dislike

NVIDIA has released new AI reference applications for live media workflows that can handle uncompressed ST 2110 streams and enable real-time media effects with minimal latency.
The AI virtual camera application uses PyTorch and NVIDIA DeepStream SDK to detect and track individuals in a video stream, creating multiple virtual camera outputs focused on detected individuals.
The Holoscan for Media 25.4 release includes improved monitoring, automation, and support for various networking variants, simplifying the development process for live media AI applications.

AI-generated content may summarize information incompletely. Verify important information. Learn more

Live media workflows are increasingly using AI microservices to augment production capabilities. However, advanced AI models are mostly hosted in the cloud, making it challenging to process high-bitrate, uncompressed media streams due to constraints around network latency, bandwidth, and real-time scalability.

NVIDIA released new AI reference applications that facilitate the ease of AI development. These technologies can interface with uncompressed ST 2110 streams and enable real-time media effects with minimal latency.

AI reference applications

The latest AI reference applications available on Holoscan for Media offer a powerful starting point for building real-time AI solutions specifically tailored for live media workflows.

AI virtual cameras

A simple application built with PyTorch and NVIDIA DeepStream SDK creates virtual cameras for each presenter in a video. After detecting and tracking individuals present in a high-resolution, uncompressed ST 2110 input stream, the reference app creates multiple cropped virtual camera outputs focused on the detected individuals. Using AI-generated camera feeds, operators can create a more dynamic production shot with a single static camera.

Automatic speech recognition

This reference application leverages real-time automatic speech recognition (ASR) on an ST 2110-30 audio source using the NVIDIA Riva Parakeet ASR NIM. A simple web user interface monitors transcription in real time and enables users to search for words. Users will see live captions of the incoming stream, along with a search field to search through the transcription. The simple front end provides a starting point for developers to refine, customize, and take to the next level.

How to get started

Before you begin building with the AI reference applications, ensure you have the following prerequisites in place to streamline your development process and avoid common setup issues:

An AI workstation with an NVIDIA RTX Pro GPU and an NVIDIA ConnectX network interface card (with loopback cable or switch connectivity) or a certified multi-GPU system.
A functional NVIDIA Holoscan for Media environment using either a local developer setup with Kubernetes or the platform reference deployment guide with a jump node.
Visual Studio Code or any other IDE for Linux platforms. The GNU Compiler Collection (GCC) can also be used.

To install v25.4, refer to the developer guides available on the Holoscan for Media collection page.

To proceed with the AI application install, follow the steps on NGC AI Reference Applications resources page.

Additional updates

In addition to these AI applications, the Holoscan for Media 25.4 release comes with improved monitoring for both production (OpenShift) and local developer (cloud-native stack) environments. It uses the SR-IOV network, PTP, and the NMOS registry application-specific Grafana dashboards.

Automation is also improved for single-node OpenShift installations and compact three-node clusters. This means support for more networking variants and red/blue networking for ST 2022-7 redundancy. It also simplifies automation for local developer setup (now supporting Ubuntu 24.04) and automated installation of reference applications like the Helm dashboard, NMOS registry, NMOS controller, or media gateway.

Conclusion

Holoscan for Media has enabled container orchestration for multi-vendor live production since its launch over a year ago. The latest 25.4 release provides the first AI reference applications for developers, delivering on the promise of real-time AI for live media on software-defined infrastructure on premises.

Get started with Holoscan for Media.

Discuss (0)

About the Authors

About Guillaume Polaillon
With over 20 years of experience in technology and innovation, Guillaume works with artists and engineers to transform their vision into groundbreaking solutions for customers. Guillaume’s expertise lies in GPU acceleration for video, 3D graphics, compute, and artificial intelligence, as well IP transition for live production broadcast.

View all posts by Guillaume Polaillon

About Sepi Motamedi
Sepi Motamedi is the global head of sports and live media marketing at NVIDIA where she leads the strategy and shapes the conversation around the role of AI in sports and live media, connecting technological and workflow innovations with the challenges and opportunities transforming these fields. Her work helps define how organizations think about the next generation of AI and accelerated computing and their impact on live production, sports data, athlete performance, intelligent venues, and next-generation audience experiences.

View all posts by Sepi Motamedi