Live media workflows are increasingly using AI microservices to augment production capabilities. However, advanced AI models are mostly hosted in the cloud, making it challenging to process high-bitrate, uncompressed media streams due to constraints around network latency, bandwidth, and real-time scalability.
NVIDIA released new AI reference applications that facilitate the ease of AI development. These technologies can interface with uncompressed ST 2110 streams and enable real-time media effects with minimal latency.
AI reference applications
The latest AI reference applications available on Holoscan for Media offer a powerful starting point for building real-time AI solutions specifically tailored for live media workflows.
AI virtual cameras
A simple application built with PyTorch and NVIDIA DeepStream SDK creates virtual cameras for each presenter in a video. After detecting and tracking individuals present in a high-resolution, uncompressed ST 2110 input stream, the reference app creates multiple cropped virtual camera outputs focused on the detected individuals. Using AI-generated camera feeds, operators can create a more dynamic production shot with a single static camera.

Automatic speech recognition
This reference application leverages real-time automatic speech recognition (ASR) on an ST 2110-30 audio source using the NVIDIA Riva Parakeet ASR NIM. A simple web user interface monitors transcription in real time and enables users to search for words. Users will see live captions of the incoming stream, along with a search field to search through the transcription. The simple front end provides a starting point for developers to refine, customize, and take to the next level.

How to get started
Before you begin building with the AI reference applications, ensure you have the following prerequisites in place to streamline your development process and avoid common setup issues:
- An AI workstation with an NVIDIA RTX Pro GPU and an NVIDIA ConnectX network interface card (with loopback cable or switch connectivity) or a certified multi-GPU system.
- A functional NVIDIA Holoscan for Media environment using either a local developer setup with Kubernetes or the platform reference deployment guide with a jump node.
- Visual Studio Code or any other IDE for Linux platforms. The GNU Compiler Collection (GCC) can also be used.
To install v25.4, refer to the developer guides available on the Holoscan for Media collection page.
To proceed with the AI application install, follow the steps on NGC AI Reference Applications resources page.
Additional updates
In addition to these AI applications, the Holoscan for Media 25.4 release comes with improved monitoring for both production (OpenShift) and local developer (cloud-native stack) environments. It uses the SR-IOV network, PTP, and the NMOS registry application-specific Grafana dashboards.
Automation is also improved for single-node OpenShift installations and compact three-node clusters. This means support for more networking variants and red/blue networking for ST 2022-7 redundancy. It also simplifies automation for local developer setup (now supporting Ubuntu 24.04) and automated installation of reference applications like the Helm dashboard, NMOS registry, NMOS controller, or media gateway.
Conclusion
Holoscan for Media has enabled container orchestration for multi-vendor live production since its launch over a year ago. The latest 25.4 release provides the first AI reference applications for developers, delivering on the promise of real-time AI for live media on software-defined infrastructure on premises.
Get started with Holoscan for Media.