Content Creation / Rendering

NVIDIA Maxine Elevates Video Conferencing in the Cloud

Cloud graphic

Real-time remote communication has become the new normal, yet many office workers still experience poor video and audio quality, which impacts collaboration and interpersonal engagement. NVIDIA Maxine was developed specifically to address these challenges through the use of state-of-the-art AI models that greatly improve the clarity of video conferencing calls. These capabilities have been largely demonstrated at recent NVIDIA GTC events. 

Now, NVIDIA Maxine has expanded to provide microservices that can be deployed in private or public clouds, enabling developers to leverage GPU power from remote servers. This post covers the recent feature updates, as well as details on microservices and the NVIDIA Maxine Thin Client software that can efficiently leverage these services from any Windows-based PC.

NVIDIA Maxine technology suite

NVIDIA Maxine is a suite of pretrained AI models built to improve video conferencing experience. Developers can now experience, develop, and deploy NVIDIA Maxine models.

  • Experience: Install and download the free NVIDIA Broadcast app to test the latest NVIDIA Maxine features on a PC with an NVIDIA RTX GPU.
  • Develop: Use NVIDIA Maxine SDKs to integrate AI features of choice into your software.
  • Deploy: Leverage NVIDIA Maxine microservices in cloud deployments to offload AI inference to GPU-powered nodes. Add the AI inference to your existing server architecture or deploy it in your dedicated video conferencing appliance.

NVIDIA Maxine SDKs

NVIDIA Maxine includes three different SDKs focused on key aspects of high-quality video conferencing experiences:

  • Audio Effects SDK to enhance audio with AI
    • Background noise removal, acoustic echo cancellation, room echo removal, audio super-resolution, and speaker focus 
  • Video Effects SDK dedicated to embrace video quality
    • Video super-resolution and upscaler, artifact reduction, video noise removal, and virtual background 
  • Augmented Reality SDK to augment your calls with interactions
    • Face mesh, face tracking and face landmark tracking, body pose estimation, face expression estimation, and eye contact

Maxine SDKs are available for download now on NGC. Audio Speaker Focus and Eye Contact are currently only available in early access. NVIDIA is currently working with partners to improve these features before making them available to a wider audience. Register for NVIDIA Maxine SDK Early Access Program and reach out to your NVIDIA contact to accelerate access.

NVIDIA Maxine cloud-native microservices

NVIDIA is accelerating its effort to provide cloud-native microservices to enable disaggregating computing in the cloud, for “scale out” beyond a single GPU and improving resource management. NVIDIA Maxine microservices can be integrated with your existing software and deployed in Kubernetes clusters with GPUs in the cloud. This also simplifies deployment for cloud infrastructure and provides the ability for companies to leverage Maxine in private or public cloud infrastructures.

Three types of microservices are provided:

  • Audio Effects microservice: Includes background noise removal, room echo cancellation, acoustic echo cancellation and audio super resolution (available since mid-2022)
  • Video Effects microservice: Includes virtual background and eye contact (available since early 2023)
  • Live Portrait microservice: For animating a picture from your webcam feed input (recently available)

“We’re on the way using NVIDIA Maxine for audio background noise removal and NVIDIA Riva speech-to-speech microservices to support video conferencing experiences between our new plant in Arizona and TSMC HQ in Taiwan using NVIDIA Maxine Thin Client and Microsoft Teams. NVIDIA cloud-ready microservices are definitely building the future of automated scale-out AI services to ensure best remote collaboration in enterprises,” TSMC Arizona said.

All NVIDIA microservices are UCF compliant, enabling you to easily connect and chain several of them together to provide a multi-feature pipeline. Other microservices made available by NVIDIA could also be chained and added to the mix, like NVIDIA Riva speech-to-speech microservice. UCF comes with dedicated tools to handle custom integrations, including dependencies and connections between components.

Presently, all of the NVIDIA Maxine microservices are limited to early access to collect feedback with a few partners. If you are interested in testing these microservices, register for the NVIDIA Maxine Microservices Early Access Program and reach out to your NVIDIA contact to accelerate access.

NVIDIA Maxine cloud reference application and Thin Client

Maxine cloud reference application is a real-time media processing service for streaming clients, combining multiple NVIDIA microservices. It can be hosted in private or public clouds and used as a reference to develop custom software. A Helm chart for the Maxine cloud reference application can be generated using UCF tools. It also comes with NVIDIA components for authentication, logging, and metrics that can be replaced by state-of-the-art open-source solutions, if desired.

Diagram showing NVIDIA Maxine Cloud Reference Application process.
Figure 1. NVIDIA Maxine cloud reference application for streaming clients

Using custom metrics like active number of sessions or GPU utilization, the Maxine cloud reference application can scale seamlessly through the following:

  • Kubernetes Horizontal Pod Autoscaling that automatically updates workload resources to match demands on the Kubernetes clusters.
  • Amazon EKS Autoscaling that automatically provisions virtual machines (VMs) in the Amazon Cloud environment. Additional VMs are added when pods fail to start due to insufficient resources, and are removed when nodes are underutilized.

A thin-client application provides easy access to inference in the cloud. This lean-client software intercepts signals from and to physical devices (microphone, speaker, and webcam) to be processed by the Maxine cloud reference application remotely. 

The physical endpoints are mapped to a virtual device by Thin Client (using virtual audio and video drivers) and made available for use in any video conferencing application of choice. Both the Maxine cloud reference application and Thin Client are available on request through the Maxine Microservice Early Access Program.

Diagram showing NVIDIA Maxine Thin Client process.
Figure 2. NVIDIA Maxine Thin Client deployed on local user system

Summary

Beyond new features like Eye Contact, also available in NVIDIA Maxine SDKs, Maxine microservices become a new standard to develop your cloud-ready application at scale. The same technology can also be used by enterprises on private clouds. Register for the NVIDIA Maxine Microservices Early Access Program and reach out to your NVIDIA contact to accelerate access. 

To learn more about NVIDIA Maxine, join us for these NVIDIA GTC 2023 sessions:

Discuss (1)

Tags