NVIDIA Maxine

Reinventing Real-Time Video Communication with AI

Get Started

proviz-maxine-key-visual-2451183-850x478.png

What is NVIDIA Maxine?

NVIDIA Maxine is a suite of GPU-accelerated AI SDKs and cloud-native microservices for deploying AI features that enhance audio, video, and augmented reality effects in real time. Maxine’s state-of-the-art models create high quality effects that can be achieved with standard microphone and camera equipment. Maxine can be deployed on premises, in the cloud, or at the edge.


Being successful while working remotely, on the road, or in a customer service center, all require increased presence — so video conferencing services and communications platforms must enable workers to be seen and heard clearly. Personal engagement increases when audio and video quality is improved on video conferencing and communications platforms, and shared eye contact during video calls helps improve interpersonal connection.


NVIDIA Maxine is part of NVIDIA AI Enterprise. NVIDIA AI Enterprise is an extensive library of full-stack software, including AI solution workflows, frameworks, pretrained models, and infrastructure optimization.

Available on PC, data center, and cloud.





What Are The Benefits of NVIDIA Maxine?

State-of-the-Art NVIDIA AI Capabilities

NVIDIA Maxine offers world-class pretrained models for developers to deploy premium augmented reality, audio and video quality features.

Real-Time AI Performance

Maxine includes accelerated and optimized AI features for real-time inference on GPUs, resulting in low-latency audio, video, and AR effects with high network resilience.

Complete AI Pipeline

Maxine offers video decode, transcode, encode, conversational AI, computer vision, video streaming, and analytics to complete your AI pipeline.

Multi-Cloud, Customizable Deployment

Maxine’s cloud-native microservices allow for flexible, fast deployment and updates.

Access NVIDIA Maxine Microservices

Maxine’s cloud-native microservices allow developers to build real-time AI applications for high-quality audio and video communications. The microservices are ready-to-use containerized packages of cloud applications that are built from Maxine algorithms. These packages contain all end-to-end applications with necessary dependencies, which can be easily deployed on public and private clouds, and enable client applications to provide the benefits of NVIDIA Maxine algorithms via cloud-based GPU computing. Microservices can be independently managed and deployed within the application, accelerating development time.


Audio Effects Microservice offers the following GPU-accelerated AI-based audio effects:

  • Speaker Focus
  • Noise removal
  • Room echo removal
  • Audio Super-resolution
  • Acoustic echo cancellation

Video Effects Microservice offers the following GPU-accelerated AI-based video effects:

  • Virtual Background
  • Eye Contact

Live Portrait Microservice contains the Live Portrait feature, which animates a person's portrait photo through their live video feed by matching the head movement and facial expressions to the photo.


NVIDIA Maxine microservices early access is now available.


Apply Now



Discover the NVIDIA Maxine SDKs

Audio Effects SDK

The Audio Effects SDK delivers multi-effect, low-latency audio quality enhancement algorithms, improving end-to-end conversation quality for narrowband, wideband, and ultra-wideband audio.


High-performance, optimized AI models enable users to process thousands of audio streams per GPU in real time, enhancing audio quality by up to two mean-opinion-score points in subjective and objective quality metrics including Perceptual Evaluation of Speech Quality and Perceptual Objective Listening Quality Analysis. In desktop applications, the optimized models allow multiple applications, such as games, to run concurrently with minimal impact to the quality of both applications.


Developers can integrate the Audio Effects SDK into standalone Windows and Linux applications to process microphone and speaker audio, or into high-density servers for processing thousands of audio streams per server.


Key features include:

  • [Updated] Audio Super Resolution: Improves audio quality by increasing the temporal resolution of audio signal. It currently supports upsampling from 8kHz to 16 kHz and from 16 kHz to 48 kHz. Now with enhanced quality. Updated with over 50% reduced latency.
  • Acoustic Echo Cancellation: Cancels real-time acoustic device echo from the input audio stream, eliminating mismatched acoustic pairs and double-talk. With AI-based technology, more effective cancellation is achieved than with traditional digital signal processing.
  • Noise Removal: Removes common background noise using state-of-the-art AI models, while preserving the speaker’s natural voice.
  • Room Echo Removal: Removes reverberations from audio using state-of-the-art AI models, restoring the clarity of a speaker’s voice.
  • [Updated] Speaker Focus: Separates the audio tracks of foreground and background speakers, making each voice more intelligible. Now in general availability.

Using these features, developers can also create innovative multi-effects by combining Noise Removal and Room Echo Cancellation, or Speaker Focus and Noise Removal while delivering optimized, real-time performance.


Get started with the Audio Effects SDK  









Video Effects SDK

Maxine’s Video Effects SDK enables AI-based visual effects that run with standard webcam input and can be easily integrated into video conference pipelines. The underlying deep learning models are optimized with NVIDIA AI using NVIDIA TensorRT for high-performance inference, making it possible for developers to apply multiple effects in real-time applications.


Key features include:

  • [Updated] Virtual Background: Segments a person and applies AI-powered background removal, replacement, or blur. Now includes enhanced temporal stability. Updated with latency improvements.
  • Super Resolution: Generates a detail-enhanced video using neural networks that reduce artifacts and preserves texture with up to 4X high-quality scaling.
  • Upscaler: Delivers high-throughput and up to 4X high-quality scaled video with an adjustable sharpening parameter.
  • Artifact Reduction: Reduces compression artifacts from encoded video while preserving original details.
  • Video Noise Removal: Removes low-light camera noise introduced in the video capture process while preserving details.

Get started with the Video Effects SDK  

Augmented Reality SDK

The Augmented Reality SDK offers AI-powered, real-time 3D face tracking and body pose estimation based on a standard webcam feed. Developers can create unique AR effects such as overlaying 3D content on a face — driving 3D characters and virtual interactions in real time.


Key features include:

  • [Updated] Face Expression Estimation: Tracks the face and infers the subject’s expression. Estimated blendshape coefficients are used to animate a properly rigged model to accurately mirror the subject’s expression. Updated with enhanced AI model, new 6 degree-of-freedom (DOF) head pose, and new face model with updated blendshapes and face area partitioning.
  • [Updated] Eye Contact: Simulates eye contact by estimating and aligning gaze with the camera. Updated with performance improvements via CUDA graph functionality.
  • Face Tracking: Detects human faces in images and videos and specifies location and size of the bounding box.
  • [Updated] Face Landmark Tracking: Recognizes facial features and contours using 126 key points. It also tracks head pose and facial deformation due to head movement and expression in three degrees of freedom in real time - now with Quality mode to achieve even higher-quality tracking.
  • [Updated] Face Mesh: Represents a human face as a 3D mesh with up to 3,000 vertices and six degrees of freedom. Now includes a 3D morphable model from USC Institute of Creative Technologies.
  • Body Pose Estimation: Predicts and tracks 34 key points of the human body in 2D and 3D. Commonly used in activity recognition, motion transfer, and virtual interactions in real time.

Get started with the Augmented Reality SDK  





Maxine Builds on Powerful NVIDIA AI SDKs

Explore technologies that integrate with Maxine’s modular, customizable, and scalable pipeline. To enable better communication and understanding, Maxine integrates NVIDIA Riva’s real-time translation and text-to-speech capabilities with photo animation “Live Portrait” and Maxine’s Eye Contact features.

Partners

Avaya, an NVIDIA Maxine partner
Corsair, an NVIDIA Maxine partner
Elgato, an NVIDIA Maxine partner
Headroom, an NVIDIA Maxine partner
Logitech, an NVIDIA Maxine partner
OBS, an NVIDIA Maxine partner
Pexip, an NVIDIA Maxine partner
SoftBank, an NVIDIA Maxine partner
Tencent Cloud, an NVIDIA Maxine partner

NVIDIA Maxine Resources

gtc23-ace-tj-campaign.jpg

NVIDIA Omniverse Avatar Cloud Engine

Omniverse ACE is a collection of cloud-based AI models and services for developers to easily build, customize, and deploy interactive avatars.


Learn More
Avaya Spaces.jpeg

Avaya Delivers Enhanced Video Conferencing Experience From the Cloud with NVIDIA Maxine

Avaya’s new cloud media processing framework delivers high-quality real-time voice and video with minimal latency while supporting innovative AI algorithms provided by Maxine.

Learn More
Pexip_Blog_Image.png

Reimagining Virtual Meetings With Speech AI and Deep Learning

With Maxine integrated into Pexip’s flexible, secure digital infrastructure, these advanced features are delivered at the server level, meaning all participants in the video meeting will have the same enhanced experience..

Learn More

Want to help improve NVIDIA Broadcast App features? Check out our interactive crowdsource page.


Check it out

Find more resources.

Discover NVIDIA AI technologies

Read about the latest developer software released at GTC 2022, including tools for conversational AI, inference, and more.

Read News

Watch the GTC 2022 keynote

Learn about the latest updates to NVIDIA Maxine from NVIDIA CEO Jensen Huang.

Watch Now

Read the latest Maxine news

Read how leading collaboration, content creation, and streaming providers are using NVIDIA Maxine.

Read News

NVIDIA Maxine on Github

NVIDIA Maxine source code is available on Github.

Explore

NVIDIA Maxine is free to download for members of the NVIDIA Developer Program.


Download Now