NVIDIA Maxine

proviz-maxine-key-visual-2451183-850x478.png

Reinventing Real-Time Video Communication with AI

Get Started


What is NVIDIA Maxine?

NVIDIA Maxine is paving the way for real-time audio and video communications. Whether for a video conference, a call made to a customer service center, or a live stream, Maxine enables clear communications to enhance virtual interactions.

Being successful while working remotely, on the road, or in a customer service center, all require increased presence — so video conferencing services and communications platforms must enable workers to be seen and heard clearly. Personal engagement increases when audio and video quality is improved on video conferencing and communications platforms, and shared eye contact during video calls helps improve interpersonal connection.

NVIDIA Maxine is a suite of GPU-accelerated AI SDKs and cloud-native microservices for deploying AI features that enhance audio, video, and augmented reality effects in real time. Maxine’s state-of-the-art models create high quality effects that can be achieved with standard microphone and camera equipment. Maxine can be deployed on premises, in the cloud, or at the edge.

NVIDIA Maxine is part of the NVIDIA AI platform. NVIDIA AI consists of hundreds of SDKs that developers can use to build business solutions.

Available on PC, data center, and cloud.





What Are The Benefits of NVIDIA Maxine?

State-of-the-Art NVIDIA AI Capabilities

Maxine, built on the NVIDIA AI platform, offers world-class pretrained models for developers to deploy premium audio and video quality features.

Real-Time AI Performance

Maxine includes accelerated and optimized AI features for real-time inference on GPUs, resulting in low-latency audio, video, and AR effects with high network resilience.

Complete AI Pipeline

Maxine offers video decode, transcode, encode, conversational AI, computer vision, video streaming, and analytics to complete your AI pipeline.

Multi-Cloud, Customizable Deployment

Maxine’s cloud-native microservices allow for flexible, fast deployment and updates.

Access NVIDIA Maxine Microservices

Maxine’s cloud-native microservices allow developers to build real-time AI applications for high-quality audio and video communications. Microservices can be independently managed and deployed within the application, accelerating development time.

Early access to the NVIDIA Maxine Audio Effects microservice is available now. Features include:

  • Audio Super Resolution: Improves audio quality by increasing the temporal resolution of audio signal. It currently supports upsampling from 8kHz to 16 kHz and from 16 kHz to 48 kHz.
  • Acoustic Echo Cancellation: Cancels real-time acoustic device echo from the input audio stream, eliminating mismatched acoustic pairs and double-talk. AI-based technology achieves more effective cancellation than traditional digital signal processing.
  • Noise Removal: Removes common background noise using state-of-the-art AI models, while preserving the speaker’s natural voice.
  • Room Echo Removal: Removes reverberations from audio using state-of-the-art AI models, restoring the clarity of a speaker’s voice.

Discover the NVIDIA Maxine SDKs

Audio Effects SDK

The Audio Effects SDK delivers multi-effect, low-latency audio quality enhancement algorithms, improving end-to-end conversation quality for narrowband, wideband, and ultra-wideband audio.


High-performance, optimized AI models enable users to process thousands of audio streams per GPU in real time, enhancing audio quality by up to two mean-opinion-score points in subjective and objective quality metrics including Perceptual Evaluation of Speech Quality and Perceptual Objective Listening Quality Analysis. In desktop applications, the optimized models allow multiple applications, such as games, to run concurrently with minimal impact to the quality of both applications.


Developers can integrate the Audio Effects SDK into standalone Windows and Linux applications to process microphone and speaker audio, or into high-density servers for processing thousands of audio streams per server.


Key features include:

  • [Updated] Audio Super Resolution: Improves audio quality by increasing the temporal resolution of audio signal. It currently supports upsampling from 8kHz to 16 kHz and from 16 kHz to 48 kHz. Now with enhanced quality.
  • Acoustic Echo Cancellation: Cancels real-time acoustic device echo from the input audio stream, eliminating mismatched acoustic pairs and double-talk. With AI-based technology, more effective cancellation is achieved than with traditional digital signal processing.
  • Noise Removal: Removes common background noise using state-of-the-art AI models, while preserving the speaker’s natural voice.
  • Room Echo Removal: Removes reverberations from audio using state-of-the-art AI models, restoring the clarity of a speaker’s voice.
  • [New] Speaker Focus: Separates the audio tracks of foreground and background speakers, making each voice more intelligible. (Early access only)

Using these features, developers can also create innovative multi-effects by combining Noise Removal and Room Echo Cancellation while delivering optimized, real-time performance.


Get started with the Audio Effects SDK  









Video Effects SDK

Maxine’s Video Effects SDK enables AI-based visual effects that run with standard webcam input and can be easily integrated into video conference pipelines. The underlying deep learning models are optimized with NVIDIA AI using NVIDIA TensorRT for high-performance inference, making it possible for developers to apply multiple effects in real-time applications.


Key features include:

  • Super Resolution: Generates a detail-enhanced video using neural networks that reduce artifacts and preserves texture with up to 4X high-quality scaling.
  • Upscaler: Delivers high-throughput and up to 4X high-quality scaled video with an adjustable sharpening parameter.
  • Artifact Reduction: Reduces compression artifacts from encoded video while preserving original details.
  • Video Noise Removal: Removes low-light camera noise introduced in the video capture process while preserving details.
  • [Updated] Virtual Background: Segments a person and applies AI-powered background removal, replacement, or blur. Now includes enhanced temporal stability.

Get started with the Video Effects SDK  

Augmented Reality SDK

The Augmented Reality SDK offers AI-powered, real-time 3D face tracking and body pose estimation based on a standard webcam feed. Developers can create unique AR effects such as overlaying 3D content on a face — driving 3D characters and virtual interactions in real time.


Key features include:

  • Face Tracking: Detects human faces in images and videos and specifies location and size of the bounding box.
  • [Updated] Face Landmark Tracking: Recognizes facial features and contours using 126 key points. It also tracks head pose and facial deformation due to head movement and expression in three degrees of freedom in real time - now with Quality mode to achieve even higher-quality tracking.
  • [Updated] Face Mesh: Represents a human face as a 3D mesh with up to 3,000 vertices and six degrees of freedom. Now includes a 3D morphable model from USC Institute of Creative Technologies.
  • [Updated] Body Pose Estimation: Predicts and tracks 34 key points of the human body in 2D and 3D. Commonly used in activity recognition, motion transfer, and virtual interactions in real time.
  • [New] Face Expression Estimation: Tracks the face and infers the subject’s expression. Estimated blendshape coefficients are used to animate a properly rigged model to accurately mirror the subject’s expression.
  • [New] Eye Contact: Simulates eye contact by estimating and aligning gaze with the camera.

Get started with the Augmented Reality SDK  





Maxine Builds on Powerful NVIDIA AI SDKs

Explore technologies that integrate with Maxine’s modular, customizable, and scalable pipeline. To enable better communication and understanding, Maxine integrates NVIDIA Riva’s real-time translation and text-to-speech capabilities with Maxine’s photo animation “live portrait” and eye contact features. Maxine is also a reference application for Omniverse ACE, a technology platform for generating interactive AI avatars.

Partners

Avaya, an NVIDIA Maxine partner
Corsair, an NVIDIA Maxine partner
Elgato, an NVIDIA Maxine partner
Headroom, an NVIDIA Maxine partner
Logitech, an NVIDIA Maxine partner
OBS, an NVIDIA Maxine partner
Pexip, an NVIDIA Maxine partner
SoftBank, an NVIDIA Maxine partner
Tencent Cloud, an NVIDIA Maxine partner

Omniverse ACE for deploying AI avatars in the cloud.

NVIDIA Omniverse Avatar Cloud Engine

Omniverse ACE is a collection of cloud-based AI models and services for developers to easily build, customize, and deploy interactive avatars.

Learn More
Video Codec SDK is a comprehensive set of APIs

GPU-Accelerated Video Encode and Decode

The Video Codec SDK is a comprehensive set of APIs, including high-performance tools, samples, and documentation for hardware-accelerated video encode and decode on Windows and Linux.

Learn More
Riva SDK - Application framework

Speech AI

NVIDIA Riva, part of the NVIDIA AI platform, is a GPU-accelerated SDK for building speech AI applications that deliver real-time performance on GPUs.

Learn More

Want to stay informed about Maxine updates?

Sign up to receive notifications when new features are released.

Notify Me

Find more resources.

Discover NVIDIA AI technologies

Read about the latest developer software released at GTC 2022, including tools for conversational AI, inference, and more.

Watch the GTC 2022 keynote

Learn about the latest updates to NVIDIA Maxine from NVIDIA CEO Jensen Huang.

Watch Now

Read the latest Maxine news

Read how leading collaboration, content creation, and streaming providers are using NVIDIA Maxine.

Read News


NVIDIA Maxine is free to download for members of the NVIDIA Developer Program.

Download Now