NVIDIA MAXINE

Cloud-AI Video-Streaming Platform


Apply for Early Access




Video conferencing features powered by NVIDIA Maxine and NVIDIA Tensor Core GPUs.


What is Maxine?


NVIDIA Maxine is a fully accelerated platform SDK for developers of video conferencing services to build and deploy AI-powered features that use state-of-the-art models in their cloud. Video conferencing applications based on Maxine can reduce video bandwidth usage down to one-tenth of H.264 using AI video compression, dramatically reducing costs.

Maxine includes APIs for the latest innovations from NVIDIA research such as face alignment, gaze correction, face re-lighting and real time translation in addition to capabilities such as super-resolution, noise removal, closed captioning and virtual assistants. These capabilities are fully accelerated on NVIDIA GPUs to run in real time video streaming applications in the cloud.

Maxine-based applications let service providers offer the same features to every user on any device, including computers, tablets, and phones. Applications built with Maxine can easily be deployed as microservices that scale to hundreds of thousands of streams in a Kubernetes environment.


NVIDIA Maxine Features


Easy to use SDK

Includes libraries, tools and example pipelines for developers to quickly add AI features to their applications.

Ultra-Low Bandwidth

AI Video Compression uses one-tenth the bandwidth of H.264 video compression standard.

State-of-the-Art AI Models

Includes pre-trained models with thousands of hours of training on NVIDIA DGX™ A100.

Fully GPU-Accelerated

Optimizes end-to-end pipelines for the highest performance on NVIDIA Tensor Cores GPUs.



Key Technologies

Face Re-animation


Using new AI research, you can identify key facial points of each person on a video call and then use these points with a still image to reanimate a person’s face on the other side of the call using generative adversarial networks (GANs).

These key points can be used for face alignment, where faces are rotated so that people appear to be facing each other during a call, as well as gaze correction to help simulate eye contact, even if a person’s camera isn’t aligned with their screen.

Developers can also add features that allow call participants to choose their own avatars that are realistically animated in real time by their voice and emotional tone.

Face alignment using generative adversarial networks (GANs)

Figure 1: Face alignment using generative adversarial networks (GANs).


Video & Audio Effects


AI-powered audio and video effects such as super resolution with NVIDIA Maxine.

Figure 2: AI-powered audio and video effects such as super resolution with NVIDIA Maxine.

AI-based super-resolution and artifact reduction can convert lower resolutions to higher resolution videos in real time which helps to lower the bandwidth requirements for video conference providers, as well as improves the call experience for users with lower bandwidth. Developers can add features to filter out common background noise and frame the camera on a user’s face for a more personal and engaging conversation.

Additional AI models can help remove noise from low-light conditions creating a more appealing picture.


Conversational AI


Maxine-based applications can use NVIDIA Jarvis, a fully accelerated conversational AI framework with state-of-the-art models optimized for real time performance. Using Jarvis, developers can integrate virtual assistants to take notes, set action items, and answer questions in human-like voices.

Additional conversational AI services such as translations, closed captioning and transcriptions help ensure everyone can understand what’s being discussed on the call.

Real-time conversational AI services with NVIDIA Jarvis.

Figure 3: Real-time conversational AI services with NVIDIA Jarvis.


Reduce Video Bandwidth vs H.264


Transfer only keypoints over the internet slashing bandwidth versus H.264 using AI Video Compression.

Figure 4: Transfer only keypoints over the internet slashing bandwidth versus H.264 using AI Video Compression.

With AI-based video compression technology running on NVIDIA GPUs, developers can reduce bandwidth use down to one-tenth of the bandwidth needed for the H.264 video compression standard.

This cuts costs for providers and delivers a smoother video conferencing experience for end users, who can enjoy more AI-powered services while streaming less data on their computers, tablets, and phones.



Apply for exclusive news, updates, and early access to NVIDIA Maxine.


Apply Now