NVIDIA MAXINE

Accelerated SDK with state-of-the-art AI features for building virtual collaboration and content creation applications.


Get Started




What Is NVIDIA Maxine?


All of the NVIDIA Broadcast Engine SDKs Are Now Included in NVIDIA Maxine

NVIDIA Maxine™ is a GPU-accelerated SDK with state-of-the-art AI features for developers to build virtual collaboration and content creation applications such as video conferencing and live streaming.

Maxine’s AI SDKs—Video Effects, Audio Effects, and Augmented Reality (AR)—are highly optimized and include modular features that can be chained into end-to-end pipelines to deliver the highest performance possible on GPUs, both on PCs and in data centers. Maxine can also be used with NVIDIA Riva, an SDK for building conversational AI applications, to offer world-class language-based capabilities such as transcription and translation.

Developers can add Maxine AI effects into their existing applications or develop new pipelines from scratch using NVIDIA DeepStream, an SDK for building intelligent video analytics, and NVIDIA Video Codec, an SDK for accelerated encode, decode, and transcode.



Benefits


State-of-the-Art AI Capabilities

World-class, pre-trained models for high-quality audio, video, and AR capabilities.

Real-Time AI Performance

Accelerated and optimized AI features for real-time AI inference on GPUs.

End-to-End Solution

Complete workflow with capabilities for video decode, transcode, encode, encode, conversational AI, computer vision, video streaming and analytics.


Touchcast utilizes state-of-the-art rendering and AI technologies for running beautiful online events with stunning life-like virtual venues and real-time collaboration capabilities. As the leader in powering the next era of computing, NVIDIA Maxine is paving the future of video communications—a future where AI and neural networks enhance and enrich content in entirely new ways. By working with NVIDIA, Touchcast can continue to be at the forefront of building the world’s most incredible experiences for its clients.


Edo Segal, Founder and CEO touchcast

SoftBank Corp. is committed to providing the best communication experience possible and Maxine dramatically improves communication clarity and quality. With capabilities such as audio background noise removal and video super resolution, our users see and hear each other more clearly, making their communications more efficient and effective.

Ryuji Wakikawa, Vice President, Head of Advanced Technology Division softbank

Pexip has always pushed the boundaries of video communications with its distributed, virtualized conferencing platform. We're exploring how NVIDIA Maxine capabilities like audio noise removal and virtual background can support premium video conferencing experiences for enterprises of all sizes. Together with NVIDIA, we look forward to providing the next generation of AI-powered video communications—creating virtual meetings that are better than meetings in person.

Giles Chamberlin, CTO and Co-founder pexip

We believe real-time AI can take the work out of video conferencing so that people can meet without distractions. NVIDIA Maxine is the first platform that supports those real-time AI video conferencing features. Maxine allows our users to communicate more consistently and effectively, focusing on the content of the discussion instead of the distractions.



Julian Green, CEO headroom

The exciting noise cancellation performance of the Maxine Audio SDK has proven to be easy to use and incredibly powerful. We envision using Maxine to allow our customers to have clear and intelligible conversations in situations never thought possible before.



John Chow, Product Manager counterpath

By processing our video streams with Maxine in the cloud, we are able to give our customers advanced abilities, without them having to invest in expensive equipment. According to our users, the quality of Maxine's video output, enhanced with AI features, is the best in the entire market. Working with the Maxine SDK allowed us to create state-of-the-art solutions for our customers, in record time.



Tzafrir Rehan, CTO belive

Maxine gives our users access to state-of-the-art, real-time, AI-driven body tracking and background removal. They can track and mask performers in a live performance setting, which in turn enables a whole world of creative use cases—and all just using a standard camera feed, eliminating the challenges of special hardware tracking solutions, which is a real game-changer. The integration of the Maxine SDK was very easy and took just a few days to complete.



Matt Swoboda, Founder, and Director notch

NVIDIA Maxine's AI-powered features let us enhance the production quality of our game streamers, starting with dynamic and intelligent noise removal for microphones to ensure clear speech during broadcasts. We also plan on integrating other features such as video denoising and upscaling as well as background removal without a green screen in the near future.



Miguel Molina, Technical Product Manager gamecaster



Maxine SDKs


Video Effects SDK

Maxine’s Video Effects SDK enables AI-based visual effects that run with standard webcam input and can easily be integrated into video conference and content creation pipelines. The underlying deep learning models are optimized using NVIDIA® TensorRT™ for high-performance inference, making it possible for developers to apply multiple effects in real-time applications.

Key features include:

  • Super resolution: AI network that enhances details, sharpens output, and preserves textural detail. Delivers up to 4X scaling.
  • Artifact reduction: Removes compression artifacts from encoded video while preserving original details.
  • Video noise removal: Removes low-light camera noise introduced in the video capture process while preserving details.
  • Virtual background: Segments a person and applies AI-powered background removal, replacement, or blur.

Get started with the Video Effects SDK











Augmented Reality SDK

The Augmented Reality SDK offers AI-based, real-time 3D face tracking and body pose estimation based on a standard web camera feed. Developers can create unique AR effects such as overlaying 3D content on a face, driving 3D characters and virtual interactions in real time.

Key features include:

  • Face tracking: Detects human faces in images and videos and specifies location and size of the bounding box.
  • Face landmark tracking: Recognizes facial features and contours using 126 key points and tracks head pose and facial deformation due to head movement and expression in three degrees of freedom in real time.
  • Face mesh: 3D mesh representation of a human face with up to 3,000 vertices and six degrees of freedom.
  • Body pose estimation: Predicts and tracks 34 key points of the human body in 2D and 3D. Commonly used in activity recognition, motion transfer, and virtual interactions in real time.
  • Eye contact (coming soon): Simulates eye contact by estimating and aligning gaze with the camera.
  • Audio2Face (coming soon): Animates a 2D or 3D digital face with high fidelity based on just an audio input.

Get started with the Augmented Reality SDK



Audio Effects SDK

The Audio Effects SDK delivers AI-based audio quality enhancement algorithms, improving end-to-end conversation quality for narrowband, wideband, and ultra-wideband audio.

High-performance, optimized AI models allow thousands of audio streams to be processed in real time per GPU, enhancing the audio quality by up to two mean-opinion-score (MOS) points in subjective and objective quality metrics such as Perceptual Evaluation of Speech Quality (PESQ) and Perceptual Objective Listening Quality Analysis (POLQA). In desktop applications, the optimized models allow multiple applications, such as games, to run concurrently with minimal impact to the quality of both applications.

Developers can integrate into standalone Windows and Linux applications to process microphone and speaker audio or into high-density servers for processing thousands of audio streams per server.

Key features include:

  • Noise removal (NR): Removes several common background noises using state-of-the-art AI models while preserving the speaker’s natural voice.
  • Room echo removal (REC): Removes reverberations from audio using state-of-the-art AI models, restoring clarity of a speaker’s voice.

Using these features, developers can also create innovative multi-effects by combining NR and REC while delivering optimized performance and real-time latency.

Get started with the Audio Effects SDK








Building on Powerful NVIDIA SDKs

Explore technologies that power Maxine. Taking advantage of the latest NVIDIA architecture, this technology provides a modular pipeline with customizability and scalability to support dynamic workloads.


Video and Image Analytics

The DeepStream SDK delivers an end-to-end streaming pipeline for AI-based, multi-sensor processing and video and image understanding.

Learn More

Video Encode and Decode

The Video Codec SDK is a comprehensive set of APIs, including high-performance tools, samples, and documentation, for hardware-accelerated video encode and decode on Windows and Linux. AI Face Codec (coming soon) will enable smoother video and bandwidth reduction up to 10X.

Learn More

Conversational AI

The Riva SDK is an application framework for multimodal conversational AI services that delivers real-time performance on GPUs.

Learn More




Resources

Reinvent Video Applications

Learn how developers from Notch, Headroom, Be.Live, and Touchcast are using NVIDIA Maxine.


Watch Now

New AI Technologies

Read about the latest developer software tools released at GTC 2021, including conversational AI, inference, and more.

Read News

GTC 2021 Keynote

Learn about the latest update for NVIDIA Maxine from NVIDIA’s CEO, Jensen Huang.


Watch Now

Latest Maxine News

Read how leading collaboration, content creation, and streaming providers are using NVIDIA Maxine.


Read News



NVIDIA Maxine is free to download for members of the NVIDIA Developer Program.


Download Now