Content Creation / Rendering

Reallusion Brings Digital Characters to Life with NVIDIA AI

In today’s digital age, creating realistic animated characters is crucial for filmmakers, game developers, and content creators looking to bring their visions to life. Reallusion is at the forefront of this cutting-edge art form, using powerful AI technologies like NVIDIA Audio2Face and NVIDIA Maxine to craft lifelike digital humans and character animations.

A major challenge exists in capturing synchronized facial animation all while matching audio and creating the overall output in a single pass. Reallusion’s adoption of Maxine AR addresses this by bypassing this complex multi-track timing synchronization, using chained effects, to create a revolutionary motion capture (mocap) solution that effectively generates both animation and audio into a single stream.

Picture of a person providing a facial expression and the avatar showing the same expression on a computer screen.
Figure 1. Reallusion’s AccuFACE avatar tracks facial expressions

Transforming character performances: Audio2Face integration with Reallusion’s iClone and Character Creator

Audio2Face is an advanced AI technology that can automatically generate expressive facial animations and lip-syncing just from an audio or text input. It supports multiple languages and can animate characters speaking or even singing. 

It’s not just limited to lip-syncing dialogue in various languages. The most recent standalone release of Audio2Face incorporates functionality to animate realistic facial expressions too. Animators can use slider and keyframe controls to animate even the most complex emotions and personality through the characters’ expressions in tandem with the speech animations.

Reallusion has integrated Audio2Face through plugins in their Character Creator and iClone applications, enabling a seamless AI-assisted animation workflow. With just one click, you can prepare an asset for animation, which generates live facial movements matched to any supplied voice track. The resulting animation can then be transitioned directly into iClone, enabling additional polishing and refinements before rendering it out for use in 3D apps, game engines, and other production environments.

Diagram shows high-level workflow starting with configuring a CC character in NVIDIA Audio2Face, animating it in real-time alongside an imported audio track, and transfering the talking animation back to iClone for additional refinement before exporting it to 3D tools and game engines.
Figure 2. Audio2Face workflow with Character Creator and iClone

Born from a close collaboration between NVIDIA and Reallusion, the CC Character Auto Setup plugin consolidates a previously cumbersome 18-step process into just one simple operation. 

Import a Character Creator asset and select either the Mark or Claire training model, then immediately see that 3D character come alive with lifelike facial animations lip-synced to any audio input. 

You can further sculpt the performance using Audio2Face’s motion sliders, auto-expression tools, and keyframe controls before passing the polished animation directly into iClone for final production refinements. This is where iClone’s powerful facial editing capabilities come into play, providing the tools needed to transform performances into polished character animations ready for production. 

Within iClone’s user-friendly interface, you have granular control over every aspect of the facial animation. Expression levels, head motions, and intricate details such as simulated eye darts can all be precisely adjusted to authentically convey a character’s distinctive personality: 

  • Jaw ranges can be exaggerated to convey powerful emotional intensity when needed. 
  • Tongue motion can be carefully sculpted to authentically mimic enunciation and speech patterns. 

iClone can also incorporate head movements sourced mocap equipment, such as AccuFACE or iPhone Live Face.

AccuFACE: Next-gen AI face mocap powered by the NVIDIA Maxine AR SDK

Authentic facial performances encompass more than mere lip sync; they are significantly informed by nuanced facial expressions and congruity with the emotion expressed in words. 

For example, the conveyance of sarcasm relies heavily on facial expressions—articulating a phrase such as, “It tastes fine” while simultaneously exhibiting a look of distaste conveys a markedly different meaning. 

To encapsulate these implicit cues, it is imperative to integrate visual performance capture beyond the audio component within your production workflow. 

AccuFACE taps into the NVIDIA Maxine AR SDK to enable unprecedented real-time facial capture quality and capabilities. Powered by NVIDIA GPUs with Tensor Cores, the Maxine AR SDK offers AI-driven 3D facial tracking and modeling, body pose estimation, and more. It instantly analyzes expressions through parallel processing while accelerating throughput and minimizing latency. 

Key features leveraged by AccuFACE include the following:

  • Precise landmark mapping
  • Head pose and deformation tracking
  • Facial mesh reconstruction
  • Robust face detection and localization

By harnessing cutting-edge Maxine technology, AccuFACE translates captured facial data into seamless digital animation and enables you to generate expressive facial animations, drive responsive 3D avatars in real time, and facilitate interactive virtual conversations.

Video 1. AccuFACE – Video-based AI Facial Mocap | Live from Webcam or Recorded Video | iClone 8

AccuFACE provides a comprehensive toolkit to let you refine the AI-generated tracking for professional-grade results: 

  • Device settings, like smooth filtering and denoising, address jitters, twitches, and other tracking artifacts. 
  • Anti-interference cancellation prevents brow, mouth, and head movements from erroneous cross-triggering. This protects against muddied or exaggerated expressions. 

To capture distinct expressions, further calibration and refinement can be added. This aims to regionally strengthen sculpting motion intensity across eyes, cheeks, and lips to deliver the authenticity of individual actors and their unique performances.

Reallusion’s partnership with NVIDIA showcases the transformative potential of AI in animation; professional-grade facial motion capture and animation are now within reach of a broader audience. 

With intuitive tools for refining facial animations and synchronization, you can achieve high-quality results without the need for extensive expertise or specialized equipment, revolutionizing the landscape of digital character animation.


From enhancing day-to-day video conferencing needs to integrating AI technology, NVIDIA Maxine offers high-quality video communications for all professionals.

The latest Maxine production release is included exclusively with NVIDIA AI Enterprise, which enables you to tap into production-ready features such as Triton Inference Server, enterprise support, and more.

If you’re interested in early access, with non-production access to production and soon-to-be-released features, see the Maxine Early Access program (requires login).

To help improve features in upcoming releases, provide feedback on the NVIDIA Maxine and NVIDIA Broadcast App survey.

Discuss (0)