NVIDIA Open Sources Audio2Face Animation Model

By leveraging large language and speech models, generative AI is creating intelligent 3D avatars that can engage users in natural conversation, from video games to customer service. To make these characters truly lifelike, they need human-like expressions. NVIDIA Audio2Face accelerates the creation of realistic digital characters by providing real-time facial animation and lip-sync driven by generative AI.

Today, NVIDIA is open sourcing our Audio2Face technology to accelerate adoption of AI-powered avatars in games and 3D applications.

Video 1. Demo of the NVIDIA Audio2Face 3.0 diffusion model in action

Audio2Face uses AI to generate realistic facial animations from audio input. It works by analyzing acoustic features like phonemes and intonation to create a stream of animation data, which is then mapped to a character’s facial poses. This data can be rendered offline for pre-scripted content or streamed in real-time for dynamic, AI-driven characters, providing accurate lip-sync and emotional expressions.

NVIDIA Audio2Face diagram — *Figure 1. Speech audio and emotional triggers generate facial animations and lip-sync.*

NVIDIA is open sourcing the Audio2Face models and SDK so every game and 3D application developer can build and deploy high fidelity characters with cutting edge animations. We’re also open sourcing the Audio2Face training framework, so anyone can fine-tune and customize our pre-existing models for specific use cases.

See the tables below for the complete list of open source tools and learn more at NVIDIA Developer.

Package	Use
Audio2Face SDK	Libraries and documentation for authoring and runtime facial animations on-device or in the cloud
Autodesk Maya plugin	Reference plugin (v2.0) with local execution that allows users to send audio inputs and receive facial animation for characters in Maya
Unreal Engine 5 plugin	UE5 plugin (v2.5) for UE 5.5 and 5.6 that allows users to send audio inputs and receive facial animation for characters in Unreal Engine 5
Audio2Face Training Framework	Framework (v1.0) to create Audio2Face models with your data

Table 1. Audio2Face SDK and plugins

Package	Use
Audio2Face Training Sample Data	Example data to get started with the training framework
Audio2Face Models	Regression (v2.2) and diffusion (v3.0) models to generate lip-sync
Audio2Emotion Models	Production (v2.2) and experimental (v3.0) models to infer emotional state from audio

Table 2. Audio2Face models and training data

Open sourcing technology allows developers, students, and researchers to learn from and build upon state-of-the-art code. This creates a feedback loop where the community can add new features and optimize the technology for diverse use cases. We’re excited to make high-quality facial animation more accessible and can’t wait to see what the community creates with it. Join our NVIDIA Audio2Face developer community on Discord and share your latest work.

The industry-leading Audio2Face model is deployed widely across gaming, media and entertainment, and customer service industries. Numerous ISVs and game developers, including Convai, Codemasters, GSC Games World, Inworld AI, NetEase, Reallusion, Perfect World Games, Streamlabs, and UneeQ Digital Humans have integrated Audio2Face in their applications.

Video 2. NVIDIA Audio2Face technology in F1 25

Reallusion, who offers a platform for creators to build 3D characters, integrated Audio2Face within its suite of tools. “Audio2Face uses AI to create expressive, multilingual facial animation from audio,” said Elvis Huang, head of innovation at Reallusion, Inc. “Its seamless integration with Reallusion’s iClone, Character Creator, and iClone AI Assistant, plus advanced editing tools like face-key editing, face puppeteering, and AccuLip make it easier than ever to produce high-quality character animation.”

Survios, developers of Alien: Rogue Incursion Evolved Edition, sped up their animation process, making it possible to deliver high quality character experiences sooner. “By integrating Audio2Face into Evolved Edition, we streamlined the pipeline for lip-syncing and facial capture while ensuring a more immersive and authentic character experience for our players,” said Eugene Elkin, game director and lead engineer at Survios.

The Farm 51, creators of the Chernobylite game series, integrated Audio2Face in their latest game. “The integration of NVIDIA Audio2Face technology in Chernobylite 2: Exclusion Zone has been a game-changer for us,” said Wojciech Pazdur, creative director at The Farm 51. “It has allowed us to generate highly detailed facial animations directly from audio, saving countless hours of animation work. Ideas that were impossible in the original Chernobylite are now possible which brings a new level of realism and immersion to the characters, making their performances feel more authentic than ever.”

Below are the other announcements for game developers released this month.

Latest updates to RTX Kit

RTX Kit is our suite of neural rendering technologies to ray trace games with AI, render scenes with immense geometry, and create game characters with photo-realistic visuals.

RTX Neural Texture Compression SDK dramatically reduces memory usage of high-quality textures without sacrificing quality and has received a host of improvements including:

Library optimizations for very large texture sets and improved performance with Cooperative Vectors on DX12
Expanded feature set for the rendering sample, improved performance and DLSS support
Command-Line Tool improvements when compressing and decompressing very large texture sets
New Intel Sponza scene, great for benchmarking

RTX Global Illumination SDK provides ray-traced indirect lighting solutions and has also received improvements:

Addition of VSync option to the pathtracer sample
Addition of cache visualization with material demodulation toggle.
Spatially Hashed Radiance Cache (SHaRC) algorithm removes compaction option, introduces optional material demodulation, additional debug pass and documentation updates

NVIDIA vGPU scales up the game development environment

NVIDIA virtual GPU (vGPU) technology enables GPU sharing among multiple users in a virtualized environment, allowing scalable GPU resources to support game developers across the entire organization. Activision overhauled its global integration, delivery, and deployment pipeline with NVIDIA vGPU, replacing 100 legacy servers with just six RTX GPU-powered units. The results:

82% reduction in footprint
72% drop in power usage
Over 250,000 tasks run daily across 3,000 developers and 500+ systems

Video 3. Activision created a global testing and deployment platform with NVIDIA vGPU

By consolidating infrastructure and enabling dynamic GPU allocation, Activision built a scalable, automated testing platform that supports everything from multiplayer validation to visual regression and performance testing, accelerating iteration speed and raising code quality across the board.

Explore the Activision story to see how centralized GPU scheduling is redefining AAA development pipelines.

Graphics development and performance tuning sessions from SIGGRAPH 2025

NVIDIA hosted a range of training sessions and technical presentations. Of particular interest to game developers were hands-on labs showcasing the latest advancements in the Nsight suite of graphics developer tools. Recordings of these sessions are now available to stream on NVIDIA On-Demand.

Nsight Graphics in Action: Develop and Debug Modern Ray-Tracing Applications focuses on inspection and debugging of frames to identify and diagnose common rendering bugs and performance blockers, including use of the new Graphics Capture tool that provides expanded and modernized workflows.

Nsight Graphics in Action: Optimize Shaders in Modern Ray-Tracing Applications is a deep dive into the GPU Trace Profiler, which lets you drill down into individual lines of shader code to find runtime execution bottlenecks.

Optimize VRAM Management With NVIDIA Nsight Systems shows how to attain a holistic view of application performance and utilization of resources across both the CPU and GPU using traces that can be minutes long. Special emphasis is given to the new Graphics Hotspot Analysis tool which converts raw timeline data into a web-based interface with easy-to-read summaries of concurrency analysis, frame stutters, and more.

Download Nsight Graphics and Nsight Systems to get started optimizing your own games and graphics applications.

What’s Next

If you weren’t able to catch our “Level up with NVIDIA” webinar episode this morning on RTX Mega Geometry in Unreal Engine 5.6, be sure to catch it on-demand here.

See our full list of game developer resources here and follow us to stay up to date with the latest NVIDIA game development news: