We have been pursuing the creation of digital humans for years.
Traditionally, digital humans have been widely used in media and entertainment, from video game characters to CGI characters in movies. But the process to create a digital human is extremely labor-intensive and manual. It requires hundreds of domain experts such as artists, programmers, and technical artists. Plus, humans understand when an artificial human is fake. We’re extremely sensitive to the uncanny valley effect, so we know when something is off.
NVIDIA is researching tools and developing ways to accelerate and simplify digital human creation and we believe that AI and simulation are the key to doing this.
What is a digital human?
At the core, a digital human is a digital form of ourselves in the virtual world. From the early days of 3D games, Virtua Fighter was one of the first to show how 3D characters can fight together. Today, players can experience the journeys of memorable characters from popular games like God of War and The Last of Us. Or maybe you recognize digital humans in popular movies, like the infamous villain Thanos from Avengers: Endgame or Brad Pitt as the older version of himself in Benjamin Button.
There are also new cases emerging in entertainment through storytelling with digital avatars. Companies like Fable AI and Baobab are creating interactive and virtual stories that involve digital characters. But how do we define what a digital human is? What metric can we use to describe different types of digital humans?
There are generally three scales or ranges, on which we measure digital humans:
- Realism compared to stylism
- Real-time compared to offline
- AI compared to human-driven
Bringing digital humans to life
As previously mentioned, the process to create a digital human can be challenging. There are three main components to making a digital human, and each of these requires a different combination of art and technologies: generation, animation, and intelligence.
To generate digital humans, teams must first make 3D models, textures, shaders, skeleton rig, and deformation of the skin so it follows the skeleton.
For animation and movement, artists must look at the physical elements of the digital human, from the body and face to hair and clothing. Typically, it is a combination of deformation and simulation to achieve the right motion for these parts. In terms of realistic performance, until now, there are mainly two ways to achieve this; animate by hand or use various performance capture techniques to get your motion data. Often, it is the combination of the two together.
Within the last few years, using artificial intelligence (AI) to generate or synthesize animation is starting to appear more. Their roles are smaller now and typically handle specific types of performance, but that is changing rapidly.
The bottom line of any of these approaches is to create context-based behavior so that the digital human can act or move in a believable way. Being able to show emotion and behavior like a real human is still difficult. New technologies in AI and simulation are helping to make this easier.
Lastly, artists must bring intelligence to digital humans, and they can do this through bidirectional interaction. Through human language processing and natural speech technologies such as NVIDIA Riva, Ensemble Health AI, and Replica, a digital human can have conversations with real humans. They can also have vision within the virtual world as well as the real world. They can recognize objects and navigate surroundings in their environment. They can see the users that are talking to them and look and respond accordingly.
The importance of digital humans
Digital humans may already be in media and entertainment, but the need for digital humans is growing, and it’s spanning across industries. Today, we are already seeing all the benefits and potential use cases for digital humans.
To start, the use of AI digital assistants has great potential in industries like healthcare and retail. For medical professionals, digital assistants can help improve training and procedures. Doctors can operate in a realistic simulation, and they can conduct the simulation hundreds of times to ensure they get the best results before performing the surgery in real life. In retail, AI digital assistance can enhance customer services by providing a more personalized experience.
For this to work, AI digital assistants need great comprehension of verbal communication. This is key to helping people better interact and converse with digital assistants, so that they can accomplish the tasks they need.
For other companies that work in industries like architecture and manufacturing, digital twins are helping teams simulate workers and people in large environments, from factories and cities to buildings. With the help of digital humans, companies can assess risks and predict environments with accurate simulations, helping them ensure that physical buildings are optimally designed.
The intelligence requirement in digital humans is different from AI digital assistants. When you put a digital human in a virtual environment, they must know how to navigate and behave like a person, whether it’s a factory worker or a tourist walking through a skyscraper.
Lastly, digital humans will help improve the creation of synthetic data generation. For AI, data and training neural networks is the essence. Companies like Synthesis AI, Microsoft, DataGen, Epic Games, and Reallusion are already working on capturing and synthesizing 3D digital human data to train AI models. But we continuously need more data, especially for the future. Synthetic data generation is key for AI to grow, and synthetic data generation of digital humans is crucial to expanding that.
Every voice will have a face
What’s on the horizon for digital humans? People will have a creative usage of an adopted digital human world. As we move towards experiences in virtual worlds, this will become more prominent.
Digital humans are essential for virtual world experiences. In fact, everyone will one day have their own digital version of themselves, whether it’s an accurate or stylized avatar.
With NVIDIA Omniverse, we want to create the framework where many types of digital humans can coexist. Using Pixar’s Universal Scene Description (USD) is the standard format across all 3D industries, so everyone can exchange data and talk together. Omniverse is helping drive these efforts towards USD, and this is key to enabling different applications and technologies to bridge and collaborate to create digital humans.
For large worlds with tons of digital people, we must be able to scale up. It’s a challenging computation problem to simulate large worlds that can interconnect and have large numbers of digital humans that can interact together in real time. But Omniverse is addressing this challenge—the platform can power large worlds and simulations. This is crucial for the future of physical and virtual worlds, where you can have massive numbers and varieties of digital humans that can participate and interact together.
Over time, the connection between real humans and digital humans will grow. It will go beyond watching a puppet on the computer. Eventually, the computer will read and interact with us, just as we do in real life.
We’ll be able to talk with digital humans, and even order goods like merchandise, food, prescriptions, and other things through a digital person, who will then deliver tangible real-world objects to you. Communication and interaction will become a two-way street, and this provides a new element of freedom and reinvention.
What’s next?
Join us at NVIDIA GTC to see where the future of digital humans is heading. Register for free and check out our session on Digital Human Technologies, as well as our Expert Breakout to take an in-depth look.
Don’t miss the special keynote, presented by our CEO Jensen Huang on Nov. 9, to hear all the latest technologies in AI and graphics.