Data is the lifeblood of AI systems, which rely on robust datasets to learn and make predictions or decisions. For perception AI models specifically, it is essential that data reflects real-world environments and incorporates the array of scenarios. This includes edge use cases for which data is often difficult to collect, such as street traffic and manufacturing assembly lines.
To bootstrap and accelerate training computer vision models, AI and machine learning (ML) engineers can leverage the power of synthetic data in conjunction with real-world data. Synthetic data generation (SDG) enables AI and ML engineers to generate large sets of diverse training data to address an infinite diversity of use cases that span visual inspection, robotics, and autonomous driving.
With the latest update of NVIDIA Omniverse Replicator, a core extension of the NVIDIA Omniverse platform built on Universal Scene Description (OpenUSD), developers can build more powerful synthetic data generation pipelines than ever before. New feature highlights include:
- Unlocking the power of synthetic data for AI developers with low-code, YAML-based configurator.
- Scaling the overall rendering process through asynchronous rendering that disaggregates the sensor simulation from rendering tasks.
- Achieving greater flexibility during the data generation process with event-based conditional triggers.
Omniverse Replicator enables developers to build a data factory for training computer vision models. Additionally, Replicator is highly customizable and extensible, making it amenable to fit into many computer vision workflows.
Replicator is integrated into NVIDIA Isaac Sim for robotics and NVIDIA DRIVE Sim for autonomous vehicle workflows. At ROSCon 2023, NVIDIA announced major updates to the NVIDIA Isaac Robotics platform that simplify building and testing performant AI-based robotics applications.
Simplified and tailored solutions
Previous limitations with the Replicator extension required developers to write extensive lines of code to generate data for model training. AI and ML engineers not familiar with 3D content generation lacked an efficient method for generating data.
Now, rather than writing extensive lines of code for a pre-existing scene, developers can use the YAML-based descriptive file to simply describe the parameters to change using syntax (lights, environment, location, for example). This approach makes it easier to track SDG parameters as part of the model creation and performance lineage, empowering the true data-centric model development approach.
In addition, developers can use the YAML file to batch-generate data using Replicator through Omniverse Farm running on an NVIDIA OVX system with minimal user intervention. Users can easily share and distribute code recipes to create new versions of the same file and create an automated pipeline for data generation.
Scaling synthetic data generation with asynchronous rendering
World simulation, sensor simulation, and rendering tasks for SDG are typically implemented as a tightly integrated synchronous application. This limits the flexibility to simulate sensors operating at different rates without compromising performance.
Asynchronous rendering runs the simulation and rendering of sensors asynchronously from one another, empowering users with finer control over the entire process. This enables developers to render synthetic data at scale using multiple GPUs, thereby increasing throughput.
Superior flexibility for SDG with event-based triggers
In Replicator, triggers dictate when specific nodes, such as randomizers or writers, are activated. The system supports on-frame triggers, which activate nodes every frame, and on-time triggers, which activate nodes at set time intervals.
The latest Replicator release also introduces conditional triggers, which enable the activation of nodes based on specific events or conditions. Developers can now establish custom logic through their own functions, offering more refined control over randomizers and writers.
Roboticists using the newest version of NVIDIA Isaac Sim can use this feature to initiate the movement of an autonomous mobile robot (AMR) in response to a particular event. This offers a robust method for controlling when and how SDG is produced, depending on simulation events.
Start developing with Omniverse Replicator
These are just a few of the new Omniverse Replicator 1.10 features for boosting SDG pipelines. To learn about additional features including material support, postrender augmentations, and 2D and 3D scatter node enhancements, see the Replicator documentation.
To start developing your own SDG applications with Omniverse Replicator, download Omniverse free and follow the instructions for getting started with Replicator in Omniverse Code.
To learn more about Replicator, check out the Replicator tutorials. Join the NVIDIA Omniverse Discord Server to chat with the community, and check out the synthetic data generation Discord channel.
Follow Omniverse on Instagram, Twitter, YouTube, and Medium for additional resources and inspiration. You can also check out the NVIDIA Developer Forums for information from Omniverse experts.