Accelerating Model Development and AI Training with Synthetic Data, SKY ENGINE AI platform, and NVIDIA TAO Toolkit

In AI and computer vision, data acquisition is costly and time-consuming and human-based labeling can be error-prone. The accuracy of the models is also affected by insufficient and poorly balanced data and the prolonged time required to improve the deep learning models. It always requires the reacquisition of data in the real world.

The collection, preparation of data, and development of accurate and reliable software solutions based on AI training is an extremely laborious process. The required investment costs offset the expected benefits of deploying the system.

One way to bridge the data gap and accelerate model training is by using synthetic data instead of real data for training. SKY ENGINE provides an AI platform to move deep learning to virtual reality. It is possible to generate synthetic data using simulations where the synthetic images come with the annotation that can be used directly in training AI models.

Synthetic data can now be directly exported to run on the NVIDIA TAO Toolkit, an AI training toolkit that simplifies training by abstracting away the AI/DL framework complexity. This enables you to build production-quality models faster without needing any AI expertise. With the SKY ENGINE AI platform and the toolkit, you can quickly iterate and build AI.

In this post, you learn how you can harness the power of synthetic data by taking preannotated synthetic data and training it on TAO Toolkit. I demonstrate a simple inspection use case to identify antennas on a telco tower using segmentation.

About the SKY ENGINE AI approach

SKY ENGINE introduces a full-stack AI platform for deep learning in virtual reality, which is the next-generation active learning AI system for image and video analysis applications. The SKY ENGINE AI platform can generate data using a proprietary, dedicated simulation system where images come already annotated and ready for deep learning.

The output data stream can include any of the following:

Rendered images or other simulated sensor data in selected modalities
Object bounding boxes
3D bounding boxes
Semantic masks
2D or 3D skeletons
Depth maps
Normal vector maps

SKY ENGINE AI also includes advanced domain adaptation algorithms that can understand the characteristics of real data examples. They assure the high-quality performance of any trained AI model during the inference.

The SKY ENGINE simulation system enables physics-driven sensor simulations (cameras, thermal vision, IR, lidars, radars, and more) and sensor data fusion. It is tightly coupled with a deep learning pipeline to ensure evolution. During training, SKY ENGINE AI can spot ambiguous situations that deteriorate the accuracy of the AI model. It obtains more imagery data to reflect those problematic situations that the deep learning accuracy could instantaneously improve. SKY ENGINE AI learns more with every performed experiment.

SKY ENGINE AI delivers a garden of deep neural networks fully implemented, tested, and optimized. Provided models are dedicated to popular computer vision tasks like object detection and semantic segmentation. They can also serve as more sophisticated topologies designed and implemented for 3D position and pose estimation, 3D geometry reasoning, or representation learning.

SKY ENGINE AI also includes advanced domain adaptation algorithms that can understand the characteristics of real data examples and assure the performance of trained model inference. SKY ENGINE AI does not require sophisticated rendering and imaging knowledge, so the entry barrier is very low. It has a Python API, including a large number of helpers to quickly build and configure the environment.

Neural network optimization

The SKY ENGINE AI platform can generate the datasets and enable the training of deep learning models that can use input data originating from any source. The input stream for AI models training in TAO Toolkit and AI-driven inference can effectively include low-quality images obtained using smartphones, data from CCTV cameras, or cameras mounted on drones.

You can deploy analytical modules for telecommunication network performance optimization on the cloud, including data storage and multi-GPU scaling. The majority of software projects driven by machine learning in this space are unable to reach the final stage of solution deployment. This could be because of the high dependence of machine learning capabilities on the quality of the input data. The development of AI models with deep training on synthetic data, offered by SKY ENGINE, is a solution with predictable project development and guaranteed deployment in several industrial business processes.

Telecommunication equipment detection and classification

One of the common computer vision tasks is the localization and classification of the equipment of interest. In this post, I present the process of neural network optimization for bounding box localization of antenna instances on a telecommunication tower using the TAO Toolkit environment with MaskRCNN. You use the synthetic data from SKY ENGINE AI to train the MaskRCNN model. The high-level workflow is as follows:

Generate synthetic data with annotations.
Convert the data format to COCO as required by the TAO Toolkit MaskRCNN model.
Configure the NGC environment and data preprocessing.
Train and evaluate the MaskRCNN model on synthetic data.
Perform inference using the trained AI model on synthetic and real telco towers.

To follow along, see the SKY ENGINE AI Jupyter notebook on GitHub.

Given the real samples of a telco tower, I used the SE Rendering Engine to create an annotated synthetic dataset.

*Figure 2. Synthetic images include automatically applied labels generated in the SKY ENGINE AI platform. (left) Synthetic images; (right) Semantic masks*

To launch automatic generation of labeled data using SKY ENGINE AI and to prepare the data source object, you must define basic tools like empty renderer context, as well as paths where the assets for the synthetic scene are located.

In this rendering scenario, I randomized the following:

The number of antennas on a given telecommunication tower
The direction of the light
The positions of the camera
The camera’s horizontal field of view
A background map

There can be many projects in which the samples returned by SKY ENGINE are not shuffled enough. One example would be when your rendering process follows the camera trajectory. For this reason, I recommend extra shuffling of the data before dividing it into train and test sets.

After generating the images, convert them to COCO format using the data export module of SKY ENGINE. This is required by the TAO Toolkit framework. After you prepare the configuration file according to the documentation, you can run the training for the TAO Toolkit pretrained Mask RCNN model with the TensorFlow backend:

!tao mask_rcnn train -e $SPECS_DIR/maskrcnn_train_telco_resnet50.txt \
                      -d $USER_EXPERIMENT_DIR/experiment_telco_anchors \
                      -k $KEY \
                      --gpus 1

As a final step, run a trained deep learning model for inference on real data to see if the model is accurately performing tasks of interest.

!tao mask_rcnn inference -i $DATA_DIR/valid_images \
                          -o $USER_EXPERIMENT_DIR/se_telco_maskrcnn_inference_synth \
                          -e $SPECS_DIR/maskrcnn_train_telco_resnet50.txt \
                          -m $USER_EXPERIMENT_DIR/experiment_telco_anchors/model.step-20000.tao \
                          -l $SPECS_DIR/telco_labels.txt \
                          -t 0.5 \
                          -b 1 \
                          -k $KEY \
                          --include_mask

Figure 3 shows some results of telecommunication antenna detection.

*Figure 3. Application of trained AI models on real images.*

Summary

In this post, I demonstrated how you can reduce your data collection and annotation effort by using the synthetic data from SKY ENGINE and training and optimizing it with TAO Toolkit. I presented a single SKY ENGINE AI use case for telecommunication industry. However, this platform unlocks the universe of further potential applications delivering several advanced functionalities:

Automated dataset balancing (active learning)
Domain adaptation
Pretrained deep learning models for 3D reasoning
Simulations of sensors and training of deep learning models for sensor fusion

For more information, see the SKY ENGINE AI solution on GitHub. For more computer vision use cases developed in the SKY ENGINE AI Platform, see the following videos: