In AI and computer vision, data acquisition is costly and time-consuming and human-based labeling can be error-prone. The accuracy of the models is also affected by insufficient and poorly balanced data and the prolonged time required to improve the deep learning models. It always requires the reacquisition of data in the real world.
The collection, preparation of data, and development of accurate and reliable software solutions based on AI training is an extremely laborious process. The required investment costs offset the expected benefits of deploying the system.
One way to bridge the data gap and accelerate model training is by using synthetic data instead of real data for training. SKY ENGINE provides an AI platform to move deep learning to virtual reality. It is possible to generate synthetic data using simulations where the synthetic images come with the annotation that can be used directly in training AI models.
Synthetic data can now be directly exported to run on the NVIDIA Transfer Learning Toolkit (TLT), an AI training toolkit that simplifies training by abstracting away the AI/DL framework complexity. This enables you to build production-quality models faster without needing any AI expertise. With the SKY ENGINE AI platform and TLT, you can quickly iterate and build AI.
In this post, you learn how you can harness the power of synthetic data by taking preannotated synthetic data and training it on TLT. I demonstrate a simple inspection use case to identify antennas on a telco tower using segmentation.
About the SKY ENGINE AI approach
SKY ENGINE introduces a full-stack AI platform for deep learning in virtual reality, which is the next-generation active learning AI system for image and video analysis applications. The SKY ENGINE AI platform can generate data using a proprietary, dedicated simulation system where images come already annotated and ready for deep learning.
The output data stream can include any of the following:
- Rendered images or other simulated sensor data in selected modalities
- Object bounding boxes
- 3D bounding boxes
- Semantic masks
- 2D or 3D skeletons
- Depth maps
- Normal vector maps
SKY ENGINE AI also includes advanced domain adaptation algorithms that can understand the characteristics of real data examples. They assure the high-quality performance of any trained AI model during the inference.
The SKY ENGINE simulation system enables physics-driven sensor simulations (cameras, thermal vision, IR, lidars, radars, and more) and sensor data fusion. It is tightly coupled with a deep learning pipeline to ensure evolution. During training, SKY ENGINE AI can spot ambiguous situations that deteriorate the accuracy of the AI model. It obtains more imagery data to reflect those problematic situations that the deep learning accuracy could instantaneously improve. SKY ENGINE AI learns more with every performed experiment.
SKY ENGINE AI delivers a garden of deep neural networks fully implemented, tested, and optimized. Provided models are dedicated to popular computer vision tasks like object detection and semantic segmentation. They can also serve as more sophisticated topologies designed and implemented for 3D position and pose estimation, 3D geometry reasoning, or representation learning.
SKY ENGINE AI also includes advanced domain adaptation algorithms that can understand the characteristics of real data examples and assure the performance of trained model inference. SKY ENGINE AI does not require sophisticated rendering and imaging knowledge, so the entry barrier is very low. It has a Python API, including a large number of helpers to quickly build and configure the environment.
Neural network optimization
The SKY ENGINE AI platform can generate the datasets and enable the training of deep learning models that can use input data originating from any source. The input stream for AI models training in NVIDIA TLT and AI-driven inference can effectively include low-quality images obtained using smartphones, data from CCTV cameras, or cameras mounted on drones.
You can deploy analytical modules for telecommunication network performance optimization on the cloud, including data storage and multi-GPU scaling. The majority of software projects driven by machine learning in this space are unable to reach the final stage of solution deployment. This could be because of the high dependence of machine learning capabilities on the quality of the input data. The development of AI models with deep training on synthetic data, offered by SKY ENGINE, is a solution with predictable project development and guaranteed deployment in several industrial business processes.
Telecommunication equipment detection and classification
One of the common computer vision tasks is the localization and classification of the equipment of interest. In this post, I present the process of neural network optimization for bounding box localization of antenna instances on a telecommunication tower using the NVIDIA TLT environment with MaskRCNN. You use the synthetic data from SKY ENGINE AI to train the MaskRCNN model. The high-level workflow is as follows:
- Generate synthetic data with annotations.
- Convert the data format to COCO as required by NVIDIA TLT MaskRCNN model.
- Configure the NGC environment and data preprocessing.
- Train and evaluate the MaskRCNN model on synthetic data.
- Perform inference using the trained AI model on synthetic and real telco towers.
To follow along, see the SKY ENGINE AI Jupyter notebook on GitHub.
Given the real samples of a telco tower, I used the SE Rendering Engine to create an annotated synthetic dataset.
To launch automatic generation of labeled data using SKY ENGINE AI and to prepare the data source object, you must define basic tools like empty renderer context, as well as paths where the assets for the synthetic scene are located.
In this rendering scenario, I randomized the following:
- The number of antennas on a given telecommunication tower
- The direction of the light
- The positions of the camera
- The camera’s horizontal field of view
- A background map
There can be many projects in which the samples returned by SKY ENGINE are not shuffled enough. One example would be when your rendering process follows the camera trajectory. For this reason, I recommend extra shuffling of the data before dividing it into train and test sets.
After generating the images, convert them to COCO format using the data export module of SKY ENGINE. This is required by the NVIDIA TLT framework. After you prepare the configuration file according to the documentation, you can run the training for the TLT pretrained Mask RCNN model with the TensorFlow backend:
!tlt mask_rcnn train -e $SPECS_DIR/maskrcnn_train_telco_resnet50.txt \ -d $USER_EXPERIMENT_DIR/experiment_telco_anchors \ -k $KEY \ --gpus 1
As a final step, run a trained deep learning model for inference on real data to see if the model is accurately performing tasks of interest.
!tlt mask_rcnn inference -i $DATA_DIR/valid_images \ -o $USER_EXPERIMENT_DIR/se_telco_maskrcnn_inference_synth \ -e $SPECS_DIR/maskrcnn_train_telco_resnet50.txt \ -m $USER_EXPERIMENT_DIR/experiment_telco_anchors/model.step-20000.tlt \ -l $SPECS_DIR/telco_labels.txt \ -t 0.5 \ -b 1 \ -k $KEY \ --include_mask
Figure 3 shows some results of telecommunication antenna detection.
In this post, I demonstrated how you can reduce your data collection and annotation effort by using the synthetic data from SKY ENGINE and training and optimizing it with NVIDIA TLT. I presented a single SKY ENGINE AI use case for telecommunication industry. However, this platform unlocks the universe of further potential applications delivering several advanced functionalities:
- Automated dataset balancing (active learning)
- Domain adaptation
- Pretrained deep learning models for 3D reasoning
- Simulations of sensors and training of deep learning models for sensor fusion
For more information, see the SKY ENGINE AI solution on GitHub. For more computer vision use cases developed in the SKY ENGINE AI Platform, see the following videos: