Robotics

NVIDIA Research: Fast Uncertainty Quantification for Deep Object Pose Estimation

Jun 08, 2021

By Yuke Zhu

Discuss (2)

AI-Generated Summary

Dislike

Researchers from NVIDIA, University of Texas at Austin, and Caltech developed FastUQ, a simple and efficient uncertainty quantification method for 6-DoF object pose estimation using an ensemble of pre-trained estimators with different architectures and training data.
The FastUQ method addresses the challenge of deep learning-based pose estimators being overconfident in their predictions and the sim2real gap between synthetic training data and real-world applications.
FastUQ has potentially significant impacts in autonomous driving and general autonomy, enabling more robust and safe perception, and uncertainty-aware control and planning.

AI-generated content may summarize information incompletely. Verify important information. Learn more

Researchers from NVIDIA, University of Texas at Austin and Caltech developed a simple, efficient, and plug-and-play uncertainty quantification method for the 6-DoF (degrees of freedom) object pose estimation task, using an ensemble of K pre-trained estimators with different architectures and/or training data sources.

The researchers presented their paper “Fast Uncertainty Quantification (“FastUQ”) for Deep Object Pose Estimation” at the 2021 International Conference on Robotics and Automation (ICRA 2021).

FastUQ focuses on the uncertainty quantification for deep object pose estimation. In deep learning-based object pose estimation (see NVIDIA DOPE), a big challenge is deep-learning-based pose estimators might be overconfident in their pose predictions.

For example, the two figures below are the pose estimation results for the “Ketchup” object from a DOPE model in a manipulation task. Both results are very confident, but the left one is incorrect.

Another challenge addressed is the sim2real gap. Typically, deep learning-based pose estimators are trained from synthetic datasets (by NVIDIA ray tracing renderer, NViSII), but we want to apply these estimators in the real world and quantify the uncertainty. For example, the left figure is from the synthetic NViSII dataset, and the right one is from the real world.

In this project, we propose an ensemble-based method for the fast uncertainty quantification of deep learning-based pose estimators. The idea is demonstrated in the following two figures, where in the left one the deep models in the ensemble disagree with each other, which implies more uncertainty; and in the right one these models agree with each other, which reflects less uncertainty.

This research is interdisciplinary and was solved by the joint efforts of different research teams at NVIDIA:

The AI Algorithms team led by Anima Anandkumar, and the NVIDIA AI Robotics Research Lab in Seattle working on the uncertainty quantification methods
The Learning and Perception Research team led by Jan Kautz for training the deep object pose estimation models, and providing photorealistic synthetic data from NVIDIA’s ray-tracing renderer, NViSII

For training the deep estimators and generating the high-fidelity photorealistic synthetic datasets, the team used NVIDIA V100 GPUs and NVIDIA OptiX (C++/CUDA back-end) for acceleration.

FastUQ is a novel fast uncertainty quantification method for deep object pose estimation, which is efficient, plug-and-play, and supports a general class of pose estimation tasks. This research has potentially significant impacts in autonomous driving and general autonomy, including more robust and safe perception, and uncertainty-aware control and planning.

To learn more about the research, visit the FastUQ project website.

Thank you to Guanya Shi at Caltech for his help with the figures and text of this blog post.

Discuss (2)

About the Authors

About Yuke Zhu
Yuke Zhu is a researcher on the NVIDIA AI Algorithms team. He received his master’s and Ph.D. degrees from Stanford. His Ph.D. thesis centers around closing the perception-action loop to make robot intelligence more generalized and applicable to less-controlled environments. His research lies at the intersection of robotics, machine learning, and computer vision. He develops computational methods of perception and control that give rise to intelligent robot behaviors. Through his work, he aspires to teach robots to understand and interact with the visual world around them. His expertise has gained attention from a variety of news outlets, leading tech institutions, and award organizations. His publications have won several awards and nominations, including the Best Conference Paper Award in ICRA 2019. His work has been covered by media, such as MIT Technology Review and Stanford News.

View all posts by Yuke Zhu

NVIDIA Research: Fast Uncertainty Quantification for Deep Object Pose Estimation

Tags

About the Authors

Comments