GTC Silicon Valley-2019 ID:S9571:NVGaze: Anatomy-Aware Augmentation for Low-Latency, Near-Eye Gaze Estimation
Joohwan Kim(NVIDIA),Michael Stengel(NVIDIA)
Training a deep-learning gaze estimator requires a massive, diverse, and high-quality training set, which is challenging to produce because it requires photographing subjects and manually labeling pupil position and gaze direction accurately. We'll describe how we created anatomically informed eye and face 3D models for infrared illumination and rendering near-eye images annotated with accurate gaze labels. We use 4M synthetic eye images in combination with real-world images to train a near-eye gaze estimator for VR and AR headsets. Over a wide 40x30-degree field of view, the estimator achieves higher accuracy on real subjects than previous methods. The estimator uses an optimized network architecture that requires fewer convolutional layers than previous deep learning-based gaze trackers, achieving low latency on desktop and mobile hardware. In addition to gaze estimation, we'll show how to use the network for robust pupil estimation and accurate remote-gaze estimation.