Piloting a drone or an unmanned vehicle by only using your gaze sounds like a scene out of a science fiction movie, but now it’s a reality. Researchers from the University of Pennsylvania, New York University, and collaborators developed a deep learning system that uses NVIDIA GPUs to enable a user to control a drone by simply directing their eyes towards where they want to steer.

“We show how a set of glasses equipped with gaze tracker, a camera, and an Inertial Measurement Unit (IMU) can be used to (a) estimate the relative position of the human with respect to a quadrotor, (b) decouple the gaze direction from the head orientation, and (c) allow the human spatially task (i.e., send new 3D navigation waypoints to) the robot in an uninstrumented environment,” the researchers stated in their video.

The work, first highlighted on the IEEE Spectrum, describes how the researchers used deep learning and a pair of NVIDIA Jetson TX2 GPUs mounted onboard the drone, and on a pair of specialized eye-tracking glasses to control the drone.

What makes this system unique is that the system is self-contained, meaning the drone doesn’t rely on external sensors.

Using NVIDIA GeForce GTX 1080 Ti GPUs and the cuDNN-accelerated TensorFlow deep learning framework, the team trained their neural networks to compute the 3D navigation waypoint from the 2D gaze coordinate provided from the glasses. They also trained the networks to perform object detection.   
“Our pipeline is able to successfully achieve human-guided autonomy for spatial tasking,” the researchers said. “[We are able to] compute a pointing vector from the glasses and then randomly select the waypoint depth within a predefined safety zone. Ideally, the 3D navigation waypoint would come directly from the eye tracking glasses, but we found in our experiments that the depth component reported by the glasses was too noisy to use effectively.“ In the future, we hope to further investigate this issue in order to give the user more control over depth.”
Since the tracking glasses don’t have much computing power, they are connected to a Jetson TX2 GPU. When a user puts on the glasses and looks at the drone, the GPU detects the drone and its relative location to the user.
“The proposed approach can be employed in a wide range of scenarios including inspection, first response, and it can be used by people with disabilities that affect their mobility,” the researchers said.
The paper has been submitted to ICRA 2019 and IEEE Robotics and Automation Letters,.
Read more>