Computer Vision / Video Analytics

Share Your Science: Microsoft Developing Applications for the Visually Impaired

Jul 18, 2016

By Brad Nemire

Ken Tran, Senior Research Engineer at Microsoft Research shares how they are using deep learning to create applications for people who are blind or visually impaired.
Using Tesla M40 and TITAN X GPUs with the cuDNN-accelerated Caffe deep learning framework, they have trained their language model to describe images or scenes in natural language.
The work is being used for Microsoft’s research project called Seeing AI that uses computer vision and natural language processing to describe a person’s surroundings, read text, answer questions and even identify emotions on people’s faces.

Share your GPU-accelerated science with us at http://nvda.ly/Vpjxr and with the world on #ShareYourScience.
Watch more scientists and researchers share how accelerated computing is benefiting their work at http://nvda.ly/X7WpH

Related resources

GTC session: Bringing Advanced AI and Navigation into Smart Glasses that Empower the Blind
GTC session: Live from GTC: A Conversation with Microsoft
GTC session: Revolutionizing Vision AI: From 2D to 3D Worlds
NGC Containers: MATLAB
SDK: CloudXR
Webinar: Transforming Molecular Design

Discuss (0)

About the Authors

About Brad Nemire
Brad Nemire leads the Developer Communications team at NVIDIA. Prior to NVIDIA, he worked at Arm on the Developer Relations team. Brad graduated from San Diego State University and currently resides in Silicon Valley.

View all posts by Brad Nemire