Ken Tran, Senior Research Engineer at Microsoft Research shares how they are using deep learning to create applications for people who are blind or visually impaired.
Using Tesla M40 and TITAN X GPUs with the cuDNN-accelerated Caffe deep learning framework, they have trained their language model to describe images or scenes in natural language.
The work is being used for Microsoft’s research project called Seeing AI that uses computer vision and natural language processing to describe a person’s surroundings, read text, answer questions and even identify emotions on people’s faces.
Share your GPU-accelerated science with us at http://nvda.ly/Vpjxr and with the world on #ShareYourScience.
Watch more scientists and researchers share how accelerated computing is benefiting their work at http://nvda.ly/X7WpH
Share Your Science: Microsoft Developing Applications for the Visually Impaired
Jul 18, 2016
Discuss (0)

Related resources
- GTC session: Developer Breakout: Bright Cluster Manager Installation Show & Tell (Spring 2023)
- GTC session: Connect with the Experts: Development and Simulation of Autonomous Robots (Spring 2023)
- GTC session: Developing XR Experiences for Every Device (Presented by Google Cloud) (Spring 2023)
- NGC Containers: MATLAB
- SDK: CloudXR
- Webinar: The Next Frontier of Computer Vision: Simulation & Synthetic Data