Ken Tran, Senior Research Engineer at Microsoft Research shares how they are using deep learning to create applications for people who are blind or visually impaired.
Using Tesla M40 and TITAN X GPUs with the cuDNN-accelerated Caffe deep learning framework, they have trained their language model to describe images or scenes in natural language.
The work is being used for Microsoft’s research project called Seeing AI that uses computer vision and natural language processing to describe a person’s surroundings, read text, answer questions and even identify emotions on people’s faces.
Share your GPU-accelerated science with us at http://nvda.ly/Vpjxr and with the world on #ShareYourScience.
Watch more scientists and researchers share how accelerated computing is benefiting their work at http://nvda.ly/X7WpH
Share Your Science: Microsoft Developing Applications for the Visually Impaired
Jul 18, 2016
Discuss (0)
Related resources
- GTC session: Bringing Advanced AI and Navigation into Smart Glasses that Empower the Blind
- GTC session: Live from GTC: A Conversation with Microsoft
- GTC session: Revolutionizing Vision AI: From 2D to 3D Worlds
- NGC Containers: MATLAB
- SDK: CloudXR
- Webinar: Transforming Molecular Design