Ken Tran, Senior Research Engineer at Microsoft Research shares how they are using deep learning to create applications for people who are blind or visually impaired.
Using Tesla M40 and TITAN X GPUs with the cuDNN-accelerated Caffe deep learning framework, they have trained their language model to describe images or scenes in natural language.
The work is being used for Microsoft’s research project called Seeing AI that uses computer vision and natural language processing to describe a person’s surroundings, read text, answer questions and even identify emotions on people’s faces.
Share your GPU-accelerated science with us at http://nvda.ly/Vpjxr and with the world on #ShareYourScience.
Watch more scientists and researchers share how accelerated computing is benefiting their work at http://nvda.ly/X7WpH
Share Your Science: Microsoft Developing Applications for the Visually Impaired
Jul 18, 2016
Discuss (0)

Related resources
- GTC session: Connect with the Experts: AR and VR in Omniverse (Spring 2023)
- GTC session: NVIDIA Omniverse User Group Spring 2023 (Spring 2023)
- GTC session: Real-Time Collaborative Rendering With NVIDIA RTX Ada Generation Laptops (Spring 2023)
- Webinar: Level Up with NVIDIA: RTX in Unity
- Webinar: NVIDIA Inception Open House: Inception Overview & Omniverse
- Webinar: Transform your Vision AI business with NVIDIA Jetson ORIN and NVIDIA Launchpad