Computer-generated holograms powered by deep learning could make real-time 3D holography feasible on laptops and smartphones, an advancement with potential applications in fields including virtual reality, microscopy, and 3D printing.
Published this week in Nature, an MIT study outlines a novel approach called tensor holography, where researchers trained and optimized a convolutional neural network to create holograms from images with depth information. The compact network requires under 1 MB of memory, and crafts holograms within milliseconds.
“People previously thought that with existing consumer-grade hardware, it was impossible to do real-time 3D holography computations,” said lead author Liang Shi, Ph.D. student in MIT’s Department of Electrical Engineering and Computer Science. “It’s often been said that commercially available holographic displays will be around in 10 years, yet this statement has been around for decades.”
Old-school holograms use laser beams to depict a static scene with both colors and a sense of depth. Traditionally, computer-generated holography has relied on supercomputers to simulate this optical set up, making it possible to create digital holograms that can also capture motion and be easily reproduced and shared.
Simulating the underlying physics of a hologram, however, is computationally intensive, taking up to minutes to render a single holographic image on a clustered supercomputer.
“Because each point in the scene has a different depth, you can’t apply the same operations for all of them,” Shi said. “That increases the complexity significantly.”
To speed things up, and increase the photorealistic precision of the holograms, the researchers turned to deep learning.
The team created custom high-quality training data — the first such database for 3D holograms — made up of 4,000 images with depth information, and a corresponding 3D hologram for each image. Training on NVIDIA Tensor Core GPUs, the CNN learned to generate accurate holograms from images with depth information, which can be acquired by standard modern smartphones with multi-camera setups or LiDAR sensors.
Using an NVIDIA TITAN RTX GPU and the NVIDIA TensorRT SDK for inference, the optimized neural network runs in real time, achieving a speedup of more than two orders of magnitude compared to physical simulation. The final model uses just 617 kilobytes of memory, allowing it to run interactively on low-power AI chips on mobile and edge devices.
The resulting method not only accelerated the process, but also produced holograms with accurate occlusion and per-pixel focal control, improving the images’ realism.
Read the news release from MIT and visit the researchers’ project page. The full paper can be found in Nature.