To help potential homebuyers get a 360-degree tour of a home, Zillow, the online real estate database company, recently launched a new app and service across North America that relies on machine learning to generate 3D walkthroughs of a home.
“Previously, 3D tours were only found on high-end or expensive homes, due to the high cost and time-intensive capture process, said Josh Weisberg, Zillow’s Senior Director of Product Development, 3D Computer Vision. “Now with 3D Home, adding an immersive experience to a home listing is fast, easy, and free.”
The app relies on the iPhone’s camera, a Ricoh Theta V, or a Z1 camera to capture the immersive 360-degree panoramas.
“What this means for the backend algorithms is that we need to support (1) high-quality hand-held 360-degree panorama stitching, and (2) automatic and smooth inter-panorama transitions,” the company wrote in a blog post.
Using NVIDIA TITAN Xp GPUs with the cuDNN accelerated TensorFlow and PyTorch deep learning frameworks, the researchers trained a machine learning algorithm to automatically stitch together multiple panoramic photos taken on a hand-held camera, as well as generating automatic and smooth transitions between the in-home panoramas.
The algorithm relies on both the videos captured on the device and inertial measurement unit motion data from the device.
When a user opens the app, the capture starts with a panorama capture of a user-picked location. The first panorama is followed by what the company describes as a link capture, which is the trajectory between the first and consecutive panorama locations.
“Our input for panorama generation is an upright rotating video at a user-selected capturing location. We first decompose the captured video into a sequence of image frames, then stitch them spatially and blend them photometrically into a panorama, as shown in the image below,” the researchers said.
To solve for the parallax effect, the displacement or difference in the apparent position of an object that can occur when the camera moves slightly, the team applied optical flow alignment techniques to create a nonlinear, pixel-to-pixel warping. This results in objects both close and at a distance to be aligned in the final panorama.
To solve for image blending issues that can occur when panoramic frames are stitched together geometrically, the researchers applied a non-local exposure correction to adjust for underexposed regions and improve the overall brightness of the scene, the team said.
To link the different panoramas between rooms, the algorithm uses accelerometer and gyroscope readings from the phone to help the algorithm classify the walking pattern for each link capture.
“The model should recognize walking steps from the raw, noisy IMU accelerometer data,” the team wrote in a blog post.
You can read more about the techniques used to design the algorithms here.
The app was initially field tested in select markets during a pilot program in 2018. With the recent release, the app is now available across the United States and Canada.