End of life Notice

The VRWorks 360 video SDK is no longer in development. Support, updates and bug fixes are no longer provided. Users who have previously downloaded the SDK can continue to use it on supported hardware and CUDA versions.


VRWorks 360 Video is NVIDIA’s SDK to enable VR developers and content creators to capture, stitch, and stream 360-degree videos.

The SDK supports 360-degree mono and stereo stitching, post production (offline mode) and real-time workflows.

360-degree video processing is complex and computationally intensive. By leveraging NVIDIA GPUs, VRWorks 360 Video SDK provides a high-performance, high-quality and low-latency GPU-accelerated implementation that can be integrated into 360 video workflows.

Real-time stereo stitching is particularly challenging. The basic process generally involves ingesting, decoding and calibrating multiple high-resolution streams, stereo stitching, and encoding. The entire pipeline has to be executed in real time while maintaining the highest level of image quality. To accomplish this, NVIDIA has developed a new set of motion based algorithms for superior-quality stereo stitching, that are optimized for real time processing. Using consecutive frames, the algorithms estimate the motion of objects in a video stream, noting how they match and move across a seam while accounting for stereo disparity.

Hardware: Compatible with: Maxwell to Turing. (GeForce GTX 900 series up to Quadro RTX 8000)

Note: recommended spec for 360 stitching:
Mono: GeForce GTX 1060 6GB / Quadro P4000
Stereo offline: GeForce GTX 1080 / Quadro P4000
Stereo Live: Dual GeForce GTX 1080Ti / Dual Quadro P6000
Software: Windows: 64 bit Windows
NVIDIA graphics driver 411.63 or later
Microsoft Visual Studio 2015 (MSVC14.0) or later
CUDA 10.0 Toolkit
CMake 3.2 or above

Linux: Ubuntu 16.04 and higher and Fedora 25 and higher


360-degree video

360-degree videos have gained popularity due to their ability to encompass a wide field of view. The proliferation of virtual reality devices has given rise to an increasing need for videos with a 360 degree field of view to deliver an enhanced sense of immersion and presence.

360-degree videos can be generated by aligning streams and combining them appropriately in the areas of overlap to produce a seamless 360-degree view. These videos can be generated by using a rig that has sufficient number of cameras with a wide enough field of view to capture the 360-degree space.


Content provided by Kevin Alderweireldt of Yume Vr and features Umamido
See full result video (over/under) here




Content provided by Z Cam

360-degree video pipeline



Ingest

The VRWorks 360 Video SDK can ingest MP4 compressed videos. It can also read from RGB files as well as CUDA arrays. The SDK inherently supports HW accelerated decode of compressed input. VRWorks 360 Video SDK currently supports up to 32 inputs which should accommodate most 360 camera arrays. Audio can be included in the input videos or ingested from an external source. There is no restriction on the number of audio inputs being ingested.

Decode

NVIDIA GPUs contain one or more hardware-based decoder and encoder(s) (separate from the CUDA cores) which provides fully-accelerated hardware-based video decoding and encoding for most video formats. With decoding/encoding offloaded, the graphics engine and the CPU are free for other operations.

During import, video files are decompressed and decoded into a raw format for image processing. VRWorks 360 video SDK makes use of the integrated hardware and NVDEC (part of the NVIDIA Video Codec SDK)

Calibrate

During the calibration process, the individual video streams are calibrated in relation to each other. This SDK calibrates for lens distortion as well as the rotation and translation between each camera within the rig. Using intrinsic and extrinsic parameters particular to the rig, the individual streams are aligned and calibrated, then stitched into a single 360-degree video. The SDK also supports camera calibration without requiring input estimates for camera parameters. It automatically computes estimated focal length, rotation, principal point and fisheye radius if not provided. Non homogenous camera rigs are also supported given constraints related to overlap are satisfied. These rigs can be composed of cameras with different resolution, lens distortion type or focal length. This SDK also includes auto balancing for automatic equatorial alignment of the camera rig.

Learn more about the calibration process in this developer blog: Calibrating Stitched Videos with VRWorks 360 Video SDK

Stitch

The process of generating 360-degree videos typically includes capturing monocular footage that are then stitched into a single 360-degree video. In the case of monoscopic 360-degree video, the output is a single 360-degree panorama generated by aligning input videos and blending in the areas of overlap. Stereoscopic stitching produces a pair of 360-degree panoramas, one for each eye. This projection model is often referred to as Omnidirectional stereo. To avoid artifacts, this process requires careful handling of issues such as camera calibration, parallax in the source videos, and temporal consistency. Because of this, the stitching process is complex and computationally challenging.

VRWorks 360 Video SDK is a GPU-accelerated solution designed to achieve real-time stitching of images into stereo 360-degree videos. The performance of the SDK is GPU scalable allowing multiple GPUs to produce a live stitched 4K 30fps stream from an 8 camera 4K 30fps input.


Click on image to enlarge

VRWorks 360 Video SDK offers real-time solutions for both monoscopic and stereoscopic 360 stitching. Both modes utilize GPU accelerated decoding and encoding to achieve real-time ingest and output of high resolution streams. Mono stitching uses GPU accelerated Multiband blending to avoid over-smoothing, distortion and visible seams. Stereo stitching is more computationally intensive and applies motion vector estimation to accomplish seamless stereoscopic output regardless of the captured scene.



Depth-based mono stitch is a new stitching pipeline that uses depth based alignment to improve the stitching quality in scenes with objects close to the camera rig and improves the quality across the region of overlap between two cameras.


Region of interest stitch enables adaptive stitching by defining the desired field of view rather than stitching a complete panorama. This opens up new use cases such as 180-degree VR and can reduce execution time and improve performance.



Moveable Seams is a feature that enables developers to adjust the seams location in the region of overlap between 2 cameras to preserve visual fidelity in the region of overlap particularly when objects are close to the camera.

Encode

During export, video files are encoded and compressed into the desired delivery format. VRWorks 360 video SDK makes use of the integrated hardware and NVENC (part of the NVIDIA Video Codec SDK)

Output

The VRWorks 360 Video SDK supports various output formats: RGB textures, H.264 compressed streams, MP4 files. Compression is GPU-accelerated. Equirectangular projection is supported. The SDK gives the user the option of multiplexing the audio and video into an MP4 or outputting audio and video as separate files.

Audio

The SDK supports both AAC and PCM audio and allows for user specified gain to be applied. Audio stereo spread can be optionally applied to the incoming streams during blending.

Audio can be included in the input videos or ingested from an external source. Blended audio output can be muxed into the output panoramic video or output as a separate audio stream.

Ambisonic Audio is a technique to record, mix and playback 3D 360-degree audio. The new pipeline enables 3D, omnidirectional audio such that the perceived direction of sound sources change when viewers modify their orientation resulting in a much more immersive 360-video experience.

Warp 360

Warp 360 provides highly-optimized image warping and distortion removal by converting images between a number of projection types, including perspective, fisheye and equirectangular. It can transform equirectangular stitched output into projection format such as cubemap to reduce streaming bandwidth leading to increased performance.