Computer Vision / Video Analytics

AV1 Encoding and Optical Flow: Video Performance Boosts and Higher Fidelity on the NVIDIA Ada Architecture

Sep 22, 2022

By Rohit Naskulwar, Aurobinda Maharana, Hareshkumar Borse and Robert Jensen

Discuss (7)

AI-Generated Summary

Dislike

The NVIDIA Ada Architecture powers the new NVIDIA GeForce RTX 40 series, NVIDIA RTX 6000 Ada Generation, and NVIDIA L40 GPUs, featuring third-generation ray tracing cores and fourth-generation Tensor Cores.
The NVIDIA Video Codec SDK 12.0 supports AV1 encoding on NVIDIA Ada-generation GPUs, offering 40% more efficiency than H.264 and enabling 8k video encoding at 60 fps in real time with multiple NVENC.
The NVIDIA Optical Flow SDK 4.0 introduces frame rate up conversion using the new optical flow accelerator, NVOFA, which is 2.5x more performant than the previous generation and provides a 15% quality improvement on popular benchmarks.

AI-generated content may summarize information incompletely. Verify important information. Learn more

Announced at GTC 2022, the next generation of NVIDIA GPUs—the NVIDIA GeForce RTX 40 series, NVIDIA RTX 6000 Ada Generation, and NVIDIA L40 for data center—are built with the new NVIDIA Ada Architecture.

The NVIDIA Ada Architecture features third-generation ray tracing cores, fourth-generation Tensor Cores, multiple video encoders, and a new optical flow accelerator.

To enable you to fully harness the new hardware upgrades, NVIDIA is announcing accompanying updates to the Video Codec SDK and Optical Flow SDK.

NVIDIA Video Codec SDK 12.0

AV1 is the state-of-the-art video coding format that offers both substantial performance boosts and higher fidelity compared to H.264, the popular standard. Introduced on the NVIDIA Ampere Architecture, the Video Codec SDK extended support to AV1 decoding. Now, with Video Codec SDK 12.0, NVIDIA Ada-generation GPUs support AV1 encoding.

Hardware-accelerated AV1 encoding is a huge milestone in transitioning AV1 to be the new standard video format. Figure 1 shows how the AV1 bit-rate savings translate into impressive performance boosts and higher fidelity images.

PSNR (peak signal to noise ratio) is a video quality measure. To achieve 42 dB PSNR, AV1 video has a 7 Mbps bit rate while H.264 has upwards of 12 Mbps. Across all resolutions, AV1 encoding averages 40% more efficient than H.264. This fundamental performance difference opens the doors for AV1 to support higher-quality video, increased throughput, and high dynamic range (HDR).

Bar chart shows that at 2160p, AV1 has a 1.45x bit-rate saving compared to NVENC H.264. — *Figure 2. Bit-rate saving for AV1 compared to H.264*

As Figure 2 shows, at 1440p and 2160p, NVENC AV1 is 1.45x more efficient than NVENC H.264. This new performance headroom enables higher than ever image quality, including 8k.

The benefits of AV1 are best used in unison with the multi-encoder design featured on the NVIDIA Ada Architecture. New to Video Codec SDK 12.0 on chips with multiple NVENC, the processing load is evenly distributed across each encoder simultaneously. This optimization creates a huge reduction in encoding times. Multiple encoders in combination with the AV1 format allows NVIDIA Ada to support an incredible 8k at 60 fps video encode in real time.

AV1 encoding across multiple hardware NVENC is enabling the next generation of video performance and fidelity. Broadcasters can achieve higher livestream resolutions, video editors can export video at 2x speed, and all this is enabled by the Video Codec SDK.

Learn more about NVIDIA Video Codec SDK 12.0, which is available for download.

NVIDIA Optical Flow 4.0

The new Optical Flow SDK 4.0 release introduces NVIDIA Optical Flow assisted frame rate up conversion (NvOFFRUC), which interpolates new frames using optical flow vectors to double the effective frame rate of a video. This result in improved smoothness of video playback and perceived visual quality.

The NVIDIA Ada Lovelace Architecture has a new optical flow accelerator, NVOFA, that is 2.5x more performant than the NVIDIA Ampere Architecture NVOFA. It provides a 15% quality improvement on popular benchmarks including KITTI and MPI Sintel.

The NvOFFRUC library uses the NVOFA and CUDA to interpolate frames significantly faster than software-only methods. It also works seamlessly with custom DirectX or CUDA applications, making it easy for developers to integrate.

The Optical Flow SDK 4.0 includes the NvOFFRUC library and sample application, in addition to basic Optical Flow sample applications. The NvOFFRUC library exposes NVIDIA NvOFFRUC APIs that take two consecutive frames and return an interpolated frame in between them. These APIs can be used for the up-conversion of any video.

Frame interpolation using the NvOFFRUC library is extremely fast compared to other software-only methods. The APIs are easy to use, and support ARGB and NV12 input surface formats. It can be directly integrated into any DirectX or CUDA application.

The sample application source code included in the SDK demonstrates how to use NvOFFRUC APIs for video frame rate up conversion. This source code can be reused or modified as required to build a custom application.

The Video 1 sample was created using the NvOFFRUC library. As you can see, the motion of foreground objects and background appears much smoother in the right video compared to the left video.

Video 1. Side-by-side comparison of original video and frame rate up-converted video. (left) Original video played at 15 fps. (right) Frame rate up-converted video played at 30 fps. Video created using the NvOFFRUC library. (Source: http://ultravideo.fi/#testsequences)

Inside the NvOFFRUC library

Here is a brief explanation about how the NvOFFRUC library processes a pair of frames and generates an interpolated frame.

A pair of consecutive frames (previous and next) are input into the NvOFFRUC library (Figure 4).

Using NVIDIA Optical flow APIs, forward and backward flow vector maps are generated.

Flow vectors in the map are then validated using a forward-backward consistency check. Flow vectors that do not pass the consistency check are rejected. The black portions in this figure are flow vectors that did not pass the forward-backward consistency check.

Using available flow vectors and advanced CUDA accelerated techniques, more accurate flow vectors are generated to fill in the rejected flow vectors. Figure 7 shows the infilled flow vector map generated.

Using a complete flow vector map between the two frames, the algorithm generates an interpolated frame between the two input frames. Such an image may contain few holes (pixels that don’t have valid color). This figure shows a few small gray regions near the head of the horse and in the sky that are holes.

Holes in the interpolated frame are filled using image domain hole infilling techniques to generate the final interpolated image. This is the output of the NvOFFRUC library.

The calling application can interleave this interpolated frame with original frames to increase the frame rate of video or game. Figure 10 shows the interpolated frame interleaved between previous and next image.

Lastly, to expand the platforms that can harness the NVOFA hardware, Optical Flow SDK 4.0 also introduces support for Windows Subsystem for Linux.

Harness the NVIDIA Ada Architecture and the NvOFFRUC library with the NVIDIA Optical Flow SDK 4.0, now available. If you have any questions, contact Video DevTech Support.

Discuss (7)

About the Authors

About Rohit Naskulwar
Rohit Naskulwar is a Senior System Software Engineer at NVIDIA in the Multimedia driver and applications team. He has worked on VR and optical flow use cases on NVIDIA GPUs. Prior to NVIDIA, he worked in SIEMENS on PLM TeamCenter server-side development. Rohit holds a B.E. degree in Computer Engineering from the University of Pune, India.

View all posts by Rohit Naskulwar

About Aurobinda Maharana
Aurobinda Maharana is a Senior System Software engineer at NVIDIA in the Multimedia driver team. He worked in NVIDIA optical flow driver and application programming interface design. Previously, he worked on the NVIDIA video driver, NVIDIA streaming, and deep learning solutions. He has an M.Tech. degree in System Science and Automation from the Indian Institute of Science, Bengaluru, India.

View all posts by Aurobinda Maharana

About Hareshkumar Borse
Hareshkumar Borse is a Senior System Software Manager at NVIDIA in the Multimedia driver and applications team. He has worked on video, audio, 3dvision, and optical flow use cases on NVIDIA GPUs. Prior to NVIDIA, he worked on video and graphics application development at C-DAC. He holds an M.Tech. degree in Communication Engineering from IIT Mumbai, India

View all posts by Hareshkumar Borse

About Robert Jensen
Robbie is a product marketing manager at NVIDIA who is working to enable adoption of NVIDIA’s game development SDKs, especially Nsight Developer Tools. He holds a bachelor’s degree in Computer Science from Connecticut College.

View all posts by Robert Jensen