Harnessing the NVIDIA Ada Architecture for Frame-Rate Up-Conversion in the NVIDIA Optical Flow SDK

The NVIDIA Optical Flow SDK 4.0 is now available, enabling you to fully harness the new NVIDIA Optical Flow Accelerator on the NVIDIA Ada architecture with NvOFFRUC.

Optical flow on the NVIDIA Ada Lovelace architecture

Starting from the NVIDIA Turing architecture, NVIDIA GPUs have dedicated hardware for optical flow computation between a pair of frames. NVIDIA has continued to invest in improving the optical flow hardware engine in the NVIDIA Ampere architecture and NVIDIA Ada Lovelace architecture generations, thanks to the continued feedback from application developers and researchers.

Significant performance improvements

The Optical Flow algorithm requires certain pre– and post-processing steps to improve the quality of the flow vectors.

In the NVIDIA Turing and NVIDIA Ampere architecture generation GPUs, most of these algorithms use a compute engine to perform the required tasks. As a result, when the compute engine workload is high, the performance of the NVIDIA Optical Flow Accelerator (NVOFA) could be affected.

On NVIDIA Ada-generation GPUs, most of these algorithms are moved to dedicated hardware within the NVOFA, reducing the dependency on the compute engine significantly.

In addition, NVIDIA Ada-generation GPUs bring several other optimizations related to reducing the overhead of interaction between driver and hardware. This increases the overall performance and context switches between various hardware engines on the GPU.

With these changes, the speed of the NVIDIA Ada Lovelace architecture NVOFA is improved ~2x compared to the NVIDIA Ampere architecture NVOFA.

Quality improvements

Based on the feedback from earlier generations of NVOFA, there are several quality improvements incorporated in the hardware. Using the same preset, you can see a 10-15% improvement in quality (tested on the KITTI2015 data set) compared to NVIDIA Ampere architecture GPUs.

For more information, see 1.4 NVOFA Quality and Performance.

Optical Flow SDK 4.0

The NVIDIA Optical Flow SDK enables you to access NVOFA functionality. The NVIDIA Optical Flow SDK is a set of Optical Flow C APIs, reusable C++ wrapper classes, and a set of sample applications. These APIs and C++ wrapper classes facilitate the programming of the NVOFA for the efficient computation of the optical flow between a pair of images.

Optical Flow SDK 4.0 comes with the following enhancements and features:

External hint support
NVIDIA Optical Flow-assisted Frame-Rate Up-Conversion (NvOFFRUC)

External hint support

When hints are generated with low evolution images or are available from other sources such as a game engine, NVOFA can refine the hints further to improve the quality of the flow vectors.

Though external hint support is already available through C-API, support was missing in earlier versions of SDK C++ wrapper classes.

Optical Flow SDK 4.0 adds necessary support in the C++ classes and the use of external hints is demonstrated in the sample application AppOFCuda. The hint format is the same as the output flow vector format: an array of NV_OF_FLOW_VECTOR structures. Each array element represents a motion vector for the corresponding block in raster scan order.

AppOFCuda accepts hints in Middlebury flo format but converts them into the required format (an array of NV_OF_FLOW_VECTOR structures) before passing it to the NVOF API. NVOFA prioritizes external hints when they are provided; you are expected to provide reasonable quality hints.

NVIDIA Optical Flow-Assisted Frame-Rate Up-Conversion

Frame-rate up-conversion (FRUC) is a technique that generates higher frame-rate video from lower frame-rate video by inserting interpolated frames into it. Such high frame-rate video shows smooth continuity of motion across frames, improving the perceived visual quality of the video.

Diagram of NvOFFRUC process. — *Figure 1. Interpolated frames are generated in between the original frames to create a smoother image*

The NvOFFRUC library exposes APIs that take two consecutive frames and generate an interpolated frame in between. The interpolation is instant and does not have to be exactly in the middle of the two frames: it can be specified arbitrarily. For more information, see the NVOFA FRUC Programming Guide.

These APIs can be used for up-conversion of any video content. Internally, the library uses the NVOFA hardware engine and CUDA compute cores. As a result, frame interpolation using the NvOFFRUC library is much faster compared to software-only methods.

For more information, see AV1 Encoding and FRUC: Video Performance Boosts and Higher Fidelity on the NVIDIA Ada Lovelace Architecture.

Optical Flow SDK 4.0 is available now.