Content Creation / Rendering

Improving Video Quality and Performance with AV1 and NVIDIA Ada Lovelace Architecture

A side-by-side comparison of two versions of a graphic.

Jan 18, 2023

By Prathap Muthana, Sampurnananda Mishra and Abhijit Patait

Discuss (9)

AI-Generated Summary

Dislike

The AV1 video format offers superior efficiency and quality compared to older formats like H.264, with NVIDIA Ada Lovelace architecture supporting both AV1 encoding and decoding.
NVIDIA NVENC AV1 encoding results in a 40% bit rate savings over H.264 at 1080p60, representing more than 1.8 GB of saved data for two hours of a 1080p 5 Mbps streamed video.
NVIDIA Ada architecture GPUs support features like Split Frame Encoding for AV1, which can improve encoding performance by splitting frames into multiple parts and encoding them in parallel using multiple encoders.

AI-generated content may summarize information incompletely. Verify important information. Learn more

AV1 is the new gold standard video format, with superior efficiency and quality compared to older H.264 and H.265 formats. It is the most recent royalty-free, efficient video encoder standardized by the Alliance for Open Media.

NVIDIA Ampere architecture introduced hardware-accelerated AV1 decoding. NVIDIA Ada Lovelace architecture supports both AV1 encoding and decoding. NVIDIA Ada architecture also brings back support for multiple encoders per GPU (up to three encoders and four decoders per GPU), enabling higher throughput compared to previous generations.

NVIDIA NVENC AV1 performance

NVIDIA NVENC AV1 offers substantial compression efficiency with respect to H.264 and HEVC at better performance. To quantify the quality improvements, we investigated peak signal-to-noise ratio (PSNR) and video multimethod assessment fusion (VMAF) scores for AV1 and H.264. PSNR and VMAF are video quality metrics frequently used to gauge encoding quality.

PSNR score

PSNR is a decibel value that quantifies the reconstruction quality of images. It is the ratio between the maximum power of a signal which is the original image or video and the noise introduced by compression. As shown in Figure 1, NVENC AV1 encoding results in ~1.5-2 dB higher PSNR compared to NVENC H.264 at the same bit rate. In other words, to achieve the same PSNR, H.264 encoding requires a considerably higher bit rate than AV1. For example, AV1 achieves 42 dB PSNR at 7 Mbps compared to 11 Mbps for H.264.

This translates into a 40% bit rate savings for AV1 over H.264 at 1080p60 at a similar quality. For a given low latency quality preset on H.264, bit rate gains are up to 40%, representing more than 1.8 GB of saved data for two hours of a 1080p 5 Mbps streamed video. Similar bit rate savings were observed at 720p, 1440p, and 4K.

VMAF score

VMAF is a video quality metric with high correlation to human perception of streaming video quality. The VMAF scores plotted in Figure 2 were collected with the identical set of videos used for PSNR evaluation. The NVENC AV1 outperforms NVENC H.264 in terms of quality. AV1 performs better than H.264 at low bit rates, and hence provides better visual quality in tough QoS scenarios. For perceptual video quality, the gap between H.264 and AV1 encoded videos reduces as the bit rate increases, as expected.

Video 1 shows a quality comparison of AV1 video encoded on an NVIDIA Ada Lovelace architecture GPU versus H.264 video encoded using x264 software. The H.264 video is encoded using medium presets at 30 Mbps, while the AV1 video is encoded at 18 Mbps using the high-performance presets. The quality of both videos is comparable. The throughput of the AV1 encoder is 500 fps, almost 9x faster than the x264 encoder.

Video 1. Quality comparison of AV1 versus H.264 video streams encoded at identical bit rates on NVIDIA GPUs

Performance in frames per second across resolutions/presets

NVENC performance has been steadily increasing with every generation. NVIDIA Turing and NVIDIA Ampere GPU architecture both had one encoder per chip, while NVIDIA Ada architecture can support up to three encoders per chip.

With NVIDIA Ada architecture, the driver handles the load balancing among the multiple encoders automatically. This enables any application to take advantage of the NVENCs without any special code enabling higher encoder throughput. However, the throughput is subject to clocks, hardware performance limits, and available memory.

NVENCODE API exposes several presets, rate control modes, and tuning information modes for programming the hardware for different use cases. A combination of these parameters enables video encoding at varying quality and performance levels. This enables the application to achieve the desired quality rather than encoding performance tradeoff at granular levels.

Max resolution support

Table 2 shows the max resolution support for AV1, HEVC, and H.264. NVIDIA Ada is the first generation of GPUs supporting 10-bit 8K60 encoding for AV1 and HEVC.

The dedicated encoder hardware NVENC can perform 8- and 10-bit AV1 encoding in addition to 8-bit H.264, 8- and 10-bit HEVC encoding. For more details about NVENC capabilities, see the NVIDIA Hardware Video Encoder documentation.

The hardware-accelerated video encoding and decoding functionality is accessible to applications through NVENCODE and NVDECODE APIs, respectively, which are a part of the NVIDIA Video Codec SDK.

NVIDIA Video Codec SDK 12.0 features

Video Codec SDK 12.0, which was released in November 2022, contains support for NVIDIA Ada Lovelace GPU hardware, along with the new features detailed below.

Split encoding 8K60

Video Codec SDK 12.0 on NVIDIA Ada GPUs support a feature called Split Frame Encoding for AV1 and HEVC, which can encode frames with resolutions greater than 4K using multiple encoders, whenever available. With this feature, the frame is split into two parts. Each part is sent to a different encoder, if the GPU contains multiple encoders. This helps improve the overall encoding performance.

This feature is enabled automatically only at high resolutions, under the conditions shown in Table 3. Note that splitting the frame across independent encoders may result in quality that is suboptimal compared to that achieved by encoding the entire frame on the single encoder. Therefore, this method of performance improvement is not enabled across all presets and resolutions.

Preset Tuning info	p1 (fastest)	p2	p3	p4	p5	p6	p7 (slowest)
High quality	Split frame	Split frame	Normal	Normal	Normal	Normal	Normal
Low latency	Split frame	Split frame	Split frame	Split frame	Normal	Normal	Normal
Ultra-low latency	Split frame	Split frame	Split frame	Split frame	Normal	Normal	Normal

Table 3. Preset and tuning criteria that determine when split encoding is enabled

If certain features in NVENC are enabled, split encoding gets disabled automatically regardless of whether the tuning and preset conditions outlined in Table 3 are met. The features not compatible with split frame encoding are listed below.

HEVC

Weighted prediction
Alpha layer
Subframe mode
Bitstream output into video memory
Picture timing / buffering period SEI message insertion onto DX12 path

AV1

Bit stream output into video memory

Multiple NVENCs for higher throughput

Some NVIDIA Ada GPUs have more than one NVENC. This enables support for encoding more streams in parallel. When encoding a single stream, frames are sent to a different NVENC sequentially. Therefore, using multiple NVENCs does not improve the throughput when encoding a single video stream but can increase the overall throughput when encoding two or more video streams in parallel. On GPUs with multiple NVENCs, different frames from different streams will get scheduled across multiple NVENCs, keeping all NVENCs fully utilized, thereby increasing the throughput.

More NVENCs also help in video editing workflows, in which different independent sections (split across GOP boundary) can be sent to different NVENCs. Such splitting of the video to be encoded can be performed manually by the user (sections with scene changes or different clips being put together, for example) or automatically by the application.

As an example, a video can be split into three time slots: t₀-t₁, t₁-t₂, and t₂-t₃, where t₀, t₁, t₂,and t₃ are monotonically increasing times in the video. Due to multiple encoders, the smaller videos can be encoded in parallel, thereby resulting in a higher overall encoding throughput.

Batch encoding is a feature that leverages multiple encoders. This feature is useful for transcode type workloads. Transcoding involves decoding an input encode stream, scaling, and re-encoding in desired formats and resolutions. This is easily done on NVIDIA Ada GPUs, as the driver automatically handles the load balancing of the decoded stream and splits the work among the encoders.

Support for AV1 in FFmpeg

FFmpeg is the most popular multimedia transcoding tool used extensively for video and audio transcoding. FFmpeg supports NVENC accelerated AV1 encode and NVDEC accelerated AV1 decode. Applications using FFmpeg now have access to GPU-accelerated encoding and decoding.

Summary

Superior PSNR, VMAF, bit rate savings and split encoding performance of AV1 over existing codecs make it a very attractive option for video encoding. NVIDIA ADA GPUs support AV1 and can be accessed through the latest version of the NVIDIA Video Codec SDK.

Discuss (9)

About the Authors

About Prathap Muthana
Prathap Muthana is a senior product manager at NVIDIA focusing on Video Technologies and Data Center products. He has worked at NVIDIA for over 15 years. Prior to his role as a product manager, Prathap worked in the hardware engineering group focusing on signal integrity, power distribution, and substrate designs. He holds a graduate degree in engineering from Georgia Tech and an MBA from Cornell University.

View all posts by Prathap Muthana

About Sampurnananda Mishra
Sampurnananda Mishra is a senior manager at NVIDIA responsible for the multimedia driver. He has worked on variety of multimedia use cases supported on NVIDIA GPUs. His interests include video coding, computer vision, video security, deep learning and system software. He holds a Masters degree in electrical engineering specializing in digital signal processing from IIT Kanpur, India.

View all posts by Sampurnananda Mishra

About Abhijit Patait
Abhijit Patait is the senior director of Multimedia and AI Software at NVIDIA, responsible for audio/video technologies, algorithms, SDKs, drivers, and cloud applications. His general interests include signal processing, audio/speech and video encoding and enhancement algorithms, VoIP, and wireless/baseband. Prior to NVIDIA, Abhijit worked in senior management roles at Motorola, Ericsson, and Ditech Networks (acquired by Nuance/Microsoft). Abhijit is a regular speaker at GTC and other conferences. Abhijit holds an MSEE degree from Missouri S&T University and an MBA from University of California, Berkeley.

View all posts by Abhijit Patait