All NVIDIA GPUs starting with the Kepler generation support fully accelerated hardware video encoding, and all GPUs starting with the Fermi generation support fully accelerated hardware video decoding through the NVIDIA Video Codec SDK.
The NVIDIA NVENC presets design in Video Codec SDK 9.1 and earlier evolved based on various NVENC use cases, which have emerged over time. The presets exposed in the NVENCODE API currently are as follows:
- Latency-sensitive encoding:
- LOW_LATENCY_HP (Low latency high performance)
- LOW_LATENCY_DEFAULT
- LOW_LATENCY_HQ (Low latency high quality)
- Latency-tolerant encoding:
- HP
- DEFAULT
- HQ
- Blu-ray-compatible encoding:
- BD (Blu-ray-compatible)
- Lossless:
- LOSSLESS_HP
- LOSSLESS_DEFAULT
These presets evolved over the years, with multiple hardware generations and encoder features getting introduced in each generation. Because NVENCODE API guarantees binary backward compatibility, it has been difficult to reorganize the presets. This has led to many problems:
- A large number of presets—There are currently nine presets, but they often have an inadequate quality or performance tradeoff.
- Non-uniform preset behavior for different resolutions—For example, a frames per second rate (FPS) improvement of 60% is achievable on Turing by just 2% BD-BR degradation from the Low Latency High Performance preset.
- Quality compared to performance—The HQ preset doesn’t always provide the best quality with lower performance. Up to a 5% bitrate savings is achievable on some clips.
To address these issues and give you better control, new presets are being introduced in Video Codec SDK 10.
Preset design
The preset designs in Video Codec SDK 10 are built for better control and granularity over performance compared to the quality tradeoff for NVENC. This gives you more flexibility. The following major changes have been introduced for improved flexibility:
- Tuning information—Specify the use scenario:
- High quality
- Low latency
- Ultra-low latency
- Lossless
- Predefined presets— Choose from a set of seven predefined presets, from P1 (fastest, lowest quality) to P7 (slowest, highest quality). This determines the set of tools used for the encoding: for example, GOP structure, B frames, look-ahead encoding, and so on.
- Simplified rate control modes—Choose VBR/CBR/constant-QP encoding and set the desired bitrate and mode (1-pass or 2-pass). If you are using the 2-pass rate control mode, you can choose whether to run the 1st pass in quarter resolution.
Presets and tuning information parameters are orthogonal to each other. Mixing and matching results in 28 combinations for better control over the encoding process.
The following section compares .legacy and new presets in 1080p and 2160p resolution with H.264 and H.265 codecs using the popular open source library FFmpeg. NVIDIA is integrating the new preset architecture into the FFmpeg support for NVENC soon.
Experiments
To compare legacy and new presets, we conducted four transcoding experiments.
H.264 transcoding:
- We encoded the raw FullHD YUV420 sequence to 50-Mbps, high-quality, H.264 video stored on RAM.
- To minimize performance fluctuations, we transcoded the high-quality H.264 video to H.264, medium-quality, fully on-GPU.
- We calculated the peak signal-to-noise ratio (PSNR) between the high- and medium-quality videos.
H.265 transcoding:
- We encoded the raw UltraHD YUV420 sequence to 80-Mbps, high-quality, H.265 video stored on RAM.
- To minimize performance fluctuations, we transcoded high-quality H.265 video to H.265, medium-quality, fully on-GPU.
- We calculated the PSNR between the high- and medium-quality videos.
The following code example shows the FFmpeg command lines:
ffmpeg -c:v $decoder -hwaccel cuvid -i $input_file -c:v $encoder -preset $preset -tuning_info $info -y $out_file
Here’s the range of values:
$decoder: h264_cuvid, hevc_cuvid $encoder: h264_nvenc, hevc_nvenc $preset: 'P1', 'P2', 'P3', 'P4', 'P5', 'P6', 'P7', 'default', 'slow', 'medium', 'fast', 'hp', 'hq', 'bd', 'll', 'llhq', 'llhp' $tuning_info: 'hq', 'll', 'ull'
Lossless scenarios weren’t included in the testing, as they show high PSNR values and make the charts unreadable.
H.264 results comparison
We used two different input sequences for comparison: Traffic Flow and Ducks Take Off.
The Traffic Flow video features local, medium-speed motion of objects on a static background. Figure 1 shows that both old and new presets did a good job providing flexible choice between quality and performance. However, the Video Codec SDK 10 presets give you finer control and the results are closer to the trend line. For the end user, this means that the presets behave just as expected: P1 HQ is the fastest, P7 HQ is the best quality, other presets are spread in between, giving almost linear scaling.
The Ducks Take Off sequence is difficult for encoders, as it has lots of local, high-magnitude, chaotic motion, and water splash textures on the background. Figure 3 shows a big difference between old and new presets. For the old presets, the trend line doesn’t look right. It’s not connecting the top-left corner of the chart with the bottom right corner. That means that there’s no tradeoff between quality and performance. The results are scattered across the plot, which means unpredictable preset quality and performance behavior.
However, the Video Codec SDK 10 presets show a good trend line with the P7 presets group in top-left corner and P1 and P2 presets in the bottom-right corner. The preset points are located nearby the trend line, giving predictable encoding performance and quality.
H.265 results comparison
We used two different input sequences for comparison: Jockey and Kayaking.
The first sequence is Jockey, which features slow, local, and global motion. Figure 5 shows that the Video Codec SDK 10 presets behave very well. The points are spread uniformly nearby the trend line, while the old presets don’t give enough control over performance compared to quality. Just one Low Latency preset is located nearby the middle of the trend line, which means users don’t have too many options.
The last 2160p sequence is Kayaking, which is difficult for encoders as it features chaotic, unpredictable motion and lots of water splash textures.
The situation is like the Ducks Take Off results. The new presets give you better control over the encoding process and more choice, unlike the old presets which show high dispersion and unpredictable quality compared to performance behavior.
Conclusion
TheVideo Codec SDK 10 presets are designed to give you better control over the encoding process and simplify the choice by splitting up the control into two orthogonal settings: predefined presets that define the set of tools used during coding process and tuning information parameters for scenarios. The new presets are already supported in the popular FFmpeg library.
For more information, see the latest Video Codec SDK release.