Content Creation / Rendering

Encoding for DirectX 12 with NVIDIA Video Codec SDK 11.1

DirectX 12 is a low-level programming API from Microsoft that reduces driver overhead in comparison to its predecessors. DirectX 12 provides more flexibility and fine-grained control on the underlying hardware using command queues, command lists, and so on, which results in better resource utilization. You can take advantage of these functionalities and optimize your applications and get better performance over earlier DirectX versions. At the same time, the application, on its own, must take care of resource management, synchronization, and so on.

More and more game titles and other graphics applications are adopting DirectX12 APIs. Video Codec SDK 11.1 introduces DirectX 12 support for encoding on Windows 20H1 and later OS. This enables DirectX 12 applications to use NVENC across all generations of supported GPUs. The Video Codec SDK package contains NVENCODEAPI headers, sample applications demonstrating the usage, and the programming guide for using the APIs. The sample application contains C++ wrapper classes, which can be reused or modified as required.

typedef struct _NV_ENC_FENCE_POINT_D3D12
     void*                   pFence; /**< [in]: Pointer to ID3D12Fence. This fence object is  
                                                used for synchronization. */
     uint64_t                value;  /**< [in]: Fence value to reach or exceed before the GPU 
                                                operation or fence value to set the fence to,                                                                
                                                after the GPU operation. */

The client application must also specify the input buffer format while initializing the NVENC.

Even though most of the parameters passed to the Encode picture API in DirectX 12 are same as those in other interfaces, there are certain functional differences. Synchronization at the input (the client application writing to the input surface and NVENC reading the input surface) and the output (NVENC writing the bitstream surface and the application reading it out) must be managed using fences. This is unlike previous DirectX interfaces, where it was automatically taken care by the OS runtime and driver.

In DirectX 12, additional information about fence and fence values are required as input parameters to the Encode picture API.  These fence and fence values are used to synchronize the CPU-GPU and GPU-GPU operations.  The application must send the following input and output struct pointers in NV_ENC_PIC_PARAMS::inputBuffer and NV_ENC_PIC_PARAMS:: outputBitstream, containing the fence and fence values:

typedef struct _NV_ENC_INPUT_RESOURCE_D3D12
     NV_ENC_REGISTERED_PTR       pInputBuffer
     NV_ENC_FENCE_POINT_D3D12    inputFencePoint;       
 typedef struct _NV_ENC_OUTPUT_RESOURCE_D3D12
     NV_ENC_REGISTERED_PTR      pOutputBuffer;
     NV_ENC_FENCE_POINT_D3D12   outputFencePoint;     

To retrieve the encoded output in asynchronous mode of operation, the application should wait on a completion event before calling NvEncLockBitstream. In the synchronous mode of operation, the application can call NvEncLockBitstream, as the NVENCODE API makes sure that encoding has finished before returning the encoded output. However, in both cases, the client application should pass a pointer to NV_ENC_OUTPUT_RESOURCE_D3D12, which was used in NvEncEncodePicture API, in NV_ENC_LOCK_BITSTREAM::outputBitstream.

For more information, see the Video Codec SDK programming guide. Encoder performance in DirectX 12 is close when compared to the other DirectX interfaces. The encoder quality is same across all interfaces. 

Discuss (0)