Profiling#
Introduction#
Profiling measures the entire pipeline latency and module-specific latency, including the initialization latency, transmission latency, submission, and execution latency efficiently and accurately. The latency is shown on the console when exiting. The latency data can be dumped for further analysis and to diagnose performance issues.
Functionality and Behavior#
The pipeline consists of multiple modules inherited from CBaseModule, which connect with each other through NvStreams. In each module, a profiler is instantiated, which records the initialization time (one time per instance), transmission time, submission time, execution time, and potential pipeline time if it is the last module (one time per packet). The definitions of the terms are shown as follows.
Initialization time
The time to initialize the current module
Transmission time
The time from sending the packet from the previous module to current module
Submission time
The time to submit the task to the hardware engine.
Execution time
The time from submitting current ask to the completion of the task
Pipeline time
The processing time of the packet from the first module to the last module belongs to the same pipeline. By default, only the last module of the pipeline records pipeline latency.
All the times, except the transmission time, are recorded by the module itself with no interference from other modules. The transmission time measures the time of sending the packet in the upstream module and receiving the packet in the downstream module. Processing of the upstream module may affect the downstream.
Example:
./nvsipl_multicast -c F008A120RM0AV2_CPHY_x4 -m "0x0001 0 0 0" -p "multicast" -K -r 4
Then, the profiling report would be shown in the console as follows. Here, it shows the pipeline latency of enc and cuda because those two modules are the last modules in their pipeline.
============ Perf. Report: multicast_channel_Enc_0 pipeline latency
[NOTE] All time-values are in MSEC
------------------------------------
min : 41.201
max : 42.234
average : 41.259
99.99-pct : 42.234
Std. Deviation : 0.183
Num. of Samples : 116
============ Perf. Report: multicast_channel_Cuda_0 pipeline latency
[NOTE] All time-values are in MSEC
------------------------------------
min : 32.672
max : 32.816
average : 32.685
99.99-pct : 32.816
Std. Deviation : 0.014
Num. of Samples : 116
Constraints#
NvPlayfair should be installed and work properly.
If the NvPlayfair library cannot load successfully, the application terminates with a corresponding error.
Metadata must be carried out in an NvStream packet
The frame capture start TSC is stored in the metadata of the packet. Without metadata, it is impossible to calculate the time consumed in the last module.