Content Creation / Rendering

Supporting Low-Latency Streaming Video for AI-Powered Medical Devices with Clara Holoscan

Nov 15, 2021

By Yaniv Lazimy and Ian Stewart

Discuss (0)

AI-Generated Summary

Dislike

NVIDIA Clara Holoscan provides a scalable medical device computing platform for developers to create AI microservices, optimizing the data pipeline from high-bandwidth data streaming to graphic visualizations.
The NVIDIA Clara AGX Developer Kit combines the efficient Arm-based embedded computing of the AGX Xavier SoC with the powerful NVIDIA RTX 6000 GPU and the 100 GbE connectivity of the NVIDIA ConnectX-6 network processor, bringing real-time AI acceleration to intelligent medical devices.
The integration of AJA Video Systems' capture cards with the Clara AGX Developer Kit, enabled by GPUDirect support, significantly reduces latency and system PCIe bandwidth for GPU video processing applications, allowing for uncompressed high-resolution video to be processed at 60 fps.

AI-generated content may summarize information incompletely. Verify important information. Learn more

NVIDIA Clara Holoscan provides a scalable medical device computing platform for developers to create AI microservices and deliver insights in real time. The platform optimizes every stage of the data pipeline: from high-bandwidth data streaming and physics-based analysis to accelerated AI inference, and graphic visualizations.

The NVIDIA Clara AGX Developer Kit, which is now available, combines the efficient Arm-based embedded computing of the AGX Xavier SoC with the powerful NVIDIA RTX 6000 GPU and the 100 GbE connectivity of the NVIDIA ConnectX-6 network processor. This brings real-time AI acceleration to the next generation of intelligent, software-defined, embedded medical devices. Developers using the Clara AGX Developer Kit for surgical video applications—such as AI-enhanced endoscopy, laparoscopy, or other minimally invasive procedures—require the minimum possible end-to-end latency in their video processing path. Customers can use the Clara Holoscan SDK v0.1 on the Clara AGX Developer Kit today and on the next-generation developer kit in the second half of 2022.

The demands of surgical video necessitate consistent and reliable low-latency, between the image captured by the endoscope and the image projected on a monitor. This provides surgeons with real-time control of their tools and monitoring of the patient.

In a typical endoscopy system, the image is digitized at the camera sensor in the endoscope, serialized by an FPGA or ASIC and transmitted to a video processor where it is written to an input frame buffer, processed, written to an output frame buffer, and then transmitted serially to the monitor. Each of these steps adds latency to the video pipeline. Developers who wish to add advanced GPU-accelerated AI processing are then faced with additional transmission latency due to the need to write the data from the video capture card to system memory, then transfer it via the CPU and PCIe bus to the GPU.

GPU compute performance is a key component of the NVIDIA Clara Holoscan platform. To optimize GPU-based video processing applications, NVIDIA has partnered with AJA Video Systems to integrate their line of video capture cards with the Clara AGX Developer Kit. AJA provides a wide range of proven, professional video I/O devices. The partnership between NVIDIA and AJA has led to the addition of Clara AGX Developer Kit support in the AJA NTV2 SDK and device drivers as of the NTV2 SDK 16.1 release.

The AJA drivers and SDK now offer GPUDirect support for NVIDIA GPUs. This feature uses remote direct memory access (RDMA) to transfer video data directly from the capture card to GPU memory. This significantly reduces latency and system PCIe bandwidth for GPU video processing applications, as system memory to GPU copies are eliminated from the processing pipeline.

AJA devices now also incorporate RDMA support into the AJA GStreamer plug-in to enable zero-copy GPU buffer integration with the DeepStream SDK. DeepStream applications can now process video data along the entire pipeline, from the initial capture to final display, without leaving GPU memory.

NVIDIA Clara Holoscan SDK v0.1 builds on the features of the previous Clara AGX SDK and adds tools to allow for detailed measurement of video transfer latency between video I/O cards, the CPU, and the GPU. This will enable users to measure latency with various configurations, allowing them to focus on improving bottlenecks and optimizing their workflows for minimum end-to-end latency.

Data transfer latency was measured using the Clara AGX Developer Kit with an AJA capture card using the internal PCIe Gen3 x8 connection. The following tables demonstrate the latency reduction that can be achieved using GPUDirect.

Format	Width	Height	Bytes/pixel	Frames/sec
720p YUV	1280	720	2	60
1080p YUV	1920	1080	2	60
4K UHD YUV	3840	2160	2	60
720p RGBA	1280	720	4	60
1080p RGBA	1920	1080	4	60
4K UHD RGBA	3840	2160	4	60

Table 1. Video formats tested.

The total time for video data transfer to and from the GPU, as well as time remaining for processing in the GPU, was then measured with and without GPUDirect enabled:

Format	Without GPUDirect		GPUDirect
	Transfer time, no processing (ms)	Time remaining for processing (ms)	Transfer time, no processing (ms)	Time remaining for processing (ms)
720p YUV	1.945	14.721	0.956	15.710
1080p YUV	3.865	12.801	1.723	14.943
4K UHD YUV	12.805	3.861	6.256	10.410
720p RGBA	3.451	13.215	1.548	15.118
1080p YUV	6.816	9.850	3.225	13.444
4K UHD RGBA	23.686	-7.020	12.406	4.260

Table 2. Latency (ms) with and without GPUDirect.

Note that GPUDirect cuts transfer time approximately in half by removing the need for writes to system memory. GPUDirect allows for the transfer and processing of 4K UHD RGBA inputs at 60 fps. This can now be transferred under the 16.666 ms frame time, whereas without GPUDirect this format could not be transferred at 60 fps. This allows for uncompressed high-resolution video to be natively alpha-blended with overlays from AI workflows. There is no need for conversion from YUV to RGBA formats, and no compromise in the 60 fps frame rate.

For instructions on how to set up and use an AJA device with the Clara AGX Developer Kit, including RDMA and DeepStream integration, go to Chapter 9 of the Clara Holoscan SDK User Guide.

Discuss (0)

About the Authors

About Yaniv Lazimy
Yaniv Lazimy is a technical product manager on the healthcare team at NVIDIA, focused on accelerated computing and connectivity solutions for medical devices. Prior to joining NVIDIA, Yaniv was an embedded systems engineer at NeuWave Medical and Johnson and Johnson.

View all posts by Yaniv Lazimy

About Ian Stewart
Ian Stewart is a software engineer on the Clara Holoscan team, focused on the optimization and deployment of GPU-accelerated medical devices. Ian has been with NVIDIA for over a decade across various graphics, imaging, and embedded development teams.

View all posts by Ian Stewart

Supporting Low-Latency Streaming Video for AI-Powered Medical Devices with Clara Holoscan

Tags

About the Authors

Comments