Deploying AI-Accelerated Medical Devices with NVIDIA Clara Holoscan

The ability to deploy real-time AI in clinics and research facilities is critical to enable the next frontiers in surgery, diagnostics, and drug discovery. From robotic surgery to studying new approaches in biology, doctors and scientists need medical devices to evolve into continuous sensing systems to research and treat disease.

To realize the next generation of intelligent medical devices, a unique combination of AI, accelerated computing, and advanced visualization are needed. NVIDIA Clara Holoscan includes the Clara AGX Developer Kit and the Clara Holoscan SDK that combine to provide a powerful development environment for creating AI-enabled medical devices. To deploy these devices at the clinical edge, a production hardware based on NVIDIA IGX Orin, and a software platform designed for medical-grade certification, are highly desirable.

NVIDIA Clara Holoscan accelerates deployment of production-quality medical applications by providing a set of OpenEmbedded build recipes and reference configurations that can be leveraged to customize and build Clara Holoscan-compatible Linux4Tegra (L4T) embedded board support packages (BSP). With the release of Clara Holoscan SDK v0.3, developers can deploy medical AI even faster using customized OpenEmbedded distributions.

Creating customized Linux distributions with OpenEmbedded

OpenEmbedded is a build framework that allows developers to create fully customized Linux distributions for embedded systems. Developers can fully customize distributions using just the software components and configuration specific to the application. In contrast, commercial Linux distributions provide full operating systems from predefined software collections that often include graphical user interfaces, package management software, GNU tools and libraries, and system configuration tools.

Customizability is particularly important for embedded deployments such that the memory, speed, safety, and security of the embedded device can be optimized while simultaneously simplifying the deployment process using a single preconfigured BSP. In the regulated medical device industry, this customizability is also important from a process overhead point of view, since it allows limiting analysis, testing, and documentation of Software of Unknown Provenance (SOUP), only to the minimal set of software components required for the essential performance of the medical device.

Comparison to HoloPack

HoloPack is the implementation of NVIDIA JetPack SDK specific to Clara Holoscan. It provides a full development environment for Clara Holoscan developer kits and includes Jetson Linux with bootloader, Linux Kernel, Ubuntu desktop environment, and a complete set of libraries for acceleration of GPU computing, multimedia, graphics, and computer vision. This is the Clara Holoscan development stack.

Using customized OpenEmbedded distributions allows you, as the developer, to include just the software components that are actually needed for your application’s deployment. The final runtime BSP can be easily optimized with respect to memory usage, speed, security, and power requirements. This is the Clara Holoscan deployment stack.

To illustrate this, the following tables compare various measurements of a HoloPack installation versus an OpenEmbedded-based Clara Holoscan build, both including the Clara Holoscan Embedded SDK available on GitHub.

Resource usage after initial boot (when idle):

	Development Stack	Deployment Stack	Difference
Processes	408	198	210 (51.4% less processes)
Disk Used	22 GB	7 GB	15 GB (68.1% less disk usage)
Memory Used	1,621 MB	744 MB	877 MB (54.1% less memory usage)

RTX6000 measurements when running the tracking_replayer Clara Holoscan SDK application:

	Development Stack	Deployment Stack	Difference
Power	71 W	67 W	4 W (5.6% less power)
Temperature	50 C	48 C	2 C (4% cooler)
GPU Usage	15%	11%	4% (26.7% less GPU usage)

Job runtime statistics (in milliseconds) as reported by the tracking_replayer Clara Holoscan SDK application:

	Development Stack	Deployment Stack	Difference
Visualizer	4.51	3.18	29.4%
Visualizer Format Converter	1.13	0.85	24.7%
Inference	10.69	5.73	46.3%
Inference Format Converter	1.00	0.93	7%
Replayer	31.11	30.09	3.2%
Total	48.44	40.78	15.8%

The customized OpenEmbedded/Yocto distribution only includes the minimal set of packages which are actually needed for running the Clara Holoscan SDK application. It therefore helps save disk space, memory, and CPU/GPU cycles that result in higher overall performance running Clara Holoscan sample applications.

Although the flexibility of having a desktop experience with HoloPack is desired during the early stages of development (easy installation of new apt packages, for example), this study shows some of the clear benefits of using the customized deployment stack using OpenEmbedded/Yocto for later stages of productization for medical devices.

Get started with NVIDIA Clara Holoscan

Clara Holoscan OpenEmbedded/Yocto recipes is open source and kept up to date alongside the releases of the NVIDIA Clara Holoscan SDK.

The Clara Holoscan OpenEmbedded/Yocto recipes, and the BSP build in general, depend on other open-source OpenEmbedded components that include (but are not limited to):

OpenEmbedded Core
OpenEmbedded BitBake
Community-driven meta-tegra OpenEmbedded layer, responsible for most of the core Jetson/L4T BSP support leveraged by Clara Holoscan

If you are already familiar with OpenEmbedded or Yocto, check out the meta-tegra-clara-holoscan-mgx repo on GitHub. The README within that repo provides a guide and full list of requirements needed to build and flash a Clara Holoscan BSP.

NVIDIA also provides the Clara Holoscan OpenEmbedded Builder on the NVIDIA GPU Cloud (NGC) website to simplify the process of getting started with these recipes. It includes all the tools and dependencies that are needed either within the container or as part of a setup script that initializes a local build tree such that building and flashing a Clara Holoscan BSP can be done in just a few simple commands.

To build a Clara Holoscan BSP for IGX Orin Developer Kit using the default configuration, which includes the Clara Holoscan SDK and sample applications, first ensure that your Docker runtime is logged into NGC. Then run the following commands in a new directory:

$ export IMAGE=nvcr.io/nvidia/clara-holoscan/holoscan-mgx-oe-builder:v0.3.0
$ docker run --rm -v $(pwd):/workspace ${IMAGE} setup.sh ${IMAGE} $(id -u) $(id -g)
$ ./bitbake.sh core-image-x11

Note that this build will require at least 200 GB of free disk space, and a first full build will take three or more hours. Once the build is complete, the IGX Orin Developer Kit can be put into recovery mode and flashed with the following command:

$ ./flash.sh core-image-x11

One major feature of the Clara Holoscan Deployment stack is the support of both iGPU and dGPU for Developer Kits. When using the iGPU configuration, the majority of the runtime components come from the standard Tegra packages used by the meta-tegra layer and allows developers to use the onboard HDMI or DisplayPort connection on the developer kit. You can check out more details by visiting meta-tegra-clara-holoscan-mgx on GitHub.

Develop custom medical AI utilizing ultra high speed frame rates

With customized OpenEmbedded distributions on Clara Holoscan SDK v0.3, it is easier than ever to deploy production-quality AI for unique medical applications at the clinical edge. The SDK provides a lightning-fast frame rate of 240 Hz for 4K video, enabling developers to combine data from more sensors for building accelerated AI pipelines.

To learn how to get started with NVIDIA Clara Holoscan, follow the instructions on the Clara Holoscan SDK page.