Data Center / Cloud

Rapidly Build AI-Streaming Apps with Python and C++

The computational needs for AI processing of sensor streams at the edge are increasingly demanding. Edge devices must keep up with high rates of incoming data streams, processing, displaying, archiving, and streaming results or closing a control loop in real time. This requires powerful, efficient, and accurate hardware and software solutions capable of high performance computing.

Edge devices must also transfer data quickly and securely to other edge devices, on-prem data centers, or to the cloud, for storing and analyzing the data received. Advanced edge AI processing solutions quickly process large amounts of sensor data and produce actionable insights in real time.

The NVIDIA Holoscan SDK v0.4 now delivers even more efficient processing for streaming AI applications at the edge. Developers can build their own streaming applications with Python and C++ with the SDK containing acceleration libraries, pretrained AI models, and reference applications. 

First introduced for medical AI use cases, Holoscan is now ready for a broader range of applications across multiple industries for high performance computing at the edge.

New Holoscan SDK v0.4 features include:

  • A Python developer experience for rapid application development.
  • Significant improvements using C++.
  • Efficient multi-AI inferencing.
  • Low-latency field-programmable gate array (FPGA) alpha blending.
  • HoloHub, a centralized repository for collecting contributions to Holoscan.

In addition, the deployment stack has been updated to be in sync with the new features added in v0.4. Going forward, the deployment stack will be updated with new SDK releases.

Python developer experience

The Holoscan SDK now provides a Python application development experience. Without compiling any code, developers can quickly prototype and deploy workflows on x86_64 workstations with NVIDIA GPUs, the NVIDIA Clara AGX, and NVIDIA IGX Orin Developer Kits. 

Developers can also integrate with other GPU-accelerated Python libraries, such as RAPIDS and CuPy, to use NVIDIA hardware and optimize processing pipelines. 

The built-in Tensor class supports both the DLPack and NumPy array interfaces (__array_interface__ and __cuda_array_interface__) for compatibility with the CuPy, PyTorch, JAX, TensorFlow, and Numba libraries and multidimensional array processing. 

The Holoscan Tensor object can be used with cuSignal and cuCIM for efficient signal and multidimensional image processing.

The following sample code demonstrates how simple it is to create a Holoscan application using the Python API. The compose() function defines the overall workflow of the application by instantiating the operators and connecting them into a workflow. 

class BasicRadarFlow(Application):
    def compose(self):
        src = SignalGeneratorOp(self, CountCondition(self, iterations), name="src")
        pulseCompression = PulseCompressionOp(self, name="pulse-compression")
        mtiFilter = MTIFilterOp(self, name='mti-filter')
        rangeDoppler = RangeDopplerOp(self, name='range-doppler')
        cfar = CFAROp(self, name='cfar')
        sink = SinkOp(self, name="sink")

        self.add_flow(src, pulseCompression, {('x', 'x'), ('waveform', 'waveform')})
        self.add_flow(pulseCompression, mtiFilter)
        self.add_flow(mtiFilter, rangeDoppler)
        self.add_flow(rangeDoppler, cfar)
        self.add_flow(cfar, sink)

The full python source code of this application can be found in the HoloHub repository.

The Holoscan Python package is available to developers through Python wheels by simply calling pip install holoscan. Refer to the instructions on PyPI for prerequisites.

C++ developer experience

The Holoscan SDK now provides a full C++ application development experience for creating Holoscan operators and flows. Previously the only way to create a Holoscan operator was by wrapping a GXF codelet. Now you can directly create operators using the Holoscan SDK and easily integrate it with other C++ libraries. Learn about native operators in the Holoscan Users Guide.

Multi-AI inference

The Holoscan SDK supports multi-AI pipelines and parallel inference on multiple AI models on the same input stream. Parallel inference through the multi-AI inference module can improve performance by approximately 30%, for you to bring more models into the inference module with the same time constraints. 

Learn how NVIDIA Inception member uses NVIDIA Clara Holoscan to run multi-AI pipelines in real time.

Low-latency FPGA alpha blending

Certain video I/O cards, such as AJA KONA 5, support alpha blending on an FPGA. This feature enables sub-millisecond video signal latency passthrough from the input to the output. This also includes time for AI inference blending with the Holoscan flow. 

Aside from the low-latency implementation, this also enables a safety feature that mitigates failure in the AI flow. In a case where the AI pipeline fails, the original video feed continues streaming to the display from the capture card. The following diagram shows the low-latency FPGA alpha-blending workflow for surgical tool tracking.  See the Holoscan SDK User’s Guide for details.

A diagram showing the workflow from SDI source to AJA hardware with Holoscan SDK enabling real-time surgical tool tracking inference
Figure 1. Diagram of the workflow for surgical tool tracking inference with an AJA card


The Holoscan SDK v0.4 release leverages Holoviz in all sample application pipelines. Features supported in the previous OpenGL visualization operators are enabled through Holoviz using Vulkan. The operator is easily configurable and handles compositing, blending, and visualization of RGB/RGBA images, masks, geometric primitives, and text. 

It also supports headless rendering and streaming to output and a mode latency bypassing the desktop compositor.


This release introduces a new repository called HoloHub. As a public repository, HoloHub hosts a collection of sample applications and operators, and publishes contributions provided by the developer community. 

With HoloHub, NVIDIA partners, including sensor providers, can implement and distribute Holoscan support to the community for quick implementation of new processing workflows.

Get started with Holoscan SDK

The quickest way to get started with Holoscan SDK 0.4 is to run the examples and sample applications from the Holoscan container on Holoscan Developer Kits or x86 devices. This updated container uses the runtime configuration for testing current applications. It also provides C++ and Python development tools and examples to modify and create new processing workflows. 

The Holoscan SDK is available through PyPi for Python 3.8 to 3.11 as well as Debian packages for Ubuntu 20.04.

For developers looking to build the Holoscan SDK, the source code is available under Apache 2 license from the nvidia-holoscan GitHub repository.

Discuss (0)