Technical Walkthrough

Accelerating Solution Development with DOCA on NVIDIA BlueField DPUs

Discuss (0)

DOCA is a software framework for developing applications on BlueField DPUs. By using DOCA, you can offload infrastructure workloads from the host CPU and accelerate them with the BlueField DPU. This enables an infrastructure that is software-defined yet hardware accelerated, maximizing both performance and flexibility in the data center.

DOCA is here!

NVIDIA first introduced DOCA in October 2020. The NVIDIA BlueField-2 DPU is now generally available and DOCA is in early availability, making it easy to develop and enhance solutions that leverage the ability of BlueField to offload, accelerate, and isolate infrastructure workloads including networking, security, storage, and management. In this post, we discuss exactly what is in DOCA and how developers and ISVs can use it to create DPU-based solutions.

First, here’s a quick overview of what the BlueField DPU is and does. It contains a powerful SmartNIC with fast Ethernet or InfiniBand interfaces, a set of Arm cores, DRAM, and a PCIe switch, all connected by a fast mesh fabric. The embedded ConnectX SmartNIC includes many accelerators (networking, cloud, storage, encryption, media streaming, time synchronization, and so on), and BlueField adds additional accelerators and features for security, storage virtualization, hardware isolation, and remote management.

Stack diagram shows how DOCA enables DPU programming both in the business application domain  on the host CPU and in the infrastructure services domain, with functional isolation between the two domains.
Figure 1. The BlueField DPU and DOCA framework allow infrastructure services to be moved to the DPU, offloading and accelerating these services. DOCA allows development in both the application and the infrastructure services domains

Benefits of using DOCA

Many of the individual functions and accelerators in BlueField can be accessed through specific APIs, open-source SDKs, or existing drivers, so you may ask why you should use DOCA. The main benefit is simplifying the development and deployment process for infrastructure applications and functions that use the DPU. This allows for faster time to market for applications and other benefits:

  • Unified access to all the DPU features—You don’t have to track down and work with several disparate tools.
  • Higher-level libraries with an abstraction layer on top of low-level DPU APIs—You can integrate at a higher level for easier and faster development that is tuned for best performance or integrate at a lower level for more detailed control.
  • Forward/backward compatibility—Developing with DOCA means that your app runs seamlessly on future versions of the BlueField DPU while gaining higher performance and scale.
  • DPU provisioning and deployment of containerized services—DOCA includes tools to simplify DPU setup, provisioning, and services orchestration.

Developer’s journey with DOCA 1.0

DOCA consists of an SDK software development kit and a runtime platform for the DPUs. The SDK contains APIs, development libraries, developer tools, and reference application sources, while its runtime includes services, reference application executables and runtime tools. The drivers support the DOCA libraries, which support the reference applications included in DOCA 1.0. Alongside are DOCA services, such as the ability to send filtered telemetry, management tools for the DPU and SDK, and the interface to program software-defined networking (SDN) either on the data plane (accelerated by DPDK in this release) or on the control plane.

Your journey with DOCA starts with choosing the type of application to run on or integrate with the DPU. The next step is determining whether the application should run on the host CPU, the DPU itself, or a combination of both. Apps running on the host must be compiled for the host CPU (usually X86) while apps running on the DPU must be compiled for Arm. Either way, the app can use DOCA to access the offloads and accelerators on the DPU and you select which DOCA sample applications, libraries, and APIs to use for development. If the main application continues to run on the host CPU, you can create a small agent to run on the DPU Arm cores that activates the BlueField hardware offloads without requiring significant changes to the existing application.

Flow diagram shows the developer’s journey from the DOCA DevZone to designing applications with DOCA libraries then deploying them on host or on the BlueField DPU
Figure 2. The developer journey with DOCA, from development to deployment.

DOCA programming options

When you program to the DPU, it’s possible in many cases to access the drivers directly. This usually requires low-level programming and a detailed knowledge of the drivers. In most cases, it’s much easier to program to the DOCA libraries, which provide higher-level and a more abstracted view to the drivers. The great benefit is that it is already tuned for best acceleration performance per use case. The reference applications provide actual working code with examples of how to use the DOCA libraries. In some cases, the reference applications can be used or modified as the foundation to applications and solutions that run on the DPU.

For example, suppose that you wanted to build an accelerated load balancer or integrate the agent of a distributed firewall to run on the DPU. You can use the reference applications, which use the DOCA deep packet inspection (DPI) libraries. These libraries, in turn, are running on top of the DPDK libraries and using the stateful connection tracking and regular expression (regex) matching engines inside the DPU.

Accelerated
Load Balancer
NGFW AgentElastic Storage
DOCA reference
application
Load BalancerFirewall Agent<coming June 2021>
DOCA librariesDOCA Flow and DPIDOCA Flow and DPISPDK
Low-level API/LibFTE_FLOW, DPDK SFT, DPDK REGEX, DPDKRTE_FLOW, DPDK RegExSPDK, BlueField SNAP
DPU hardware featureeSwitch, Connection Tracking, RegExeSwitch Connection Tracking, RegExRDMA, BlueField SNAP, PCIe switch

Table 1. Three examples of how DOCA runs on top of and enables access to lower-level APIs/Libs and DPU hardware features.

In most cases, you could program to the low-level API/Lib instead of to the DOCA libraries, but it’s easier to program to the DOCA libraries or even modify the DOCA reference application whenever they are available. For some DOCA libraries—such as SPDK—there isn’t a DOCA reference application yet, but the library is in DOCA 1.0 and available for immediate use. A storage reference application is expected to be added to DOCA later. In other cases, such as the time synchronization or IPSec encryption features, the APIs and functionality are available now through the latest update to the BlueField OS. The functionality will be available through DOCA libraries in a future release.

Two sides to DOCA tools

From a developer standpoint, DOCA can be divided into two major areas:

  • SDK components to help you build applications that run on or use the DPU.
  • DOCA runtime set, which consists of the components needed to run the applications on the DPU.

The SDK side contains development libraries, drivers, and toolkits, as well as documentation and reference code sources for reference applications.

The runtime side includes the binary form of libraries, runtime binaries, compiler tools, installation utilities, benchmarks, and various DOCA service agents. These enable you to set up a DPU card, install the proper OS, and run your software on the DPU, using the different DPU APIs and features. The runtime side also includes management tools for provisioning and supporting the DPU cards in the servers and across the network, while enabling the orchestration of containerized and accelerated services.

Reference applications included with DOCA

DOCA 1.0 includes reference applications for an accelerated load balancer using DPI and a next-generation firewall agent that uses both DPI and regex pattern matching. These take advantage of the DOCA libraries and special features on the DPU. They include source files and are not required for programming on the DPU. However, they can simplify application development and integration by providing examples for using the DPU APIs and libraries. Additional reference applications may be added in future DOCA releases.

DPU management tools and additional features

  • SDK manager—Helps install and update the BlueField SDK on the machines that you use to run the DPU. It installs the DOCA SDK and runtime on the host and installs the development container on the host used to update the OS and DOCA on BlueField.
  • Provisioning tools—Designed to simplify and automate the management of many DPU cards across a data center and can work with scripting and management tools. These tools are not included in DOCA 1.0 but are expected to be added to DOCA soon.
  • Telemetry—Enables selective capture and sharing of key networking and server telemetry on the DPU followed by sharing or collection of that telemetry with log management, data analysis, or cyber security tools.

Some other BlueField DPU features are not yet supported in the DOCA 1.0 SDK but are supported by the DOCA runtime. These features are accessible through the BlueField DPU software or the Mellanox OFED libraries and will be added to the DOCA SDK in future releases:

  • Encryption of network traffic (using IPSec or TLS)
  • Super-accurate Precision Time Protocol (PTP) to support a time-synchronized data center
  • HPC/AI collective offloads
  • Support for NVIDIA GPUDirect Storage (GDS)
  • And more…

Similarly, BlueField SNAP uses the power of the DPU to virtualize network storage as either a local NVMe SSD or as a virtio-blk (block storage) device. SNAP capabilities are included now in the DOCA runtime with high-level developer SDK access through the SPDK library, with additional SNAP features coming to the DOCA SDK later.

DOCA vision

The roadmap for DOCA includes the ability to access practically any feature of the BlueField DPU using DOCA. Figure 3 shows the planned DOCA stack, which includes support for many types of high-level applications that run on top an expanding set of DOCA services, libraries, and drivers.

Diagram shows the planned DOCA model including 6 applications, 5 DOCA services, 10 categories of DOCA libraries, and an expanded section of DOCA drivers
Figure 3. DOCA is evolving to include comprehensive support for nearly all BlueField DPU features.

NVIDIA wants to give you powerful access to all the DPU features while also simplifying your work to create new applications on the DPU or integrated existing applications with the BlueField DPU. A continuing series of DOCA releases will expand the drivers, libraries, services, and sample applications in DOCA. You will be able to create more sophisticated and more efficient solutions using the NVIDIA DPU to accelerate infrastructure services. Use DOCA to improve data center performance, efficiency, security, and manageability. You can apply for access to DOCA now.

For more information, see the following resources: