Compute Graph Framework SDK Reference  5.10
Custom Node and Integration

Develop a C++ node using DriveWorks Core & framework APIs

To create C++ code implementation of a node, the user can use the node.json to C++ code template Python script to first create a template based on the node JSON file. In this section, we will discuss the C++ code structure in more detail. There are two types of nodes and, when creating one, you will inherit from one of these two interfaces:

  • ProcessNode
  • SensorNode

In this document, we will focus mainly on process nodes. Process node covers anything that performs data processing. For instance, ISPNode would be a process node. When creating a node that is not based on a DriveWorks module, the developer is free to inherit from the abstract classes ProcessNode or SensorNode, but care must be taken to ensure that the node catches its own exceptions as there will be no guarantee of that. It is highly recommended to inherit from the Simple classes and ExceptionSafe classes.

Creating a public node interface

For each node, there will be a public node interface and an implementation file. When creating a public node interface file, the interface should inherit from ExceptionSafeProcessNode for a process node type. This is an exception catching layer so that the exceptions thrown in the implementation get translated into status codes. Upon construction, ExceptionSafeProcessNode requires a pointer to the implementation. Thus, when creating the HelloWorldNode node, in the constructor call to the parent class ExceptionSafeProcessNode a pointer to HelloWorldNodeImpl would be passed. An example can be found in public header of the object detector custom node, HelloWorldNode.hpp.

Pass Definition and Pass Descriptors

Pass Definition

When defining the public header for the node, each pass is defined with macro NODE_REGISTER_PASS(). NODE_REGISTER_PASS() macro definition is defined in file <dwcgf/node/SimpleNodeT.hpp>. In HelloWorldNodeImpl.cpp, each pass is defined in initPass method:

// post-process CPU pass
NODE_REGISTER_PASS(
    "PROCESS"_sv,
    [this]() { return process(); });

Pass Descriptors

We can also describe each pass using pass descriptor APIs to make pass visualization in DW Graph GUI Tool. Pass descriptor API’s are defined in file <dwcgf/pass/PassDescriptor.hpp>

static constexpr auto passes()
{
    return describePassCollection(
        describePass("SETUP"_sv, DW_PROCESSOR_TYPE_CPU),
        describePass("PREPROCESS"_sv, DW_PROCESSOR_TYPE_GPU),
        describePass("INFERENCE"_sv, DW_PROCESSOR_TYPE_GPU),
        describePass("POSTPROCESS"_sv, DW_PROCESSOR_TYPE_CPU),
        describePass("TEARDOWN"_sv, DW_PROCESSOR_TYPE_CPU));
};

Port Definition and Port Descriptors

Port Definition

In addition to defining passes, each input and output must be defined with a name. In addition, ports can also be defined with portIndex function which is defined in <dwcgf/port/PortDescriptor.hpp>. Port init can be done with MACROs defined in SimpleNodeT.hpp. An input port definition example in SumNode is listed below:

// Setup input ports.
NODE_INIT_INPUT_PORT("VALUE_0"_sv);

Similar with Pass Descriptor, we can also describe each port using port descriptor APIs to make port visualization in DW Graph GUI Tool. Pass descriptor API’s are defined in file <dwcgf/pass/PortDescriptor.hpp>.

Port Descriptors

static constexpr auto describeInputPorts()
{
    using namespace dw::framework;
    return describePortCollection();
};
static constexpr auto describeOutputPorts()
{
    using namespace dw::framework;
    return describePortCollection(
        DW_DESCRIBE_PORT(int, "VALUE_0"_sv, PortBinding::REQUIRED),
        DW_DESCRIBE_PORT(int, "VALUE_1"_sv, PortBinding::REQUIRED));
};

Parameter Descriptors

The last describe function needed is the describeParameters API. It is used to describe the parameters of the node. Below is an example code from HelloWorldNode code. The function describes the parameter "name" which is also specified in HelloWorldNode.node.json:

static constexpr auto describeParameters()
{
    return describeConstructorArguments<HelloWorldNodeParams, dwContextHandle_t>(
        describeConstructorArgument(
            DW_DESCRIBE_PARAMETER(
                std::string,
                "name"_sv,
                &HelloWorldNodeParams::name)),
        describeConstructorArgument(
            DW_DESCRIBE_UNNAMED_PARAMETER(
                dwContextHandle_t)));
}

Constructor

There should be a constructor inside the public header that instantiates the node. Commonly a struct which is named <node_name>Params is passed as a parameter and contains the desired parameters to configure the node. A constructor parameter of type dwContextHandle_t is commonly passed separately. Here is an example:

HelloWorldNodeImpl(const HelloWorldNodeParams& params, const dwContextHandle_t ctx);

Besides the constructor, a node needs a static create function and registers the node using a macro in the cpp file. A simplest example is to utilize something like below:

dw::framework::create<Node>(ParameterProvider& provider);

Creating a node implementation

When creating an implementation file, the file should inherit from the concrete class SimpleProcessNode for a process node type. Using a similar pattern as stated earlier, the name of the node implementation should be the name of the node plus "Impl" (e.g., "HelloWorldNodeImpl"). In the implementation file, you should store all of the unique_ptr’s to the input and output ports.

Init Passes

void ISPNodeImpl::initPasses()
{
    // SETUP and TEARDOWN passes are using default impl
    NODE_REGISTER_PASS("PROCESS"_sv, [this]() { return process(); });
}

Init Ports

void ISPNodeImpl::initPorts()
{
    // Setup input ports, waitTime is zero by default
    NODE_INIT_INPUT_PORT("IMAGE"_sv);

    // Setup output ports
    const auto& transParams = m_params.transParams;
    dwImageProperties imageProps{};
    imageProps.width        = transParams.inputRes.x;
    imageProps.height       = transParams.inputRes.y;
    imageProps.type         = DW_IMAGE_CUDA;
    imageProps.memoryLayout = DW_IMAGE_MEMORY_TYPE_PITCH;
    imageProps.format       = DW_IMAGE_FORMAT_RGB_FLOAT16_PLANAR;

    // Create RGB (fp16) full resolution image
    FRWK_CHECK_DW_ERROR(dwImage_create(&m_convertedFullImage, imageProps, m_ctx));

    imageProps.width  = transParams.outputRes.x;
    imageProps.height = transParams.outputRes.y;
    NODE_INIT_OUTPUT_PORT("IMAGE"_sv, imageProps);
    NODE_INIT_OUTPUT_PORT("IMAGE_FOVEAL"_sv, imageProps);
    bool refSignal{};
    NODE_INIT_OUTPUT_PORT("FOVEAL_SIGNAL"_sv, refSignal);
}

Pass Implementation

Pass is the real meat behind the execution abstraction model because it provides the scheduler with timing information about each task and guarantees by defining each task is atomic with respect to the processor type it runs on. This theoretically allows a scheduler to efficiently schedule parallel work. When creating a pass, the heart of it is the function that is passed into NODE_REGISTER_PASS macro. The function will be passed a pointer to the node impl itself as the only parameter and should return a status. The user doesn’t need to explicitly write setup and teardown pass functions. These functions are implemented in the background. The setup pass gets executed first every time the node is executed, and the teardown gets called at the end of execution. The teardown is where all the outputs are sent. This is done by calling the member port output send method. Below is an example of the function passed into NODE_REGISTER_PASS macro in the HelloWorldNode:

PostProcessing Pass Implementation

dwStatus HelloWorldNodeImpl::process()
{
    auto& outPort0 = NODE_GET_OUTPUT_PORT("VALUE_0"_sv);
    auto& outPort1 = NODE_GET_OUTPUT_PORT("VALUE_1"_sv);
    if (outPort0.isBufferAvailable() && outPort1.isBufferAvailable())
    {
        *outPort0.getBuffer() = m_port0Value++;
        DW_LOGD << "[Epoch " << m_epochCount << "] Sent value0 = " << *outPort0.getBuffer() << Logger::State::endl;
        outPort0.send();

        *outPort1.getBuffer() = m_port1Value--;
        DW_LOGD << "[Epoch " << m_epochCount << "] Sent value1 = " << *outPort1.getBuffer() << Logger::State::endl;
        outPort1.send();
    }
    DW_LOGD << "[Epoch " << m_epochCount++ << "] Greetings from HelloWorldNodeImpl: Hello " << m_params.name.c_str() << "!" << Logger::State::endl;
    return DW_SUCCESS;
}

Logging

A static LOG_TAG should also be provided to identify the node. This is a label used for logging inside the framework. It should be the same as the node name. For example, HelloWorldNode would be:

static constexpr char LOG_TAG[] = "HelloWorldNode";

Any messages logged out within the node (e.g. a message of an exception) will be prepended by the LOG_TAG, therefore the message to be logged would suffice if it contains information in the following pattern: <function name>:<message>. As a example, for an exception to be thrown, we can have the following message,

throw Exception(DW_NOT_IMPLEMENTED, "validate: not implemented");

Validation

The validate function by default is implemented in SimpleNodeT.hpp. The user can overwrite this function for any custom validate implementation. An example to use validate() would be to validate all the ports are bound to the appropriate channels (any required ports, that is). For example, a camera node may have processed output and raw output ports, but only one is required to be hooked up. A validate method can inspect the ports to make sure at least one is bound. In the ISPNode, we validate if the required inputPortImageHandle is available:

dwStatus ISPNodeImpl::validate()
{
    dwStatus status = Base::validate();
    // Check foveal ports are bound when foveal enabled
    if (status == DW_SUCCESS &&
        (fovealEnabled() && (!NODE_GET_OUTPUT_PORT("IMAGE_FOVEAL"_sv).isBound() || !NODE_GET_OUTPUT_PORT("FOVEAL_SIGNAL"_sv).isBound())))
    {
        return DW_NOT_READY;
    }

    return status;
}

Reset

The reset function by default is implemented in SimpleNodeT.hpp. However, user can overwrite the reset function for any custom reset implementation.

dwStatus HelloWorldNodeImpl::reset()
{
    m_port0Value = 0;
    m_port1Value = 10000;
    return Base::reset();
}

Register custom node

#include <dwcgf/node/NodeFactory.hpp>
DW_REGISTER_NODE(dw::framework::HelloWorldNode)

JSON description

Node JSON

  • library: The basename of the shared library containing the node. Omitting this key indicates that the node has no implementation. The value ‘static’ indicates that the node is part of a statically linked library rather than a dynamically loaded shared library
  • name: The fully qualified C++ type name of the node class
  • inputPorts: The input ports. The order is user defined and matches the order in the C++ code if applicable
  • outputPorts: The output ports. The order is user defined and matches the order in the C++ code if applicable
  • parameters: The parameters. The order is user defined and matches the order in the C++ code if applicable
  • passes: The passes. The order is user defined and matches the order in the C++ code if applicable. In the default case where all passes are sequential, this means the passes must follow topological order

Graphlet JSON

  • name: The unique name of the graphlet
  • inputPorts: The input ports. The order is user defined
  • outputPorts: The output ports. The order is user defined
  • parameters: The parameters. The order is user defined
  • subcomponents: The subcomponents. The order should be alphabetical
  • connections: The connections. The order should be alphabetical based on the source and the connection parameter names
    • type: specifies the channel type connected to the port. Currently supports NvSciStream, socket or localshmem. Default value is localshmem
    • singleton: true/false, suggest a single buffer at producer side. A read&write lock is enforced on that piece of buffer so that there’s no conflict reading while writing or the other way around. Singleton flag is NOT supposed to be used with mailbox
    • mailbox: true/false, this is to identify the mail box channels in the graph. This is useful for STM schedule table generation. Channel identified with mailbox=true won’t impose scheduling dependencies between its producer and consumer. It’s usually specified when producer and consumer are not the same epoch
    • reuse: true/false, when the mailbox is true and you can further set the mailbox channel as a reuse mailbox channel. The packet in the reuse mailbox will be reused by the consumer if no new packet arrives from the producer side
    • fifo-size: this is used to set the FIFO size of the channel, integer value, optional, default to 1 if not set
    • indirect: true/false, if indirect is set to true, this means the downstream requires upstream data of previous frame

Application JSON

  • name: application name
  • logSpec: application log name
  • parameters: application parameters
  • requiredSensors: The required sensors JSON file
  • subcomponents: The subcomponents (same as for graphlets).
  • connections: The connections (same as for graphlets).
  • states: Commonly a single default state which maps to the STM schedule.
  • stmSchedules: Scheduling information as described in the next subsection.
  • processes: OS processes to be instantiated
    • executable: executable file
    • argv command line arguments of the process, e.g. for the STM master process
      • –log: STM master log name
      • –soc: Specifies which Tegra STM master runs on
      • -m: true = no skip on hyperepoch overrun
      • -v: verbose mode
    • logSpec: log of process
    • subcomponents: list of subcomponents if the process is a LoaderLite
  • extraInfo: ExtraInfo JSON file

Schedule Configuration

  • wcet: Worst Case Execution Time (WCET) YAML file
  • hyperepochs: A hyperepoch is a resource partition that runs a fixed configuration of epochs that share the resources in that partition. It is periodic in nature, and it respawns the contained epochs at the specified period
    • period: period of hyperepoch respawning epochs
    • epochs: A periodic timebase which dictates the rate at which a sub-graph of passes is respawned. It is confined to the boundaries of the hyperepoch
      • period: The period specified for the epoch specifies the rate at which a frame of passes is spawned, up to the number of frames specified, in the hyperepochs period
      • frames: Number of frames each epoch spawns in specified hyperepoch period. Period specifies the time each frame is separated apart
      • passes: A pass is an atomic unit of work, generally work that can be encompassed by a single function running on a single engine. Passes can have dependencies on other passes
    • resources: Any hardware engine (e.g. CPU, GPU, etc) or software resource (e.g. CUDA streams, scheduling mutexes, etc.) shared by multiple passes that needs to be modeled by STM compiler

SensorLayout JSON

  • sensor layout configuration

Steps to create a compute graph framework application

  1. Describe node interfaces in JSON format (demo JSON files provided in directories src/dwframework/dwnodes as well as src/cgf/nodes)
    1. External interface: input/output ports, parameters
    2. Internal / scheduling interface: passes, meta data relevant for scheduling
    3. Node description JSON files are included in directory src/cgf/graphs/descriptions/nodes
  2. Describe graphlet interfaces in JSON format, composed of nodes and other graphlets (demo JSON files provided in directory src/cgf/graphs/descriptions/graphlets)
    1. External interface: input/output ports, parameters
    2. Internal / scheduling interface: nodes and nested graphlets
    3. Graphlet description JSON files are included in directory src/cgf/graphs/descriptions/graphlets
  3. Describe complete compute graph in CGFDemo.graphlet.json (file provided in src/cgf/graphs/descriptions/graphlets) in JSON format which composed of nodes and graphlet
  4. (Optional, but recommended) Visualize the nodes, graphlets, DAGs
    1. Using the graphical editor to view the nodes/graphlets and their connections with CGFDemo.graphlet.json
    2. GUI tool release is included in a separate tar file. Please refer to README.md for usage information
  5. Describe the application specific metadata (e.g. parameters, epochs, hyperepochs) in JSON format
    1. application description JSON files are included in directory: src/cgf/graphs/descriptions/systems. Please refer to previous section for these metadata descriptions
  6. (Optional) Adding a custom node to the demo
    1. Using the provided tool in tools/descriptionScheduleYamlGenerator, convert application description JSON files into YAML file for STM compiler

      command: ./descriptionScheduleYamlGenerator.py --graph CGFDemo.app.json --output CGFDemo.yaml

    2. Using stmcompiler tool from STM package, .stm binary file can be generated with YAML input

      command: ./stmcompiler -i CGFDemo.yaml -o GFDemo.stm

    3. Performance fine tuning in the schedule table. Performance tuning and architecture analysis tools will be released in future release
    4. If running with an added custom node, copy new STM, CGFDemo.stm, into cgf/graphs directory on DDPX. In addition, copy custom node directory and updated JSON files such as CGFDemo.graphlet.json, CGFDemo.schedule.json onto DDPX
  7. Once set up, demo can be launched with command sudo ./run_cgf.sh

Some demo components are released in binary form. Please refer to the description of each binary:

  • launcher: Launcher parses the application description, launch loader, STM master, SSM master
  • LoaderLite: LoaderLite takes in the application descriptions, instantiates system-wide parameters and handles. It also instantiates the corresponding dwGraphlet, instantiates SSMClone and STM client, registers passes and enters them into scheduler
  • sensor_sync_server: Sensor sync server
  • descriptionScheduleYamlGenerator: A tool that converts application description files into YAML format for STM compilation
  • stm_master: Precompiled STM (System Task Manager)
  • vanillassm: Precompiled SSM (System State Manager)

Integrate a custom node JSON into the application

With the node structure explained above, we can now create a custom node based on the HelloWorld and Sum sample code provided. Please follow the following steps to add the HelloWorld and Sum sample node into the demo compute graph:

  1. Set up your cross compile environment for DriveWorks sample based on the DriveWorks SDK documentation
  2. libcgf_custom_nodes.so will then be generated under build/src/cgf_node directory
  3. Copy libcgf_custom_nodes.so into DDPX targets/linux-amd64-ubuntu/lib directory
  4. Copy HelloWorldNode.node.json and SumNode.node.json into driveworks/src/cgf/nodes folder
  5. In CGFDemo.graphlet.json, HelloWorld and Sum nodes can be added in these three sections:
    1. Under subcomponents:
      "helloworld": {
          "componentType": "../../../../../nodes/HelloWorldNode.node.json"
          "parameters": { "name": "$name" }
      },
      "sum": {
          "componentType": "../../../nodes/SumNode.node.json"
      }
      
    2. Under parameters, add HelloWorld name parameter:
      "name": { "type": "std::string", "default": "Demo" }
      
    3. Under connections
      { "src": "helloworld.VALUE_0", "dests": {"sum.VALUE_0": {}} },
      { "src": "helloworld.VALUE_1", "dests": {"sum.VALUE_1": {}} },
      
  6. In CGFDemo.app.json, add HelloWorld and Sum:

    "renderEpoch": { "passes": [ ... "cgfDemo.arender", "cgfDemo.helloworld", "cgfDEmo.sum" ] }

  7. To create new STM binary, let's first convert .app.json to yaml with descriptionScheduleYamlGenerator tool. Using descriptionScheduleYamlGenerator.py in tools folder, use command: ./descriptionScheduleYamlGenerator --graph CGFDemo.app.json --output CGFDemo.yaml
  8. Now generate STM binary with stmcompiler tool. Using stmcompiler in the tools folder, use command: ./stmcompiler -i CGFDemo.yaml -o CGFDemo.stm
  9. Replace with new CGFDemo.stm in src/cgf/graphs folder

Verify the functionality of custom node

To quickly verify if the node has been added into the framework successfully, prints can be added in the C++ implementation. This example can be found in the HelloWorld node.