Chip to Chip Communication

The NVIDIA software Communication Interface for Chip-To-Chip over direct PCIe connection (NvSciC2cPcie) provides the ability for user applications to exchange data across two NVIDIA DRIVE AGX Orin Devkits interconnected on a direct PCIe connection. The direct PCIe connection is between the first/one NVIDIA DRIVE AGX Orin Devkit as a PCIe Root Port with the second/other NVIDIA DRIVE AGX Orin Devkit as a PCIe Endpoint.

Supported Platform Configurations

Platform

  • NVIDIA DRIVE AGX Orin Devkit
SoC
  • NVIDIA DRIVE Orin as PCIe Root Port
  • NVIDIA DRIVE Orin as PCIe Endpoint
Topology
  • NVIDIA DRIVE AGX Orin Devkit as PCIe Root Port <> NVIDIA DRIVE AGX Orin Devkit as PCIe Endpoint

Platform Setup

The following platform configurations are required for NvSciC2cPcie communication.

  • miniSAS Port-A of NVIDIA DRIVE AGX Orin Devkit -1 connected to miniSAS Port-B of NVIDIA DRIVE AGX Orin Devkit - 2 with a PCIe miniSAS cable.
  • The PCIe controllers of the two NVIDIA DRIVE AGX Orin Devkits when interconnected back-to-back have PCIe re-timers, and the PCIe re-timer firmware must be flashed for the appropriate PCIe lane configuration.

Execution Setup

Linux Kernal Module Insertion

NvSciC2cPcie only runs on select platforms: NVIDIA DRIVE AGX Orin Devkit. Before user applications can exercise NvSciC2cPcie interface, you must insert the Linux kernel modules for NvSciC2cPcie. They are not loaded by default on DRIVE OS Linux boot. To insert the required Linux kernel module:

  • On first/one NVIDIA DRIVE AGX Orin Devkit configured as PCIe Root Port (miniSAS cable connected to miniSAS port-A)

sudo modprobe nvscic2c-pcie-epc

  • On second/other NVIDIA DRIVE AGX Orin Devkit configured as PCIe Endpoint (miniSAS cable connected to miniSAS port-B)

sudo modprobe nvscic2c-pcie-epf

PCIe Hot-Plug

Once loaded, NVIDIA DRIVE AGX Orin Devkit enabled as PCIe Endpoint is hot-plugged and enumerated as a PCIe device with NVIDIA DRIVE AGX Orin Devkit configured as PCIe Root Port (miniSAS cable connected to miniSAS port-A). The following must be executed on NVIDIA DRIVE AGX Orin Devkit configured as PCIe Endpoint (miniSAS cable connected to miniSAS port-B):

sudo -s
cd /sys/kernel/config/pci_ep/
mkdir functions/nvscic2c_epf_22CC/func
echo 0x10DE > functions/nvscic2c_epf_22CC/func/vendorid
echo 0x22CC > functions/nvscic2c_epf_22CC/func/deviceid
ln -s functions/nvscic2c_epf_22CC/func controllers/141c0000.pcie_ep
echo 0 > controllers/141c0000.pcie_ep/start
echo 1 > controllers/141c0000.pcie_ep/start

The previous steps, including Linux kernel module insertion, can be added as a linux systemd service to facilitate auto-availability of NvSciC2cPcie SW at boot.

NvSciIpc (INTER_CHIP, PCIe) Channels

Once the Linux kernel module insertion and PCIe hot-plug completes successfully, the following NvSciIpc channels are available for use with NvStreams producer or consumer applications.

NVIDIA DRIVE AGX Orin Devkit as PCIe Root Port NVIDIA DRIVE AGX Orin Devkit as PCIe Endpoint
nvscic2c_pcie_s0_c5_1 nvscic2c_pcie_s0_c6_1
nvscic2c_pcie_s0_c5_2 nvscic2c_pcie_s0_c6_2
nvscic2c_pcie_s0_c5_3 nvscic2c_pcie_s0_c6_3
nvscic2c_pcie_s0_c5_4 nvscic2c_pcie_s0_c6_4
nvscic2c_pcie_s0_c5_5 nvscic2c_pcie_s0_c6_5
nvscic2c_pcie_s0_c5_6 nvscic2c_pcie_s0_c6_6
nvscic2c_pcie_s0_c5_7 nvscic2c_pcie_s0_c6_7
nvscic2c_pcie_s0_c5_8 nvscic2c_pcie_s0_c6_8
nvscic2c_pcie_s0_c5_9 nvscic2c_pcie_s0_c6_9
nvscic2c_pcie_s0_c5_10 nvscic2c_pcie_s0_c6_10
nvscic2c_pcie_s0_c5_11 nvscic2c_pcie_s0_c6_11

NvSciIpc (INTER_CHIP, PCIe) channel names can be modified according to convenience.

The user-application on NVIDIA DRIVE AGX Orin Devkit as PCIe Root Port opens nvscic2c_pcie_s0_c5_1for use, then the peer user application on the other NVIDIA DRIVE AGX Orin Devkit as PCIe Endpoint must open nvscic2c_pcie_s0_c6_1 for exchange of data across the SoCs and for the remaining channels listed above.

Each of the NvSciIpc (INTER_CHIP, PCIe) channels are configured to have 16 frames with 32KB per frame as the default.

Reconfiguration

The following reconfiguration information is based on the default NvSciC2cPcie support offered for NVIDIA DRIVE AGX Orin Devkit.

Different platforms or a different PCIe controller configuration on the same NVIDIA DRIVE AGX Orin Devkit require adding a new set of device-tree node entries for NvSciC2cPcie on a PCIe Root Port (nvidia,tegra-nvscic2c-pcie-epc) and a PCIe Endpoint (nvidia,tegra-nvscic2c-pcie-epf). For example, a change in PCIe Controller Id or a change in role of a PCIe controller from PCIe Root Port to PCIe Endpoint (or vice-versa) from the default NVIDIA DRIVE AGX Orin Devkit requires changes, which are possible with device-tree node changes or additions but are not straightforward to document them all. These shall be one-time changes and can be done in coordination with your NVIDIA point-of-contact.

BAR Size

BAR size for NVIDIA DRIVE AGX Orin as PCIe Endpoint is configured to 1GB by default. When required, this can be reduced or increased by modifying the property nvidia,bar-win-size of device-tree node: nvscic2c-pcie-s0-c6-epf

file: <PDK_TOP>/drive-foundation/platform-config/hardware/nvidia/platform/t23x/automotive/kernel-dts/p3710/common/tegra234-dcb-p3710-0010.dtsi/tegra234-p3710-0010-nvscic2c-pcie.dtsi

nvscic2c-pcie-s0-c6-epf {
	compatible = "nvidia,tegra-nvscic2c-pcie-epf";
--   nvidia,bar-win-size = <0x40000000>;  /* 1GB. */
++ nvidia,bar-win-size = <0x20000000>;  /* 512MB. */
};

NvSciIpc (INTER_CHIP, PCIe) Channel Properties

The NvSciIpc (INTER_CHIP, PCIe) channel properties can be modified on a use-case basis.

Modify Channel Properties

To change the channel properties frames and frame size, must change for both NVIDIA DRIVE AGX Orin Devkit as PCIe Root Port and NVIDIA DRIVE AGX Orin Devkit as PCIe Endpoint device-tree nodes: nvscic2c-pcie-s0-c5-epc and nvscic2c-pcie-s0-c6-epf respectively.

file: <PDK_TOP>/drive-foundation/platform-config/hardware/nvidia/platform/t23x/automotive/kernel-dts/p3710/common/tegra234-dcb-p3710-0010.dtsi/tegra234-p3710-0010-nvscic2c-pcie.dtsi

The following illustrates change in frames count or number of frames for NvSciIpc (INTER_CHIP, PCIe) channel: nvscic2c_pcie_s0_c5_2 (PCIe Root Port) and nvscic2c_pcie_s0_c6_1(PCIe Endpoint)

nvscic2c-pcie-s0-c5-epc {
nvidia,endpoint-db =
"nvscic2c_pcie_s0_c5_1,     16,     00032768",
--   "nvscic2c_pcie_s0_c5_2,     16,     00032768",
++ "nvscic2c_pcie_s0_c5_2,     08,     00032768",
"nvscic2c_pcie_s0_c5_3,     16,     00032768",
…..
};
nvscic2c-pcie-s0-c6-epf {
nvidia,endpoint-db =
"nvscic2c_pcie_s0_c6_1,     16,     00032768",
--   "nvscic2c_pcie_s0_c6_2,     16,     00032768",
++ "nvscic2c_pcie_s0_c6_2,     08,     00032768",
"nvscic2c_pcie_s0_c6_3,     16,     00032768",
…..
};

The following illustrates change in frame size for NvSciIpc (INTER_CHIP, PCIe) channel: nvscic2c_pcie_s0_c5_2 (PCIe Root Port) and nvscic2c_pcie_s0_c6_1(PCIe Endpoint)

nvscic2c-pcie-s0-c5-epc {
nvidia,endpoint-db =
"nvscic2c_pcie_s0_c5_1,     16,     00032768",
--   "nvscic2c_pcie_s0_c5_2,     16,     00032768",
++ "nvscic2c_pcie_s0_c5_2,     16,     00028672",
"nvscic2c_pcie_s0_c5_3,     16,     00032768",
…..
};
nvscic2c-pcie-s0-c6-epf {
nvidia,endpoint-db =
"nvscic2c_pcie_s0_c6_1,     16,     00032768",
--   "nvscic2c_pcie_s0_c6_2,     16,     00032768",
++ "nvscic2c_pcie_s0_c6_2,     16,     00028672",
"nvscic2c_pcie_s0_c6_3,     16,     00032768",
…..
};

New Channel Addition

To introduce additional NvSciIpc (INTER_CHIP, PCIe) channels, the change must occur for both NVIDIA DRIVE AGX Orin Devkit as PCIe Root Port and NVIDIA DRIVE AGX Orin Devkit as PCIe Endpoint device-tree nodes: nvscic2c-pcie-s0-c5-epc and nvscic2c-pcie-s0-c6-epf respectively.

File: <PDK_TOP>/drive-foundation/platform-config/hardware/nvidia/platform/t23x/automotive/kernel-dts/p3710/common/tegra234-dcb-p3710-0010.dtsi/tegra234-p3710-0010-nvscic2c-pcie.dtsi

nvscic2c-pcie-s0-c5-epc {
	nvidia,endpoint-db =
	"nvscic2c_pcie_s0_c5_1,     16,     00032768",
	……
	--  "nvscic2c_pcie_s0_c5_11,     16,     00032768";
	++ "nvscic2c_pcie_s0_c5_11,     16,     00032768",
	++ "nvscic2c_pcie_s0_c5_12,     16,     00032768";
};
nvscic2c-pcie-s0-c6-epf {
	nvidia,endpoint-db =
	"nvscic2c_pcie_s0_c6_1,     16,     00032768",
	……
	--  "nvscic2c_pcie_s0_c6_11,     16,     00032768";
	++ "nvscic2c_pcie_s0_c6_11,     16,     00032768",
	++ "nvscic2c_pcie_s0_c6_12,     16,     00032768";
};

File: /etc/nvsciipc.cfg(on target)

INTER_CHIP      nvscic2c_pcie_s0_c5_11   0000
	++ INTER_CHIP      nvscic2c_pcie_s0_c5_12   0000
	…..
	…..
	…..
	INTER_CHIP      nvscic2c_pcie_s0_c6_11   0000
	++ INTER_CHIP      nvscic2c_pcie_s0_c6_12   0000

Changes can be made to reduce, subtract, or remove any of the existing NvSciIpc (INTER_CHIP, PCIe) channels.

For a given pair of NVIDIA DRIVE AGX Orin Devkit as PCIe Root Port and NVIDIA DRIVE AGX Orin Devkit as PCIe Endpoint, the maximum NvSciIpc (INTER_CHIP, PCIe) channels supported are 16.

PCIe Hot-Unplug

To tear-down the connection between PCIe Root Port and PCIe Endpoint, you must PCIe hot-unplug PCIe Endpoint from PCIe Root Port. Refer to the Restrictions section for more information..

The PCIe Hot-Unplug is always executed from PCIe Endpoint [NVIDIA DRIVE AGX Orin Devkit (miniSAS cable connected to miniSAS port-B)] by initiating the power-down off the PCIe Endpoint controller and subsequently unbinding the nvscic2c-pcie-epf module with the PCIe Endpoint.

Prerequisite: PCIe Hot-Unplug must be attempted only when the PCIe Endpoint is successfully hot-plugged into PCIe Root Port and NvSciIpc(INTER_CHIP, PCIE) channels are enumerated.

To PCIe hot-unplug, execute the following on NVIDIA DRIVE AGX Orin Devkit configured as PCIe Endpoint (miniSAS cable connected to miniSAS port-B). This makes NvSciIpc(INTER_CHIP, PCIE) channels disappear on both the PCIe inter-connected NVIDIA DRIVE AGX Orin Devkits.

sudo -s
cd /sys/kernel/config/pci_ep/
echo 0 > controllers/141c0000.pcie_ep/start
unlink controllers/141c0000.pcie_ep/func

Successful PCIe hot-unplug of PCIe Endpoint from PCIe Root Port makes the NvSciIpc(INTER_CHIP, PCIE) channels as listed, NvSciIpc (INTER_CHIP, PCIe) channels, go away on both the NVIDIA DRIVE AGX Orin Devkits, and you can proceed with power-cycle/off of one or both the NVIDIA DRIVE AGX Orin Devkits.

PCIe Hot-Replug

To re-establish the PCIe connection between PCIe Endpoint and PCIe Root Port, you must PCIe hot-replug PCIe Endpoint to PCIe Root Port.

When both the SoCs were power-cycled after PCIe hot-unplug previously, you must follow the usual steps of PCIe hot-plug. However, if one of the two SoCs power-cycled/rebooted then, PCIe hot-replug is required to re-establish the connection between them.

Note: NVIDIA DRIVE AGX Orin Devkits with PCIe retimer firmware(FW) have known issues that do not allow users to PCIe hot-replug and PCIe hot-unplug without rebooting both the PCIe inter-connected NVIDIA DRIVE AGX Orin Devkits. Therefore, for the platforms that have PCIe retimers, you must power-cycle/reboot both the NVIDIA DRIVE AGX Orin Devkits and establish the PCIe connection between them after PCIe hot-unplug. See the Execution Setup section for connection information.

Prerequisite: PCIe hot-replug is attempted when one of the two SoCs are power-recycled/rebooted after a successful attempt of PCIe hot-unplug between them. If both SoCs were power-recycled/rebooted, then the same steps as listed in section: Execution Setup section are required to establish the PCIe connection between them.

For platforms that do not have PCIe retimers, to achieve PCIe hot-replug after the connection was PCIe Hot-Unplugged before NVIDIA DRIVE AGX Orin Devkit has rebooted,execute the following on NVIDIA DRIVE AGX Orin Devkit configured as PCIe Endpoint (miniSAS cable connected to miniSAS port-B). This makes NvSciIpc(INTER_CHIP, PCIE) endpoints reappear on both the PCIe inter-connected NVIDIA DRIVE AGX Orin Devkits.

Case 1: When only PCIe Root Port SoC was power-recycled/rebooted

On PCIe Root Port SoC (NVIDIA DRIVE AGX Orin Devkit (miniSAS cable connected to miniSAS port-A))

Follow the same steps as listed in Linux Kernel Module Insertion, Execution Setup.

On PCIe Endpoint SoC [NVIDIA DRIVE AGX Orin Devkit(miniSAS cable connected to miniSAS port-B)]
sudo -s
cd /sys/kernel/config/pci_ep/
ln -s functions/nvscic2c_epf_22CC/func controllers/141c0000.pcie_ep
echo 0 > controllers/141c0000.pcie_ep/start
echo 1 > controllers/141c0000.pcie_ep/start

Case 2: When only PCIe Endpoint SoC is power-recycled/rebooted

On PCIe Endpoint SoC (NVIDIA DRIVE AGX Orin Devkit (miniSAS cable connected to miniSAS port-B)

Follow the steps Execution Setup.

On PCIe Root Port SoC (NVIDIA DRIVE AGX Orin Devkit (miniSAS cable connected to miniSAS port-A)

Nothing is required. The module is already inserted.

Assumptions

  • NVIDIA Software Communication Interface for Chip-To-Chip (NvSciC2cPcie) is offered only between the inter-connected DRIVE Orin SoC as PCIe Root Port and a DRIVE Orin SoC as PCIe Endpoint.
  • NVIDIA Software Communication Interface for Chip-To-Chip (NvSciC2cPcie) is offered from a single Guest OS Virtual Machine of a DRIVE Orin SoC as PCIe Root Port to a single Guest OS Virtual Machine of another DRIVE Orin SoC as PCIe Endpoint.
  • User-applications are responsible for the steps to teardown the ongoing Chip-to-Chip transfer pipeline on all the SoCs in synergy and gracefully.
  • Out of the box support is ensured for NVIDIA DRIVE AGX Orin Devkit inter-connected with another NVIDIA DRIVE AGX Orin Devkit. In this configuration, the default configuration is PCIe controller C5 in PCIe Root Port mode and PCIe controller C6 in PCIe Endpoint mode. Any change in PCIe controller mode or by moving to another set of PCIe controllers for NvSciC2cPcie requires changes in the tegra234-p3710-0010-nvscic2c-pcie.dtsi device-tree include file.
  • For steps to recompile changes in tegra234-p3710-0010-nvscic2c-pcie.dts(using -d option), refer to Bind Partition Options

Restrictions

  • Before powering-off/recycling one of the two PCIe inter-connected NVIDIA DRIVE AGX Orin Devkits when one NVIDIA DRIVE AGX Orin Devkit is PCIe hot-plugged into another NVIDIA DRIVE AGX Orin Devkit, you must tear down the PCIe connection between them (PCIe hot-unplug).
  • Before tearing down the PCIe connection between the two SoCs (PCIe hot-unplug), on both of these SoCs, all applications or streaming pipelines using the corresponding NvSciIpc(INTER_CHIP, PCIE) channels will exit or purge. Before they exit or purge, the corresponding in-use NvSciIpc(INTER_CHIP, PCIE) channel must be closed with NvSciIpcCloseEndpoint().
  • On the two PCIe inter-connected NVIDIA DRIVE AGX Orin Devkits, before closing a corresponding NvSciIpc(INTER_CHIP, PCIE) channel with NvSciIpcCloseEndpoint(), the you must ensure for this NvSciIpc(INTER_CHIP, PCIE) channel:
    • No pipelined NvSciSync waits are pending.
    • All the NvSciIpc (INTER_CHIP, PCIE) channel messages sent have been received.
    • All the NvSciBuf and NvSciSync, source and target handles, export and import handles, registered and CPU mapped, with NvSciC2cPcie layer must be unregistered and their mapping deleted with NvSciC2cPcie layer by invoking the relevant NvSciC2cPcie programming interfaces.
  • Unloading of NvSciC2cPcie Linux kernel modules is not supported.
  • Error-handling of NvSciC2cPcie transfers leads to timeouts in the software layers exercising NvSciC2cPcie.
  • With SC-7 suspend and resume, NVIDIA Chip to Chip Communication must be established only after completing SC-7 suspend and SC-7 resume.
    • Subsequent SC-7 suspend and SC-7 resume cycles are not supported once NVIDIA Chip to Chip Communication is established.