Deploy AI-RAN at Cell Sites with NVIDIA ARC-Compact

Wireless networks are the backbone of modern connectivity, serving billions of 5G users through millions of cell sites globally. The opportunities and benefits of AI-RAN are driving the transformation of telecom networks and ecosystems toward AI-native wireless networks.

NVIDIA AI Aerial software is now available as open source. Access Now on GitHub >

The mission is to create an intelligent network fabric that connects hundreds of billions of AI-powered endpoints such as smartphones, cameras, robots, and AI agents. This requires embedding AI into radio signal processing for X-factor performance and efficiency gains, and accelerating cell sites to serve AI traffic, bringing AI inference as close as possible to users.

AI-RAN makes it possible, by evolving the current ASIC-based, single-purpose 5G/RAN-only systems to multi-purpose, commercial off-the-shelf (COTS) systems capable of hosting AI and RAN workloads, AI for RAN algorithms, and AI-on-RAN applications on the same platform. This transformation promises numerous benefits for telecom operators, including enabling new AI services, generating new revenues, improving network utilization, spectral efficiency, energy efficiency, and performance. AI-RAN provides a future-proof path to 6G through a 100% software-defined approach leveraging virtualized RAN (vRAN).

For AI-RAN to become a reality in the field, it must support centralized and distributed scenarios depending on demand for AI and RAN capacity at a given location. For example, centralized RAN (C-RAN) may aggregate AI and 5G vRAN workloads from tens of cell sites requiring higher-capacity systems versus distributed RAN (D-RAN) deployed at a single cell site needing lower total capacity. The goal is to have a ubiquitous 5G vRAN software layer over a homogeneous accelerated infrastructure across any deployment scenario, to capture the full benefits of AI-RAN with ease of management and operation.

NVIDIA AI Aerial is set to power high-density AI-RAN deployments using NVIDIA Grace Hopper and NVIDIA Grace Blackwell-based Aerial RAN Computer systems suited for AI-centric scenarios.

The new NVIDIA Compact Aerial RAN Computer (ARC-Compact) extends these capabilities to the edge, enabling AI-RAN at individual cell sites where space and power are at a premium, and RAN-centric workloads dominate. Together, they support both centralized and distributed AI-RAN.

NVIDIA ARC-Compact: a Low-Power AI-RAN solution for cell sites

ARC-Compact is designed to enable distributed AI-RAN deployment scenarios. Combining power efficiency, GPU-accelerated radio processing, and high-performance vRAN, it uses the Arm ecosystem to transform cell sites into multifunctional 5G and AI hubs. It addresses the unique constraints of cell sites by providing optimal cell capacity within typical power and space limitations, while meeting ‌ form factor requirements.

ARC-Compact supports the same software-defined codebase across C-RAN and D-RAN, following the same first principles of being software upgradable to 6G. It is built using the NVIDIA Grace CPU C1, featuring 72 high-performance and power-efficient Arm Neoverse V2 cores. A PCIe plug-in card with an NVIDIA L4 Tensor Core GPU is used for accelerated processing of radio functions or AI workloads.Fast Ethernet connectivity is provided with an NVIDIA ConnectX-7 network interface card (NIC).

ARC-Compact meets key requirements for distributed AI-RAN as outlined in Table 1, based on internal NVIDIA benchmarking.

Requirements	ARC-Compact Advantage
Energy efficiency	• Gbps/Watt similar to traditional baseband systems today • Leverages low-power 72W L4 GPU and power-efficient ARM CPU, for meeting total power consumption within the typical cell site range
5G vRAN performance	• Supports a mix of TDD and FDD, standard 4TR and massive 64TR MIMO • Up to 30 sector carriers, supporting 25 Gbps system throughput • Supports O-RAN 7-2 and 7-3 RU Splits • Supports both in-line and lookaside Layer 1 functions • Embedded NVIDIA CUDA-Accelerated RAN algorithms (cuPHY, cuMAC) for improved spectral efficiency and performance
AI processing	• Advanced AI models and AI for RAN innovations, including neural networks, site-specific learning accelerated on NVIDIA GPUs • Add-on NVIDIA GPU options for AI applications such as video search and summarization, cybersecurity, computer vision, and LLM inferencing
Environmental compliance	• 2RU height, 17-inch half-depth—ideal for cell site cabinets • Supports +55 °C to -5 °C operating range • US NEBS3 and European GR-63/1089-CORE compliant
Software upgradability	• Fully software-defined, upgradeable to 6G

Table 1. ARC-Compact meets key requirements for distributed AI-RAN

ARC-Compact adoption and availability

ARC-Compact is expected to become available through multiple OEM and ODM partners such as Foxconn, Lanner, Quanta Cloud Technology, and Supermicro with whom we are partnering on the development of Grace CPU C1–based systems. We expect to see various configurations to support telecom distributed AI-RAN use-cases in the market later this year.
The joint AI-RAN Innovation center partnership announced at T-Mobile’s Capital Markets day has provided insights into the NVIDIA ARC-Compact solution development for distributed AI-RAN deployments. The new solution will be used as input for the next phase of the AI-RAN collaboration as the D-RAN reference architecture.
Vodafone is continuing its collaboration with NVIDIA and evaluating the ARM-based NVIDIA ARC-Compact solution for distributed AI-RAN, in line with their key OpenRAN objective to deliver greater performing and efficient distributed compute on edge-optimized short-depth servers.
Nokia received the seed systems of NVIDIA ARC-Compact as part of the early access program and is testing its 5G Cloud RAN software with early benchmarking, showing the suitability of ARC-Compact for distributed RAN deployment scenarios. This furthers the ongoing collaboration between Nokia and NVIDIA on AI-RAN.
Samsung is expanding its AI-RAN collaboration with NVIDIA to include their 5G vRAN integration with NVIDIA ARC-Compact for distributed AI-RAN solutions. Samsung already completed a proof-of-concept to verify seamless integration between its vRAN software and NVIDIA L4 GPUs, demonstrating enhanced network performance and efficiency last year. Samsung is now evaluating its vRAN software with NVIDIA Grace C1, and NVIDIA L4 Tensor Core GPUs to accelerate additional AI workloads, including AI/ML algorithms, to further the performance and efficiency gains.

NVIDIA has been at the forefront of AI-RAN solutions with its AI Aerial portfolio. The existing Aerial RAN Computer systems are already part of AI-RAN engagements with customers such as Indosat Ooredoo Hutchison, SoftBank, and T-Mobile, and solution partners such as Capgemini, Fujitsu, Kyocera, and SynaXG. With ARC-Compact, it now includes both high-density and low-density systems to cater to AI-centric, RAN-centric, or even AI-only and RAN-only modes. This enables homogeneous software and hardware architecture across centralized and distributed AI-RAN deployment scenarios, a key requirement for operators building AI-RAN networks.

ARC-Compact key building blocks

ARC-Compact is designed to efficiently process 5G vRAN and AI workloads, leveraging the following hardware and software components.

NVIDIA Grace CPU

The NVIDIA Grace CPU is designed for modern data centers running AI, vRAN, cloud, edge, and high-performance computing applications. It provides 2x the energy efficiency of today’s leading server processors. NVIDIA Grace architecture is fully compatible with the Arm ecosystem, ensuring that any application designed for Arm in data centers will operate seamlessly on Grace, and vice versa, giving telecom operators the needed supplier diversity for their vRAN deployments.

NVIDIA L4 Tensor Core GPU

The L4 Tensor Core GPU PCIe plug-in card provides a cost-effective, energy-efficient solution for high-throughput and low-latency AI workloads and RAN acceleration. It supports FP8 Tensor Core with 24 GB of GPU memory, enabling 485 teraFLOPS performance, suitable for edge AI workloads such as video search and summarization, and vision language models, in addition to radio Layer 1 and some Layer 2 functions such as scheduling. L4 delivers 120x higher AI video performance than CPU-based solutions and a low-profile form factor operating in a 72W TDP low-power envelope.

NVIDIA ConnectX-7

NVIDIA ConnectX-7 provides high-speed, low-latency Ethernet connectivity supporting fronthaul, mid-haul, or backhaul, and can also route AI traffic or provide advanced offloads. It provides up to four ports of connectivity and up to 400 Gbps of total throughput and delivers hardware-accelerated networking, storage, security, and manageability services at data center scale for telecommunications, including features such as in-line hardware acceleration for Transport Layer Security (TLS), IP Security (IPsec), and MAC Security (MACsec)

Software architecture

ARC-Compact leverages both CPU and GPU to deliver 5G vRAN, based on NVIDIA Aerial CUDA accelerated RAN software implementation. Table 2 shows the compute used for various RAN functions and AI workloads supported by ARC-Compact.

Workloads	Grace CPU C1	L4 GPU
RAN and CNFs	• FDD Layer 1 • TDD and FDD Layer 2 + • Distributed UPF • RIC Applications	• TDD Layer 1 for Standard and Massive MIMO • Embedded AI/ML algorithms in L1, and some Layer 2 functions such as accelerated scheduler (cuMAC, cuPHY)
AI	N/A	• Advanced AI models, such as neural network model for channel estimation • LLM, VLM inferencing, Video Search & Summarization, Cybersecurity, Smart City / IOT AI apps, Computer Vision and more • RIC Applications

Table 2. Compute used for RAN functions and AI workloads supported by ARC-Compact

Core benefits for telecom service providers

For telecom service providers looking to deploy distributed AI-RAN at cell sites, NVIDIA ARC-Compact offers a low-power, compact, and economical solution for delivering high-performance 5G and AI inferencing. The key benefits include:

Energy efficient solution for cell sites: ARC-Compact delivers both high-performance 5G vRAN and AI workloads within a low power envelope, enabling sustainable, cost-effective distributed AI-RAN deployments.
AI-powered radio and edge innovation: By integrating PCIe plug-in L4 GPUs, it enables advanced AI/ML algorithms for radio signal processing improving spectral efficiency, network utilization, and unlocking new AI-driven services at the edge.
Leverages the ARM ecosystem for flexibility and diversity: Built on the NVIDIA Grace C1 CPU with Arm cores, ARC-Compact enables telecom operators to benefit from the growing ARM ecosystem and diversify their supplier base for vRAN solutions.
Homogeneous, software-defined vRAN across all deployments: ARC-Compact runs the same 5G vRAN software as centralized AI-RAN sites, enabling a unified, fully software-defined network that is easily upgradeable to 6G and simplifies management irrespective of deployment scenario.
Optimized for real-world cell site requirements: With a compact form factor, broad temperature range, and compliance with global telecom standards, it is purpose-built for AI-RAN deployment at the edge on a single platform.

The system can be configured flexibly to support various AI-RAN use cases, including:

RAN centric or RAN only: This is expected to be the primary use case for most distributed deployments and is served with a single Grace C1 CPU and single L4 GPU configuration.
AI centric: Primarily uses the Grace CPU for RAN (like FDD) and dedicates the L4 GPU for AI or visual processing applications.
RAN and AI centric: Utilizes an additional L4 GPU dedicated for AI or visual processing, while supporting high-end RAN workloads simultaneously with single C1 and L4.

Conclusion: the AI-RAN catalyst

NVIDIA introduced the Aerial RAN Computer-1 in 2024, showcasing AI-RAN benefits such as 5x in new revenues per dollar of CAPEX, 40% better energy efficiency versus best-in-class ASIC systems, and 3x better capacity utilization, through an external field trial. This moment marked a turning point for AI-RAN technology and its ecosystem. Many customers and partners began to advance their AI-RAN goals.

NVIDIA ARC-Compact is the next catalyst for AI-RAN adoption, enabling telcos to deploy powerful, energy-efficient, and flexible AI-RAN solutions at every cell site. Coupled with Aerial RAN Computer-1, it adds to the AI-RAN building blocks with a full-stack platform that enables scalable hardware, common software, and an open architecture to deliver high performance AI-RAN together with ecosystem partners, across any deployment scenario.

Telcos also value the full suite of NVIDIA AI Aerial portfolio with three computer systems to train, simulate, and deploy AI-RAN. For example, the same NVIDIA Aerial CUDA-Accelerated RAN software version is implemented in both the Aerial Omniverse Digital Twin as well as the live Aerial RAN Computer, allowing customers to simulate the performance of new AI models predictably before deploying them in the field, and continue to fine-tune them via a data loop.

As the industry accelerates toward AI-native wireless networks, NVIDIA AI Aerial provides the foundation for a new era of distributed intelligence, unlocking unprecedented efficiency, innovation, and value across the wireless landscape.

Learn more about AI-RAN in action from a panel of telecom leaders at NVIDIA GTC 2025.