Building Telco Reasoning Models for Autonomous Networks with NVIDIA NeMo

Autonomous networks are quickly becoming one of the top priorities in telecommunications. According to the latest NVIDIA State of AI in Telecommunications report, 65% of operators said AI is driving network automation, and 50% named autonomous networks as the top AI use case for ROI.

Yet many telcos still report gaps in AI and data science expertise. This makes it difficult to scale safe, closed-loop automation across complex, multidomain networks.

Most telecom network operations centers (NOCs) today operate using reactive, alarm-driven workflows. Engineers manually triage thousands of incidents across multiple tools, sift through a high volume of alarm and performance data, and stitch together fragmented dashboards and logs before applying a fix or dispatching a field team. NOCs are a natural starting point for autonomous networks, because they concentrate high-volume, repeatable tasks where AI can directly cut MTTR and OPEX.

Tech Mahindra, a leading global provider of technology consulting and digital solutions to enterprises across industries, and NVIDIA are collaborating to close this AI skills gap. They’re doing so by making autonomous network building blocks—open models, tools, and implementation guides—into assets telecom developers can readily adopt and adapt in their own environments.

This post outlines how to fine‑tune reasoning models with NVIDIA NeMo so they behave like NOC engineers, safely driving closed‑loop, self‑healing workflows. It shows how to:

Generate synthetic, telecom‑realistic incident data
Translate expert procedures into structured reasoning traces using the production-grade reference workflows. This teaches the model to coordinate tools, reason over network state, and execute fault‑management tasks end to end

The result is a repeatable method that telco teams can use to build their own specialized AI agents for network operations. These agents can perform triage, root‑cause analysis, and resolution for high‑volume incident classes, helping operators progress toward TM Forum Level 4 highly autonomous networks and beyond.

Why do network operations centers need reasoning models?

Traditional NOC automation is mostly rule‑based and open‑loop: scripts trigger on fixed conditions but struggle with noisy signals, cross‑domain dependencies, and constantly changing network behavior. As a result, many Level 1 and Level 2 tasks—triage, root‑cause analysis, validation after a change—still depend on manual effort, keeping MTTR high and limiting how far operators can move toward truly autonomous operations.

Diagram comparing a traditional NOC where a human engineer handles alarms and technician requests with an AI-driven workflow where an AI agent powered by a reasoning model sits between technician requests, topology data, and the NOC to automate alarm validation and resolution. — *Figure 1. Shifting from manual NOC alarm handling to a reasoning agent embedded in the NOC workflow*

A telco reasoning model becomes the engine for an AI agent that can take on this work pattern in a controlled, auditable way. Instead of hard‑coded runbooks and point scripts, the agent uses the model to interpret incidents, decide which tools to call, and adapt its actions based on live responses. Key features include:

AI reasoning plus tool-calling: Replaces manual alarm triage by invoking NOC tools for validation, root‑cause analysis, and remediation across existing systems
End-to-end automation: Handles alarm validation, RCA, and healing for various incident types such as outages, flaps, congestion, and configuration issues
Noise reduction: Filters self‑clearing or low‑value alarms using historical patterns so engineers can focus on higher priorities
Resolution in seconds, not hours: Shrinks resolution time for high‑volume, well‑understood incidents from hours to seconds, significantly reducing MTTR

The outcome is a closed‑loop, self‑healing network. Specialized NOC agents handle routine triage and resolution, and engineers shift from reactive alarm handling to proactive optimization and complex problem-solving.

Designing a telco reasoning pipeline

The technical approach to this solution combines the following components into one reproducible pipeline:

Synthetic incident data
Expert NOC procedures
Structured reasoning traces
Supervised fine‑tuning
Evaluation

Instead of trying to learn from raw logs and alarms directly, the model is trained on curated examples that show how an experienced engineer would analyze an incident, call tools, and decide when a fix is complete.

Diagram of a three‑stage agent training pipeline. Step 1 generates synthetic reasoning data from historical incidents using a teacher model and mock tools. Step 2 uses NeMo Skills and NeMo RL to fine‑tune a reasoning model on that data. Step 3 evaluates the trained model with a ReAct agent, using synthetic reasoning data to assess tool‑calling, reasoning quality, and final conclusions. — *Figure 2. Agent training pipeline, from synthetic incident generation to reasoning model, fine-tuning, and evaluation across tool-calling, reasoning, and conclusions*

In this case, Qwen3-32B is the base reasoning modeling that is fine-tuned for telco NOC workflows using the following design principles:

Focusing on a small number of high‑impact faults, which account for the majority of incidents and require deliberate action. This enables the model to learn deeply on the fault classes that matter most.
Defining step-by-step operational guidelines for each problem type including RCA and remediation steps and NOC tools that agents must use.
Generate synthetic reasoning traces that capture multistep tool calls and the rationale behind each decision, using the NeMo Skills reference workflow to automate trace and incident generation.

NeMo Skills orchestrates this pipeline end to end, using its CLI, vLLM or TensorRT LLM servers, and training utilities to move from raw incidents to a fine-tuned telco reasoning model.

Synthetic incidents and NOC tool-calling

The input to the pipeline is a fully synthetic incident dataset that is modeled on real NOC behavior. Each record includes fields such as region, domain, priority, problem type, possible cause, and time stamps. Engineer notes are also included, describing intermediate steps and close notes summarizing the final resolution and close code.

An incident summary captures why the network was degraded or down and is the backbone of what the model is trained to solve. The pipeline concentrates on the most frequent, high-impact faults that account for the bulk of incident volume and require explicit action. The reasoning model learns deeply on the cases that drive MTTR and OPEX.

To model realistic NOC workflows, a set of custom tools are defined for agents to call in multistep procedures, such as:

Acknowledging and tracking the initial alert
Checking site and equipment status
Performing remote actions (reset, unlock, enable)
Monitoring for automatic recovery or alarm clearance
Checking topology, power, and fiber, plus public outage information
Applying configuration fixes
Rechecking alarm status when it remains active
Investigating persistent or recurring alarms
Documenting actions and status updates
Coordinating onsite dispatch or hardware replacement
Confirming final site health and closing the incident

For each problem type, domain experts translate existing workflows into step‑by‑step guidelines that map onto these tools. Examples include which triage toolkit to consult first; which alarms to query; when to reboot a device; and how to verify a fiber cut, power outage, or network element faults.

These guidelines become blueprints for the synthetic reasoning traces the model will learn from. They later define the action space that NOC agents use when executing closed‑loop workflows in production.

Turn expert procedures into reasoning traces

To turn expert NOC procedures into training data for a telco‑specialized reasoning model, follow the three-step NeMo Skills workflow outlined below. It converts runbooks into structured, multiturn reasoning traces ready for autonomous NOC agents.

Step 1: Generate structured action sequences

Using a reference workflow from NeMo Skills, a teacher model generates standardized action sequences for each incident based on prompts that include incident fields and guideline templates. The steps map directly to NOC tools.

Traces are formatted so each step records the action, its parameters, the tool call, and the immediate result, forming a structured view of the NOC workflow.

Step 2: Attach per‑step reasoning

A second pass enriches every action with reasoning text that explains why the step is taken, what signals it uses, and how it influences the next decision. This creates a chain of reasoning that reflects how an experienced NOC engineer reasons over topologies, alarms, and historical behavior.

Because raw traces can be verbose or repetitive, a squashing phase merges related steps while preserving key decision points, making sequences more efficient for training.

Step 3: Formatting for multiturn, tool‑calling models

Using another workflow from NeMo Skills, the formatted traces are converted into a Qwen-compatible format that encodes both the dialogue-style interaction and tool-calling actions over multiple turns. Multiturn tokenization simulates realistic interactions where the agent alternates between reasoning, calling tools, and interpreting tool responses, which is essential for deploying a ReAct-style NOC agent.

The result is a curriculum-structured dataset where easier cases and shorter traces appear earlier, while more complex multi-step incidents appear later, supporting curriculum learning during model training.

Fine-tuning the telco reasoning model

The fine-tuning phase uses a standard train/test split on the compiled reasoning dataset, with NeMo Skills orchestrating data preparation and Qwen3 32B serving as the base reasoning model. NeMo Skills prepare_data utilities apply a telco‑specific prompt template (noc_reasoning_sft) and the Qwen tokenizer. This makes each trace in the training split into a supervised fine‑tuning (SFT) example that includes:

Incident context and NOC signals
Multistep tool calls and intermediate results
Reasoning traces explaining each decision
Final resolution and incident summary

This produces a single JSONL file of SFT-ready examples for the telco reasoning model.

To improve learning efficiency, curriculum learning is applied by ordering samples from simple, single‑problem incidents to more complex multistep, multitool cases. This allows the model to master core NOC behaviors before tackling long, multiturn troubleshooting patterns.

Multiturn tokenization ensures that each example preserves realistic sequences of queries, tool calls, responses, and follow‑up actions, rather than isolated single‑turn prompts. These capabilities are critical for downstream ReAct‑style agents that must coordinate multiple tools over long contexts.

Ultimately, Qwen3‑32B is fine‑tuned on this telco reasoning curriculum with long sequence lengths and tensor model parallelism across GPUs. Checkpointing and experiment tracking allow teams to iterate on data quality, curriculum design, and hyperparameters.

The result is a telco‑specialized reasoning model that understands incident fields, close codes, and NOC procedures, and can reliably drive multitool, multiturn tool‑calling workflows in production.

Evaluating incident summary accuracy and safety

Initial evaluation focuses on incident summary accuracy: how well the model, embedded in a ReAct‑style agent with tools, predicts and executes the correct resolution path for a given incident.

Experiments compare the fine‑tuned telco reasoning model against a baseline Qwen3‑32B on held‑out incidents, measuring accuracy, precision, and recall across problem and close‑code categories. Incident summary accuracy can also be analyzed within a single problem type to highlight where reasoning traces and curriculum learning deliver the largest gains, informing future iterations of synthetic data generation and guideline design.

Evaluations across multiple iterations show that the fine-tuned model improves accuracy from roughly 20% to 60%.

Beyond incident summary metrics, additional evaluation methods can be introduced over time to further harden the system, including:

LLM‑as‑a‑judge setups to evaluate reasoning traces for correctness, completeness, and safety
LLM‑as‑a‑judge to assess final conclusions and remediation plans
Tool‑calling benchmarks such as BFCLv3 to measure how reliably the agent sequences and interprets tool calls
Rollout and rejection sampling to stress‑test behavior across many simulated incidents
Controlled errors injected into traces to teach the model to detect and recover from its own mistakes
Incorporation of retrieval‑augmented generation (RAG) with historical few‑shot examples to improve robustness on long‑tail scenarios

Get started building telco reasoning models for autonomous networks

Telco‑specific reasoning models—powered by synthetic data, structured traces, and safe tool‑calling—can move NOCs toward zero‑touch, self‑healing operations. By focusing on high‑impact close codes, encoding expert guidelines as multiturn reasoning traces, and fine‑tuning large models with the NVIDIA NeMo software toolkit, operators can build agents that reliably take on real NOC engineer tasks.

The pipeline is reusable and adaptable, so this approach can be tailored to each operator’s tools, data, and policies. This accelerates the industry’s transition from manual alarm handling to intelligent, autonomous network operations.

To get started fine-tuning a reasoning model to build AI agents for network operations, see Teaching a Model to Reason over Telecom Network Incidents.