Build a Report Generator AI Agent with NVIDIA Nemotron on OpenRouter

Unlike traditional systems that follow predefined paths, AI agents are autonomous systems that use large language models (LLMs) to make decisions, adapt to changing requirements, and perform complex reasoning.

In this guide to the self-paced workshop for building a report generation agent, you’ll gain:

Understanding of the four core considerations of any AI agent, including NVIDIA Nemotron, an open model family with open data and weights.
A working document generation agent that can research and write reports.
Knowledge of how to build agents using LangGraph and OpenRouter.
A turnkey, portable development environment.
Your own customized agent, ready to share as a NVIDIA Launchable.

Video walkthrough

Workshop deployment

Launch the workshop as a NVIDIA Brev Launchable:

Figure 1. Click on the ‘Deploy Now’ button to deploy the NVIDIA DevX Workshop in the cloud

Configuration for setting up secrets

To follow along with this workshop, you’ll need to gather and configure a few project secrets.

OpenRouter API key: This enables access to the NVIDIA Nemotron Nano 9B V2 model through OpenRouter.
Tavily API key: This enables access to the Tavily web search API for real-time web search.

With your JupyterLab environment running, you can use the Secrets Manager tile under NVIDIA DevX Learning Path in the Jupyterlab Launcher to configure these secrets for your workshop development environment. Verify in the logs tab that the secrets have been added successfully.

A screenshot of the Secrets Manager tile under NVIDIA DevX Learning Path — *Figure 2. Use the “Secrets Manager” tile under the NVIDIA DevX Learning Path section to configure project secrets*

Next, locate the NVIDIA DevX Learning Path section of the JupyterLab Launcher. Select the 1. Introduction to Agents tile to open up the lab instructions and get started.

A screenshot of the 1. Introduction to Agents tile — *Figure 3. “1. Introduction to Agents” tile in NVIDIA DevX Learning Path to open lab instructions.*

Introduction to agent architecture

Once your workshop environment is set up, the first section introduces developers to agents. It’s crucial to understand what differentiates agents from simpler AI applications before diving into implementation.

Unlike traditional LLM-based applications, agents can dynamically choose tools, incorporate complex reasoning, and adapt their analysis approach based on the situation at hand. Developers will learn about the four key considerations fundamental to all agents:

Model: An LLM that serves as the brain, deciding which tools to use and how to respond.
Tools: Functions that enable the LLM to perform actions like mathematical calculations, database queries, or API calls.
Memory and state: Information available to the LLM during and between conversations.
Routing: Logic that determines what the agent should do next based on the current state and LLM decisions.

You’ll learn how to put these components together to build your first basic, calculator-equipped agent in code/intro_to_agents.ipynb. By the end of this exercise, you’ll have an agent that completes the following:

[{'content': 'What is 3 plus 12?', 'role': 'user'},
 {'content': None,
  'role': 'assistant',
  'tool_calls': [{'function': {'arguments': '{"a": 3, "b": 12}', 'name': 'add'},
                  'id': 'chatcmpl-tool-b852128b6bdf4ee29121b88490174799',
                  'type': 'function'}]},
 {'content': '15',
  'name': 'add',
  'role': 'tool',
  'tool_call_id': 'chatcmpl-tool-b852128b6bdf4ee29121b88490174799'},
 {'content': 'The answer is 15.', 'role': 'assistant'}]

Report generation components

The rest of this workshop centers around building a multi-layered agentic system using LangGraph and NVIDIA NIM hosted as an OpenRouter endpoint. The architecture consists of four interconnected agentic components, each handling specific aspects of the document generation process:

Initial research: Gather comprehensive information about the topic.
Outline planning: Create a structured document outline based on research.
Section writing: Generate detailed content for each section with additional research as needed.
Final compilation: Assemble all sections into a professional report.

Learn and implement the code

Now that we understand the concepts, let’s dive into the technical implementation. We’ll start with the foundational considerations mentioned above and build up to the complete agent:

Choose a model
Select tools
Build a researcher
Build an author
Build the final agent
Manage and route the agent

Foundations: the model

The workshop relies on NVIDIA NIM endpoints for the core model powering the agent. NVIDIA NIM provides high-performance inference capabilities, including:

Tool binding: Native support for function calling.
Structured output: Built-in support for Pydantic models.
Async operations: Full async/await support for concurrent processing.
Enterprise reliability: Production-grade inference infrastructure.

This example shows the ChatNVIDIA Connector using NVIDIA NIM hosted as an OpenRouter endpoint.

from langchain_nvidia_ai_endpoints import ChatNVIDIA

llm = ChatNVIDIA(
    base_url="https://openrouter.ai/api/v1",
    model="nvidia/nemotron-nano-9b-v2:free", 
    api_key=os.getenv("OPENROUTER_API_KEY")
)
llm_with_tools = llm.bind_tools([tools.search_tavily])

While clear quality instructions are important in any LLM-based application, they’re especially critical for agents as they remove ambiguity and clarify decision-making processes. One such example from code/docgen_agent/prompts.py is provided as follows:

research_prompt: Final[str] = """
Your goal is to generate targeted web search queries that will gather comprehensive information for writing a technical report section.

Topic for this section:
{topic}

When generating {number_of_queries} search queries, ensure they:
1. Cover different aspects of the topic (e.g., core features, real-world applications, technical architecture)
2. Include specific technical terms related to the topic
3. Target recent information by including year markers where relevant (e.g., "2024")
4. Look for comparisons or differentiators from similar technologies/approaches
5. Search for both official documentation and practical implementation examples

Your queries should be:
- Specific enough to avoid generic results
- Technical enough to capture detailed implementation information
- Diverse enough to cover all aspects of the section plan
- Focused on authoritative sources (documentation, technical blogs, academic papers)"""

This prompt shows a few key principles of reliable LLM prompting:

Role specification: Clear definition of the agent’s expertise and responsibilities.
Task decomposition: Breaking down complex requirements into specific, actionable steps or criteria.
Specificity: References examples of temporal specificity and authoritative sources.
Structured inputs / outputs: Specific instructions for desired response structure and expected input structure.

Foundations: the tools

The agent’s capabilities are defined through its tools. The workshop uses Tavily, a search API designed specifically for AI agents, as the primary tool for information gathering.

# imports and constants omitted
@tool(parse_docstring=True)
async def search_tavily(
    queries: list[str],
    topic: Literal["general", "news", "finance"] = "news",
) -> str:
    """Search the web using the Tavily API.

    Args:
        queries: List of queries to search.
        topic: The topic of the provided queries.
          general - General search.
          news - News search.
          finance - Finance search.

    Returns:
        A string of the search results.
    """
    search_jobs = []
    for query in queries:
        search_jobs.append(
            asyncio.create_task(
                tavily_client.search(
                    query,
                    max_results=MAX_RESULTS,
                    include_raw_content=INCLUDE_RAW_CONTENT,
                    topic=topic,
                    days=days,  # type: ignore[arg-type]
                )
            )
        )
    search_docs = await asyncio.gather(*search_jobs)
    return _deduplicate_and_format_sources(
        search_docs,
        max_tokens_per_source=MAX_TOKENS_PER_SOURCE,
        include_raw_content=INCLUDE_RAW_CONTENT,
    )

Key architectural decisions in the tools module implementation include:

Async operation: Using `asyncio.gather()` for concurrent searches
Deduplication: Helper function preventing redundancy from multiple searches
Structured documentation: Google-style docstrings help the LLM understand tool usage

Now that we have established a foundational understanding of models and tools in code, let’s assemble them into an actual, working agent. We haven’t explored ‌state management and routing considerations yet, but we’ll revisit them later on once we have our agent components built.

Implementing the researcher

The researcher component of the agent implements the reasoning and action pattern (ReAct), one of the most effective architectures for tool-using agents. This pattern creates a loop where the agent thinks about what to do, takes an action, and then decides on the next steps based on the results. This continues until the agent evaluates that no more actions are needed to complete a task.

A diagram showcasing a simple React Agent loop — *Figure 4. A ReAct loop is an effective architecture for tool-using agents to complete a task*

The code for this Researcher component of the agent is implemented in code/docgen_agent/researcher.py and can be tested by code/researcher_client.ipynb.

state = ResearcherState(
    topic="Examples of AI agents in various industries.",
    number_of_queries=3,
)
state = await graph.ainvoke(state)

for message in state["messages"]:
    print("ROLE: ", getattr(message, "role", "tool_call"))
    print(message.content[:500] or message.additional_kwargs)
    print("")

You can also see each action taken by the researcher during execution.

INFO:docgen_agent.researcher:Calling model.
INFO:docgen_agent.researcher:Executing tool calls.
INFO:docgen_agent.researcher:Executing tool call: search_tavily
INFO:docgen_agent.tools:Searching the web using the Tavily API
INFO:docgen_agent.tools:Searching for query: Technical architecture of AI agents in healthcare 2024
INFO:docgen_agent.tools:Searching for query: Comparison of machine learning frameworks for AI agents in finance
INFO:docgen_agent.tools:Searching for query: Real-world applications of natural language processing in AI agents for customer service 2024
INFO:httpx:HTTP Request: POST https://api.tavily.com/search "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.tavily.com/search "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.tavily.com/search "HTTP/1.1 200 OK"
INFO:docgen_agent.researcher:Calling model.

As well as the final output, including the tool call request, the tool output, and the final message summarizing the results.

ROLE:  assistant
{'tool_calls': [{'id': 'chatcmpl-tool-b7185ba8eb3a44259b0bdf930495ece5', 'type': 'function', 'function': {'name': 'search_tavily', 'arguments': '{"queries": ["Technical architecture of AI agents in healthcare 2024", "Comparison of machine learning frameworks for AI agents in finance", "Real-world applications of natural language processing in AI agents for customer service 2024"], "topic": "general"}'}}]}

ROLE:  tool_call
"Sources:\n\nSource AI Agents in Modern Healthcare: From Foundation to Pioneer:\n===\nURL: https://www.preprints.org/manuscript/202503.1352/v1\n===\nMost relevant content from source: T. Guo et al., \"Large language model based multi-agents: A survey of progress and challenges,\" arXiv preprint arXiv:2402.01680, 2024. J. Ruan et al., \"TPTU: Large Language Model-based AI Agents for Task Planning and Tool Usage. | **Partner Agent** | * True healthcare team partners * Generates clinical hypotheses

ROLE:  assistant
Based on the search results, here are some potential search queries that could be used to gather comprehensive information for writing a technical report section on examples of AI agents in various industries:

1. "Technical architecture of AI agents in healthcare 2024"
2. "Comparison of machine learning frameworks for AI agents in finance"
3. "Real-world applications of natural language processing in AI agents for customer service 2024"

Implementing the author

Simple ReAct agents are very powerful, but are often combined with additional steps to do more complicated workflows. The section author needs to perform additional section-related research, but only when requested. Once the necessary research is available, it must use that research to write the section.

In the modified architecture diagram, a gating function has been added before the React-style agent to determine whether additional research is required, as well as a writing step at the end.

A diagram showing the decision tree leading to the React Agent loop. — *Figure 5. ReAct agent loop with section author added to perform additional section-related research when requested*

The code for this researcher component of the agent is implemented in code/docgen_agent/author.py and can be tested by code/author_client.ipynb.

state = SectionWriterState(
    index=1,
    topic="Examples of AI agents in various industries.",
    section=Section(
        name="Real-World Applications",
        description="Examples of AI agents in various industries",
        research=True,
        content=""
    ),
)

state = await graph.ainvoke(state)
Markdown(state["section"].content)

You can also see each action taken by the author during execution.

INFO:docgen_agent.author:Researching section: Real-World Applications
INFO:docgen_agent.author:Executing tool calls for section: Real-World Applications
INFO:docgen_agent.author:Executing tool call: search_tavily
INFO:docgen_agent.tools:Searching the web using the Tavily API
INFO:docgen_agent.tools:Searching for query: AI agents in healthcare industry 2024
INFO:docgen_agent.tools:Searching for query: Implementation of AI agents in finance sector 2023
...
INFO:httpx:HTTP Request: POST https://api.tavily.com/search "HTTP/1.1 200 OK"
INFO:docgen_agent.author:Researching section: Real-World Applications
INFO:docgen_agent.author:Writing section: Real-World Applications

As well as the final output, which consists of the written section in markdown format.

## Real-World Applications

AI agents have numerous applications across various industries...

### Healthcare

AI agents in healthcare are revolutionizing patient care and medical innovation. They are being used to automate administrative tasks, enhance diagnostics, and improve workflow efficiency. For instance...

### Finance

AI agents in finance are driving innovation, success, and compliance. They are being used to automate tasks such as data entry, transaction processing, and compliance checks. AI agents are also being used to detect fraud, improve customer service, and provide personalized investment advice. For example...

Implementing the final agent

Using these two components, we can now put together our document generation agent’s workflow. This architecture is the simplest so far—a linear workflow that researches the topic, writes the sections, and compiles the whole, finalized report.

A diagram depicting the entire end-to-end flow. — *Figure 6. Workflow of reporting agent that researches a topic, writes the sections, and compiles a report*

The code for this researcher component of the agent is implemented in code/docgen_agent/agent.py and can be tested by code/agent_client.ipynb.

state = AgentState(
    topic="The latest developments with AI Agents in 2025.",
    report_structure="This article should be..."
)

state = await graph.ainvoke(state)
Markdown(state["report"])

You can also see each action taken by the author during execution.

INFO:docgen_agent.agent:Performing initial topic research.
INFO:docgen_agent.researcher:Calling model.
INFO:docgen_agent.researcher:Executing tool calls.
INFO:docgen_agent.researcher:Executing tool call: search_tavily
INFO:docgen_agent.tools:Searching the web using the Tavily API
INFO:docgen_agent.tools:Searching for query: AI Agents 2025 core features
INFO:docgen_agent.tools:Searching for query: Real-world applications of AI Agents in 2025
...
INFO:httpx:HTTP Request: POST https://api.tavily.com/search "HTTP/1.1 200 OK"
INFO:docgen_agent.researcher:Calling model.
INFO:docgen_agent.agent:Calling report planner.
INFO:docgen_agent.agent:Orchestrating the section authoring process.
INFO:docgen_agent.agent:Creating author agent for section: Introduction
INFO:docgen_agent.agent:Creating author agent for section: Autonomous Decision-Making
INFO:docgen_agent.agent:Creating author agent for section: Integration with Physical World
INFO:docgen_agent.agent:Creating author agent for section: Agentic AI Trends
INFO:docgen_agent.agent:Creating author agent for section: AI Agents in Customer Support
INFO:docgen_agent.agent:Creating author agent for section: AI Agents in Healthcare
INFO:docgen_agent.agent:Creating author agent for section: Conclusion
INFO:docgen_agent.agent:Throttling LLM calls.
INFO:docgen_agent.author:Writing section: Introduction
INFO:docgen_agent.author:Researching section: Autonomous Decision-Making
...

The finalized report will be generated in markdown format. Provided is a sample research report generated by this agent: sample markdown output.

Foundations: state management and routing

With the agent components built, let’s take a step back and explore how to use LangGraph as the agent framework for advanced state management and flow control, connecting all three components together into a single agentic AI system. LangGraph provides a couple of key advantages:

Conditional routing: Conditional edges enable dynamic flow control based on runtime conditions, enabling agents to make intelligent decisions about their next actions.

Graph compilation and execution: Compiled graphs can be invoked asynchronously, supporting concurrent execution and complex orchestration patterns essential for multi-agent systems.

In the following example from code/docgen_agent/agent.py we can see which components we previously built correspond to which node, as well as the edges that connect, or route, intermediate outputs from one node to the next.

main_workflow = StateGraph(AgentState)
main_workflow.add_node("topic_research", topic_research)
main_workflow.add_node("report_planner", report_planner)
main_workflow.add_node("section_author_orchestrator", section_author_orchestrator)
main_workflow.add_node("report_author", report_author)

main_workflow.add_edge(START, "topic_research")
main_workflow.add_edge("topic_research", "report_planner")
main_workflow.add_edge("report_planner", "section_author_orchestrator")
main_workflow.add_edge("section_author_orchestrator", "report_author")
main_workflow.add_edge("report_author", END)

Congratulations! In walking through each step of this developer workshop, you have just built your own LangGraph agent. Test your new agent using the code/agent_client.ipynb notebook.

Summary

Building AI agents requires understanding both the theoretical foundations and practical implementation challenges. This workshop provides a comprehensive path from basic concepts to complex agentic systems, emphasizing hands-on learning with production-grade tools and techniques.

By completing this workshop, developers gain practical experience with:

Fundamental agent concepts: Understanding the difference between workflows and intelligent agents.
State management: Implementing complex state transitions and persistence.
Tool integration: Creating and managing external tool capabilities.
Modern AI stack: Working with LangGraph, NVIDIA NIM, and associated tooling.

Learn more

For hands-on learning, tips, and tricks, watch our Nemotron Labs livestream replay, “Building an AI Agent for Report Generation with NVIDIA Nemotron on OpenRouter.”

Try NVIDIA Nemotron on OpenRouter and Hugging Face. See the workshop on GitHub.
Ask questions on the Nemotron developer forum or the Nemotron channel on Discord

Stay up to date on NVIDIA Nemotron by subscribing to NVIDIA news and following NVIDIA AI on LinkedIn, X, Discord, and YouTube.

Visit our Nemotron developer page for all the essentials you need to get started with the most open, smartest-per-compute reasoning model.
Explore new open Nemotron models and datasets on NIM microservices and Blueprints on build.nvidia.com.
Share your ideas and vote on features to help shape the future of Nemotron.
Tune into upcoming Nemotron livestreams and connect with the NVIDIA Developer community through the Nemotron developer forum and the Nemotron channel on Discord.
Browse video tutorials and livestreams to get the most out of NVIDIA Nemotron.

Build a Report Generator AI Agent with NVIDIA Nemotron on OpenRouter

Video walkthrough

Workshop deployment

Configuration for setting up secrets

Introduction to agent architecture

Report generation components

Learn and implement the code

Foundations: the model

Foundations: the tools

Implementing the researcher

Implementing the author

Implementing the final agent

Foundations: state management and routing

Summary

Learn more

Tags

About the Authors

Build a Report Generator AI Agent with NVIDIA Nemotron on OpenRouter

Video walkthrough

Workshop deployment

Configuration for setting up secrets

Introduction to agent architecture

Report generation components

Learn and implement the code

Foundations: the model

Foundations: the tools

Implementing the researcher

Implementing the author

Implementing the final agent

Foundations: state management and routing

Summary

Learn more

Tags

About the Authors

Comments

Related posts

Build a Retrieval-Augmented Generation (RAG) Agent with NVIDIA Nemotron

How to Build Custom AI Agents with NVIDIA NeMo Agent Toolkit Open Source Library

Extending the NVIDIA NeMo Agent Toolkit to Support New Agentic Frameworks

Spotlight: xpander AI Equips NVIDIA NIM Applications with Agentic Tools

Develop and Deploy Scalable Generative AI Models Seamlessly with NVIDIA AI Workbench

Related posts

Building an Interactive AI Agent for Lightning-Fast Machine Learning Tasks

Create Your Own Bash Computer Use Agent with NVIDIA Nemotron in One Hour

Build an AI Agent to Analyze IT Tickets with NVIDIA Nemotron

Build a Log Analysis Multi-Agent Self-Corrective RAG System with NVIDIA Nemotron

Build a Retrieval-Augmented Generation (RAG) Agent with NVIDIA Nemotron