Generative AI / LLMs

Building AI Agents with NVIDIA NIM Microservices and LangChain

Image of a person standing in front of an AI kiosk in a retail location.

NVIDIA NIM, part of NVIDIA AI Enterprise, now supports tool-calling for models like Llama 3.1. It also integrates with LangChain to provide you with a production-ready solution for building agentic workflows. NIM microservices provide the best performance for open-source models such as Llama 3.1 and are available to test for free from NVIDIA API Catalog in LangChain applications. 

Building AI agents with NVIDIA NIM 

The Llama 3.1 NIM microservice enables you to build generative AI applications with advanced functionality for production deployments. You can use an accelerated open model with state-of-the-art agentic capabilities to build more sophisticated and reliable applications. For more information, see Supercharging Llama 3.1 across NVIDIA Platforms

NIM provides an OpenAI compatible tool calling API for familiarity and consistency. Now, you can bind tools with LangChain to NIM microservices to create structured outputs that bring agent capabilities to your applications.

Tool usage with NIM

Tools accept structured output from a model, execute an action, and return results in a structured format back to the model. They often involve external API calls, but this isn’t mandatory. 

For instance, a weather tool might get the current weather in San Diego, while a web search tool might get the current San Francisco 49ers football game score.

To support tool usage in an agent workflow, first, a model must be trained to detect when to call a function and output a structured response like JSON with the function and its arguments. The model is then optimized as a NIM microservice for NVIDIA infrastructure and easy deployment, making it compatible with frameworks like LangChain’s LangGraph.

Use LangChain to develop LLM applications with tools

Here’s how to use LangChain with models like Llama 3.1 that support tool calling. For more information about installing packages and setting up the ChatNVIDIA library, see the LangChain NVIDIA documentation

To get a list of models that support tool calling, run the following command:

from langchain_nvidia_ai_endpoints import ChatNVIDIA

tool_models = [model for model in ChatNVIDIA.get_available_models() if model.supports_tools]

You can create your own functions or tools and bind them to models using LangChain’s bind_tools function.

from langchain_core.pydantic_v1 import Field
from langchain_core.tools import tool

@tool
def get_current_weather(
    location: str = Field(..., description="The location to get the weather for.")
):
    """Get the current weather for a location."""
    ...

llm = ChatNVIDIA(model=tool_models[0].id).bind_tools(tools=[get_current_weather])
response = llm.invoke("What is the weather in Boston?")
response.tool_calls

You can use something like Tavily API for generic search or the National Weather Service API in the get_current_weather function.  

Explore more resources

The previous code example is just a small example that shows how models can support tools. LangChain’s LangGraph integrates with NIM microservices, as shown in the NVIDIA NIMs with Tool Calling for Agents example on GitHub.

Check out other LangGraph examples to build stateful, multi-actor applications with NIM microservices for use cases such as customer support, coding assistants, and advanced RAG, evaluation. 

Advanced RAG can be built with LangGraph and NVIDIA NeMo Retriever in an agent workflow that uses strategies such as self-RAG and corrective RAG. For more information, see Build an Agentic RAG Pipeline with Llama 3.1 and NVIDIA NeMo Retriever NIM Microservices and the associated notebook.

Get started on building your applications with NVIDIA NIM microservices

Discuss (0)

Tags