Join our contest that runs through June 17 and showcase your innovation using cutting-edge generative AI-powered applications using NVIDIA and LangChain technologies. To get you started, we explore a few applications for inspiring your creative journey, while sharing tips and best practices to help you succeed in the development process.
Jumpstart your creativity
There are many different practical applications for generative AI agents. Agents or copilot applications developed in previous contests use large language models (LLMs) or small language models (SLMs) depending on the application’s privacy, security, and computational requirements.
These examples include:
- Using locally hosted LLM models, a plug-in designed for Outlook helps users compose emails, summarize email threads, and answer inbox questions.
- Employing a command-line assistant to enhance the command-line interface by translating plain English instructions into actionable command-line prompts.
- A visual exploration tool that analyzes images and provides intuitive photo analysis capabilities.
Developers can create applications in domains, such as gaming, healthcare, and media and entertainment for content generation. Other options include summarization, question and answering, sentiment analysis, and real-time translation. In healthcare, agents can help in diagnosing diseases by analyzing patient symptoms, medical history, and clinical data.
Many of these ideas are adaptable to your data and the problem you’re looking to solve—whether it’s using an agent to improve your weekly grocery shopping or to optimize customer service responses in a business setting
Quick tips for your development journey
Developing an application powered by LLMs or SLMs involves integrating multiple components. This process encompasses preparing data, choosing the appropriate foundation model, fine-tuning the selected foundation model, and orchestrating the model for various downstream tasks. These tasks may include agent creation, inference services, and other specialized functionalities.
Let’s walk through the scenario of creating an LLM-based agent application. Selecting the appropriate foundation model in an agent application is crucial, as it plays a pivotal role in comprehending user queries accurately and efficiently. This decision raises several important questions, such as whether to choose an LLM or a SLM, and whether to quantize the model.
The answers to these questions aren’t straightforward and are influenced by factors such as the application’s requirements, the deployment infrastructure, the desired inference speed, and the accuracy requirements.
The following pointers are helpful to keep in mind.
If your application is deployed on GPUs with a smaller memory footprint, you should consider using a quantized model or quantizing an existing model before using it. A few tools developers can use are quantization frameworks such as model optimizer and various plugins including NVIDIA TensorRT for Large Language Model (TensorRT-LLM), which are available in the LangChain framework.
If inference accuracy is important, you should use foundation models that align with their use case, however, some of these models require GPUs with large memory.
If your goal is using retrieval-augmented generation (RAG) in your application, then formatting and curating your documents is an important aspect of your application development. You can leverage tools such as NVIDIA NeMo Curator or document loaders that support processing different document modalities and review our recent blog post about NeMo Curator for additional insights.
These are a few topics that will help you get started with your application. For more advanced use cases such as fine-tuning and building multi-agent applications, you can explore NeMo framework and LangGraph.
Register for the developer contest and start building your next-gen AI application now.