The evolution of modern application development has led to a significant shift toward microservice-based architectures. This approach offers great flexibility and scalability, but it also introduces new complexities, particularly in the realm of security. In the past, engineering teams were responsible for a handful of security aspects in their monolithic applications. Now, with microservices, these responsibilities have multiplied exponentially. Engineering teams are now responsible for network security, identity and access management, TLS certificates, and vulnerability scanning—not just for one application, but potentially hundreds of individual services.
The sheer scale of this challenge makes manual vulnerability patching impractical, if not impossible. This is where automation becomes not just beneficial, but essential. Automation enables teams to implement security measures consistently across all services, respond rapidly to threats, and maintain compliance with regulatory requirements. Moreover, as applications grow and evolve, automation ensures that the security practices can scale accordingly. It provides the necessary control and governance to manage complex, distributed systems effectively.
Running applications in containers is a common approach for building microservices. It enables developers to maintain the same continuous integration (CI) pipeline for their applications, regardless of the container orchestration platform used. No matter which programming language you use for your application, the deployable artifact is a container image that commonly includes application code and its dependencies. Application development teams must scan those images for vulnerabilities to make sure of their safety before deploying them to cloud environments.
This post showcases how engineering teams can efficiently automate vulnerability remediation early in their continuous integration pipelines using the NVIDIA AI Blueprint for vulnerability analysis with NVIDIA NIM microservices, NVIDIA Morpheus, and AWS cloud-native services like Amazon EKS, AWS Lambda, and Amazon Inspector.
NVIDIA Morpheus for near real-time threat detection
NVIDIA Morpheus is a GPU-accelerated, end-to-end AI framework to build, customize, and scale cybersecurity applications. It provides developers with an innovative AI-powered cybersecurity SDK designed to tackle the growing cybersecurity challenges in cloud and enterprise environments.
Morpheus leverages the power of GPUs to process and analyze vast amounts of data at unprecedented speeds and uses machine learning (ML) models and large language models (LLMs) to identify patterns and anomalies that might indicate security threats like: phishing attempts, malware infections, or insider threats. The framework can be integrated with existing security infrastructure, as described in this post, enhancing the organization’s ability to detect and respond to threats in near real-time.
NVIDIA AI Blueprint for vulnerability analysis
Built with Morpheus and Llama 3 NIM microservices, the NVIDIA AI Blueprint for vulnerability analysis is a reference application to help organizations efficiently automate the detection and remediation of common vulnerabilities and exposures (CVEs).
The AI Blueprint for vulnerability analysis workflow starts with receiving a collection of assets as input parameters:
- The list of CVEs detected by a designated security scanner, such as Amazon Inspector or Docker Scout
- The SBOM file
- The location of the application source code and documentation (GitHub URLs, for example)
The application then begins the automated vulnerability analysis workflow, as outlined below.
Building the knowledge base
The first step in the workflow is to create a comprehensive knowledge base. It starts by pulling the code repositories specified by the user. These repositories are then processed through an embedding model, a type of ML model that converts text into numerical vectors. The resulting embeddings are stored in vector databases (VDBs), which enable efficient similarity searches. This step provides the system with a deep understanding of the codebase context.
Gathering vulnerability intelligence
During the phase of gathering vulnerability intelligence, the workflow collects detailed information about each CVE in the supplied list. This involves web scraping and data retrieval from various public security databases like the GitHub Security Advisory (GHSA), distribution-specific databases (Distro), and the National Institute of Standards and Technology (NIST) CVE records. It also incorporates data from specialized threat intelligence feeds. This comprehensive approach ensures it has the most up-to-date and relevant information about each vulnerability.
Processing the Software Bill of Materials
The Software Bill of Materials (SBOM) is a crucial document that lists all components in a piece of software. During this step, the workflow processes this document into a format that AI can easily ingest and analyze. This step provides vital context about the software’s composition and dependencies.
Generating a tailored checklist
Using the gathered vulnerability information, a NIM LLM generates a context-sensitive task checklist. This checklist is designed to guide the impact analysis process, ensuring that all relevant aspects of each vulnerability are thoroughly examined.
Creating the task agent loop
At the center of this agentic workflow, agents work in parallel on the unique checklist. For each list item, a prompt generator creates a detailed query, including information about available tools and data sources. The agent then uses these prompts to search various databases and employ validation tools, gathering all relevant information to address each checklist item. This process continues until all items are resolved satisfactorily.
Summarizing the findings
Once the agents have completed the checklist, a summarization NIM LLM condenses the results into a concise, human-readable paragraph. This step ensures that the complex technical details are presented in an accessible finding.
Assigning justification status
Based on the summary, another NIM LLM assigns a Vulnerability Exploitability eXchange (VEX) status to each CVE. If the vulnerability is deemed exploitable, it’s categorized as “vulnerable.” If not exploitable, the system selects from 10 predefined categories to explain why, ranging from “false positive” to various forms of protection mechanisms.
Preparing the final output and human review
The workflow concludes by preparing a comprehensive output file that contains all gathered and generated information in a human-readable format. This file is then passed to security analysts for a final review. These experts have the ultimate authority to determine whether the container meets the security requirements for publication.
This process utilized the LangChain library, optimized and parallelized with Morpheus. This agentic approach enhances efficiency and reduces redundant efforts, making the vulnerability analysis workflow both thorough and streamlined.
Applying NVIDIA AI Blueprints to containerized applications on AWS
The sample solution uses a combination of AWS services, such as Amazon ECR to store container images, Amazon Inspector to scan images for vulnerabilities, Amazon EventBridge and AWS Lambda to connect solution components in an event-driven serverless manner, and Amazon EKS to run the AI agent for vulnerability analysis.
The solution also uses Amazon Bedrock, a fully managed service that offers a choice of high-performing foundation models. For further efficiency when generating the GitHub issue content, the solution uses the in-context learning approach, a technique that tailors AI responses to narrow scenarios. When used to generate the GitHub Issue content, the solution builds generative AI prompts based on the programming language in question and a previously generated example of what a similar issue might look like.
This approach underscores a crucial point. That is, for some narrow use cases, using a smaller LLM (such as Llama 13B) with an assisted prompt might yield results that are as effective as those that a larger LLM (such as Llama 2 70B) might yield. We recommend that you evaluate both few-shot prompts with smaller LLMs and zero-shot prompts with larger LLMs to find the model that works most efficiently for you. Read more about providing prompts and examples in the Amazon Bedrock documentation.
Full solution architecture
Before packaging an application as a container, engineering teams should make sure that their continuous integration pipeline includes steps such as static code scanning with tools such as SonarQube or Amazon CodeGuru, and image analysis tools such as Amazon Inspector or Docker Scout. Validating your code for vulnerabilities at this stage aligns with the shift-left mentality, and engineers should be able to detect and address potential threats in their code during the earliest stages of development. The steps involved in this process are detailed below.
Steps 1-4
After packaging the new application code and pushing it to Amazon ECR, the image scanning with Amazon Inspector is triggered. As image scanning runs, Amazon Inspector emits EventBridge Finding events for each vulnerability detected, as well as a scan competition event at the end, as shown in Figure 4.Â
Steps 5-11
EventBridge is configured to invoke a Lambda function for each finding event. The function aggregates and updates the Amazon DynamoDB database table with each finding information. Once Amazon Inspector finishes, it emits the completion event to EventBridge, which uses a Lambda function to retrieve findings, retrieve application metadata such as SBOM and source code URL, build the Morpheus LLM Agent request body, and trigger the scan.
Steps 12-14
The AI Blueprint runs. Based on the received SBOM, list of CVEs, and knowledge base, it produces analysis results and persists them in an S3 bucket. This is described in detail in the previous section.
Steps 15-17
A Lambda function is invoked when a new analysis result object is stored in an S3 bucket. The Lambda function retrieves analysis results, builds a prompt, and invokes Amazon Bedrock to generate the description and content of an issue. Once generated, the same Lambda function creates an issue in the source code repository.
Engineering teams are notified so they can validate the PR or issue and merge it with the source code repository. Over time, as engineering teams gain trust with the process, they might consider automating the merge part as well.
The sample IaC and code for the solution discussed in this blog are available in the aws-samples/gen-ai-cve-patching GitHub repo. To provision this solution in your AWS account, follow the step-by-step instructions in the README file.
Conclusion
Using the AI Blueprint for vulnerability analysis together with native AWS services such as Amazon Inspector, Amazon EKS, AWS Lambda, and Amazon Bedrock helps to simplify what has been a complex and laborious process. Having an automated workflow in place enables engineering teams to focus on delivering business value, thereby improving overall security without extra operational overhead. To learn more, see Applying Generative AI for CVE Analysis at an Enterprise Scale.
Ready to start building your agentic applications with the AI Blueprint for vulnerability analysis? Get a jump start with the interactive notebook experience and dive into the code base.