Advancing Production AI with NVIDIA AI Enterprise

While harnessing the potential of AI is a priority for many of today’s enterprises, developing and deploying an AI model involves time and effort. Often, challenges must be overcome to move a model into production, especially for mission-critical business operations. According to IDC research, only 18% of enterprises surveyed could put an AI model into production in under a month.

This post explores the challenges that slow down AI deployments and introduces the benefits of using a consistent, secure, and reliable platform to accelerate the journey of taking AI into production.

The ever-growing complexity of the AI software stack

Open-source software (OSS) plays a critical role in advancing AI adoption. According to The State of the Octoverse 2023, there were 65K public generative AI-related GitHub projects in 2023 with 248% year-over-year growth. While the open-source community has helped fuel the AI era, the diverse range of OSS used in building AI applications makes maintaining a reliable, enterprise-grade AI software stack a complex and resource-intensive endeavor that is similar to maintaining an open-source OS.

For example, NVIDIA Triton Inference Server, used to standardize and scale AI deployments, relies on countless software dependencies. In Figure 1, green dots represent CUDA libraries, white dots represent OSS packages, and the lines in between represent dependencies. Any single change, such as a regular software update or security patch, can introduce an API change and result in an application failure or downtime.

A graphic representation of NVIDIA Triton Inference Server software dependencies. Green dots represent CUDA libraries, white dots represent OSS packages, and the lines in between represent dependencies. — *Figure 1. Software dependencies of NVIDIA Triton Inference Server*

Continuous security monitoring

The inevitable increase in security vulnerabilities makes maintaining the AI software stack even more challenging. A recent open-source security and risk analysis report by Synopsys, indicates a 236% surge in high-risk attack patterns in OSS vulnerabilities for big data, AI, Business Intelligent, and machine learning over a 5-year period.

New vulnerabilities are constantly being discovered. For example, Figure 2 shows a comparison of security scanning results for the NVIDIA Triton container. In just over 3 weeks, one critical vulnerability was identified. In addition, the number of high vulnerabilities grew from four to 11. Continuous monitoring and rapid response times to fix vulnerabilities are critical for maintaining business continuity.

Two screenshots showing that the vulnerabilities of NVIDIA Triton increased in 3 weeks. — *Figure 2. Security scan results comparison for NVIDIA Triton*

NVIDIA AI Enterprise for production AI

To help address these challenges, NVIDIA introduced NVIDIA AI Enterprise, an end-to-end, cloud-native software platform that accelerates data science pipelines and streamlines development and deployment of production-grade AI. Built on open source and curated, optimized, and supported by NVIDIA, the NVIDIA AI Enterprise software platform enables developers to focus on building and deploying new AI services.

NVIDIA AI Enterprise includes three supported branches: production branches, feature branches, and long-term support branches. Customers have access to all three branches and can use any mix of the three.

Production branches ensure API stability and regular security updates; ideal for deploying AI in production when stability is required. Released every 6 months with a 9-month lifecycle.

Feature branches include the top-of-tree software updates; ideal for AI developers who want the faster-moving, latest development environment. Released monthly.

Long-term support branches are ideal for highly regulated industries. Released every 2.5 years with a lifecycle of up to 3 years.

API stability and security

Throughout the 9-month lifecycle of each NVIDIA AI Enterprise production branch, NVIDIA continuously monitors critical and high common vulnerabilities and exposures (CVEs) and releases monthly security patches (Figure 3). By doing so, the AI frameworks, libraries, models, and tools included in NVIDIA AI Enterprise can be updated for security fixes while eliminating the risk of breaking an API.

Graphic of NVIDIA AI Enterprise production branch lifecycle timeline. — *Figure 3. NVIDIA AI Enterprise production branch lifecycle timeline*

Figure 4 compares the version of Triton available through the production branch release of NVIDIA AI Enterprise to the open-source version of Triton. The commercial version available with the production branch of NVIDIA AI Enterprise has zero critical and high vulnerabilities, while the open-source version has nine high vulnerabilities.

Two screenshots of vulnerability scanning results of two PyTorch images. One from NGC, and one from NVIDIA AI Enterprise. — *Figure 4. Triton security scan results comparison*

Security through transparency

In addition to product branches with monthly CVE patches and bug fixes, NVIDIA AI Enterprise customers can also receive security advisories and exploitability information from NVIDIA, including Vulnerability Exploitability eXchange (VEX) and Software Bill of Materials (SBOM), vulnerability context, and remediation guidance.

A VEX document is a relatively recent addition to the field of cybersecurity. Unlike a CVE entry, which provides general information about a vulnerability, a VEX document programmatically provides important context-specific details. It indicates whether a vulnerability is relevant (or exploitable) to particular components within the AI stack. It is also used to communicate false positives flagged by vulnerability scanning tools. VEX documents at NVIDIA are delivered in the CyclonDX format, which provides a standardized machine-readable way to share the information with customers.

Software optimization over time for better performance and lower TCO

As NVIDIA continues to evolve AI software and optimize performance over time, advances in NVIDIA AI software deliver up to 54% performance gains without a hardware upgrade. Figure 5 shows NVIDIA MLPerf Inference v3.0 compared to v2.1 submission results with NVIDIA H100 GPUs. This not only improves efficiency and performance, but also reduces energy consumption, footprint, and investment in the data center or cloud.

Chart of NVIDIA MLPerf Inference v3.0 compared to v2.1 submission results on NVIDIA H100. — *Figure 5. NVIDIA inference software delivers up to 54% performance gains without a hardware upgrade*

Enterprise support

Enterprise support is included with every NVIDIA AI Enterprise subscription, enabling organizations to benefit from the transparency of open source with the assurance of full software stack support provided by NVIDIA. Business-standard support includes:

Unlimited technical support cases accepted through the customer portal and phone 24 hours a day, 7 days a week
Escalation support during local business hours
Timely resolution from NVIDIA experts and engineers
Up to 3 years of long-term support

Whether you need to connect with AI experts, access knowledge base resources, or troubleshoot performance issues, NVIDIA is here to help you and provide the support you need to keep your AI stable and secure.

Get started with NVIDIA AI Enterprise

NVIDIA AI Enterprise reduces the costs and burden of maintaining and securing the complex software platform for production AI, freeing organizations to focus on building AI and harnessing its game-changing insights.

To experience the enterprise platform, request a free 90-day evaluation license that grants access to all software branches and enterprise support.

Already an NVIDIA AI Enterprise user? Access the latest version of the production branch.