NVIDIA Cloud Functions (NVCF)
NVIDIA Cloud Functions (NVCF) is a unified API layer for running and scaling inference, fine-tuning, batch, and simulation workloads across Kubernetes clusters. It seamlessly integrates into NVIDIA Cloud Provider (NCP) and independent software vendor (ISV) environments with secure, multi-tenant isolation inherited from the underlying infrastructure.
How NVIDIA Cloud Functions Works
NVIDIA Cloud Functions (NVCF) lets AI builders easily deploy and scale agentic AI, physical AI, and simulation workloads using NVIDIA models like Nemotron™ and Cosmos™, as well as supported open-source models from build.nvidia.com, giving teams the flexibility to run the models that best suit their workloads. By integrating with NVIDIA Dynamo, NVIDIA Grove, and KAI Scheduler, NVCF delivers autoscaling, multi-tenancy, and high GPU utilization while reducing fragmentation. It provides a single unified API for distributed multi-node inference that simplifies scaling and operations for even the most complex workloads and accelerates time to market.

NVIDIA Cloud Functions Key Features
Auto-Scaling to Zero
With NVCF, you can scale down to zero instances during periods of inactivity to optimize resource utilization and reduce costs. There’s no extra cost for cold-boot start times, and the system is optimized to minimize them.
BYO Observability
NVCF offers robust observability features. It allows you to integrate your preferred monitoring tools, such as Splunk, for comprehensive insights into your AI workloads.
Broad Workload Support
NVCF offers flexible deployment options, whether you bring your own containers, models, and Helm charts, or use NVIDIA’s open model fleet on build.nvidia.com, including Nemotron, Cosmos, GR00T, Clara™, Alpamayo, Apollo, and Earth-2. You can seamlessly create and scale functions tailored to your specific AI workloads.
Sovereign Deployment
NVCF can be deployed on your self-hosted infrastructure, including CSP environments, NVIDIA Cloud Partners (NCPs), or on prem, enabling you to keep models and data within required regions and meet sovereignty requirements.
Get Started With NVIDIA Cloud Functions

Try
Experience leading models on build.nvidia.com, accelerated by NVIDIA DGX™ Cloud with NVCF.

Start Here
A unified API layer for scaling inference and simulation workloads across one or more Kubernetes clusters.
GitHubNVIDIA Cloud Functions Ecosystem
ISVs are core to the NVCF ecosystem, building AI applications on top of NVCF’s runtime capabilities such as auto-scaling, load balancing and observability. They transform the platform and models into vertical, customer-ready solutions that drive real-world adoption.
More Resources
Ethical AI
NVIDIA believes trustworthy AI is a shared responsibility, and we have established policies and practices to support the development of AI across a wide array of applications. When downloading or using this model in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI Concerns here.
