NVIDIA AI Inference Manager SDK
The NVIDIA AI Inference Manager (AIM) SDK streamlines AI model deployment and integration for PC application developers. The SDK pre-configures the PC with the necessary AI models, engines, and dependencies. It orchestrates AI inference seamlessly across PC and cloud from a unified inference API. And it supports all major inference backends, across different hardware accelerators (GPU, NPU, CPU). AIM SDK is available for early access.
Product Overview
Delivering optimal AI-enabled user experiences across the breadth of the PC ecosystem can be complex. Developers need to manage models, libraries, and dependencies across different types of devices, and switch to the cloud when on device resources are limited. AIM offers an easy path to integrate AI models into applications, and orchestrate deployment across cloud and PC.
Application Deployment
Perform hardware compatibility verifications as well as manage the installation and configuration process of models, engines, and runtime dependencies on the user's device. Easily define an application-specific policy to orchestrate across cloud and local PC execution.
Multiple Inference Backends
Use your choice of backend - be it DirectML, TensorRT, Llama.cpp, PyTorch-CUDA, or custom backends - to optimize and run models on end-user devices.
Local and Cloud Execution
AIM SDK supports any cloud API endpoint including NVIDIA NIM, as well as local execution on PCs. For local execution, developers can utilize either an in-process execution method to integrate directly with latency-sensitive applications, or an out-of-process execution method that integrates as a service within an application.
Integrated with Graphics Pipelines
Offers native integration into game pipelines and simultaneous CUDA and graphics execution with low latency.
Benefits
Flexibility
Built in a modular fashion, with C++ plugins, AIM offers full flexibility to developers to design application-specific experiences for users. With its modular nature, integrate your choice of inference backend, configure custom execution policies, and more.
Ease of Use
Easily deploy AI capabilities into applications without worrying about installation processes, models, engines, and runtime dependency management on end-user systems.
Scale Across Platforms
Scale across 1000s of end-user system configurations including different accelerators (GPU, NPU, and CPU), while delivering superior user experience through either cloud or local PC deployments of AI models.
Related Products
RTX AI Toolkit
A suite of tools for Windows developers to accelerate customization, optimization, and deployment of AI models across RTX PCs and cloud.
NVIDIA ACE
NVIDIA ACE is a suite of NIMs that helps developers bring digital humans to life with generative AI.
Resources
AIM is currently available to developers as part of an early access program.