NVIDIA In-Game Inferencing SDK (Beta)

The NVIDIA In-Game Inferencing SDK streamlines AI model deployment and integration for PC application developers. The SDK pre-configures the PC with the necessary AI models, engines, and dependencies. It orchestrates AI inference seamlessly across PC and cloud from a unified inference API. And it supports all major inference backends, across different hardware accelerators (GPU, NPU, CPU).  In-Game Inferencing SDK is available in beta.

Download Beta

NVIDIA AI Inference Manager (AIM) software development kit (SDK)

Product Overview

Delivering optimal AI-enabled user experiences across the breadth of the PC ecosystem can be complex. Developers need to manage models, libraries, and dependencies across different types of devices, and switch to the cloud when on device resources are limited. In-Game Inferencing SDK offers an easy path to integrate AI models into applications, and orchestrate deployment across cloud and PC.

Application Deployment

Perform hardware compatibility verifications as well as manage the installation and configuration process of models, engines, and runtime dependencies on the user's device. Easily define an application-specific policy to orchestrate across cloud and local PC execution.

Multiple Inference Backends

Use your choice of backend - be it DirectML, TensorRT, Llama.cpp, PyTorch-CUDA, or custom backends - to optimize and run models on end-user devices.

Local and Cloud Execution

NVIDIA In-Game Inferencing SDK supports any cloud API endpoint including NVIDIA NIM, as well as local execution on PCs. For local execution, developers can utilize either an in-process execution method to integrate directly with latency-sensitive applications, or an out-of-process execution method that integrates as a service within an application.

Integrated with Graphics Pipelines

Offers native integration into game pipelines and simultaneous CUDA and graphics execution with low latency.


Benefits

Decorative image representing full design flexibility for developers

Flexibility

Built in a modular fashion, with C++ plugins, NVIDIA In-Game Inferencing offers full flexibility to developers to design application-specific experiences for users. With its modular nature, integrate your choice of inference backend, configure custom execution policies, and more.

Decorative image representing ease of software deployment

Ease of Use

Easily deploy AI capabilities into applications without worrying about installation processes, models, engines, and runtime dependency management on end-user systems.

Decorative image representing scaling across platforms

Scale Across Platforms

Scale across 1000s of end-user system configurations including different accelerators (GPU, NPU, and CPU), while delivering superior user experience through either cloud or local PC deployments of AI models.


Related Products

NVIDIA RTX AI toolkit

RTX AI Toolkit

A suite of tools for Windows developers to accelerate customization, optimization, and deployment of AI models across RTX PCs and cloud.

NVIDIA ACE sparks life into virtual characters with generative AI

NVIDIA ACE

NVIDIA ACE is a suite of NIMs that helps developers bring digital humans to life with generative AI.


Resources

Streamline AI-Powered App Development with NVIDIA RTX AI Toolkit for Windows RTX PCs

A suite of tools for Windows developers to accelerate customization, optimization, and deployment of AI models across RTX PCs and cloud.

A suite of tools for Windows developers to accelerate customization, optimization, and deployment of AI models across RTX PCs and cloud.

    Get started with the NVIDIA In-Game Inferencing SDK.

    Download Beta