Building a Generative AI OpenUSD App for Brand-Accurate Marketing Visuals

Today, brands and their creative agencies are under huge strain to create and deliver high-quality, accurate product images at scale, from campaign key visuals to packshots for e-commerce. Audience-targeted content, such as personalized and localized visual variations, adds additional layers of complexity to production.

Production costs, short timelines, resources, and maintaining brand identity are all repeat hurdles that prevent marketing teams from creating more assets and more targeted content for their audience segments.

For example, an espresso machine manufacturer might want to target a wide range of audiences for an upcoming product launch, from young professionals living in a city to older generations enjoying retirement in the countryside. Historically, this would require multiple workstreams, locations, teams, and review cycles to execute, which are often not possible, limiting the available content that marketing teams can use for targeting.

To generate high-quality, brand-accurate content at scale for wide-ranging audience segments, creative teams can now harness generative AI workflows. Integrating generative AI into tools and applications used for brand-accurate visual asset generation and content production can unlock new possibilities and efficiencies for the content supply chain.

Many developers are already working to make this a reality.

In this post, we introduce the NVIDIA Omniverse Blueprint for 3D conditioning for precise visual generative AI, give an overview of how it works, what you could use it for, and hear from a few industry leaders on how they are thinking about the development of this field.

NVIDIA Omniverse Blueprints are reference workflows that enable you to easily implement and build 3D, simulation, and digital twin applications.

Model conditioning to unlock generative AI for scalable and controlled asset creation

Integrating generative AI into a workflow to create precise on-brand images can be problematic if there is no control over the visual input of the product. You can have specific geometry, color, logos, and brand guidelines be misinterpreted or lost without certain conditioning.

Model conditioning means providing a model with specific information or rules to help it make better predictions or decisions based on what you want it to do. To condition an LLM, you provide text-based instructions, examples, context, or previous conversation history. For image generators, you can provide text or a sample image.

But this only provides so much control over the AI model. This is why 3D conditioning is required.

Setting the stage in 3D enables artists to have ultimate creative control or direction over the output of the generated visuals. Building an easy-to-use UI for end-user interaction enables non-technical teams to iterate and create content in a controlled and conditioned framework, while keeping branded assets untouched by the AI.

This Omniverse Blueprint takes a multimodal approach, along with a hybrid of combining 3D for the hero asset and simple environment geometry and 2D render passes for rapid inpainting to complete the controlled scene. You maintain the integrity of the product digital twin with masking and can frame the shot by changing camera angle and zoom through a 3D viewport.

Building a 3D-conditioned workflow for precise visual generative AI involves a handful of key components:

On-brand hero asset: A finalized asset, built by an artist and typically approved by a brand manager and art director, which should be considered the hero asset. For this example, we provided a simple espresso machine.
A simple, untextured 3D scene: Provided by a 3D artist, to use for staging the hero asset and controlling layout and composition.
Custom application: Built with the Kit App Template based on Kit 106.2.
Generative AI microservices and kit extensions: Add generative AI functionality to your custom application. In this case, a diffusion model takes care of inpainting.
Solution testing: Verifies the functionality and performance of your integrated workflow.

For this workflow, we specifically explored microservices that enable you to use generative AI while also taking advantage of OpenUSD for 3D application and workflow development.

Omniverse Blueprints are designed to be extensible and customizable. Here are some additional components that you can introduce to the workflow:

Large multimodal models (LMMs) + ComfyUI: Fast generative text-to-image models that can synthesize photorealistic images from a text prompt.
USD Code: A language model that answers OpenUSD knowledge queries and generates USD Python code.
USD Search: An AI-powered search for OpenUSD data, 3D models, images, and assets using text– or image-based inputs.

By the end of the workflow guide, you will be able to develop your own custom app with AI to enable and accelerate your creative and marketing teams. All microservices are currently available as a preview on build.nvidia.com, where you can make API calls for evaluation.

Marketing ecosystem builds with NVIDIA Omniverse Blueprints

Developers at independent software vendors (ISVs) and production services agencies are building the next generation of content creation solutions, infused with controllable generative AI, built on OpenUSD.

This blueprint provides inspiration for the “art of the possible,” demonstrating how creative agencies and developers are deploying this technology in production to scale content generation.

For example, Accenture Song has leveraged this technology for clients like L’Oréal, Nestlé, and JLR, while Grip has worked with the Coca-Cola Co. and Moët Hennessy. Additionally, Collective World has collaborated with Unilever to implement these advanced content generation capabilities. Other leaders like Monks are also using Omniverse to accelerate development of AI agents that can be used for marketing and content creation.

Developing a scalable AI solution for on-brand asset creation

This blueprint provides you with an example architecture of how to build controllable generative AI applications. You or your client can now get the most out of your app:

Multimodal AI-generated final-frame campaign assets
Rapid concepting and ideation for key visuals
Batch processing of prompt inputs, generating potentially hundreds of visual outputs from predefined text prompts fed from a database

By implementing this blueprint, you or your client get the following benefits:

Accelerated time to market: Significantly decrease the time it takes to create high-resolution branded assets to allow for products to be taken to market faster.
Low-effort localization: Enable the creation of localized imagery instantly to help brands meet certain cultural trends or requirements for different markets.
Increased productivity: Easy-to-use tools that use 3D data can lower the technical skillset traditionally associated with high-fidelity asset creation.

Get started

In this post, we introduced the NVIDIA Omniverse Blueprint for 3D conditioning for precise visual generative AI and showed you ways to benefit from building generative AI applications for brand-accurate visual asset generation and content production.

This Blueprint is now retired. Many of the developer tools mentioned remain openly available for building solutions. If you are looking for end user applications, there are many options to explore, including the ecosystem partners mentioned above.

Updated on July 23, 2025, to add information about companies using the technology.