Generative AI / LLMs

Generating Financial Market Scenarios Using NVIDIA NIM

While generative AI can be used to create clever rhymes, cool images, and soothing voices, a closer look at the techniques behind these impressive content generators reveals probabilistic learners, compression tools, and sequence modelers. When applied to quantitative finance, these methods can help disentangle and learn complex associations in financial markets. 

Market scenarios are crucial for risk management, strategy backtesting, portfolio optimization, and regulatory compliance. These hypothetical data models represent potential future market conditions, helping financial institutions to simulate and assess outcomes and make informed investment decisions. 

Specific methods demonstrate proficiency in various areas, such as:

  • Data generation with variational autoencoders or denoising diffusion models
  • Modeling sequences with intricate dependencies using transformer-based generative models
  • Understanding and predicting time-series dynamics with state-space models

While these methods may operate in distinct ways, they can also be combined to yield powerful results. 

This post explores how variational autoencoders (VAE), denoising diffusion models (DDM), and other generative tools can be integrated with large language models (LLM) to efficiently create market scenarios with desired properties. It showcases a scenario generation reference architecture powered by NVIDIA NIM, a collection of microservices designed to accelerate the deployment of generative models.

One toolset, many applications

Generative AI provides a unified framework around a variety of quantitative finance problems that have previously been addressed with distinct approaches. Once a model has been trained to learn the distribution of its input data, it can be used as a foundational model for a variety of tasks. 

For instance, it could generate samples for the creation of simulations or risk scenarios. It could also pinpoint which samples are out of distribution, acting as an outlier detector or stress scenario generator. As market data moves at different frequencies, cross-sectional market snapshots have gaps. A generative model can supply the missing data that fits with the actual data in a plausible way, which is beneficial for nowcasting models or dealing with illiquid points. Finally, autoregressive next-token prediction and state space models can help with forecasting. 

A significant bottleneck for domain experts leveraging such generative models is the lack of platform support that bridges their ideas and intentions with the complex infrastructure needed to deploy these models. While LLMs have gained mainstream use across various industries including finance, they are primarily used for knowledge processing tasks, such as Q&A and summarization, or coding tasks, like generating code stubs for further enhancement by human developers and integration into proprietary libraries. 

Integrating LLMs with complex models can bridge the communication gap between quantitative experts and generative AI models, as explained below.

Market scenario generation

Traditionally, the generation of market scenarios has relied on techniques including expert specifications (“shift the US yield curve up by 50 bp in parallel”), factor decompositions (“bump the EUR swap curve by -10 bp along the first PCA direction”), and statistical methods such as variance-covariance or bootstrapping. While these techniques help produce new scenarios, they lack a full picture of the underlying data distribution and often require manual adjustment. Generative approaches, which learn data distributions implicitly, elegantly overcome this modeling bottleneck.

LLMs can be combined with scenario generation models in powerful ways to enable simplified interaction while also acting as natural language user interfaces for market data exploration. For example, a trader might wish to assess her book’s exposure if markets were to behave like they did during a previous event, such as the great financial crisis, the last U.S. election, the Flash Crash, or the dot-com bubble burst. An LLM trained on recorded knowledge of such events could find and extract the characteristics of interest conditional on such events or historical periods and pass them to a generative market model to create similar market conditions for use in downstream applications. 

Figure 1 illustrates reference architecture for market scenario generation, connecting user specifications with suitable generative tools. Sample code for an implementation powered by NVIDIA NIM is shown in the Sample Implementation section. 

The process starts with a user instruction; for example, requesting a simulation of an interest rate environment similar to the one “at the peak of the financial crisis.” An agent (or collection of agents) processes this request by first routing it to an LLM-powered interpreter that interprets and converts the request in natural language to an intermediate format (in this case, a JSON file). 

The LLM then translates the “peak of financial crisis” to a concrete historical period (September 15 to October 15, 2008) and maps the market objects of interest (U.S. swap curves and swaption volatility surfaces, for example) to their respective pre-trained generative models (VAE, DDM, and so on). Information about the historical period of interest can be retrieved through a data retriever component and passed on to the corresponding generative tools to generate similar market data. 

Reference architecture for market scenario generation, connecting user specifications with suitable generative tools. Starting with a user request to generate market scenarios similar to those of a given historical period, an agent first routes it to an LLM-powered Interpreter that translates the request to a concrete historical period and maps the market objects of interest (e.g., swap curves, volatility surfaces) to corresponding pre-trained generative models. Historical data is retrieved by a data retriever component and passed on to these models which, in turn, generate novel market scenarios.
Figure 1. Market scenario generator reference architecture using NIM microservices

The generation process entails running inference on generative models that have been pre-trained on market data. Figures 2 and 3 illustrate an example of yield curve scenarios generated using a VAE model corresponding to the start of the COVID-19 pandemic.

3D latent space projections shown as points that represent historical (circles) and synthetic (triangles) US yield curves at the start of the COVID-19 pandemic (Late Feb to mid April 2020).
Figure 2. Latent space projections of historical U.S. yield curves at the start of the COVID-19 pandemic, late February to mid-April 2020 (left), and scenarios added by sampling in the latent space in the neighborhoods of realized yield curves (right)
Historical and synthetically generated yield curve scenarios for the start of COVID-19. Historical curves are shown in solid lines. Newly-generated curves are shown with dashed lines. One can observe that the newly-generated curves respect the overall shape of the input data, without directly copying it.
Figure 3. Historical and synthetically generated yield curve scenarios for the start of the COVID-19 pandemic

Table 1 exemplifies additional market scenario requests along with corresponding JSON outputs. In this case, the user query indicates the model to be used for illustration purposes. In a production environment, a routing agent could automatically select the right model. 

QueryAnswerModel & Comments
“Using a VAE model, output yield curves similar to those in the second half of 2020.”{“scenarios”: [
    {
     “method”: “VAE”,
     “object_type”: “yield curve”,
     “period”: {
         “start_date”: “2020-07-01”,
         “end_date”: “2020-12-31”}
    }
]}
llama3-70b-instruct

Correctly finds the second half of 2020.
“Generate yield curves similar to those in the year post-COVID.”{“scenarios”: [
    {
     “method”: “VAE”,
     “object_type”: “yield curve”,
     “period”: {
         “start_date”: “2020-03-01”,
         “end_date”: “2022-12-31”}
    }
]}
llama3-70b-instruct

Correctly selects ‘VAE’ as the default method and identifies the post-COVID period.
“Using DDPM, generate implied vol surfaces similar to Q1 in 2024.”{“scenarios”: [
    {
     “method”: “DDPM”,
     “object_type”: “implied vol surfaces”,
     “period”: {
         “start_date”: “2024-01-01”,
         “end_date”: “2024-03-31”}
    }
]}
mixtral-8x22b-instruct-v0.1

Correctly dates Q1 of 2024.
“Generate yield curves similar to the two-week period surrounding the Flash Crash.”{“scenarios”: [
    {
     “method”: “VAE”,
     “object_type”: “yield curve”,
     “period”: {
         “start_date”: “2010-04-30”,
         “end_date”: “2010-05-13”}
    }
]}
llama3-70b-instruct

Correctly dates the Flash Crash on May 6, 2010, and a 2-week period around it.
“Simulate inflation curves similar to those of last year.”{“scenarios”: [
    {
     “method”: “VAE”,
     “object_type”: “inflation curve”,
     “period”: {
         “start_date”: “2022-01-01”,
         “end_date”: “2022-12-31”}
    }
]}
llama3-70b-instruct

Caution: What is the model’s knowledge cut-off date? Find out from the model card in the NVIDIA catalog. In this case, it is March 2023, so “last year” for this model was 2022.
“Simulate yield curves and swaption vol surfaces at the peak of the financial crisis.”{“scenarios”: [
{
  “method”: “VAE”,
  “object_type”: “yield curve”,
  “period”: {
      “start_date”: “2008-09-15”,
      “end_date”: “2008-10-15”}
},
{
“method”: “VAE”,
“object_type”: “swaption volatility surface”,
“period”: {
         “start_date”: “2008-09-15”,
         “end_date”: “2008-10-15”
}}]}
meta/llama3-70b-instruct

Correctly identifies two types of market objects, as well as the peak of the financial crisis (Lehman Brothers collapsed on September 15, 2008.)
Table 1. Scenario generation Q&A examples obtained with NVIDIA NIM

Market structure analysis using generative models

Financial markets are inherently complex, characterized by noisy, high-dimensional data typically observed as multivariate time series. Detecting fleeting patterns that lead to financial gains at scale requires clever modeling and substantial computations. 

One way to reduce dimensionality is to consider the intrinsic structure present in the data: curves, surfaces, and higher-dimensional structures. These structures carry information that can be leveraged to reduce complexity. They can also be viewed as units of information to be embedded and analyzed through latent spaces of generative models. In this section we review examples of how VAEs and DDMs can be used in this context.

VAEs for learning the distribution of market curves

Bond yields, swap, inflation, and foreign exchange rates can be thought of as having one-dimensional term-structures, such as zero-coupon, spot, forward, or basis curves. Similarly, option volatilities can be seen as (hyper-)surfaces in 2D or higher dimensional spaces. A typical swap curve could have as many as 50 tenors. Instead of studying how the 50 corresponding time series relate to each other, one can consider a single time series of swap curves and use a VAE to learn the distribution of these objects, as detailed in Multiresolution Signal Processing of Financial Market Objects

The strength of this approach lies in its ability to integrate previously isolated data: the behavior of market objects, traditionally modeled in isolation either by currency (USD, EUR, BRL, for example) or by task (scenario generation, nowcasting, outlier detection, for example), can now be integrated into the training of a single generative model reflecting the interconnected nature of markets. 

Figure 4 illustrates the training loop of a VAE on market data objects such as yield curves: an encoder compresses the input objects into a latent space with a Gaussian distribution, a decoder reconstructs curves from points in this space. The latent space is continuous, explicit, and usually lower-dimensional, making it intuitive to navigate. Curves representing different currencies or market regimes cluster in distinct areas, allowing the model to navigate within or between these clusters to generate new curves unconditional or conditional on desired market characteristics.

Diagram showing the high-level architecture of a VAE training on market data: an encoder compresses input objects such as yield curves into a latent space with a Gaussian distribution; a decoder reconstructs curves from points in this space. The latent space of a VAE is continuous, explicit, typically of lower dimensionality, and intuitive to navigate.
Figure 4. VAE training loop on market data such as yield curves

In particular, the generation process can be conditioned on specific historical periods to generate curves that may have similar shapes to historical ones, without being exact replicas. For example, the U.S. Treasury yield curves corresponding to the start of the COVID-19 pandemic are shown in Figure 2 (left) as circles in the 3D latent space of a VAE that has been trained on yield curve data. Since VAE latent spaces are continuous, they naturally render themselves to defining neighborhoods around points of interest and to sampling from those neighborhoods to generate novel yield curve scenarios that are similar (but not the same) as historical ones, shown as triangles in Figure 2 (right). 

Figure 5 shows a more complete picture of the yield curves used for training the VAE model, along with clusters grouped by the level of rates, as well as an outlier corresponding to the mistaken inclusion of a data point when the Treasury markets were closed (Good Friday, 2017). One can easily imagine additional applications beyond scenario generation, including non-linear factor decompositions, outlier detection, and many more.

In the 3D latent space of a VAE trained on US Treasury curves data (1993-2019), the financial crisis forms a distinct cluster, highlighting a significant deviation from pre-crisis conditions and transitioning into a unique post-crisis low-rates environment.
Figure 5. In the 3D latent space of a VAE trained on U.S. Treasury curves data (1993-2019), the financial crisis forms a distinct cluster, highlighting a significant deviation from pre-crisis conditions and transitioning into a unique post-crisis low-rates environment

DDMs for learning volatility surfaces

DDMs approach the generative process through the prism of reversible diffusion. As shown in Figure 6, they operate by gradually introducing noise into the data until it becomes a standard Gaussian. Then they learn to reverse this process to generate new data samples starting from pure noise. Noise is added gradually in the forward pass, until the resulting object is indistinguishable from Gaussian noise. During the backward pass, the model learns the denoising needed to reconstruct the original surface. To learn more, see Generative AI Research Spotlight: Demystifying Diffusion-Based Models.

A diagram showing the high-level architecture of a DDM trained to learn the distribution of 2D market objects such as volatility surfaces. It consists of a forward pass of gradual noise injection and a backward pass of learning the denoising process.
Figure 6. High-level architecture of a DDM trained to learn the distribution of 2D market objects such as volatility surfaces

To learn the distribution of implied volatility surfaces, the ability of a DDM (specifically a DDPM, or denoising diffusion probabilistic model) is explored. This example uses a synthetic data set of roughly 20,000 volatility surfaces generated with the SABR (stochastic alpha-beta-rho) stochastic volatility model described in Managing Smile Risk (subject to initial conditions F_0 and \alpha_0):

dF_t = \alpha_t (F_t)^{\beta} dW_t
d\alpha_t = \nu \alpha_t dZ_t
\langle dW_t, dZ_t \rangle = \rho dt

An example input surface is shown in Figure 7.

The goal is to evaluate the ability of such a model to recover the input data distribution, in this case, a SABR distribution. In the case of empirical surfaces implied directly from market prices of options, the distribution would be unknown and having a tool that can capture the distribution in a non-parametric way would offer a valuable alternative to sparse parametric models that don’t have sufficient degrees of freedom to represent the data. It can subsequently be used to generate volatility surface scenarios, or to fill in missing regions in plausible ways.

Example of an input SABR volatility surface used for model training. The surface is shown as a 3D plot with axes: log-moneyness, time to maturity in years, and implied log-volatility.
Figure 7. Example of an input SABR surface used for model training

A simplified version of this architecture was adapted for this work. The inputs are 16 \times 16 grids of implied volatilities corresponding to various option moneyness and time to maturity pairs. Other types of inputs, such as volatility cubes (moneyness-maturity-underlying) or other variations could be tackled in similar fashion.

Figure 8 illustrates a few steps in the training of the DDPM model on volatility surface inputs.

Screenshot showing progressions of 16x16 images during the first few epochs of training a DDPM model. Beginning with pure noise, the model progressively learns the noise process needed to denoise the images and generate SABR-like surfaces.
Figure 8. A few training steps: beginning with pure noise, the model progressively learns the noise process

Figure 9 shows two synthetically-generated surfaces which have subsequently been fitted with SABR models to verify that their shapes are SABR-like. Green points represent SABR surfaces fitted to the generated ones, verifying that the DDPM model has learned the SABR-like shapes.

Two synthetically-generated surfaces that have subsequently been fitted with SABR models to verify that their shapes are SABR-like. Green points represent SABR surfaces fitted to the generated ones, verifying that the DDPM model has learned the SABR-like shapes.
Figure 9. Transition from noise to a generated sample, showing an intermediate noisy surface (top) and newly generated volatility surfaces (in purple, bottom)

Sample implementation

This section presents an example of using NVIDIA-hosted NIM endpoints including Llama 3.1 70B Instruct LLM to build the LLMQueryInterpreter component of the reference architecture in Figure 1. Note that many other NVIDIA as well as open-source LLMs are available with NIM, including Nemotron, Mixtral, Gemma, and many more. Accessing these through NIM guarantees that they are optimized for inference on NVIDIA-accelerated infrastructure and offers a quick and easy way to compare responses from multiple models.

import os
from langchain_nvidia_ai_endpoints import ChatNVIDIA

# NVIDIA API configuration
NVIDIA_API_KEY = os.environ.get("NVIDIA_API_KEY") # click “Get API Key” at https://www.nvidia.com/en-us/ai/
if not NVIDIA_API_KEY:
    raise ValueError("NVIDIA_API_KEY environment variable is not set")

class LLMQueryInterpreter:
    """ NVIDIA NIM-powered class that processes scenario requests from user."""
    
    def __init__(self, llm_name="meta/llama-3.1-70b-instruct", temperature=0):
        self._llm = ChatNVIDIA(model=llm_name, temperature=temperature) 
    
        # define output JSON format
        self._scenario_template = """ 
        {"scenarios": [ 
            { 
                "method": <generative method name, e.g., "VAE">,
                "object_type": <object type, e.g., "yield curve">,
                "period": {
                    "start_date": <start date of event>,
                    "end_date"  : <end date of event>
                }
            }, 
            { 
                "method": <generative method name, e.g., "DDPM">,
                "object_type": <object type, e.g., "volatility surface">,
                "period": {
                    "start_date": <start date of event>,
                    "end_date"  : <end date of event>
                }
            }, 

            // Additional scenarios, as needed 
        ] } 
        """

        # instructions for the LLM
        self._prompt_template = """" 
        Your task is to generate a single, high-quality response in JSON format. \
        Format the output according to the following template: {scenario_template}. \
        Offer a refined, accurate, and comprehensive reply to the instruction. \

        Every query has a structure of this form: \
        "Use <method> to generate <market_object_type> similar to <period>". \
        Valid methods are "VAE" or "DDPM"; use "VAE" by default, if no method is specified.\
        If a different method is requested, return "invalid" as the method type. \ 

        Answer the following query without any additional explanations, return only a JSON output:
        {query}  

        """

    def process_query(self, query):
        # query = "Using a VAE model, output yield curves similar to those in the second half of 2020"

        llm_request = self._prompt_template.format(**{ "query": query, "scenario_template": self._scenario_template }) 
        llm_answer = self._llm.invoke(llm_request) 
        
        return llm_answer.content

# Example usage
query = "Using a VAE model, output yield curves similar to the Flash Crash"
llm = LLMQueryInterpreter()
print(llm.process_query(query))

# Output
{
"scenarios": [
    {
        "method": "VAE",
        "object_type": "yield curve",
        "period": {
            "start_date": "2010-05-06",
            "end_date": "2010-05-06"
        }
    }
]
}

Conclusion

It’s exciting to imagine a future where quants, traders, and investment professionals increasingly collaborate with AI tools to model and explore financial markets. The integration of these advanced tools enhances financial modeling and market exploration, promising to drive forward the capabilities and insights of market participants. They can be combined in innovative ways and served with ease using NVIDIA NIM.

Discuss (0)

Tags