Posts by Neelay Shah
        
                    Agentic AI / Generative AI
        
        
        May 06, 2025
      
      LLM Inference Benchmarking Guide: NVIDIA GenAI-Perf and NIM
                                                This is the second post in the LLM Benchmarking series, which shows how to use GenAI-Perf to benchmark the Meta Llama 3 model when deployed with NVIDIA NIM. ...
                          
          
            11 MIN READ
          
        
      
    
        
                    Agentic AI / Generative AI
        
        
        Apr 02, 2025
      
      LLM Inference Benchmarking: Fundamental Concepts
                                                This is the first post in the large language model latency-throughput benchmarking series, which aims to instruct developers on common metrics used for LLM...
                          
          
            15 MIN READ
          
        
      
    
        
                    Developer Tools & Techniques
        
        
        Mar 18, 2025
      
      NVIDIA Dynamo, A Low-Latency Distributed Inference Framework for Scaling Reasoning AI Models
                                                NVIDIA announced the release of NVIDIA Dynamo at GTC 2025. NVIDIA Dynamo is a high-throughput, low-latency open-source inference serving framework for deploying...
                          
          
            14 MIN READ
          
        
      
    
        
                    Data Center / Cloud
        
        
        Mar 07, 2024
      
      Generate Stunning Images with Stable Diffusion XL on the NVIDIA AI Inference Platform
                                                Diffusion models are transforming creative workflows across industries. These models generate stunning images based on simple text or image inputs by...
                          
          
            14 MIN READ
          
        
      
     
           
       
       
      