Posts by Ivan Goldwasser
AI Platforms / Deployment
Sep 05, 2025
Accelerate Large-Scale LLM Inference and KV Cache Offload with CPU-GPU Memory Sharing
Large Language Models (LLMs) are at the forefront of AI innovation, but their massive size can complicate inference efficiency. Models such as Llama 3 70B and...
7 MIN READ
Data Center / Cloud
May 20, 2025
NVIDIA 800 VDC Architecture Will Power the Next Generation of AI Factories
The exponential growth of AI workloads is increasing data center power demands. Traditional 54 V in-rack power distribution, designed for kilowatt (KW)-scale...
8 MIN READ
Data Center / Cloud
May 18, 2025
Integrating Semi-Custom Compute into Rack-Scale Architecture with NVIDIA NVLink Fusion
Data centers are being re-architected for efficient delivery of AI workloads. This is a hugely complicated endeavor, and NVIDIA is now delivering AI factories...
7 MIN READ
AI Platforms / Deployment
May 18, 2025
NVIDIA ConnectX-8 SuperNICs Advance AI Platform Architecture with PCIe Gen6 Connectivity
As AI workloads grow in complexity and scale—from large language models (LLMs) to agentic AI reasoning and physical AI—the demand for faster, more scalable...
5 MIN READ
Data Center / Cloud
May 16, 2025
Building the Modular Foundation for AI Factories with NVIDIA MGX
The exponential growth of generative AI, large language models (LLMs), and high-performance computing has created unprecedented demands on data center...
6 MIN READ
Data Center / Cloud
Mar 11, 2025
Efficient ETL with Polars and Apache Spark on NVIDIA Grace CPU
The NVIDIA Grace CPU Superchip delivers outstanding performance and best-in-class energy efficiency for CPU workloads in the data center and in the cloud. The...
7 MIN READ