Peter Kisfaludi

Peter Kisfaludi is a senior software engineer working in the TensorRT Multi-Device team. In this role, he focuses on developing scalable runtime architectures and optimizing communication overhead to deliver low-latency execution for multi-GPU model serving. Before joining NVIDIA in 2022, Peter was an independent consultant designing low-latency, mission critical software for real-time embedded systems.

Posts by Peter Kisfaludi

Developer Tools & Techniques Jun 25, 2026

Scaling AI Inference Across Multiple GPUs Using NVIDIA TensorRT with Multi-Device Inference Support

Generative AI workloads are rapidly outgrowing the memory and compute budget of single GPUs. For inference developers building media generation pipelines, the... 11 MIN READ