Sergio Perez

Sergio Perez is a solution architect at NVIDIA who specializes in the training and inference of LLMs. Sergio works alongside AI developers in public supercomputer centers and sectors such as energy, automotive, finance, telecommunications, and internet services. He has contributed to production applications of LLMs covering RAG systems, optimization of inference servers, pretraining of LLMs from scratch, custom evaluation of LLMs, or quantization using FP8 formats. Sergio holds a Ph.D. in computational fluid dynamics from Imperial College London.

Posts by Sergio Perez

Agentic AI / Generative AI Jun 18, 2025

Sergio Perez

Posts by Sergio Perez

LLM Inference Benchmarking: How Much Does Your LLM Inference Cost?

Floating-Point 8: An Introduction to Efficient, Lower-Precision AI Training

Continued Pretraining of State-of-the-Art LLMs for Sovereign AI and Regulated Industries with Domyn and NVIDIA DGX Cloud