Ashraf Eassa

Ashraf Eassa is a senior product marketing manager at NVIDIA, focusing on deep learning, training and inference. He holds bachelor's degrees in computer science and mathematics from the University of Vermont.
Avatar photo

Posts by Ashraf Eassa

Image of an HGX H200
Generative AI / LLMs

Low Latency Inference Chapter 1: Up to 1.9x Higher Llama 3.1 Performance with Medusa on NVIDIA HGX H200 with NVLink Switch

As large language models (LLMs) continue to grow in size and complexity, multi-GPU compute is a must-have to deliver the low latency and high throughput that... 5 MIN READ
Generative AI / LLMs

Boosting Llama 3.1 405B Performance up to 1.44x with NVIDIA TensorRT Model Optimizer on NVIDIA H200 GPUs

The Llama 3.1 405B large language model (LLM), developed by Meta, is an open-source community model that delivers state-of-the-art performance and supports a... 7 MIN READ
Decorative image.
Generative AI / LLMs

NVIDIA Triton Inference Server Achieves Outstanding Performance in MLPerf Inference 4.1 Benchmarks

Six years ago, we embarked on a journey to develop an AI inference serving solution specifically designed for high-throughput and time-sensitive production use... 8 MIN READ
Data Center / Cloud

NVIDIA Blackwell Platform Sets New LLM Inference Records in MLPerf Inference v4.1

Large language model (LLM) inference is a full-stack challenge. Powerful GPUs, high-bandwidth GPU-to-GPU interconnects, efficient acceleration libraries, and a... 13 MIN READ
Decorative image of linked modules.
Generative AI / LLMs

NVIDIA NVLink and NVIDIA NVSwitch Supercharge Large Language Model Inference

Large language models (LLM) are getting larger, increasing the amount of compute required to process inference requests. To meet real-time latency requirements... 8 MIN READ
NVIDIA Hopper GPU and NVIDIA Grace CPUs on a black background.
Data Center / Cloud

Revolutionizing Data Center Efficiency with the NVIDIA Grace Family

The exponential growth in data processing demand is projected to reach 175 zettabytes by 2025. This contrasts sharply with the slowing pace of CPU performance... 16 MIN READ