Ekin Karabulut

Ekin Karabulut is a data scientist and developer advocate previously at Run:ai, now at NVIDIA, exploring the efficient usage of large models in different production scenarios. Previously she worked on privacy implications of federated learning, focused on distributed training techniques and got fascinated by inefficiencies in GPU usage in research and industry settings. She established the AI Infrastructure Club and is based in Munich, Germany.

Posts by Ekin Karabulut

Data Center / Cloud Feb 27, 2026

Maximizing GPU Utilization with NVIDIA Run:ai and NVIDIA NIM

Organizations deploying LLMs are challenged by inference workloads with different resource requirements. A small embedding model might use only a few gigabytes... 11 MIN READ

Data Center / Cloud Feb 18, 2026

Unlock Massive Token Throughput with GPU Fractioning in NVIDIA Run:ai

As AI workloads scale, achieving high throughput, efficient resource usage, and predictable latency becomes essential. NVIDIA Run:ai addresses these challenges... 13 MIN READ

Data Center / Cloud Jan 28, 2026

Ensuring Balanced GPU Allocation in Kubernetes Clusters with Time-Based Fairshare

NVIDIA Run:ai v2.24 introduces time-based fairshare, a new scheduling mode that brings fair-share scheduling with time awareness for over-quota resources to... 11 MIN READ

Agentic AI / Generative AI Nov 10, 2025

Streamline Complex AI Inference on Kubernetes with NVIDIA Grove

Over the past few years, AI inference has evolved from single-model, single-pod deployments into complex, multicomponent systems. A model deployment may now... 10 MIN READ

Agentic AI / Generative AI Oct 03, 2025

Enable Gang Scheduling and Workload Prioritization in Ray with NVIDIA KAI Scheduler

NVIDIA KAI Scheduler is now natively integrated with KubeRay, bringing the same scheduling engine that powers high‑demand and high-scale environments in... 10 MIN READ

Data Center / Cloud Sep 29, 2025

Smart Multi-Node Scheduling for Fast and Efficient LLM Inference with NVIDIA Run:ai and NVIDIA Dynamo

The exponential growth in large language model complexity has created challenges, such as models too large for single GPUs, workloads that demand high... 9 MIN READ