Ekin Karabulut

Ekin Karabulut is a data scientist and developer advocate previously at Run:ai, now at NVIDIA, exploring the efficient usage of large models in different production scenarios. Previously she worked on privacy implications of federated learning, focused on distributed training techniques and got fascinated by inefficiencies in GPU usage in research and industry settings. She established the AI Infrastructure Club and is based in Munich, Germany.
Avatar photo

Posts by Ekin Karabulut

Data Center / Cloud

Maximizing GPU Utilization with NVIDIA Run:ai and NVIDIA NIM

Organizations deploying LLMs are challenged by inference workloads with different resource requirements. A small embedding model might use only a few gigabytes... 11 MIN READ
Data Center / Cloud

Unlock Massive Token Throughput with GPU Fractioning in NVIDIA Run:ai

As AI workloads scale, achieving high throughput, efficient resource usage, and predictable latency becomes essential. NVIDIA Run:ai addresses these challenges... 13 MIN READ
Data Center / Cloud

Ensuring Balanced GPU Allocation in Kubernetes Clusters with Time-Based Fairshare

NVIDIA Run:ai v2.24 introduces time-based fairshare, a new scheduling mode that brings fair-share scheduling with time awareness for over-quota resources to... 11 MIN READ
Agentic AI / Generative AI

Streamline Complex AI Inference on Kubernetes with NVIDIA Grove

Over the past few years, AI inference has evolved from single-model, single-pod deployments into complex, multicomponent systems. A model deployment may now... 10 MIN READ
Decorative image.
Agentic AI / Generative AI

Enable Gang Scheduling and Workload Prioritization in Ray with NVIDIA KAI Scheduler

NVIDIA KAI Scheduler is now natively integrated with KubeRay, bringing the same scheduling engine that powers high‑demand and high-scale environments in... 10 MIN READ
Data Center / Cloud

Smart Multi-Node Scheduling for Fast and Efficient LLM Inference with NVIDIA Run:ai and NVIDIA Dynamo

The exponential growth in large language model complexity has created challenges, such as models too large for single GPUs, workloads that demand high... 9 MIN READ