Tejash Shah

Tejash Shah is a principal product manager within the AI Platform Software group at NVIDIA, responsible for managing JAX and MLX frameworks. Before NVIDIA, Tejash held software engineering roles at semiconductor companies. He holds five patents in a wide range of technological domains. He earned a master's degree in Computer Science from The University of Texas at Dallas and a bachelor’s degree in Information Technology from Gujarat University.
Avatar photo

Posts by Tejash Shah

Developer Tools & Techniques

Achieve CUTLASS C++ Performance with Python APIs Using CuTe DSL

CuTe, a core component of CUTLASS 3.x, provides a unified algebra for describing data layouts and thread mappings, and abstracts complex memory access patterns... 9 MIN READ
Data Center / Cloud

Optimizing for Low-Latency Communication in Inference Workloads with JAX and XLA

Running inference with large language models (LLMs) in production requires meeting stringent latency constraints. A critical stage in the process is LLM decode,... 6 MIN READ
Developer Tools & Techniques

CUTLASS 3.x: Orthogonal, Reusable, and Composable Abstractions for GEMM Kernel Design

GEMM optimization on GPUs is a modular problem. Performant implementations need to specify hyperparameters such as tile shapes, math and copy instructions, and... 12 MIN READ
Agentic AI / Generative AI

CUTLASS: Principled Abstractions for Handling Multidimensional Data Through Tensors and Spatial Microkernels

In the era of generative AI, utilizing GPUs to their maximum potential is essential to training better models and serving users at scale. Often, these models... 12 MIN READ