Asha Anoosheh

Asha Anoosheh is a deep learning algorithms engineer at NVIDIA working on the TensorRT Model Optimizer library. He has an M.Sc. from the ETH Zürich in robotics with a focus in computer vision.
Avatar photo

Posts by Asha Anoosheh

Developer Tools & Techniques

Pruning and Distilling LLMs Using NVIDIA TensorRT Model Optimizer

Large language models (LLMs) have set a high bar in natural language processing (NLP) tasks such as coding, reasoning, and math. However, their deployment... 11 MIN READ
A larger and smaller cartoon llama on a sunny beach, wearing shirts that say 8B and 4B.
Agentic AI / Generative AI

LLM Model Pruning and Knowledge Distillation with NVIDIA NeMo Framework

Model pruning and knowledge distillation are powerful cost-effective strategies for obtaining smaller language models from an initial larger sibling. ... 10 MIN READ