Kezhi Kong

Kezhi Kong is a research scientist at NVIDIA and a member of the Foundation Model team. He received his PhD from the Computer Science Department of University of Maryland, College Park. His research focuses on building state-of-the-art large language models, especially through improved quality and extended scale of pretraining data as well as enhanced pretraining algorithms.

Posts by Kezhi Kong

Agentic AI / Generative AI Jan 09, 2025

Announcing Nemotron-CC: A Trillion-Token English Language Dataset for LLM Pretraining

NVIDIA is excited to announce the release of Nemotron-CC, a 6.3-trillion-token English language Common Crawl dataset for pretraining highly accurate large... 4 MIN READ