Nemotron™ 3 VoiceChat is a 12B-parameter, full‑duplex speech‑to‑speech model that unifies ASR, LLM, and TTS into a single architecture for real‑time voice agents. It is designed to deliver sub‑second, natural, interruptible conversations on NVIDIA GPUs, with open, inspectable weights for enterprise deployment.
This early access program gives qualified developers hands‑on access to the model, reference deployment containers, benchmark results, and a guided fine‑tuning path for domain‑specific, full‑duplex voice agents.