Lalit Adithya

Lalit Adithya is a senior system software engineer at NVIDIA, working within the DGX Cloud organization. He specializes in distributed systems, accelerator resiliency, and multi-cloud/hybrid-cloud infrastructures, with a strong focus on building resilient, self-healing Kubernetes platforms tailored for workloads requiring accelerators. Lalit was a founding member of the NVSentinel project. With over a decade of experience in software engineering, Lalit has contributed to designing and deploying large-scale cloud-native solutions, spanning business-critical web applications, DevOps, CI/CD automation, and thick client development – always with security as a first principle, never an afterthought. He is also the co-author of the Jenkins Administrator’s Guide, a comprehensive resource for managing production-grade CI/CD systems.
Avatar photo

Posts by Lalit Adithya

Data Center / Cloud

Automate Kubernetes AI Cluster Health with NVSentinel

Kubernetes underpins a large portion of all AI workloads in production. Yet, maintaining GPU nodes and ensuring that applications are running, training jobs are... 7 MIN READ