Bright Cluster Manager

Bright Computing

Bright Computing provides comprehensive software solutions for deploying and managing enterprise-grade Linux clusters. Bright provisions, monitors and manages GPU clusters, and makes it an ongoing practice to incorporate the latest enhancements in NVIDIA GPU technology into its products, enabling Bright customers to seamlessly use NVIDIA technology.

Key Features

  • Intuitive web interface provides comprehensive view of GPU and cluster metrics
  • Powerful Cluster Management Shell (CMSH) as alternative administrative interface
  • Full support for NVIDIA libraries, CUDA, OpenCL, OpenACC, CUDA-aware libraries, NCCL, and CUB
  • Comprehensive GPU monitoring and health checking
  • Provisions GPU resources from public (AWS, Azure) and private (OpenStack) clouds within minutes
  • Auto scaling hybrid cloud based on workload and configured policies
  • Supports several popular Linux distributions: RHEL and derivatives, SUSE SLES and Ubuntu LTS
  • GPU-enabled Docker and Singularity containers
  • Offers a complete deep learning stack
  • Deployment for popular HPC file systems and management of fast interconnects

Bright Cluster Manager can sample and monitor metrics from supported GPUs and GPU Computing Systems, such as the NVIDIA Tesla V100, P100, and P40 GPU cards as well as commodity GPUs.

Examples of supported metrics include:

  • GPU temperatures
  • GPU exclusivity modes
  • GPU fan speeds
  • System fan speeds
  • PSU voltages and currents
  • System LED states
  • GPU ECC statistics
  • Job-based metrics

Bright Cluster Manager leverages NVIDIA’s Data Center GPU Manager (DCGM) for GPU health monitoring, diagnostics and validation

Key benefits:

  • Unprecedented ease of use
  • Significant cost and time savings
  • Increased uptime and productivity
  • Scalability up to 30,000 compute nodes
  • Significant cost savings through dynamic scaling

Bright for Data Science

Bright empowers organizations to gain actionable insights from rich, complex data. To achieve this, Bright offers a comprehensive deep learning solution that includes:

  • A modern deep learning environment - Bright provides everything needed to spin up an effective deep learning environment, and manage it effectively
  • Choice of machine learning frameworks - Bright Cluster Manager provides a choice of machine learning frameworks, including Tensorflow, Tensorflow2, Horovod, Keras, PyTorch, Chainer, fast.ai, DyNet, MXNet, Theano to simplify deep learning projects.
  • Choice of machine learning libraries - Bright includes a selection of the most popular machine learning libraries to help access datasets, including MLPython, NVIDIA CUDA Deep Neural Network library (cuDNN), Deep Learning GPU Training System (DIGITS), and CaffeOnSpark (a Spark package for deep learning)
  • Frameworks are provided for both Python 3.6 and Python 3.7
  • Choice of machine learning libraries - Bright includes a selection of the most popular machine learning libraries to help access datasets, including NVIDIA CUDA Deep Neural Network library (cuDNN), NVIDIA TensorRT, an SDK for high-performance deep learning inference and more.
  • Supporting infrastructure elements – Bright takes care of finding, configuring, and deploying all of the dependent pieces needed to run deep learning libraries and frameworks, and includes over 400MB of Python modules that support the machine learning packages, plus the NVIDIA hardware drivers, CUDA (parallel computing platform API) drivers, CUB (CUDA building blocks), and NCCL2 (library of standard collective communication routines)
  • Because the machine learning frameworks and libraries are constantly being updated, please see the Bright packages dashboard for the most up to date information

For more information: