Bright Cluster Manager

Bright Computing provides comprehensive software solutions for deploying and managing HPC clusters, big data clusters, Deep Learning, and OpenStack in the data center and in the cloud. Bright Computing provisions, monitors and manages GPU clusters, and makes it an ongoing practice to incorporate the latest enhancements in NVIDIA GPU technology into its products, enabling Bright customers to seamlessly support NVIDIA technology from its GUI.

Key Features

  • Intuitive web app provides comprehensive view of GPU and cluster metrics
  • Powerful Cluster Management Shell as alternative user interface
  • Fully Support for NVIDIA libraries, CUDA, OpenCL, OpenACC, CUDA-aware libraries, NCCL, and CUB
  • Comprehensive monitoring of GPU
  • Brings in GPU resources from public (AWS, Azure) and private (OpenStack) clouds within minutes
  • Automated scaling of the cluster based on pre-defined policies
  • Supports several popular Linux distributions: RHEL and derivatives, SUSE SLES and Ubuntu LTS
  • GPU-enabled Docker containers
  • Offers a complete deep learning stack
  • Deployment for popular HPC filesystems and management of fast interconnects

Bright Cluster Manager can sample and monitor metrics from supported GPUs and GPU Computing Systems, such as the NVIDIA Tesla V100, P100, and P40 GPU cards as well as commodity GPUs such as the GeForce GTX 1080.

Examples of supported metrics include:

  • GPU temperatures;
  • GPU exclusivity modes;
  • GPU fan speeds;
  • system fan speeds;
  • PSU voltages and currents;
  • system LED states;
  • GPU ECC statistics.
  • Job-based metrics.

Bright Cluster Manager leverages NVIDIA’s Data Center GPU Manager (DCGM) for GPU health monitoring, diagnostics and validation, beginning with Version 8.0.

Key benefits:

  • Unprecedented ease of use
  • Significant cost and time savings
  • Increased uptime and productivity
  • Pain-free scalability
  • Significant cost savings through dynamic scaling

Bright for Deep Learning

Bright empowers organizations to gain actionable insights from rich, complex data. To achieve this, Bright offers a comprehensive deep learning solution that includes:

  • A modern deep learning environment - Bright provides everything needed to spin up an effective deep learning environment, and manage it effectively
  • Choice of machine learning frameworks - Bright Cluster Manager provides a choice of machine learning frameworks, including Caffe, Torch, Tensorflow, and Theano, to simplify deep learning projects
  • Choice of machine learning libraries - Bright includes a selection of the most popular machine learning libraries to help access datasets, including MLPython, NVIDIA CUDA Deep Neural Network library (cuDNN), Deep Learning GPU Training System (DIGITS), and CaffeOnSpark (a Spark package for deep learning)
  • Supporting infrastructure elements – Bright takes care of finding, configuring, and deploying all of the dependent pieces needed to run deep learning libraries and frameworks, and includes over 400MB of Python modules that support the machine learning packages, plus the NVIDIA hardware drivers, CUDA (parallel computing platform API) drivers, CUB (CUDA building blocks), and NCCL (library of standard collective communication routines)

For more information: