TORQUE is an open source resource manager providing control over batch jobs and distributed compute nodes. It is a community effort based on the original *PBS project and, with more than 1,200 patches, has incorporated significant advances in the areas of scalability, fault tolerance, and feature extensions contributed by NCSA, OSC, USC , the U.S. Dept of Energy, Sandia, PNNL, U of Buffalo, TeraGrid, and many other leading edge HPC organizations.
TORQUE can integrate with Moab Workload Manager to improve overall utilization, scheduling and administration on a cluster. Customers who purchase Moab family products also receive free support for TORQUE.
Torque can automatically detect NVidia GPUs using nvidia-smi (default) or NVML. GPUs can be allocated and consumed similar to the way np (processors) are allocated and consumed. Users can submit jobs with the qsub command requesting total gpus for job instead of gpus per node: -l ncpus=X,gpus=Y