Common Node Resource Specification
BC Project: FY14-01
Date of Policy: 13 Mar 2014
Last Updated: 06 Jan 2022 (see Revision Log)
This project defines a common method for specifying the type of nodes required on all allocated HPC systems with heterogeneous node configurations.
Definition of a common method will facilitate commonality across participating HPCMP Centers and systems.
Currently, the centers use Workload Management System directives to specify resources. Examples of node resources are memory size, or presence of an accelerator such as an Intel Many Integrated Core (MIC), also known as XEON Phi or a Graphics Processing Unit (GPU). The following generic names will be used to specify the listed resources:
|bigmem||A node which has more memory than other nodes.|
|nmics||Number of Intel MIC (phi) devices per node|
|ngpus||Number of Nvidia GPU devices per node|
|nmlas||Number of Machine Learning Architecture devices per node|
|inference||A node type intended for Machine Learning Inference workloads|
|training||A node type intended for Machine Learning Training|
|visualization||A node type intended for visualization|
|nvme||A node resource that specifies Non-Volatile Memory local storage on the node|
The list of resources may change as needed to accommodate new capabilities.
DSRCs will coordinate with the Baseline Configuration Team prior to introducing any new node resource types.
A summary of usage for each system for the participating HPCMP centers can be found at the HPC Centers website in the "Guide to the Batch Queuing System."
Notice: Future HPCMP allocated HPC systems may make use of a diversity of HPC job schedulers (PBSpro, SLURM, LSF) to submit batch jobs.
|06 Jan 2022||BC Team Audit - Policy Name modification|
|08 Dec 2020||Replaced PBS scheduler by a generic HPC job scheduler|
|26 Apr 2018||BC Team Audit|
|16 Jun 2016||BC Team Audit|
|13 Mar 2014||First release of policy|