Common Node Resource Specification

BC Project: FY14-01
Date of Policy: 13 Mar 2014
Last Updated: 06 Jan 2022 (see Revision Log)

This project defines a common method for specifying the type of nodes required on all allocated HPC systems with heterogeneous node configurations.

Definition of a common method will facilitate commonality across participating HPCMP Centers and systems.

Currently, the centers use Workload Management System directives to specify resources. Examples of node resources are memory size, or presence of an accelerator such as an Intel Many Integrated Core (MIC), also known as XEON Phi or a Graphics Processing Unit (GPU). The following generic names will be used to specify the listed resources:

Node Attribute Specification
Attribute Description
bigmem A node which has more memory than other nodes.
nmics Number of Intel MIC (phi) devices per node
ngpus Number of Nvidia GPU devices per node
nmlas Number of Machine Learning Architecture devices per node
inference A node type intended for Machine Learning Inference workloads
training A node type intended for Machine Learning Training
visualization A node type intended for visualization
nvme A node resource that specifies Non-Volatile Memory local storage on the node

The list of resources may change as needed to accommodate new capabilities.

DSRCs will coordinate with the Baseline Configuration Team prior to introducing any new node resource types.

A summary of usage for each system for the participating HPCMP centers can be found at the HPC Centers website in the "Guide to the Batch Queuing System."

Notice: Future HPCMP allocated HPC systems may make use of a diversity of HPC job schedulers (PBSpro, SLURM, LSF) to submit batch jobs.

Revision Log
Date Revision
06 Jan 2022BC Team Audit - Policy Name modification
08 Dec 2020Replaced PBS scheduler by a generic HPC job scheduler
26 Apr 2018BC Team Audit
16 Jun 2016BC Team Audit
13 Mar 2014First release of policy