BC News

Look to this space for information on Baseline Configuration (BC) Initiative completed projects.

2022

Feb - After several months of work, the BC Team has performed an audit of all BC policies. The entire set of updated policies may be found in the Policies and Projects section.

2020

Applicability of Baseline Configuration Policies to Emerging Architectures

At a joint meeting of the HPCMP User Advocacy Group (UAG) and Baseline Configuration Team (BCT) in Lorton, VA on 11 March 2020, The HPCMP Associate Director for HPC Centers posed a series of questions to both groups regarding three emerging architectures in the HPCMP:

  1. HPC in the Cloud
    1. How can users interact with both the HPCMP and Cloud environments?
    2. Are BC/consistency policies applicable to the Cloud?
  2. Highly Classified HPC
    1. What are the user requirements for these systems?
    2. What has been the impact to users resulting from fewer unclassified resources?
    3. How will we allocate these systems?
    4. Do we apply BC/consistency policies to these systems?
  3. Data-Intensive/Deployable HPC
    1. What allocation unit shall we use for these systems?
    2. Do we apply BC/consistency policies to these systems?

During the spring and summer months of 2020 the BC Team discussed these questions in its weekly meetings, specifically the sub-questions relating to BC/consistency policies applicable to emerging architectures.

On December 30, 2020, BC Team submitted to the HPCMP Associate Director for HPC Centers a report on the applicability of each BC policy to the following three emerging architectures:

  • HPC Cloud Computing,
  • Highly Classified HPC Implementations,
  • Deployable Containerized HPC Implementations.

The BC Team report will be provided upon request.

2017

mpiprocs Hook

In the current batch job scheduling environment, a submitted job is rejected at run time on a Cray system if the number of MPI processors (mpiprocs) does not have a valid value with respect to the number of processors required to run the job. Therefore, a job will fail at run time whenever mpiprocs does not have a valid value.

Instead of rejecting a job at run time, a tool was developed, called "mpiprocs hook" that anticipated a job's failure at submission time. Rejecting the job at submission time saves users effort, saves system resources, and also, eliminates time wasted by placing a job in the queue, only to have it fail at run time.

The "mpiprocs hook" is designed to reject jobs that have exceeded the maximum total mpiprocs per job request or if the job's mpiprocs value is not a factor of the number of requested cpu's. This validation is done by querying a job's mpiprocs on Cray systems prior to accepting the job into the queues. The user will know immediately if their mpiprocs is valid. If not, the user will get a warning message that explains why the mpiprocs setting was rejected along with a recommendation for a valid mpiprocs setting.

Key features are:

  • Reject jobs that have exceeded the maximum total mpiprocs.
  • Reject a job if the mpiprocs value is not a factor of the number of requested cpu's.
  • Support for accelerator nodes (nmics), which is machine configurable.
  • Support for multiple chunks within a select statement.
  • Support system default values; the hook will also accept a job if mpiprocs is not set or if mpiprocs is set to zero.

The newly developed "mpiprocs hook" is now in production on all HPCMP allocated HPC systems.

An article by BC Team on "mpiprocs Hook" was published in the April 2017 edition of the What's New @ HPCMP quarterly newsletter.