Unclassified Systems

Carpenter is currently running in a degraded state.

Carpenter is an HPE Cray EX4000 system located at the ERDC DSRC. It has 1,632 standard compute nodes, 4 large-memory nodes, and 8 GPU nodes (a total of 313,344 compute cores). It has 585 TB of useable memory and is rated at 17.65 peak PFLOPS.

More Info

Maintenance
Date / TimeDetails
2024 Oct 10 16:00 CT - TBD (In Progress) System Maintenance
2024 Oct 21 08:00 - Oct 28 08:00 CT Archive Maintenance
Node Configuration
Login Standard Large-Memory Visualization
Total Nodes 10 1,632 4 8
Processor AMD 9654 Genoa AMD 9654 Genoa AMD 9654 Genoa AMD 7713 Milan
Processor Speed 2.4 GHz 2.4 GHz 2.4 GHz 2.0 GHz
Sockets / Node 2 2 2 2
Cores / Node 192 192 192 128
Total CPU Cores 1,920 313,344 768 1,024
Usable Memory / Node 8 GB 349 GB 2.973 TB 467 GB
Accelerators / Node None None None 1
Accelerator n/a n/a n/a NVIDIA A40 PCIe 4
Memory / Accelerator n/a n/a n/a 48 GB
Storage on Node 1.3 TB NVMe SSD None 8.8 TB NVMe SSD None
Interconnect Ethernet HPE Slingshot HPE Slingshot HPE Slingshot
Operating System SLES 15 SLES 15 SLES 15 SLES 15
Queue Descriptions and Limits on Carpenter
Priority Queue Name Max Wall Clock Time Max Cores Per Job Description
Highest urgent 24 Hours 9,408 Jobs belonging to DoD HPCMP Urgent Projects
Down arrow for decreasing priority debug* 1 Hour 13,824 Time/resource-limited for user testing and debug purposes
HIE 24 Hours 192 Rapid response for interactive work. For more information see the HPC Interactive Environment (HIE) User Guide.
high_lw 168 Hours 7,488 Long-walltime jobs belonging to DoD HPCMP High Priority Projects
high_lg 24 Hours 100,032 Large jobs belonging to DoD HPCMP High Priority Projects
high_sm 24 Hours 9,408 Small jobs belonging to DoD HPCMP High Priority Projects
frontier_lw 168 Hours 7,488 Long-walltime jobs belonging to DoD HPCMP Frontier Projects
frontier_lg 24 Hours 100,032 Large jobs belonging to DoD HPCMP Frontier Projects
frontier_sm 24 Hours 9,408 Small jobs belonging to DoD HPCMP Frontier Projects
standard_lw 168 Hours 7,488 Long-walltime standard jobs
standard_lg 24 Hours 100,032 Large standard jobs
standard_sm 24 Hours 9,408 Small standard jobs
serial 168 Hours 1 Single-core serial jobs
transfer 48 Hours 1 Data transfer for user jobs. See the ERDC DSRC Archive Guide, section 5.2.
Lowest background** 4 Hours 9,408 User jobs that are not charged against the project allocation
Narwhal is currently Up.

Narwhal is an HPE Cray EX system located at the Navy DSRC. It has 2,304 standard compute nodes, 26 large-memory nodes, 16 visualization accelerated nodes, 32 Single-GPU MLA accelerated nodes, and 32 Dual-GPU MLA accelerated nodes (a total of 2,410 compute nodes or 308,480 compute cores). It has 640 TB of memory and is rated at 13.5 peak PFLOPS.

More Info

Maintenance
Date / TimeDetails
2024 Oct 15 09:00 - 17:00 CT System Maintenance
2024 Oct 15 13:00 - 18:00 CT Archive Maintenance
Node Configuration
Login Standard Large-Memory Visualization Single-GPU MLA Dual-GPU MLA
Total Nodes 11 2,304 26 16 32 32
Processor AMD 7H12 Rome AMD 7H12 Rome AMD 7H12 Rome AMD 7H12 Rome AMD 7H12 Rome AMD 7H12 Rome
Processor Speed 2.6 GHz 2.6 GHz 2.6 GHz 2.6 GHz 2.6 GHz 2.6 GHz
Sockets / Node 2 2 2 2 2 2
Cores / Node 128 128 128 128 128 128
Total CPU Cores 1,408 294,912 3,328 2,048 4,096 4,096
Usable Memory / Node 226 GB 238 GB 995 GB 234 GB 239 GB 239 GB
Accelerators / Node None None None 1 1 2
Accelerator n/a n/a n/a NVIDIA V100 PCIe 3 NVIDIA V100 PCIe 3 NVIDIA V100 PCIe 3
Memory / Accelerator n/a n/a n/a 32 GB 32 GB 32 GB
Storage on Node 880 GB SSD None 1.8 TB SSD None 880 GB SSD 880 GB SSD
Interconnect HPE Slingshot HPE Slingshot HPE Slingshot HPE Slingshot HPE Slingshot HPE Slingshot
Operating System SLES SLES SLES SLES SLES SLES
Queue Descriptions and Limits on Narwhal
Priority Queue Name Max Wall Clock Time Max Cores Per Job Description
Highest urgent 24 Hours 16,384 Jobs belonging to DoD HPCMP Urgent Projects
Down arrow for decreasing priority frontier 168 Hours 65,536 Jobs belonging to DoD HPCMP Frontier Projects
high 168 Hours 32,768 Jobs belonging to DoD HPCMP High Priority Projects
debug 30 Minutes 8,192 Time/resource-limited for user testing and debug purposes
HIE 24 Hours 3,072 Rapid response for interactive work. For more information see the HPC Interactive Environment (HIE) User Guide.
viz 24 Hours 128 Visualization jobs
standard 168 Hours 32,768 Standard jobs
mla 24 Hours 128 Machine Learning Accelerated jobs that require a GPU node; PBS assigns the next available smla (1-GPU) or dmla (2-GPU) node.
smla 24 Hours 128 Machine Learning Accelerated jobs that require an smla (Single-GPU MLA) node.
dmla 24 Hours 128 Machine Learning Accelerated jobs that require a dmla (Dual-GPU MLA) node.
serial 168 Hours 1 Serial jobs
bigmem 96 Hours 1,280 Large-memory jobs
transfer 48 Hours 1 Data transfer for user jobs. See the Navy DSRC Archive Guide, section 5.2.
Lowest background 4 Hours 1,024 User jobs that are not charged against the project allocation
Nautilus is currently Up.

Nautilus is a Penguin Computing TrueHPC system located at the Navy DSRC. It has 1,384 standard compute nodes, 16 large-memory nodes, 16 visualization accelerated nodes, 32 AI/ML nodes, and 32 High Core Performance nodes (a total of 1,480 compute nodes or 186,368 compute cores). It has 386 TB of memory and is rated at 8.5 peak PFLOPS.

More Info

Node Configuration
Login Standard Large-Memory Visualization AI/ML High Core Performance
Total Nodes 14 1,384 16 16 32 32
Processor AMD 7713 Milan AMD 7713 Milan AMD 7713 Milan AMD 7713 Milan AMD 7713 Milan AMD 73F3 Milan
Processor Speed 2 GHz 2 GHz 2 GHz 2 GHz 2 GHz 3.4 GHz
Sockets / Node 2 2 2 2 2 2
Cores / Node 128 128 128 128 128 32
Total CPU Cores 1,792 166,912 2,048 2,048 / 16 4,096 / 128 1,024
Usable Memory / Node 433 GB 231 GB 998 GB 491 GB 491 GB 491 GB
Accelerators / Node None None None 1 4 None
Accelerator n/a n/a n/a NVIDIA A40 PCIe 4 NVIDIA A100 SXM 4 n/a
Memory / Accelerator n/a n/a n/a 48 GB 40 GB n/a
Storage on Node 1.92 TB NVMe SSD None 1.92 TB NVMe SSD None 1.92 TB NVMe SSD None
Interconnect HDR InfiniBand HDR InfiniBand HDR InfiniBand HDR InfiniBand HDR InfiniBand HDR InfiniBand
Operating System RHEL RHEL RHEL RHEL RHEL RHEL
Queue Descriptions and Limits on Nautilus
Priority Queue Name Max Wall Clock Time Max Cores Per Job Description
Highest urgent 24 Hours 16,384 Jobs belonging to DoD HPCMP Urgent Projects
Down arrow for decreasing priority debug 30 Minutes 10,752 Time/resource-limited for user testing and debug purposes
HIE 24 Hours 3,072 Rapid response for interactive work. For more information see the HPC Interactive Environment (HIE) User Guide.
frontier 168 Hours 65,536 Jobs belonging to DoD HPCMP Frontier Projects
high 168 Hours 65,536 Jobs belonging to DoD HPCMP High Priority Projects
serial 168 Hours 1 Single-core serial jobs
standard 168 Hours 16,384 Standard jobs
transfer 48 Hours 1 Data transfer for user jobs. See the Navy DSRC Archive Guide, section 5.2.
Lowest background 4 Hours 4,096 User jobs that are not charged against the project allocation
Raider is currently running in a degraded state.

Raider is a Penguin Computing TrueHPC system located at the AFRL DSRC. It has 1,480 standard compute nodes, 8 large-memory nodes, and 24 Visualization nodes, 32 MLA nodes, and 64 High Clock nodes (a total of 199,680 compute cores). It has 447 TB of memory and is rated at 9 peak PFLOPS.

More Info

Maintenance
Date / TimeDetails
2024 Sep 11 14:30 ET - TBD (In Progress) Archive Maintenance
Node Configuration
Login Login-viz Standard Large-Memory Visualization MLA High Clock Transfer
Total Nodes 6 4 1,480 8 24 32 64 2
Processor AMD 7713 Milan AMD 7713 Milan AMD 7713 Milan AMD 7713 Milan AMD 7713 Milan AMD 7713 Milan AMD 73F3 Milan AMD 7713 Milan
Processor Speed 2.0 GHz 2.0 GHz 2.0 GHz 2.0 GHz 2.0 GHz 2.0 GHz 3.4 GHz 2.0 GHz
Sockets / Node 2 2 2 2 2 2 2 2
Cores / Node 128 128 128 128 128 128 32 128
Total CPU Cores 768 512 189,440 1,024 3,072 4,096 2,048 256
Usable Memory / Node 503 GB 503 GB 251 GB 2.0 TB 503 GB 503 GB 503 GB 503 GB
Accelerators / Node 1 1 None None 1 4 None None
Accelerator NVIDIA A40 PCIe 4 NVIDIA A100 SXM 4 n/a n/a NVIDIA A40 PCIe 4 NVIDIA A100 SXM 4 n/a n/a
Memory / Accelerator 45 GB 40 GB n/a n/a 45 GB 40 GB n/a n/a
Storage on Node 960 GB NVMe SSD 960 GB NVMe SSD 1.91 TB NVMe SSD 7.68 TB NVMe SSD None 3.84 TB NVMe SSD None None
Interconnect HDR InfiniBand HDR InfiniBand HDR InfiniBand HDR InfiniBand HDR InfiniBand HDR InfiniBand HDR InfiniBand HDR InfiniBand
Operating System RHEL RHEL RHEL RHEL RHEL RHEL RHEL RHEL
Queue Descriptions and Limits on Raider
Priority Queue Name Max Wall Clock Time Max Cores Per Job Description
Highest urgent 168 Hours 92,160 Jobs belonging to DoD HPCMP Urgent Projects
Down arrow for decreasing priority debug 1 Hour 3,840 Time/resource-limited for user testing and debug purposes
high 168 Hours 92,160 Jobs belonging to DoD HPCMP High Priority Projects
frontier 168 Hours 92,160 Jobs belonging to DoD HPCMP Frontier Projects
standard 168 Hours 92,160 Standard jobs
HIE 24 Hours 256 Rapid response for interactive work. For more information see the HPC Interactive Environment (HIE) User Guide.
transfer 48 Hours 1 Data transfer for user jobs. See the AFRL DSRC Archive Guide, section 5.2.
Lowest background 24 Hours 3,840 User jobs that are not charged against the project allocation
SCOUT is currently Up.

SCOUT is an IBM Power9 system located at the ARL DSRC. It has 22 Training nodes, each with 6 nVidia V100 GPUs, 128 Inference nodes, each with 4 nVidia T4 GPUs, and 2 Visualization nodes, each with 2 nVidia GPUs (a total of 152 compute nodes or 6,080 cores). It has 45 TB of memory.

More Info

Node Configuration
Login Training Inference Visualization
Total Nodes 4 22 128 2
Processor IBM POWER9 IBM POWER9 IBM POWER9 IBM POWER9
Processor Speed 2.55 GHz 2.55 GHz 2.55 GHz 2.55 GHz
Sockets / Node 2 2 2 2
Cores / Node 40 40 40 40
Total CPU Cores 160 880 5,120 80
Usable Memory / Node 502 GB 690 GB 246 GB 502 GB
Accelerators / Node None 6 4 2
Accelerator n/a NVIDIA V100 PCIe 3 NVIDIA T4 PCIe 3 NVIDIA V100 PCIe 3
Memory / Accelerator n/a 32 GB 16 GB 16 GB
Storage on Node 1.4 TB PCIe 12 TB PCIe 2.1 TB PCIe 5.9 TB PCIe
Interconnect InfiniBand EDR InfiniBand EDR InfiniBand EDR InfiniBand EDR
Operating System RHEL RHEL RHEL RHEL
Queue Descriptions and Limits on SCOUT
Priority Queue Name Max Wall Clock Time Max Cores Per Job Description
Highest transfer 48 Hours N/A Data transfer for user jobs. See the ARL DSRC Archive Guide, section 5.2.
Down arrow for decreasing priority urgent 96 Hours N/A Jobs belonging to DoD HPCMP Urgent Projects
debug 1 Hour N/A Time/resource-limited for user testing and debug purposes
high 168 Hours N/A Jobs belonging to DoD HPCMP High Priority Projects
frontier 168 Hours N/A Jobs belonging to DoD HPCMP Frontier Projects
HIE 24 Hours N/A Rapid response for interactive work. For more information see the HPC Interactive Environment (HIE) User Guide.
interactive 12 Hours N/A Interactive jobs
standard 168 Hours N/A Standard jobs
Lowest background 24 Hours N/A User jobs that are not charged against the project allocation
Warhawk is currently running in a degraded state.

Warhawk is an HPE Cray EX system located at the AFRL DSRC. It has 1,024 standard compute nodes, 4 large-memory nodes, 24 1-GPU visualization nodes, and 40 2-GPU Machine-Learning nodes (a total of 1,092 compute nodes or 139,776 compute cores). It has 564 TB of memory and is rated at 6.86 peak PFLOPS.

More Info

Maintenance
Date / TimeDetails
2024 Sep 11 14:30 ET - TBD (In Progress) Archive Maintenance
2024 Oct 14 08:00 - Oct 25 17:00 ET System Maintenance
Node Configuration
Login Standard Large-Memory Visualization Machine-Learning Accelerated
Total Nodes 7 1,024 4 24 40
Processor AMD 7H12 Rome AMD 7H12 Rome AMD 7H12 Rome AMD 7H12 Rome AMD 7H12 Rome
Processor Speed 2.6 GHz 2.6 GHz 2.6 GHz 2.6 GHz 2.6 GHz
Sockets / Node 2 2 2 2 2
Cores / Node 128 128 128 128 128
Total CPU Cores 896 131,072 512 3,072 5,120
Usable Memory / Node 995 GB 503 GB 995 GB 503 GB 503 GB
Accelerators / Node None None None 1 2
Accelerator n/a n/a n/a NVIDIA V100 PCIe 3 NVIDIA V100 PCIe 3
Memory / Accelerator n/a n/a n/a 32 GB 32 GB
Storage on Node None None None None None
Interconnect Cray Slingshot Cray Slingshot Cray Slingshot Cray Slingshot Cray Slingshot
Operating System SLES SLES SLES SLES SLES
Queue Descriptions and Limits on Warhawk
Priority Queue Name Max Wall Clock Time Max Cores Per Job Description
Highest urgent 168 Hours 69,888 Jobs belonging to DoD HPCMP Urgent Projects
Down arrow for decreasing priority debug 1 Hour 2,816 Time/resource-limited for user testing and debug purposes
high 168 Hours 69,888 Jobs belonging to DoD HPCMP High Priority Projects
frontier 168 Hours 69,888 Jobs belonging to DoD HPCMP Frontier Projects
standard 168 Hours 69,888 Standard jobs
HIE 24 Hours 256 Rapid response for interactive work. For more information see the HPC Interactive Environment (HIE) User Guide.
transfer 48 Hours 1 Data transfer for user jobs. See the AFRL DSRC Archive Guide, section 5.2.
Lowest background 24 Hours 2,816 User jobs that are not charged against the project allocation