Compute Resources
Overview
ElGato
During the quarterly maintenance cycle on April 27, 2022 the ElGato K20s were removed because they are no longer supported by Nvidia.
Implemented at the start of 2014, ElGato has been reprovisioned with CentOS 7 and new compilers and libraries. From July 2021 it has been using Slurm for job submission. ElGato is our smallest cluster with 130 standard nodes each with 16 CPUs. Purchased by an NSF MRI grant by researchers in Astronomy and SISTA.
Ocelote
Implemented in the middle of 2016, Ocelote is designed to support the majority of workloads on the standard nodes. Additionally, Ocelote has one large memory node with 2TB of memory and 46 nodes with Nvidia P100 GPUs for GPU-accelerated workflows
Puma
Implemented in 2020, Puma is the biggest cat yet. Similar to Ocelote, it has standard CPU nodes (with 94 cores and 512 GB of memory per node), GPU nodes (with Nvidia V100) and two high-memory nodes (3 TB). Local scratch storage increased to ~1.4 TB. Puma runs on CentOS 7.
Free vs. Buy-In
The HPC resources at UArizona are differentiated from many other universities in that there is central funding for a significant portion of the available resources. Each PI receives a standard monthly allocation of hours at no charge. There is no charge to the allocation for windfall usage and that has proven to be very valuable for researchers with substantial compute requirements.
Research groups can 'Buy-In' (adding additional compute nodes) to the base HPC systems as funding becomes available. Buy-In research groups will have highest priority on the resources they add to the system. If the expansion resources are not fully utilized by the Buy-In group they will be made available to all users as windfall.
Compute System Details
During the quarterly maintenance cycle on April 27, 2022 the ElGato K20s and Ocelote K80s were removed because they are no longer supported by Nvidia.
Name | El Gato | Ocelote
| Puma |
|---|---|---|---|
Model | IBM System X iDataPlex dx360 M4 | Lenovo NeXtScale nx360 M5 | Penguin Altus XE2242 |
Year Purchased | 2013 | 2016 (2018 P100 nodes) | 2020 |
Node Count | 131 | 400 | 236 CPU-only |
Total System Memory (TB) | 26TB | 82.6TB | 128TB |
Processors | 2x Xeon E5-2650v2 8-core (Ivy Bridge) | 2x Xeon E5-2695v3 14-core (Haswell) | 2x AMD EPYC 7642 48-core (Rome) |
Cores / Node (schedulable) | 16c | 28c (48c - High-memory node) | 94c |
Total Cores | 2160* | 11528* | 23616* |
Processor Speed | 2.66GHz | 2.3GHz (2.4GHz - Broadwell CPUs) | 2.4GHz |
Memory / Node | 256GB - GPU nodes | 192GB (2TB - High-memory node) | 512GB (3TB - High-memory nodes) |
Accelerators |
| 46 NVIDIA P100 (16GB) | 29 NVIDIA V100S |
/tmp | ~840 GB spinning | ~840 GB spinning | ~1440 TB NVMe |
HPL Rmax (TFlop/s) | 46 | 382 |
|
OS | Centos 7 | CentOS 7 | CentOS 7 |
Interconnect | FDR Inifinband | FDR Infiniband for node-node | 1x 25Gb/s Ethernet RDMA (RoCEv2)
|
* Includes high-memory and GPU node CPU | |||