HPC Clusters Overview


New HPC Documentation Website!

New documentation is coming that will replace our current Confluence website (the one you're viewing right now). We will be sending an announcement on when the site will go live. Interested in taking a peek? Check out this page for the beta version. Note: the URL is likely to change.


New GPUs on Ocelote!

We have recently added 22 new P100 GPUs to Ocelote. Need to request multiple GPUs on a node and you're finding Puma queue times too slow? You can now request two GPUs per node on Ocelote using --gres=gpu:2.

Puma

Puma is our latest supercomputer which came online in the middle of 2020. As is the case for our other supercomputers, we use the RFP process to get the best value for our financial resources, that meet our technical requirements.  This time Penguin Computing one with AMD processors. This is tremendously valuable as each node comes with:

  • Two AMD Zen2 48 core processors
  • 512GB RAM
  • 25Gb path to storage
  • 25Gb path to other nodes for MPI
  • 2TB internal NVME disk (largely available as /tmp)
  • Qumulo all flash storage array for shared filesystems
  • Two large memory nodes with 3TB memory and the same processors and memory as the other nodes
  • Six nodes with four Nvidia V100S GPU's each 

Ocelote

Ocelote arrived in 2016.  Lenovo's Nextscale M5 technology was the winner of the RFP mainly on price, performance and meeting our specific requirements. This cluster is actually the next generation of the IBM cluster we call ElGato.  Lenovo purchased IBM's Intel server line in 2015.

In 2021, Ocelote's operating system was upgraded from CentOS6 to CentOS7 and was configured to use SLURM, like Puma. It will continue until it is either too expensive to maintain or it is replaced by something else.

Features:

  • Intel Haswell V3 28 core processors 
  • 192GB RAM per node
  • FDR infiniband for fast MPI interconnect
  • Qumulo all flash storage array (all HPC storage is integrated into one array)
  • One large memory node with 2TB RAM,  Intel Ivy Bridge V2 48 cores
  • 46 nodes with Nvidia P100 GPU's

ElGato

ElGato is the cluster we obtained prior to Ocelote and was rebuilt last year with CentOS 7. As of July 2021, ElGato was upgraded to use Slurm.