Running Your Code


Welcome to the UA HPC User Guide! This section includes information and detailed guides on how to run jobs on HPC. 

This page is under construction! We cannot guarantee completion or accuracy of information herein while in development.

Pages Summary

Contents

GUI Jobs

Open OnDemand, which is an NSF-funded open-source HPC portal, is available for users and provides web browser access for graphical interfacing with HPC. This service is available from https://ood.hpc.arizona.edu/

Batch Jobs & Slurm

Some jobs don't need a GUI, may take a long time to run, and/or do not need user input. In these cases, batch jobs are useful because they allow a user to request resources for a job, then wait until it completes automatically without any further input. The user can even fully log off of HPC, and the submitted jobs will continue to run.  The program on HPC that takes requests and assigns resources at ideal times to optimize cluster usage is called a scheduler, and the name of our scheduler is Slurm. All three clusters, Puma, Ocelote, and ElGato, use SLURM for resource management and job scheduling.

Example Batch Jobs

Included here are a few key examples to demonstrate the workflow of batch jobs. Examples include serial jobs, single node MPI jobs, and a basic array job. More examples can be found on our github page.

Managing Batch Jobs

It may be helpful to get information about queued, running, or past jobs. There are several functionalities within Slurm that can help with this process. Note that all of the following commands are contextual to the current cluster. To view commands on a different cluster, type the name of that cluster, and then run the command. 

Slurm Reference

Slurm is a highly configurable workload manager with numerous configuration options and environment variables. This page lists commonly used options as a reference when create batch jobs.

Interactive Jobs

Sometimes it is necessary to test or run code in a quick and interactive shell environment. In these cases, batch jobs are inconvenient due to queue times and lack of interactivity, and the Open On Demand graphical options may be too cumbersome when it's only necessary to use the command line. Interactive sessions are ideal in these cases. The term "interactive session" typically refers to jobs run from within the command line on a terminal client.

Parallelization

To take advantage of the full computing power that HPC has to offer, codes can be run in parallel to spread the workload across multiple CPUs and potentially grant significant improvements in performance. This is often easier said than done. Some codes are developed with parallelization in mind such that it can be as simple as calling mpirun, however, other codes may need to be modified or run in tandem with parallelization libraries such as mpi4py.