Managing Batch Jobs
This page is under construction! We cannot guarantee completion or accuracy of information herein while in development.
Overview
It may be helpful to get information about queued, running, or past jobs. There are several functionalities within Slurm that can help with this process. Note that all of the following commands are contextual to the current cluster. To view commands on a different cluster, type the name of that cluster, and then run the command.
Queued & Running Jobs
To see jobs that are in the queue or currently running on the current cluster, use squeue --user $USER
. Note that this command will not show jobs that have finished via completion, cancellation, or any other reason.
Previous Jobs
Users may want to review details of past jobs, or to study their performance to improve future jobs. The following commands may help:
past-jobs
job-history
seff
SLURM and System Commands
Additional helpful commands are included here. For a complete list of Slurm commands and options, please refer to the official Slurm documentation.
Command | Purpose | Example(s) |
---|---|---|
Native Slurm Commands | ||
sbatch | Submits a batch script for execution | sbatch script.slurm |
srun | Run parallel jobs. Can be in place of mpirun/mpiexec. Can be used interactively as well as in batch scripts | srun -n 1 --mpi=pmi2 a.out |
salloc | Requests a session to work on a compute node interactively | see: Interactive Jobs |
squeue | Checks the status of pending and running jobs |
|
scancel | Cancel a running or pending job |
|
scontrol hold | Place a hold on a job to prevent it from being executed | scontrol hold $JOBID |
scontrol release | Releases a hold placed on a job allowing it to be executed | scontrol release $JOBID |
System Commands | ||
va | Displays your group membership, your account usage, and CPU allocation. Short for "view allocation" | va |
interactive | Shortcut for quickly requesting an interactive job. Use "interactive --help" to get full usage. | interactive -a $GROUP_NAME |
job-history | Retrieves a running or completed job's history in a user-friendly format | job-history $JOBID |
seff | Retrieves a completed job's memory and CPU efficiency | seff $JOBID |
past-jobs | Retrieves past jobs run by user. Can be used with option "-d N" to search for jobs run in the past N days. | past-jobs -d 5 |
job-limits | View your group's job resource limits and current usage. | job-limits $GROUP |
nodes-busy | Display a visualization of nodes on a cluster and their usage | nodes-busy --help |
system-busy | Display a text-based summary of a cluster's usage | system-busy |
cluster-busy | Display a visualization of all three cluster's overall usage | cluster-busy --help |