Example Batch Jobs
- Ethan Jahn
This page is under construction! We cannot guarantee completion or accuracy of information herein while in development.
Slurm Job Examples
Included here are a few key examples to demonstrate the workflow of batch jobs. If you find that your use case is not represented below, please check out our github page for a more complete set of examples. If there is something you would like us to include here, please let us know!
Serial Job
This script runs a serial, single-CPU job using the standard queue.
Any line beginning with #SBATCH issues a SLURM directive that controls aspects of your job such as job name, output filename, memory requirements, number of CPUs, number of nodes, etc.
To run the script, replace YOUR_GROUP with the name of your PI’s group on HPC. To find this information, you can use the command va. You can submit the job with sbatch <script_name>. This job assumes there is a python script in your working directory (included in downloadable example above).
Submission Script
#!/bin/bash #SBATCH --job-name=Sample_Slurm_Job #SBATCH --ntasks=1 #SBATCH --nodes=1 #SBATCH --time=00:01:00 #SBATCH --partition=standard #SBATCH --account=hpcteam module load python/3.8 python3 hello_world.py
Example Python Script
#!/usr/bin/env python3 import os print("Hello world! I'm running on compute node: %s"%os.environ["HOSTNAME"])
Example Job Submission
(ocelote) [netid@junonia ~]$ ls hello_world.py serial-job.slurm (ocelote) [netid@junonia ~]$ sbatch serial-job.slurm Submitted batch job 73224
Output
(ocelote) [netid@junonia ~]$ ls slurm-73224.out hello_world.py serial-job.slurm (ocelote) [netid@junonia ~]$ cat slurm-73224.out Hello world! I'm running on compute node: i4n0 Detailed performance metrics for this job will be available at https://metrics.hpc.arizona.edu/#job_viewer?action=show&realm=SUPREMM&resource_id=5&local_job_id=73224 by 8am on 2021/08/05. (ocelote) [netid@junonia ~]$
Single Node MPI Job
Script that compiles and runs an MPI job using 30 CPUs on a single node.
Note: the C file can also be compiled manually in an interactive session.
Slurm Script:
#!/bin/bash #SBATCH --job-name=Single-Node-MPI-Job #SBATCH --ntasks=30 #SBATCH --nodes=1 #SBATCH --time=00:01:00 #SBATCH --partition=standard #SBATCH --account=YOUR_GROUP module load gn8 openmpi3 mpicc -o hello_world hello_world.c /usr/bin/time mpirun -np $SLURM_NTASKS ./hello_world
Companion MPI Script
The following C script was used to create the executable for this workflow. It is included in the example available for download above
#include <mpi.h> #include <stdio.h> int main(int argc, char** argv) { MPI_Init(NULL, NULL); int world_size; MPI_Comm_size(MPI_COMM_WORLD, &world_size); int world_rank; MPI_Comm_rank(MPI_COMM_WORLD, &world_rank); char processor_name[MPI_MAX_PROCESSOR_NAME]; int name_len; MPI_Get_processor_name(processor_name, &name_len); printf("Hello world from node %s. My rank is %d out of %d processors\n", processor_name, world_rank, world_size); MPI_Finalize(); }
To compile the script manually, start an interactive terminal session using interactive
, then:
module load gnu8 openmpi3 mpicc -o hello_world hello_world.c
Script Submission
(puma) [netid@junonia ~]$ sbatch Single-Node-MPI-Job.slurm Submitted batch job 1694351
Output Files
(puma) [netid@junonia ~]$ ls *.out slurm-1694351.out
Additionally, the executable hello_world
will be generated and stored in your working directory
File Contents
(puma) [netid@junonia ~]$ head slurm-1694351.out Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 28 out of 30 processors Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 8 out of 30 processors Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 14 out of 30 processors Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 17 out of 30 processors Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 6 out of 30 processors Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 9 out of 30 processors Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 10 out of 30 processors Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 2 out of 30 processors Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 4 out of 30 processors Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 5 out of 30 processors
Basic Array Job
Array jobs are used to execute the same script multiple times with different input.
What problem does this help fix?
To execute multiple analyses, a user may be tempted to submit jobs with a scripted loop, e.g.:
for i in $( seq 1 10 ); do sbatch script.slurm <submission options> ; done
This isn’t a good solution because it submits too many jobs too quickly and overloads the scheduler. Instead, an array job can be used to achieve the same ends.
Example
#!/bin/bash #SBATCH --ntasks=1 #SBATCH --nodes=1 #SBATCH --time=00:01:00 #SBATCH --partition=standard #SBATCH --account=YOUR_GROUP #SBATCH --array 1-5 echo "./sample_command input_file_${SLURM_ARRAY_TASK_ID}.in"
Script Breakdown
What differentiates the script above from standard submissions is the --array
directive. This is what tells SLURM that you’re submitting an array. Following this flag, you will specify the number of jobs you wish to run. In this case, we’re running 5:
#SBATCH --array 1-5
Each job in the array has its own associated environment variable SLURM_ARRARY_TASK_ID
that can be used to differentiate subjobs. To demonstrate how we can use each of these to read in different input files, we’ll print a sample command:
echo "./sample_command input_file_${SLURM_ARRAY_TASK_ID}.in"
Script Submission
(ocelote) [netid@junonia ~]$ sbatch basic_array_job.slurm Submitted batch job 73958
Output Files
Each of the subjobs in the array will produce its own output file of the form slurm_jobid_arrayid.out as seen below:
echo "./sample_command input_file_${SLURM_ARRAY_TASK_ID}.in"
For more information on naming SLURM files, see our online documentation.
File Contents
Below is a concatenation of the job’s output files. Notice how the array indices function to differentiate the input files in the sample command:
(ocelote) [netid@junonia ~]$ cat slurm-73958_* | grep sample ./sample_command input_file_1.in ./sample_command input_file_2.in ./sample_command input_file_3.in ./sample_command input_file_4.in ./sample_command input_file_5.in