Banner | ||||
---|---|---|---|---|
| ||||
Panel | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||
|
Panel | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Slurm Job Examplestext
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Panel | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Code Block | ||||
---|---|---|---|---|
| ||||
#!/bin/bash
#SBATCH --job-name=Sample_Slurm_Job
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --time=00:01:00
#SBATCH --partition=standard
#SBATCH --account=hpcteam
module load python/3.8
python3 hello_world.py |
Example Python Script
Code Block | ||||
---|---|---|---|---|
| ||||
#!/usr/bin/env python3
import os
print("Hello world! I'm running on compute node: %s"%os.environ["HOSTNAME"]) |
Example Job Submission
Code Block | ||||
---|---|---|---|---|
| ||||
(ocelote) [netid@junonia ~]$ ls
hello_world.py serial-job.slurm
(ocelote) [netid@junonia ~]$ sbatch serial-job.slurm
Submitted batch job 73224 |
Output
Code Block | ||||
---|---|---|---|---|
| ||||
(ocelote) [netid@junonia ~]$ ls
slurm-73224.out hello_world.py serial-job.slurm
(ocelote) [netid@junonia ~]$ cat slurm-73224.out
Hello world! I'm running on compute node: i4n0
Detailed performance metrics for this job will be available at https://metrics.hpc.arizona.edu/#job_viewer?action=show&realm=SUPREMM&resource_id=5&local_job_id=73224 by 8am on 2021/08/05.
(ocelote) [netid@junonia ~]$ |
borderColor | #9c9fb5 |
---|---|
bgColor | #fafafe |
borderWidth | 2 |
borderStyle | double |
Single Node MPI Job
Script that compiles and runs an MPI job using 30 CPUs on a single node.
Note: the C file can also be compiled manually in an interactive session.
Slurm Script:
Code Block | ||||
---|---|---|---|---|
| ||||
#!/bin/bash
#SBATCH --job-name=Single-Node-MPI-Job
#SBATCH --ntasks=30
#SBATCH --nodes=1
#SBATCH --time=00:01:00
#SBATCH --partition=standard
#SBATCH --account=YOUR_GROUP
module load gn8 openmpi3
mpicc -o hello_world hello_world.c
/usr/bin/time mpirun -np $SLURM_NTASKS ./hello_world |
Companion MPI Script
The following C script was used to create the executable for this workflow. It is included in the example available for download above
Code Block | ||||
---|---|---|---|---|
| ||||
#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
MPI_Init(NULL, NULL);
int world_size;
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
int world_rank;
MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
char processor_name[MPI_MAX_PROCESSOR_NAME];
int name_len;
MPI_Get_processor_name(processor_name, &name_len);
printf("Hello world from node %s. My rank is %d out of %d processors\n",
processor_name, world_rank, world_size);
MPI_Finalize();
} |
To compile the script manually, start an interactive terminal session using interactive
, then:
Code Block | ||||
---|---|---|---|---|
| ||||
module load gnu8 openmpi3
mpicc -o hello_world hello_world.c |
Script Submission
Code Block | ||||
---|---|---|---|---|
| ||||
(puma) [netid@junonia ~]$ sbatch Single-Node-MPI-Job.slurm
Submitted batch job 1694351 |
Output Files
Code Block | ||||
---|---|---|---|---|
| ||||
(puma) [netid@junonia ~]$ ls *.out
slurm-1694351.out |
Additionally, the executable hello_world
will be generated and stored in your working directory
File Contents
Code Block | ||||
---|---|---|---|---|
| ||||
(puma) [netid@junonia ~]$ head slurm-1694351.out
Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 28 out of 30 processors
Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 8 out of 30 processors
Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 14 out of 30 processors
Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 17 out of 30 processors
Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 6 out of 30 processors
Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 9 out of 30 processors
Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 10 out of 30 processors
Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 2 out of 30 processors
Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 4 out of 30 processors
Hello world from node r2u05n1.puma.hpc.arizona.edu. My rank is 5 out of 30 processors |
borderColor | #9c9fb5 |
---|---|
bgColor | #fafafe |
borderWidth | 2 |
borderStyle | double |
Basic Array Job
Array jobs are used to execute the same script multiple times with different input.
What problem does this help fix?
To execute multiple analyses, a user may be tempted to submit jobs with a scripted loop, e.g.:
Code Block | ||||
---|---|---|---|---|
| ||||
for i in $( seq 1 10 ); do sbatch script.slurm <submission options> ; done |
This isn’t a good solution because it submits too many jobs too quickly and overloads the scheduler. Instead, an array job can be used to achieve the same ends.
Example
Code Block | ||||
---|---|---|---|---|
| ||||
#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --nodes=1
#SBATCH --time=00:01:00
#SBATCH --partition=standard
#SBATCH --account=YOUR_GROUP
#SBATCH --array 1-5
echo "./sample_command input_file_${SLURM_ARRAY_TASK_ID}.in" |
Script Breakdown
What differentiates the script above from standard submissions is the --array
directive. This is what tells SLURM that you’re submitting an array. Following this flag, you will specify the number of jobs you wish to run. In this case, we’re running 5:
Code Block | ||||
---|---|---|---|---|
| ||||
#SBATCH --array 1-5 |
Each job in the array has its own associated environment variable SLURM_ARRARY_TASK_ID
that can be used to differentiate subjobs. To demonstrate how we can use each of these to read in different input files, we’ll print a sample command:
Code Block | ||||
---|---|---|---|---|
| ||||
echo "./sample_command input_file_${SLURM_ARRAY_TASK_ID}.in" |
Script Submission
Code Block | ||||
---|---|---|---|---|
| ||||
(ocelote) [netid@junonia ~]$ sbatch basic_array_job.slurm
Submitted batch job 73958 |
Output Files
Each of the subjobs in the array will produce its own output file of the form slurm_jobid_arrayid.out as seen below:
Code Block | ||||
---|---|---|---|---|
| ||||
echo "./sample_command input_file_${SLURM_ARRAY_TASK_ID}.in" |
For more information on naming SLURM files, see our online documentation.
File Contents
Below is a concatenation of the job’s output files. Notice how the array indices function to differentiate the input files in the sample command:
Code Block | ||||
---|---|---|---|---|
| ||||
(ocelote) [netid@junonia ~]$ cat slurm-73958_* | grep sample
./sample_command input_file_1.in
./sample_command input_file_2.in
./sample_command input_file_3.in
./sample_command input_file_4.in
./sample_command input_file_5.in |