Job submission examples

Job types

This page describes how to submit batch jobs based on job type. Job types can be classified as sequential jobs, shared memory parallel jobs, distributed memory parallel jobs, GPU jobs, and job arrays. Interactive jobs can be submitted using the srun and salloc command. Batch jobs can be submitted using sbatch.

Sequential Jobs

The srun command is not required to launch a simple single-step job.

#!/bin/bash #SBATCH --mail-user=<user_id>@uwaterloo.ca #SBATCH --mail-type=BEGIN,END,FAIL #SBATCH --job-name="hostname_name" #SBATCH --partition=cpu_mosaic_guest #SBATCH --time=00:00:05 #SBATCH --mem=1GB ### %x replaced by job-name, and ### %j replaced by job-id #SBATCH --output=%x-%j.out #SBATCH --error=%x-%j.err # Your code below this line echo -n "I'm on host: " hostname

Parallel Jobs

Shared Memory Jobs (e.g. OpenMP)

SMP parallelization is based upon dynamically created threads (fork and join) that share memory on a single node.

specify N parallel threads with --cpus-per-task=N
OpenMP is not Slurm-aware, so you must also set environment variable OMP_NUM_THREADS
$OMP_NUM_THREADS must equal cpus-per-task

For example, assuming you have an OpenMP program called openmp_test:

#!/bin/bash #SBATCH --mail-user=<user_id>@uwaterloo.ca #SBATCH --mail-type=BEGIN,END,FAIL #SBATCH --job-name="openmp_job" #SBATCH --partition=cpu_mosaic_guest #SBATCH --time=00:01:00 #SBATCH --cpus-per-task=8 #SBATCH --mem-per-cpu=2GB #SBATCH --output=%x-%j.out #SBATCH --error=%x-%j.err # Your code below this line # # to prevent the program from spawning more threads # set OMP_NUM_THREADS to the number of --cpus-per-task # if the --cpus-per-task option is not set, # set OMP_NUM_THREADS to 1 if [ -n "$SLURM_CPUS_PER_TASK" ]; then export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK else # if it is not set, set it to one export OMP_NUM_THREADS=1 fi # run the command without srun ./openmp_test

For optimal resource management, notably to prevent oversubscribing the compute node, setting the correct number of threads is crucial. The assignment OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK ensures that your program does not spawn more threads than requested.

MPI Jobs (e.g. Open MPI)

Message Passing Interface (MPI) parallelization allows applications to run as parallel processes that communicate by passing messages. Since they don't rely on shared memory these processes can be distributed among several discrete compute nodes. Slurm provides job options that control how MPI processes are distributed among nodes

To control only the total number of tasks, use:
```
#SBATCH --ntasks=N
```
To control instead how tasks are distributed among nodes, use the following two lines:
#SBATCH --nodes=N #SBATCH --ntasks-per-node=M

For example:

#!/bin/bash #SBATCH --mail-user=<user_id>@uwaterloo.ca #SBATCH --mail-type=BEGIN,END,FAIL #SBATCH --job-name="mpi_job" #SBATCH --partition=cpu_mosaic_guest #SBATCH --time=00:01:00 #=============================================== # use either --ntasks alone, e.g. --ntasks=40, or # use --nodes and --ntasks-per-node, shown below #============================================== #SBATCH --nodes=4 #SBATCH --ntasks-per-node=10 #SBATCH --mem-per-cpu=1GB #SBATCH --output=%x-%j.out #SBATCH --error=%x-%j.err # Your code below this line # Load openmpi enviroment module. module load mpi/openmpi # # run the parallel job mpirun ./mpi-test # or, srun --mpi=pmi2 ./mpi-test

GPU jobs

Each partition's machines use different GPU devices. Therefore, when selecting a partition you have to identify its --gres resource. The table below lists GPU partitions with thier corresponding --gres resources.

Partition name	allowed --gres
gpu_a100	`--gres=gpu:a100:1`
gpu_a100	`--gres=gpu:a100_80G:1`
gpu_mosaic_owner	`--gres=gpu:k20:1`
gpu_k20_mosaic_guest	`--gres=gpu:k20:1`
gpu_p100	`--gres=gpu:p100:1`

In the example shown, gpu_p100 partition and its corresponding --gres=gpu:p100:1 are used. The nvidia-smi command is used as the program.

#!/bin/bash #SBATCH --mail-user=<user_id>@uwaterloo.ca #SBATCH --mail-type=BEGIN,END,FAIL #SBATCH --job-name="gpu_test" #SBATCH --partition=gpu_p100 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=1 #SBATCH --time=12:00:00 #SBATCH --mem-per-cpu=2000 #SBATCH --gres=gpu:p100:1 #SBATCH --output=%x_%j.out #SBATCH --error=%x_%j.err echo "Cuda device: $CUDA_VISIBLE_DEVICES" echo "======= Start memory test =======" nvidia-smi

Interactive jobs

Interactive jobs let you run commands on a compute node as if you were on an interactive node. Interactive jobs, or sessions, are useful for jobs that require direct user input. Examples include:

Compiling your code, especially when the compute node architecture differs from the headnode architecture. For example, it is best to compile CUDA code on a GPU machine that has architecture similar to the target GPU device.
Testing and debugging code.
Running applications on a graphical user interface such as X windows.

To launch interactive jobs, use srun with --pty option. The basic form of this command is:
srun --pty bash -i

This runs like a terminal session with an interactive bash shell. With no resources explicitly specified, this job will run under default Slurm settings: default account, default partition, and resource allocation defaults such as number of CPUs, memory size, etc.

Using srun command line options, you can request any resources that are available to you. The following example requests an interactive session using one GPU in the gpu_p100 partition and the normal account.
srun --gres=gpu:p100:1 --partition=gpu_p100 \ --account=normal --pty /bin/bash -i

You may not get an interactive session immediately. Remember that the srun command submits the job to a queue. You will get an interactive session on a compute node(s) as soon as the requested resources become available.

Interactive jobs with X-forwarding

If you want to run X11/GUI application on one of the Slurm's cluster compute nodes, first you need to enable ssh's X11 forwarding between your desktop machine and headnode machine (rsubmit.math). Use ssh's -Y flag as shown below:

ssh -Y userID@rsubmti.math.private.uwaterloo.ca

Then, from head node run interactive job (as described above) with --x11 flag.

srun --gres=gpu:p100:1 --partition=gpu_p100 \ --account=normal --x11 --pty /bin/bash -i