Workflow example

This example shows the workflow for copying your source code to the headnode, logging in to the head node, compiling the code, creating a Slurm batch script, submitting the job, checking status, and copying results back.

You could also develop your code on the headnode. This would avoid the first step.

In the example, <user_id> represents your UW userid.

copy files

Copy the local source file (my_test_mpi.c) to the cluster under /work/<user_id>/demo directory.

% scp my_test_mpi.c <user_id>@rsubmit.math.private.uwaterloo.ca:/work/<user_id>/demo/

log in to headnode

% ssh <user_id>@rsubmit.math.private.uwaterloo.ca

Use WatIAM credential to login.

compile your code

Compile MPI source code using the mpicc command.

% cd /work/<user_id>/demo

% mpicc my_test_mpi.c -o my_mpi_test

create Slurm batch script

example for a script (my_mpi_job.sh) is:

#!/bin/bash
# Set Slurm job characteristics
#SBATCH --mail-user=<user_id>@uwaterloo.ca
#SBATCH --mail-type=BEGIN,END,FAIL
#SBATCH --job-name="my_mpi_job"
#SBATCH --partition=cpu_mosaic_guest
#SBATCH --account=normal
#SBATCH --time=00:01:00
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=6
#SBATCH --mem-per-cpu=1GB
## %x for job-name and %j for job-id
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err

# Your code below this line
# First set the environment for using Open MPI
module load mpi/openmpi-uw

#run the executable using srun command
srun --mpi=pmi2 ./my_mpi_test

submit Slurm job

% sbatch my_mpi_job.sh

The output would be

Submitted batch job 655

where 655 is the job ID.

check job status

To check job status use squeue For example, to check the current job status run

% squeue job=655
JOBID PARTITION NAME USER ST TIME NODES ...
655 cpu_mosaic_guest my_mpi_job <user_id> R 1:11 1 ...

copy results back

% scp my_mpi_job-655.out linux.math.uwaterloo.ca:/u/<user_id>/