This example shows the workflow for copying your source code to the headnode, logging in to the head node, compiling the code, creating a Slurm batch script, submitting the job, checking status, and copying results back.
You could also develop your code on the headnode. This would avoid the first step.
In the example, <user_id> represents your UW userid.
copy files
Copy the local source file (my_test_mpi.c) to the cluster under /work/<user_id>/demo directory.
% scp my_test_mpi.c <user_id>@rsubmit.math.private.uwaterloo.ca:/work/<user_id>/demo/
log in to headnode
% ssh <user_id>@rsubmit.math.private.uwaterloo.ca
Use WatIAM credential to login.
compile your code
Compile MPI source code using the mpicc command.
% cd /work/<user_id>/demo
% mpicc my_test_mpi.c -o my_mpi_test
create Slurm batch script
example for a script (my_mpi_job.sh) is:
#!/bin/bash
# Set Slurm job characteristics
#SBATCH --mail-user=<user_id>@uwaterloo.ca
#SBATCH --mail-type=BEGIN,END,FAIL
#SBATCH --job-name="my_mpi_job"
#SBATCH --partition=cpu_mosaic_guest
#SBATCH --account=normal
#SBATCH --time=00:01:00
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=6
#SBATCH --mem-per-cpu=1GB
## %x for job-name and %j for job-id
#SBATCH --output=%x-%j.out
#SBATCH --error=%x-%j.err
# Your code below this line
# First set the environment for using Open MPI
module load mpi/openmpi-uw
#run the executable using srun command
srun --mpi=pmi2 ./my_mpi_test
submit Slurm job
% sbatch my_mpi_job.sh
The output would be
Submitted batch job 655
where 655 is the job ID.
check job status
To check job status use squeue For example, to check the current job status run
% squeue job=655
JOBID PARTITION NAME USER ST TIME NODES ...
655 cpu_mosaic_guest my_mpi_job <user_id> R 1:11 1 ...
copy results back
% scp my_mpi_job-655.out linux.math.uwaterloo.ca:/u/<user_id>/