Managing jobs with Slurm - teaching

Introduction

This page describes how to use Slurm commands to manage jobs on Slurm-managed clusters. A workload manager schedules jobs based on availability of resources to meet specified needs for CPU, memory, run-time, etc. Slurm is the same workload manager used by the Digital Research Alliance of Canada (formerly Compute Canada / SHARCNET.

Slurm commands are described in three groups: job submission, monitoring, and control. Some commands such as scontrol can be used for monitoring, configuration, and control. 

For more help on any of the Slurm commands use man pages. Command line options --help and --usage can also be used to get a summary of options. Command line options are all case sensitive.

Slurm does not automatically copy executable or data files to the compute nodes allocated to a job. The files must be present either on a local disk or in some network-mounted file system. Another option is to use the sbcast command to transfer files to local storage on the allocated compute nodes.

Procedures

Job submission commands

Monitoring commands

Controlling and job signalling commands