Monitoring Slurm system: nodes, partitions, jobs

Introduction

Slurm provides commands to obtain information about nodes, partitions, jobs, jobsteps on different levels. These commands are sinfo, squeue, sstat, scontrol, and sacct. All these commands output can be formatted using --format (-o) or --Format (-O) option. The --sort (-S) option can be used to sort the output. Man pages are available for all commands. Most command options support a short form as well as a long form (e.g. -o <output_format>, and --format=<output_format>), but for readability, it is recommended to use long form of the command line options.

sinfo

Reports status information about nodes and partition.

Syntax

sinfo [Options...]

Using command line options, the output can be filtered sorted and formatted. Use sinfo man page man sinfo for more detailed information. You can also use --help or --usage for short list of the command line options, or visit the sinfo page on SchedMD website. Below is a summary of output format as well as node and partition states.

sinfo command common options

long form	short	Description
--Node	-N	Print information in a node-oriented format with one line per node. The default is to print information in a partition-oriented format.
--node	-n	Print information only about the specified node(s). Multiple nodes may be comma separated or expressed using a node range expression
--partition	-p	Print information only about the specified partition(s). Multiple partitions are separated by commas.
--long	-l	Print more detailed information.
--exact	-e	If set, do not group node information on multiple nodes unless their configurations to be reported are identical
--summarize	-s	List only a partition state summary with no node state details
--list-reasons	-R	List reasons nodes are in the down, drained, fail or failing state

Output format

sinfo command by default will display the following fields:

PARTITION AVAIL TIMELIMIT NODES STATE NODELIST

The table below describes these fields:

head title	Description
PARTITION	Name of a partition. Default partition identified by "*" suffix.
AVAIL	Partition state: up or down
TIMELIMIT	Maximum time limit for any user job in days-hours:minutes:seconds
NODES	Count of nodes with this particular configuration
STATE	State of the nodes. Possible states include: allocated, completing, down, drained, draining, fail, failing, future, idle
NODELIST	Names of nodes associated with the configuration/partition.

sinfo examples

Report basic node and partition configurations:
$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST cpu_pr3 up 7-12:01:00 1 down* hpc-pr3-07 cpu_pr3 up 7-12:01:00 7 idle hpc-pr3-[01-06,08] gpu_p100 up 7-12:01:00 1 idle gpu-pr1-01 hagrid_batch up 8-08:01:00 1 mix hagrid03 hagrid_batch up 8-08:01:00 7 idle hagrid[01-02,04-08] hagrid_interactive up 4:01:00 1 idle hagrid-storage barrio1 up infinite 1 idle barrio1 cpu_mosaic_owner up 7-04:01:00 1 mix mosaic-21 cpu_mosaic_owner up 7-04:01:00 3 idle mosaic-[22-24] ... ...
Report partition summary information:
$ sinfo -s PARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST cpu_pr3 up 7-12:01:00 0/7/1/8 hpc-pr3-[01-08] gpu_p100 up 7-12:01:00 0/1/0/1 gpu-pr1-01 hagrid_batch up 8-08:01:00 1/7/0/8 hagrid[01-08] hagrid_interactive up 4:01:00 0/1/0/1 hagrid-storage barrio1 up infinite 0/1/0/1 barrio1 cpu_mosaic_owner up 7-04:01:00 1/3/0/4 mosaic-[21-24] ... ...
Report more complete information about a certain partition:
$ sinfo --long --partition=cpu_mosaic_guest Fri Aug 26 12:49:44 2022 PARTITION AVAIL TIMELIMIT JOB_SIZE ... NODES STATE NODELIST cpu_mosaic_guest up 7-04:01:00 1-2 ... 1 mixed mosaic-21 cpu_mosaic_guest up 7-04:01:00 1-2 ... 3 idle mosaic-[22-24]
Report only those nodes that are in state idle:
$ sinfo --state=idle PARTITION AVAIL TIMELIMIT NODES STATE NODELIST cpu_pr3 up 7-12:01:00 7 idle hpc-pr3-[01-06,08] gpu_p100 up 7-12:01:00 1 idle gpu-pr1-01 hagrid_batch up 8-08:01:00 7 idle hagrid[01-02,04-08] hagrid_interactive up 4:01:00 1 idle hagrid-storage barrio1 up infinite 1 idle barrio1 cpu_mosaic_owner up 7-04:01:00 3 idle mosaic-[22-24] ... ...
Report node-oriented information with details and exact matches:
$ sinfo -Nel Fri Aug 26 12:56:54 2022 NODELIST NODES PARTITION STATE CPUS S:C:T ... barrio1 1 barrio1 idle 8 1:4:2 ... gpu-pr1-01 1 gpu_p100 idle 28 2:14:1 ... gpu-pr1-02 1 gpu_a100 mixed 64 2:32:1 ... hagrid01 1 hagrid_batch idle 20 2:10:1 ... hagrid02 1 hagrid_batch idle 20 2:10:1 ... hagrid03 1 hagrid_batch mixed 20 2:10:1 ... hagrid04 1 hagrid_batch idle 20 2:10:1 ... hagrid05 1 hagrid_batch idle 20 2:10:1 ... hagrid06 1 hagrid_batch idle 20 2:10:1 ... ... ...
Report only down, drained and draining nodes and their reason field:
$ sinfo -R REASON USER TIMESTAMP NODELIST Not responding root 2020-06-20T06:49:16 hpc-pr3-[01-02] Not responding root 2020-06-20T06:49:17 hpc-pr3-[03-04] Hardware failure,ETA slurm 2020-07-20T06:25:05 hpc-pr3-07,mosaic-05

squeue

Use the squeue command to get a high-level overview of all active (running and pending) jobs in the cluster.

Syntax

$ squeue [options]

These commonly-used options filter the output of the squeue command.

option	Description
--user=user_list	Request job data for a user or a comma-separated list of users.
--jobs=job_list	Request data based on specific job_list. job_list can be a single job ID or a comma-separated list of job IDs
--partition=part_list	Get information on jobs running on a partition or a comma-separated list of partitions
--states=state_list	Display data on jobs in specific states. state_list can be a single state, "all", or a comma-separated list of states. Default: "PD,R,CG"

squeue command output table header

The headers of the squeue command's default output are

Header title	Description
JOBID	Job or step ID. For array jobs, the job ID format will be of the form <job_id>_<index>
PARTITION	Partition of the job/step
NAME	Name of the job/step
USER	Owner of the job/step
ST	State of the job/step. See below for a description of the most common states
TIME	Time used by the job/step. Format is days-hours:minutes:seconds
NODES	Number of nodes allocated to the job or the minimum amount of nodes required by a pending job
NODELIST(REASON)	For pending and failed jobs, this field display the reason for pending or failure. Otherwise, this field shows a list of allocated nodes See below for a list of the most common reason codes

You can easily tailor the output format of squeue to your own needs using the --format (-o) or --Format (-O) options. See the man page for more information: man squeue

Job States

During its lifetime, a job passes through several states. The most common states are PENDING, RUNNING, SUSPENDED, COMPLETING, and COMPLETED.

state symbole	Description
PD	Pending. Job is waiting for resource allocation
R	Running. Job has an allocation and is running
S	Suspended. Execution has been suspended and resources have been released for other jobs
CA	Cancelled. Job was explicitly cancelled by the user or the system administrator
CG	Completing. Job is in the process of completing. Some processes on some nodes may still be active
CD	Completed. Job has terminated all processes on all nodes with an exit code of zero
F	Failed. Job has terminated with non-zero exit code or other failure condition

REASON column

The REASON column of the squeue output gives you a hint why your job is not running.

Reason code	Description
AssociationJobLimit	The job's association has reached its maximum job count.
AssociationResourceLimit	The job's association has reached some resource limit.
AssociationTimeLimit	The job's association has reached its time limit.
Dependency	This job is waiting for a dependent job to complete.
InvalidAccount	The job's account is invalid.
JobLaunchFailure	The job could not be launched. This may be due to a file system problem, invalid program name, etc.
NodeDown	A node required by the job is down.
NonZeroExitCode	The job terminated with a non-zero exit code.
PartitionDown	The partition required by this job is in a DOWN state.
PartitionNodeLimit	The number of nodes required by this job is outside of its partitions current limits
PartitionTimeLimit	The job's time limit exceeds its partition's current time limit.
Priority	One or more higher priority jobs exist for this partition or advanced reservation.
Resources	The job is waiting for resources to become available.
SystemFailure	Failure of the Slurm system, a file system, the network, etc.
TimeLimit	The job exhausted its time limit.

squeue command examples

List all currently running jobs of user jsmith:
squeue --user=jsmith --states=PD,R
List all currently running jobs of user jsmith in partition cpu_pr3:
squeue --user=jsmith --partition=cpu_pr3 --states=R
Print the job steps in the gpu_p100 partition sorted by user:
squeue -s -p gpu_p100 -S u
Print information only about jobs 12345,12346, and 12348:
squeue --jobs 12345,12346,12348
Print information only about job step 65552.1:
squeue --steps 65552.1

scontrol

Collect information about nodes, partitions, jobs, steps. Use scontrol show command to do that.

Syntax

scontrol [options] show entity=entityID (or entity entityID)

Some useful options are: --all (-a) , --details (-d), --verbose (-v). Examples of entities are: node, partition, job, step

Examples

Show detailed information about job with ID 500:
scontrol --details show job=500
Show even more detailed information about job with ID 500 (including the jobscript):
scontrol -dd show job 500

sacct

Display accounting data for all jobs and job steps in the Slurm job accounting log or Slurm database

Syntax

sacct [options]

Common options

option	description
--endtime=end_time	Select jobs in any state before the specified time.
--starttime=start_time	Select jobs in any state after the specified time.
--state=state_list	Select jobs based on their state during the time period given.

By default, the start and end time will be the current time when the --state option is specified, and hence only currently running jobs will be displayed. By default, sacct reports jobs owned by the current user in any state after 00:00:00 of the current day. To select older jobs you must specify a start time with the --starttime option. When specifying states (--state) and no start time is provided the default start time is 'now'.

Introduction

sinfo

sinfo command common options

Output format

sinfo examples

squeue

squeue command output table header

Job States

REASON column

squeue command examples

scontrol

Syntax

Examples

sacct

Syntax

Departments/Schools

Inquiries

Suggestions