Putting it all together
Slurm offers a wide range of controls for defining and enforcing resource limits, but we use only two:
-
partitions
- set per-job limits such as maximum runtime and maximum number of nodes
-
accounts
- set per-user limits such as maximum number of jobs, CPUs, memory, etc.
As an example, the table below shows all combinations of resources limits for two partitions (gpu_a100, cpu_pr3) and one account (normal account).
-
use
--account
and--partition
Slurm command options to select the Slurm account and partition respectively - it is crucial to know the available resources before submitting your Slurm job. If you exceed the limits set by an account and partition selection, the job submission will fail.
gpu_a100 | cpu_pr3 | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
normal |
|