Slurm accounts

Slurm accounts are different from login accounts (Nexus/WatIAM).

  • accounts in Slurm are used to track resource utilization so Slurm can manage limits on certain users or groups of users
  • Slurm accounts in association with Slurm partitions and Slurm QoS objects are used to control/limit access to cluster resourcess.
  • to use Slurm, your Nexus/WatIAM user ID has to be associated with one or more Slurm accounts 
  • normal accounts: these apply to all Slurm-managed resources except privately-owned resources that have dedicated accounts. normal account is the default account which means you don't have to use --account=normal option. By default, Slurm will select normal account
  • dedicated accounts: these apply to privately-owned resources. To selet these accounts you have to use --account=<account name> option (e.g. --account=hagrid).
    • use the account name hagrid for the Hagrid cluster. To access hagrid, --account=hagrid and --partition=hagrid_batch Slurm sbatch/srun options have to used. That is you have to use the following two option as part of sbatch script:
      
      #SBATCH --account=hagrid
      #SBATCH --partition=hagrid_batch
      
    • use the account name barrio1 for the barrio1 machine. To access barrio1 cluster, you have to use the following options
      
      #SBATCH --account=barrio1
      #SBATCH --partition=barrio1 
      
    • use the account name mosaic_owners if you are one of the owners (or a student of one of the owners) of the Mosaic cluster. Use this account with one of the following partitions cpu_mosaic_owner (for mosaic CPU cluster) and gpu_k20_mosaic_owner (mosaic GPU cluster) to gain high priority access to mosaic gpu/cpu machines. For example to get high priority access to mosaic gpu machines, use the following options in sbatch script,
      
      #SBATCH --account=mosaic_owners
      #SBATCH --partition=gpu_k20_mosaic_owner              
      

Slurm quality of service (QoS) objects are used to set resource limits, job priority and job preemption rules. Slurm account resource limits are set by associating the account to one of the QoS objects. The table below shows currently defined accounts and their respective resource limits.

Account Name Resources Limits
normal
QoS Name normal
Max submit jobs 120
CPUs 60
Mem 120GB
gpu 0
Default YES
hagrid
QoS Name qos_hagrid
Max submit jobs 200
CPUs 120
Mem 500GB
gpu 0
Default NO
barrio1
QoS Name qos_barrio1
Max submit jobs 16
CPUs 8
Mem 60GB
gpu 1
gpu types gpu:p100
Default NO
mosaic_owners
QoS Name qos_mosaic_hi
Max submit jobs 50
CPUs 200
Mem 500GB
gpu 5
gpu types gpu:k20
Default NO

Commands for Slurm account info

A Slurm account's resource limit is set using Slurm QoS objects. So, to find out the resource limit set for a Slurm account, you need the QoS name associated with the account. You can get that from the table shown above, or by using the Slurm sacctmgr command. For example, to find the resource limit for the "normal" account, from the table we see that the "normal" QoS is associated with "normal" account, so run the following command:


sacctmgr show qos normal format=Name%15,MaxJobsPU,MaxSubmitjobsPU,MaxTresPU%40