Teaching GPU cluster Slurm partitions

Partitions in Slurm can be considered as a resource abstraction. A partition configuration used to define resource limits and access controls for a group of nodes. Slurm allocates resources to jobs within the selected partition by taking into consideration the resources you request for your job and the partition's available resources and restrictions.

MFCF will adjust partition configurations as we observe usage patterns.

There are three partitions in the teaching GPU cluster: gpu-gen, gpu-k80, and gpu-gtx1080ti

  • Use the gpu-gen partition when you don't care which type of GPU you use.

    • gpu-gen total available resources:
      Partition name gpu-gen
      Total available memory 600 GB
      Max Cores 78 cores
      Threads per core 2 Threads
      GPU devices 8 K80 devices, and
      16 GTX1080ti devices
      GPU memory per device 12 GB
      Compute Nodes gpu-pt1-02,
      gpu-pt1-02,
      gpu-pt1-03,
      gpu-pt1-04
      gpu-gen partition specifications
    • gpu-gen resource limits:
      Max runtime (h) 12 hour
      Max Nodes 1 Node
      gpu-gen partition limits
  • Use the gpu-gtx1080ti partition to use the GTX1080i GPUs 

    • gpu-gtx1080ti total available resources:
      Partition name gpu-gtx1080ti
      Total available memory 480 GB
      Max Cores 48 cores
      Threads per core 2 Threads
      GPU devices 8 GTX1080ti
      GPU memory per device 12 GB
      Compute Nodes gpu-pt1-02,
      gpu-pt1-03
      gpu-gtx1080ti partition specifications
    • gpu-gtx1080ti resource limits:
      Max runtime (h) 24 hours
      Max Nodes 1 Node
      gpu-gtx1080ti partition limits