Overview of specialty machines

The specialty machines controlled by the Slurm workload manager include:

Head node

  • hostname rsubmit.math.private.uwaterloo.ca
  • head node is for job submission and short tasks only, e.g. code compilation, not for executing jobs
# Nodes 1
Node names slurm-pr2-01 (alias rsubmit.math.private)
CPU model (2 per node) Xeon(R) CPU E5-2620 @ 2.00GHz (Sandy Bridge EP)
# Cores per node 12
Threads per core 2
System Memory per node 64 GB
head node hardware specification

CPU clusters

This resource is intended for medium-sized parallel jobs (OpenMP or MPI) and for developing and testing parallel code. Large jobs are better served by the Digital Research Alliance of Canada (formerly Compute Canada).

  • hpc-pr3 cluster
    #Nodes 8
    Node names hpc-pr3-01 to hpc-pr3-08
    CPU model (2 per node) Xeon(R) Gold 6326 2.9 GHz (Ice Lake)
    #Cores per node 32
    Threads per core 2
    System Memory per node 128 GB
    compute node hardware specifications

The Hagrid cluster is for members of the Bauch Lab.

  • hagrid cluster
    #Nodes 8 compute, 1 storage
    Node names hagrid01 to hagrid08, hagrid-storage
    CPU model (2 per node) Xeon(R) Silver 4114 CPU @ 2.20GHz
    #Cores per node 10
    Threads per core 1
    System Memory per node 187 GB
    hardware specifications 

GPU servers

This resource is intended for GPU-specific computing on a small scale. Medium and large-scale GPU jobs are better served by SHARCNET.

  • research GPU server hardware specifications
    Node name gpu-pr1-01 gpu-pr1-02 gpu-pr1-03 gpu-pr1-04
    CPU model 2x Intel Xeon E5-2680v4 2.4 GHz (Broadwell) 2x AMD EPYC 7542 2.9 GHz 2x Intel Xeon 8480+ 2.0 GHz (Sapphire Rapids) 1x AMD EPYC 9634 2.25 GHz
    # Cores 28 64 56 84
    Threads per core 1 1 1 1
    System Memory 128 GB 1024 GB 1024 GB 768 GB
    GPU Type NVIDIA Tesla P100 NVIDIA Ampere A100 PCI NVIDIA Hopper H100 SXM NVIDIA L40S
    # of GPU devices 4 8 4 3
    GPU memory per device 16 GB four with 40 GB, four with 80 GB 80 GB 48 GB