Workload management

TrinityX is using the OpenHPC for user space, including SLURM.

SLURM (OpenHPC)

TrinityX is configured to use the default paths for SLURM. The configuration directory /etc/slurm is shared on all the nodes but links to /trinity/shared/etc/slurm where all the files reside.

For better readability, the files have been split up and included from the main slurm.conf file.

File Description
slurm.conf Main configuration file
acct_gather.conf Slurm configuration file for the acct_gather plugins (see acct_gather.conf)
slurm-health.conf Health check configuration (where applicable)
slurm-partitions.conf Partition configuration
topology.conf Slurm configuration file for defining the network topology (see topology.conf)
cgroup.conf Slurm configuration file for the cgroup support (see cgroup.conf)
slurmdbd.conf Slurm Database Daemon (SlurmDBD) configuration file ([see slurmdbd.conf(https://slurm.schedmd.com/slurmdbd.conf.html)])
slurm-nodes.conf Slurm node configuration file
slurm-user.conf Slurm user configuration file (e.g. QoS, priorities)

By default Luna generates the configuration for nodes and partitions, where the partition is based on the group name. This method is useful for homogeneous node-type cluster where a default node contains the detailed configuration for CPU-s, cores and RAM. When more complexity is desired, e.g. having different node-types, the automation can be overidden by manually configuration in these files, or the graphical slurm configurator can be used.

Graphical slurm configuration application

The graphical slurm configurator in action: