External How-Tos & Guides
This page includes references to external projects, manuals, how-tos, and best practices. We consider them valuable resources and references.
-
EasyBuild - Framework for reproducible building and installation of scientific/HPC software stacks.
-
Grafana (project) - Visualisation and dashboarding platform for metrics, logs, and traces.
-
Linux Administration Best Practices - IEEE Xplore - Peer-reviewed perspectives on Linux management patterns and pitfalls in enterprise and research settings.
-
Linux System Administration Best Practices - Penn Engineering - Concise guidance on secure, maintainable Linux operations (updates, access control, automation).
-
Pacemaker / PCS - High-availability clustering: Pacemaker manages resource failover; PCS is the CLI/web tooling to configure it.
-
Prometheus (project) - Monitoring and alerting toolkit with a pull-based model and built-in time-series database.
-
Slurm - Open-source workload manager for HPC clusters; schedules, queues, and runs jobs across nodes.
-
ZFS - Copy-on-write file system and volume manager with checksumming, snapshots, compression, and replication.