External How-Tos & Guides

This page includes references to external projects, manuals, how-tos, and best practices. We consider them valuable resources and references.

  • EasyBuild - Framework for reproducible building and installation of scientific/HPC software stacks.

  • Grafana (project) - Visualisation and dashboarding platform for metrics, logs, and traces.

  • Linux Administration Best Practices - IEEE Xplore - Peer-reviewed perspectives on Linux management patterns and pitfalls in enterprise and research settings.

  • Linux System Administration Best Practices - Penn Engineering - Concise guidance on secure, maintainable Linux operations (updates, access control, automation).

  • Pacemaker / PCS - High-availability clustering: Pacemaker manages resource failover; PCS is the CLI/web tooling to configure it.

  • Prometheus (project) - Monitoring and alerting toolkit with a pull-based model and built-in time-series database.

  • Slurm - Open-source workload manager for HPC clusters; schedules, queues, and runs jobs across nodes.

  • ZFS - Copy-on-write file system and volume manager with checksumming, snapshots, compression, and replication.