luna2-daemon — Architecture & Node Boot Flow

Component: luna2-daemon (Luna 2 Project / TrinityX) Source branch: development (HEAD 371511ce, version 2.1) Listens on: TCP [::]:7050 (REST API, JSON) Last updated: 10 Jun 2026

Scope — Internal architecture of luna2-daemon: the role of the base, utils and routes layers, how the plugin system works (and how plugins are selected per node/group/hostname/distro, §6.4), the templ_install installer template, base/boot.py, and the end-to-end node boot/provisioning sequence. It also covers the node-side counterpart luna2-client (§12) — the deb/rpm dracut module that provides the pre-pivot hook which authenticates the node and runs the installer. Written from the development branch (luna2-daemon) and main branch (luna2-client); for developers and integrators (not end users).

Two halves of one boot. luna2-daemon (server-side — all sections except §12) renders and serves the iPXE menu, the node-boot script and the installer. luna2-client (node-side, §12) is a dracut module baked into every OS image that consumes those artefacts: it brings up the NIC, authenticates via /tpm/<node>, fetches /boot/install/<node> and runs it — all before the initramfs pivots into the installed system.


1. Overview & design philosophy

Luna 2 Daemon is the core cluster-management service of TrinityX: a stateless Flask REST API served by Gunicorn, backed by a SQL database (SQLite / MySQL / PostgreSQL via unixODBC). It owns everything needed to define and provision a cluster — nodes, groups, OS images, networks, DNS, DHCP, BMC setup, secrets, switches, racks, monitoring and HA replication.

Its single most important job is netbooting and installing compute nodes: it generates the iPXE menus, the kernel/initrd boot scripts, and the bash installer that runs inside the node's initramfs.

Property Value
Entry point (WSGI app) daemon/luna.pydaemon (Flask object)
Process manager Gunicorn, config daemon/config/gunicorn.py
systemd unit daemon/config/luna2-daemon.service
Bind / port [::]:7050, 4 workers
Introspection routes GET /version, GET /all-routes
Config file luna.ini → parsed into CONSTANT (common/constant.py)

⚠️ daemon/daemon.py contains only stub install/update/upgrade placeholders — it is not the running service. The live application is daemon/luna.py, loaded by Gunicorn via import luna in gunicorn.py.

1.1 Design philosophy

Five principles recur throughout the codebase and shape the rest of this document:

  • Strict layering. A request flows routes → base → utils; the HTTP edge stays free of logic and the logic stays free of HTTP. One base method per route. (§2–§5)
  • Plugin-driven, not patched. Site-, OS- and vendor-specific behaviour lives in plugins/, selected per node/group/hostname/distro and injected into outputs — the daemon core never changes for a new distro or BMC. (§6)
  • Replicate intent, not rows. Multi-controller HA re-executes the method call that changed state on every peer rather than copying database rows, so heterogeneous controllers converge by running the same business logic. (§8)
  • One return contract. Every layer returns (status, message[, request_id]); a message from the deepest plugin reaches the CLI unaltered, synchronously or via a polled status stream. (§15)
  • Stateless, template-rendered output. The daemon holds no per-request state; everything a node sees (iPXE, installer, dhcp, dns) is a Jinja2 template rendered from the DB on demand. (§7; §9–§12)

1.2 Configuration inheritance (node → group → …)

Most of a node's settings are not stored on the node at all — they are resolved at read time from a precedence chain, so an administrator configures a group once and every member inherits it, overriding only what differs on a specific node. base/node.py:get_node() performs this resolution for every field and records the winning level in a _<field>_source marker (node / group / osimage / cluster / default), plus an _override flag when the node itself set the value — this is exactly what the CLI prints as "Config differs from parent — local overrides".

The order of precedence, most specific first: node → group → OS image (+ tag) → cluster → daemon defaults — i.e. the lowest level in the map below that sets a field wins. Two profiles are referenced out-of-band: bmcsetup (BMC credentials) and the node/group interface definitions (network membership, DHCP). The map reads top-down, general defaults refining toward the specific node:

Node configuration inheritance map

A few fields compose rather than simply override — e.g. effective kerneloptions can be taken from the node, the group or the OS image; and an OS image whose imagefile is kickstart forces provision_method=kickstart regardless of the group or cluster.


2. High-level architecture

Layered application under daemon/. A request flows downward; output is rendered from templates.

High-level architecture of luna2-daemon

2.1 Supporting packages

Package Role
daemon/common/ Process-wide foundation: constants, bootstrap, auth/input decorators and the DB schema. constant.py loads luna.ini into CONSTANT + LOGGER + LUNAKEY; bootstrap.py seeds a first-run cluster; validate_auth.py / validate_input.py provide the @token_required / @validate_name decorators; database_layout.py declares every table. Constants & validation: §17; first-run bootstrap & schema: §14.
daemon/config/ Shipped config: gunicorn.py, luna.ini, luna2-daemon.service, bootstrap.ini, nginx/transmission units.
daemon/templates/ Jinja2 templates the daemon renders and serves (iPXE menus, node boot scripts, installer, dhcpd/kea, DNS zones).
daemon/log/ Runtime log directory.

3. The routes layer

routes/ is the HTTP edge. Each file defines a Flask Blueprint registered in luna.py. Handlers are deliberately thin: apply decorators, call one base method, translate the (status, data) tuple into an HTTP code, and (for boot endpoints) render a template. No business logic here.

Two guard decorators: - @token_required (common/validate_auth.py) — enforces a JWT x-access-tokens header. - @validate_name (common/validate_input.py) — sanitises path params (node/group names).

3.1 Route catalogue

Blueprint file Mount / purpose
auth.py /token — authenticate, issue JWT
boot.py /boot, /boot/search/mac/…, /boot/manual/…, /boot/install/<node>, /kickstart/install/<node>the netboot + install entry points
boot_roles.py /boot/roles/<role> — systemd role units + scripts
boot_scripts.py /boot/scripts/<script> — pre/part/post install script bodies
config_node.py /config/node/… — node CRUD, interfaces, status
config_group.py /config/group/… — group CRUD
config_osimage.py / config_osgroup.py OS image + os-group ops (pack, grab, push, tags)
config_network.py networks, IP allocation
config_dns.py DNS zone management
config_bmcsetup.py BMC credential / channel profiles
config_switch.py switch definitions (switchport detection)
config_otherdev.py non-node devices (PDUs, etc.)
config_rack.py rack / datacenter layout
config_secrets.py /config/secrets/node/<name> — encrypted per-node/group secrets
config_cloud.py cloud / alternative-provisioning targets
config_cluster.py cluster-wide settings
config_status.py long-running task status messages
control.py /control/… — power actions via control plugins
monitor.py /monitor/node/<name> — node state updates (installer posts here)
service.py start/stop/reload of managed services (dhcp, dns…)
files.py /files/… — serves kernels, initrds, image files
journal.py, ha.py, tables.py HA replication, journal replay, raw table access between controllers
tracker.py torrent tracker endpoint (torrent provisioning)
plugin_export.py / plugin_import.py export/import of config + boot plugins between controllers

4. The base layer

base/ holds the business logic — one class per managed resource. These classes assemble datasets from the DB, enforce domain rules, enqueue service actions, call plugins, and return (status, data) tuples.

Class (file) Responsibility
Boot (boot.py) Provisioning brain — node discovery, iPXE menu, node-boot script, installer assembly (§10–8)
Node (node.py) node datasets, filtering, CRUD
Group (group.py) group definitions + inheritance of image/kernel/netboot settings
Interface (interface.py) node/group interfaces, MAC↔node binding
Network / Dns IP networks, DNS zones, resolution
OSImage (osimage.py) image pack/grab/push, tags, kernel/initrd
BMCSetup (bmcsetup.py) BMC credential profiles
Switch / OtherDev / Rack switches, other devices, rack layout
Secret (secret.py) per-node/group encrypted secrets served at install time
Roles / Scripts systemd role units; pre/part/post install scripts
Cloud / Cluster cloud bursting targets, cluster settings
Control (control.py) power control orchestration (delegates to control plugins)
Monitor / Service node state tracking, managed-service control
Authentication login + JWT issuance
Journal / Tables HA — record & replay mutating calls to peer controllers
PluginExport / PluginImport bundle and transfer config + plugins

Pattern: a routes handler calls exactly one base method; the base method owns the DB queries and plugin calls. Handlers stay free of logic; base stays free of Flask/HTTP concerns.


5. The utils layer

utils/ provides the shared services and helpers used by every base class — DB access, plugin loading, service control, queueing, HA mechanics, template assembly.

Module Responsibility
database.py ODBC abstraction — get_record, get_record_join, update, insert, delete; the single DB gateway
helper.py catch-all helpers — incl. plugin_finder() / plugin_load() (plugin entry points), bool/base64/jinja helpers, nodes_and_groups()
plugin_manager.py PluginManager — resolves & caches plugin classes with search/priority + on-disk reload detection (§6)
plugin_tree.py build_plugin_tree() — walks a plugin dir into a nested dict
template_manager.py TemplateManager — same tree mechanism, locates .templ files
plugin_sync.py PluginSync — replicates boot-plugin files between HA controllers (background worker)
config.py builds rendered config sets (interfaces, DHCP, DNS) from DB records
service.py Service — start/stop/reload/status of managed daemons
queue.py Queue — internal task queue; mutating work is enqueued and drained by housekeeper threads
housekeeper.py Housekeeper — background "mother" threads (status cleanup, queue drain, switchport scan, journal replication, invalid-config sweep, osimage tasks)
boot.py (UBoot) boot-side helpers — e.g. verify_bootpause() (don't boot a node while its image is being packed)
osimage.py / downloader.py image packing/pushing and download helpers
journal.py, controller.py, ha.py, request.py, tables.py, dbstructure.py HA: track/replicate/replay mutating requests; manage DB schema (deep dive: §8)
monitor.py, status.py node state + status message storage
log.py Log.get_logger() — shared logger
filter.py, model.py, files.py, ping.py, control.py input filtering, data models, file validation, ping registry, control offload

6. How plugins work

Plugins let site- and OS-specific behaviour be swapped in without changing the daemon. They live under daemon/plugins/ and are loaded dynamically at request time.

6.1 Plugin categories

Path Kind Provides
plugins/boot/provision/ fetch var + create/cleanup how a node downloads its image: http, torrent, kickstart, default
plugins/boot/network/ string vars per-distro interface config: redhat8/9/10, ubuntu, opensuse (+ .templ)
plugins/boot/bmc/ config var BMC config bash: default (ipmitool), dell
plugins/boot/scripts/ methods install scripts: diskfull, raid1, nodhcp, default
plugins/boot/roles/ class systemd role builders: bond, default
plugins/boot/detection/ class node identity discovery: switchport, cloud
plugins/control/ class power backends: default (ipmi), dell
plugins/hooks/ class lifecycle hooks: hooks/luna (startup/shutdown), hooks/config/*, hooks/control, hooks/monitor
plugins/osimage/ class image ops: operations/image, osgrab, ospush, filesystem, other/cleanup
plugins/export/ & plugins/import/ class config/Prometheus rule + boot-plugin export/import

6.2 Loading mechanism

Two helpers on Helper (utils/helper.py) are the public entry points; they delegate to PluginManager:

boot_plugins = Helper().plugin_finder(f'{plugins_path}/boot')                      # build tree
provision_plugin = Helper().plugin_load(boot_plugins, 'boot/provision', 'http')    # resolve class
  1. Tree buildbuild_plugin_tree() (plugin_tree.py) walks the dir with os.walk into a nested dict {dir: {...}, file.py: None}.
  2. Resolution + priorityPluginManager.load() tries candidate module names in order for a (root, levelone, leveltwo) request. E.g. boot/network + redhat + 9:
  3. plugins.boot.network.redhat9 (concatenated)
  4. plugins.boot.network.redhat.9 (sub-module)
  5. plugins.boot.network.redhat.default
  6. plugins.boot.network.redhat9 (single)
  7. plugins.boot.network.default (final fallback)
  8. Class selection_resolve_plugin_class() returns the requested class name (default Plugin), falling back to Plugin.
  9. Caching + hot reload — resolved classes cached in _class_cache. Before serving from cache, _module_changed_on_disk() compares the file's (mtime_ns, size) fingerprint; if changed, the cache is invalidated and the module is importlib.reload-ed. Operators can edit plugins live.

6.3 The segment-injection pattern (most important)

Provision/network/bmc/script plugins do not run on the node. Each exposes its behaviour as a bash string attribute (or method source). The daemon reads the raw installer template, string-substitutes each plugin's snippet into a named marker, then Jinja2-renders the whole thing.

Example — the http provision plugin (plugins/boot/provision/http.py):

class Plugin():
    def create(self, ...):  return True, "Success"
    def cleanup(self, ...): return True, "Success"
    # 'fetch' is the bash injected into the installer to download the image
    fetch = """
    curl $INTERFACE -H "x-access-tokens: $LUNA_TOKEN" -s \
      {{ WEBSERVER_PROTOCOL }}://{{ LUNA_CONTROLLER }}:{{ WEBSERVER_PORT }}/files/{{ LUNA_IMAGEFILE }} \
      > /{{ LUNA_SYSTEMROOT }}/{{ LUNA_IMAGEFILE }}
    return $?
    """

In base/boot.py:install() the snippet is wrapped and substituted into the marker:

segment = str(provision_plugin().fetch)
segment = f"function download_{method} {{\n{segment}\n}}\n## FETCH CODE SEGMENT"
template_data = template_data.replace("## FETCH CODE SEGMENT", segment)

The same mechanism fills every ## … CODE SEGMENT marker in templ_install.cfg: NETWORK INIT, INTERFACE, GATEWAY, DNS (+ IPv6), BMC, and SCRIPT PRE/PART/POST. The ## FETCH CODE SEGMENT marker is re-appended each time so primary + fallback provision methods can be stacked.

6.4 How a plugin is selected (node / hostname / group / distro)

The key idea: plugin selection is driven by attributes of the specific node being provisioned. PluginManager.load() accepts a list for levelone and tries each entry in order, falling through to default. So a list like [nodename, group, distribution] produces a most-specific-wins hierarchy: hostname → group → distro → default.

All selection happens in base/boot.py:install() (and discover_*) via Helper().plugin_load(tree, root, levelone, leveltwo):

Plugin Selected by (leveloneleveltwo) Effect
osimage/operations/image distributionosrelease per-distro image unpack + systemroot (e.g. ubuntu vs default)
boot/provision provision_method, then provision_fallback how the node downloads its image (http/torrent/kickstart) — from node/group/cluster config
boot/network distributionosrelease per-distro/release interface config (redhat+9redhat9, ubuntu, opensuse)
boot/bmc [nodename, group] BMC config — a node-specific plugin overrides a group one, else default
boot/scripts [nodename, group, distribution] install scripts resolved node → group → distro → default
boot/detection switchport, cloud (pre-loaded) identifies which node a booting MAC is, by switch port or cloud metadata

For each candidate, PluginManager's internal resolution order is: exact levelone+leveltwo file → levelone/leveltwo submodule → levelone/defaultlevelone.py<root>/default. This is what lets you drop a plugins/boot/bmc/<hostname>.py or plugins/boot/network/redhat9.py and have it picked up automatically for matching nodes, with default.py always as the safety net.

distribution/osrelease come from the node's assigned OS image; provision_method/provision_fallback from node/group/cluster provisioning settings; nodename/group from the resolved node. Change any of these in the DB and a different plugin is selected on the next boot — no daemon restart needed (plugins hot-reload, §6.2).


7. Templates — dynamic config rendering

Almost nothing luna2-daemon emits is static. Every artefact a node or a managed service sees — the iPXE menu, the installer, the DHCP and DNS configuration — is a Jinja2 template under daemon/templates/, rendered on demand from the current database state. This is what keeps controllers stateless (§1.1): change a row, re-render, never hand-edit a config file.

7.1 The template set

Group Templates Rendered for
Boot templ_boot_ipxe[_short].cfg, templ_nodeboot.cfg, templ_install.cfg, templ_boot_disk.cfg, templ_boot_failed.cfg (+ _kickstart variants) served to the node during netboot (§9–§11)
DHCP templ_dhcpd.cfg, templ_dhcpd6.cfg, templ_kea-dhcp4.cfg, templ_kea-dhcp6.cfg the controller's DHCP server config
DNS templ_dns_conf.cfg, templ_dns_zone.cfg, templ_dns_zones_conf.cfg named/zone files

7.2 Two rendering paths

  • Per-request boot artefacts. A routes handler renders a boot template with render_template / render_template_string, fills it with DB-derived data, and returns the text straight to the node — e.g. base/boot.py:install()templ_install.cfg (§10). Nothing is written to disk.
  • On-change service configs. When a network, node or DNS record changes, utils/config.py (Config) renders the DHCP/DNS templates with a Jinja2 Environment(FileSystemLoader(...)) from the DB, writes the result into the working TMP_DIRECTORY, validates it (dhcpd -t -cf …), then deploys it (e.g. /etc/dhcp/dhcpd.conf) and reloads the service. This is the dynamic config rendering behind luna network change (§13.4): the operator never edits dhcpd.conf or zone files — they are regenerated from the templates.

7.3 Template resolution & overrides (shared with plugins)

utils/template_manager.py (TemplateManager) locates templates with the same tree + search/priority mechanism as the plugin loader (§6.2). A .templ file can therefore be selected per distribution/osrelease/group/node and takes precedence over the equivalent .py plugin — e.g. a node's interface block is taken from plugins/boot/network/redhat9.templ if present, otherwise the redhat9.py network plugin's bash segments are injected instead (§6.3, §6.4).

At startup, constant.py copies the templates listed in TEMPLATES.TEMPLATE_LIST into the working TMP_DIRECTORY, so the running daemon always renders from an isolated, validated set.


8. High availability, the journal & replication philosophy

8.1 Philosophy — replicate intent, not rows

luna2-daemon scales to multiple controllers with an active/active, eventually-consistent model that replicates intent, not data. Rather than replicating SQL rows or shipping a binary DB log, each controller records the method call that changed state and re-executes that same call on every peer. The unit of replication is a logical request ("Node.node_update on object X with this payload"), not a row diff.

Consequences of this choice:

  • Controllers may run different DB backends (one SQLite, one MySQL) and still converge — only the logical operation crosses the wire.
  • Replay runs the same base business logic on each controller, so derived state (rendered configs, queued service restarts, journalled sub-calls) is reproduced naturally — a row-copy would skip all of that.
  • It is best-effort + self-healing: entries are delivered redundantly (push and pull), and a periodic table-checksum comparison repairs any drift the journal missed.

Flow at a glance:

Journal replication flow between controllers

8.2 Roles & state — the ha table

A single-row ha table holds the controller's view of the cluster (§14.2):

Field Meaning
enabled H/A mode on/off. If off, Journal.add_request is a no-op ("Not in H/A mode")
master is this controller the master? One master at a time owns master-only operations
insync has this controller pulled the full journal and converged? Until true it refuses new work
sharedip the cluster presents a single floating beacon IP; the beacon controller currently owns it
overrule emergency bypass of the insync gate (operator-forced)
syncimages whether OS-image bits are rsynced between controllers
updated timestamp, used for last-writer-wins master election (set_role rejects an older request)

Related identity (utils/controller.py, utils/ha.py): every controller knows me (its own hostname), whether it is a beacon (owns the shared IP) or a shadow (standby / read-mostly, skipped as a replication source).

8.3 The journal entry

Each mutating base method, when it changes state, calls Journal().add_request(...), writing one row to the journal table:

Column Purpose
function Class.method to replay, e.g. Interface.change_node_interface
object / param positional args for the replayed call
payload base64-encoded JSON body for the call
masteronly / remoteonly replay only on the master / only on non-masters
sendby originating controller
sendto a replicator hop (forward via the beacon when sharedip)
sendfor the controller this entry must ultimately be applied on
tries / created retry count + ordering key

add_request blocks until the controller is insync (up to keeptrying seconds) unless overrule is set, so a freshly-started controller never originates work before it has caught up.

8.4 Replay — handle_requests()

On the receiving side the journal thread reads entries where sendfor = me, ordered by created, and for each:

  1. drops it if masteronly/remoteonly does not match this controller's role;
  2. decodes the payload and resolves the class dynamically: repl_class = globals()[class_name] → e.g. base.node.Node, then repl_function = getattr(repl_class, function_name);
  3. calls it with the stored args — the identical code path that ran on the origin;
  4. for OSImage operations (pack / clone_osimage / grab) it additionally queues an image rsync task (method replay alone cannot move image bits — §8.6);
  5. always deletes the journal row afterwards, success or failure (drift is caught later by table hashing, not by infinite retry).

8.5 Convergence loop — journal_mother

The replication thread (one of the elected background workers, §16; it exits immediately if enabled is false) runs roughly every 5 s:

  1. startup — pull the journal from every peer until successful, then set_insync(True) and clear overrule;
  2. pushpushto_controllers(): POST my sendby=me entries (and forward sendto=me replicator entries) to each target's /journal; delete locally only on a successful POST;
  3. pullpullfrom_controllers(): GET /journal/<me> from each peer (skipping beacon/shadow as appropriate);
  4. verify — periodically compare per-table checksums across controllers (Tables.verify_tablehashes_controllers); on mismatch, hard-copy the table from the authoritative host (import_table_from_host).

Push and pull together mean a single entry still arrives even if one controller was briefly down.

8.6 OS image synchronisation between controllers

Replaying a method (§8.4) keeps database state identical, but an OS image is also files on disk — a packed imagefile/kernelfile/initrdfile under FILES.IMAGE_FILES, plus an unpacked chroot tree under FILES.IMAGE_DIRECTORY. Those bytes must physically move between controllers. Luna does this as a journalled task chain, not a raw copy, so every controller performs (and records) the same steps.

Trigger. When an OSImage mutation (pack, clone_osimage, grab) replays on a peer, handle_requests calls queue_source_sync / queue_target_sync / queue_source_sync_by_node_name, which enqueue a parked sync_osimage_with_master task (subsystem osimage) carrying <osimage>:<master>.

Execution. tasks_mother (§16.1) picks up sync_osimage_with_master and issues a chain of journalled sub-requests, so the sync is itself replicated consistently:

  1. OsImager.schedule_cleanup — clear the target image area;
  2. OSImage.update_osimage (payload with kernelfile/initrdfile/imagefile) — bring DB metadata in line with the master;
  3. Downloader.pull_image_filesdownload the kernel, initrd and image tarball from the master over the authenticated /files/ endpoint (Request().download_file) into FILES.IMAGE_FILES;
  4. OsImager.schedule_provision — schedule the provisioning artefacts;
  5. only if HA().get_syncimages() is true — queue unpack_osimage:<osimage>, which unpacks the tarball into the chroot tree under IMAGE_DIRECTORY. With syncimages=false the files are pulled but the unpacked filesystem is not replicated.

A second path, Downloader.pull_image_data, syncs the unpacked tree directly between controllers via the filesystem plugin's sync() (rsync -aHv --numeric-ids --one-file-system --delete-after …).

Are plugins relevant? Yes — centrally. The whole osimage/ plugin family abstracts how images are stored, copied and moved, so the sync is filesystem-agnostic:

Plugin Selected by Role in sync
osimage/filesystem/<x> PLUGINS.IMAGE_FILESYSTEM (default default) performs the actual cross-controller copygetpath, clone, sync (the rsync), extract (untar into the tree). Override it for ZFS/btrfs snapshotting or versioned read-only image stores
osimage/operations/image/<distro> distributionosrelease how a chroot becomes a bootable image (kernel/initrd/tarball) — the pack/build step
osimage/operations/osgrab/<…> [node, distro, osimage, group] grab an image from a running node
osimage/operations/ospush/<…> [node, distro, osimage, group] push / live-sync an image to nodes
boot/provision/<method> provision_method .create() prepares the downloadable artefact (http file, torrent) that nodes later fetch with the same plugin's fetch (§6.3)

Because the filesystem and operations plugins are chosen exactly as in §6.4 (by IMAGE_FILESYSTEM, distro/release, or node/group), image storage and replication can be tailored per site and per OS without touching daemon code.

syncimages (in the ha table) is the master switch for replicating image filesystem content; image metadata and the packed files are always kept in step through the journal + /files/ download. A manual resync can be triggered with GET /ha/syncimage/<name>.

8.7 Inter-controller API

Controllers talk over the normal REST API using short-lived tokens (utils/request.pyget_token, get_request, post_request, download_file):

Endpoint Use
POST /journal receive a batch of journal entries
GET /journal/<name> hand a controller the entries queued for it
GET /journal/<name>/_delete drop applied entries
GET /ha/state, /ha/master, /ha/controllers inspect H/A status
GET /ha/master/_set, /ha/overrule/_set force master / bypass insync
GET /table/hashes, /table/data/<name> checksum compare + hard table copy
GET /ping liveness for verify_pings

Mental model: the journal is a replicated to-do list of method calls; every controller drains the part addressed to it by re-running the same base method, and a table-checksum sweep is the backstop that guarantees eventual convergence even when individual journal deliveries are lost.


9. base/boot.py — the provisioning brain

The Boot class backs every /boot* endpoint. At import it pre-loads the switchport and cloud detection plugins; in __init__ it builds the boot and osimage plugin trees and resolves controller identity (IP, network, beacon).

Method Endpoint Output
default() GET /boot templ_boot_ipxe.cfg — the iPXE menu (ask / unassigned / choose / category / enter…)
boot_short() GET /boot/short templ_boot_ipxe_short.cfg
boot_disk() GET /boot/disk templ_boot_disk.cfg — local-disk boot
discover_mac(mac) GET /boot/search/mac/<mac> finds/creates the node for a MAC, renders templ_nodeboot.cfg
discover_group_mac(group, mac) GET /boot/manual/group/… picks the next free node in a group
discover_hostname_mac(host, mac) GET /boot/manual/hostname/… binds a chosen hostname to a MAC
install(node[, 'kickstart']) GET /boot/install/<node> assembles & returns templ_install.cfg

9.1 What node selection (discover_mac) does

  • Looks up the node owning the MAC (via nodeinterface join). If none, can auto-create/assign one with find_next_suitable_node() (honouring group provision_interface, nextnode, makeupname).
  • On (re)assignment, binds the MAC to the node's BOOTIF via Interface.change_node_interface (journalled for HA) and clears the MAC from any other node (clear_existing_mac).
  • Resolves IP/gateway (IPv4/IPv6/dhcp), OS image + tag (node overrides group), kernel options and the netboot flag — then renders templ_nodeboot.cfg.

9.2 What install() does

  • Re-resolves node, image/tag, kernel options; restarts dhcp/dhcp6 via the queue.
  • Branches on image type: regular → templ_install.cfg; kickstarttempl_nodeboot_kickstart.cfg; netboot=falsetempl_boot_disk.cfg; honours verify_bootpause() so a node won't boot while its image is being packed.
  • Injects all plugin segments (provision fetch, network init/interface/gateway/dns, bmc config, script pre/part/post), sets state install.rendered, returns the assembled template + all render variables.
  • If any required value is None, returns a rendered failure template (templ_boot_failed.cfg) with a human-readable reason.

10. The installer template: templ_install.cfg

daemon/templates/templ_install.cfg is the bash script that runs inside the node's initramfs (dracut) to install the OS image. It is delivered by GET /boot/install/<node> after the daemon (a) injects the plugin segments and (b) Jinja2-renders the node-specific variables.

10.1 How it is rendered

  1. routes/boot.py:boot_install (token-protected) calls Boot().install(node).
  2. base/boot.py:install() reads templ_install.cfg, injects plugin segments (§6.3), sets node state install.rendered, returns data['template_data'].
  3. The route renders it with render_template_string(...), passing ~30 LUNA_* / NODE_* / PROVISION_* variables.

10.2 Key rendered variables

Variable Source / meaning
NODE_NAME, NODE_HOSTNAME resolved node identity
LUNA_GROUP, LUNA_DISTRIBUTION, LUNA_OSRELEASE group + OS image metadata
LUNA_IMAGEFILE image tarball to fetch and unpack
LUNA_SYSTEMROOT target root (e.g. sysroot)
PROVISION_METHOD / PROVISION_FALLBACK primary + backup provision plugin names
LUNA_INTERFACES per-interface config (rendered by network plugin)
LUNA_PRESCRIPT/PARTSCRIPT/POSTSCRIPT base64 user scripts
LUNA_ROLES / LUNA_SCRIPTS comma-lists fetched at install time
LUNA_SETUPBMC / LUNA_BMC BMC config block (optional)
LUNA_TOKEN JWT used by the installer to call back to the API
LUNA_API_PROTOCOL/PORT, WEBSERVER_* controller endpoints

10.3 Execution order (bottom of the script)

lunainit                 # create /lunatmp and /sysroot
dynamic_ip_check         # if DHCP boot, POST real IP back to /config/node/<name>
node_scripts             # (if LUNA_SCRIPTS) fetch pre/part/post script bodies
prescript                # run pre script + custom 'pre' scripts
bmcsetup                 # (if LUNA_SETUPBMC) configure BMC channel/credentials
partscript               # partition / filesystem creation
download_image           # PROVISION_METHOD then PROVISION_FALLBACK (provision plugin 'fetch')
unpack_imagefile         # tar -I lbzip2 extract image into /sysroot (ACL-aware)
collect_mac_n_name_net   # map MACs <-> interface names
change_net               # write NetworkManager connections (network plugin segments)
node_secrets             # GET /config/secrets/node/<name> -> write secret files (chmod 600)
postscript               # run post script + custom 'post' scripts
node_roles               # GET /boot/roles/<role> -> install + enable luna-<role>.service
fix_capabilities         # restore file capabilities (ping, arping…)
restore_selinux_context  # setfiles relabel if SELinux policy present
update_system_info       # dmidecode vendor/serial -> POST /config/node/<name>
cleanup                  # wipe /sysroot/tmp
update_status "install.success"

Throughout, update_status "<state>" POSTs to /monitor/node/<name> so the controller (and luna2-cli) show live progress: install.prescript, install.download, install.unpack, install.setnet, install.secrets, install.roles, install.success, etc.


11. How a node typically boots (end to end)

How a node boots, end to end

A re-provision is just this sequence again. A node configured with netboot=false short-circuits at Stage 2/4 to templ_boot_disk.cfg and boots its local disk instead of re-installing.


12. luna2-client — the node-side pre-pivot hook

luna2-daemon produces and serves boot artefacts; luna2-client is what actually runs them on the node. It is the node-side half of the boot story, shipped from a separate GitLab repo (clustervision/luna2-client, branch main). It is not a Python service — it is a dracut module named 95luna baked into every OS image as an OS package.

12.1 Packaging (deb / rpm)

One package per distro-family × architecture; all carry the same payload:

Path Distro family Arch Builder
redhat/x86, redhat/arm RHEL / Rocky / Alma x86_64 / aarch64 create_rpm.sh + luna2-client.spec
opensuse/x86, opensuse/arm openSUSE / SLES x86_64 / aarch64 create_rpm.sh + luna2-client.spec
ubuntu/x86, ubuntu/arm Ubuntu / Debian x86_64 / aarch64 create_deb.sh

The package delivers the 95luna dracut module under /usr/lib/dracut/modules.d/95luna/, an /etc/dracut.conf.d/luna2.conf that pulls extra tools into the initramfs, a /usr/sbin/dhclient-script-luna, and libluna-fakeuname.so. The spec/control file pulls in the runtime tools the installer needs (curl, aria2, tar, lbzip2, parted, gdisk, lvm2, mdadm, ipmitool, tpm2-tools, dropbear, dmidecode, jq…).

How it reaches the node: luna2-client package → installed into the OS image → dracut bakes the 95luna module into that image's initrd → the daemon serves that initrd as OSIMAGE_INITRDFILE in templ_nodeboot.cfg. So the initrd a node netboots already knows how to talk to luna2-daemon.

12.2 The two dracut hooks (where the pre-pivot start is wired in)

module-setup.sh registers two hooks and bakes node-side credentials/keys into the initramfs:

  • inst_hook cmdline 99 luna-parse-cmdline.sh — runs in dracut's cmdline phase. Its whole job is [ $root = "luna" ] && rootok=1, telling dracut the synthetic root=luna (set by templ_nodeboot.cfg) is valid so it won't hang waiting for a real root device.
  • inst_hook initqueue/finished 99 luna2-start.sh — runs at the end of the initqueue, once the network device has settled but before dracut pivots (switch_root) into the real root. This is the initial pre-pivot luna start.

It also bakes in: the node-side client config /trinity/local/luna/node/config/luna.ini (holds API_USERNAME/API_PASSWORD), the host SSH keys + /root/.ssh/authorized_keys (so an operator can SSH/dropbear into a node during install), and the CA bundle.

12.3 luna2-start.sh — what the pre-pivot hook does

Guarded by if [ "x$root" = "xluna" ], so it only fires on Luna-provisioned boots:

  1. luna_init / luna_start — set hostname from luna.hostname; start dropbear (or sshd) and a rescue shell on tty2 for live debugging; match luna.mac to a NIC (find_nic) and bring networking up — DHCP (luna.bootproto=dhcp) or static luna.ip + luna.gw. Exposes LUNA_BOOTIF/LUNA_BOOTPROTO for the installer.
  2. fetch_token — read API_USERNAME/API_PASSWORD from the baked-in luna.ini, read the TPM PCR (tpm2_pcrread sha256:0), and POST {tpm_sha256,username,password} to ${luna.url}/tpm/${luna.node}. The daemon (base/authentication.py) matches or first-boot-registers the node's tpm_sha256 and returns a JWT. This is how a node authenticates itself before it owns any secrets.
  3. fetch install script — loop GET ${luna.url}/boot/install/${luna.node} with x-access-tokens: $token, retrying every 10 s until HTTP 200. That body is the daemon-assembled templ_install.cfg (§10).
  4. run it/bin/sh /lunatmp/install.sh. On success the loop ends; on failure the whole token→fetch→run loop repeats, forwarding /tmp/luna_install.log to luna.loghost via logger.
  5. service mode — if luna.service=1, skip install entirely and drop to an interactive shell (rescue / manual work).
  6. luna_finish — forward logs, kill sshd/dropbear/dhclient, flush + down the NIC, move the install log to /sysroot/var/log/, kill the tty2 shell. Control returns to dracut, which switch_roots into the freshly installed /sysroot — the pivot.

12.4 Contract between the two halves

The daemon side (base/boot.pytempl_install.cfg) only produces the install script; it relies on luna2-client to (a) accept root=luna, (b) bring up the right NIC by MAC, (c) authenticate via /tpm/<node>, and (d) fetch and execute the script before pivot. The kernel arguments that glue them together are emitted by templ_nodeboot.cfg (rendered by Boot().discover_mac) and consumed by luna2-start.sh:

Kernel arg Set by (daemon) Used by (luna2-client)
root=luna / boot=ramdisk templ_nodeboot.cfg luna-parse-cmdline.sh (rootok=1)
luna.url controller API endpoint /tpm/<node> + /boot/install/<node> calls
luna.node / luna.hostname resolved node node identity, hostname
luna.mac node BOOTIF MAC find_nic → bring up NIC
luna.ip / luna.gw / luna.bootproto node IP config static vs DHCP network setup
luna.loghost controller / beacon remote log forwarding via logger
luna.service node service flag 1 → service/rescue shell, skip install
luna.verifycert API verify-certificate setting --insecure toggle on curl

luna2-client is intentionally OS-image-resident and arch/distro-specific (note the main-arm, *-opensuse, twans_new branches). When adding support for a new distro you typically pair a new boot/network/<distro>.py plugin on the daemon (§6.4) with a matching luna2-client build so the baked-in initrd has the right tools.


13. Worked examples — request walkthroughs

These tie the layers together. Inline markers (rendered faintly) show where each subsystem is incorporated:

(HA) replication touchpoint · (PLUGIN) a plugin is selected & loaded · (QUEUE) handed to a background mother (§16.1) · (SVC) service config re-rendered from a template. Status/return behaviour follows §15.

13.1 luna osimage packGET /config/osimage/<name>/_pack

luna osimage pack — request flow

Packs a chroot image tree into a bootable kernel / initrd / tarball.

  1. Route config_osimage_pack (@token_required). (HA) checks HA().get_hastate() / get_role():
  2. not masterJournal().add_request("OSImage.pack", object=name, masteronly=True, misc=request_id) and returns a request_idthe master does the packing; the CLI polls status by id. (HA)
  3. master (or non-HA) → continue locally.
  4. OSImage().pack(name) clears the changed flag, then (QUEUE) add_task_to_queue("pack_n_build_osimage", subsystem="osimage"); if first in line it wakes osimage_mother_wrapper() in a child process.
  5. (QUEUE) osimage_tasks_mother / the child runs pack_osimage:
  6. (PLUGIN) osimage/filesystem/<IMAGE_FILESYSTEM>getpath/mount the tree (override for ZFS/btrfs);
  7. (PLUGIN) osimage/operations/image/<distribution>·<osrelease> → the distro-specific pack (kernel, initrd, tarball into IMAGE_FILES).
  8. Progress streams to /monitor / the status table under request_id; the CLI tails it (§15).
  9. (HA) on success the route calls Journal().queue_source_sync(name, request_id) → a sync_osimage_with_master task so every peer pulls the new files (and unpacks them if syncimages, §8.6).

Plugins: filesystem + per-distro image operation. HA: master-routing up front, file replication afterwards.

13.2 luna node showGET /config/node/<name>

luna node show — request flow

The read path — deliberately simple, no side effects.

  1. Route config_node_get (@token_required @validate_name) → Node().get_node(name).
  2. base assembles the dataset: Database().get_record_join over node ⋈ group ⋈ nodeinterface …, applying node-overrides-group inheritance (osimage, kernel options, bmc, scripts, netboot).
  3. Returns JSON; the CLI renders the table.

Plugins: none. HA: none for the read — it is served straight from this controller's local replica, which the journal keeps in step with its peers, so any controller answers identically.

13.3 luna node changePOST /config/node/<name>

luna node change — request flow

  1. Route config_node_post (@provision_token_required — so in-install scripts may also update — @validate_name, @input_filter(checks=['config:node'])).
  2. (HA) Journal().add_request("Node.update_node", object=name, payload=request.data) first — this records the entry for replication and gates on insync (a not-yet-synced controller refuses the write). If it returns true:
  3. Node().update_node(name, data) runs locally: validates and writes the node/interface rows, then:
  4. (QUEUE) enqueues restart dhcp, restart dhcp6, reload dns, and a run_bulk node:master task;
  5. (PLUGIN) loads hooks/config/node and calls the node hook so sites can react to the change.
  6. (HA) on every peer, handle_requests replays Node.update_node(name, payload) — the same method, queue calls and hook run there too, so all controllers converge.

Plugins: hooks/config/node. HA: journalled write + identical replay; (QUEUE) decouples the dhcp/dns regeneration.

13.4 luna network changePOST /config/network/<name>

luna network change — request flow

  1. Route config_network_post (@token_required @validate_name @input_filter(checks=['config:network'])).
  2. (HA) Journal().add_request("Network.update_network", object=name, payload=request.data) → if accepted:
  3. Network().update_network(name, data): validates DHCP ranges and the mutually-exclusive DHCP modes, writes the network row, may re-allocate IPs (QUEUE), then:
  4. (SVC) Service().queue('dns','reload'), Service().queue('dhcp','restart'), Service().queue('dhcp6','restart') → the DHCP/DNS configs are re-rendered from the templ_kea-dhcp4/6.cfg and templ_dns_* templates and the services bounced.
  5. (HA) replayed on every peer, so each controller re-renders its own Kea/named config and restarts its own dhcp/dns — there is no shared config file.

Plugins: none directly (templates, not plugins, drive dhcp/dns output). HA: journalled write + per-controller service regeneration.

13.5 luna control / lpower — power one node, and a bunch (threaded)

luna control / lpower — single and threaded batch

Two endpoints, one plugin path. Power is a live action, so — unlike §13.3/12.4 — it is not journalled.

One node (synchronous)lpower n1 onGET /control/action/power/n1/_on:

  1. Route control_action_get (@token_required) → Control().control_action("n1","power","on")blocks until done.
  2. base joins node ⋈ BMC interface ⋈ bmcsetup (node's bmcsetupid, else the group's) to get the BMC IP + credentials.
  3. (PLUGIN) NodeControl.control_actionplugin_load("control", [nodename, group]) selects the power backend (default ipmitool, or dell) node→group→defaultcontrol_plugin().power_on(...)(True, "…Up/On").
  4. (PLUGIN) plugin_load("hooks/control", [nodename, group]) runs the optional site hook; on success Monitor().update_nodestatus.
  5. returns (status, {"control": {"power": …}}) → HTTP 204 / 200 / 501. No (HA) — it runs on whichever controller received it.

A bunch (threaded, asynchronous)lpower node[001-064] onPOST /control/action/power/_on (body lists the nodes):

  1. Route control_action_post (@token_required @input_filter(checks=['control'])) → Control().bulk_action(data).
  2. base builds a pipeline of nodes, reads BMC_BATCH_SIZE / BMC_BATCH_DELAY (the BMCCONTROL config, §17.1), generates a request_id, and (QUEUE) spawns NodeControl().control_mother(pipeline, request_id, size, delay) in a thread — then returns the request_id immediately (§15).
  3. control_mother loops while the pipeline has nodes: a ThreadPoolExecutor(max_workers=10) fires control_child ×batch, each popping a node and running the same (PLUGIN) control/[node,group] path; per-node results (n012:power on:True) are appended to the status table; it sleeps delay between batches to spare the BMCs, then writes EOF.
  4. the CLI polls GET /control/status/<request_id> and streams per-node outcomes until EOF (§15).

Plugins: control + hooks/control, selected per node→group→default. HA: none (live action). Scale: BMC_BATCH_SIZE/BMC_BATCH_DELAY throttle how many BMCs are hit at once.


14. Startup — first boot, bootstrap & schema

What the daemon does the first time it starts (and re-checks on every start): it validates or seeds the cluster from a bootstrap file, against a declarative database schema. validate_bootstrap() is invoked from on_starting (§16) before any worker forks.

14.1 bootstrap.py — first-run cluster bootstrap

validate_bootstrap() is called by luna.py:on_starting before any worker forks (§16). It:

  1. checks the database is reachable (db_status() detects the driver — SQLite / MySQL / PostgreSQL) and whether the schema already exists (DBStructure().check_db_tables());
  2. if /trinity/local/luna/daemon/config/bootstrap.ini is present and the tables are empty, parses it and seeds the initial cluster — HOSTS (hostname, controller, nodelist, domain), NETWORKS (cluster/ipmi/ib), GROUPS, OSIMAGE, BMCSETUP;
  3. runs post-start fixes regardless: verify_and_set_beacon(), legacy_and_forward_fixes(), cleanup_queue_and_status(), cleanup_and_init_ping().

Returns False (daemon exits) if the DB is unavailable, or if the schema needs init but bootstrap.ini is missing. After a successful bootstrap the operator removes bootstrap.ini.

14.2 database_layout.py — declarative schema

A pure-data module: ~30 DATABASE_LAYOUT_<table> lists, each describing a table's columns/types/keys. utils/dbstructure.py reads these to create and upgrade tables (driven by the bootstrap table check, §14.1). Adding a column = editing the layout list; there are no separate migration scripts.

Tables defined: node, group, network, nodeinterface, groupinterface, ipaddress, reservedipaddress, osimage, osimagetag, bmcsetup, switch, otherdevices, rack, rackinventory, cloud, controller, cluster, dns, roles, nodesecrets, groupsecrets, user, monitor, status, queue, tracker, journal, ha, ping, reference.


15. The return contract — how status & messages travel

Every layer of luna2-daemon speaks the same shape: a method returns (status, message_or_data) — a boolean plus a string or dict — and, for work that outlives the HTTP request, a third element request_id. This single convention is what lets a message born in the deepest place (a plugin, a child thread on a peer controller) surface verbatim at the CLI, with no per-endpoint plumbing.

The tuple, layer by layer (a power-on, deepest → outermost):

control plugin   power_on()           -> (True, "Chassis Power Control: Up/On")   (PLUGIN)
utils  NodeControl.control_action()   -> (True, "Chassis Power Control: Up/On")
base   Control.control_action()       -> (True, {"control": {"power": "...Up/On"}})
route  control_action_get()           -> json body + HTTP 204 / 200 / 501
CLI    lpower n1 on                    -> prints the message

base returns either a 2-tuple (status, response) or, for deferred work, a 3-tuple (status, response, request_id). The route turns status into an HTTP code — often through Helper().get_access_code(status, response) — and JSON-encodes response.

Two channels:

  • Synchronous — the (status, message) tuple bubbles straight up and is the HTTP body. Used for reads and quick mutations (node show, node change, network change).
  • Asynchronous status stream — for queued/threaded/long work (osimage pack, bulk lpower), the base returns immediately with (True, response, request_id) while the deep code keeps appending progress to the status table:

Status().add_message(request_id, "lpower", "node005:power on:True", status=200) … Status().add_message(request_id, "lpower", "EOF") # terminator

The CLI then polls GET /control/status/<request_id> (or /status/<request_id>); the daemon replays the accumulated messages (deleting them as they are read) until it sees EOF. This is how a line emitted by a child thread — or by a plugin running inside osimage_mother — reaches the user after the original request already returned.

Across HA. When a request was journalled and replayed on a peer (§8.4), handle_requests captures the replayed method's (status, message, request_id) and calls Status().forward_status_request(...), so progress produced on the master is forwarded back to the controller the user actually talked to. The return contract therefore spans controllers, not just layers.

Why it matters: because every boundary — plugin, utils, base, route — returns the same (status, message[, request_id]), any layer can answer directly or defer to the stream, and an error string from the lowest plugin reaches the CLI unaltered. New endpoints inherit this for free.


16. Process lifecycle & background workers

Gunicorn drives the daemon through hooks defined in luna.py and wired in gunicorn.py:

Hook When What luna.py does
on_starting master, before fork validate bootstrap; enqueue initial dhcp/dhcp6/dns (re)starts; fire hooks/luna startup plugin
post_worker_init each worker, after fork elect one worker (non-blocking flock on /var/lib/luna2-daemon-background.lock) to run the singleton background threads
worker_exit worker exits stop that worker's background threads
on_reload service reload drain the queue
on_exit service stop fire hooks/luna shutdown plugin
worker_abort worker timeout dump a traceback for debugging

The elected worker starts these Housekeeper / PluginSync "mother" threads (start_background_workers()): - status-message cleanup (cleanup_mother) - queue housekeeper / task drain (tasks_mother) - switch/port/MAC detection (switchport_scan) - boot-plugin sync between HA controllers (boot_plugins_mother) - journal / replication (journal_mother) — see §8 - invalid-config sweep (invalid_config_mother) - osimage tasks (osimage_tasks_mother)

16.1 The background "mothers"

utils/housekeeper.py (Housekeeper) defines most of these singleton loops; each *_mother runs while True guarded by the shared stop event, and publishes its health to /monitor (the mother item, 200/500). They run only on the flock-elected worker and, where relevant, only while the controller is HA-insync and/or master.

Thread Cadence Responsibility
tasks_mother drains the housekeeper queue continuously the workhorse dispatcher — a match over queued tasks: service restart/reload (dhcp, dhcp6, dns), sync_osimage_with_master (§8.6), provision_osimage, unpack_osimage, remote osimage removal, etc.
osimage_tasks_mother periodic, master-only waits for insync; if not master, clears queued osimage tasks and returns; expires stale tasks; spawns the heavy image worker (OsImage().osimage_mother_wrapper) in a separate process to run pack/build/grab/push/copy
cleanup_mother periodic expires old status / message buffers so the status table doesn't grow unbounded
switchport_scan periodic loads the switchport detection plugin and scans configured switches to learn which MAC sits on which port — feeds node identification at boot (§9.1)
invalid_config_mother periodic sweeps for incomplete/invalid configuration (e.g. nodes/interfaces missing required data) and flags it
journal_mother ~5 s HA replication loop (§8.5)
boot_plugins_mother periodic lives in utils/plugin_sync.py (PluginSync), not housekeeper — replicates boot-plugin files between HA controllers

Each loop is defensive: exceptions are caught, logged with line numbers, and surfaced as a 500 on the mother monitor item; on the next clean pass it flips back to 200. The heavy osimage work is deliberately pushed to a child process (not just a thread) so a crash or signal there can't take down the worker.


17. The common package — constants & validation

daemon/common/ is the foundation imported by almost every other module. It is not a request-handling layer; it provides process-wide configuration and the auth/input decorators. (The first-run bootstrap and the database schema — also part of common/ — are covered separately under Startup, §14.)

17.1 constant.pyCONSTANT, LOGGER, LUNAKEY

Imported once at process start; reads /trinity/local/luna/daemon/config/luna.ini into the global CONSTANT dict and exposes the shared LOGGER and LUNAKEY. Everything else simply does from common.constant import CONSTANT, LOGGER.

  • Parses the ini into a fixed section schema (below) and sanity-checks that the log dir, image dir, template dirs, plugin dir and keyfile exist and are writable.
  • Normalises human durations to seconds — EXPIRY (h/m/s, default 24h), COOLDOWN, MAXPACKAGINGTIME, BMC_BATCH_DELAY.
  • Loads LUNAKEY from FILES.KEYFILE — the symmetric key used to encrypt/decrypt node & group secrets.
  • Copies the template files listed in TEMPLATES.TEMPLATE_LIST (a JSON manifest) into the working TMP_DIRECTORY.
CONSTANT section Key settings
LOGGER LEVEL, LOGFILE
API USERNAME, PASSWORD, EXPIRY, SECRET_KEY, PROTOCOL, ENDPOINT
DATABASE DRIVER, DATABASE, DBUSER, DBPASSWORD, HOST, PORT
FILES KEYFILE, IMAGE_FILES, IMAGE_DIRECTORY, TMP_DIRECTORY, MAXPACKAGINGTIME
PLUGINS PLUGINS_DIRECTORY, IMAGE_FILESYSTEM
SERVICES DHCP, DNS, CONTROL, COOLDOWN, COMMAND
DHCP OMAPIKEY
BMCCONTROL BMC_BATCH_SIZE, BMC_BATCH_DELAY
TEMPLATES TEMPLATE_FILES, TEMPLATE_LIST, TMP_DIRECTORY, VARS

CONSTANT is read at import only — editing luna.ini requires a daemon restart to take effect.

17.2 validate_auth.py — token decorators

JWT (HS256, signed with API.SECRET_KEY) verification wrappers used by the routes:

Decorator Use
@token_required standard protected endpoints; rejects a provision-scoped token (403) — those may only reach install endpoints
@provision_token_required endpoints the in-install kickstart must reach; accepts an admin token or a node-scoped provision token whose node claim matches the requested node/name
@agent_check sets cli=True/False from the User-Agent (Luna2-web ⇒ GUI, else CLI)

This is the JWT consumed by luna2-client after it authenticates at /tpm/<node> (§12.3).

17.3 validate_input.py — boundary input sanitation

The security layer applied at the HTTP edge (the "validate at system boundaries" rule). It holds a dictionary of named regex patterns and the decorators that enforce them:

  • @validate_name — sanitises the name / path params on a route.
  • input_filter(checks=[...]) — validates named JSON-body fields before the handler runs; returns 400 on mismatch.
  • strips control characters; named patterns include name, strictname, interface, ipaddress, macaddress, domainname, integer, loosecsv, anything, etc.

18. Quick reference

I want to… Look at
Add a new download method plugins/boot/provision/<name>.py (define fetch, create, cleanup)
Support a new distro's networking plugins/boot/network/<distro>.py (init/interface/gateway/dns) plus a matching luna2-client build (§12)
Change the node-side pre-pivot behaviour luna2-client repo → */src/usr/lib/dracut/modules.d/95luna/luna2-start.sh
Add a tool to the install initramfs luna2-client module-setup.sh (dracut_install …) + etc/dracut.conf.d/luna2.conf
Change the iPXE menu templates/templ_boot_ipxe.cfg + Boot().default()
Change the installer steps templates/templ_install.cfg (order at bottom) + matching ## CODE SEGMENT markers
Add an API resource new routes/<x>.py blueprint + register in luna.py + base/<x>.py class
Add a power backend plugins/control/<vendor>.py
Hook startup/shutdown behaviour plugins/hooks/luna/default.py
Run logic on every Nth worker only start_background_workers() in luna.py (flock-elected)

Generated from the luna2-daemon development branch. File and method references are accurate as of HEAD 371511ce.