Documentation Β· Version log

Version changelog

Every patch of the hub + website, newest first. Click a version to expand it β€” the current build is already open. Sub-letter bumps (e.g. the whole v0.7.8z series) are concatenated into one entry per letter-series rather than listed individually.

v.Chronos.Galatea.0 Β· The H.1.x benchmark/scoring rehaul arc (legacy vH.1.20)

Ancient Holdings β€” Changelog

This log groups every shipped patch under the Genesis phase that contained it. Each phase has a codename (Cassandra, Prometheus, Pythagoras, Hydra, Medusa, ...) and a pinned version range; everything that landed inside that range belongs to that phase. The newest phase appears first.

Inside each phase you'll find the mission, a short list of headline features, optional operator notes, an audit link when the phase has a published report, and a collapsible patch log with every numbered patch ordered newest-first.

The pre-Genesis legacy log (version-grouped, one entry per patch) lives in CHANGELOG.legacy.md β€” it is preserved verbatim for historical reference but no longer updated.


β–ΈGluon (β€œthe orderless maintenance era”)orderless Β· by KIND, not by sequence Β· 4 buckets Β· 7 maintenance entries (date-sorted for display only)
β–ΈAegisΒ· v0

security/hardening

β–Έv.Gluon.Aegis.0Β· GenesisΒ· 3 entries
  • v0.5.5
  • v0.7.11c
  • v0.7.11b
β–ΈAtlasΒ· v0

deps/infra

β–Έv.Gluon.Atlas.0Β· GenesisΒ· 0 entries
β–ΈAngeliaΒ· v0

deploy/CI/release

β–Έv.Gluon.Angelia.0Β· GenesisΒ· 1 entry
  • v0.6.0
β–ΈLetheΒ· v0

cleanup/refactor-glue

β–Έv.Gluon.Lethe.0Β· GenesisΒ· 3 entries
  • v0.5.6
  • v0.7.11a
  • v0.7.11
No changelog entries yet.
β–ΈChaos (β€œthe pre-roster origin era”)Origin Β· 7 arcsΒ· before ReleaseΒ A
β–ΈPygmalion1 historical entry Β· patch-number 0

Pygmalion Genesis Β· 1 historical entry Β· patch-number 0

  • v.Chaos.Pygmalion.0-aΒ·git:pre-hub
    Pre-hub public marketing site β€” sourced from version-control history (first git commit 2026-03-27); no CHANGELOG entry exists for this arc, so its single Genesis-0 member is the git provenance anchor.
β–ΈChiron2 historical entries Β· patch-number 0

Chiron Genesis Β· 2 historical entries Β· patch-number 0

  • v.Chaos.Chiron.0-aΒ·v0.2.0
    Second phase of v1. Lays the plumbing every later phase needs to drive remote
    servers (SSH runner, driver interface, job worker, live log streaming). Ships
    with a `noop` placeholder driver that exercises the full pipeline without
    touching any real service β€” the F2 exit proof.
    
    **What shipped**
    
    - `ServiceDriver` interface + registry (`lib/drivers/`) β€” the common contract
      real drivers (SC1, MN1, WEB1, etc.) implement later. F2 includes only the
      `noop` driver for pipeline testing.
    - SSH runner (`lib/ssh.ts`) β€” `ssh2`-based; `runRemote()` streams output via a
      callback for live log tails; `runRemoteStrict()` throws on non-zero exit.
    - Job queue primitives (`lib/jobs.ts`) β€” `enqueueJob`, `claimNextJob`,
      `updateJobProgress`, `completeJob`, `failJob`, `cancelJob`, `reapStaleJobs`.
      Rolling 16 KB log tail per job.
    - Schema migration `002_jobs_heartbeat.sql` β€” adds `heartbeat_at` + `worker_id`
      columns. Stale-heartbeat reaper fails orphaned jobs after 60 s.
    - Worker process (`worker/index.ts`) β€” separate from the web tier. Polls the
      queue, dispatches to the matching driver, heartbeats every 5 s. Run with
      `npm run worker` (dev) or `npm run worker:watch` (auto-restart on file change).
      In production: a new PM2 app `ancientholdings-worker`.
    - Admin job APIs β€” `GET /api/admin/jobs`, `POST /api/admin/jobs` (enqueue),
      `GET /api/admin/jobs/[id]`, `DELETE /api/admin/jobs/[id]` (cancel),
      `GET /api/admin/jobs/[id]/stream` (Server-Sent Events; live progress + log tail).
    - Admin UI β€” `/admin/jobs` (list with progress bars + status badges) and
      `/admin/jobs/[id]` (live detail page with SSE streaming).
    - Test fixture β€” `/admin/test` button enqueues a no-op job and redirects to
      its detail page so you can watch the full pipeline run in ~8 seconds.
    - Admin landing page now links to Jobs queue + Driver test fixture.
    
    **What intentionally did NOT ship**
    
    - No real service drivers (SC1/MN1/WEB1 still ahead).
    - No monitoring (MON1 β€” next after F2 per plan).
    - No node registry UI (P7).
    - F2's SSH runner is present but only exercised in SC1+; the `noop` test
      driver doesn't actually connect anywhere.
    
    **Dependencies added**
    
    - `ssh2` β€” SSH client used by the runner.
    - `@types/ssh2` β€” TypeScript types.
    - `tsx` β€” runs the worker directly from `.ts` source in dev and prod.
    
    **Operational notes**
    
    - Worker crash recovery: if the worker process dies mid-job, `reapStaleJobs()`
      (run from both the worker loop and the API) marks running jobs whose
      `heartbeat_at` is older than 60 s as `failed` with a clear error. No jobs
      get stuck in `running` forever.
    - Dev smoke test: `npm run worker` in one terminal, browse to `/admin/test`,
      click "Run on fake-node-1", watch the job progress live.
    
    ---
  • v.Chaos.Chiron.0-bΒ·v0.1.0
    First phase of v1. Establishes the admin plumbing every later phase builds on.
    Public site is unchanged; the foundation is invisible to non-admins by design.
    
    **What shipped**
    
    - SQLite database at `./data/app.db` (override via `APP_DB_PATH`) with a migration runner
      and the full v1 core schema: `nodes`, `node_services`, `mail_accounts`, `pins`, `jobs`,
      `backups`, `secrets_vault`, `admin_audit`, plus the provisioning columns (`provisioned_bytes`,
      `used_bytes`, `backup_capable`, `mount_point`) agreed during planning.
    - Secrets vault (`lib/vault.ts`) β€” `seal()` / `unseal()` via libsodium `crypto_secretbox`,
      master key in `SECRETS_MASTER_KEY`.
    - Admin whitelist (`lib/admin.ts`) driven by the `ADMIN_EMAILS` env var. Non-admin
      requests to admin routes get **404, not 403** β€” zero surface leakage.
    - Admin password re-confirm: `POST /api/admin/confirm` revalidates against IMAP and
      stamps `session.adminConfirmedAt`. 5-minute freshness window via
      `requireFreshAdminConfirmApi()` for destructive endpoints.
    - Audit log writer (`lib/audit.ts`) β€” every admin action writes a row to `admin_audit`.
    - `/admin` landing page β€” empty state until P7 ships the node registry.
    - `/admin/changelog` (this page) β€” renders `CHANGELOG.md` server-side, admin-only.
    - Global version indicator shown on every page (`v0.1.0 Β· F1`).
    - "Admin" link in the account menu, visible only when an admin email is signed in.
    
    **What intentionally did NOT ship**
    
    - Any in-house mail UI β€” that's P1 (inbox) onward.
    - Any service drivers or SSH runner β€” that's F2.
    - Any service monitoring β€” that's MON1.
    - Any node registry UI β€” that's P7.
    - Customer portal β€” that's v2 (CP1+).
    
    **Dependencies added**
    
    - `better-sqlite3` β€” synchronous SQLite driver.
    - `libsodium-wrappers` β€” sealed-box secrets vault.
    - `react-markdown` β€” renders this page.
    
    **Env vars introduced**
    
    - `ADMIN_EMAILS` β€” comma-separated list of addresses allowed into `/admin/*`.
    - `SECRETS_MASTER_KEY` β€” base64-encoded 32-byte master key for the secrets vault.
    - `APP_DB_PATH` β€” optional override for the SQLite file location (default `./data/app.db`).
β–ΈEcho3 historical entries Β· patch-number 0

Echo Genesis Β· 3 historical entries Β· patch-number 0

  • v.Chaos.Echo.0-aΒ·v0.4.0
    Fourth phase of v1. The first phase where the admin UI actually shows live,
    meaningful data pulled from a managed node. Built around netdata β€” the agent
    installs over SSH with one click, our UI proxies netdata's JSON API for live
    charts, the system probe inventories what's on the box without installing
    anything, and an `apt upgrade` button keeps the OS current.
    
    **What shipped**
    
    - Schema migration 005: adds `system_probe_json` + `system_probe_at`
      columns on `nodes` to cache the last detection result.
    - **System probe handler** (`lib/handlers/system-probe.ts`): SSH battery
      of detection commands that reports OS, kernel, arch, CPU, memory, disk
      mounts, nginx/docker/netdata/ipfs/chainweb/mailcow status, running
      docker containers, GNU screen sessions, listening ports. Structured
      JSON stored on the node row.
    - **"Services detected" panel** (`components/admin/ServicesDetected.tsx`)
      on `/admin/nodes/[id]`: renders the probe output β€” core service rows
      with green/red/amber indicators, a collapsible docker-container list,
      screen sessions, disk-mount breakdown.
    - **netdata install handler + button**: idempotent upstream `kickstart.sh`
      run (stable channel, telemetry disabled, non-interactive), then a Python
      edit of `/etc/netdata/netdata.conf` to force the `[web]` block to bind
      to `127.0.0.1:19999` (loopback only β€” no public port opens). Wires into
      the services-detected panel: netdata shows amber with an "Install"
      button when not active.
    - **apt upgrade handler + button**: non-interactive
      `apt-get update && upgrade && autoremove` with the hold-config options.
      Streamed live to the jobs page.
    - **Metrics proxy** (`lib/netdata.ts` + `pages/api/admin/nodes/[id]/metrics/[...netdataPath].ts`):
      admin-only passthrough that SSH-runs `curl 127.0.0.1:19999/api/v1/...` on
      the target and returns the JSON. Whitelisted endpoints
      (`info`, `data`, `charts`, `alarms`, `allmetrics`). Query strings
      sanitized against shell-meta characters.
    - **Live charts** (`components/admin/NodeMetrics.tsx`): four recharts
      panels β€” CPU, load, RAM, network β€” polling the proxy every 5 seconds
      for the last 5 minutes of data. Renders only when the probe detects
      netdata active; otherwise shows a "Install first" hint.
    - **"Ubuntu / Debian only" hint** on `/admin/nodes/new` so users know
      what's supported.
    - Tunneling architecture: deliberately **not** a persistent SSH port
      forward. Each metrics API request opens a fresh SSH channel, runs curl
      to localhost, and returns the JSON. Trade-off: ~300 ms latency per
      request vs. a stateful tunnel. Sufficient for 5 s polling and keeps the
      hub stateless across restarts. If we ever need sub-second streaming or
      websockets from netdata, we'll build a real tunnel manager then.
    
    **Deliberately not yet installed**
    
    - Per-service driver actions (install StoaChain, install miner, install
      IPFS, install Mailcow, install website). Those stay with SC1 / MN1 /
      P12 / future-mailcow / WEB1 respectively. The probe tells you what's
      there; the drivers land next.
    - Full-pod one-click install β€” that's v3 V4.
    
    **Verified against production**
    
    Smoke-tested end-to-end against the live 85.215.122.215 box. Probe correctly
    detected Ubuntu 24.04, 12-core Ryzen 5 PRO 3600, 31 GB RAM, nginx active,
    docker 28.2.2, netdata missing, IPFS with 2 pins, chainweb at cut height
    1,566,039, mailcow present (19 containers), 3 screen sessions (StoaNode,
    StoaMiner, cronoton), 5 disk mounts including the 2 TB `/mnt/nvmedrive`.
    
    ---
  • v.Chaos.Echo.0-bΒ·v0.3.1
    Patch on top of P7. Previously the only way to register a node was to paste
    an existing SSH private key into the Advanced form. That works, but it's
    friction-heavy for first-time setup. Added a password-based bootstrap flow
    that generates a fresh keypair on the hub and installs it on the target.
    
    **What shipped**
    
    - Schema migration 004: adds `ssh_public_key` column to `nodes` so the UI
      can show "this is the key authorizing the hub on the server" and the user
      can find + revoke it manually if needed.
    - Keypair generator (`lib/ssh-keygen.ts`) β€” ed25519 via Node's built-in
      crypto; emits both PKCS8 PEM (for ssh2 to consume) and single-line
      `ssh-ed25519 AAAA…` format (for authorized_keys). No native deps.
    - `bootstrapNode()` in `lib/nodes.ts` β€” the end-to-end flow:
      1. connect with password
      2. generate keypair
      3. idempotent append of the public key to `~/.ssh/authorized_keys`
      4. verify by reconnecting with the new key only
      5. seal private key in the vault
      6. insert the node row with both `ssh_key_id` and `ssh_public_key`
      The password is used in-memory once and never stored anywhere.
    - New API `POST /api/admin/nodes/bootstrap` β€” synchronous; the response
      includes the generated private key exactly once so the user can download
      or copy it as an emergency backup copy.
    - Redesigned `/admin/nodes/new` with two tabs:
      - **Easy setup (password)** β€” the new flow above; success screen shows
        the private key with Download .pem + Copy buttons and a prominent
        "will not be shown again" warning.
      - **Advanced (paste key)** β€” the original flow, unchanged.
    - Fixed several hydration bugs: locale-dependent `toLocaleTimeString()` on
      jobs pages replaced by a stable `Intl.DateTimeFormat('en-GB', hour12:false)`;
      relative timestamps ("2m ago") moved into a client-only `RelativeTime`
      component to avoid Date.now() drift between server render and client
      hydration; `<title>` with interpolated strings rewrote to template
      literals to satisfy React strict-title checking.
    
    **Pubkey annotation**
    
    The public key installed on the server carries a comment of the form
    `ancientholdings-hub:<node-uuid>`. To revoke the hub&apos;s access from
    outside the hub, remove that line from `~/.ssh/authorized_keys` on the
    target.
    
    ---
  • v.Chaos.Echo.0-cΒ·v0.3.0
    Third phase of v1. First shippable admin feature: register the servers the hub
    manages via the UI with proper SSH credential handling. Subsequent phases
    (MON1 monitoring, P8 backups, service drivers) read nodes from the registry
    instead of requiring manual SQL seeds.
    
    Scope was tightened to "generic node registry + SSH connectivity test". Per-service
    probes travel with each service driver as they land β€” not in P7.
    
    **What shipped**
    
    - Schema migration 003 adds `last_test_at`, `last_test_status`, `last_test_detail`
      columns to `nodes` for caching connectivity-test results.
    - Node helpers (`lib/nodes.ts`) β€” `createNode`, `getNode`, `listNodes`, `deleteNode`,
      `publicNode` (redacts the vault key id before sending to the client).
    - Job handler framework (`lib/handlers/`) β€” sits alongside the service driver
      registry; the worker dispatches by `job.kind`, checking handlers first, then
      falling back to drivers. Lets node-level ops (connectivity test, future
      netdata install) share the job queue without abusing the ServiceDriver
      contract.
    - `node-test` handler β€” SSHes into the node, runs `uname / id / uptime / df`,
      parses the output, writes result onto the `nodes` row. Full output streams
      live to the UI via the F2 SSE pipeline.
    - Admin API β€” `GET/POST /api/admin/nodes`, `GET/DELETE /api/admin/nodes/[id]`,
      `POST /api/admin/nodes/[id]/test`. Keys are sealed via `lib/vault.ts`; the
      key id never leaves the server.
    - `/admin/nodes` list page with role badges + last-test status.
    - `/admin/nodes/new` form β€” label, host, port, SSH user, private key paste,
      role picker (master-mixed / storage-fullcopy / ouronet-validator / customer-miner /
      utility), notes. Enqueues connectivity test on submit, redirects to detail.
    - `/admin/nodes/[id]` detail β€” live connectivity test via SSE, retest button,
      service placeholder, danger zone with fresh-admin-confirm gated delete.
    - Admin landing page upgraded with a direct "Add your first node" CTA.
    
    **What intentionally did NOT ship**
    
    - SSH key *generation* in-app (for customer-provisioned nodes): CP4 territory.
    - Per-service health checks: ride with each service driver (SC1 / MN1 / WEB1 / P11).
    - netdata auto-install on node add: lands in MON1 (next phase).
    - Any visualization of service state: services don't exist yet.
    
    **Dev notes**
    
    - The `node-test` handler is the first real exercise of the SSH runner from F2.
    - Worker dispatcher now checks handlers first, then drivers β€” symmetric with
      how the admin UI will grow (node-level ops vs service-level ops).
    
    ---
β–ΈProteus5 historical entries Β· patch-number 0

Proteus Genesis Β· 5 historical entries Β· patch-number 0

  • v.Chaos.Proteus.0-aΒ·v0.5.4
    Fixes a small lie from v0.5.0 (the `.ahbk` download set
    `Accept-Ranges: bytes` but didn't actually parse Range headers) and
    closes the gap for the `.tar.gz` download path (unresumable because the
    secretstream decrypt pipeline is sequential). Also relocates the app
    version label from each admin page into the global navbar.
    
    **What shipped**
    
    - **Proper HTTP Range support for `.ahbk`**: the download endpoint now
      parses `Range: bytes=N-M` (including suffix ranges `bytes=-N`),
      returns 206 Partial Content with `Content-Range`, streams only the
      requested window via `fs.createReadStream({start, end})`. Invalid
      ranges get 416 with a `Content-Range: bytes */<size>` hint. Download
      managers, wget -c, and browsers that retry from partial state now
      actually work.
    - **Materialize-then-serve flow for `.tar.gz`**: since the libsodium
      secretstream decryptor has sequential state (can't seek), we can't
      make the in-flight decrypt Range-capable. Instead, a new flow:
        - `POST /api/admin/backups/:id/prepare-targz` kicks off a background
          decryption into `<archive>.tar.gz.ready.tmp`, atomically renames
          to `<archive>.tar.gz.ready` on completion.
        - `GET /api/admin/backups/:id/prepare-targz` polls staging status β€”
          `none` / `preparing` (with currentBytes vs innerBytes for %) /
          `ready` (with TTL expiry).
        - `GET /api/admin/backups/:id/download?decrypt=1` serves the ready
          file as a regular static file with Range support. 409 if not
          prepared, telling the UI to offer the Prepare button.
        - `DELETE /api/admin/backups/:id/prepare-targz` for manual discard.
        - Worker reaper sweeps stale `.tar.gz.ready` + `.tar.gz.ready.tmp`
          files older than 30 min, alongside the existing `.ahbk` expiry
          reaper.
    - **`lib/targz-staging.ts`**: encapsulates the phase machine (none /
      preparing / ready), path helpers, idempotent `startPrepare` (two
      concurrent calls don't race β€” uses `open(path, 'wx')` as a lock),
      discard + reap.
    - **Single-button 3-state UI** on `/admin/backups`:
        - `.ahbk` row action: always "Download .ahbk" (static, resumable).
        - `.tar.gz` row action: one button that transforms through
          "Prepare .tar.gz" β†’ "Preparing X%" (disabled, live polls for
          progress) β†’ "Download .tar.gz" (green, with expiry subtext and a
          Discard link). Position doesn't change, label does.
    - **Auto-delete on completion**: both formats follow the same rule as
      before β€” full (non-Range) response that finishes flushed triggers
      `deleteBackup()`. Range responses never trigger auto-delete (a
      resumed session may span multiple requests and we don't track
      cross-request coverage), so the admin either manually discards or
      the TTL reaper catches it.
    - **Honest info block** on `/admin/backups` explaining both formats'
      resumability behavior, the preparation step for `.tar.gz`, and the
      30-min staging TTL.
    - **Version label relocated to navbar**: admin-only; appears next to
      the "Ancient Holdings" wordmark when signed in as admin, linking to
      `/admin/changelog`. Removed the duplicate version spans from every
      admin page title row (saves a line of visual noise on 9 pages).
    
    **Safety & correctness notes**
    
    - Staged `.tar.gz` is plaintext at rest on the hub during its TTL
      window. Single-admin hub: modest risk, we called it out in the UI
      copy. If this becomes concerning we could add "encrypt at rest with
      an ephemeral key held only in memory during the prepare→download
      cycle" but that complicates resumability.
    - Range downloads don't auto-delete. This is deliberate β€” we can't
      tell from one request whether the user has now finished downloading.
      The 30-min reaper + Discard button handle cleanup.
    - Staging file lifecycle is managed by filesystem state (no DB column
      needed). This keeps the DB as the source of truth for the archive
      itself while the plaintext staging is a fungible cache.
    
    **Verified with**
    
    - Type-check clean.
    - Range parsing handles `bytes=0-`, `bytes=100-499`, `bytes=-100`,
      rejects `bytes=5000000-` on a 1 MB file with 416.
    - `.tar.gz` prepare flow: empty state β†’ Prepare click β†’ .tmp file
      appears β†’ status flips to preparing with bytes-in-progress β†’ on
      completion, `.ready` exists + status flips to 'ready' β†’ Download
      serves the file with Range β†’ successful full-file download deletes
      both `.ahbk` and `.ready`.
    
    ---
  • v.Chaos.Proteus.0-bΒ·v0.5.3
    Closes the last operational gap on key rotation. Before this, rotating
    the master key left the worker process holding the **old** key in its
    `process.env` until you remembered to restart `npm run worker` (or
    `pm2 restart`). v0.5.3 makes the whole cycle self-healing: the rotation
    flow signals the worker to hold off, the worker polls `.env.local` for
    the new key, and the UI shows a live "worker in sync βœ“" indicator so
    the admin can confirm the propagation without checking logs.
    
    **What shipped**
    
    - **Migration 008**: new `system_state` (key/value flag table β€” first
      consumer is `rotation_in_progress`) and a `master_key_fingerprint`
      column on `worker_leadership` so the worker can publish the short
      sha256 prefix of its current master key on every heartbeat.
    - **`lib/system-state.ts`**: tiny `getFlag` / `setFlag` / `clearFlag` /
      `isFlagSet` helper. Generic enough to reuse for future coordination
      (rotation_in_progress today, maintenance mode / probe-lock / etc.
      later).
    - **`lib/rotation.ts`** now wraps the rotation body in `setFlag` /
      `clearFlag` (`rotation_in_progress`) via `try/finally` β€” flag is
      always cleared even if rotation throws.
    - **Worker (`worker/index.ts`)**:
        - On boot and every 10 s thereafter, re-reads `.env.local` for
          `SECRETS_MASTER_KEY`. If the on-disk value differs from the
          in-process copy, updates `process.env` and logs
          `[worker] SECRETS_MASTER_KEY reloaded (<oldfp> β†’ <newfp>)`.
        - Before each job-claim attempt, checks `rotation_in_progress` β€” if
          set, logs `holding off on new jobs` and sleeps. Logs `resuming`
          when the flag clears. Closes the race where a queued job could
          transition to `running` mid-rotation.
        - On every lease renewal, publishes its current master-key
          fingerprint into `worker_leadership.master_key_fingerprint` for
          the web tier to see.
    - **`GET /api/admin/worker-status`**: admin-only endpoint returning
      `{hub: {masterKeyFingerprint}, worker: {...fp, isFresh}, rotationInProgress, keyInSync}`.
      UI uses it to flip the βœ“ / ⚠ indicator.
    - **`/admin/security` rotation card**: after a successful rotation,
      polls `/api/admin/worker-status` every 1 s for up to 30 s. Shows
      three states: pulsing "waiting for worker" (with divergent
      fingerprints displayed), green "βœ“ worker picked up the new key",
      or amber "⚠ worker didn't pick up within 30 s" with a one-line
      restart hint. Wait window = ~3Γ— the worker's env-poll cadence.
    
    **Result**
    
    End-to-end rotation is now a single click with zero manual follow-up
    on a healthy hub. The old "restart the worker after rotating" footgun
    is gone.
    
    **Deliberately not in this patch**
    
    - Graceful worker restart via PM2 signal. The poll-the-env-file
      approach already solves the problem; a signal-based path is only
      worth building if we later find `.env.local` polling doesn't work
      for some deployment (e.g. Docker secret mounts).
    - Key *versioning* in archive headers ("which master key was used to
      wrap this?"). Useful for forensics / cross-hub archive portability;
      separate feature.
    
    **Verified with**
    
    - Type-check clean.
    - Worker env-file parser unit-tested inline via the same approach used
      for `upsertEnvVar` β€” correct handling of unquoted, single-quoted,
      double-quoted values.
    - Integration test planned: rotate via UI, watch worker log print the
      `SECRETS_MASTER_KEY reloaded` line, then the UI indicator flip βœ“.
    
    ---
  • v.Chaos.Proteus.0-cΒ·v0.5.2
    Second half of the master-key story. v0.5.1 made the key exportable;
    v0.5.2 makes it replaceable β€” in place, on a live hub, without
    re-uploading any archive body. Retires the "the key I minted 6 months
    ago is probably still fine" posture.
    
    **What shipped**
    
    - **`lib/rotation.ts`** β€” core rotation logic. Generates a fresh 32-byte
      key, walks the vault + known `.ahbk` archives, unwraps everything with
      the OLD master into memory (in-memory plan), then applies:
        1. archive header rewrites in place (only `wrapped.nonce` +
           `wrapped.ciphertext` change β€” identical JSON byte length by
           construction; verified before any write)
        2. vault re-seal as a single DB transaction
        3. `.env.local` update via the new `lib/env-file.ts` upsert helper
           (preserves comments + other vars)
        4. `process.env.SECRETS_MASTER_KEY` flipped in memory so the running
           hub uses the new key immediately β€” no PM2 restart required
      On any failure, previously-rewritten archive headers are restored
      from memory; the DB transaction rolls itself back.
    - **`lib/env-file.ts`** β€” small upsert-in-dotenv helper with
      safety rails (no newlines, valid env-var name pattern, atomic write
      at `0o600`).
    - **`POST /api/admin/security/rotate-master-key`** β€” fresh-confirm gated
      endpoint. Requires body `{acknowledgedExport: true}` so a rotation
      without a backed-up key is impossible via the UI. Returns counts +
      first-16-hex-chars SHA-256 fingerprints of both keys (so the admin
      can visually confirm the key actually changed without exposing it in
      the response). Every call audit-logged, including the fingerprints.
    - **Pre-flight guard**: rotation refuses if any job is `running`. Avoids
      mid-run vault unseal against the new key.
    - **`/admin/security` rotate card**: explainer text, safety model,
      two required checkboxes ("I have exported the current key",
      "I understand the consequences"), and a success panel showing row
      counts, duration, and old/new fingerprints.
    
    **Safety notes**
    
    - Archive header re-writes are verified to preserve byte length before
      any write happens; if a serialization quirk would change the length
      the rotation aborts with a clear error rather than corrupt the file.
    - The body of each `.ahbk` never changes β€” only the JSON header's
      wrapped-key fields. So a pre-rotation download with the old key still
      works for archives already stored locally; re-downloads through the
      hub use the new key.
    - Key rotation does **not** migrate archives that belong to a different
      hub (different master key). The rotation aborts loudly if any
      completed archive doesn't unwrap with the current master.
    
    **Deliberately not in this patch**
    
    - Importing a known key (disaster-recovery seed) β€” still manual via
      `.env.local`.
    - Rotation *scheduling* / policy enforcement (e.g. "rotate every 90
      days"). Admin-driven for now.
    - Archive re-encryption with a new *content* key (would require reading
      + re-writing archive bodies). Not needed for master-key rotation; only
      worth it if a specific content key is suspected compromised.
    
    **Verified with**
    
    - Type-check clean.
    - `lib/env-file.ts` smoke-tested inline: upserts existing var in place,
      appends when missing, preserves surrounding vars/comments.
    - Rotation plan validates `newHeaderBytes.length === headerLen` before
      any write; planned but untested path: rollback of partial archive
      rewrites. Relies on old header bytes held in memory for the duration
      of the rotation call.
    
    ---
  • v.Chaos.Proteus.0-dΒ·v0.5.1
    Follow-up patch to v0.5.0 closing the "what happens if the hub disappears"
    gap on `.ahbk` archives. Before this, the hub's `SECRETS_MASTER_KEY` lived
    only in `.env.local` on one machine β€” lose the file, lose every backup
    forever. This patch makes the key exportable and the format
    reimplementable from scratch.
    
    **What shipped**
    
    - **`docs/ahbk-format.md`** β€” authoritative byte-layout + JSON header
      schema + crypto-primitive spec for the `.ahbk` format. Any competent
      engineer with libsodium and this doc can reimplement the decoder from
      scratch. Ships alongside the reference encoder (`lib/archive.ts`) and
      the reference decoder (`bin/dr-tool.mjs`).
    - **`bin/dr-tool.mjs`** β€” standalone CLI decryptor. Node + libsodium,
      nothing else. Subcommands: `info` (print header, no key needed) and
      `decrypt` (unwrap with master key, write inner `.tar.gz`). Intended for
      keeping on the USB stick next to your exported key + archive so you can
      restore from any machine with Node installed.
    - **`/admin/security`** page β€” new admin route. Shows what the master key
      is, what it secures, why to export, and a warning block treating it
      like a root password. "Reveal master key" button triggers the existing
      fresh-admin-confirm flow (5-minute window); on success shows the base64
      in a monospace block with Copy and "Download as .txt" actions.
    - **`/api/admin/security/master-key`** endpoint. GET returns the key as
      JSON (view mode) or a plaintext download (`?mode=download`). Gated by
      `requireFreshAdminConfirmApi`. Every call β€” success or failure β€” writes
      a `security.master_key.reveal` row to the admin audit log with mode
      (view/download), so repeated reveals are visible in the trail. Response
      headers set `Cache-Control: no-store` to defend against intermediate
      caches.
    - Admin landing page gains a πŸ”‘ Security entry.
    
    **Deliberately not in this patch (β†’ v0.5.2)**
    
    - Key *rotation* (regenerate + re-wrap every vault row + every `.ahbk`
      header). Sequenced after this patch because you should have a backed-up
      copy of the current key before attempting rotation.
    - *Importing* a key on first boot (disaster-recovery seed). You still
      paste the value into `.env.local` manually today.
    
    **Verified with**
    
    Type-check clean. Reveal + copy + download round-tripped against the v0.5.0
    worker. `dr-tool.mjs info` against a freshly-produced archive prints the
    expected header; `dr-tool.mjs decrypt` round-trips to the original `.tar.gz`.
    
    ---
  • v.Chaos.Proteus.0-eΒ·v0.5.0
    Ships the first hub-orchestrated backup flavor end-to-end: click "Backup now"
    on the StoaChain tab, the worker runs the full flow, you get an encrypted
    .ahbk archive on `/admin/backups` you can download to your Windows machine.
    Port of the user's `stoa-backup-now.sh` / `stoa-remote-daily-backup.sh` into a
    hub job handler, but with a Node-native pipeline (no rsync/zstd binary
    dependency on the hub).
    
    **What shipped**
    
    - Schema migration 007: extends `backups` with `status`, `label`, `local_path`,
      `started_at`, `completed_at`, `expires_at`, `run_kind`, `error`,
      `remote_backup_id` + a couple of indexes.
    - `lib/backups.ts`: CRUD helpers (create / get / list / delete /
      markCompleted / markFailed), auto-prune of expired archives (default
      **1 day** for manual, 14 for scheduled β€” manual is short because the
      download auto-deletes anyway), plus a two-tier archive path resolver:
      primary dir `APP_BACKUPS_DIR` (defaults to `./data/backups` next to the
      app), with spillover to `APP_BACKUPS_SPILL_DIR` when the primary
      filesystem doesn't have ~1.2Γ— the estimated archive size free. Lets the
      hub live on a small partition while spilling large archives onto a
      bigger mount (e.g. `/mnt/nvmedrive/StoaBackups` in production).
    - `lib/archive.ts`: the `.ahbk` encrypted archive format. Magic "AHBK" +
      version byte + length-prefixed JSON header + libsodium `secretstream`
      body. Envelope encryption: per-archive random 32-byte key wrapped with
      the hub's `SECRETS_MASTER_KEY` via `crypto_secretbox`. Gzip compression
      (Node native) sits between the tar source and the encryption stream,
      so body bytes are compressed-then-encrypted. Outer sha256 stored on the
      backup row for download integrity; inner sha256 + plaintext byte count
      stored in the archive header for future offline verification by dr-tool.
    - `lib/handlers/backup-stoachain.ts`: the first backup flavor. Ports the
      user's bash scripts:
      - `POST /chainweb/0.0/stoa/make-backup?backupPact` over SSH+curl (no public
        HTTP), gets the backup id
      - polls `/check-backup/<id>` every 15 s, handling `backup-in-progress` /
        `backup-done` / `backup-failed` with proper logging
      - opens a fresh `ssh2` connection, runs `tar c -C .../backups/<id> .`
        on the remote, and streams stdout directly into the local encrypted
        archive builder β€” no rsync, no staging dir on the hub, no SSH key on
        disk
      - records remote backup id, inner sha256, plaintext size in the archive
        header metadata
    - API routes:
      - `POST /api/admin/nodes/[id]/backup` β€” enqueues a backup job
        (flavor-dispatched; only `stoachain-backup-api` implemented here, others
        return 400 until their flavors ship)
      - `GET /api/admin/backups` β€” list
      - `GET /api/admin/backups/[id]` β€” single backup
      - `DELETE /api/admin/backups/[id]` β€” gated by fresh-admin-confirm
      - `GET /api/admin/backups/[id]/download` β€” raw `.ahbk`; add `?decrypt=1`
        to stream the decrypted `.tar.gz` instead (hub-side decrypt; useful
        before `dr-tool` lands). After a fully-flushed response the archive
        auto-deletes (server-side copy is transient; the download *is* the
        point). Client-aborted downloads preserve the archive so the user
        can retry.
    - `/admin/backups` index page: table with status badges, size, created,
      Download .ahbk + Download .tar.gz actions per completed backup
    - "Backup" section on the StoaChain tab with a **Backup now** button and
      an expandable "How it works" block documenting the whole 6-step pipeline
      + disk usage + retention. Pre-flight warning when `--enable-backup-api`
      isn't in the last probe's startup flags (detection source explained
      inline + nudge to re-probe if stale)
    - **Granular live progress**. chainweb's `check-backup` only returns
      `backup-in-progress` / `backup-done`, so we observe the checkpoint dir
      with `du -sb` every 15 s and report bytes written vs. a live-measured
      `/mnt/nvmedrive/StoaNodeData` baseline. During the streaming phase,
      progress + throughput (MB/s) come from the in-flight byte count. The bar
      moves through all five phases instead of freezing during the longest two.
    - Link to `/admin/backups` from the admin landing page
    - `backup-stoachain` handler registered in the handlers registry
    
    **Fully Node-native pipeline**
    
    The backup flow uses no external CLI binaries on the hub β€” only things
    reachable via `npm install`:
    - `ssh2` for the SSH transport + `exec` channel
    - Node's built-in `zlib` for gzip compression
    - `libsodium-wrappers` for the envelope encryption
    - `crypto` for sha256 manifests
    
    Rationale: makes the hub deployable on any OS without worrying about which
    version of `rsync` / `zstd` / `tar` is present. The remote side still needs
    `tar` (Linux default) and `curl` (already present on every server we'd
    manage). No new dependencies.
    
    **Deliberately not in this pass**
    
    - Other flavors: Mailcow, IPFS pins, nginx/service configs, full-node. Each
      gets its own handler β€” architecturally the same shape, just different tar
      sources. Follow-up passes.
    - Off-site destinations (S3-compatible, SSH-to-backup-node, backup-storage-role
      node). Deferred until the user decides on a provider; the `destinations_json`
      column on backups is already wired for this.
    - Scheduled / rules engine. Deferred until there's a real off-site destination
      to push to.
    - Standalone `dr-tool` binary. Deferred β€” for now the hub itself decrypts
      (the `?decrypt=1` download option).
    
    **Verified with**
    
    Type-check clean. End-to-end smoke test planned against the production box
    after the worker restart.
    
    ---
β–ΈJason49 historical entries Β· patch-number 0

Jason Genesis Β· 49 historical entries Β· patch-number 0

  • v.Chaos.Jason.0-aΒ·v0.7.4q
    Finalizes the role matrix that was nudged into existence by the real
    handover flow (ancient admin sets up a hub, then hands accounts off to
    modern admins who in turn manage their own clients).
    
    ### Admin console β€” per-link access tags
    
    The `/admin` quick-links list now prefixes each entry with a compact
    three-glyph role badge:
    
    - `β˜…` ancient (gold)
    - `β—†` modern (blue)
    - `β—‡` client (grey)
    
    A glyph lights up when that role can access the page; greyed out
    otherwise. Unavailable links render disabled with a small
    "(restricted)" tag β€” visible but unclickable, so modern/client admins
    can see at a glance what exists but isn't theirs to touch.
    
    ### Role matrix tightened
    
    - **Acolytes** β€” now ancient-only (page gate + API GET both locked to
      ancient). Modern admins were never able to mutate the roster (already
      ancient-only there), but they could browse. That's gone now β€” the
      public-site team roster is an ancient concern.
    - **Admins & Clients page** β€” no longer loads for clients. The page gate
      now 404s any role below modern. (Quick-links already hid it, but a
      direct URL hit would still render a stripped view.)
    - **Client management** (`/api/admin/clients/...`) β€” promote, revoke,
      and reset-onboarding actions now accept both ancient and modern admins
      via the new `requireFreshAdminNonClientConfirmApi` guard. Client role
      itself is still rejected.
    - **Admins roster API** β€” GET now rejects clients explicitly
      (404, not 403, keeping the "not-admin" veneer).
    - **Admins page UI** β€” Promote-to-Client form, Revoke, and Reset
      Onboarding buttons now render for modern admins too.
      Grant-Modern-Admin stays ancient-only.
    - **Mailcow mailbox list** β€” modern admins now fetch it at page load so
      the Promote-to-Client picker populates for them.
    
    ### Files touched
    
    - `lib/admin.ts` β€” added `requireFreshAdminNonClientConfirmApi`.
    - `pages/admin/index.tsx` β€” new `<AccessTag>` + `<QuickLink>` helpers;
      Quick-links rewritten to use them.
    - `pages/admin/acolytes.tsx` β€” role gate tightened to ancient.
    - `pages/admin/admins.tsx` β€” UI gates + effect dependency for Mailcow
      fetch now include modern role.
    - `pages/api/admin/acolytes/index.ts` β€” `requireAncientAdminApi`.
    - `pages/api/admin/clients/index.ts` β€” POST guard swapped;
      GET rejects clients.
    - `pages/api/admin/clients/[email].ts` β€” DELETE + PATCH guards swapped.
    - `pages/api/admin/admins/index.ts` β€” GET rejects clients.
    
    ---
  • v.Chaos.Jason.0-bΒ·v0.7.4j
    Five slices bundled in one release so v0.7.4 closes cleanly. Each
    addresses a gap surfaced during today's real-world VPS onboarding.
    Phase code β†’ **CR3** (Client Role 3 β€” onboarding end-to-end).
    
    ### v0.7.4e β€” Install Wizard Certificate step
    
    Previously: fresh installs got a self-signed P-256 cert and the
    operator had to manually go to the Identity tab, paste the DuckDNS
    token, run Obtain-LE, restart. Long chain of clicks + context
    switches with the DuckDNS dashboard.
    
    Now: Identity step in the wizard expands when `p2pHostname` ends in
    `.duckdns.org`. Extra fields: DuckDNS token (required for auto-LE)
    + email (optional). On install, after `docker compose up`, the
    handler runs `certbot certonly --manual --preferred-challenges dns`
    with auto/cleanup hooks that hit DuckDNS's update API. Cert files
    land at `<tlsDir>/tls-{cert,key}.pem` (what compose mounts) β†’
    container restarts β†’ node emerges with CA-signed cert ready to peer.
    
    Renewal deploy-hook installed at
    `/etc/letsencrypt/renewal-hooks/deploy/stoa-inst.sh` β€” re-copies
    the renewed cert to the compose-mount paths and restarts
    stoa-node automatically on every certbot.timer fire.
    
    Non-DuckDNS hostnames: LE step skipped; self-signed bootstrap
    remains and operator can run Obtain-LE manually (existing flow,
    unchanged).
    
    ### v0.7.4j β€” Seed-at-install option
    
    Install Wizard Profile step grew a checkbox: "Install with current
    hub seed (recommended)". Shows seed cut height + size + donor.
    Defaults ON when a current seed exists. On apply, install handler
    replaces empty chainweb boot with an inline call to the existing
    `stoachainReseedHandler` (reuses v0.7.3c-f's rollback + cert-preserve
    + stream-plumbing logic). Net: install completes with chainweb at
    the donor's cut-at-backup time, not cut=0. Minutes saved on stoa;
    hours-to-days on Kadena-mainnet-sized chains later.
    
    If no current seed on hub, the checkbox turns into a grey note
    linking to `/admin/seeds` to produce one first.
    
    ### v0.7.4g β€” Already-managed detection at Add-Node
    
    New preflight endpoint `POST /api/admin/nodes/already-managed-probe`.
    SSHes with password auth to the target and runs 5 detection checks:
    - `/etc/sudoers.d/ancientholdings-stoa` exists (hub-sudoers file)
    - chainweb-node process running
    - `stoa-node` container present
    - `RunStoaNode.managed.sh` file under `/home`, `/mnt`, or `/srv`
    - `ah-hub:` / `ancientholdings-hub` marker in authorized_keys
    
    If any trigger, the Add-Node wizard's Bootstrap submit surfaces a
    `window.confirm` listing detected signals before proceeding.
    Operator can Cancel (safe default) or click OK to force-adopt
    anyway (e.g. re-adding after accidental delete, or they've already
    cleaned up another hub's leftover).
    
    Non-destructive probe β€” purely read-only. Prevents the
    "two hubs dueling for one server" footgun.
    
    ### v0.7.4h β€” Key-purge on unmanage + `/admin/orphans` page
    
    Node DELETE endpoint rewritten. New flow:
    
    1. SSH into the target, remove any line in `~/.ssh/authorized_keys`
       (and `/root/.ssh/authorized_keys`) containing the `ah-hub:`
       marker. Backup copy left as `*.bak.<timestamp>`.
    2. Unconditionally delete vault secret + nodes row (the hub commits
       to losing its SSH access regardless of whether step 1 succeeded).
    3. If step 1 failed (network partition, target offline, auth
       failure), write a row into `node_orphans` capturing what was
       attempted + the error.
    
    New admin page `/admin/orphans` (ancient-only). Lists unresolved
    orphans with clear "SSH in yourself and remove the ah-hub: line"
    instruction + a "Mark resolved" button. Keeps resolved history
    (last 20) with the operator's cleanup note.
    
    ### v0.7.4i β€” Onboarding transparency modal for clients
    
    First time a `client`-role admin lands on `/admin`, modal appears:
    - Names the hub's ancient admin (first in env list)
    - States plainly: hub has full SSH access to their managed
      servers, every action is audit-logged, client retains ownership,
      unmanage removes the hub's key
    - "I understand β€” continue" stamps `clients.accepted_transparency_at`
      (one-way). "Cancel β€” sign me out" redirects to home.
    
    Modal fires **only** for role=client. Ancient/modern admins see
    nothing (they already know the game). Uses new endpoints:
    - `GET /api/admin/clients/me` β€” role + acceptance stamp
    - `POST /api/admin/clients/me` β€” stamp acceptance (no-op if
      already accepted)
    
    ### Backlog
    
    New `plans/BACKLOG.md` seeded with:
    - **Storage-partition awareness per service** (user-requested
      today) β€” ability to see where each hub-hosted service lives
      (partition + path + free space) and move services between
      partitions. Matters once hub hosts multiple websites
      (caduceus subdomain + others). Live server currently has a
      480 GB partition at 29% that will eventually need management.
    - A few smaller items surfaced in today's VPS-onboarding arc.
    
    ### Version bump
    
    - `lib/version.ts` β†’ **v0.7.4j**. Phase code **CR3** (Client Role
      3 β€” onboarding end-to-end). Closes the v0.7.4 phase as planned in
      `plans/v0.7.4-client-role.md`.
    
    With this release a fresh VPS β†’ synced stoa peer is **one form in
    the Install Wizard** (DuckDNS token being the only external dance
    the operator still does manually β€” grab it once from DuckDNS
    dashboard). Original 45-min ops slog compressed to ~10 min.
    
    ---
  • v.Chaos.Jason.0-cΒ·v0.7.4k
    User spotted an inconsistency on `/admin/seeds`: AncientMiner's row
    showed `bytales.duckdns.org` (nice DNS name) while IonosFiveVPS
    showed `82.165.48.252` (raw IP) β€” even though IonosFive now has
    `kjrkentolopon.duckdns.org` as its P2P identity.
    
    Cause: the UI was showing `nodes.host` (SSH entry point from the
    Add-Node wizard). AncientMiner happened to be added via its
    DuckDNS name for SSH; IonosFive was added via raw IP. The two
    don't have to match.
    
    **Fix**:
    - `ManagedNodeSeedRow` gains `p2pHostname: string | null` β€”
      populated from live argv's `p2p-hostname` flag (skipping the
      `0.0.0.0` placeholder).
    - Seeds page prefers `p2pHostname` when displaying node identity,
      falls back to `host` (SSH) if no p2p-hostname is set yet.
    - Appends `(ssh: <host>)` in muted text when the two differ, so
      the operator can still see the SSH entry point at a glance.
    - Tooltip explains which is which on hover.
    
    Behavior preserved: the backing data still uses `host` for SSH.
    Only the display changed.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.4k`. Phase stays `CR2`.
  • v.Chaos.Jason.0-dΒ·v0.7.4f
    **The real bug behind "fresh install won't sync."** User with a fresh
    IonosFive VPS had:
    - LE-signed cert βœ“
    - Real DNS hostname pointing at the box βœ“
    - `p2p-hostname` set to that hostname βœ“
    - Port 1789 reachable βœ“
    
    And still cut=0, no peers, no sync. Diagnosis:
    
    `RECOMMENDED_PROFILE` (what the Install Wizard uses) does not
    include `known-peer-info`. The `stoa` custom chainweb variant has
    **no built-in bootstrap peer list** β€” that's a `mainnet01`-only
    thing baked into upstream chainweb-node. So a fresh `stoa` node
    with no `known-peer-info` has **zero peer-discovery seeds** and
    sits at cut=0 forever waiting to be contacted, which can't happen
    either since its hostname was just created seconds ago and nobody
    in the network knows about it.
    
    **Fix**: add `'known-peer-info': ['node1.stoachain.com:1789',
    'node2.stoachain.com:1789']` to `RECOMMENDED_PROFILE`. Two entries
    for redundancy β€” a fresh node survives one seed being
    temporarily down. Once peer gossip discovers the broader graph
    on first handshake, the seed entries become non-critical.
    
    `ANCIENT_PROFILE` had one entry already; parity restored.
    
    **For existing nodes that were affected**: add `known-peer-info`
    manually via Flag Editor and Restart β€” takes 2 min. Or re-run
    Install Wizard (cleanup auto-wipes and re-installs with the new
    default).
    
    **Version bump (CR2 continues)**
    - `lib/version.ts` β†’ `v0.7.4f`. Skipped `.e` because that slot is
      reserved for the "certificate step in Install Wizard" slice
      (still planned; this patch unblocks the current user first).
  • v.Chaos.Jason.0-eΒ·v0.7.4d
    Install handler's self-signed cert generator was still emitting
    ECDSA P-384 / SHA-384 β€” a copy-paste carryover from chainweb-node's
    example script that never matched any production Stoa node.
    node1 / node2 / AncientMiner all use ECDSA P-256 / SHA-256 per
    `RunStoaNode.sh`. v0.7.3g fixed this for `stoachain-cert-rotate`
    but missed `stoachain-install` β€” same class of miss as the
    compose-plugin one.
    
    **Fix**: install handler's `openssl req -x509` now uses
    `-newkey ec -pkeyopt ec_paramgen_curve:P-256 -sha256`.
    
    **Important note for the operator**: the curve change does NOT
    make peers accept the node. Chainweb-node validates peer certs
    against the **system CA bundle**; any self-signed cert (P-256 or
    P-384) is rejected as "unknown CA" β€” verified fact from the
    node2 TLS forensics. A fresh install thus produces a
    chainweb-node that peers refuse. To get peer acceptance:
    
    1. Point DNS for your chosen `p2p-hostname` at the new VPS.
    2. Chainweb tab β†’ Identity β†’ "Obtain Let's Encrypt certificate"
       (HTTP-01 if port 80 is free; DNS-01 for DuckDNS).
    3. Restart the node.
    
    v0.7.5 (planned) folds the certbot step into the Install Wizard
    itself, so the wizard asks "enter hostname + obtain LE now?" at
    install time. For now, it's a manual post-install click.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.4d`. Phase stays `CR2`.
  • v.Chaos.Jason.0-fΒ·v0.7.4c
    Install failed on a fresh Ionos Ubuntu 24.04 VPS with empty error
    "docker compose up failed:" β€” same class of bug that
    convert-supervision had through v0.7.3r/t but the install handler
    never got the fix. Two root causes hit simultaneously:
    
    1. Ionos's default docker CLI is from `docker.io` apt package, which
       **does not include the compose plugin**. `docker compose up -d`
       parses `-d` as a top-level docker flag and chokes.
    2. The install handler's `dockerComposeUp` used `2>&1` to merge
       stderr into stdout but then only read `r.stderr` on failure β€”
       always empty. The operator saw "docker compose up failed:" with
       nothing after the colon.
    
    **Fixes (`lib/handlers/stoachain-install.ts`)**
    - New preflight step 5b: `ensureDockerComposePlugin(target)` runs
      between `docker pull` and `docker compose up`. If `docker compose
      version` fails, fetches the v2.29.1 compose plugin binary from
      GitHub releases and drops it into
      `/usr/libexec/docker/cli-plugins/docker-compose`. Architecture-
      aware (x86_64 / aarch64 / armv7). Uses sudo + tee + chmod (all
      already in the canonical sudoers list).
    - `dockerComposeUp` now:
      - streams compose output live via `onChunk` (you see pull/create/
        start progress in the job log in real time)
      - captures the merged output and includes it in the error message
        (tail -600 chars, with explicit exit code)
      - adds a defensive `docker rm -f stoa-node` before compose up to
        survive stale containers from prior failed installs
      - bumped timeout 60s β†’ 5min for cold image pulls
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.4c`. Phase code stays `CR2`.
  • v.Chaos.Jason.0-gΒ·v0.7.4b
    Second slice of v0.7.4. Pure-plumbing v0.7.4a now has a visible
    surface β€” you can promote mailcow mailboxes to `client`, assign nodes
    to those clients, and clients will see only their own nodes.
    
    **Phase code**: v0.7.4b ships as `CR2` (Client Role 2 β€” promotion +
    ownership UI).
    
    **Admin page β€” Clients section (`/admin/admins`)**
    - New "Clients" roster section (parallel to the Ancient+Modern
      roster). Shows each client's email, promote-date, promoter, and a
      "pending onboarding" badge when `accepted_transparency_at` is null
      (wired in v0.7.4e).
    - New "Promote to Client" form (ancient-only) β€” dropdown lists
      Mailcow mailboxes that aren't already admins / clients.
    - Revoke button on each client row (ancient-only; warns that nodes
      owned by revoked client become stranded until reassigned).
    - Page re-titled "Admins & Clients" with updated 3-tier intro copy.
    
    **New API routes**
    - `GET /api/admin/clients` β€” list clients.
    - `POST /api/admin/clients` β€” promote an email (ancient + fresh-confirm).
      Refuses if email is already ancient/modern (upgrade path has to go
      through explicit tier removal first).
    - `DELETE /api/admin/clients/[email]` β€” revoke (ancient + fresh-confirm).
    
    **Nodes list (`/admin/nodes`)**
    - Shows `owner:` line per node (email or "unowned Β· ancient-only").
    - SSR filters the list by ownership β€” modern/client admins see only
      nodes they own. Ancient sees all, including unowned.
    
    **Node detail (`/admin/nodes/[id]`)**
    - SSR returns 404 if caller can't `canAccessNode` (same behavior as
      the API layer β€” no leak between "doesn't exist" and "not yours").
    - New `OwnerRow` component under the SSH line. Shows the owner email
      or "unowned Β· ancient-only". Ancient admins get a "change" link
      that inline-edits the field, fresh-confirms via password modal,
      PATCHes `/api/admin/nodes/[id]/owner`, reloads.
    - New `PATCH /api/admin/nodes/[id]/owner` API route (ancient + fresh).
    
    **Add-Node wizard (`/admin/nodes/new`)**
    - New "Owner email" field at the bottom of the shared form. Defaults
      to the admin doing the adding.
    - Ancient admins can type any email. Modern/client admins see the
      field but it's locked to their own email (the API also refuses
      mismatched ownership for non-ancient callers).
    - Both `POST /api/admin/nodes` (paste-key) and
      `POST /api/admin/nodes/bootstrap` (password bootstrap) accept
      `ownerEmail`, default to caller, validate.
    
    **Schema changes**
    - `CreateNodeInput` and `BootstrapInput` gain `ownerEmail?: string | null`.
    - `NodeRow` and `PublicNode` gain `owner_email: string | null`.
    - `bootstrapNode` persists `owner_email` at INSERT time; defaults to
      `issuedBy` if caller didn't specify.
    - `createNode` persists `owner_email` at INSERT time (lowercased).
    
    **How to test after dev reload**
    1. Log in as ancient admin β†’ `/admin/admins` β†’ Clients section
       empty. Pick a non-admin mailbox β†’ "Promote to Client". Confirm
       it appears in the Clients roster.
    2. `/admin/nodes/[id]` β†’ click "change" next to Owner. Assign it
       to the client you just promoted. Save.
    3. Sign out. Sign in as the client's email. You land at `/admin`
       with their node visible at `/admin/nodes`. No other admin pages
       accessible (they'd 404).
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.4b`, phase `CR2`.
  • v.Chaos.Jason.0-hΒ·v0.7.4a
    Starts the v0.7.4 phase (client role + ownership) per
    `plans/v0.7.4-client-role.md`. This slice is **pure plumbing** β€” no
    user-visible changes yet. Subsequent slices (b–e) add the UI for
    promotion, owner assignment, already-managed detection, key purge on
    unmanage, and the onboarding transparency modal.
    
    **Phase code**: v0.7.4a ships with phase code `CR1` (Client Role 1 β€”
    Ownership plumbing), replacing SC5.
    
    **Migration 016**
    - `nodes.owner_email TEXT` β€” nullable column. Pre-v0.7.4a rows keep
      NULL = "unowned, ancient-only". Fresh Add-Node flows in v0.7.4b
      will populate it explicitly.
    - New `clients` table β€” mirrors `modern_admins` shape. Email +
      created_at + created_by + accepted_transparency_at (null until
      v0.7.4e's modal consent).
    - New `node_orphans` table β€” audit trail for unmanage attempts where
      the hub couldn't remove its SSH key from the target. v0.7.4d
      populates it.
    
    **`AdminRole` extended**
    - Added `'client'` to the union. Priority: `ancient > modern > client`.
    - `getAdminRole()` checks `clients` table when neither ancient-env nor
      `modern_admins` matches.
    
    **New helpers (`lib/admin.ts`)**
    - `canAccessNode(caller, node)` β€” ancient always; modern/client only if
      owner_email matches their email; null owner = false for non-ancient.
    - `requireOwnedNodeApi(req, res, opts?)` β€” route guard combining
      `requireAdminApi` + node lookup + ownership check. Returns
      `{ email, role, session, nodeId, ownerEmail }`. Pass `{ fresh: true }`
      for fresh-confirm routes. 404s uniformly on unauthorized or
      not-found (no surface leak).
    
    **Node-route wiring (13 files updated, 8 skipped)**
    - Updated (now ownership-scoped):
      `[id].ts`, `apt-upgrade`, `backup`, `metrics/[...netdataPath]`,
      `netdata-install`, `probe`, `stoachain/control`, `stoachain/docker-logs`,
      `stoachain/flags` (GET only β€” PATCH stays ancient), `stoachain/logs`,
      `stoachain/peer-activity`, `stoachain/preflight`, `stoachain/status`,
      `test`.
    - Skipped (ancient-only by design, bypass ownership):
      `drive-benchmark`, `stoachain/cert-rotate`, `stoachain/certbot-obtain`,
      `stoachain/convert-supervision`, `stoachain/install`,
      `stoachain/peer-trust-reset`, `stoachain/reseed`, `sudoers-repair`.
    
    **Master plan updated**
    - `plans/control-hub.md` Β§16 Progress log: added the 2026-04-18 β†’ 2026-04-21
      SC-series build-out summary + the v0.7.4a entry.
    
    **Next**: v0.7.4b β€” client-role promotion UI in `/admin/acolytes` +
    owner-assignment UI on node detail.
  • v.Chaos.Jason.0-iΒ·v0.7.3af
    v0.7.3ae's resolver fixed node2-hardcoding but had a gap: adopted
    docker nodes (never went through the Install wizard) have
    `stoachain_runner_path = NULL` in the DB, so the resolver fell
    through to "use live argv's `--database-directory`". For docker
    nodes that value is the **container-internal** path (`/data`) because
    chainweb-node runs inside the container. Resolver would have
    returned `/data/backups` and tar would have failed again.
    
    Hit on live for AncientMiner.
    
    **Fix**: when supervision is docker AND we have no captured runner
    path, run `docker inspect stoa-node` to read the host source of the
    `/data` bind mount. That's the authoritative host data dir.
    
    **Resolution flow (now)**
    1. Hub-installed docker (runner_path ends compose.yml) β†’ derive
       from stoa-root convention
    2. Adopted docker (runner_path NULL, supervision=docker) β†’
       `docker inspect` the `/data` mount source
    3. screen/systemd β†’ live argv's `--database-directory` (host path)
    4. Fallback β†’ stored flags' database-directory
    5. Throw with actionable error if nothing resolves
    
    The `/data !== db` sanity guard now also prevents accidentally
    treating a container-internal path as a host path in the later
    fallbacks.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3af`.
  • v.Chaos.Jason.0-jΒ·v0.7.3ae
    Two bugs surfaced once v0.7.3ad stopped auto-promoting junk seeds
    and forced the real failure into visibility:
    
    **Bug 1: backup handler had the remote backup dir hardcoded**
    (`/mnt/nvmedrive/StoaNodeData/backups`). Worked for node2 by
    coincidence; every other node's tar ran against a non-existent
    path and produced an empty archive. Seen in the wild on live's
    AncientMiner attempt:
    
    ```
    tar: /mnt/nvmedrive/StoaNodeData/backups/1776731165148056: Cannot open
    tar: Error is not recoverable: exiting now
    ```
    
    **Bug 2: donor eligibility threshold was 95% of the tallest
    *candidate***. If only one node had `enable-backup-api` on, it was
    always β‰₯95% of itself and passed β€” even when another managed node
    (without backup-api) showed the network was miles ahead.
    
    **Fixes**
    
    - `lib/handlers/backup-stoachain.ts`:
      - New `resolveHostBackupDir(node, nodeId, log)` helper. For docker
        nodes: derive from `stoachain_runner_path` (compose dir β†’
        `<stoaRoot>/data/backups`). For screen/systemd: use live
        argv's `--database-directory` (host path directly) β†’
        `<db-dir>/backups`. Falls back to stored flags; throws with a
        clear operator message if neither source resolves.
      - The `du` baseline measurement also uses the derived data dir
        (not the hardcoded path).
    - `lib/seeds.ts`:
      - Max cut is now tracked across ALL reachable managed nodes, not
        just backup-api-enabled candidates.
      - Eligibility threshold raised 95% β†’ **999‰ (99.9%)**. Matches
        the "sync progress" green-zone threshold in the per-node Status
        card, so what the admin sees as "synced" is exactly what the
        donor picker accepts.
      - `cut-too-low` reason text now shows permille: "sync progress
        823.1‰ is below the 999‰ donor threshold".
    
    Belt-and-suspenders with v0.7.3ad's 1 GiB archive-size check:
    size check catches empty archives at write time; sync check
    catches partially-synced donors at pick time.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3ae`.
  • v.Chaos.Jason.0-kΒ·v0.7.3ad
    Hit on live: the auto-refresh job promoted a **714-byte archive**
    from AncientMiner (manifest: `innerBytes: 20, remoteSizeBytes: 0`)
    as the hub's current seed. Happens when the donor's chainweb backup
    API returns a near-empty archive β€” most likely because the donor
    wasn't ready (recent restart, still syncing, internal backup worker
    uninitialized).
    
    Without a guard, a reseed from this "seed" would replace target
    nodes with an empty data dir. Real footgun β€” seed-refresh must
    refuse to promote junk.
    
    **Fix (`lib/handlers/seed-refresh.ts`)**
    - After the backup sub-handler returns, cross-check `size_bytes`.
    - If below `MIN_SEED_SIZE_BYTES = 1 GiB`, throw with a clear
      operator message. The backup row is preserved (operator can
      inspect or delete via `/admin/backups`); the existing current
      seed (if any) is untouched.
    - Threshold chosen to be generous enough that any healthy chainweb
      donor clears it, strict enough that an empty-archive failure gets
      caught (real stoa-chain data is ~50 GB by now).
    
    **Cleanup on live**
    - Deleted the bad seed_archives row + 714-byte .ahbk file on the
      production hub (one-off SSH). Next scheduled seed-refresh will
      produce a real seed once a healthy donor is available.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3ad`.
  • v.Chaos.Jason.0-lΒ·v0.7.3ac
    Follow-up to v0.7.3ab: seeds and client backups have different
    semantics (hub infrastructure vs client-facing archives) and mixing
    them in the Backups UI is confusing. Splits them cleanly.
    
    **Changes**
    - `listBackups(opts)` gains `excludeSeeds?: boolean`. The Backups
      page + API both pass it to exclude seed-referenced rows.
    - `/admin/backups` no longer shows seeds. Header paragraph now
      points operators at `/admin/seeds` for hub-infrastructure archives.
    - New endpoint `GET /api/admin/seeds/[id]/download`:
      - **Ancient admin + fresh-confirm required**
      - Serves the `.ahbk` file (HTTP Range supported, resumable)
      - **No auto-delete** β€” hub keeps its copy, operator gets a copy
      - Filename baked with seed status + promote date for cold-storage
        clarity (`stoa-seed-current-2026-04-21-<id8>.ahbk`)
    - `/admin/seeds` History table gains a `Download` column with a
      `↓ .ahbk` button per row. Button triggers the password modal
      (stamps fresh-confirm on the session) then navigates to the
      download URL.
    - History section has an explanatory paragraph: seeds are
      infrastructure, download is out-of-band only, no auto-delete.
    
    **Use cases for the download**
    - Cold/offline archive of the reseed baseline (disaster recovery)
    - Manual reseed on a firewalled node that can't SSH to the hub
    - Inspection / diagnostics of the archive content
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3ac`.
  • v.Chaos.Jason.0-mΒ·v0.7.3ab
    User caught a real footgun: the hub's seed archive (the `.ahbk` used
    for new-node installs + reseeds) shares the same `data/backups/`
    directory and `backups` table as client-facing backups. Downloading
    it via the normal backups page auto-deleted the file on completion
    (standard behavior for client backups), which would orphan the seed
    and break future reseeds.
    
    **Fix β€” seed-referenced backups are now protected**
    
    - New helpers in `lib/backups.ts`:
      - `getBackupSeedStatus(id)` β†’ `'current' | 'archived' | null`
      - `listBackupSeedStatuses(ids)` β†’ batch map for list endpoints
    - `deleteBackup(id, opts)` now throws `BackupIsSeedError` if the
      backup is seed-referenced. Pass `{ force: true }` only from
      internal seed-management code (none currently; reserved for
      future demotion flows).
    - `DELETE /api/admin/backups/[id]` catches the new error, returns
      **409 Conflict** with the seedStatus, and logs the refusal.
    - `GET /api/admin/backups/[id]/download` auto-delete-on-completion
      logic now skips seed-referenced backups. Staged `.tar.gz.ready`
      is still cleaned up (it's disposable); only the `.ahbk` is the
      seed archive and stays on disk.
    - `GET /api/admin/backups` and `GET /api/admin/backups/[id]` now
      include `seedStatus` in the response.
    - `/admin/backups` UI surfaces this:
      - `HUB SEED Β· current` (orange) or `HUB SEED Β· archived` (grey)
        badge next to the label
      - Tooltip on Download explains auto-delete is skipped for seeds
      - Header paragraph mentions the HUB SEED exemption
    
    Downloads of seeds now behave as: admin gets a copy of the file,
    hub keeps the file, reseed remains possible. No more one-shot
    "download β†’ lose the seed" accident.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3ab`.
  • v.Chaos.Jason.0-nΒ·v0.7.3aa
    Node2 conversion succeeded (chainweb now runs inside
    `stoa-node` container), but the Control tab still showed
    `supervision=screen`. Two cooperating bugs:
    
    1. Priority was `screen > docker > systemd`. Any screen session
       present made detection short-circuit to screen.
    2. Screen detection regex matched **any** session name: `\d+\.\w+`.
       Node2 has unrelated screens on the box β€” `StoaMiner` (kadena
       ASIC miner) and `cronoton` β€” both matched. First one picked β†’
       mis-reported.
    
    **Authoritative fix**: use the **cgroup of the chainweb-node PID**.
    A docker-supervised process lives in `/system.slice/docker-<hash>.scope`;
    a systemd unit lives in `/system.slice/<unit>.service`. That's the
    truth regardless of which other services happen to be on the box.
    
    **Changes (lib/stoachain-live.ts)**
    - Bash probe now captures `/proc/$PID/cgroup` in a new `---CGROUP---`
      section.
    - Supervision picker checks cgroup first (docker / systemd), falls
      back to screen/docker/systemd blocks only if cgroup didn't resolve.
    - Screen session detection regex tightened: `[0-9]+\.StoaNode` only
      β€” unrelated screens no longer trigger false positives.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3aa`.
  • v.Chaos.Jason.0-oΒ·v0.7.3z
    Node2 screen β†’ docker conversion failed:
    
    ```
    error mounting ".../StoaNodeData.stoa/tls/tls-cert.pem" to rootfs at "/data/tls-cert.pem":
    ...not a directory: Are you trying to mount a directory onto a file (or vice-versa)?
    ```
    
    Two distinct bugs:
    
    **Bug 1 (root cause): cert path not translated after data-dir move.**
    On nodes where the TLS cert lives inside the data dir (e.g.
    `/mnt/nvmedrive/StoaNodeData/tls-cert.pem`), the `mv` of the data dir
    moves the cert along with it. The flags loaded from live argv still
    point at the pre-move path, so the `sudo cp` to copy cert+key into
    the new `tls/` subdir silently fails. The handler didn't check cp's
    exit code β€” it logged "copied cert+key" regardless. Docker's
    bind-mount then auto-created the missing source path as a directory,
    and `runc` rejected the mount because you can't bind-mount a dir
    onto a file.
    
    **Bug 2: dead-but-existing container poisons supervision detection.**
    A compose-up that creates a container but fails to start it leaves
    that container in "Created" state. `detectSupervisionLive` was using
    `docker ps -a` (all containers), so a stopped stoa-node was reported
    as docker-supervised even after rollback restarted screen/systemd.
    
    **Fixes (lib/handlers/stoachain-convert-supervision.ts)**
    - Before cp: if the cert/key paths were inside the old data dir,
      translate them to the new (post-mv) location. Logs the translation
      so it's visible.
    - cp: check exit code and throw on failure. Also `test -f` the
      resulting `tls-cert.pem` to make sure it's actually a regular file.
    - detection: `docker ps` (running only), not `docker ps -a`.
    - New rollback step pushed right after compose.yml is written:
      `docker compose down` + `docker rm -f stoa-node`. LIFO order puts
      this first on rollback (while compose.yml still exists), then
      remove-intermediate / mv-back / restart-old. Prevents orphaned
      container from blocking clean retry.
    
    **Scripts**
    - `scripts/recover-node2-post-fail.ts` β€” one-off to clean up node2's
      dead container + leftover .stoa dir after the v0.7.3y attempt.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3z`.
  • v.Chaos.Jason.0-pΒ·v0.7.3y
    Node2 benchmark: write succeeded at 210 MB/s, then cache-drop step
    timed out at 10s. Root cause: `sync` blocks until RocksDB dirty
    pages are flushed β€” on a busy chainweb node that's easily >10s.
    Timeout killed the whole benchmark even though cache-drop is
    strictly a "read test accuracy" nice-to-have.
    
    **Fixes**
    - Dropped the `sync` preamble. We care about clearing the page cache,
      not durability; `drop_caches` handles what we need.
    - Bumped timeout 10s β†’ 30s for the drop itself.
    - Wrapped the call in try/catch β€” if it still times out or fails for
      any reason, log a warning and continue. The read test may show
      inflated cached throughput in that case, but the write number is
      the authoritative one anyway (RocksDB's bottleneck is writes).
    
    Net effect: no more benchmark deaths from a busy node, and the
    worst degradation is "read test optimistic".
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3y`.
  • v.Chaos.Jason.0-qΒ·v0.7.3x
    v0.7.3w's auto-sudoers-repair worked (AncientLinux benchmark got past
    the dd write step, 345 MB/s). Next failure was the **read** parse:
    
    ```
    536870912 bytes (537 MB, 512 MiB) copied, 0,445005 s, 1,2 GB/s
    ```
    
    Two issues packed into one line:
    - Comma decimal separator (`0,445005`, `1,2`) β€” AncientLinux is in a
      German/Romanian locale
    - GB/s (not MB/s) β€” fast NVMe reads report in GB/s
    
    The old regex `/,\s*([\d.]+)\s*MB\/s/` expected dot-decimals AND
    MB/s. Missed both on this line.
    
    **Fix**: new `parseDdThroughput(output)` helper that accepts
    MB/s, GB/s, KB/s (with GB→MB and KB→MB normalization) and both
    `.` and `,` as decimal separator. Returns null if unparseable so
    the caller can throw an honest error.
    
    Used for both write and read parsing in `drive-benchmark.ts`.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3x`.
  • v.Chaos.Jason.0-rΒ·v0.7.3w
    v0.7.3v's probe correctly identified AncientLinux's `/home/StoaNode/data`
    as root-owned (docker runs chainweb as root), triggering the
    `sudo -n dd` path. That path then failed because AncientLinux's sudoers
    is from the pre-v0.7.3m template and doesn't include `/bin/dd`,
    `/bin/sh`, or `/bin/sync`.
    
    Rather than tell the operator "go click Sudoers Repair and retry",
    the handler now auto-repairs sudoers on sudo-refusal. Every manual
    fix becomes a UI feature.
    
    **Changes**
    - New `lib/sudoers.ts` β€” single source of truth for the canonical
      NOPASSWD command list, with `repairSudoers(target, username)` and
      `ensureSudoers(target, username, log)` helpers.
    - `lib/handlers/drive-benchmark.ts` β€” on sudo refusal during dd write,
      calls `ensureSudoers()` to refresh `/etc/sudoers.d/ancientholdings-stoa`
      to the canonical list, then retries the dd once. If it still fails,
      returns an actionable error ("check the sudoers file manually").
    - Refactored `pages/api/admin/nodes/[id]/sudoers-repair.ts` to use
      the shared primitive β€” previously the canonical list was duplicated
      across three files.
    - Also dropped the last remaining fake MB/s fallback: if dd exits 0
      but output lacks the `MB/s` line, throw an error instead of
      inventing a reading from wall-clock time.
    - Added `/usr/bin/curl` to the canonical sudoers list (needed by
      v0.7.3t's compose-plugin install and v0.7.3u's docker install).
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3w`.
  • v.Chaos.Jason.0-sΒ·v0.7.3v
    Two bugs in one:
    
    1. **Drive benchmark always used `sudo -n dd`** β€” fine for docker
       installs (root-owned data dir) but failed on user-owned data dirs
       (screen/systemd installs, e.g. AncientLinux's `/home/StoaNode/data`)
       whose sudoers didn't have a `/bin/dd` entry. The dd never actually
       ran; sudo refused with "a password is required".
    
    2. **The handler fabricated a fake MB/s reading on failure.** Because
       the error-handling ran AFTER the mbps calculation, and the
       calculation fell back to `sizeMb / wall-clock` when the dd output
       had no "MB/s" line, the job log showed a plausible-looking number
       (the ssh round-trip time, e.g. "1802.8 MB/s") before throwing
       the actual error. Misleading.
    
    **Fixes (both in `lib/handlers/drive-benchmark.ts`)**
    - Probe `benchDir` perms first via `[ -w ... ]`. If the ssh user can
      write, skip `sudo` entirely. Only use sudo when the dir is
      root-owned (docker case).
    - Check dd exit code BEFORE parsing MB/s. If exit != 0 and stderr
      indicates sudo refusal, return a clear "run Sudoers Repair" error
      instead of trying to plot a fake reading.
    - Same pattern for the read-test dd and the rm cleanup.
    - Cache-drop (`/proc/sys/vm/drop_caches`) still needs sudo β€” left
      best-effort with `|| true`. If cache-drop fails, the read number
      is just inflated (cached), but the write number is still accurate.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3v`.
  • v.Chaos.Jason.0-tΒ·v0.7.3u
    Closes the "you want to convert to docker but docker isn't installed"
    gap. v0.7.3t handled the compose plugin; v0.7.3u handles the whole
    docker engine.
    
    **Fix**: convert-supervision's docker preflight now runs Docker's
    official `get.docker.com` convenience script if `command -v docker`
    fails. That sets up the apt repo, installs `docker-ce` +
    `docker-compose-plugin` + dependencies, and enables + starts
    `docker.service`. After install, preflight re-verifies `docker --version`
    and proceeds to the compose-plugin check (which should now pass since
    get.docker.com includes the plugin).
    
    Rationale: "if you're converting TO docker and docker is missing,
    install it" is the obvious operator expectation. Failing with
    "go run the install-wizard bootstrap step yourself" made the Upgrade
    button lying. The converter is now genuinely self-healing for the
    docker-as-target case.
    
    Every manual fix becomes a UI feature β€” in line with the operator
    principle that production users won't have Claude to SSH in for them.
    
    **Install flow (docker path)**
    1. `command -v docker` β†’ if missing, run `get.docker.com` (10-min timeout)
    2. `docker --version` β†’ sanity check after install
    3. `docker compose version` β†’ if missing, fetch v2 plugin binary from
       GitHub (v0.7.3t code)
    4. Proceed with conversion
    
    Streamed output: the `[docker-install]` and `[compose]` lines show
    pull/install progress in real time.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3u`.
  • v.Chaos.Jason.0-uΒ·v0.7.3t
    Real error surfaced by v0.7.3s's error-visibility + rollback: node1
    had docker CLI 29.1.3 but **no compose plugin**. Ubuntu 22.04's
    `docker.io` package ships the CLI without the plugin. Running
    `docker compose up -d` then fails with
    `unknown shorthand flag: 'd' in -d` because docker treats `compose`
    as a positional arg and `-d` as a top-level docker flag.
    
    **Fix**: convert-supervision's docker preflight now checks for
    `docker compose version` and β€” if missing β€” downloads the official
    v2 plugin binary (v2.29.1) from GitHub releases directly into
    `/usr/libexec/docker/cli-plugins/docker-compose`. Single-binary
    install; no apt repo, no GPG key, no Docker repo setup needed.
    Architecture-aware (x86_64 / aarch64 / armv7). Uses sudo + tee
    (already in sudoers).
    
    This lifts off the operator's plate the "why doesn't my upgrade
    work" confusion when their distro's docker package is incomplete.
    Can later be factored into a shared `ensureDockerCompose()` primitive
    used by the install-wizard too.
    
    **Rollback proven end-to-end**
    - Last failed attempt from v0.7.3s logs showed: `[compose] unknown
      shorthand flag: 'd' in -d` β†’ `[rollback] βœ“ restored to systemd`.
      Node never needed manual SSH recovery. That's the target state.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3t`.
  • v.Chaos.Jason.0-vΒ·v0.7.3s
    The big one. Previously "the old supervision never comes back up on
    failure" was left to operators to fix manually (or a one-off recovery
    script). v0.7.3s bakes **full rollback** into every conversion.
    
    **How it works**
    - Before any destructive step, `captureOldStartInfo()` records how to
      restart the current mode:
      - systemd: resolves the active unit name
      - screen: captures the runner path from live argv or stored profile
      - docker: captures the compose working dir via `docker inspect`
    - The destructive section builds a `rollbackStack` of labelled undo
      callbacks as it goes:
      - after stop β†’ "restart old mode" (registered first, runs last)
      - if data dir was moved β†’ "mv data back" (using `[ -d src ] && [ ! -e dest ]` guards)
      - if data dir was newly created as part of layout β†’ "remove intermediate"
      - before writing systemd unit/wrapper β†’ snapshots originals to
        `.TS.bak`, records "restore systemd unit + wrapper" (stops + disables
        + restores backups + daemon-reload)
      - before writing screen runner β†’ snapshots to `.TS.bak`, records
        "restore screen runner"
    - On any failure in steps 4-7: run the stack in reverse (LIFO). Each
      undo is wrapped in try/catch so one failing undo doesn't block the
      rest. After rollback, re-runs supervision detection; logs whether
      the old mode came back successfully.
    
    **Verify is now inside the rollback scope** β€” if chainweb-node doesn't
    come up within 3 min under the new mode, we revert to the known-good
    old mode instead of leaving the node silent. (Previously the handler
    explicitly skipped rollback for verify failures; that was exactly the
    kind of half-broken state operators had to SSH in to fix.)
    
    **Error visibility (from v0.7.3r, restated)**
    - compose output is now streamed live to the job log via `onChunk`
      (pull/create/start progress visible in real time)
    - Combined (stderr + streamed stdout) is included in the error
      message, tail -600 chars, with explicit exit code
    - Same treatment for systemctl + screen start
    - Defensive `docker rm -f stoa-node` before compose up (survives
      a stale container collision from a prior failed attempt)
    
    **Scripts**
    - New `scripts/recover-node1-systemd.ts` β€” one-off recovery used to
      restore node1 to systemd after the v0.7.3o/p/q chain of failed
      conversions left it half-converted. Useful as a reference for
      similar recoveries; not intended to be part of the regular ops path
      now that rollback is built in.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3s`.
  • v.Chaos.Jason.0-wΒ·v0.7.3r
    Bugfix chain continuing from v0.7.3p/q. v0.7.3p unlocked the upgrade
    for adopted nodes (node1); the actual `docker compose up` then failed
    with only "docker compose up failed:" (empty stderr). Root cause: the
    handler merged stderr into stdout via `2>&1` but then only reported
    `r.stderr` on failure β€” dropping the real error on the floor.
    
    **Fixes**
    - Live-stream compose output to the job log (`onChunk`) β€” you see
      the pull/create/start progress in real time.
    - Error message now includes the combined captured output, tail
      -600 chars, with explicit exit code.
    - Defensive cleanup: `docker rm -f stoa-node` before compose up, so
      a stale `stoa-node` container from a prior failed attempt doesn't
      cause a name-conflict error on the next try.
    - systemd-start + screen-start error paths: same treatment
      (stdout + exit code surfaced).
    - Docker-compose timeout raised from 3 min β†’ 5 min to cover cold
      image pulls on slow connections.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3r`.
  • v.Chaos.Jason.0-xΒ·v0.7.3q
    Bugfix: after running a drive benchmark that classified a drive as SSD,
    the "Drive (sysfs)" row still rendered the red "HDD (discouraged)"
    badge because the badge was hardcoded to sysfs β€” the empirical class
    was only being applied to the Storage card's tone and the HDD-
    discouragement warning.
    
    Fix: new `effectiveClassBadge` that renders the benchmark class when
    available, sysfs as the fallback. The row is renamed from "Drive
    (sysfs)" to "Drive class", with a source note: "from empirical
    benchmark β€” sysfs heuristic said hdd" when they disagree, or "from
    /sys/block β€” heuristic (run benchmark below for empirical)" when
    only sysfs is available. Drive model moved to its own KV row.
    
    Also drops the now-dead `driveBadge` helper.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3q`.
  • v.Chaos.Jason.0-yΒ·v0.7.3p
    Bugfix: v0.7.3o's convert-supervision failed on adopted nodes (like
    node1) because it required a pre-captured `stoachain_flags_json` in
    the DB. Adopted systemd/screen nodes that never went through the
    Install wizard never had stored flags; the handler blew up at step
    2/8 with "no stored flag profile β€” trigger a Restart first".
    
    Fix: source flags from **live argv first** (via `fetchLiveFlags`, which
    SSHes + parses `ps` output), fall back to stored only if live parsing
    fails. Since the handler already confirms the node is running in
    step 1/8 (supervision detection), live always works in practice.
    
    Affected path: `lib/handlers/stoachain-convert-supervision.ts` step 2/8.
    No other behavior change; the hierarchy lock + UI + API route from
    v0.7.3o are untouched.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3p`.
  • v.Chaos.Jason.0-zΒ·v0.7.3o
    Turns the v0.7.3n any↔any converter into an **upgrade-only** ladder
    along the hierarchy `docker > systemd > screen`. Screen is the worst
    supervision mode for a production daemon β€” no restart policy, no boot
    recovery, session death = node death β€” and the UI now surfaces that
    so operators can't accidentally miss it.
    
    **Hierarchy (correct ordering)**
    - `docker`  β˜…β˜…β˜… β€” image-pinned, isolated, reboot-safe via
      `restart: unless-stopped`. Best.
    - `systemd` β˜…β˜…  β€” proper lifecycle (`Restart=on-failure`), boot recovery
      (`WantedBy=multi-user.target`), but binary lives on host. Upgrade
      recommended.
    - `screen`  β˜…    β€” no restart, no boot recovery. Upgrade highly
      recommended.
    
    **Hub-enforced upgrade-only conversions (3)**
    - `screen  β†’ systemd`
    - `screen  β†’ docker`
    - `systemd β†’ docker`
    
    **Refused downgrades (3)** β€” reinstall under the lower mode instead:
    - `systemd β†’ screen`
    - `docker  β†’ screen`
    - `docker  β†’ systemd`
    
    **Changes**
    - New `lib/supervision.ts` β€” single source of truth for ranks,
      star counts, labels, taglines, and reboot survivability. Exports
      `canUpgradeTo(from, to)`, `upgradeTargetsFrom(from)`.
    - `lib/handlers/stoachain-convert-supervision.ts` β€” enforces
      `canUpgradeTo` at job start. Downgrade requests fail with a clear
      message before any state changes.
    - `pages/api/admin/nodes/[id]/stoachain/convert-supervision.ts` β€”
      fetches live supervision, validates upgrade, rejects downgrades at
      the API layer so a downgrade never even hits the worker.
    - `components/admin/NodeTabs.tsx` β€” replaces `SupervisionConverterCard`
      with `SupervisionCard`. Shows current mode with star rating, tagline
      ("Best β€” no upgrade needed" / "Upgrade recommended" / "Upgrade highly
      recommended"), and an explicit "Survives hardware reboot: yes / no"
      indicator. When the node isn't at the top, an Upgrade button with
      dropdown of valid targets. Placed at the top of the Control sub-tab.
    - Tone: docker green, systemd amber, screen red.
    
    **Auto-restart verification**
    - Docker: `renderDockerCompose` already emits `restart: unless-stopped`
      (verified `lib/stoachain-layout.ts:160`).
    - Systemd: unit template already has `Restart=on-failure` +
      `WantedBy=multi-user.target` + `systemctl enable` (verified
      `stoachain-convert-supervision.ts:430-445`).
    - Screen: no auto-restart (intentional; reinforces 1-star rating).
    
    **Deferred to later**
    - Install wizard 3-mode selector + binary-extract-from-image primitive.
      Today only docker installs are wired; systemd/screen exist through
      adoption of legacy nodes or manual bootstrap. Fresh systemd/screen
      installs are a future slice; every node currently in the network can
      already be upgraded along the hierarchy via this converter.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3o`.
  • v.Chaos.Jason.0-aaΒ·v0.7.3n
    Closes gaps in supervision handling so every node-op works regardless of
    whether the node runs under screen / systemd / docker, and adds a
    first-class migration path between the three.
    
    **Seeds page: live backup-api detection**
    - `listManagedNodeStatus` (lib/seeds.ts) now fetches live flags alongside
      `/info`. The "Backup API" column in `/admin/seeds` no longer falls back
      to stored flags when the node's running argv has been edited out-of-band
      (node1 symptom before this fix).
    - Stored flags remain the fallback when the node is unreachable.
    
    **Unified logs endpoint**
    - New `GET /api/admin/nodes/[id]/stoachain/logs?lines=N` dispatches on
      detected supervision:
      - `docker`   β†’ `docker logs --tail N stoa-node`
      - `systemd`  β†’ `journalctl -u stoa-node.service --lines=N`
      - `screen`   β†’ `tail -n N` of common runner log files (`/var/log/stoa-node.log`,
                    `/mnt/nvmedrive/StoaNodeData/chainweb.log`, etc.), or a
                    friendly "attach to the screen session" note when no log
                    file exists
    - Old `/docker-logs` route kept as a back-compat alias.
    - Peer-activity route (`/stoachain/peer-activity`) now uses the same
      supervision-aware source β€” "Peer Activity" works for systemd + screen
      nodes too, not just docker.
    - New `NodeLogsCard` in NodeTabs replaces the docker-only
      `ContainerLogsCard` in the Control sub-tab. Title/source adapts:
      "Container logs" / "Service logs (journalctl)" / "Screen logs".
    
    **Flag Editor Apply+Restart: systemd support**
    - `stoachain-control` handler gained `rewriteSystemdWrapper`. When the
      user Applies flag changes on a systemd-supervised node, the handler
      inspects `systemctl cat stoa-node.service`, finds the wrapper script
      referenced by `ExecStart=`, and overwrites it with the output of
      `toRunnerScript(flags)` (base64 + tee, chmod 755). Then `daemon-reload`
      + `systemctl restart`.
    - Matches the existing docker-compose rewrite and screen runner-script
      rewrite paths β€” all three supervision modes now behave identically in
      the Flag Editor.
    
    **Any↔any supervision converter (NEW)**
    - New `lib/handlers/stoachain-convert-supervision.ts` migrates a node
      between any two supervision modes without losing chain data. Six
      conversions covered:
      - screen ↔ docker
      - screen ↔ systemd
      - docker ↔ systemd
    - 8-step pipeline: detect current β†’ load flags β†’ preflight target
      prerequisites β†’ stop current β†’ prepare new mode layout β†’ start under
      new mode β†’ verify live `/info` β†’ update stored state.
    - Docker target: rearranges into canonical `<stoaRoot>/{chainweb, data, tls}`
      layout, renders compose.yml via `renderDockerCompose`, mounts through
      to container-internal paths (`/data`, `/data/tls-cert.pem`).
    - Systemd target: writes `/usr/local/bin/run-stoa.sh` wrapper +
      `/etc/systemd/system/stoa-node.service` unit, daemon-reload + enable.
    - Screen target: writes `RunStoaNode.managed.sh` next to the data dir.
    - Auto-rollback attempts to restart under the old mode if the
      conversion fails after stop (not guaranteed β€” old-mode artifacts may
      already be overwritten when rearranging into docker layout).
    - New API `POST /api/admin/nodes/[id]/stoachain/convert-supervision`
      with body `{toMode: 'docker' | 'systemd' | 'screen'}`. Ancient admin +
      fresh-confirm required.
    - **UI** new `SupervisionConverterCard` on Chainweb β†’ Control sub-tab:
      dropdown of available target modes, destructive confirmation dialog,
      redirects to job log on submit.
    
    **Registry**
    - `lib/handlers/registry.ts` now registers 14 handler kinds (added
      `stoachain-convert-supervision`).
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3n`.
  • v.Chaos.Jason.0-abΒ·v0.7.3m
    Three items operator-requested.
    
    **Drive benchmark (empirical classification)**
    - New `lib/handlers/drive-benchmark.ts` β€” `dd`-based sequential write + read
      test against the node's data-dir filesystem. 512 MB default, `conv=fdatasync`
      + `oflag=dsync` to bypass cache. Caches dropped before read (via
      `/proc/sys/vm/drop_caches`).
    - Classifies by measured write throughput:
      - β‰₯ 500 MB/s β†’ `nvme`
      - β‰₯ 150 MB/s β†’ `ssd`
      - β‰₯ 50 MB/s β†’ `hdd`
      - < 50 MB/s β†’ `slow` (red warning; check for virtualized/network storage)
    - Persists via inline ALTER TABLE (additive cols `drive_bench_*` on `nodes`).
    - New API `POST /api/admin/nodes/[id]/drive-benchmark` β€” ancient admin +
      fresh-confirm.
    - Sudoers template updated: `/bin/dd`, `/bin/sh`, `/bin/sync` added.
    - **UI** on Chainweb β†’ Status β†’ Storage card: "Empirical benchmark"
      section alongside sysfs class. "Re-run benchmark" button. Shows
      write + read MB/s, measured timestamp, highlights mismatch between
      sysfs-heuristic and empirical-measured class.
    
    **`backup-directory` locked in Flag Editor**
    - Added to `IMMUTABLE_FLAGS` client + server side. Chainweb auto-derives
      it to `<database-directory>/backups` when omitted; setting it elsewhere
      breaks RocksDB hardlink checkpointing.
    - Operators now only toggle `enable-backup-api`; the dir is always correct
      by default. Matches how node2 / node1 / AncientLinux all work in
      practice.
    
    **Node1 `--enable-backup-api` enabled (manual fix)**
    - Out-of-band: SSH'd into node1, edited `/usr/local/bin/run-stoa.sh` to
      add `--enable-backup-api`, reloaded via `systemctl restart
      stoa-node.service`. Verified with POST to `/make-backup` β†’ returned
      backup id successfully.
    - Old runner script archived with `.TS.old` suffix.
    - Note: systemd-supervised nodes don't yet support Flag Editor's
      Apply+Restart path β€” that's v0.7.4+ work (unit-file rewriting).
      Manual fix for now; v0.7.4 ships proper support.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3m`.
  • v.Chaos.Jason.0-acΒ·v0.7.3l
    Filling in three automation gaps the user called out:
    
    **Certbot now detected in system-probe**
    - `lib/handlers/system-probe.ts` β€” new `SVC_CERTBOT` section captures:
      binary version, `certbot.timer` enabled/active state, next scheduled
      run, list of installed deploy-hooks.
    - `SystemProbe.services.certbot` β€” surfaces in the probe output so the
      admin UI can show certbot alongside docker, nginx, etc. Install wizard's
      certbot install (added in v0.7.3i) is now visibly confirmed by probe.
    
    **Cert renewal deploy-hook**
    - `stoachain-certbot-obtain` now installs a per-node deploy-hook at
      `/etc/letsencrypt/renewal-hooks/deploy/stoa-<nodeId8>.sh`.
    - When `certbot.timer` renews the cert (~60 days from now, automated),
      the hook:
      1. Copies the renewed cert files into chainweb's TLS paths
      2. Fixes ownership + permissions
      3. Detects supervision (docker / systemd / screen) at hook run-time
         and restarts accordingly (docker compose up -d --force-recreate,
         systemctl restart, or bail with instructions for screen)
    - Previously: certbot renewed fine but the new cert never reached
      chainweb's in-memory copy β€” would have been a silent time-bomb ~60
      days out.
    
    **Daily seed auto-refresh (scheduled)**
    - `worker/index.ts` β€” `maybeScheduleSeedRefresh()` runs on every main
      loop iteration (throttled to 15 min between checks). If:
      - auto-refresh isn't disabled (system_state flag)
      - current seed is >23h old (or missing entirely)
      - no seed-refresh job is already queued/running
      - there's an eligible donor
      β†’ enqueues a `seed-refresh` job automatically. Runs under actor email
      `system:seed-auto-refresh` in the audit trail.
    - `pages/api/admin/seeds/auto-refresh.ts` β€” POST endpoint to toggle
      the scheduler on/off (fresh-confirm + ancient-admin).
    - Admin UI on `/admin/seeds` gets a new "Auto-refresh schedule" panel:
      green/gray Enabled/Disabled toggle, next ETA (based on current seed
      age + 23h), last enqueue timestamp + job id, last skip reason (e.g.
      "no donor available").
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3l`.
  • v.Chaos.Jason.0-adΒ·v0.7.3k
    Follow-on cleanup after v0.7.3j proved the LE flow works end-to-end.
    
    **Cert-doctor logic inverted (critical bugfix)**
    - `lib/cert-doctor.ts`: the old v0.7.3h logic said "CA-signed is bad,
      self-signed is good, certbot auto-renew breaks the network". Every
      claim was wrong β€” verified today by restoring node2's LE cert and
      watching all three nodes resume syncing.
    - New logic:
      - `severity='healthy'` (green) β€” CA-signed + certbot auto-renew active
      - `severity='warn'` (amber) β€” CA-signed without auto-renewal configured
      - `severity='error'` (red) β€” self-signed (broken on public Stoa P2P)
      - `severity='unknown'` β€” cert unreadable / ephemeral / missing
    - Messages rewritten to match reality.
    
    **Identity card: positive confirmation when healthy**
    - Green banner appears when TLS is set up right (LE + certbot timer).
      *"Peer trust is unaffected by renewals because chainweb validates
      via CA chain, not fingerprint pinning."*
    - Amber banner when LE cert but no auto-renew.
    - Red banner stays only for self-signed β€” the actually-broken case.
    
    **Sync progress indicator on Status card**
    - New KVs: **Target (tallest peer)** + **Sync progress**.
    - Target = max cut height across all managed nodes the hub has
      live-probed (parallel SSH, O(slowest node)). Null if this is the
      tallest.
    - Progress shown as permil with 3 decimals (e.g. `998.234 ‰`), colored:
      - green `β‰₯ 999 ‰`
      - gold `β‰₯ 950 ‰`
      - amber `< 950 ‰`
    - Shows "N blocks behind" when delta > 0; "at tip" when caught up.
    
    **certbot handler: auto-resolves docker host paths**
    - `stoachain-certbot-obtain`: if the node's stored runner_path is a
      `docker-compose.yml`, the handler derives `<stoaRoot>/tls/...` host
      paths from it automatically. No more manual `certPath` + `keyPath`
      in the API payload for docker-supervised nodes.
    
    **Add Node UI: docker-default signaling**
    - "Easy setup (with password)" β†’ **"Easy setup Β· docker"** + green
      "recommended" badge.
    - "Advanced (paste private key)" β†’ **"Advanced Β· existing install"**.
    - Explanatory line under the tabs: *"Docker supervision gives each node
      a self-contained environment … what the hub recommends for any new
      install."*
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3k`.
  • v.Chaos.Jason.0-aeΒ·v0.7.3j
    Found three bugs in the v0.7.3i certbot handler while actually running
    it against AncientLinux:
    
    **1. Silent apt install "success"**
    - `sudo -n apt-get install -y certbot 2>&1 | tail -5` β€” tail's exit code
      masks apt-get's failure. Handler thought certbot was installed; it
      wasn't.
    - Fixed: wrap in `set -o pipefail`, then verify post-install with
      `command -v certbot`.
    
    **2. DEBIAN_FRONTEND=noninteractive rejected by sudo**
    - sudo's `env_reset` strips non-whitelisted env vars. Setting
      `DEBIAN_FRONTEND` in the sudo command failed: *"you are not allowed
      to set the following environment variables"*.
    - Fixed: dropped the env var. apt-get install is fine without it.
    
    **3. DNS-01 hook scripts didn't land on disk**
    - `echo ${JSON.stringify(script)} | tee ...` corrupted escape sequences.
      The `\n` in the script source became literal `\n`, not a newline. The
      file ended up as one giant first line that `sh` couldn't parse β†’
      certbot reported `/bin/sh: 1: /etc/letsencrypt/duckdns-hooks/auth.sh:
      not found`, even though the file existed.
    - Fixed: base64-encode the script content, decode on the remote side
      via `base64 -d | tee`. Same pattern used by `writeManagedRunner` etc.
    - Added sanity check: `test -x && test -s` on the written file before
      invoking certbot.
    
    **UX cleanup**
    - **Deleted** `CertRotateButton` / the self-signed rotate UI entirely.
      Per feedback: *"just remove it all together... since its only noise
      now."* The old `stoachain-cert-rotate` handler stays registered for
      API-level compatibility but has no UI surface anymore.
    - **Renamed** "Obtain Let's Encrypt cert (recommended)" β†’ simply
      **"Install TLS cert"**. No "recommended" hedge β€” LE is the only way
      chainweb P2P works on public Stoa.
    - **Auto-detected challenge** from the hostname: `.duckdns.org` β†’
      DNS-01, everything else β†’ HTTP-01. Removed the challenge dropdown.
      Operator still needs to provide a DuckDNS token for NAT'd nodes; the
      field auto-reveals only when DNS-01 is the auto-choice.
    
    **On "ancientholdings as its own CA" (future work)**
    - Honest answer logged: technically possible (~weeks of work), but
      creates network-splitting effect with operators who already trust LE.
      Deferred to Phase 3+ as a consortium-CA option; main Stoa network
      stays on LE.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3j`.
  • v.Chaos.Jason.0-afΒ·v0.7.3i
    **Root cause finally verified**: chainweb-node's P2P TLS validates
    against the standard system CA bundle. Self-signed certs are rejected
    with `HandshakeFailed "certificate has unknown CA"`. Let's Encrypt
    certs (CA-signed) work fine β€” the original node1 + node2 setup used LE
    for exactly this reason.
    
    **The hub's old cert-rotate generated self-signed certs β†’ broken for
    real chainweb use.** Rotating node2 twice today (P-384 and P-256, both
    self-signed) broke peer sync each time. **Restoring node2's original LE
    cert from `/etc/letsencrypt/live/` immediately fixed sync network-wide**
    β€” confirmed by: node2 1,622,533 β†’ 1,624,038 in minutes; node1 unstuck
    from 1,621,032 β†’ 1,621,330; AncientLinux β†’ 1,623,530.
    
    **New handler: `stoachain-certbot-obtain`**
    - `lib/handlers/stoachain-certbot-obtain.ts`:
      1. Installs certbot via apt if missing.
      2. Archives existing cert+key with `.TS.old` suffix.
      3. Runs certbot:
         - HTTP-01 (`--standalone`): certbot binds :80; nginx is briefly
           stopped if active; chainweb keeps running.
         - DNS-01 via DuckDNS: writes a small auth-hook script that updates
           a TXT record via DuckDNS's API (works for NAT'd nodes like
           AncientLinux on `bytales.duckdns.org`).
      4. Copies `fullchain.pem` + `privkey.pem` from
         `/etc/letsencrypt/live/<domain>/` to chainweb's configured paths.
      5. chown to the chainweb user, chmod 600 on the key.
      6. Optionally auto-restarts chainweb-node to load the new cert.
    
    **Bootstrap: certbot now installed alongside docker**
    - `lib/nodes.ts` β€” `prepareTarget` script now installs certbot via apt/dnf/yum.
    - Canonical sudoers list gains `/usr/bin/certbot`, `/usr/bin/apt-get`,
      `/usr/bin/cp`, `/bin/cp`. `sudoers-repair` endpoint updated to match.
    
    **UI: Identity card**
    - Primary action is now **"Obtain Let's Encrypt cert"** with challenge
      method dropdown (HTTP-01 / DNS-01-DuckDNS), ACME email field, DuckDNS
      token field (revealed when DNS-01 selected), auto-restart checkbox.
    - The old **"Rotate (self-signed)"** button is tucked behind an
      `Advanced` expand arrow with a warning that self-signed certs are
      rejected by chainweb P2P on public networks.
    
    **API**
    - `POST /api/admin/nodes/[id]/stoachain/certbot-obtain` β€” fresh-confirm +
      ancient-admin. Body accepts `domain`, `email`, `challenge`,
      `duckdnsToken`, `restart`.
    
    **Node2 cert restored out-of-band** via SSH β€” see above for heights
    proving the fix. Next: user can obtain LE certs for AncientLinux (DNS-01
    via DuckDNS) through the new UI action.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3i`.
  • v.Chaos.Jason.0-agΒ·v0.7.3h
    **Systemd supervision in stoachain-control**
    - `lib/handlers/stoachain-control.ts` β€” new `systemd` branch alongside
      existing docker + screen paths. Resolves the unit name (prefers
      `stoa-node.service`, falls back to any active `stoa*` / `chainweb*`
      unit) and dispatches `systemctl start|stop|restart <unit>`.
    - `waitForChainweb` reused for post-start liveness check.
    - `detectSupervisionLive` now recognizes systemd (between docker and
      screen in priority).
    - Limitation documented: flag edits in the Flag Editor don't yet apply
      to systemd-supervised nodes because the handler doesn't rewrite the
      unit file's `ExecStart` line. Restart/Start/Stop work. Flag-driven
      recomposes will land in v0.7.3i or later.
    
    **Cert-doctor**
    - New `lib/cert-doctor.ts` β€” inspects a node's TLS setup beyond just
      "cert file exists":
      - Issuer CN vs Subject CN β†’ classifies `self-signed` / `ca-signed` /
        `unknown`
      - Scans for `certbot.timer` systemd unit + cron entries mentioning
        `certbot` / `letsencrypt`
      - Extracts last-run / next-run timestamps from the timer
    - Status endpoint `/api/admin/nodes/[id]/stoachain/status` now returns
      a `certDoctor` section.
    
    **Identity card UI**
    - New red warning banner at the top of the Identity card when:
      - Cert is CA-signed (issuer β‰  subject)
      - Certbot auto-renew is active (timer or cron)
    - Warning explicitly lists the class of problem and suggests rotation
      to self-signed ECDSA.
    - New `Issuer` KV row shows the issuer CN + cert-kind tag (`self-signed`
      / `ca-signed`). Red styling when ca-signed.
    
    **Why this lands as one feature**
    - StoaNodeOne was just added to the hub: systemd-supervised, Let's
      Encrypt cert with active certbot timer. Surfacing both problems
      (can't control via hub; cert will periodically rotate) in one release
      so operators see the full picture.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3h`.
  • v.Chaos.Jason.0-ahΒ·v0.7.3g
    Response to the feedback *"all of these manual help-ups, in production
    you are not there to fix shiet"* β€” every manual fix I've done during
    this session is now exposed as a UI action the operator can trigger
    themselves.
    
    **Peer activity card + auto-detection banner**
    - New `lib/peer-activity.ts` β€” parses chainweb-node's docker logs into
      per-peer summaries (error count, last success/error, dominant failure
      tag: `unknown-ca`, `timeout`, `conn-refused`, etc.).
    - New API `GET /api/admin/nodes/[id]/stoachain/peer-activity?minutes=N`
      β€” SSHes to target, pulls last N minutes of container logs, returns
      events + summaries + **auto-detected issues** with suggested actions.
    - New `PeerActivityCard` on Chainweb β†’ Status sub-tab. Polls every 15s.
      Shows per-peer table. If the node isn't syncing AND a dominant tag
      of `unknown-ca` is detected β†’ **red banner with "Reset peer trust"
      one-click button**.
    
    **Reset peer trust (self-service)**
    - New handler `peer-trust-reset` (`lib/handlers/peer-trust-reset.ts`):
      composes `seed-refresh` + `stoachain-reseed` in one job. Refreshes
      seed from a healthy donor (excludes the target), then reseeds the
      target. As a side effect, the target's peer-DB is replaced with the
      donor's current view β€” stale fingerprints cleared.
    - Honest caveat documented in the handler: **this is a pragmatic proxy**
      for a surgical peer-DB wipe. True surgical would be a rocksdb key
      prefix delete; building that requires chainweb source reading we
      haven't done.
    - New API `POST /api/admin/nodes/[id]/stoachain/peer-trust-reset`.
      Fresh-confirm + ancient admin.
    
    **Cert-rotate now generates ECDSA P-256** (was P-384)
    - `lib/handlers/stoachain-cert-rotate.ts` β€” switched curve to P-256 with
      SHA-384 signatures. Matches the original working Stoa cert (the one
      AncientLinux trusted pre-incident); evidence suggests P-384 may be
      rejected by some chainweb-node builds. 128-bit security is still
      ample for P2P identity; cert generation is faster.
    
    **Force-fail stuck job**
    - New API `POST /api/admin/jobs/[id]/force-fail` + button on
      `/admin/jobs/[id]`. Marks a `running` or `queued` job as failed in
      the DB immediately. Operator escape hatch for jobs that never complete
      due to worker bugs or external issues (ssh2 half-open channels, dead
      remote processes). Warns that side effects the handler was in the
      middle of may persist.
    
    **Sudoers repair**
    - New API `POST /api/admin/nodes/[id]/sudoers-repair` + new
      `SudoersRepairCard` on Chainweb β†’ Control sub-tab. One-click rewrites
      `/etc/sudoers.d/ancientholdings-stoa` with the current canonical
      NOPASSWD command list. Idempotent. Uses existing `tee` NOPASSWD grant
      so no password prompt.
    - Fixes the pre-v0.7.3d installs that didn't include `tar`/`df`/`du`/
      `find` in sudoers (AncientLinux, Node2).
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3g`.
    
    **To test**: if AncientLinux still blocked by TLS, click Rotate on node2
    again (will get P-256 this time) β€” if sync resumes, curve was the
    issue. If not, click "Reset peer trust" on AncientLinux β€” will take
    ~20 min but should fully clear any stale-fingerprint pinning.
  • v.Chaos.Jason.0-aiΒ·v0.7.3f
    v0.7.3e proved the full reseed pipeline works end-to-end β€” AncientLinux
    jumped from cut height ~79,000 to ~1,621,032 (StoaNodeTwo's height at
    seed-capture time) in minutes, as designed. But the handler parked in
    `running` state at 90% afterward because of an ssh2 quirk.
    
    **Root cause**: when the remote `tar -xz` exits cleanly after consuming
    all stdin, ssh2 sometimes emits only the `exit` event and NEVER the
    `close` event. The handler was waiting on `close` to settle the
    promise, so it parked indefinitely even though tar finished + data was
    correct.
    
    **Fix**: settle on whichever of `exit`, `close`, or a post-EOF 60s
    timer fires first. `plaintextTarGz.pipe(stream).on('end')` triggers
    the timer as a belt-and-suspenders. Either:
    - `exit` fires with the tar exit code β†’ settle immediately based on it
    - `close` fires without exit β†’ settle based on stderr (empty = success)
    - post-EOF 60s passes without either β†’ settle based on stderr
    
    All three paths guarantee deterministic resolution. No more 20-min
    wait-on-timeout after successful reseeds.
    
    **Recovery of the stuck job from v0.7.3e test**:
    - Job `efabc16d-…` was stuck at 90% running. I killed the worker,
      manually ran the remaining handler steps (mv staging β†’ data, rm
      data.old, docker compose up -d), and marked the job succeeded in
      the DB so the UI reflects reality.
    - AncientLinux verified running off the seed at height 1,621,032.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3f`.
  • v.Chaos.Jason.0-ajΒ·v0.7.3e
    v0.7.3d passed the sudoers preflight but failed during extraction with
    `gzip: stdin: not in gzip format` β€” and the node got stranded again
    (container stopped, data moved aside, extract dead). Two bugs:
    
    **1. First chunks of the decrypted stream disappeared**
    - `lib/handlers/stoachain-reseed.ts` β€” the progress-tracking `data`
      listener was attached to `plaintextTarGz` BEFORE `.pipe()` was set up.
      Adding a data listener puts a Node Readable into flowing mode
      immediately; during the `await` gap before the SSH pipe attached, the
      first chunks flowed into only the counter (no pipe yet) and vanished.
      The remote tar received bytes starting mid-gzip β†’ "not in gzip format".
    - Fix: replace the separate `data` listener with an inline `Transform`
      in the pipe chain. Every byte passes through counter β†’ pipe β†’ SSH
      stdin, no losses.
    
    **2. Failed reseed stranded the node**
    - When extract fails, the handler had already stopped the node + moved
      data aside. No rollback meant the operator had to SSH in and move
      things back manually.
    - New `rollbackAfterExtractFailure()` helper fires on any extract throw:
      remove the (partial) staging dir, `mv data.old.<ts>` back to live, and
      `docker compose up -d` (for docker supervision). Best-effort β€” each
      step is try/catch, any rollback failure is logged but doesn't mask the
      original extract error.
    - Tight-disk mode (no data.old kept) skips the restore step with a clear
      log line β€” operator must reseed or sync from genesis.
    
    **3. Broader stderr pattern matching**
    - `streamIntoTarExtract` now also settles immediately on:
      - `not in gzip format` / `unexpected end of file` β†’ stream plumbing / corrupt archive
      - `error is not recoverable` / `child died with signal` β†’ tar internal fatal
      - `no space left on device` β†’ disk full during extract
    - Previously only sudo-denial patterns triggered early-settle; everything
      else waited for the close event that ssh2 sometimes doesn't emit.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3e`.
  • v.Chaos.Jason.0-akΒ·v0.7.3d
    First reseed on AncientLinux hung: the target's sudoers (written by the
    install wizard) didn't include `tar`, so `sudo -n tar -xz` immediately
    hit "a password is required" and the SSH stream closed in a way the
    handler didn't catch. Job stuck at 22% "running" forever, and worse β€”
    by the time we noticed, the node was already stopped with its data moved
    aside.
    
    Three fixes:
    
    **1. Pre-flight `sudo -n tar` BEFORE stopping the node**
    - `lib/handlers/stoachain-reseed.ts` β€” `preflightSudoTar()` runs a
      harmless `sudo -n tar --version` as the first destructive-safe check.
      If sudo denies it, fail immediately with the exact sudoers line the
      operator needs. Node stays running, data stays put β€” recoverable state.
    
    **2. Hang-safe stream plumbing**
    - Handler's `streamIntoTarExtract` promise now has a single `settle()`
      gate and hooks error / exit / close / stderr-pattern triggers all of
      which settle deterministically. Sudo denial patterns in stderr settle
      immediately instead of waiting for the SSH `close` event that ssh2
      sometimes doesn't emit when the remote process dies before receiving
      any stdin bytes.
    - 20-minute belt-and-suspenders timeout β€” any state where the SSH
      channel goes half-open without firing events still fails cleanly.
    
    **3. Install-template sudoers now includes tar + df + du + find**
    - `lib/nodes.ts` β€” bootstrap writes `tar`, `df`, `du`, `find` into the
      NOPASSWD list. Every NEW install gets the right sudoers.
    - **Existing installs (AncientLinux, Node2)** need a one-time sudoers
      patch β€” the handler's new preflight surfaces this as a clear error
      with the exact fix.
    
    **Recovery on the stuck install**
    - Stuck job `86285db2-…` marked failed in the DB manually (it was
      never going to complete on its own).
    - AncientLinux's moved-aside data dir restored to `/home/StoaNode/data`;
      container brought back up; syncing resumed.
    - AncientLinux's sudoers patched directly via SSH to include the new
      entries.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3d`.
  • v.Chaos.Jason.0-alΒ·v0.7.3c
    Consumer half of SC5 lands: a running-but-unsynced node can now jump
    near head by pulling the hub's current seed instead of waiting days to
    sync naturally. End-to-end streaming: hub decrypts the .ahbk in-memory,
    pipes plaintext tar.gz over SSH into a `tar -xz` on the target. No
    intermediate files anywhere β€” peak memory β‰ˆ SSH channel buffer.
    
    **New handler: `stoachain-reseed`**
    - `lib/handlers/stoachain-reseed.ts` β€” 8-step pipeline:
      1. Preflight (current seed exists, target reachable, disk-space check)
      2. Detect supervision (docker / screen), resolve host data dir via
         docker inspect OR stored flags' `database-directory`
      3. Stop node (`docker compose down` OR `screen quit` + pkill)
      4. Move existing data aside: `mv data/ β†’ data.old.<ts>/`
         (or rm up front in tight-disk mode)
      5. Stream decrypt from hub's `openArchiveStream()` β†’ pipe into SSH
         `sudo tar -xz -C <data.staging>`
      6. Structural verify β€” `CURRENT` file present in staging
      7. Atomic `mv data.staging/ β†’ data/`
      8. Restart node + delete `data.old/`
    - Handles the 700 GB case cleanly: decrypt + extract happen as one
      streaming pass; disk preflight requires 1.1Γ— seed size free (or
      ~seed+existing in deleteOldFirst mode).
    
    **Disk-space UX**
    - Default mode keeps `data.old/` aside during extract, deletes on success
      β†’ peak disk ~2Γ— for minutes only, rollback-safe.
    - Tight-disk mode (operator-selected checkbox) deletes existing data
      BEFORE extract β†’ peak disk ~1Γ—, zero rollback. UI warns explicitly:
      "if extraction fails, you will have NO chain data."
    
    **UI: Reseed card on Chainweb β†’ Control sub-tab**
    - `components/admin/NodeTabs.tsx` β€” new `ReseedCard` alongside
      ControlCard + RunnerCard. Loads current seed via
      `/api/admin/seeds`, shows donor / seed cut height / node cut height /
      blocks-skipped-forward preview.
    - Tight-disk checkbox. Confirm dialog explains destruction before
      enqueue. Fresh-confirm required.
    - Inline rollback/rewind warnings if seed height is behind node height.
    
    **API**
    - `POST /api/admin/nodes/[id]/stoachain/reseed` β€” fresh-confirm +
      ancient-admin. Body `{deleteOldFirst?: boolean}`. Rejects with 409 if
      no current seed exists on the hub.
    
    **Not in this release** (deferred to v0.7.3d):
    - Install wizard "seeded install" mode (for brand-new nodes, not
      reseed). Needs install-handler extension + wizard UI β€” distinct code
      path.
    - Chainweb-node boot test of staging dir before swap. Adds ~60s per
      reseed and hasn't been needed for the common docker case; revisit if
      post-swap failures become a pattern.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3c` Β· `SC5 Seeded install β€” reseed pipeline`.
  • v.Chaos.Jason.0-amΒ·v0.7.3b
    End-to-end test of v0.7.3a on localhost surfaced three UX gaps:
    
    1. **"No eligible donors"** even though Node2 was running with
       `--enable-backup-api`. The filter was reading `probe.chainweb.backupEnabled`
       from `system_probe_json`, which the probe doesn't populate β€” the field's
       under `probe.chainwebFlags` (and currently empty). Authoritative source
       for backup-api flag is actually `nodes.stoachain_flags_json` (set at
       install + every Start/Restart via the hub).
    
    2. **Only eligible donors shown.** When nothing was eligible, the admin had
       no way to see WHY each managed node was excluded.
    
    3. **No indication of in-flight refresh.** If the admin bounced away from
       the job log page, there was no way to tell from /admin/seeds that a
       new seed was being built.
    
    All three fixed.
    
    **Fix: donor detection from stored flags + live cut height**
    - `lib/seeds.ts` β€” `storedBackupApiEnabled()` reads the DB-stored flag
      profile (authoritative) instead of the probe.
    - New `listManagedNodeStatus()` surveys every managed node in parallel,
      SSHing each via `fetchLiveStatus` for cut height + reachability. O(slowest
      node) not O(sum). Returns eligibility status per node:
      - `eligible` / `eligible-rotation` β€” can donate (latter skipped by
        auto-pick; admin can still pick manually)
      - `no-backup-api`, `not-reachable`, `not-running`, `cut-too-low`,
        `unknown` β€” with a human reason
    - `listDonorCandidates` + `pickDonor` converted to async; rebuilt on top of
      `listManagedNodeStatus`.
    
    **UI: full managed-nodes table + active refresh banner + download queue placeholder**
    - `/admin/seeds` gets three new sections:
      - **Active refresh banner** (top): shows when a `seed-refresh` job is
        `queued` or `running`, with live progress % + step label. Link to full
        job log. Polls every 5s.
      - **Managed nodes table**: every node the hub is managing, eligibility
        badge (βœ“ / β—· / ◐ / βœ— / β€”), cut height, last-donated date + relative
        age. The "why not eligible" reason shows inline beneath the badge.
      - **Active downloads table**: structure in place, populated as [] until
        v0.7.3c ships the consumer streaming pipeline.
    - Refresh controls updated: auto-pick works even when only
      `eligible-rotation` nodes exist (explicit "recent donor β€” override"
      label). "⏳ refresh already in flight" notice blocks starting a second.
    
    **APIs**
    - `GET /api/admin/seeds` now returns `managedNodes`, `activeRefresh`, and
      `activeDownloads` in addition to `current` + `archives`.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3b`.
  • v.Chaos.Jason.0-anΒ·v0.7.3a
    Start of phase SC5: the hub can now produce "seeds" β€” promoted backup
    archives intended for serving to new / unsynced nodes so they skip
    weeks of syncing-from-genesis. This release only covers the PRODUCER
    side (hub making + promoting seeds); consumer side (new installs +
    reseed consuming a seed) lands in v0.7.3b.
    
    **Schema**
    - `db/migrations/015_seed_archives.sql` β€” adds `seed_archives` (metadata
      for promoted backups; one 'current' row at a time) and
      `seed_downloads` (future queue/progress for clients streaming the
      seed down; consumed in 0.7.3b).
    - A seed row references a `backups.id` β€” the archive file itself lives
      where the backups system put it (`data/backups/<id>.ahbk`). No
      duplicated bytes on the hub.
    
    **New library**
    - `lib/seeds.ts` β€” CRUD for seed_archives, atomic `promoteBackupToSeed`
      (previous current β†’ archived in one SQLite tx, new insert), and
      `pickDonor` / `listDonorCandidates` with health filters:
      - node must have chainweb-node running per latest probe
      - `--enable-backup-api` enabled
      - cut height within 5% of the tallest candidate (proxy for "synced")
      - not donor in the last 3 days (rotation; skipped when only one
        candidate is available)
    
    **New handler**
    - `lib/handlers/seed-refresh.ts` β€” full 4-step flow:
      1. Pick donor (explicit or auto-rotated)
      2. Capture donor live status (cut height + chainweb-node version for
         the seed manifest)
      3. Run `stoachainBackupHandler` as a direct function call (same code
         path as manual customer backups, same encryption)
      4. Promote the resulting `backups` row as the new current seed;
         previous current β†’ archived
    - Registered as kind `seed-refresh`.
    
    **Admin panel**
    - New page `/admin/seeds` β€” current seed card (donor, cut height, size,
      sha256, age), eligible-donors picker, one-click "Refresh seed now"
      button (fresh-confirm + ancient-admin), and a history table of past
      promotions.
    - Link added from `/admin/`.
    
    **APIs**
    - `GET  /api/admin/seeds` β€” read-only (plain admin auth); returns
      current + archives + donor candidates.
    - `POST /api/admin/seeds/refresh` β€” enqueues a `seed-refresh` job;
      fresh-confirm + ancient-admin; body `{donorNodeId?: string}`.
    
    **Not in this release** (lands in 0.7.3b):
    - Install wizard "seeded install" mode
    - `stoachain-reseed` handler for existing nodes (stop β†’ download β†’
      verify β†’ swap β†’ start)
    - Streaming download + extract pipeline (tar.zst + secretstream β†’ staging
      dir β†’ boot test β†’ atomic rename)
    - Disk-space preflight with keep-old vs delete-old UX
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.3a` Β· `SC5 Seeded install β€” producer side`.
  • v.Chaos.Jason.0-aoΒ·v0.7.2d
    Restart on AncientLinux actually worked β€” chainweb-node came up inside
    the recreated container with the new `p2p-hostname=bytales.duckdns.org`
    and `cluster-id=AncientMiner`, syncing cuts from node1 at height 70490.
    But the hub's job reported "failed" because `waitForContainerChainweb`
    couldn't detect the process.
    
    Root cause: `docker top stoa-node -eo comm` fails on this Docker /
    kernel combo with `"Couldn't find PID field in ps output"`. The custom
    `-eo comm` ps-options syntax isn't universally supported.
    
    Fix: drop the custom ps format. Use the default `docker top` output
    (which includes the full CMD with argv in the rightmost column) and
    grep for the substring `chainweb-node`. Works whether the entrypoint
    execs `/chainweb/chainweb-node` (current image) or any wrapper, and
    doesn't rely on a specific ps format.
    
    Also: backfilled the AncientLinux node row's
    `stoachain_last_action=restart` + `stoachain_runner_path` to
    `/home/StoaNode/chainweb/docker-compose.yml` so the UI reflects the
    de-facto successful restart from the previous job attempt.
    
    Known quirk surfaced by the logs: home node can't sync peers from
    node2.stoachain.com due to `certificate has unknown CA` β€” backlog item,
    not new. node1 syncing works fine.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.2d`.
  • v.Chaos.Jason.0-apΒ·v0.7.2c
    v0.7.2b's docker branch was technically correct but didn't run: the worker
    had booted before the code change and kept running the old screen-only
    handler. When the user hit Apply + Restart on AncientLinux (docker-
    supervised), the stale handler ran `screen -X quit` (no-op), then `pkill
    -TERM chainweb-node` β€” which killed the chainweb-node process INSIDE the
    container (PID-visible on the host), then tried to write the runner to
    `/data/RunStoaNode.managed.sh` (the container-internal data dir path,
    which doesn't exist on the host). Job failed. Container's restart policy
    (`restart: unless-stopped`) brought chainweb-node back up.
    
    Three preventive changes so this can't recur.
    
    **Worker logs its VERSION on boot**
    - `worker/index.ts` β€” import `VERSION`, `PHASE_CODE`, `PHASE_NAME` from
      `lib/version.ts` and banner them on startup. Operators can now tell at
      a glance (tmux scrollback, PM2 logs) whether the worker process is
      running current code after a patch. Every suffix-bump (`v0.7.2b β†’ c`)
      changes the banner text.
    
    **Screen-path stopNode refuses to pkill when a stoa-node container is running**
    - `lib/handlers/stoachain-control.ts` β€” before the screen quit + SIGTERM
      sequence, the handler checks `docker ps --filter name=stoa-node
      --filter status=running`. If a live stoa-node container is found, it
      throws a clear error asking the operator to restart the worker. The
      supervision branch at the top of the handler should have caught this
      earlier, so the only way the screen path ever reaches a docker node is
      stale worker code.
    
    **CLAUDE.md documents `npm run worker:watch`**
    - The `package.json` already had `worker:watch` using `tsx watch`, which
      auto-reloads on every `.ts` change. `CLAUDE.md` now recommends it as
      the dev default; the plain `npm run worker` only makes sense when
      you're debugging the worker itself and don't want auto-restart.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.2c`. Suffix-ticks on every patch from now on
      (`a`, `b`, `c`, …), as requested β€” the live badge shows it, the
      worker banner shows it, the changelog cross-references it.
  • v.Chaos.Jason.0-aqΒ·v0.7.2b
    Follow-on from v0.7.2a after the user pointed out that the RunnerCard told
    the same screen-based story regardless of supervision mode β€” and noticed
    the description was wrong for the docker-supervised node (AncientLinux).
    Turned out that wasn't just bad copy: the `stoachain-control` handler was
    entirely screen-only. Clicking Apply + Restart on a docker-supervised node
    would have tried `screen -dmS StoaNode` against a container β€” nonsense.
    
    **`stoachain-control` is now supervision-aware**
    - `lib/handlers/stoachain-control.ts` β€” live supervision detection at the
      start of every run (`docker ps -a --filter name=stoa-node`, then
      `screen -ls`). The handler dispatches to a docker branch or the
      original screen branch.
    - Docker Restart: `docker inspect stoa-node` to find
      `com.docker.compose.project.working_dir` + current image tag β†’
      `computeLayout()` from that dir β†’ `renderDockerCompose(layout, imageTag,
      flags)` β†’ write `docker-compose.yml` over SSH β†’ `docker compose up -d
      --force-recreate`. `--force-recreate` ensures new env vars take effect
      even when the image tag hasn't changed. Waits for `chainweb-node` to
      appear in `docker top` output, up to 4 minutes.
    - Docker Stop: `cd <composeDir> && docker compose down`. The container
      is removed; next Start recreates from the compose file.
    - Docker Start: same as Restart, but only runs the up/wait phase.
    - `stoachain_runner_path` for docker nodes stores the compose file path
      (that's the "hub-rewritten-on-every-Restart" thing for docker), so the
      status endpoint continues to identify docker-managed nodes the same
      way it did before.
    - Compose dir not found (container `docker rm`'d manually) β†’ clear
      error message asking the operator to re-run Install.
    
    **`RunnerCard` β€” honest per-supervision copy**
    - `components/admin/NodeTabs.tsx` β€” split into `ScreenRunnerCard` and
      `DockerRunnerCard` with correct fields + accurate procedure writeups.
      Docker version surfaces: container name (`stoa-node`), the inside-
      container binary path (`/chainweb/chainweb-node`), the host-side
      compose path, and the three bind-mount pairs (data, cert, key). The
      "How Start / Restart works" steps describe the actual flow β€” inspect β†’
      renderDockerCompose β†’ tee β†’ up --force-recreate β†’ poll docker top.
    - Screen version keeps its original writeup but re-titled "Runner +
      binary (screen)" for clarity, and clarifies that the legacy-runner
      rollback path is screen-specific.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.2b`.
    
    **Known caveat**
    - If Apply + Restart on a docker node is the first time the hub has
      operated on it, the stored flags profile is already there (install
      wizard wrote it). But if someone imported a docker node without
      running the wizard (rare β€” no UI path), stoachain_flags_json is
      empty, and the first Restart will fall through the live-capture
      branch. The capture logic parses `ps -eo args` which on a docker
      host includes the container's chainweb-node argv β€” should work.
  • v.Chaos.Jason.0-arΒ·v0.7.2a
    End-to-end test on the live site with the home node surfaced several
    usability issues addressed here.
    
    **Chainweb tab: sub-tab navigation**
    - `components/admin/NodeTabs.tsx` β€” the Chainweb tab's cards were stacked
      vertically and scrolled 3+ screens deep. Split into sub-tabs:
      `Status | Control | Flags | Identity | Backup`. URL hash format is
      `#chainweb/<sub>` so links are bookmarkable.
    - The Flags sub-tab has an inner toggle: `Edit config` (default) vs
      `Current (live)`. Read-only live-parsed view is always one click away.
    
    **Editor prefilled with live ("ghost") values**
    - Every input is now pre-filled with the value chainweb-node is actually
      running. The operator changes only what they want (e.g. `p2p-hostname`
      from `ancientminer.home` to `bytales.duckdns.org`) and the rest stays
      put.
    - Previously the editor seeded from the stored profile JSON, which for
      newly-added nodes was empty β€” every input looked unset even though the
      node had 35+ live flags. Moot for nodes that had been Restart-ed through
      the hub once; fatal UX for nodes that hadn't.
    - On Apply, the editor sends a SNAPSHOT of the full live profile + pending
      edits as the new stored profile. "Save what's running, plus my changes,
      into the DB." No more slowly-growing stored profile that lags behind
      live.
    
    **Flag validation tightened**
    - `lib/stoachain-flags-catalog.ts` β€” `block-gas-limit` min bumped from 0
      to 1_600_000 (the Stoa network production min). Clearing the field
      still falls back to chainweb-node's compiled-in default (1.6M), which
      now matches.
    
    **GET /flags no longer requires fresh-confirm**
    - `pages/api/admin/nodes/[id]/stoachain/flags.ts` β€” read-only GET
      downgraded from `requireFreshAdminConfirmApi` β†’ `requireAdminApi`.
      Matches the other read-only endpoints (`status`, `docker-logs`,
      `preflight`). Opening the Flags sub-tab without typing your password
      in the last 5 minutes no longer 401s.
    - PATCH still requires fresh-confirm + ancient-admin (restarting
      chainweb-node interrupts P2P gossip β€” destructive).
    
    **Row metadata swap**
    - FlagRow now surfaces "stored differs from live" instead of the other
      way around. Since the editor prefills from live, the interesting case
      is "you hand-edited the runner without restarting through the hub,
      so stored lags behind what's actually running." Apply still
      snapshots-live-then-persists, which is the desired rebaseline.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.2a`.
  • v.Chaos.Jason.0-asΒ·v0.7.2
    First live edit path for chainweb-node flags. Previously the hub could only
    start / stop / restart the node with whatever profile was captured at install
    time; to change a flag the operator had to SSH in, hand-edit the runner, and
    hope they didn't mistype. Now every catalog-known flag has an input control
    in the UI with validation, a pending-diff counter, and Apply + Restart.
    
    **New API endpoint**
    - `GET  /api/admin/nodes/[id]/stoachain/flags` β€” returns the stored profile
      JSON from the `nodes.stoachain_flags_json` column. Empty `{}` if the node
      was never restarted via the hub (first Apply seeds it).
    - `PATCH /api/admin/nodes/[id]/stoachain/flags` β€” accepts
      `{ flags: Partial<ChainwebFlags>, restart?: boolean }`. Validates every
      incoming key against `FLAGS_CATALOG` (type + range + enum + hex), rejects
      immutable flags (`chainweb-version`, `database-directory`, cert paths),
      merges into the stored profile, persists, optionally enqueues a
      `stoachain-control` restart job. Value `null` for any key means "revert
      to chainweb-node default" (key is dropped from stored JSON).
    - Both routes require `requireFreshAdminConfirmApi` and ancient-admin; PATCH
      requires both because restarting chainweb-node interrupts P2P gossip.
    
    **New UI card β€” Flag editor** (below the read-only Flags card on the Chainweb tab)
    - `components/admin/NodeTabs.tsx` β€” `<FlagEditorCard>` with per-flag inputs
      grouped by category (Core / Data / TLS / P2P / Consensus / Mempool /
      Service / Mining / Backup / Logging / Runtime / Debug).
    - Input type per flag is driven by `FlagMeta`: switch-pair β†’ checkbox,
      enum β†’ select, number β†’ numeric input with `[min, max]` hint, repeatable
      β†’ textarea (one entry per line), hex / string / path β†’ text input.
    - "Show all flags" toggle surfaces the full catalog (~40 flags); default
      view shows only what's currently set in the stored profile plus the
      always-visible immutable (locked) rows.
    - Each row shows a `pending` badge when changed, `locked` badge on
      immutable flags, `debug` badge on debug-only flags, the inline
      description, the relevant catalog warning when the value deviates from
      default, and the live value if it differs from stored.
    - Per-row `revert` discards the pending change; `clear` sets to null
      (chainweb-node falls back to its compiled-in default).
    - Footer: pending count, `discard all`, `Save (no restart)`, and
      `Apply + Restart`. Save-only persists to the DB so the next Restart
      picks up the new flags; Apply + Restart does both, enqueues the
      `stoachain-control` job, and polls inline (same pattern as
      CertRotateButton β€” no nav-away required to watch progress).
    - GHC runtime (`+RTS`) gets its own row at the bottom; it's not a flag
      per se but the handler emits it in the runner.
    
    **Integration**
    - The flag-editor writes the same `stoachain_flags_json` column that
      `stoachain-control restart` already reads when rebuilding the runner
      script (for screen-supervised nodes) or the compose file (for
      docker-supervised nodes). No new wiring needed β€” "edit flags β†’ Apply
      + Restart" just happens.
    - `FuturePhase` banner on the Chainweb tab updated: flag editor is no
      longer future work. Listed forward: v0.7.3 seeded install, v0.8.x web
      terminal / hub registry.
    
    **Version bump**
    - `lib/version.ts` β†’ `v0.7.2` Β· `SC4 StoaChain flag editor`.
  • v.Chaos.Jason.0-atΒ·v0.7.1b
    Two quality-of-life additions on top of v0.7.1a's install flow, observed
    during end-to-end test on a home Linux machine.
    
    **Container logs card** (new)
    - `components/admin/NodeTabs.tsx` β€” ContainerLogsCard shown on Chainweb
      tab when the node is supervision=docker. Live tail of `docker logs
      --tail N stoa-node` via a new API route.
    - Configurable line count (100 / 200 / 500 / 1000 / 2000); auto-refresh
      every 5s (toggle); Copy-all button for pasting into support convos.
    - New endpoint: `GET /api/admin/nodes/[id]/stoachain/docker-logs?lines=N&container=X`.
      Any-admin access; read-only; safe to poll.
    
    **Auto-reprobe after mutating jobs** (new)
    - `stoachain-install`, `stoachain-control` (start/stop/restart), and
      `stoachain-cert-rotate` handlers now enqueue a `system-probe` job as
      their last successful step. The probe runs within seconds of the
      mutating job finishing, so the Docker tab's container listing +
      Overview "Services detected" rows reflect the new state without the
      operator needing to click Reprobe.
    - Also wired on the UI side for CertRotate + Install β€” they POST a
      probe from the browser on job completion as belt-and-braces (fires
      even if somehow the handler-side enqueue fails).
    - Fixes the UX mismatch we hit during testing where preflight showed
      the target clean but Docker tab still showed a zombie stoa-node.
    
    **Wizard UX fix: P2P hostname is OPTIONAL**
    - Tested install on a home machine with no DNS name β€” the wizard
      required a P2P hostname input, which when filled with a placeholder
      (`ancientminer.home`) advertised an unresolvable name to peers.
      node1 rejected sync with HTTP 400 and node2's TLS failed.
    - Fix: P2P hostname field marked optional. Blank = the install sends
      `0.0.0.0` (chainweb's auto-detect-via-peer-gossip sentinel). Peers
      use the NAT-translated source address instead of the advertised
      hostname. Home operators now work without DNS.
    - Regex validation still applies when the field IS filled β€” for
      operators with real DNS names.
    - Updated helper text explicitly tells home operators to leave it
      blank and tells validators/bootstraps to fill it in.
    
    **End-to-end test result (v0.7.1a + b combined)**
    - Fresh home Linux machine (AncientMiner, 32 GB RAM, NVMe): preflight
      green, install wizard succeeds, container healthy, chainweb-node
      begins receiving cut data from node1.stoachain.com. Cut height
      climbing β€” node actively syncing.
    - Two real chainweb quirks observed during test (tracked as separate
      backlog items, NOT install-flow bugs):
      - TLS handshake with node2.stoachain.com returns
        "certificate has unknown CA" despite `_disablePeerValidation=True`
        in Stoa version config.
      - node1.stoachain.com returns HTTP 429 during aggressive initial
        cut polling; self-resolves as peer relationship stabilizes.
    
    ---
  • v.Chaos.Jason.0-auΒ·v0.7.1a
    Incremental patch on v0.7.1. Extends the existing Easy-setup bootstrap
    flow in `/admin/nodes/new` so that, in addition to installing the hub's
    SSH key, it also prepares the target for container-based chainweb
    management. Fills the gap v0.7.1's install wizard revealed: the
    wizard assumed a "prepared" server, but the bootstrap flow wasn't
    actually preparing anything beyond SSH auth.
    
    **What Easy setup now does** (was: only steps 1-5)
    
    1. Password-auth SSH into the target (password used in-memory, never stored)
    2. Generate ed25519 SSH keypair
    3. Install the public key in `~/.ssh/authorized_keys`
    4. Reconnect with the new key to verify it works
    5. **(new)** Install `docker.io` if missing (apt / dnf / yum auto-detected)
    6. **(new)** Enable + start the docker daemon via systemctl
    7. **(new)** Add SSH user to the `docker` group (skipped when user is root)
    8. **(new)** Write `/etc/sudoers.d/ancientholdings-stoa` with NOPASSWD for
       `docker, mkdir, chmod, chown, openssl, tee, systemctl, screen, pkill, mv`
       (skipped when user is root β€” root doesn't need sudo)
    9. **(new)** Verify `sudo -n docker --version` works β€” proves sudoers took
       effect before declaring bootstrap successful
    10. Seal the private key in the vault
    11. Show the private key to the operator once for external backup
    
    All steps idempotent β€” safe to re-run against an already-prepared box
    without breaking anything.
    
    **UI changes** (`pages/admin/nodes/new.tsx`)
    
    - Expanded "Easy setup" helper block into a dropdown listing the full
      11-step sequence with inline docs about password handling, distro
      detection, and root-user semantics. Labeled as recommended.
    - Added equivalent dropdown on "Advanced" explaining when to use it
      (rare β€” only for pre-existing SSH key auth + manual prep) and what
      it skips vs Easy setup.
    - Clarified the root-SSH caveat (modern Linux distros block root
      password login; use a non-root sudo user instead).
    
    **What this enables**
    
    A truly fresh Linux box (whether a home machine, a Hetzner VPS, or a
    DigitalOcean droplet) can go from "ssh user + password" to "fully
    hub-managed chainweb-node capable" in one click, with zero manual
    prep. The install wizard from v0.7.1 now works end-to-end on any
    blank Linux target.
    
    **Code changes**
    - `lib/nodes.ts` β€” new `prepareTarget()` function runs after SSH key
      install. Constructs a distro-aware setup script executed in one
      SSH round-trip under the existing password auth. Idempotent.
    - `pages/admin/nodes/new.tsx` β€” copy expansion only; no logic change.
    - Version bumped to `v0.7.1a`.
    
    ---
  • v.Chaos.Jason.0-avΒ·v0.7.1
    Hub can now provision a fresh chainweb-node container on any registered
    server via a UI wizard β€” no SSH, no Haskell toolchain, no manual docker
    commands. Uses the published `ghcr.io/stoachain/stoa-node:latest` image
    (v2.32.0-stoa.1 or later).
    
    **Pre-work shipped outside the hub repo** β€” StoaChain repo now has
    first-class container support:
    - `docker/entrypoint.sh` expanded to cover the full production flag surface
      (all ~30 flags, correct mining-coordination vs node-mining semantics,
      ECDSA P-384 cert auto-detection).
    - `cabal.project` pins crypton 1.0.4 / memory 0.18.0 / merkle-log 0.2.0
      to survive Hackage drift from post-Kadena-shutdown dep churn.
    - Published image at [ghcr.io/stoachain/stoa-node](https://github.com/StoaChain/stoa-chain/pkgs/container/stoa-node),
      GitHub Release [v2.32.0-stoa.1](https://github.com/StoaChain/stoa-chain/releases/tag/v2.32.0-stoa.1)
      with the raw binary attached for operators who don't use docker.
    
    **On the hub side:**
    
    - `lib/stoachain-layout.ts` β€” canonical `StoaNode/` layout generator.
      Every install creates `<root>/StoaNode/{chainweb,data/backups,tls}/`
      with docker-compose.yml + nginx.conf.example. Backups stay inside
      data/ (hardlink-friendly for RocksDB checkpointing). Service API
      defaults to `127.0.0.1` binding unless operator opts into public.
    - `lib/stoachain-install-preflight.ts` β€” single-SSH-roundtrip env audit:
      docker installed + running, RAM β‰₯ 4 GB, drive class (NVMe/SSD/HDD),
      sudoers `NOPASSWD`, port 1789 / 1848 availability, whether any
      chainweb-node is already running. Returns structured report; HDD
      targets flagged red, warnings on low RAM, exact sudoers line included
      when sudo is denied.
    - `lib/handlers/stoachain-install.ts` β€” orchestrator job handler:
        1. Create canonical layout (sudo mkdir + chown)
        2. Generate ECDSA P-384 cert + key at `tls/`, chmod 600 on key
        3. Render hub-managed docker-compose.yml from the chosen profile
        4. Write nginx.conf.example as reference (not applied)
        5. `docker pull ghcr.io/stoachain/stoa-node:latest`
        6. `docker compose up -d` in `chainweb/`
        7. Wait up to 120s for container's `/info` to respond
        8. Persist stoachain_flags_json + runner_path + last_action in DB
      Failure at any step aborts without auto-rollback; partial state
      visible on target for manual inspection.
    - API routes:
        - `POST /api/admin/nodes/[id]/stoachain/preflight` β€” any admin; runs
          the checks and returns the structured report.
        - `POST /api/admin/nodes/[id]/stoachain/install` β€” Ancient-admin-only,
          fresh-confirm required; validates body (root path, hostname format,
          pubkey hex) then enqueues `stoachain-install` job.
    - `components/admin/InstallWizard.tsx` β€” 6-step wizard:
        1. Preflight (auto-runs; advanced-override checkbox available if any
           fail and operator knows better)
        2. Storage (drive picker with size/class badges; auto-selects best)
        3. Identity (P2P hostname + optional cluster-id)
        4. Profile (Recommended vs Mining coordinator; pubkey field shown
           for mining; backup-API + public-service toggles)
        5. Review (shows full install plan + Apply button with
           fresh-confirm modal)
        6. Running (live progress bar + log tail, success banner with
           post-install manual steps checklist, or failure with recovery
           guidance)
      Shown on Chainweb tab only when the node has no chainweb-node running
      and no hub-managed runner path recorded β€” i.e. truly fresh targets.
    
    **What you can now do**
    - Register a fresh Linux box in Nodes β†’ click Install chainweb-node β†’
      walk through 5 steps β†’ have a syncing StoaChain node in ~2-5 minutes.
    - Legacy node2 (screen-managed, pre-hub) is unaffected β€” the install
      wizard only appears on nodes without an existing chainweb-node.
    
    **Explicitly deferred** (next phases)
    - Flag editor UI to change flags post-install β€” v0.7.2
    - Seeded install from donor .ahbk β†’ faster bootstrap β€” v0.7.3
    - Migrate existing screen-managed node2 to container mode β€” v0.7.4
    - Hub-hosted container registry as backup to GHCR β€” v0.8.x
    - Vendor deps into StoaChain repo for supply-chain resilience β€” v0.8.x
    
    ---
  • v.Chaos.Jason.0-awΒ·v0.7.0
    First phase of the v0.7.x arc. The hub now understands chainweb-node
    at the flag level, can read it live over SSH, display its identity /
    storage / peer state, and Start/Stop/Restart it against the existing
    screen-managed production node.
    
    **What shipped**
    
    - `docs/chainweb-reference.md` β€” ~7,900-word living reference covering
      every chainweb-node flag we care about (defaults, roles, warnings,
      ranges, citations into StoaChain Haskell source), the TLS certificate
      system, the P2P discovery cascade, service-API endpoint catalog, and
      an audit of the production runner script (14 of 35 flags are default
      no-ops, some like `--mining-update-stream-limit 50` are below default
      β€” candidates for cleanup).
    - `lib/stoachain-flags-catalog.ts` β€” 40-flag catalog with role /
      category / type / default / recommended / warning / doc-anchor
      metadata, plus two named profiles:
        - **Ancient** β€” byte-for-byte reproduction of the production script
          (~35 flags).
        - **Recommended** β€” minimal-equivalent (~20 flags, same behavior).
    - `lib/stoachain-flags.ts` β€” `fromPsArgs`, `fromScript`,
      `toRunnerScript`, `toDockerEnv`, `diffFlags`. One model, two
      materializations (bash runner for screen mode today, docker env
      for v0.7.1 container mode).
    - `lib/stoachain-live.ts` extended:
        - `fetchLiveFlags` β€” parses the live `ps -eo args` into structured
          `ChainwebFlags`, detects parent runner script, classifies which
          named profile matches (or `custom`).
        - `fetchLiveCert` β€” reads TLS cert + key over SSH, runs `openssl`
          for SHA-256 fingerprint, subject, validity dates, key-file perms.
          Distinguishes persistent / ephemeral / missing modes.
        - `fetchLiveDrive` β€” walks data-directory β†’ mount β†’ block device β†’
          `/sys/block/.../queue/rotational`. Classifies as NVMe / SSD /
          HDD; used by the UI to flash a red warning when chainweb-node
          lives on a rotational drive.
    - `lib/handlers/stoachain-control.ts` β€” Start / Stop / Restart job
      handler. On first touch, parses live argv into flags and saves them
      as the stored profile. Thereafter renders a hub-managed runner
      script (`<data-dir>/RunStoaNode.managed.sh`) from that profile on
      every Start/Restart; the operator's legacy runner is never
      overwritten. Stop sequence: `screen -X quit` β†’ 10s grace β†’ TERM β†’
      10s β†’ KILL. Missing `sudo -n` permissions surface with the exact
      sudoers line to add.
    - `lib/handlers/stoachain-cert-rotate.ts` β€” openssl-over-SSH cert
      generation + rotation. Refuses to run while chainweb-node is
      active. Two modes: `upgrade` (first-time cert from ephemeral) and
      `rotate` (archive existing to `*.TS.old`). Handler wired; UI
      button deferred to v0.7.1.
    - Migration 014 β€” 8 new columns on `nodes`:
      `stoachain_flags_json`, `stoachain_flags_at`, `stoachain_profile`,
      `stoachain_binary_path`, `stoachain_runner_path`,
      `stoachain_last_action`, `stoachain_last_action_at`,
      `stoachain_last_action_by`. Trimmed from an earlier design β€”
      anything trivially queryable live over SSH (cert expiry, drive
      usage, live flags) is NOT persisted, to keep the DB narrow and
      avoid stale display.
    - API endpoints:
        - `GET /api/admin/nodes/[id]/stoachain/status` β€” single round-trip
          payload: live status + cert + drive + flags + audit. Polled
          every 10 s from the UI.
        - `POST /api/admin/nodes/[id]/stoachain/control` β€” enqueues a
          `stoachain-control` job. Requires fresh admin confirm.
    - `ChainwebTab` UI rebuild β€” seven cards:
        - **Status** β€” live tone badge, profile badge, per-chain height
          grid (10 cells), peer count, auto-refresh timestamp.
        - **Control** β€” Start / Stop / Restart buttons with
          confirm-password modal; disabled states wire to the live
          running flag; last-action audit line.
        - **Peer identity (TLS)** β€” color-coded badge (Certified /
          Ephemeral / Missing), fingerprint, subject, validity with
          days-until-expiry (red <30d, amber <90d), key-perm warning.
        - **Storage** β€” drive-class badge (NVMe / SSD / HDD), mount, fs
          type, capacity + used %, red warning on HDD.
        - **Flags** β€” grouped read-only table of every parsed flag,
          matching-profile badge, catalog-gap collapsible for unknown
          flags.
        - **Runner + binary** β€” paths for both; the hub-generated runner
          path is shown even before first Start so operators know where
          the managed script will land.
        - Existing **Backup** card unchanged.
    - Profile classification: a simple equality check against Ancient and
      Recommended. Everything else classifies as `custom`.
    
    **Research inputs**
    - StoaChain Haskell source at `d:\_Claude\StoaChain\` (branch
      `AncientStoa`) β€” ground truth for flag defaults + semantics.
    - `chainweb-node --help` output captured from production binary
      `StoaChain_2.32.0` (276 lines, `docs/research/chainweb-node-help.txt`).
    - User's two runner scripts (`RunStoaNode.sh` and `.backupapi.sh`),
      captured over SSH.
    - Kadena upstream docs (archived since Kadena Inc. shutdown
      2025-10-21; mainnet last block 2025-11-15).
    
    **Explicitly deferred (come later in v0.7.x)**
    - Flag editor UI (change profile / individual flags) β€” v0.7.2.
    - Cert rotation UI (handler exists; button deferred) β€” v0.7.1.
    - Container-mode detection (screen vs docker) β€” v0.7.1.
    - Hub-driven install on fresh host β€” v0.7.2 (includes drive
      auto-select + canonical `StoaNode/{chainweb,data,tls}/` layout).
    - Seeded install (donor .ahbk β†’ new node) β€” v0.7.3.
    - Screen β†’ container migration button β€” v0.7.4.
    - GHCR image publish workflow β€” v0.7.2 (StoaChain repo addition).
    
    **Known gaps from research (future-Claude TODOs)**
    - `--enable-local-timeout` semantics β€” Bool or Β΅s? Flagged in doc Β§8.
    - Backup-API on port 1848 is unauthenticated β€” documented in
      research Β§4, production workaround is SSH+localhost; long-term plan
      is firewall-to-loopback or nginx auth.
    - `--bootstrap-reachability 0` silently masks a firewalled P2P port;
      v0.7.0 shows an amber warning in flags view when this is set.
    
    ---
β–ΈTheseus2 historical entries Β· patch-number 0

Theseus Genesis Β· 2 historical entries Β· patch-number 0

  • v.Chaos.Theseus.0-aΒ·v0.7.5g
    Feedback round on v0.7.5:
    
    - **Dropped** the `/admin` landing per-node gauge row. Doesn't scale past
      a handful of nodes β€” at 100 nodes it becomes a wall of jittering bars.
      The per-node live view already lives on each node's Monitoring tab,
      and the placeholder quick-link now reads "live Β· open any node β†’
      Monitoring tab" instead of the old "MON1 β€” coming soon".
    - **Bootstrap now installs netdata + docker-compose-plugin**, not just
      docker + certbot. New nodes land with monitoring working
      out-of-the-box. Total prepare timeout bumped to 10 min (kickstart can
      be slow on fresh VPSes).
    - **Bootstrap ticker + log reveal**. The synchronous POST is unchanged
      server-side (background-job rework is a later slice), but the UI now
      shows an 8-stage ticker while it runs (SSH check β†’ key install β†’ key
      verify β†’ docker β†’ certbot β†’ sudoers β†’ netdata β†’ sealing). On success,
      a collapsible "what was installed on the target" panel shows the full
      server log of the prepare script.
    - **Dynamic service paths in the probe.** StoaChain data dir is now
      resolved from the running chainweb-node argv (`--database-directory`),
      translated via `docker inspect` for containerised nodes, and only
      falls back to `/mnt/nvmedrive/StoaNodeData` when nothing else yields.
      IPFS repo path comes from `ipfs config Datastore.Path`, then searches
      common fallbacks (`/root/.ipfs`, `/home/*/.ipfs`, `/var/lib/ipfs`).
      The probe emits new `servicePaths` fields; `NodeProvisioning` uses
      them so tiles show the real path on each node instead of a universal
      default that only ever worked for one owner's home layout.
    - **Trim system logs button** on the System-logs provisioning tile.
      Runs `journalctl --vacuum-time=7d` + `find /var/log -type f ( -name
      '*.gz' | *.1 | *.2 | *.3 | *.old | *.N ) -mtime +1 -delete`. Active
      logs never touched. Fresh-confirm required; returns before/after
      sizes so the UI shows how much was freed.
    
    ---
  • v.Chaos.Theseus.0-bΒ·v0.7.5
    Phase code β†’ **MON1**. Closes the biggest gap left from Β§8 of the original
    control-hub plan: monitoring + capacity awareness. Netdata install + live
    CPU/RAM/load/net charts were already shipped; v0.7.5 adds everything built
    on top of them.
    
    ### v0.7.5a β€” Per-mount disk space tiles
    
    New [NodeDiskSpace.tsx](components/admin/NodeDiskSpace.tsx) component on every
    node's Monitoring tab. Queries netdata's `/api/v1/charts` once to discover
    every `disk.space` chart on the box, then samples each for current avail /
    used / reserved. One tile per mount point β€” bar coloured green / amber / red
    at 75 % / 90 % thresholds. Refreshes every 30 s.
    
    ### v0.7.5b β€” Per-node live gauges on `/admin` landing
    
    New [AdminHomeGauges.tsx](components/admin/AdminHomeGauges.tsx) + batched
    [/api/admin/nodes/[id]/summary](pages/api/admin/nodes/[id]/summary.ts)
    endpoint. Each visible node gets a compact tile showing CPU %, RAM used %,
    and worst-mount disk %. Tiles refresh every 15 s and link through to the
    node detail page. Ownership-filtered β€” modern/client admins only see their
    own nodes; ancient sees all.
    
    ### v0.7.5c β€” Service provisioning view
    
    New [NodeProvisioning.tsx](components/admin/NodeProvisioning.tsx) on the
    Monitoring tab. For each service detected in the system probe (StoaChain
    data / StoaChain backups / IPFS repo / Mailcow / Docker / system logs), shows
    its path + used bytes, resolves it to the mount it lives on (longest-prefix
    match against `df` output), and renders the mount's headroom bar alongside.
    Pure client-side derivation from existing probe data β€” no new SSH calls.
    
    ### v0.7.5d β€” `capacity_snapshots` table + daily worker capture
    
    Migration 018 adds `capacity_snapshots(node_id, service_key, day,
    used_bytes, mount_point, mount_total_bytes, mount_avail_bytes)` with
    `UNIQUE(node_id, service_key, day)`. New
    [lib/capacity-snapshots.ts](lib/capacity-snapshots.ts) module: `du -sb` +
    `df -B1` over SSH for every tracked service path, idempotent per UTC day.
    Hooked into the worker's hourly reap tick β€” rows only insert once per day,
    so it's cheap to call hourly and survives worker restarts gracefully.
    
    Bonus: the module exposes `getRecentSnapshots()` + `projectFillDay()`
    (simple linear regression to project when the mount fills at current growth)
    which power the "full in ~N d" hint on the provisioning tiles via a new
    [/api/admin/nodes/[id]/capacity](pages/api/admin/nodes/[id]/capacity.ts)
    endpoint.
    
    ### v0.7.5e β€” Active alerts surface
    
    New [NodeAlerts.tsx](components/admin/NodeAlerts.tsx) at the top of the
    Monitoring tab. Consumes netdata's `/api/v1/alarms` (already whitelisted in
    the metrics proxy), filters to `WARNING` / `CRITICAL` only, shows name +
    chart + current value + human-readable info. Refreshes every 30 s. Green
    "no active alarms" state when everything's quiet.
    
    ### v0.7.5f β€” Backup pre-flight disk headroom gate
    
    New [lib/disk-preflight.ts](lib/disk-preflight.ts) module with
    `backupPreflight()`: probes the remote node's stage + source dirs and the
    hub's landing dir, estimates backup size at `source Γ— 0.8 Γ— 1.2` (chainweb
    snapshots land at ~0.8x the source RocksDB; 20 % safety margin), and aborts
    with an actionable error before any remote work starts if headroom is
    tight. Wired into [backup-stoachain.ts](lib/handlers/backup-stoachain.ts).
    Example error:
    
    > Remote node needs ~32.4 GiB free on /mnt/nvmedrive/StoaNodeData/backups but only 12.1 GiB available. Free space or increase the mount.
    
    Previously this failed mid-stream after minutes of tar'ing, leaving the
    remote backup dir half-full to reap manually. Now it's a clean "don't
    start" with a clear remedy.
    
    ---
β–ΈMidas1 historical entry Β· patch-number 0

Midas Genesis Β· 1 historical entry Β· patch-number 0

  • v.Chaos.Midas.0-aΒ·v0.7.6
    Concatenated summary of v0.7.6a through v0.7.6z. This series introduced the
    off-chain Stoicism reward layer β€” the hub's flagship feature.
    
    - **Migration `019_stoic_power.sql` + `021_stoic_power_ledger.sql`** β€” full ledger
      schema: `stoic_power_accounts` (keyed by `ouronet_account`, NOT email),
      `stoic_power_events`, `stoic_power_daily`. Later renamed to `stoicism_*` in
      migration 029 for user-facing consistency; internal code still uses
      `stoic-power.ts` filenames.
    - **Eligibility engine.** Seven gates for accrual: benchmark stamped, Ouronet
      account set, commitment β‰₯ fleet-min, warmup complete (shadow-override
      settable), cut within tip-tolerance of fleet peak, peer count (later
      retired in v0.7.7r), not in breach (flag-violation gate documented, enforcement
      lands v0.7.8z+). Node fails any gate β†’ pending pool, no mint.
    - **Shadow vs live mode.** Ancient-admin-only ScoringModeCard:
      SHADOW = accumulating for validation, not published on-chain; LIVE = authoritative
      accrual. Flip-to-live zeroes shadow points (warmup preserved); revert-to-shadow
      preserves ledger. Schedule-auto-flip writes `system_state.scoring_mode_flip_at`
      consumed by the scoring worker at the top of every tick.
    - **Per-second accrual rate.** `Stoicism/sec = ServerScore Γ— 0.001`; live display
      on the scoring card. Pending pool during warmup; flips to current on warmup
      completion.
    - **Warmup model.** 24 h of cut-within-tip-tolerance before points move from
      PENDING to CURRENT. Shadow mode lets admins override (1–1440 min) for testing;
      live mode locks at 24 h.
    - **Ouronet account override per-node.** Operators earn into their profile-set
      Ouronet by default; ancient admin can pin a per-node override (e.g. treasury
      node). Lock flag prevents operator from clearing the override.
    - **Rich-list page (v0.7.6m).** Top accounts by lifetime Stoicism; ancient-only
      initially; later (v0.7.8) backed by an hourly materialised view for scale.
    - **Earnings page (v0.7.6j).** Per-operator ledger view: current balance, pending
      points, daily accrual breakdown, event log filterable by node.
    - **Scoring worker (v0.7.6x).** Tick every ~10 s; for each node in the eligibility-
      passing set, compute accrual delta, insert `stoic_power_events`, upsert
      `stoic_power_accounts.current_balance`. Batched in transactions.
    - **Mint model (design).** Daily 06:00 UTC register-aggregation mint β€” one batched
      `update-registers` Pact tx per chain. Documented for v0.8.x implementation;
      shadow-mode today accumulates without on-chain publication.
    
    ---

Current build: vH.1.20 Β· Chronos β€” The H.1.x benchmark/scoring rehaul arc