docs: spec for ZFS pool detail enrichment
Compact per-pool block with type, capacity bar, used/free/total, scrub state, and vdev summary. Collector gets pool_type derivation, scan state, and vdev list — no new shell-outs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
f616d466eb
commit
45f59eb163
1 changed files with 159 additions and 0 deletions
159
docs/superpowers/specs/2026-04-22-zfs-pool-detail-design.md
Normal file
159
docs/superpowers/specs/2026-04-22-zfs-pool-detail-design.md
Normal file
|
|
@ -0,0 +1,159 @@
|
|||
# ZFS Pool Detail — Design
|
||||
|
||||
Date: 2026-04-22
|
||||
Status: approved (pending implementation)
|
||||
|
||||
## Problem
|
||||
|
||||
The host detail view renders a compact row per ZFS pool today (`server/lib/server_web/live/host_detail_live.ex:67`):
|
||||
|
||||
```
|
||||
rpool [ONLINE]
|
||||
cap 0% · frag 0% · err 0 · vdevs 4 (deg 0) scrub never
|
||||
```
|
||||
|
||||
This hides information the user needs at first glance:
|
||||
|
||||
- Total / used / free size (bytes are already collected but never rendered).
|
||||
- Pool layout (mirror / raidz1 / raidz2 / stripe / mixed) — not collected.
|
||||
- Scan state — only `end_time` is kept, so an in-progress scrub looks like a finished one.
|
||||
|
||||
The original concept doc calls for "Health, Capacity-Bar, Fragmentation, Error-Counters, Scrub-Info, vdev-Liste" per pool (`proxmox-monitor-konzept.md:227`). We never finished that.
|
||||
|
||||
## Goal
|
||||
|
||||
One compact block per pool that answers at a glance: *is it healthy, what layout is it, how full is it, is a scrub running*. No drill-down yet.
|
||||
|
||||
## Scope
|
||||
|
||||
In scope:
|
||||
|
||||
1. Agent collector enrichment — derive `pool_type`, keep vdev summary list, keep scan function/state. No new shell-outs; `zpool status -j --json-flat-vdevs --json-int` already returns all of this.
|
||||
2. Host detail LiveView — replace the current single-line pool row with a richer compact block (see layout below).
|
||||
3. Capacity bar styling in `assets/css/app.css`.
|
||||
4. Tests — extend `agent/test/proxmox_agent/collectors/zfs_test.exs` fixtures and assertions for the new fields.
|
||||
|
||||
Out of scope (YAGNI):
|
||||
|
||||
- Drill-down view with per-vdev disk state, resilver progress bars, or scan history.
|
||||
- Persistence schema changes — payload is stored as JSON blob; adding keys is additive.
|
||||
- Storage/dataset/VM panel changes — separate conversation.
|
||||
|
||||
## Agent changes
|
||||
|
||||
### Collector output
|
||||
|
||||
Extend `ProxmoxAgent.Collectors.Zfs.pool_summary` with three fields:
|
||||
|
||||
```elixir
|
||||
%{
|
||||
# existing fields unchanged:
|
||||
name:, health:, size_bytes:, allocated_bytes:, free_bytes:,
|
||||
fragmentation_percent:, capacity_percent:, error_count:,
|
||||
vdev_count:, degraded_vdev_count:, last_scrub_end:,
|
||||
|
||||
# new:
|
||||
pool_type: String.t(), # "mirror" | "raidz1" | "raidz2" | "raidz3" | "stripe" | "mixed"
|
||||
scan_function: String.t() | nil, # "scrub" | "resilver" | nil
|
||||
scan_state: String.t() | nil, # "SCANNING" | "FINISHED" | "CANCELED" | nil
|
||||
vdevs: [%{name: String.t(), type: String.t(), state: String.t(),
|
||||
read_errors: non_neg_integer(), write_errors: non_neg_integer(),
|
||||
checksum_errors: non_neg_integer()}]
|
||||
}
|
||||
```
|
||||
|
||||
### Derivation rules
|
||||
|
||||
`pool_type` is derived from the set of `vdev_type` values across top-level vdevs:
|
||||
|
||||
- All vdevs the same type → that type (`"mirror"`, `"raidz1"`, `"raidz2"`, `"raidz3"`).
|
||||
- All vdevs are `disk` (plain top-level disk with no redundancy) → `"stripe"`.
|
||||
- Anything else → `"mixed"`.
|
||||
|
||||
Special vdev types (`log`, `cache`, `spare`, `dedup`, `special`) are ignored for layout classification — they don't change the data redundancy story. They are still included in the `vdevs` list.
|
||||
|
||||
`scan_function` / `scan_state` read `get_in(status_info, ["scan", "function" | "state"])`.
|
||||
|
||||
Per-vdev numeric fields (`read_errors`, `write_errors`, `checksum_errors`) are parsed the same way `error_count` already is (string or int tolerant).
|
||||
|
||||
### Tests
|
||||
|
||||
`agent/test/fixtures/zfs/zpool_status.json` already has a mirror and a raidz2 pool; extend assertions in `zfs_test.exs`:
|
||||
|
||||
- `rpool.pool_type == "mirror"`
|
||||
- `tank.pool_type == "raidz2"`
|
||||
- `rpool.scan_state == "FINISHED"`
|
||||
- `rpool.vdevs` has length 1 with `type: "mirror"`, `state: "ONLINE"`
|
||||
|
||||
Add one new fixture-free unit test covering the `"stripe"` and `"mixed"` branches by injecting a synthetic runner.
|
||||
|
||||
## Server changes
|
||||
|
||||
None in the collector pipeline. The channel handler already stores the whole `zfs_pools.pools` list as JSON (`server/lib/server_web/channels/host_channel.ex` — to confirm in plan) and the LiveView reads it with `get_in/2`. New keys flow through automatically.
|
||||
|
||||
## UI changes
|
||||
|
||||
### Layout
|
||||
|
||||
Replace the current `.pool-row` flex block in `host_detail_live.ex:69-86` with a per-pool compact block:
|
||||
|
||||
```
|
||||
rpool mirror [ONLINE]
|
||||
████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 40%
|
||||
used 200.0 GB · free 300.0 GB · total 500.0 GB
|
||||
frag 17% · err 0 · vdevs 1 (deg 0) · scrub finished 2026-04-19
|
||||
```
|
||||
|
||||
Element mapping:
|
||||
|
||||
- Line 1: pool name (bright mono, bold) · pool_type (muted) · health badge (right).
|
||||
- Line 2: capacity bar (div with width % + background color keyed to capacity thresholds).
|
||||
- Line 3: used / free / total — rendered with the existing `format_bytes/1` helper.
|
||||
- Line 4: the existing compact details line, plus scrub state — `scrub scanning` / `scrub finished <date>` / `scrub never`.
|
||||
|
||||
### Capacity bar
|
||||
|
||||
CSS in `server/assets/css/app.css`:
|
||||
|
||||
```css
|
||||
.capbar {
|
||||
height: 4px; background: var(--panel-2); border-radius: 2px;
|
||||
overflow: hidden; margin: 0.25rem 0;
|
||||
}
|
||||
.capbar > span { display: block; height: 100%; background: var(--ok); }
|
||||
.capbar[data-level="warn"] > span { background: var(--warn); }
|
||||
.capbar[data-level="crit"] > span { background: var(--crit); }
|
||||
```
|
||||
|
||||
Thresholds (matching the concept doc's thresholds at `proxmox-monitor-konzept.md:218-219`):
|
||||
|
||||
- `cap >= 90` → `data-level="crit"`
|
||||
- `cap >= 80` → `data-level="warn"`
|
||||
- else → default (ok green).
|
||||
|
||||
### Degraded pool callout
|
||||
|
||||
For ONLINE pools with `degraded_vdev_count == 0`, do not render per-vdev detail — keep it simple. For anything else, render one line per non-ONLINE vdev below the detail line:
|
||||
|
||||
```
|
||||
! mirror-1 DEGRADED r=0 w=0 cksum=12
|
||||
```
|
||||
|
||||
Styled with the existing `.callout.err` class.
|
||||
|
||||
### Scrub rendering
|
||||
|
||||
- `scan_state == "SCANNING"` → `"scrub scanning"` (no date).
|
||||
- `scan_state == "FINISHED"` and `last_scrub_end` present → `"scrub #{format_date(last_scrub_end)}"`.
|
||||
- Otherwise → `"scrub never"`.
|
||||
|
||||
`last_scrub_end` is a string like `"Sat Apr 19 02:00:00 2026"` — keep as-is or reformat to `YYYY-MM-DD` with a tiny helper (strptime isn't stdlib-trivial in Elixir; simplest: split on whitespace and reorder). Accept "as-is" if reformatting is ugly.
|
||||
|
||||
## Risks
|
||||
|
||||
- ZFS JSON output has changed shape between OpenZFS releases. The concept doc requires `OpenZFS 2.3+`. Agent code tolerates missing keys via `Map.get/3` defaults — keep that discipline.
|
||||
- `zpool status --json-flat-vdevs` flattens nested mirrors-of-mirrors. Top-level vdevs are keyed by name; pool_type derivation inspects only top-level entries (no child vdev walking needed in the flat form).
|
||||
|
||||
## Rollout
|
||||
|
||||
Additive collector changes + additive UI. No DB migration, no breaking payload change. Old agents without the new fields render the "graceful degraded" path: `pool_type` shows as `—`, scrub line falls back to `never`, capacity bar still renders from existing bytes.
|
||||
Loading…
Add table
Add a link
Reference in a new issue