Compact per-pool block with type, capacity bar, used/free/total, scrub state, and vdev summary. Collector gets pool_type derivation, scan state, and vdev list — no new shell-outs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6.7 KiB
ZFS Pool Detail — Design
Date: 2026-04-22 Status: approved (pending implementation)
Problem
The host detail view renders a compact row per ZFS pool today (server/lib/server_web/live/host_detail_live.ex:67):
rpool [ONLINE]
cap 0% · frag 0% · err 0 · vdevs 4 (deg 0) scrub never
This hides information the user needs at first glance:
- Total / used / free size (bytes are already collected but never rendered).
- Pool layout (mirror / raidz1 / raidz2 / stripe / mixed) — not collected.
- Scan state — only
end_timeis kept, so an in-progress scrub looks like a finished one.
The original concept doc calls for "Health, Capacity-Bar, Fragmentation, Error-Counters, Scrub-Info, vdev-Liste" per pool (proxmox-monitor-konzept.md:227). We never finished that.
Goal
One compact block per pool that answers at a glance: is it healthy, what layout is it, how full is it, is a scrub running. No drill-down yet.
Scope
In scope:
- Agent collector enrichment — derive
pool_type, keep vdev summary list, keep scan function/state. No new shell-outs;zpool status -j --json-flat-vdevs --json-intalready returns all of this. - Host detail LiveView — replace the current single-line pool row with a richer compact block (see layout below).
- Capacity bar styling in
assets/css/app.css. - Tests — extend
agent/test/proxmox_agent/collectors/zfs_test.exsfixtures and assertions for the new fields.
Out of scope (YAGNI):
- Drill-down view with per-vdev disk state, resilver progress bars, or scan history.
- Persistence schema changes — payload is stored as JSON blob; adding keys is additive.
- Storage/dataset/VM panel changes — separate conversation.
Agent changes
Collector output
Extend ProxmoxAgent.Collectors.Zfs.pool_summary with three fields:
%{
# existing fields unchanged:
name:, health:, size_bytes:, allocated_bytes:, free_bytes:,
fragmentation_percent:, capacity_percent:, error_count:,
vdev_count:, degraded_vdev_count:, last_scrub_end:,
# new:
pool_type: String.t(), # "mirror" | "raidz1" | "raidz2" | "raidz3" | "stripe" | "mixed"
scan_function: String.t() | nil, # "scrub" | "resilver" | nil
scan_state: String.t() | nil, # "SCANNING" | "FINISHED" | "CANCELED" | nil
vdevs: [%{name: String.t(), type: String.t(), state: String.t(),
read_errors: non_neg_integer(), write_errors: non_neg_integer(),
checksum_errors: non_neg_integer()}]
}
Derivation rules
pool_type is derived from the set of vdev_type values across top-level vdevs:
- All vdevs the same type → that type (
"mirror","raidz1","raidz2","raidz3"). - All vdevs are
disk(plain top-level disk with no redundancy) →"stripe". - Anything else →
"mixed".
Special vdev types (log, cache, spare, dedup, special) are ignored for layout classification — they don't change the data redundancy story. They are still included in the vdevs list.
scan_function / scan_state read get_in(status_info, ["scan", "function" | "state"]).
Per-vdev numeric fields (read_errors, write_errors, checksum_errors) are parsed the same way error_count already is (string or int tolerant).
Tests
agent/test/fixtures/zfs/zpool_status.json already has a mirror and a raidz2 pool; extend assertions in zfs_test.exs:
rpool.pool_type == "mirror"tank.pool_type == "raidz2"rpool.scan_state == "FINISHED"rpool.vdevshas length 1 withtype: "mirror",state: "ONLINE"
Add one new fixture-free unit test covering the "stripe" and "mixed" branches by injecting a synthetic runner.
Server changes
None in the collector pipeline. The channel handler already stores the whole zfs_pools.pools list as JSON (server/lib/server_web/channels/host_channel.ex — to confirm in plan) and the LiveView reads it with get_in/2. New keys flow through automatically.
UI changes
Layout
Replace the current .pool-row flex block in host_detail_live.ex:69-86 with a per-pool compact block:
rpool mirror [ONLINE]
████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 40%
used 200.0 GB · free 300.0 GB · total 500.0 GB
frag 17% · err 0 · vdevs 1 (deg 0) · scrub finished 2026-04-19
Element mapping:
- Line 1: pool name (bright mono, bold) · pool_type (muted) · health badge (right).
- Line 2: capacity bar (div with width % + background color keyed to capacity thresholds).
- Line 3: used / free / total — rendered with the existing
format_bytes/1helper. - Line 4: the existing compact details line, plus scrub state —
scrub scanning/scrub finished <date>/scrub never.
Capacity bar
CSS in server/assets/css/app.css:
.capbar {
height: 4px; background: var(--panel-2); border-radius: 2px;
overflow: hidden; margin: 0.25rem 0;
}
.capbar > span { display: block; height: 100%; background: var(--ok); }
.capbar[data-level="warn"] > span { background: var(--warn); }
.capbar[data-level="crit"] > span { background: var(--crit); }
Thresholds (matching the concept doc's thresholds at proxmox-monitor-konzept.md:218-219):
cap >= 90→data-level="crit"cap >= 80→data-level="warn"- else → default (ok green).
Degraded pool callout
For ONLINE pools with degraded_vdev_count == 0, do not render per-vdev detail — keep it simple. For anything else, render one line per non-ONLINE vdev below the detail line:
! mirror-1 DEGRADED r=0 w=0 cksum=12
Styled with the existing .callout.err class.
Scrub rendering
scan_state == "SCANNING"→"scrub scanning"(no date).scan_state == "FINISHED"andlast_scrub_endpresent →"scrub #{format_date(last_scrub_end)}".- Otherwise →
"scrub never".
last_scrub_end is a string like "Sat Apr 19 02:00:00 2026" — keep as-is or reformat to YYYY-MM-DD with a tiny helper (strptime isn't stdlib-trivial in Elixir; simplest: split on whitespace and reorder). Accept "as-is" if reformatting is ugly.
Risks
- ZFS JSON output has changed shape between OpenZFS releases. The concept doc requires
OpenZFS 2.3+. Agent code tolerates missing keys viaMap.get/3defaults — keep that discipline. zpool status --json-flat-vdevsflattens nested mirrors-of-mirrors. Top-level vdevs are keyed by name; pool_type derivation inspects only top-level entries (no child vdev walking needed in the flat form).
Rollout
Additive collector changes + additive UI. No DB migration, no breaking payload change. Old agents without the new fields render the "graceful degraded" path: pool_type shows as —, scrub line falls back to never, capacity bar still renders from existing bytes.