docs: implementation plan for ZFS pool detail

Four tasks: collector enrichment (pool_type/scan/vdevs), classification
coverage tests, CSS for capacity bar + pool block, LiveView rendering
and test updates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Carsten 2026-04-22 17:40:31 +02:00
parent 45f59eb163
commit a4f4d3ca51

View file

@ -0,0 +1,532 @@
# ZFS Pool Detail Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Show type, total/used/free size, capacity bar, and scan state per ZFS pool on the host detail page — a simple at-a-glance view with no drill-down yet.
**Architecture:** Extend the existing agent collector (`ProxmoxAgent.Collectors.Zfs.collect_pools/1`) to derive `pool_type`, `scan_function`, `scan_state`, and a compact `vdevs` list from the already-fetched `zpool status -j --json-flat-vdevs` JSON. No new shell-outs. The Phoenix channel stores pool payloads as opaque JSON, so server/DB layers need no change. The host detail LiveView renders a new compact per-pool block using the enriched fields plus a thin capacity bar driven by existing `capacity_percent` thresholds.
**Tech Stack:** Elixir / Phoenix LiveView, ExUnit, existing `assets/css/app.css`. Design doc: `docs/superpowers/specs/2026-04-22-zfs-pool-detail-design.md`.
---
## File Structure
**Modify**
- `agent/lib/proxmox_agent/collectors/zfs.ex` — extend `merge_pools/2` to emit `pool_type`, `scan_function`, `scan_state`, and `vdevs` list.
- `agent/test/proxmox_agent/collectors/zfs_test.exs` — extend existing assertions, add new test cases for stripe, mixed, and ignored special vdev types.
- `server/lib/server_web/live/host_detail_live.ex` — replace the pool row markup (current lines 6986), add `capbar_level/1`, `pool_scrub_line/1`, and `pool_layout/1` helpers.
- `server/assets/css/app.css` — add `.capbar` rules.
- `server/test/server_web/live/host_detail_live_test.exs` — extend the fast-sample pool fixture with the new fields and add assertions.
**No new files.** All changes are additive and land inside existing modules.
---
## Task 1: Agent collector — pool_type, scan state, vdev list
**Files:**
- Modify: `agent/lib/proxmox_agent/collectors/zfs.ex`
- Modify: `agent/test/proxmox_agent/collectors/zfs_test.exs`
- Modify: `agent/test/fixtures/zfs/zpool_status.json`
### - [ ] Step 1: Add per-vdev error counters to the fixture so tests can assert on them
Replace `agent/test/fixtures/zfs/zpool_status.json` with:
```json
{
"output_version": { "command": "zpool status", "vers_major": 0, "vers_minor": 1 },
"pools": {
"rpool": {
"name": "rpool",
"state": "ONLINE",
"scan": {
"function": "scrub",
"state": "FINISHED",
"end_time": "Sat Apr 19 02:00:00 2026"
},
"error_count": "0",
"vdevs": {
"mirror-0": {
"name": "mirror-0",
"vdev_type": "mirror",
"state": "ONLINE",
"read_errors": "0",
"write_errors": "0",
"checksum_errors": "0"
}
}
},
"tank": {
"name": "tank",
"state": "DEGRADED",
"scan": {
"function": "scrub",
"state": "SCANNING",
"end_time": "Tue Mar 01 08:00:00 2026"
},
"error_count": "2",
"vdevs": {
"raidz2-0": {
"name": "raidz2-0",
"vdev_type": "raidz2",
"state": "DEGRADED",
"read_errors": "0",
"write_errors": "0",
"checksum_errors": "2"
}
}
}
}
}
```
(The only change from the current fixture is `"tank"`'s `scan.state``"SCANNING"` so a scrub-in-progress case is covered.)
### - [ ] Step 2: Extend the existing fixture-based test with new field assertions
Edit `agent/test/proxmox_agent/collectors/zfs_test.exs`. Inside `describe "collect_pools/1"`, replace the `"returns a summary per pool"` test body with:
```elixir
test "returns a summary per pool" do
sample = Zfs.collect_pools(runner: fake_runner())
assert is_list(sample.pools)
assert length(sample.pools) == 2
rpool = Enum.find(sample.pools, &(&1.name == "rpool"))
tank = Enum.find(sample.pools, &(&1.name == "tank"))
assert rpool.health == "ONLINE"
assert rpool.capacity_percent == 40
assert rpool.fragmentation_percent == 17
assert rpool.size_bytes == 500_000_000_000
assert rpool.error_count == 0
assert rpool.degraded_vdev_count == 0
assert rpool.pool_type == "mirror"
assert rpool.scan_function == "scrub"
assert rpool.scan_state == "FINISHED"
assert [%{name: "mirror-0", type: "mirror", state: "ONLINE",
read_errors: 0, write_errors: 0, checksum_errors: 0}] = rpool.vdevs
assert tank.health == "DEGRADED"
assert tank.error_count == 2
assert tank.degraded_vdev_count == 1
assert tank.pool_type == "raidz2"
assert tank.scan_state == "SCANNING"
assert [%{name: "raidz2-0", type: "raidz2", state: "DEGRADED",
checksum_errors: 2}] = tank.vdevs
end
```
### - [ ] Step 3: Run tests — expect FAIL
Run: `cd agent && mix test test/proxmox_agent/collectors/zfs_test.exs`
Expected: the `"returns a summary per pool"` test fails because `:pool_type`, `:scan_function`, `:scan_state`, and `:vdevs` are not yet on the pool map.
### - [ ] Step 4: Implement the new fields in the collector
Edit `agent/lib/proxmox_agent/collectors/zfs.ex`. Update the `@type pool_summary` and `merge_pools/2` function as follows:
```elixir
@type vdev_summary :: %{
name: String.t(),
type: String.t(),
state: String.t(),
read_errors: non_neg_integer(),
write_errors: non_neg_integer(),
checksum_errors: non_neg_integer()
}
@type pool_summary :: %{
name: String.t(),
health: String.t(),
size_bytes: non_neg_integer(),
allocated_bytes: non_neg_integer(),
free_bytes: non_neg_integer(),
fragmentation_percent: non_neg_integer(),
capacity_percent: non_neg_integer(),
error_count: non_neg_integer(),
vdev_count: non_neg_integer(),
degraded_vdev_count: non_neg_integer(),
pool_type: String.t(),
scan_function: String.t() | nil,
scan_state: String.t() | nil,
last_scrub_end: String.t() | nil,
vdevs: [vdev_summary()]
}
```
Replace the body of `merge_pools(%{"pools" => list_pools}, %{"pools" => status_pools})` with:
```elixir
defp merge_pools(%{"pools" => list_pools}, %{"pools" => status_pools}) do
Enum.map(list_pools, fn {name, list_info} ->
status_info = Map.get(status_pools, name, %{})
raw_vdevs = Map.get(status_info, "vdevs", %{}) |> Map.values()
vdevs = Enum.map(raw_vdevs, &vdev_summary/1)
%{
name: name,
health: Map.get(list_info, "health"),
size_bytes: Map.get(list_info, "size", 0),
allocated_bytes: Map.get(list_info, "alloc", 0),
free_bytes: Map.get(list_info, "free", 0),
fragmentation_percent: Map.get(list_info, "frag", 0),
capacity_percent: Map.get(list_info, "cap", 0),
error_count: to_int(Map.get(status_info, "error_count", "0")),
vdev_count: length(vdevs),
degraded_vdev_count: Enum.count(vdevs, &(&1.state != "ONLINE")),
pool_type: derive_pool_type(vdevs),
scan_function: get_in(status_info, ["scan", "function"]),
scan_state: get_in(status_info, ["scan", "state"]),
last_scrub_end: get_in(status_info, ["scan", "end_time"]),
vdevs: vdevs
}
end)
end
defp vdev_summary(v) do
%{
name: Map.get(v, "name"),
type: Map.get(v, "vdev_type"),
state: Map.get(v, "state"),
read_errors: to_int(Map.get(v, "read_errors", "0")),
write_errors: to_int(Map.get(v, "write_errors", "0")),
checksum_errors: to_int(Map.get(v, "checksum_errors", "0"))
}
end
@data_vdev_types ~w(mirror raidz1 raidz2 raidz3 disk)
@special_vdev_types ~w(log cache spare dedup special)
defp derive_pool_type(vdevs) do
data_types =
vdevs
|> Enum.map(& &1.type)
|> Enum.reject(&(&1 in @special_vdev_types))
|> Enum.uniq()
case data_types do
[] -> "unknown"
["disk"] -> "stripe"
[t] when t in @data_vdev_types -> t
_ -> "mixed"
end
end
```
### - [ ] Step 5: Run tests — expect PASS
Run: `cd agent && mix test test/proxmox_agent/collectors/zfs_test.exs`
Expected: all tests pass.
### - [ ] Step 6: Commit
```bash
git add agent/lib/proxmox_agent/collectors/zfs.ex \
agent/test/proxmox_agent/collectors/zfs_test.exs \
agent/test/fixtures/zfs/zpool_status.json
git commit -m "feat(agent): enrich zpool summary with type, scan state, vdev list"
```
---
## Task 2: Agent collector — stripe, mixed, and special-vdev coverage
**Files:**
- Modify: `agent/test/proxmox_agent/collectors/zfs_test.exs`
### - [ ] Step 1: Add test for plain stripe, mixed layout, and special-vdev filtering
Append this block inside `describe "collect_pools/1"` in `agent/test/proxmox_agent/collectors/zfs_test.exs`:
```elixir
test "classifies pool_type for stripe, mixed, and special vdevs" do
list_json =
Jason.encode!(%{
"pools" => %{
"stripe" => %{"name" => "stripe", "size" => 1, "alloc" => 0, "free" => 1,
"frag" => 0, "cap" => 0, "health" => "ONLINE"},
"mixed" => %{"name" => "mixed", "size" => 1, "alloc" => 0, "free" => 1,
"frag" => 0, "cap" => 0, "health" => "ONLINE"},
"mirror_with_log" => %{"name" => "mirror_with_log", "size" => 1, "alloc" => 0, "free" => 1,
"frag" => 0, "cap" => 0, "health" => "ONLINE"}
}
})
vdev = fn name, type ->
{name, %{"name" => name, "vdev_type" => type, "state" => "ONLINE",
"read_errors" => "0", "write_errors" => "0", "checksum_errors" => "0"}}
end
status_json =
Jason.encode!(%{
"pools" => %{
"stripe" => %{
"name" => "stripe", "state" => "ONLINE", "error_count" => "0",
"vdevs" => Map.new([vdev.("sda", "disk"), vdev.("sdb", "disk")])
},
"mixed" => %{
"name" => "mixed", "state" => "ONLINE", "error_count" => "0",
"vdevs" => Map.new([vdev.("mirror-0", "mirror"), vdev.("raidz1-1", "raidz1")])
},
"mirror_with_log" => %{
"name" => "mirror_with_log", "state" => "ONLINE", "error_count" => "0",
"vdevs" => Map.new([vdev.("mirror-0", "mirror"), vdev.("log-0", "log")])
}
}
})
runner = fn
"zpool", ["list" | _] -> {:ok, list_json}
"zpool", ["status" | _] -> {:ok, status_json}
end
sample = Zfs.collect_pools(runner: runner)
by_name = Map.new(sample.pools, &{&1.name, &1})
assert by_name["stripe"].pool_type == "stripe"
assert by_name["mixed"].pool_type == "mixed"
assert by_name["mirror_with_log"].pool_type == "mirror"
# log vdev is retained in the per-pool vdevs list even though it's ignored for layout classification
assert Enum.any?(by_name["mirror_with_log"].vdevs, &(&1.type == "log"))
end
```
### - [ ] Step 2: Run the new test — expect PASS (collector already implements the logic)
Run: `cd agent && mix test test/proxmox_agent/collectors/zfs_test.exs`
Expected: all tests pass, including the new case.
### - [ ] Step 3: Commit
```bash
git add agent/test/proxmox_agent/collectors/zfs_test.exs
git commit -m "test(agent): cover stripe, mixed, and special-vdev pool_type classification"
```
---
## Task 3: UI — capacity bar CSS
**Files:**
- Modify: `server/assets/css/app.css`
### - [ ] Step 1: Add `.capbar` rules after the existing `.pool-row` block
In `server/assets/css/app.css`, locate the `.pool-row` rules (around lines 249258) and insert the following immediately after them:
```css
.capbar {
height: 4px;
background: var(--panel-2);
border-radius: 2px;
overflow: hidden;
margin: 0.25rem 0 0.4rem;
}
.capbar > span {
display: block;
height: 100%;
background: var(--ok);
transition: width 0.3s ease;
}
.capbar[data-level="warn"] > span { background: var(--warn); }
.capbar[data-level="crit"] > span { background: var(--crit); }
.pool-block {
padding: 0.6rem 0.9rem;
border-bottom: 1px solid var(--border);
}
.pool-block:last-child { border-bottom: none; }
.pool-block .head {
display: flex;
justify-content: space-between;
align-items: baseline;
gap: 0.6rem;
}
.pool-block .head .layout { color: var(--muted); font-size: 0.8rem; margin-left: 0.5rem; }
.pool-block .sizes { font-family: var(--mono); font-size: 0.78rem; color: var(--fg); }
.pool-block .details { color: var(--muted); font-family: var(--mono); font-size: 0.78rem; }
```
### - [ ] Step 2: Commit
```bash
git add server/assets/css/app.css
git commit -m "style(ui): capacity bar and per-pool block styles"
```
---
## Task 4: UI — render per-pool block with type, capacity bar, sizes, scrub state
**Files:**
- Modify: `server/lib/server_web/live/host_detail_live.ex`
- Modify: `server/test/server_web/live/host_detail_live_test.exs`
### - [ ] Step 1: Extend the LiveView test fixture with the new fields and add assertions
In `server/test/server_web/live/host_detail_live_test.exs`, replace the `"zfs_pools"` block inside the `fast` fixture (currently ~lines 1525) with:
```elixir
"zfs_pools" => %{
"pools" => [
%{
"name" => "rpool",
"health" => "ONLINE",
"pool_type" => "mirror",
"size_bytes" => 500_000_000_000,
"allocated_bytes" => 200_000_000_000,
"free_bytes" => 300_000_000_000,
"capacity_percent" => 40,
"fragmentation_percent" => 17,
"error_count" => 0,
"vdev_count" => 1,
"degraded_vdev_count" => 0,
"scan_function" => "scrub",
"scan_state" => "FINISHED",
"last_scrub_end" => "Sat Apr 19 02:00:00 2026",
"vdevs" => [
%{"name" => "mirror-0", "type" => "mirror", "state" => "ONLINE",
"read_errors" => 0, "write_errors" => 0, "checksum_errors" => 0}
]
}
]
},
```
Then in the `"renders sections..."` test, after the existing assertions, add:
```elixir
assert html =~ "mirror"
assert html =~ "465.7 GB" # size_bytes formatted
assert html =~ "186.3 GB" # allocated_bytes formatted
assert html =~ "279.4 GB" # free_bytes formatted
assert html =~ "capbar"
assert html =~ "scrub"
```
(Byte-to-`format_bytes/1` values: 500 GB decimal → 465.7 GiB; 200 GB → 186.3 GiB; 300 GB → 279.4 GiB. The helper divides by 1024 per step.)
### - [ ] Step 2: Run LiveView test — expect FAIL
Run: `cd server && mix test test/server_web/live/host_detail_live_test.exs`
Expected: `"renders sections..."` fails on the new assertions (`"mirror"`, sizes, `"capbar"`).
### - [ ] Step 3: Replace the pool-rendering block in the LiveView
In `server/lib/server_web/live/host_detail_live.ex`, replace the panel that renders ZFS pools (current lines 6587, the `<div class="panel">` containing `<header><span>ZFS pools</span>` down through its closing `</div>`) with:
```heex
<div class="panel">
<header><span>ZFS pools</span><span class="mono">{length(pools(@fast))}</span></header>
<div class="body tight">
<div :if={pools(@fast) == []} class="empty">No data.</div>
<div :for={pool <- pools(@fast)} class="pool-block">
<div class="head">
<div>
<span class="mono" style="color: var(--fg-bright); font-weight: 600;">{pool["name"]}</span>
<span class="layout">{pool_layout(pool)}</span>
</div>
<span class="badge" style={pool_badge_style(pool["health"])}>{pool["health"]}</span>
</div>
<div class="capbar" data-level={capbar_level(pool["capacity_percent"])}>
<span style={"width: #{pool["capacity_percent"] || 0}%"}></span>
</div>
<div class="sizes">
used {format_bytes(pool["allocated_bytes"] || 0)} ·
free {format_bytes(pool["free_bytes"] || 0)} ·
total {format_bytes(pool["size_bytes"] || 0)}
<span class="muted">({pool["capacity_percent"] || 0}%)</span>
</div>
<div class="details">
frag {pool["fragmentation_percent"] || 0}% ·
err {pool["error_count"] || 0} ·
vdevs {pool["vdev_count"] || 0} (deg {pool["degraded_vdev_count"] || 0}) ·
{pool_scrub_line(pool)}
</div>
<div :for={v <- degraded_vdevs(pool)} class="callout err" style="margin-top: 0.4rem;">
{v["name"]} {v["state"]} · r={v["read_errors"]} w={v["write_errors"]} cksum={v["checksum_errors"]}
</div>
</div>
</div>
</div>
```
Then add these helper functions near the other private helpers (after `pool_badge_style/1`):
```elixir
defp pool_layout(pool) do
case pool["pool_type"] do
nil -> "—"
"" -> "—"
t -> t
end
end
defp capbar_level(cap) when is_number(cap) and cap >= 90, do: "crit"
defp capbar_level(cap) when is_number(cap) and cap >= 80, do: "warn"
defp capbar_level(_), do: "ok"
defp pool_scrub_line(%{"scan_state" => "SCANNING"}), do: "scrub scanning"
defp pool_scrub_line(%{"scan_state" => "FINISHED", "last_scrub_end" => end_time})
when is_binary(end_time) and end_time != "",
do: "scrub #{end_time}"
defp pool_scrub_line(%{"last_scrub_end" => end_time}) when is_binary(end_time) and end_time != "",
do: "scrub #{end_time}"
defp pool_scrub_line(_), do: "scrub never"
defp degraded_vdevs(pool) do
(pool["vdevs"] || [])
|> Enum.filter(fn v -> Map.get(v, "state") not in [nil, "ONLINE"] end)
end
```
### - [ ] Step 4: Run LiveView test — expect PASS
Run: `cd server && mix test test/server_web/live/host_detail_live_test.exs`
Expected: all tests pass.
### - [ ] Step 5: Run the full server suite to catch regressions
Run: `cd server && mix test`
Expected: all tests pass.
### - [ ] Step 6: Run the full agent suite to catch regressions
Run: `cd agent && mix test`
Expected: all tests pass.
### - [ ] Step 7: Manual visual check (dev server)
Start the server locally (`cd server && mix phx.server`), log in, open a host detail page with live agent data, and confirm:
- Each pool shows `name pool_type` on line 1 with the health badge on the right.
- The capacity bar renders at the correct width and turns yellow/red at 80% / 90%.
- `used / free / total` line shows bytes formatted like `200.0 GB`.
- The `details` line shows frag/err/vdevs and a scrub label (`scrub finished …`, `scrub scanning`, or `scrub never`).
- Degraded pools list each non-ONLINE vdev in a red `.callout.err` line; ONLINE pools don't.
If the manual check reveals a rendering issue, fix it in `host_detail_live.ex` and re-run `cd server && mix test`.
### - [ ] Step 8: Commit
```bash
git add server/lib/server_web/live/host_detail_live.ex \
server/test/server_web/live/host_detail_live_test.exs
git commit -m "feat(ui): detailed per-pool block with type, capacity bar, scrub state"
```