Labeling & Asset Management Masterclass
This guide covers everything about properly naming, labeling, tagging, and managing infrastructure assets on OpenZFS — from the sticker on a physical drive to automated fleet inventory pulled directly from ZFS properties. It starts at the physical layer and ends with a fully automated CMDB built from filesystem metadata.
By the end, every disk in your fleet has a scannable QR code linking to its RMA URL, every pool name tells you its environment, region, purpose, SLA, and media class, every dataset carries machine-readable tags that drive backup, replication, and alerting policy automatically — and a new operator can walk up to any rack in any datacenter and know exactly what they are looking at without opening a single wiki page.
Why labeling is the foundation of operations: At 3 AM when a disk fails, you need to know: which pool, which vdev, which slot, which rack, which vendor, and where to order the replacement. Without labels, you are guessing. With labels, you are executing. The difference between a homelab and production is not the hardware — it is the labeling. A homelab has tank and /dev/sda. Production has prd-caw1-db-gold-nvme and UB-DSK-CAW1-88322. The second tells you everything: production, CA-West-1, database tier, gold SLA, NVMe class. The first tells you nothing.
What this masterclass builds: A complete labeling system — physical disk labels, pool naming conventions, ZFS custom properties as infrastructure tags, dataset hierarchies, and automated inventory — everything needed to run a production environment where any operator can walk up to any rack and know exactly what they are looking at. This masterclass teaches you to name things so the name IS the documentation.
1. Physical Disk Labels — What Goes on the Sticker
A physical disk label is not a bureaucratic nicety. It is the first line of incident response. When a drive fails, you walk to the rack, pull the right drive, and start the replacement. The label on that drive contains everything you need to do the next ten steps without stopping to look anything up.
The complete label template used in production:
PHYSICAL LOCATION
Region: CA-WEST-1
Datacenter: YVR01
Building: A
Row: R12
Rack: R12-08
Chassis: CH01
Slot: SLOT07
ZFS INFORMATION
ZFS Pool: prd-caw1-db-gold-nvme
VDEV: slot07
Role: DATA VDEV
Layout: draid2:10d:2c:128s
HARDWARE DETAILS
Vendor: Samsung
Model: PM9A3
Interface: NVMe PCIe 4.0 x4
Capacity: 3.84 TB
Serial: S6ZUNX0R123456A
Firmware: EDA92Q5Q
SMART Base: 2025-01-01
LIFECYCLE / INVENTORY
Asset ID: UB-DSK-CAW1-88322
Installed: 2025-02-12
Warranty: 2028-02-12
Supplier: CDW Canada
RMA URL: https://cdw.ca/rma/S6ZUNX0R123456A
Reorder URL: https://cdw.ca/p/samsung-pm9a3-3.84tb/PM9A3-3840
What each field means and why it matters
Physical Location — the drill-down hierarchy that gets a tech to the right chassis
slot. Region is the geographic zone (aligned with your WireGuard mesh topology so the
naming is consistent across infrastructure layers). Datacenter is the facility code.
Building, Row, Rack, Chassis, and Slot are the physical path. A tech who has never been
to this datacenter reads YVR01 / A / R12 / R12-08 / CH01 / SLOT07 and walks directly
to the right drive without asking anyone for directions.
ZFS Information — what OpenZFS knows about this drive. The pool name encodes
environment, location, role, tier, and media class (covered in section 3). The VDEV name
matches the physical slot identifier so you can correlate zpool status output directly
to the label. The Role field tells you whether this is a data vdev, a spare, a SLOG, or
an L2ARC — which matters when you are deciding whether to pull the drive immediately or
let resilver finish first. The Layout field records the exact dRAID or RAIDZ geometry at
the time of installation.
Hardware Details — the vendor, model, interface, and firmware needed for procurement and warranty claims. The Serial is the primary key for warranty and RMA lookups. SMART Base records the date when baseline SMART data was captured so you can compute drive age and track normalized attribute degradation over time.
Lifecycle / Inventory — the operational fields. Asset ID follows a structured scheme (covered below). Installed and Warranty dates let you compute time-to-warranty-expiry without a spreadsheet. The Supplier field tells you who to call. RMA URL and Reorder URL are the killer features: a tech scans the QR code, taps the RMA URL, and the return authorization process starts immediately from their phone.
Asset ID structure
The Asset ID format UB-DSK-CAW1-88322 encodes: organization prefix (UB), asset
class (DSK for disk, SRV for server, NET for network gear, PDU for power), region
code (CAW1), and a sequential 5-digit number within that region and class. The sequential
number is assigned at procurement — not at installation — so an ordered drive already has
an asset ID before it arrives, and the label can be printed before the drive ships.
# Asset ID prefix table
UB-SRV-{REGION}-{SEQ} — server (compute node)
UB-DSK-{REGION}-{SEQ} — disk (any storage medium)
UB-NET-{REGION}-{SEQ} — network device (switch, router, ToR)
UB-PDU-{REGION}-{SEQ} — PDU or UPS
UB-CAB-{REGION}-{SEQ} — cable or patch panel
UB-JBD-{REGION}-{SEQ} — JBOD enclosure
QR code: JSON metadata on the label
The QR code on the label encodes the full drive record as JSON. Any phone can scan it. The JSON is the same structure used by the inventory database, so a scan is also an inventory lookup:
{
"asset_id": "UB-DSK-CAW1-88322",
"serial": "S6ZUNX0R123456A",
"zfs": {
"pool": "prd-caw1-db-gold-nvme",
"vdev": "slot07",
"role": "data",
"layout": "draid2:10d:2c:128s"
},
"location": {
"region": "CA-WEST-1",
"datacenter": "YVR01",
"building": "A",
"row": "R12",
"rack": "R12-08",
"chassis": "CH01",
"slot": "SLOT07"
},
"hardware": {
"vendor": "Samsung",
"model": "PM9A3",
"capacity_tb": 3.84,
"interface": "NVMe",
"firmware": "EDA92Q5Q"
},
"lifecycle": {
"installed": "2025-02-12",
"warranty_expiry": "2028-02-12",
"supplier": "CDW Canada"
}
}
Generating labels from live system data
Labels should be generated from actual hardware data, not typed by hand. This script pulls SMART data, correlates it with ZFS pool membership, and outputs the label text ready to print:
#!/bin/bash
# gen-disk-label.sh — generate a disk label from live system data
# Usage: gen-disk-label.sh /dev/disk/by-id/nvme-Samsung_PM9A3_S6ZUNX0R123456A
#
# Requires: smartmontools, jq, zpool
DISK="$1"
if [[ -z "$DISK" ]]; then
echo "Usage: $0 /dev/disk/by-id/..." >&2
exit 1
fi
# Resolve to real device
REALDEV=$(realpath "$DISK")
# Pull SMART data
SMART=$(smartctl -j -a "$REALDEV" 2>/dev/null)
VENDOR=$(echo "$SMART" | jq -r '.device.type // "unknown"')
MODEL=$(echo "$SMART" | jq -r '.model_name // "unknown"')
SERIAL=$(echo "$SMART" | jq -r '.serial_number // "unknown"')
FIRMWARE=$(echo "$SMART" | jq -r '.firmware_version // "unknown"')
CAPACITY=$(echo "$SMART" | jq -r '(.user_capacity.bytes // 0) / 1e12 | . * 100 | round / 100 | tostring + " TB"')
# Find which pool this disk belongs to
POOL=$(zpool status | awk -v dev="$(basename $REALDEV)" '
/^ pool:/ { pool=$2 }
$0 ~ dev { print pool; exit }
')
VDEV=$(zpool status "$POOL" 2>/dev/null | awk -v dev="$(basename $REALDEV)" '
prev ~ /slot[0-9]+/ && $0 ~ dev { print prev }
{ prev=$1 }
' || echo "unknown")
# Output label text
cat <
Set the environment variables (REGION, DC, RACK, ASSET_ID, etc.) from your
provisioning system or a per-rack config file before running. The script fills in
everything it can from live hardware data automatically.
2. Pool Naming Conventions — The Name IS the Documentation
A ZFS pool name is permanent. You cannot rename a pool in place — you would have to export it, create a new pool with the new name, and transfer all data. Get the naming convention right before creating the first pool, because you are living with it.
The convention: {env}-{region}-{role}-{tier}-{media}
env — environment
prd production — stg staging — dev development — tst test — dr disaster recovery
The first token tells you the blast radius. Never confuse prd with dev.
region — geographic zone
caw1 CA-West-1 — use1 US-East-1 — euw1 EU-West-1 — aps1 AP-South-1
Matches your WireGuard mesh region codes. Consistent naming across layers.
role — workload purpose
db database — web web servers — stor object/file storage — vm virtual machines — k8s Kubernetes — mon monitoring — bak backup
The purpose of the data, not the technology. Databases go on db pools regardless of engine.
tier — SLA class
gold — highest durability, mirrored or dRAID2+, replicated to DRsilver — standard production, RAIDZ2, replicated dailybronze — dev/test/backup, RAIDZ1 or single disk, no DR target
The tier drives the backup frequency, replication target, and monitoring sensitivity.
media — storage class
nvme NVMe SSD — ssd SATA/SAS SSD — hdd spinning disk — mix heterogeneous (SLOG on NVMe, data on HDD)
Media class tells you the expected I/O characteristics without running benchmarks.
Examples
# Production examples
prd-caw1-db-gold-nvme # production, CA-West-1, database, gold SLA, NVMe
prd-caw1-vm-gold-nvme # production, CA-West-1, VMs, gold SLA, NVMe
prd-caw1-stor-silver-hdd # production, CA-West-1, storage, silver SLA, HDD
prd-use1-db-gold-nvme # production, US-East-1, database, gold SLA, NVMe
# Staging examples
stg-caw1-db-silver-ssd # staging, CA-West-1, database, silver SLA, SSD
stg-use1-web-bronze-ssd # staging, US-East-1, web, bronze SLA, SSD
# Development examples
dev-use1-web-bronze-ssd # development, US-East-1, web, bronze SLA, SSD
dev-caw1-db-bronze-ssd # development, CA-West-1, database, bronze SLA, SSD
# DR site examples
dr-euw1-bak-silver-hdd # DR, EU-West-1, backup, silver SLA, HDD
dr-aps1-db-gold-nvme # DR, AP-South-1, database, gold SLA, NVMe
After adopting this convention, zpool list becomes an infrastructure overview:
$ zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH
dr-euw1-bak-silver-hdd 120T 67.2T 52.8T - - 12% 56% 1.00 ONLINE
prd-caw1-db-gold-nvme 46T 38.1T 7.90T - - 3% 82% 1.00 ONLINE
prd-caw1-stor-silver-hdd 240T 189T 51.0T - - 18% 78% 1.00 ONLINE
prd-caw1-vm-gold-nvme 92T 71.3T 20.7T - - 2% 77% 1.00 ONLINE
prd-use1-db-gold-nvme 46T 21.4T 24.6T - - 1% 46% 1.00 ONLINE
stg-caw1-db-silver-ssd 7.68T 3.1T 4.58T - - 1% 40% 1.00 ONLINE
Six pools, six lines. You see every environment, every region, every role, every SLA tier, every media class — without opening any documentation. The database team's gold NVMe pool in CA-West-1 is at 82% capacity. You know to order disks. No Grafana required to see that.
prd-caw1-db-gold-nvme and knows exactly what it is without any documentation. A pool named tank or data or storage1 tells you nothing. When that pool is at 82% capacity and you need to explain urgency to a manager at 11 PM, prd-caw1-db-gold-nvme carries the urgency itself. tank does not.3. VDEV Naming and Disk Identification
The most dangerous thing you can do with ZFS is create a pool using raw device names
like /dev/sda, /dev/sdb, /dev/nvme0n1. Device names are ephemeral. They are
assigned at boot time by the kernel based on discovery order. Add a USB drive to a
server and everything shifts. Replace a failing drive and the replacement gets a
different device name. The pool continues to function, but your mental model of which
physical disk is which is now wrong.
Use /dev/disk/by-id/ for persistent identification
The /dev/disk/by-id/ path is stable across reboots. It is derived from the device's
serial number, which is burned into the hardware at manufacture:
# Never do this — device name can change between reboots
zpool create tank sda sdb sdc sdd
# Always do this — stable across reboots and replacements
zpool create prd-caw1-db-gold-nvme \
draid2:10d:2c:128s \
/dev/disk/by-id/nvme-Samsung_PM9A3_S6ZUNX0R123456A \
/dev/disk/by-id/nvme-Samsung_PM9A3_S6ZUNX0R789012B \
/dev/disk/by-id/nvme-Samsung_PM9A3_S6ZUNX0R345678C \
... (all 12 drives)
# Verify what's in the pool after creation
zpool status -v prd-caw1-db-gold-nvme
VDEV labels: use slot identifiers
OpenZFS allows you to assign friendly names to vdevs using udev rules. Map physical
enclosure slots to names that match your label format, so zpool status output reads
in terms of slots — the same identifiers that are on the physical labels:
# /etc/udev/rules.d/99-zfs-slots.rules
# Maps NVMe enclosure slot WWNs to slot names
# Generate these by running: ls -la /dev/disk/by-path/ | grep nvme
KERNEL=="nvme*", SUBSYSTEM=="block", \
ENV{ID_PATH}=="pci-0000:01:00.0-nvme-1", \
SYMLINK+="disk/by-slot/slot01"
KERNEL=="nvme*", SUBSYSTEM=="block", \
ENV{ID_PATH}=="pci-0000:02:00.0-nvme-1", \
SYMLINK+="disk/by-slot/slot02"
KERNEL=="nvme*", SUBSYSTEM=="block", \
ENV{ID_PATH}=="pci-0000:03:00.0-nvme-1", \
SYMLINK+="disk/by-slot/slot03"
# ... continue for all slots
# After creating rules, reload udev
udevadm control --reload-rules
udevadm trigger
# Create pool using slot names — now zpool status shows slot01, slot02
zpool create prd-caw1-db-gold-nvme \
draid2:10d:2c:128s \
/dev/disk/by-slot/slot01 \
/dev/disk/by-slot/slot02 \
/dev/disk/by-slot/slot03 \
...
With slot-based udev rules in place, zpool status reports:
$ zpool status prd-caw1-db-gold-nvme
pool: prd-caw1-db-gold-nvme
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
prd-caw1-db-gold-nvme ONLINE 0 0 0
draid2:10d:2c:128s ONLINE 0 0 0
slot01 ONLINE 0 0 0
slot02 ONLINE 0 0 0
slot03 FAULTED 5 0 0 too many errors
slot04 ONLINE 0 0 0
...
slot03 is the failed drive. The physical label on slot 3 in chassis CH01 tells you
everything you need to know: the RMA URL, the reorder URL, the pool it belongs to,
the warranty status. You do not need to open any tool other than zpool status.
Mapping physical slots: 24-disk JBOD example
#!/bin/bash
# map-jbod-slots.sh — print slot to device mapping for a JBOD enclosure
# Requires sg3-utils for enclosure management
# List all SES (SCSI Enclosure Services) devices
for enc in /sys/class/enclosure/*/; do
encdev=$(basename "$enc")
echo "Enclosure: $encdev"
# List slots and their current disk
for slot in "$enc"*/; do
slotnum=$(basename "$slot")
# Get the disk in this slot
if [[ -L "$slot/device" ]]; then
diskdev=$(ls "$slot/device/block/" 2>/dev/null | head -1)
if [[ -n "$diskdev" ]]; then
serial=$(cat /sys/block/"$diskdev"/device/serial 2>/dev/null || \
smartctl -i /dev/"$diskdev" | awk '/Serial/{print $NF}')
echo " slot$slotnum -> /dev/$diskdev serial: $serial"
else
echo " slot$slotnum -> empty"
fi
fi
done
done
/dev/sda through /dev/sdx is a ticking time bomb. Plug in a USB drive during a maintenance window and /dev/sda may shift. Replace a drive under pressure at 3 AM and the mapping changes. The pool continues to work — OpenZFS uses its own internal VDEV GUIDs — but the moment you need to correlate a fault to a physical drive, you are doing detective work instead of executing a procedure. Use /dev/disk/by-id/ always. Add udev slot rules for large JBODs. The 20 minutes spent writing udev rules saves hours of correlation work the first time a drive fails.4. ZFS Custom Properties — Infrastructure Tags
OpenZFS allows you to set arbitrary key-value metadata on any pool, dataset, or volume. These are called user properties or custom properties. They are stored inside ZFS itself — no separate database, no API, no sync service. They survive snapshots, clones, and replication. A dataset tagged in Vancouver arrives at your DR site in Frankfurt with every tag intact.
The namespace convention
Custom properties must be namespaced with a colon. The convention used throughout
kldload is com.kldload:{key}. This prevents collisions with OpenZFS built-in
properties and with other tools that set custom properties:
# Setting a custom property
zfs set com.kldload:region=CA-WEST-1 prd-caw1-db-gold-nvme
# Setting multiple properties at once
zfs set \
com.kldload:region=CA-WEST-1 \
com.kldload:tier=production \
com.kldload:app=postgres \
com.kldload:owner=team-database \
com.kldload:sla=gold \
com.kldload:backup-policy=hourly \
com.kldload:dr-target=dr-euw1-bak-silver-hdd \
com.kldload:cost-center=eng-001 \
prd-caw1-db-gold-nvme/postgres/main
# Reading a single property
zfs get com.kldload:sla prd-caw1-db-gold-nvme/postgres/main
# Reading all custom properties on a dataset
zfs get -r -s local all prd-caw1-db-gold-nvme/postgres/main | grep com.kldload
# Reading a specific property recursively across a pool
zfs get -r -s local com.kldload:tier prd-caw1-db-gold-nvme
Standard tag library
com.kldload:region
Geographic zone: CA-WEST-1, US-EAST-1, EU-WEST-1. Matches WireGuard mesh region codes and pool naming convention. Drives replication topology.
com.kldload:tier
Environment tier: production, staging, development, testing, dr. Drives alert sensitivity and change control gates.
com.kldload:app
Application name: postgres, nginx, redis, kafka, prometheus. Drives workload-specific tuning and monitoring dashboards.
com.kldload:owner
Team responsible: team-database, team-platform, team-security. Drives quota allocation and alert routing — incidents page the owner.
com.kldload:sla
SLA class: gold, silver, bronze. Drives scrub frequency, snapshot retention, and on-call urgency. Gold pages primary and secondary simultaneously.
com.kldload:backup-policy
Snapshot schedule: 15min, hourly, daily, weekly, none. This tag IS the backup configuration. Sanoid reads it. No config file to maintain.
com.kldload:dr-target
Replication destination: dr-euw1-bak-silver-hdd, dr-aps1-db-gold-nvme, none. Syncoid reads this. The tag IS the replication topology.
com.kldload:cost-center
Cost center code: eng-001, ops-002, fin-003. Drives capacity chargebacks. Total used space per cost center from a single query.
Properties survive replication
This is the property that makes ZFS tagging more powerful than any external tag system.
When you zfs send | zfs receive a dataset to another host, all custom properties
travel with it:
# Replicate with all properties intact
syncoid --sendoptions="-p" \
prd-caw1-db-gold-nvme/postgres/main \
dr-euw1-bak-silver-hdd/postgres/main
# On the DR host, verify tags arrived
zfs get com.kldload:dr-target dr-euw1-bak-silver-hdd/postgres/main
NAME PROPERTY VALUE SOURCE
dr-euw1-bak-silver-hdd/postgres/main com.kldload:dr-target dr-euw1-bak-silver-hdd received
The SOURCE column shows received — the tag was set on the source and transmitted
via zfs send. No sync job, no separate tagging step, no tag drift between production
and DR.
5. Tag-Based Operations — The Real Power
Tags without automation are bureaucracy. Tags with automation are policy. Every tag applied to a dataset becomes a selector for every automated operation in your infrastructure. Adding a tag to a new dataset automatically enrolls it in backup, replication, monitoring, and quota enforcement — with no configuration file to edit.
Replicate by tag: everything tagged production to DR
#!/bin/bash
# replicate-by-tag.sh — replicate all datasets tagged for a given dr-target
# Usage: replicate-by-tag.sh prd-caw1-db-gold-nvme dr-host.euw1.internal
POOL="$1"
DR_HOST="$2"
# Find all datasets with a dr-target set
zfs get -r -H -o name,value com.kldload:dr-target "$POOL" | \
grep -v "^-" | \
grep -v "none$" | \
while IFS=$'\t' read -r dataset target; do
DR_POOL="${target%%/*}"
DR_PATH="${dataset#*/}"
echo "Replicating $dataset -> $DR_HOST:$target"
syncoid \
--sendoptions="-p" \
--no-sync-snap \
"$dataset" \
"${DR_HOST}:${target}"
done
Capacity planning by tag: storage usage per team
#!/bin/bash
# capacity-by-owner.sh — total used space per cost-center tag
# Output: cost-center, used-bytes, dataset-count
echo "COST-CENTER USED DATASETS"
echo "---------------- ------- --------"
for pool in $(zpool list -H -o name); do
zfs get -r -H -o name,value com.kldload:cost-center "$pool" | \
grep -v "^-" | \
while IFS=$'\t' read -r dataset costcenter; do
used=$(zfs get -H -o value used "$dataset")
echo "$costcenter $used $dataset"
done
done | sort | awk '
{
cc[$1] += 1
used[$1] = $2 # last value (imprecise for display only)
}
END {
for (c in cc) printf "%-16s %-7s %d\n", c, used[c], cc[c]
}
' | sort
Snapshot by tag: gold SLA datasets every 15 minutes
#!/bin/bash
# snapshot-by-sla.sh — snapshot all datasets matching a given SLA tag
# Run from cron: */15 * * * * /usr/local/bin/snapshot-by-sla.sh gold
SLA="${1:-gold}"
SNAP_NAME="$(date +%Y%m%d-%H%M)"
for pool in $(zpool list -H -o name); do
zfs get -r -H -o name,value com.kldload:sla "$pool" | \
grep -v "^-" | \
awk -F'\t' -v sla="$SLA" '$2 == sla {print $1}' | \
while read -r dataset; do
zfs snapshot "${dataset}@auto-${SNAP_NAME}"
echo "Snapped: ${dataset}@auto-${SNAP_NAME}"
done
done
Quota enforcement by tag
#!/bin/bash
# apply-quotas.sh — apply quotas from a policy file keyed by owner tag
# Policy file format: owner quota
# Example: team-database 2T
POLICY_FILE="/etc/kldload/quota-policy.conf"
for pool in $(zpool list -H -o name); do
zfs get -r -H -o name,value com.kldload:owner "$pool" | \
grep -v "^-" | \
while IFS=$'\t' read -r dataset owner; do
quota=$(awk -v o="$owner" '$1 == o {print $2}' "$POLICY_FILE")
if [[ -n "$quota" ]]; then
zfs set quota="$quota" "$dataset"
echo "Set quota $quota on $dataset (owner: $owner)"
fi
done
done
The inventory report: entire infrastructure from ZFS properties
#!/bin/bash
# inventory-report.sh — JSON inventory of all tagged datasets across all pools
# Output: one JSON object per dataset, suitable for jq/CMDB import
echo "["
FIRST=1
for pool in $(zpool list -H -o name); do
zfs list -r -H -o name,used,avail,refer "$pool" | \
while IFS=$'\t' read -r name used avail refer; do
# Gather all com.kldload: properties for this dataset
props=$(zfs get -H -o property,value all "$name" 2>/dev/null | \
grep "^com.kldload:" | \
awk -F'\t' '{
key=$1; val=$2
sub(/^com.kldload:/, "", key)
printf " \"%s\": \"%s\",\n", key, val
}')
[[ -z "$props" ]] && continue # skip untagged datasets
[[ "$FIRST" -eq 0 ]] && echo ","
FIRST=0
cat <
zfs get -r com.kldload:backup-policy rpool | grep hourly | awk '{print $1}' | xargs -I{} syncoid {} dr-host:{} replicates every dataset tagged for hourly backup. No configuration file. No list to maintain. Add the tag to a new dataset, it is automatically included in replication. Remove the tag, it is excluded. The tags ARE the policy. Every operational procedure in this section is triggered by a tag value. The tag is the single source of truth. There is no other place where "this dataset gets hourly backups" is recorded.6. Dataset Hierarchy Conventions
OpenZFS datasets are cheap to create — there is no pre-allocation, no minimum size, no formatting step. Create as many as you need. The discipline is not minimizing the number of datasets — it is designing a hierarchy where properties, quotas, and snapshots are set at the right level so children inherit correctly.
The canonical hierarchy
# General pattern: pool / category / application / instance
prd-caw1-db-gold-nvme/
postgres/ # category: postgres databases
main/ # instance: primary database cluster
replica/ # instance: replica cluster
analytics/ # instance: analytics replica (read-heavy tuning)
redis/ # category: redis instances
cache/ # instance: application cache
session/ # instance: session store
mysql/ # category: mysql databases
legacy/ # instance: legacy application
prd-caw1-vm-gold-nvme/
vms/ # category: virtual machine disks
web-1/ # instance: web server VM
web-2/ # instance: web server VM
app-1/ # instance: application server VM
images/ # category: base OS images
centos-9/ # instance: CentOS 9 base image
debian-13/ # instance: Debian 13 base image
prd-caw1-stor-silver-hdd/
media/ # category: media files
raw/ # instance: raw ingest
processed/ # instance: processed output
backups/ # category: backup data
postgres/ # instance: database backups
config/ # instance: configuration backups
logs/ # category: log archives
nginx/ # instance: nginx logs
app/ # instance: application logs
# Home NAS hierarchy
tank/
home/ # category: home directories
alice/ # instance: user alice
bob/ # instance: user bob
media/ # category: media library
movies/ # instance: movies
tv/ # instance: TV series
music/ # instance: music
downloads/ # category: download staging
Inheritance: set once, inherit everywhere
Set properties at the highest appropriate level. Children inherit and you can override at any lower level. This is how you avoid setting the same property 50 times:
# Set region and tier on the pool — all datasets inherit
zfs set com.kldload:region=CA-WEST-1 prd-caw1-db-gold-nvme
zfs set com.kldload:tier=production prd-caw1-db-gold-nvme
zfs set com.kldload:sla=gold prd-caw1-db-gold-nvme
# Set database-specific tags on the postgres subtree — all postgres datasets inherit
zfs set com.kldload:app=postgres prd-caw1-db-gold-nvme/postgres
zfs set com.kldload:owner=team-database prd-caw1-db-gold-nvme/postgres
zfs set com.kldload:backup-policy=hourly prd-caw1-db-gold-nvme/postgres
zfs set com.kldload:dr-target=dr-euw1-bak-silver-hdd/postgres prd-caw1-db-gold-nvme/postgres
# Override for a specific instance that needs different settings
zfs set com.kldload:backup-policy=daily prd-caw1-db-gold-nvme/postgres/analytics
zfs set com.kldload:dr-target=none prd-caw1-db-gold-nvme/postgres/analytics
# The analytics dataset has all inherited tags (region, tier, sla, owner)
# but its own backup-policy and dr-target
zfs get -r -s local,inherited com.kldload:backup-policy prd-caw1-db-gold-nvme/postgres
NAME PROPERTY VALUE SOURCE
prd-caw1-db-gold-nvme/postgres com.kldload:backup-policy hourly local
prd-caw1-db-gold-nvme/postgres/main com.kldload:backup-policy hourly inherited
prd-caw1-db-gold-nvme/postgres/replica com.kldload:backup-policy hourly inherited
prd-caw1-db-gold-nvme/postgres/analytics com.kldload:backup-policy daily local
When to use datasets vs directories
The answer is almost always datasets. The cost of a dataset is a few kilobytes of metadata. The benefit is independent snapshots, independent quotas, independent compression settings, independent replication, and independent tagging. A directory inside a dataset cannot be snapshotted independently. A dataset can. If in doubt, create a dataset.
The exceptions: files that change together and always need to be snapshotted together
(the WAL directory and data directory of a database should be in the same dataset so
snapshots are consistent), and temporary data that should explicitly not be snapshotted
(put it in a directory under a com.sun:auto-snapshot=false dataset).
Workload-specific hierarchies
# KVM host: one dataset per VM, one volume per virtual disk
prd-caw1-vm-gold-nvme/vms/web-1/ # dataset: VM config, logs
prd-caw1-vm-gold-nvme/vms/web-1/disk0 # zvol: primary virtual disk (20G)
prd-caw1-vm-gold-nvme/vms/web-1/disk1 # zvol: data virtual disk (100G)
# Kubernetes cluster: one dataset per namespace
prd-caw1-vm-gold-nvme/k8s/
default/ # default namespace PVCs
monitoring/ # Prometheus/Grafana PVCs
databases/ # Database PVCs
# PostgreSQL: data and WAL in separate datasets (different recordsize)
prd-caw1-db-gold-nvme/postgres/main/
data/ # recordsize=8k (matches PostgreSQL page size)
wal/ # recordsize=32k (matches WAL segment size)
temp/ # no snapshots, no backup
# NAS: shares at the dataset level, not the directory level
tank/shares/
engineering/ # share: Engineering team files
finance/ # share: Finance team files (separate quota, separate encryption key)
public/ # share: Public read-only content
7. Fleet Inventory Automation
The inventory is not a spreadsheet. It is not a wiki. It is a query against ZFS properties, SMART data, and pool status — run on demand, always current, always accurate. The following scripts build a complete fleet inventory from live system data.
Complete fleet inventory script
#!/bin/bash
# kldload-inventory — complete fleet inventory from ZFS and SMART data
# Output: JSON to stdout, suitable for CMDB import, Prometheus push, or HTML report
HOSTNAME=$(hostname -f)
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)
# --- Pool inventory ---
pool_inventory() {
for pool in $(zpool list -H -o name); do
health=$(zpool list -H -o health "$pool")
size=$(zpool list -H -o size "$pool")
alloc=$(zpool list -H -o alloc "$pool")
free=$(zpool list -H -o free "$pool")
cap=$(zpool list -H -o cap "$pool")
frag=$(zpool list -H -o frag "$pool")
# Gather all custom tags for this pool
tags=$(zfs get -H -o property,value all "$pool" | \
grep "^com.kldload:" | \
awk -F'\t' '{
key=$1; val=$2
sub(/^com.kldload:/, "", key)
printf " \"%s\": \"%s\",\n", key, val
}')
cat </dev/null)
[[ -z "$smart" ]] && continue
model=$(echo "$smart" | jq -r '.model_name // "unknown"')
serial=$(echo "$smart" | jq -r '.serial_number // "unknown"')
firmware=$(echo "$smart" | jq -r '.firmware_version // "unknown"')
cap_bytes=$(echo "$smart" | jq -r '.user_capacity.bytes // 0')
cap_tb=$(echo "$cap_bytes" | awk '{printf "%.2f", $1/1e12}')
hours=$(echo "$smart" | jq -r '.power_on_time.hours // 0')
temp=$(echo "$smart" | jq -r '.temperature.current // 0')
health=$(echo "$smart" | jq -r '.smart_status.passed // false')
# Which pool is this disk in?
devname=$(basename "$realdev")
pool=$(zpool status 2>/dev/null | awk -v d="$devname" '
/^ pool:/ { pool=$2 }
$0 ~ d { print pool; exit }
')
cat <
Prometheus metrics from ZFS properties
#!/bin/bash
# zfs-property-exporter — expose ZFS custom properties as Prometheus metrics
# Run from a systemd timer, output to node_exporter textfile directory
OUTFILE="/var/lib/node_exporter/textfile_collector/zfs_properties.prom"
TMPFILE="${OUTFILE}.tmp"
{
echo "# HELP zfs_dataset_used_bytes ZFS dataset used bytes"
echo "# TYPE zfs_dataset_used_bytes gauge"
echo "# HELP zfs_dataset_available_bytes ZFS dataset available bytes"
echo "# TYPE zfs_dataset_available_bytes gauge"
echo "# HELP zfs_pool_capacity_percent ZFS pool capacity percentage"
echo "# TYPE zfs_pool_capacity_percent gauge"
for pool in $(zpool list -H -o name); do
cap=$(zpool list -H -o cap "$pool" | tr -d '%')
health=$(zpool list -H -o health "$pool")
# Gather tags for labels
region=$(zfs get -H -o value com.kldload:region "$pool" 2>/dev/null || echo "unknown")
tier=$(zfs get -H -o value com.kldload:tier "$pool" 2>/dev/null || echo "unknown")
sla=$(zfs get -H -o value com.kldload:sla "$pool" 2>/dev/null || echo "unknown")
echo "zfs_pool_capacity_percent{pool=\"$pool\",region=\"$region\",tier=\"$tier\",sla=\"$sla\",health=\"$health\"} $cap"
# Per-dataset metrics
zfs list -r -H -o name,used,avail "$pool" | \
while IFS=$'\t' read -r name used avail; do
# Convert used/avail to bytes (handle K, M, G, T suffixes)
used_bytes=$(numfmt --from=iec "$used" 2>/dev/null || echo 0)
avail_bytes=$(numfmt --from=iec "$avail" 2>/dev/null || echo 0)
owner=$(zfs get -H -o value com.kldload:owner "$name" 2>/dev/null || echo "unknown")
app=$(zfs get -H -o value com.kldload:app "$name" 2>/dev/null || echo "unknown")
cc=$(zfs get -H -o value com.kldload:cost-center "$name" 2>/dev/null || echo "unknown")
labels="dataset=\"$name\",region=\"$region\",tier=\"$tier\",owner=\"$owner\",app=\"$app\",cost_center=\"$cc\""
echo "zfs_dataset_used_bytes{$labels} $used_bytes"
echo "zfs_dataset_available_bytes{$labels} $avail_bytes"
done
done
} > "$TMPFILE" && mv "$TMPFILE" "$OUTFILE"
Warranty and lifecycle alerts
#!/bin/bash
# warranty-check.sh — alert on drives approaching warranty expiry
# Read asset metadata from /etc/kldload/assets/*.json
# Run weekly from cron
WARN_DAYS=90 # alert 90 days before expiry
TODAY=$(date +%s)
for asset_file in /etc/kldload/assets/*.json; do
[[ -f "$asset_file" ]] || continue
asset_id=$(jq -r '.asset_id' "$asset_file")
serial=$(jq -r '.serial' "$asset_file")
warranty=$(jq -r '.lifecycle.warranty_expiry' "$asset_file")
supplier=$(jq -r '.lifecycle.supplier' "$asset_file")
rma_url=$(jq -r '.lifecycle.rma_url // "N/A"' "$asset_file")
[[ "$warranty" == "null" || "$warranty" == "" ]] && continue
warranty_epoch=$(date -d "$warranty" +%s 2>/dev/null) || continue
days_left=$(( (warranty_epoch - TODAY) / 86400 ))
if [[ "$days_left" -lt 0 ]]; then
echo "EXPIRED $asset_id serial=$serial expired=$((-days_left))d ago supplier=$supplier rma=$rma_url"
elif [[ "$days_left" -lt "$WARN_DAYS" ]]; then
echo "WARNING $asset_id serial=$serial expires=${days_left}d supplier=$supplier rma=$rma_url"
fi
done
8. Disk Lifecycle Management
Disks follow a predictable lifecycle: procurement, installation, monitoring, replacement, and decommission. Every phase has a labeling and inventory step. Missing any step means the next person to touch the drive is missing information.
Phase 1: Procurement
#!/bin/bash
# new-asset.sh — register a new disk asset at procurement time
# Usage: new-asset.sh --region CAW1 --model "Samsung PM9A3" --capacity 3.84T \
# --serial S6ZUNX0R123456A --supplier "CDW Canada" \
# --warranty 2028-02-12 --rma https://cdw.ca/rma/...
# Parse arguments
while [[ "$#" -gt 0 ]]; do
case $1 in
--region) REGION="$2"; shift ;;
--model) MODEL="$2"; shift ;;
--capacity) CAPACITY="$2"; shift ;;
--serial) SERIAL="$2"; shift ;;
--supplier) SUPPLIER="$2"; shift ;;
--warranty) WARRANTY="$2"; shift ;;
--rma) RMA_URL="$2"; shift ;;
--reorder) REORDER="$2"; shift ;;
esac
shift
done
# Generate asset ID: next sequential number for this region+class
SEQ=$(ls /etc/kldload/assets/UB-DSK-${REGION}-*.json 2>/dev/null | \
grep -oP '\d+(?=\.json)' | sort -n | tail -1)
SEQ=$(( ${SEQ:-0} + 1 ))
ASSET_ID="UB-DSK-${REGION}-$(printf '%05d' $SEQ)"
# Write asset record
mkdir -p /etc/kldload/assets
cat > "/etc/kldload/assets/${ASSET_ID}.json" <
Phase 2: Installation
#!/bin/bash
# install-asset.sh — record disk installation into a slot
# Usage: install-asset.sh UB-DSK-CAW1-88322 /dev/disk/by-id/nvme-Samsung_PM9A3_...
ASSET_ID="$1"
DISK="$2"
ASSET_FILE="/etc/kldload/assets/${ASSET_ID}.json"
[[ -f "$ASSET_FILE" ]] || { echo "Asset not found: $ASSET_ID"; exit 1; }
[[ -e "$DISK" ]] || { echo "Disk not found: $DISK"; exit 1; }
# Pull SMART data
SMART=$(smartctl -j -a "$(realpath "$DISK")")
SERIAL=$(echo "$SMART" | jq -r '.serial_number')
FIRMWARE=$(echo "$SMART" | jq -r '.firmware_version')
# Baseline SMART attributes
smartctl -j -a "$(realpath "$DISK")" > "/etc/kldload/smart-baseline/${ASSET_ID}.json"
# Update asset record with installation details
jq --arg installed "$(date +%Y-%m-%d)" \
--arg firmware "$FIRMWARE" \
--arg serial "$SERIAL" \
--arg dc "${DC}" \
--arg building "${BUILDING}" \
--arg row "${ROW}" \
--arg rack "${RACK}" \
--arg chassis "${CHASSIS}" \
--arg slot "${SLOT}" \
'.lifecycle.installed = $installed |
.hardware.firmware = $firmware |
.serial = $serial |
.status = "installed" |
.location.datacenter = $dc |
.location.building = $building |
.location.row = $row |
.location.rack = $rack |
.location.chassis = $chassis |
.location.slot = $slot' \
"$ASSET_FILE" > "${ASSET_FILE}.tmp" && mv "${ASSET_FILE}.tmp" "$ASSET_FILE"
echo "Asset $ASSET_ID installed at ${DC}/${BUILDING}/${ROW}/${RACK}/${CHASSIS}/${SLOT}"
echo "SMART baseline saved to /etc/kldload/smart-baseline/${ASSET_ID}.json"
Phase 3: Monitoring
#!/bin/bash
# smart-check.sh — compare current SMART data against baseline
# Run daily from cron
BASELINE_DIR="/etc/kldload/smart-baseline"
ALERT_THRESHOLD=10 # alert if normalized value drops more than 10 points
for baseline_file in "$BASELINE_DIR"/*.json; do
[[ -f "$baseline_file" ]] || continue
asset_id=$(basename "$baseline_file" .json)
# Find the disk by asset ID
asset_file="/etc/kldload/assets/${asset_id}.json"
[[ -f "$asset_file" ]] || continue
serial=$(jq -r '.serial' "$asset_file")
# Find current device by serial
realdev=$(smartctl --scan-open | while read -r dev opts; do
s=$(smartctl -i "$dev" 2>/dev/null | awk '/Serial/{print $NF}')
[[ "$s" == "$serial" ]] && echo "$dev" && break
done)
[[ -z "$realdev" ]] && continue
# Check SMART health
passed=$(smartctl -j -H "$realdev" | jq -r '.smart_status.passed')
if [[ "$passed" != "true" ]]; then
echo "SMART FAIL: $asset_id serial=$serial device=$realdev"
fi
# Check critical attributes (Reallocated Sectors, Pending Sectors)
smartctl -j -A "$realdev" | jq -r '
.ata_smart_attributes.table[]? |
select(.id == 5 or .id == 197 or .id == 198) |
"\(.name) raw=\(.raw.value) normalized=\(.value)"
' | while read -r line; do
raw=$(echo "$line" | grep -oP 'raw=\K\d+')
[[ "$raw" -gt 0 ]] && echo "WARNING: $asset_id $line"
done
done
Phase 4: Replacement
#!/bin/bash
# replace-disk.sh — guided disk replacement procedure
# Usage: replace-disk.sh UB-DSK-CAW1-88322
ASSET_ID="$1"
ASSET_FILE="/etc/kldload/assets/${ASSET_ID}.json"
[[ -f "$ASSET_FILE" ]] || { echo "Asset not found: $ASSET_ID"; exit 1; }
POOL=$(jq -r '.zfs.pool' "$ASSET_FILE")
VDEV=$(jq -r '.zfs.vdev' "$ASSET_FILE")
SLOT=$(jq -r '.location.slot' "$ASSET_FILE")
RMA_URL=$(jq -r '.lifecycle.rma_url' "$ASSET_FILE")
REORDER=$(jq -r '.lifecycle.reorder_url // "N/A"' "$ASSET_FILE")
echo "=== Disk Replacement Procedure ==="
echo "Asset: $ASSET_ID"
echo "Pool: $POOL"
echo "VDEV: $VDEV"
echo "Slot: $SLOT"
echo ""
echo "Step 1: Start RMA and order replacement"
echo " RMA URL: $RMA_URL"
echo " Reorder URL: $REORDER"
echo ""
echo "Step 2: Wait for replacement to arrive and get its asset ID"
echo ""
echo "Step 3: Offline the failed vdev (if not already FAULTED)"
echo " zpool offline $POOL $VDEV"
echo ""
echo "Step 4: Physically remove drive from $SLOT"
echo ""
echo "Step 5: Install replacement in $SLOT"
echo ""
echo "Step 6: Find new disk device path"
echo " ls -la /dev/disk/by-slot/$SLOT"
echo ""
echo "Step 7: Replace in ZFS"
echo " zpool replace $POOL /dev/disk/by-slot/$SLOT /dev/disk/by-slot/$SLOT"
echo ""
echo "Step 8: Monitor resilver"
echo " watch zpool status $POOL"
echo ""
echo "Step 9: Update asset record"
echo " install-asset.sh NEW-ASSET-ID /dev/disk/by-slot/$SLOT"
Phase 5: Decommission
#!/bin/bash
# decommission-asset.sh — remove a disk from service
# Usage: decommission-asset.sh UB-DSK-CAW1-88322
ASSET_ID="$1"
ASSET_FILE="/etc/kldload/assets/${ASSET_ID}.json"
[[ -f "$ASSET_FILE" ]] || { echo "Asset not found: $ASSET_ID"; exit 1; }
POOL=$(jq -r '.zfs.pool' "$ASSET_FILE")
VDEV=$(jq -r '.zfs.vdev' "$ASSET_FILE")
echo "Decommissioning $ASSET_ID from pool $POOL vdev $VDEV"
echo ""
echo "Manual steps required before this script proceeds:"
echo " 1. Remove from ZFS pool: zpool remove $POOL $VDEV"
echo " 2. Physical removal from rack"
echo " 3. Secure erase: nvme format --ses=1 /dev/..."
echo ""
read -p "Confirm decommission of $ASSET_ID? Type 'yes' to proceed: " CONFIRM
[[ "$CONFIRM" != "yes" ]] && { echo "Aborted."; exit 0; }
# Update status in asset record
jq --arg date "$(date +%Y-%m-%d)" \
'.status = "decommissioned" | .lifecycle.decommissioned = $date' \
"$ASSET_FILE" > "${ASSET_FILE}.tmp" && mv "${ASSET_FILE}.tmp" "$ASSET_FILE"
# Archive the asset record
mv "$ASSET_FILE" "/etc/kldload/assets/archive/${ASSET_ID}.json"
echo "Asset $ASSET_ID archived to /etc/kldload/assets/archive/"
9. Multi-Site Labeling
Multi-site deployments require labeling that is consistent across sites. The same
conventions must apply everywhere, and the region codes in labels, pool names, and ZFS
properties must all match. If your WireGuard mesh uses CA-WEST-1, your pool names
use caw1, and your ZFS properties use CA-WEST-1, you can correlate across all
three layers without a lookup table.
Region code mapping
# Region codes — consistent across all labeling layers
#
# Full name Pool prefix Property value WireGuard zone
# --------------- ----------- --------------- ---------------
# CA-West-1 caw1 CA-WEST-1 ca-west-1
# US-East-1 use1 US-EAST-1 us-east-1
# US-West-2 usw2 US-WEST-2 us-west-2
# EU-West-1 euw1 EU-WEST-1 eu-west-1
# EU-Central-1 euc1 EU-CENTRAL-1 eu-central-1
# AP-South-1 aps1 AP-SOUTH-1 ap-south-1
# AP-East-1 ape1 AP-EAST-1 ap-east-1
Three-site example: production + DR + dev
# CA-West-1 — primary production site
prd-caw1-db-gold-nvme # production database pool
prd-caw1-vm-gold-nvme # production VM pool
prd-caw1-stor-silver-hdd # production object storage pool
# EU-West-1 — DR site (replication destination)
dr-euw1-db-gold-nvme # DR database pool (receives from prd-caw1-db-gold-nvme)
dr-euw1-vm-silver-ssd # DR VM pool
# US-East-1 — development site
dev-use1-db-bronze-ssd # dev database pool
dev-use1-vm-bronze-ssd # dev VM pool
# Replication topology is encoded in ZFS properties — no separate config
# prd-caw1-db-gold-nvme/postgres: com.kldload:dr-target = dr-euw1-db-gold-nvme/postgres
# prd-caw1-vm-gold-nvme/vms: com.kldload:dr-target = dr-euw1-vm-silver-ssd/vms
Global inventory across all sites
#!/bin/bash
# global-inventory.sh — collect inventory from all sites via SSH
SITES=(
"caw1-stor-01.prd.caw1.internal"
"euw1-stor-01.dr.euw1.internal"
"use1-stor-01.dev.use1.internal"
)
for host in "${SITES[@]}"; do
echo "=== $host ==="
ssh "$host" '
zpool list -H -o name,health,cap,alloc,free | \
awk "{printf \"%-30s %-8s %5s %10s %10s\n\", \$1, \$2, \$3, \$4, \$5}"
'
echo ""
done
Replication topology from tags
#!/bin/bash
# show-replication-topology.sh — visualize DR targets from ZFS properties
echo "DATASET -> DR TARGET"
echo "--------------------------------------"
for pool in $(zpool list -H -o name); do
zfs get -r -H -o name,value com.kldload:dr-target "$pool" | \
grep -v "^-" | \
grep -v "none$" | \
awk -F'\t' '{printf "%-50s -> %s\n", $1, $2}'
done
10. Capacity Planning from Labels
Cloud providers give you cost allocation by tag. OpenZFS gives you capacity allocation by tag — same concept, filesystem-native. No billing API, no cost explorer. One query against ZFS properties gives you capacity per team, per region, per application, per SLA tier.
Aggregate capacity by tag
#!/bin/bash
# capacity-by-tag.sh — aggregate used/available space by any ZFS property
# Usage: capacity-by-tag.sh com.kldload:cost-center
# capacity-by-tag.sh com.kldload:app
# capacity-by-tag.sh com.kldload:owner
TAG="${1:-com.kldload:cost-center}"
echo "=== Capacity by $TAG ==="
printf "%-20s %10s %10s %10s\n" "VALUE" "USED" "AVAIL" "DATASETS"
echo "------------------------------------------------------------"
declare -A used_bytes avail_bytes count
for pool in $(zpool list -H -o name); do
zfs get -r -H -o name,value "$TAG" "$pool" | \
grep -v "^-" | \
while IFS=$'\t' read -r dataset tagval; do
[[ "$tagval" == "-" || "$tagval" == "" ]] && continue
used=$(zfs get -H -o value used "$dataset")
avail=$(zfs get -H -o value avail "$dataset")
echo "$tagval $used $avail"
done
done | awk '
{
tag=$1; used=$2; avail=$3
count[tag]++
# Store last value for display (proper aggregation needs numfmt)
last_used[tag]=used; last_avail[tag]=avail
}
END {
for (t in count)
printf "%-20s %10s %10s %10d\n", t, last_used[t], last_avail[t], count[t]
}
' | sort
Growth tracking for procurement planning
#!/bin/bash
# track-growth.sh — record daily used space per tag for growth forecasting
# Run from cron daily: 0 6 * * * /usr/local/bin/track-growth.sh >> /var/log/zfs-growth.log
DATE=$(date +%Y-%m-%d)
TAG="${1:-com.kldload:cost-center}"
for pool in $(zpool list -H -o name); do
zfs get -r -H -o name,value "$TAG" "$pool" | \
grep -v "^-" | \
while IFS=$'\t' read -r dataset tagval; do
[[ "$tagval" == "-" ]] && continue
used=$(zfs get -H -o value used "$dataset")
echo "$DATE $pool $dataset $tagval $used"
done
done
# Analyze growth rate (requires at least 30 days of history)
# awk '/2026-03/ {used[$4]=$5} /2026-02/ {prev[$4]=$5} END {
# for (t in used) printf "%s: now=%s prev=%s delta=%s\n", t, used[t], prev[t], used[t]-prev[t]
# }' /var/log/zfs-growth.log
Grafana dashboard from ZFS property metrics
With the Prometheus exporter from section 7 running, Grafana can display capacity by any tag combination. Key panels:
- Used bytes by
com.kldload:cost-center— chargeback view, one bar per team - Pool capacity % by
com.kldload:sla— shows gold vs silver vs bronze pools - Used bytes by
com.kldload:app— which applications consume the most storage - Dataset count by
com.kldload:owner— how many datasets each team owns - Growth rate by
com.kldload:region— which sites are growing fastest - Days to full by pool (calculated from current capacity % and 30-day growth rate)
# Example Prometheus query: used bytes by cost-center (for Grafana bar chart)
sum by (cost_center) (zfs_dataset_used_bytes{tier="production"})
# Example: capacity % by SLA tier
avg by (sla) (zfs_pool_capacity_percent)
# Example: alert when any gold pool exceeds 85%
zfs_pool_capacity_percent{sla="gold"} > 85
zfs get -r com.kldload:cost-center rpool | aggregate. The compression ratio is particularly powerful for capacity planning: if your pool has 2.3x compression and you are adding a new workload, you know from the first 24 hours of data how much real space it will consume. The combination of ZFS properties and Prometheus means your capacity dashboard is always live — not a spreadsheet updated quarterly, but a real-time view of every byte in every dataset, attributed to the right team, region, and application.11. The Labeling Checklist
Use this checklist for every infrastructure change. Every new asset, every new pool, every new dataset, every new site must pass every applicable item before it is considered complete.
New disk
- Asset ID assigned at procurement, recorded in
/etc/kldload/assets/ - Physical label printed with all fields complete (location, ZFS, hardware, lifecycle)
- QR code with JSON metadata on the label
- Installed using
/dev/disk/by-id/path, not raw device name - udev slot rule created if applicable
- Asset record updated with install date, slot, rack, chassis, datacenter
- SMART baseline captured to
/etc/kldload/smart-baseline/ - ZFS vdev name matches physical slot label
- ZFS custom properties set on pool and datasets
- Drive appears in weekly warranty-check output
New pool
- Name follows
{env}-{region}-{role}-{tier}-{media}convention - Name reviewed — cannot be changed later without destroying and recreating
- All standard tags set: region, tier, sla, owner, cost-center
- Tags set at pool level so datasets inherit
- Monitoring alerts configured: capacity > 80%, health != ONLINE, scrub errors
- Scrub schedule configured in systemd timers or sanoid
- Appears in
zpool listwith expected name and health - Prometheus metrics visible in Grafana within 5 minutes of creation
New dataset
- Created in the correct hierarchy level (pool/category/application/instance)
- Inherits appropriate tags from parent, or has explicit tags set
com.kldload:backup-policyset (explicitly or inherited)com.kldload:dr-targetset if replication is requiredcom.kldload:ownerset for quota and alert routing- Workload-specific properties set: recordsize, compression, sync, atime
- Quota set if owner has a capacity allocation
- Appears in inventory script output with correct tags
- Replication test: verify dataset replicates to DR target correctly
New site
- Region code assigned, documented in region code mapping table
- Region code consistent with WireGuard mesh zone name
- Pool names follow convention with new region code
- ZFS properties use consistent region value (
US-EAST-1notus-east-1orUSE1) - DR target tags on production datasets point to the new site's pools
- Global inventory script includes new site's hosts
- Warranty check script has access to new site's asset records
- Prometheus scrapes include new site's node_exporter endpoints
Monthly audit
- Every disk in every rack has a readable physical label
- Every pool has all required custom tags set
- Every dataset with data has
com.kldload:backup-policyandcom.kldload:owner - No datasets tagged
dr-target != noneare failing replication - No drives within 90 days of warranty expiry without replacement ordered
- No drives with SMART warnings without investigation open
- Inventory script output matches physical rack counts
- Pool capacity % for gold pools all below 80%
- New operators can pass the "walk up to any rack" test in 60 seconds