AI for ZFS — kldload

Build Your Own

AI for ZFS Operations — a local model that knows your pools better than you do.

Generic LLMs know that ZFS exists. This model knows your pool topology, your dataset hierarchy, your snapshot schedule, your ARC hit rate, and your replication targets. It reads zpool status before answering every question. It recommends ksnap before every destructive operation. It understands recordsize tuning for your actual workload, not a textbook workload.

The Modelfile encodes deep ZFS knowledge. The context script feeds live pool state into every query. The cron job catches problems at 3 AM so you don't have to.

1. The ZFS Modelfile

This is the complete system prompt. It encodes pool management, scrub behavior, ARC internals, recordsize optimization, send/recv patterns, encryption, boot environments, and the kldload tool wrappers. The model memorizes all of it.

Complete ZFS expert Modelfile

# /srv/ollama/Modelfile.zfs-expert
FROM llama3.1:8b

SYSTEM """
You are a ZFS storage expert for this kldload-based infrastructure.
You give precise commands, reference actual pool and dataset names from context,
and always recommend snapshots before destructive operations.

=== POOL MANAGEMENT ===
Create mirror pool:     zpool create -o ashift=12 -O compression=zstd -O acltype=posixacl -O xattr=sa rpool mirror /dev/disk/by-id/X /dev/disk/by-id/Y
Create RAIDZ2 pool:     zpool create -o ashift=12 tank raidz2 /dev/disk/by-id/{A,B,C,D,E,F}
Add mirror vdev:        zpool add rpool mirror /dev/disk/by-id/X /dev/disk/by-id/Y
Add SLOG:               zpool add rpool log mirror /dev/disk/by-id/SLOG1 /dev/disk/by-id/SLOG2
Add L2ARC:              zpool add rpool cache /dev/disk/by-id/SSD1
Remove vdev (post-0.8): zpool remove rpool /dev/disk/by-id/OLD
Pool import/export:     zpool export tank && zpool import -d /dev/disk/by-id tank
Pool status:            zpool status -v   (always check FIRST)
Pool I/O stats:         zpool iostat -v 5  (5-second intervals)
Pool history:           zpool history rpool | tail -30

=== SCRUBS ===
Start scrub:            zpool scrub rpool
Cancel scrub:           zpool scrub -s rpool
Scrub status:           zpool status | grep -A3 'scan:'
Scrub schedule:         Monthly minimum. Weekly for production. Scrubs read every block and verify checksums.
Scrub vs resilver:      Scrub checks all data. Resilver copies data to a replacement disk. Resilver takes priority.
If scrub finds errors:  zpool status -v shows affected files. Restore from snapshot or redundancy auto-repairs.

=== ARC TUNING ===
ARC stats:              cat /proc/spl/kstat/zfs/arcstats
Key metrics:            size (current ARC), c_max (ARC limit), hits, misses
Hit rate formula:       hits / (hits + misses) * 100
Target hit rate:        >90% for most workloads, >95% for databases
Set ARC max:            echo $((RAM_BYTES * 3/4)) > /sys/module/zfs/parameters/zfs_arc_max
Persist ARC max:        echo 'options zfs zfs_arc_max=N' > /etc/modprobe.d/zfs.conf
ARC consumes RAM:       This is NORMAL. ARC is the filesystem cache. Free RAM is wasted RAM.
L2ARC considerations:   Only useful when ARC hit rate is high but working set exceeds RAM.
                        L2ARC headers consume ARC RAM (~200 bytes per cached block).
                        SSD wear: L2ARC writes constantly. Use enterprise SSDs.
Prefetch:               zfs_prefetch_disable=0 (default). Disable for random I/O workloads.

=== RECORDSIZE OPTIMIZATION ===
General files:          recordsize=128k (default, good for most)
Databases (PostgreSQL): recordsize=16k (matches PG page size of 8k, allows some coalescing)
Databases (MySQL InnoDB): recordsize=16k (InnoDB page = 16k)
VMs (zvol):             volblocksize=64k (match guest filesystem block)
Media files:            recordsize=1M (large sequential reads)
Logs:                   recordsize=128k with compression=zstd (sequential writes)
Small files (git):      recordsize=16k or 32k (reduce internal fragmentation)
Rule:                   Match recordsize to your dominant I/O size. 'zpool iostat -r' shows distribution.

=== SEND/RECV ===
Full send:              zfs send -Rw rpool/data@snap | ssh node2 zfs recv tank/data
Incremental:            zfs send -Ri @snap1 rpool/data@snap2 | ssh node2 zfs recv tank/data
Raw send (encrypted):   zfs send -w rpool/data@snap  (preserves encryption, receiver cannot read)
Resume interrupted:     zfs send -t TOKEN | ssh node2 zfs recv -s tank/data
Syncoid (automated):    syncoid rpool/data root@node2:tank/data  (handles incrementals automatically)
Syncoid with sanoid:    sanoid creates snapshots, syncoid replicates them. Set in /etc/sanoid/sanoid.conf
Bandwidth limit:        zfs send ... | pv -L 100m | ssh node2 zfs recv ...
Compression in flight:  zfs send ... | lz4 | ssh node2 'lz4 -d | zfs recv ...'

=== ENCRYPTION ===
Create encrypted ds:    zfs create -o encryption=aes-256-gcm -o keyformat=passphrase rpool/secret
Create with keyfile:    zfs create -o encryption=on -o keyformat=raw -o keylocation=file:///root/key rpool/secret
Load key:               zfs load-key rpool/secret
Unload key:             zfs unload-key rpool/secret
Change key:             zfs change-key -o keyformat=passphrase rpool/secret
Encrypted send:         zfs send -w (raw send preserves encryption, receiver needs no key)
Inheritance:            Child datasets inherit parent encryption. Cannot un-encrypt a child.
Mount encrypted:        zfs load-key -a && zfs mount -a  (load all keys, mount all)
Auto-unlock at boot:    Store keyfile on separate USB, load in initramfs

=== BOOT ENVIRONMENTS ===
List BEs:               kbe list  (or: zfs list -r rpool/ROOT)
Create BE:              kbe create pre-upgrade  (or: zfs snapshot rpool/ROOT/cs@pre-upgrade)
Activate BE:            kbe activate pre-upgrade  (sets bootfs property)
Rollback:               kbe rollback pre-upgrade  (boot into previous known-good state)
Delete old BE:          kbe destroy old-be  (frees space)
ZFSBootMenu:            Reads bootfs property. Shows all BEs at boot. Enter recovery shell with Ctrl+Alt+F2.
Pre-upgrade pattern:    kbe create pre-upgrade && kupgrade  (always snapshot before upgrading)

=== SANOID/SYNCOID ===
Config location:        /etc/sanoid/sanoid.conf
Snapshot policy:        [rpool/data] use_template = production
Template example:       [template_production] hourly=24, daily=30, monthly=12, yearly=1, autosnap=yes, autoprune=yes
Syncoid replication:    syncoid --no-sync-snap rpool/data root@backup:tank/data
Monitor snapshots:      sanoid --monitor-snapshots  (check for stale/missing snapshots)
Nagios integration:     sanoid --monitor-snapshots returns Nagios-format exit codes

=== KLDLOAD ZFS TOOLS ===
kst                     System status dashboard — shows pools, datasets, ARC, services
ksnap                   Create snapshots: ksnap /srv/data  (wraps zfs snapshot with timestamps)
ksnap rollback          Rollback: ksnap rollback /srv/data  (interactive snapshot selection)
kbe                     Boot environment manager: kbe list, kbe create, kbe activate, kbe rollback
kdf                     Dataset disk usage: kdf  (sorted, human-readable, shows compression ratio)
kdir                    Create dataset: kdir /srv/newdata  (sets compression=zstd, mountpoint)
kclone                  Clone dataset: kclone /srv/data /srv/data-copy  (instant, zero-cost copy)
kexport                 Export: kexport /srv/data backup.zstream  (ZFS stream, OVA, QCOW2)

=== TROUBLESHOOTING ===

Pool DEGRADED:
  1. zpool status -v  (identify faulted device)
  2. ksnap /srv  (snapshot everything FIRST)
  3. Check dmesg for disk errors: dmesg | grep -i 'error\|fault\|i/o'
  4. If transient: zpool online rpool DEVICE
  5. If failed: zpool replace rpool OLD NEW
  6. After replace: zpool scrub rpool  (verify all data)
  7. Monitor: zpool status until resilver completes

Pool FAULTED:
  1. zpool status -v  (read the error)
  2. zpool clear rpool  (clear transient errors)
  3. If too many errors: zpool import -fN rpool  (force import, don't mount)
  4. Mount manually: zfs mount rpool/ROOT/cs
  5. Recover data: zfs send to backup FIRST

Slow performance:
  1. zpool iostat -v 5  (watch I/O distribution across vdevs)
  2. Check ARC: arc_summary  (or parse arcstats)
  3. Check fragmentation: zpool list -v  (FRAG column)
  4. Check recordsize vs workload: zfs get recordsize
  5. Check compression: zfs get compression,compressratio

Checksum errors:
  1. zpool status -v  (shows files with errors)
  2. zpool scrub rpool  (scrub repairs from redundancy)
  3. If no redundancy: restore affected files from snapshot

=== PHILOSOPHY ===
Always snapshot before changes. ksnap is cheap. Regret is expensive.
ARC using RAM is not a problem — it IS the filesystem cache.
Match recordsize to your workload, not to convention.
Scrubs are not optional. They are how ZFS proves your data is intact.
Encrypted send (-w) means the backup server never sees your data.
Boot environments mean upgrades are always reversible.
"""

PARAMETER temperature 0.3
PARAMETER num_ctx 16384

# Build the ZFS expert model
ollama create zfs-expert -f /srv/ollama/Modelfile.zfs-expert

# Verify it
ollama run zfs-expert "What recordsize should I use for PostgreSQL and why?"

A DBA memorizes pg_stat tables. A storage admin memorizes arcstats. This model memorizes both — plus your pool layout, your snapshot schedule, and every ZFS property that matters.

2. Live context script

The Modelfile is the AI's education. The context script is the patient chart. Every query includes fresh pool state, ARC stats, snapshot inventory, and scrub history so the model answers based on what is happening right now.

The ZFS context builder

#!/bin/bash
# /usr/local/bin/kai-zfs — query the ZFS AI with live pool context

build_zfs_context() {
    echo "=== LIVE ZFS STATE ($(date -Iseconds)) ==="

    echo -e "\n--- zpool status -v ---"
    zpool status -v 2>/dev/null

    echo -e "\n--- zpool list ---"
    zpool list -o name,size,alloc,free,frag,cap,dedup,health 2>/dev/null

    echo -e "\n--- zpool iostat ---"
    zpool iostat -v 2>/dev/null

    echo -e "\n--- zfs list (datasets) ---"
    zfs list -o name,used,avail,refer,mountpoint,recordsize,compression,compressratio 2>/dev/null

    echo -e "\n--- Snapshot inventory ---"
    zfs list -t snapshot -o name,used,creation -s creation 2>/dev/null | tail -30

    echo -e "\n--- Snapshot ages (oldest per dataset) ---"
    zfs list -t snapshot -o name,creation -s creation 2>/dev/null | \
        awk 'NR>1{split($1,a,"@"); if(!(a[1] in seen)){seen[a[1]]=1; print}}' | head -20

    echo -e "\n--- ARC statistics ---"
    if [ -f /proc/spl/kstat/zfs/arcstats ]; then
        awk '
            /^size /          {printf "ARC size:       %d MB\n",$3/1048576}
            /^c_max /         {printf "ARC max:        %d MB\n",$3/1048576}
            /^hits /          {h=$3}
            /^misses /        {m=$3}
            /^prefetch_data_hits /   {pdh=$3}
            /^prefetch_data_misses / {pdm=$3}
            /^l2_hits /       {printf "L2ARC hits:     %d\n",$3}
            /^l2_misses /     {printf "L2ARC misses:   %d\n",$3}
            /^l2_size /       {printf "L2ARC size:     %d MB\n",$3/1048576}
            END {
                if(h+m>0) printf "ARC hit rate:   %.1f%%\n",h/(h+m)*100
                if(pdh+pdm>0) printf "Prefetch rate:  %.1f%%\n",pdh/(pdh+pdm)*100
            }
        ' /proc/spl/kstat/zfs/arcstats
    fi

    echo -e "\n--- Last scrub ---"
    zpool status 2>/dev/null | grep -A3 'scan:'

    echo -e "\n--- Sanoid snapshot health ---"
    sanoid --monitor-snapshots 2>/dev/null || echo "(sanoid not installed)"

    echo -e "\n--- Memory ---"
    free -h 2>/dev/null

    echo -e "\n--- ZFS kernel module params ---"
    for p in zfs_arc_max zfs_arc_min zfs_prefetch_disable; do
        f="/sys/module/zfs/parameters/$p"
        [ -f "$f" ] && echo "$p = $(cat "$f")"
    done

    echo -e "\n--- Recent ZFS-related errors ---"
    dmesg 2>/dev/null | grep -i 'zfs\|zio\|checksum\|degraded' | tail -10
}

QUESTION="$*"
if [ -z "$QUESTION" ]; then
    echo "Usage: kai-zfs <question>"
    echo ""
    echo "Examples:"
    echo "  kai-zfs 'is my pool healthy?'"
    echo "  kai-zfs 'my pool is degraded — what do I do?'"
    echo "  kai-zfs 'optimize recordsize for postgres'"
    echo "  kai-zfs 'set up replication to backup server'"
    echo "  kai-zfs 'my ARC hit rate is 72%'"
    echo "  kai-zfs 'which snapshots can I delete to free space?'"
    exit 1
fi

CONTEXT=$(build_zfs_context)

echo -e "${CONTEXT}\n\n=== QUESTION ===\n${QUESTION}" | ollama run zfs-expert

You don't diagnose a pool by reading the ZFS manual. You diagnose it by reading zpool status. This script makes sure the AI always reads zpool status before it opens its mouth.

3. Example queries

Every query below hits the model with fresh pool data. The AI sees your actual vdevs, your actual ARC numbers, your actual snapshot list. It doesn't guess — it reads.

"My pool is degraded"

The AI reads zpool status -v, identifies the faulted device by path, checks dmesg for I/O errors, and tells you whether to zpool online (transient) or zpool replace (hardware failure). It recommends ksnap before touching anything.

kai-zfs "my pool shows DEGRADED — what happened and what do I do?"

"Optimize recordsize for PostgreSQL"

The AI checks your current recordsize on the database dataset, sees it's 128k, and recommends zfs set recordsize=16k rpool/var/lib/pgsql. It explains that PG uses 8k pages and 16k gives room for WAL coalescing. It warns you: recordsize only applies to new writes.

kai-zfs "I run PostgreSQL on rpool/var/lib/pgsql — what recordsize?"

"Set up replication to backup server"

The AI generates the syncoid command for your actual dataset names. It recommends sanoid for snapshot scheduling, shows you the /etc/sanoid/sanoid.conf template with your dataset paths, and gives you the cron entry for hourly replication.

kai-zfs "set up replication of rpool/srv to root@backup:tank/srv"

"My ARC hit rate is 72%"

The AI reads the live ARC stats, calculates your working set size, compares it to available RAM, and recommends increasing zfs_arc_max. It gives you the exact echo command and the /etc/modprobe.d/zfs.conf line to persist it across reboots.

kai-zfs "my ARC hit rate is low — how do I fix it?"

"Which snapshots can I delete?"

The AI reads your snapshot list with used space, identifies snapshots older than retention policy, finds the ones holding the most space via zfs list -t snapshot -o name,used -s used, and recommends specific zfs destroy commands. It warns about dependent clones.

kai-zfs "I'm running low on space — which snapshots should I clean up?"

"Set up encrypted dataset for secrets"

The AI walks you through kdir -o encryption=on -o keyformat=passphrase /srv/secrets, explains the keyformat options (passphrase, raw, hex), shows how to auto-load keys at boot, and reminds you that zfs send -w preserves encryption through replication.

kai-zfs "create an encrypted dataset for /srv/secrets with auto-unlock"

4. Automated scrub and health monitoring

The AI watches your pools continuously. A cron job runs health checks, detects degraded vdevs, stale scrubs, low ARC hit rates, and aging snapshots — then writes a report with exact remediation commands.

ZFS health monitor

#!/bin/bash
# /usr/local/bin/kai-zfs-monitor — AI-driven ZFS health check

REPORT_DIR="/var/log/kai-zfs"
mkdir -p "$REPORT_DIR"
REPORT="$REPORT_DIR/$(date +%F).txt"

# Gather deep ZFS state
STATE=$(cat <<ZFSDATA
=== ZFS HEALTH CHECK — $(hostname) — $(date) ===

--- Pool Status ---
$(zpool status -v 2>/dev/null)

--- Pool Capacity ---
$(zpool list -o name,size,alloc,free,frag,cap,health 2>/dev/null)

--- Dataset Usage ---
$(zfs list -o name,used,avail,refer,compressratio -s used 2>/dev/null)

--- ARC Stats ---
$(awk '/^size /{printf "ARC size: %d MB\n",$3/1048576}
      /^c_max /{printf "ARC max: %d MB\n",$3/1048576}
      /^hits /{h=$3} /^misses /{m=$3}
      END{if(h+m>0) printf "ARC hit rate: %.1f%%\n",h/(h+m)*100}' \
    /proc/spl/kstat/zfs/arcstats 2>/dev/null)

--- Last Scrub ---
$(zpool status 2>/dev/null | grep -A3 'scan:')

--- Snapshots by Age ---
$(zfs list -t snapshot -o name,used,creation -s creation 2>/dev/null | head -10)
$(echo "...")
$(zfs list -t snapshot -o name,used,creation -s creation 2>/dev/null | tail -10)

--- Sanoid Monitor ---
$(sanoid --monitor-snapshots 2>/dev/null || echo "(not installed)")

--- Disk Errors ---
$(dmesg 2>/dev/null | grep -i 'error\|fault\|i/o' | tail -10)
ZFSDATA
)

# AI analysis
ANALYSIS=$(echo "${STATE}

Analyze this ZFS health data. Report:
1. CRITICAL — degraded vdevs, checksum errors, pools near capacity (>80%)
2. SCRUB STATUS — when was the last scrub, is it overdue (>30 days)?
3. ARC HEALTH — hit rate assessment, tuning recommendations
4. SNAPSHOT HYGIENE — oldest snapshots, anything holding excessive space
5. RECOMMENDATIONS — exact commands: ksnap, kdf, kbe, zpool scrub, zfs destroy

Be specific. Use actual dataset names and values from the data." | \
    ollama run zfs-expert)

{
    echo "=== AI ZFS HEALTH REPORT ==="
    echo "=== $(hostname) — $(date) ==="
    echo ""
    echo "$ANALYSIS"
    echo ""
    echo "=== RAW DATA ==="
    echo "$STATE"
} > "$REPORT"

# Alert on critical issues
if echo "$ANALYSIS" | grep -qi 'CRITICAL'; then
    echo "$ANALYSIS" | head -20 | logger -t kai-zfs -p daemon.warning
fi

echo "ZFS report saved: $REPORT"

Schedule it

# Daily ZFS health check at 5 AM
cat > /etc/cron.d/kai-zfs-monitor <<'EOF'
SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
0 5 * * * root /usr/local/bin/kai-zfs-monitor
EOF

# Weekly scrub on Sunday at 2 AM (the AI will report results Monday morning)
cat > /etc/cron.d/zfs-scrub <<'EOF'
SHELL=/bin/bash
0 2 * * 0 root zpool scrub rpool
EOF

# Check reports
cat /var/log/kai-zfs/$(date +%F).txt

A good storage admin checks zpool status every morning. This cron job is that admin — it never oversleeps, never forgets, and never says "I'll check it later."

5. Replicate to fleet via syncoid

Train the ZFS expert on one node. Replicate the model to every server in the fleet. Each node injects its own pool state, but the ZFS knowledge is identical everywhere.

Fleet deployment

#!/bin/bash
# replicate-zfs-expert.sh — push the ZFS model to all nodes

NODES="node-2 node-3 node-4 node-5"

# Snapshot the trained model
zfs snapshot rpool/srv/ollama@zfs-expert-$(date +%F)

# Replicate to every node
for node in $NODES; do
    echo "--- Syncing ZFS expert to $node ---"
    syncoid --no-sync-snap rpool/srv/ollama "root@${node}:rpool/srv/ollama"
    ssh "root@${node}" "systemctl restart ollama"
    echo "$node: done"
done

# Deploy the kai-zfs script and cron job to every node
for node in $NODES; do
    scp /usr/local/bin/kai-zfs "root@${node}:/usr/local/bin/kai-zfs"
    scp /usr/local/bin/kai-zfs-monitor "root@${node}:/usr/local/bin/kai-zfs-monitor"
    scp /etc/cron.d/kai-zfs-monitor "root@${node}:/etc/cron.d/kai-zfs-monitor"
    ssh "root@${node}" "chmod +x /usr/local/bin/kai-zfs /usr/local/bin/kai-zfs-monitor"
done

echo "Fleet updated at $(date)"

Same doctor, different patients. Every node runs the same ZFS expert but feeds it its own pool data. node-2 asks about its mirror. node-5 asks about its RAIDZ2. Same expertise. Different answers.

ZFS gives you the primitives. Pools, datasets, snapshots, send/recv, checksums, encryption. These are not abstractions — they are building blocks. The AI doesn't replace your understanding of them. It amplifies it. It reads your ARC stats at 3 AM. It catches the degraded vdev before your users do. It remembers the recordsize tuning you set six months ago and why.

Learn the primitives. Then teach them to a machine that never sleeps.

← AI Voice & Vision — talk to your infrastructure. AI for eBPF Observability — ask questions, get traces. →