| your Linux construction kit
Source

ZFS Zero to Hero

Complete operational guide — from empty disk to replicating datasets between nodes. Every command, every option, every config. No shortcuts.

Works on CentOS/RHEL and Debian. Commands are identical on both.


Part 1: Pools

Create a pool

# Single disk (no redundancy)
zpool create -o ashift=12 -O compression=lz4 -O acltype=posixacl -O xattr=sa -O relatime=on rpool /dev/sda

# Mirror (2 disks, survives 1 failure)
zpool create -o ashift=12 -O compression=lz4 -O acltype=posixacl -O xattr=sa rpool mirror /dev/sda /dev/sdb

# 3-way mirror (3 disks, survives 2 failures)
zpool create -o ashift=12 -O compression=lz4 rpool mirror /dev/sda /dev/sdb /dev/sdc

# RAIDZ1 (3+ disks, 1 parity, survives 1 failure)
zpool create -o ashift=12 -O compression=lz4 rpool raidz1 /dev/sda /dev/sdb /dev/sdc

# RAIDZ2 (4+ disks, 2 parity, survives 2 failures)
zpool create -o ashift=12 -O compression=lz4 rpool raidz2 /dev/sda /dev/sdb /dev/sdc /dev/sdd

# RAIDZ3 (5+ disks, 3 parity, survives 3 failures)
zpool create -o ashift=12 -O compression=lz4 rpool raidz3 /dev/sd{a,b,c,d,e}

# Striped mirrors (4 disks, 2 mirror vdevs, fast + redundant)
zpool create -o ashift=12 -O compression=lz4 rpool \
  mirror /dev/sda /dev/sdb \
  mirror /dev/sdc /dev/sdd

Pool creation options explained

Option Value Why
ashift=12 4K sector alignment Matches modern drives. Never use 9 (512b). Use 13 for some NVMe.
compression=lz4 Fast, ~1.5-2x ratio Always on. Zero reason to disable.
acltype=posixacl POSIX ACLs Required for systemd, containers, most apps.
xattr=sa Store xattrs in dnodes Faster than directory-based xattrs.
relatime=on Relaxed atime updates Reduces write amplification.
normalization=formD Unicode normalization Consistent filename handling.
dnodesize=auto Variable dnode size Better metadata performance.
autotrim=on Automatic TRIM For SSDs. Omit for spinning rust.

Pool operations

# List pools
zpool list
zpool list -v                    # verbose — shows vdev layout

# Pool health + config
zpool status rpool
zpool status -v rpool            # verbose — shows individual disk status

# Pool I/O stats (live, 2 second interval)
zpool iostat rpool 2
zpool iostat -v rpool 2          # per-vdev breakdown

# Pool history (every command ever run on this pool)
zpool history rpool
zpool history rpool | tail -20   # last 20 commands

# Scrub (verify all checksums — run weekly)
zpool scrub rpool
zpool status rpool | grep scan   # check scrub progress

# Import / export
zpool export rpool               # detach pool (for migration or unmount)
zpool import                     # list available pools
zpool import rpool               # re-import
zpool import -d /dev/disk/by-id rpool   # import by disk ID (more reliable)

# Upgrade pool features
zpool upgrade rpool

# Destroy pool (DESTRUCTIVE)
zpool destroy rpool

Add devices to existing pool

# Add a mirror vdev (expand capacity)
zpool add rpool mirror /dev/sde /dev/sdf

# Add a cache device (L2ARC — read cache on SSD)
zpool add rpool cache /dev/nvme0n1

# Add a log device (SLOG — synchronous write log)
zpool add rpool log mirror /dev/nvme1n1 /dev/nvme1n2

# Add a special vdev (metadata + small blocks on fast storage)
zpool add rpool special mirror /dev/nvme0n1p4 /dev/nvme1n1p4
zfs set special_small_blocks=64K rpool

# Replace a failed disk
zpool replace rpool /dev/sda /dev/sdg
zpool status rpool   # watch resilver progress

# Remove a device (mirrors and special vdevs only)
zpool remove rpool /dev/nvme0n1

# Take a device offline / online
zpool offline rpool /dev/sda
zpool online rpool /dev/sda

Part 2: Datasets

Create datasets

# Basic dataset
zfs create rpool/data

# With mountpoint
zfs create -o mountpoint=/srv/app rpool/srv/app

# With compression
zfs create -o mountpoint=/srv/logs -o compression=zstd rpool/srv/logs

# With quota (limit size)
zfs create -o mountpoint=/home/alice -o quota=50G rpool/home/alice

# With reservation (guaranteed space)
zfs create -o mountpoint=/srv/db -o reservation=100G rpool/srv/db

# With recordsize tuned for workload
zfs create -o mountpoint=/srv/postgres -o recordsize=8k rpool/srv/postgres      # PostgreSQL
zfs create -o mountpoint=/srv/mysql -o recordsize=16k rpool/srv/mysql           # MySQL
zfs create -o mountpoint=/srv/media -o recordsize=1M rpool/srv/media            # large files

# Non-mountable (container for child datasets)
zfs create -o canmount=off -o mountpoint=none rpool/ROOT

# Encrypted dataset
zfs create -o encryption=aes-256-gcm -o keyformat=passphrase rpool/srv/secrets

# All options at once
zfs create \
  -o mountpoint=/srv/production \
  -o compression=lz4 \
  -o quota=500G \
  -o reservation=200G \
  -o recordsize=128k \
  -o atime=off \
  -o logbias=throughput \
  rpool/srv/production

Dataset properties

# List all datasets
zfs list
zfs list -r rpool                # recursive from rpool
zfs list -o name,used,avail,compress,mountpoint   # custom columns

# Get a property
zfs get compression rpool/data
zfs get all rpool/data           # all properties
zfs get compressratio rpool      # how much compression is saving

# Set a property
zfs set compression=zstd rpool/srv/archive
zfs set quota=100G rpool/home/alice
zfs set atime=off rpool/srv/database
zfs set recordsize=8k rpool/srv/postgres

# Inherit from parent
zfs inherit compression rpool/data

# Mount / unmount
zfs mount rpool/data
zfs unmount rpool/data
zfs mount -a                     # mount all datasets

Dataset properties reference

Property Values Use case
compression lz4, zstd, gzip-9, off lz4 for general, zstd for archives, off for pre-compressed
recordsize 4k1M 8k=PostgreSQL, 16k=MySQL, 128k=general, 1M=media
quota size or none Limit dataset size
reservation size or none Guarantee space for dataset
atime on, off off for databases and containers
logbias latency, throughput throughput for sequential writes
sync standard, always, disabled disabled only if you accept data loss
canmount on, off, noauto noauto for boot environments
mountpoint path or none where the dataset mounts
encryption aes-256-gcm, off per-dataset encryption
dedup on, off, verify WARNING: uses massive RAM. Usually not worth it.
snapdir hidden, visible visible exposes .zfs/snapshot to users
special_small_blocks 01M route small blocks to special vdev

Part 3: Snapshots

Create snapshots

# Single dataset
zfs snapshot rpool/data@mysnap

# With timestamp
zfs snapshot rpool/data@$(date +%Y%m%d-%H%M%S)

# Recursive (all child datasets)
zfs snapshot -r rpool@full-backup-$(date +%Y%m%d)

# Multiple datasets
zfs snapshot rpool/home@backup rpool/srv@backup rpool/var/log@backup

List snapshots

# All snapshots
zfs list -t snapshot

# With size and creation date
zfs list -t snapshot -o name,used,refer,creation -S creation

# Snapshots for a specific dataset
zfs list -t snapshot -r rpool/home

# Count snapshots
zfs list -t snapshot -H | wc -l

# Space used by snapshots
zfs get usedbysnapshots rpool

Access snapshot data

# Browse snapshot contents (without rollback)
ls /home/.zfs/snapshot/
ls /home/.zfs/snapshot/mysnap/alice/documents/

# Make .zfs directory visible
zfs set snapdir=visible rpool/home

# Copy a file from a snapshot
cp /home/.zfs/snapshot/mysnap/alice/important.txt /home/alice/important.txt

Rollback

# Rollback to most recent snapshot
zfs rollback rpool/data@mysnap

# Rollback destroying intermediate snapshots
zfs rollback -r rpool/data@old-snapshot

# Rollback destroying intermediate snapshots AND clones
zfs rollback -rR rpool/data@old-snapshot

Destroy snapshots

# Single snapshot
zfs destroy rpool/data@mysnap

# Range of snapshots
zfs destroy rpool/data@snap1%snap5

# All snapshots matching a pattern
zfs list -t snapshot -H -o name | grep "auto-" | xargs -n1 zfs destroy

# Destroy recursively
zfs destroy -r rpool@full-backup-20260322

Part 4: Clones

Create clones

# Snapshot first (required — clones come from snapshots)
zfs snapshot rpool/srv/production@clone-src

# Clone
zfs clone rpool/srv/production@clone-src rpool/srv/staging

# Clone starts at near-zero space
zfs list rpool/srv/staging   # USED will be ~0

Clone properties

# Clone inherits parent properties but can be changed
zfs set mountpoint=/srv/staging rpool/srv/staging
zfs set quota=50G rpool/srv/staging

# Check clone origin
zfs get origin rpool/srv/staging

Promote a clone

# Make the clone independent (no longer depends on origin snapshot)
zfs promote rpool/srv/staging

# Now the original depends on the clone's snapshot
# The clone becomes the "real" dataset

Destroy a clone

# Must destroy the clone before the origin snapshot
zfs destroy rpool/srv/staging
zfs destroy rpool/srv/production@clone-src

Part 5: Boot Environments

How they work

Boot environments are ZFS datasets under rpool/ROOT/. ZFSBootMenu detects them and lets you choose which one to boot.

# Current boot environment
zpool get bootfs rpool

# List all boot environments
zfs list -r rpool/ROOT -o name,used,mountpoint,creation

# The active one has mountpoint=/
zfs get mountpoint rpool/ROOT/default

Create a boot environment

# Snapshot the current root
zfs snapshot rpool/ROOT/default@before-upgrade

# Clone it as a new BE
zfs clone rpool/ROOT/default@before-upgrade rpool/ROOT/safe-rollback

Switch boot environment

# Set which BE to boot next
zpool set bootfs=rpool/ROOT/safe-rollback rpool

# Reboot into it
reboot

# At the ZFSBootMenu screen, you can also select BEs interactively

Rollback a broken upgrade

# Option 1: from command line (if you can still boot)
zpool set bootfs=rpool/ROOT/default@before-upgrade rpool
reboot

# Option 2: from kldload live ISO
krecovery import rpool
krecovery list-be
krecovery activate rpool/ROOT/default@before-upgrade
reboot

Part 6: Replication

Local replication (to a backup disk)

# Create a backup pool on a second disk
zpool create backup /dev/sdb

# Full initial send
zfs snapshot -r rpool@backup-initial
zfs send -R rpool@backup-initial | zfs receive -F backup/rpool

# Incremental daily send
zfs snapshot -r rpool@backup-day2
zfs send -R -i rpool@backup-initial rpool@backup-day2 | zfs receive -F backup/rpool

# Verify
zfs list -r backup/rpool

Remote replication (over SSH)

# Full send to remote host
zfs snapshot -r rpool@replicate
zfs send -R rpool@replicate | ssh backup-server zfs receive -F tank/backup/rpool

# Incremental
zfs snapshot -r rpool@replicate-2
zfs send -R -i rpool@replicate rpool@replicate-2 | ssh backup-server zfs receive -F tank/backup/rpool

# Compressed transfer
zfs send -R rpool@replicate | zstd -3 | ssh backup-server "zstd -d | zfs receive -F tank/backup"

# With bandwidth limit (10MB/s)
zfs send -R rpool@replicate | pv -L 10m | ssh backup-server zfs receive -F tank/backup

Replication over WireGuard

This is where kldload shines — two kldloadOS nodes with WireGuard form a private encrypted channel. Replication traffic never touches the public internet.

# Setup: Node A (10.200.0.1) and Node B (10.200.0.2) connected via wg0

# On Node A: send to Node B over the WireGuard tunnel
zfs snapshot -r rpool@replicate
zfs send -R rpool@replicate | ssh 10.200.0.2 zfs receive -F rpool-backup

# Incremental replication (daily cron job)
zfs snapshot -r rpool@daily-$(date +%Y%m%d)
PREV=$(zfs list -t snapshot -H -o name -S creation | grep "rpool@daily-" | sed -n '2p')
zfs send -R -i "$PREV" rpool@daily-$(date +%Y%m%d) | \
  ssh 10.200.0.2 zfs receive -F rpool-backup

Automated replication with syncoid

# Install syncoid (part of sanoid, pre-installed on kldloadOS free)
# syncoid handles incremental tracking automatically

# Replicate a dataset
syncoid rpool/srv/data backup-server:tank/backup/data

# Replicate recursively
syncoid -r rpool backup-server:tank/backup/rpool

# Replicate over WireGuard
syncoid -r rpool 10.200.0.2:rpool-backup

# Dry run (show what would be sent)
syncoid -r --no-sync-snap --dryrun rpool backup-server:tank/backup

# Cron job — every hour
echo '0 * * * * root syncoid -r rpool 10.200.0.2:rpool-backup' >> /etc/crontab

Replication patterns

Pattern 1: Push backup (A → B)

Node A pushes snapshots to Node B.

# On Node A (cron)
syncoid -r rpool nodeB:tank/backup

Pattern 2: Pull backup (B pulls from A)

Node B pulls snapshots from Node A. Better for security — backup server initiates.

# On Node B (cron)
syncoid -r nodeA:rpool tank/backup

Pattern 3: Bidirectional (A ↔︎ B)

Both nodes replicate to each other. Different datasets in each direction.

# On Node A
syncoid rpool/srv/app nodeB:rpool/srv/app-replica

# On Node B
syncoid rpool/srv/db nodeA:rpool/srv/db-replica

Pattern 4: Fan-out (A → B, C, D)

One source replicates to multiple targets.

# On Node A
for target in nodeB nodeC nodeD; do
  syncoid -r rpool/srv/data ${target}:tank/backup/data &
done
wait

Pattern 5: Chain (A → B → C)

A replicates to B, B replicates to C. Geographic distribution.

# On Node A
syncoid -r rpool nodeB:tank/replica

# On Node B
syncoid -r tank/replica nodeC:tank/offsite

Part 7: Two-Node Setup (Complete Example)

Build two kldloadOS nodes, connect them with WireGuard, and replicate data between them.

Step 1: Install both nodes

Boot the kldload ISO on two machines. Install with Server profile.

  • Node A: hostname node-a, IP 10.100.10.10
  • Node B: hostname node-b, IP 10.100.10.20

Step 2: Set up WireGuard

On Node A:

umask 077
wg genkey | tee /etc/wireguard/private.key | wg pubkey > /etc/wireguard/public.key
cat /etc/wireguard/public.key   # copy this

On Node B:

umask 077
wg genkey | tee /etc/wireguard/private.key | wg pubkey > /etc/wireguard/public.key
cat /etc/wireguard/public.key   # copy this

Node A — /etc/wireguard/wg0.conf:

[Interface]
Address = 10.200.0.1/24
ListenPort = 51820
PrivateKey = <node-a-private-key>

[Peer]
PublicKey = <node-b-public-key>
AllowedIPs = 10.200.0.2/32
Endpoint = 10.100.10.20:51820
PersistentKeepalive = 25

Node B — /etc/wireguard/wg0.conf:

[Interface]
Address = 10.200.0.2/24
ListenPort = 51820
PrivateKey = <node-b-private-key>

[Peer]
PublicKey = <node-a-public-key>
AllowedIPs = 10.200.0.1/32
Endpoint = 10.100.10.10:51820
PersistentKeepalive = 25

Both nodes:

systemctl enable --now wg-quick@wg0
ping 10.200.0.2   # from Node A
ping 10.200.0.1   # from Node B

Step 3: Set up SSH keys

# On Node A
ssh-keygen -t ed25519 -N "" -f ~/.ssh/id_ed25519
ssh-copy-id admin@10.200.0.2

# On Node B
ssh-keygen -t ed25519 -N "" -f ~/.ssh/id_ed25519
ssh-copy-id admin@10.200.0.1

Step 4: Create application datasets

On Node A:

zfs create -o mountpoint=/srv/app rpool/srv/app
zfs create -o mountpoint=/srv/db -o recordsize=8k rpool/srv/db
echo "production data" > /srv/app/config.txt

Step 5: Initial replication

# On Node A — full send to Node B over WireGuard
zfs snapshot -r rpool/srv@initial
zfs send -R rpool/srv@initial | ssh 10.200.0.2 zfs receive -F rpool/srv-replica

Verify on Node B:

zfs list -r rpool/srv-replica
cat /srv-replica/app/config.txt   # should show "production data"

Step 6: Incremental replication

# On Node A — make changes
echo "updated config" > /srv/app/config.txt
echo "new data" > /srv/db/records.csv

# Snapshot and send incremental
zfs snapshot -r rpool/srv@update1
zfs send -R -i rpool/srv@initial rpool/srv@update1 | \
  ssh 10.200.0.2 zfs receive -F rpool/srv-replica

Verify on Node B:

cat /srv-replica/app/config.txt   # should show "updated config"

Step 7: Automate with syncoid

# On Node A — set up hourly replication
cat > /etc/cron.d/zfs-replicate << 'EOF'
0 * * * * root syncoid -r rpool/srv 10.200.0.2:rpool/srv-replica 2>&1 | logger -t zfs-replicate
EOF

Step 8: Failover

If Node A dies, Node B has the replica:

# On Node B
zfs set mountpoint=/srv/app rpool/srv-replica/app
zfs set mountpoint=/srv/db rpool/srv-replica/db

# Node B is now serving production data
# When Node A recovers, reverse the replication direction

Part 8: Monitoring

# Pool health (add to monitoring)
zpool status -x   # only shows pools with problems

# Space usage
zfs list -o name,used,avail,refer,compressratio

# Snapshot space
zfs get usedbysnapshots rpool

# ARC stats
arc_summary   # if available
cat /proc/spl/kstat/zfs/arcstats | grep -E "^hits|^misses|^size|^c_max"

# ARC hit rate calculation
awk '/^hits/{h=$3} /^misses/{m=$3} END{printf "ARC hit rate: %.1f%%\n", h/(h+m)*100}' /proc/spl/kstat/zfs/arcstats

# I/O latency (with eBPF)
zfsslower 1        # operations slower than 1ms
biolatency         # block device latency histogram

# Prometheus node_exporter ZFS metrics
curl -s localhost:9100/metrics | grep zfs

Quick Reference

I want to… Command
Create a pool zpool create -o ashift=12 -O compression=lz4 rpool mirror /dev/sda /dev/sdb
Create a dataset zfs create -o mountpoint=/srv/app rpool/srv/app
Snapshot everything zfs snapshot -r rpool@$(date +%Y%m%d-%H%M%S)
List snapshots zfs list -t snapshot -o name,used,creation -S creation
Rollback zfs rollback rpool/srv/app@before-change
Clone zfs snapshot rpool/x@src && zfs clone rpool/x@src rpool/x-clone
Replicate to remote zfs send -R rpool@snap \| ssh remote zfs receive -F tank/backup
Incremental replicate zfs send -R -i @snap1 rpool@snap2 \| ssh remote zfs receive -F tank/backup
Automated replication syncoid -r rpool remote:tank/backup
Pool health zpool status rpool
Scrub zpool scrub rpool
Check compression zfs get compressratio rpool
Boot environment zfs snapshot rpool/ROOT/default@safe && zpool set bootfs=rpool/ROOT/default rpool