Appliance Recipes

NAS Server — The Best File Server You’ll Ever Build

Every NAS appliance — Synology, QNAP, TrueNAS, Unraid — is just ZFS (or a worse filesystem) with a web UI bolted on. They lock you into their hardware, their app ecosystem, their update schedule. When the vendor drops your model from support, your “appliance” becomes e-waste with your data still on it.

kldload gives you ZFS on root with the distro of your choice. You build the NAS you actually want — not the one someone decided to sell you. Real Samba with every feature. Real NFS. Real Docker. Real snapshots that show up as Windows Previous Versions. Real replication to a remote site. No artificial limits. No subscription. No “please buy the Pro model for more than 4 bays.”

This is what ZFS was made for.

This recipe builds a NAS that does everything a Synology or TrueNAS does — and several things they can't. Windows Previous Versions from ZFS snapshots (Samba shadow_copy2). Native Time Machine targets. Per-dataset recordsize tuned to the workload (1M for movies, 128K for documents, 8K for databases). Automatic hourly snapshots with sanoid. Incremental offsite replication with syncoid over WireGuard. Docker apps on ZFS datasets with independent snapshot policies. And you're running Debian (or CentOS, or Ubuntu, or any of 8 distros) — not a vendor-locked OS that stops getting updates when they release a new model. For the deep dive on ZFS pool design, recordsize tuning, send/receive, and ARC management, see the ZFS Masterclass.

Step 0: Install kldload

cat > /tmp/answers.env << 'EOF'
KLDLOAD_DISTRO=debian
KLDLOAD_DISK=/dev/sda
KLDLOAD_HOSTNAME=nas
KLDLOAD_USERNAME=admin
KLDLOAD_PASSWORD=changeme
KLDLOAD_PROFILE=server
KLDLOAD_NET_METHOD=static
KLDLOAD_NET_IP=192.168.1.10/24
KLDLOAD_NET_GW=192.168.1.1
KLDLOAD_NET_DNS=192.168.1.1
EOF
kldload-install-target --config /tmp/answers.env

Use static IP for a NAS — clients need a stable address for SMB/NFS mounts. The boot disk gets ZFS on root automatically. Your data pool uses separate disks.

Pool design is the most consequential decision for a NAS because you can't change it later without destroying and recreating the pool. Mirrors give you the best random I/O (important for Samba/NFS serving many small files). RAIDZ gives you the best capacity efficiency (important for media storage). Striped mirrors give you both but cost more drives. The special vdev is the secret weapon for HDD pools — an NVMe pair that handles metadata and small files at SSD speed while the HDDs handle bulk data. Read the ZFS Masterclass pool design section if you want the full analysis.

1. ZFS Pool Design

Why kldload uses a separate data pool

kldload installs your OS on rpool — a single-disk ZFS pool on your boot drive. Your NAS data goes on a separate pool called tank across your data disks. This separation is deliberate:

• The OS is disposable — if it breaks, export tank, reinstall kldload in 2 minutes, import tank. Done.

• The data pool has zero dependency on the OS. It's pure ZFS, portable between any system.

• You can use non-ZFS storage for data if the workload demands it — fast hardware RAID for a caching appliance, ZFS on root for recovery. Right tool for the job.

The pool topology you choose depends on how many disks you have and what you value: speed, capacity, or redundancy. Every topology below uses ashift=12, which tells ZFS your disks have 4K physical sectors. This is correct for every modern drive (HDD and SSD). Never omit it — ZFS defaults to 512-byte sectors and you will halve your write throughput on every drive made after 2010.

Single disk — testing or budget

# One disk, no redundancy. Fine for scratch space or testing.
zpool create -o ashift=12 -O compression=lz4 -O atime=off \
    -O xattr=sa -O dnodesize=auto \
    tank /dev/sdb

Mirror — 2+ disks, best for small NAS (2–4 bay)

# Two disks, mirrored. Lose one disk, keep running.
# 50% usable capacity, best random I/O.
zpool create -o ashift=12 -O compression=lz4 -O atime=off \
    -O xattr=sa -O dnodesize=auto \
    tank mirror /dev/sdb /dev/sdc

3-way mirror — maximum safety

# Three disks, triple mirror. Survive two simultaneous failures.
zpool create -o ashift=12 -O compression=lz4 -O atime=off \
    -O xattr=sa -O dnodesize=auto \
    tank mirror /dev/sdb /dev/sdc /dev/sdd

RAIDZ1 — 3+ disks, one parity (like RAID5)

# Three disks, one parity. Lose one disk, keep running.
# Usable capacity: (N-1) * smallest disk.
# Good for media storage where sequential throughput matters more than IOPS.
zpool create -o ashift=12 -O compression=lz4 -O atime=off \
    -O xattr=sa -O dnodesize=auto \
    tank raidz1 /dev/sdb /dev/sdc /dev/sdd

RAIDZ2 — 4+ disks, two parity (like RAID6)

# Six disks, two parity. The recommendation for large arrays.
# Survives two simultaneous disk failures.
# Usable capacity: (N-2) * smallest disk.
zpool create -o ashift=12 -O compression=lz4 -O atime=off \
    -O xattr=sa -O dnodesize=auto \
    tank raidz2 /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg

Striped mirrors — best IOPS for mixed workloads

# Four disks as two mirror pairs, striped. Best random I/O.
# Same redundancy as RAID10 — lose one disk per mirror.
# 50% usable capacity, but 4x the IOPS of a single disk.
zpool create -o ashift=12 -O compression=lz4 -O atime=off \
    -O xattr=sa -O dnodesize=auto \
    tank mirror /dev/sdb /dev/sdc mirror /dev/sdd /dev/sde

Special vdev — NVMe metadata acceleration for HDD pools

# Add a mirrored NVMe special vdev to an HDD pool.
# Metadata and small blocks land on NVMe — directory listings,
# file lookups, and small-file I/O become SSD-fast.
# The HDD pool handles bulk data at full sequential speed.
zpool add tank special mirror /dev/nvme0n1 /dev/nvme1n1

# Set the small-block threshold (blocks <= this size go to special vdev)
zfs set special_small_blocks=64K tank

Always mirror your special vdev. If the special vdev dies and it was not mirrored, you lose the entire pool. It contains metadata for every file.

Dataset layout

Datasets are free — use one per workload so you can tune recordsize, compression, quota, and snapshot policy independently:

# Parent dataset (not mounted directly)
zfs create -o canmount=off tank/nas

# Media — large sequential files, big recordsize, compression helps subtitles/NFOs
zfs create -o mountpoint=/srv/nas/media -o recordsize=1M -o compression=lz4 tank/nas/media
zfs create -o mountpoint=/srv/nas/media/movies tank/nas/media/movies
zfs create -o mountpoint=/srv/nas/media/tv tank/nas/media/tv
zfs create -o mountpoint=/srv/nas/media/music tank/nas/media/music

# Photos — already compressed JPEGs/HEICs, compression wastes CPU
zfs create -o mountpoint=/srv/nas/photos -o recordsize=1M -o compression=off tank/nas/photos

# Documents — small files, high compressibility
zfs create -o mountpoint=/srv/nas/documents -o recordsize=128K -o compression=lz4 tank/nas/documents

# Backups — Time Machine, PC backups
zfs create -o canmount=off tank/nas/backups
zfs create -o mountpoint=/srv/nas/backups/timemachine -o recordsize=1M \
    -o compression=lz4 tank/nas/backups/timemachine
zfs create -o mountpoint=/srv/nas/backups/pc-backups -o recordsize=1M \
    -o compression=lz4 tank/nas/backups/pc-backups

# Docker application data
zfs create -o mountpoint=/srv/nas/docker -o recordsize=128K -o compression=lz4 tank/nas/docker

# Scratch space — temp files, no snapshots, sync disabled for speed
zfs create -o mountpoint=/srv/nas/scratch -o recordsize=1M \
    -o compression=off -o sync=disabled tank/nas/scratch

tank/nas/
├── media/          recordsize=1M, compression=lz4
│   ├── movies/
│   ├── tv/
│   └── music/
├── photos/         recordsize=1M, compression=off
├── documents/      recordsize=128K, compression=lz4
├── backups/        recordsize=1M, compression=lz4
│   ├── timemachine/
│   └── pc-backups/
├── docker/         recordsize=128K, compression=lz4
└── scratch/        recordsize=1M, sync=disabled, compression=off

The dataset layout above is the reason this NAS beats every commercial appliance. Each dataset has its own recordsize (1M for movies = streaming throughput, 128K for documents = balanced, off compression for photos = don't waste CPU on JPEGs). Each dataset snapshots independently (roll back documents without touching media). Each dataset has its own quota potential. The scratch dataset has sync=disabled for maximum speed at the cost of crash safety — perfect for temp files, terrible for anything you care about. This granularity is impossible on ext4/btrfs/XFS. It's the whole point of ZFS.

2. SMB/CIFS — Windows & Mac File Sharing

Samba is the file sharing protocol that every device on your network already speaks. Windows, macOS, Linux, smart TVs, game consoles, phones — they all talk SMB. This configuration gives you everything a commercial NAS does, plus features they charge extra for: recycle bins, shadow copies (Windows Previous Versions from ZFS snapshots), and native Time Machine support.

# Install Samba
apt install -y samba samba-vfs-modules

# Create NAS users (match your household)
useradd -M -s /usr/sbin/nologin nas-alice
useradd -M -s /usr/sbin/nologin nas-bob
useradd -M -s /usr/sbin/nologin nas-kid

# Set Samba passwords (these are separate from Linux passwords)
smbpasswd -a nas-alice
smbpasswd -a nas-bob
smbpasswd -a nas-kid

# Create per-user home directories
mkdir -p /srv/nas/homes/{alice,bob,kid}
chown nas-alice: /srv/nas/homes/alice
chown nas-bob: /srv/nas/homes/bob
chown nas-kid: /srv/nas/homes/kid

Complete smb.conf

cat > /etc/samba/smb.conf << 'SMBEOF'
[global]
    workgroup = WORKGROUP
    server string = kldload NAS
    server role = standalone server

    # Security
    map to guest = Bad User
    guest account = nobody

    # Performance — direct I/O, no sendfile (ZFS handles caching via ARC)
    use sendfile = yes
    aio read size = 16384
    aio write size = 16384

    # macOS compatibility
    vfs objects = fruit streams_xattr
    fruit:metadata = stream
    fruit:model = MacSamba
    fruit:posix_rename = yes
    fruit:veto_appledouble = no
    fruit:nfs_aces = no
    fruit:wipe_intentionally_left_blank_rfork = yes
    fruit:delete_empty_adfiles = yes

    # Logging
    log file = /var/log/samba/log.%m
    max log size = 1000
    log level = 1

    # Disable printing (this is a NAS, not a print server)
    load printers = no
    printing = bsd
    printcap name = /dev/null
    disable spoolss = yes

# ─── Per-User Home Directories ──────────────────────────────────
[homes]
    comment = Personal Home Directory
    path = /srv/nas/homes/%U
    browseable = no
    writable = yes
    valid users = %U
    create mask = 0600
    directory mask = 0700
    vfs objects = recycle
    recycle:repository = .recycle/%U
    recycle:keeptree = yes
    recycle:versions = yes
    recycle:touch = yes

# ─── Public Share (Guest OK) ────────────────────────────────────
[public]
    comment = Public Share — anyone on the network
    path = /srv/nas/scratch
    browseable = yes
    writable = yes
    guest ok = yes
    create mask = 0666
    directory mask = 0777
    force user = nobody
    force group = nogroup

# ─── Documents (Authenticated Users) ────────────────────────────
[documents]
    comment = Shared Documents
    path = /srv/nas/documents
    browseable = yes
    writable = yes
    valid users = nas-alice, nas-bob
    create mask = 0664
    directory mask = 0775
    force group = users
    vfs objects = recycle shadow_copy2
    recycle:repository = .recycle/%U
    recycle:keeptree = yes
    recycle:versions = yes
    shadow:snapdir = .zfs/snapshot
    shadow:sort = desc
    shadow:format = autosnap_%Y-%m-%d_%H:%M:%S_hourly
    shadow:localtime = no

# ─── Media (Read-Only for Streaming Devices) ────────────────────
[media]
    comment = Media Library — read-only
    path = /srv/nas/media
    browseable = yes
    read only = yes
    guest ok = yes
    create mask = 0644
    directory mask = 0755

# ─── Media Admin (Read-Write for Managers) ──────────────────────
[media-admin]
    comment = Media Library — admin access
    path = /srv/nas/media
    browseable = yes
    writable = yes
    valid users = nas-alice, nas-bob
    create mask = 0664
    directory mask = 0775
    force group = users
    vfs objects = recycle
    recycle:repository = .recycle/%U
    recycle:keeptree = yes
    recycle:versions = yes

# ─── Photos ─────────────────────────────────────────────────────
[photos]
    comment = Family Photos
    path = /srv/nas/photos
    browseable = yes
    writable = yes
    valid users = nas-alice, nas-bob
    create mask = 0664
    directory mask = 0775
    force group = users
    vfs objects = recycle shadow_copy2
    recycle:repository = .recycle/%U
    recycle:keeptree = yes
    shadow:snapdir = .zfs/snapshot
    shadow:sort = desc
    shadow:format = autosnap_%Y-%m-%d_%H:%M:%S_hourly
    shadow:localtime = no

# ─── Time Machine Backup Target ─────────────────────────────────
[timemachine]
    comment = macOS Time Machine Backup
    path = /srv/nas/backups/timemachine
    browseable = yes
    writable = yes
    valid users = nas-alice, nas-bob
    vfs objects = fruit streams_xattr
    fruit:time machine = yes
    fruit:time machine max size = 1T
    create mask = 0600
    directory mask = 0700

# ─── PC Backups ─────────────────────────────────────────────────
[backups]
    comment = PC Backup Target
    path = /srv/nas/backups/pc-backups
    browseable = yes
    writable = yes
    valid users = nas-alice, nas-bob
    create mask = 0600
    directory mask = 0700
    vfs objects = shadow_copy2
    shadow:snapdir = .zfs/snapshot
    shadow:sort = desc
    shadow:format = autosnap_%Y-%m-%d_%H:%M:%S_daily
    shadow:localtime = no
SMBEOF

# Test the config
testparm -s

# Enable and start Samba
systemctl enable --now smbd nmbd

The shadow_copy2 VFS module is the feature that makes users love this NAS. It exposes ZFS snapshots as "Previous Versions" in Windows Explorer. Right-click a file → Properties → Previous Versions → you see every hourly snapshot. Restore a deleted file? Click it. Compare versions? Click it. No backup software, no IT ticket, no "ask the admin to restore from backup." Users self-serve their own file recovery directly from ZFS snapshots. This single feature replaces dedicated backup products and eliminates 80% of "I accidentally deleted a file" tickets. The shadow:format line must match your sanoid snapshot naming pattern.

3. NFS — Linux/Unix Clients

If you have Linux machines on the network (other servers, a Proxmox cluster, dev workstations), NFS is the native choice. Lower overhead than SMB, no Samba auth to deal with, and NFSv4 gives you a single unified namespace.

# Install NFS server
apt install -y nfs-kernel-server

# Export NAS datasets
cat > /etc/exports << 'NFSEOF'
# NFSv4 root (required for NFSv4 unified namespace)
/srv/nas        192.168.1.0/24(rw,fsid=0,no_subtree_check,crossmnt)

# Per-dataset exports
/srv/nas/media      192.168.1.0/24(ro,no_subtree_check,all_squash,anonuid=65534,anongid=65534)
/srv/nas/documents  192.168.1.0/24(rw,no_subtree_check,no_root_squash)
/srv/nas/photos     192.168.1.0/24(rw,no_subtree_check,no_root_squash)
/srv/nas/docker     192.168.1.0/24(rw,no_subtree_check,no_root_squash)
/srv/nas/backups    192.168.1.0/24(rw,no_subtree_check,no_root_squash)
NFSEOF

exportfs -ra
systemctl enable --now nfs-server

Client-side mounts

# On a Linux client — add to /etc/fstab
# NFSv4 mounts (paths are relative to the NFSv4 root)
nas:/media      /mnt/nas/media      nfs4  ro,soft,intr,timeo=30   0 0
nas:/documents  /mnt/nas/documents  nfs4  rw,soft,intr,timeo=30   0 0
nas:/photos     /mnt/nas/photos     nfs4  rw,soft,intr,timeo=30   0 0

# Or use autofs for on-demand mounting
apt install -y autofs

cat > /etc/auto.master.d/nas.autofs << 'EOF'
/mnt/nas  /etc/auto.nas  --timeout=300
EOF

cat > /etc/auto.nas << 'EOF'
media      -fstype=nfs4,ro,soft,intr  nas:/media
documents  -fstype=nfs4,rw,soft,intr  nas:/documents
photos     -fstype=nfs4,rw,soft,intr  nas:/photos
EOF

systemctl enable --now autofs

Kerberos for secured environments

# For environments with Active Directory or FreeIPA:
# 1. Join the NAS to the Kerberos realm
# 2. Export with sec=krb5p (encrypted + integrity)
#
# /srv/nas/documents  192.168.1.0/24(rw,sec=krb5p,no_subtree_check)
#
# This encrypts NFS traffic on the wire and authenticates users
# via Kerberos tickets — no UID mapping hacks needed.

4. Windows Previous Versions (Shadow Copy)

This is the single most valuable feature of a ZFS NAS, and the one that nobody talks about. Every ZFS snapshot automatically appears as a “Previous Version” in Windows Explorer. No backup agent. No cloud sync. No third-party software. It is built into Windows and it works with ZFS snapshots out of the box.

How it works:

ZFS takes snapshots on a schedule (hourly, daily — you configure it)
Samba’s vfs_shadow_copy2 module exposes those snapshots to Windows clients
In Windows Explorer, right-click any file or folder → Properties → Previous Versions
Windows shows every snapshot as a point-in-time version
User clicks “Restore” — file is restored instantly from the snapshot

The user workflow:

Alice accidentally deletes Q4-Report.xlsx from \\nas\documents
She right-clicks the documents folder in Windows Explorer
Clicks Properties → Previous Versions tab
Sees a list of snapshots: “Today 3:00 PM”, “Today 2:00 PM”, “Today 1:00 PM”...
Opens the 2:00 PM version, finds Q4-Report.xlsx, drags it back to the share
Done. No IT ticket. No restore from tape. No “did you check the recycle bin?”

The Samba config in the shares above already includes shadow_copy2. The key settings:

# These lines in the share definition enable Previous Versions:
vfs objects = shadow_copy2
shadow:snapdir = .zfs/snapshot          # Where ZFS keeps snapshots
shadow:sort = desc                      # Newest first
shadow:format = autosnap_%Y-%m-%d_%H:%M:%S_hourly   # Must match your snapshot naming
shadow:localtime = no                   # Snapshots use UTC

The shadow:format must match the snapshot names created by sanoid (configured below). Sanoid uses the format autosnap_YYYY-MM-DD_HH:MM:SS_hourly by default, which is exactly what we configure here.

Combined with the recycle bin: Even if a user empties the Windows Recycle Bin and the Samba recycle bin, the file still exists in every ZFS snapshot taken before the deletion. Recovery is always possible until the snapshot is pruned.

5. Time Machine over SMB

macOS Time Machine backs up to the NAS natively over SMB. No AFP needed (Apple deprecated it). The vfs_fruit module handles all the Apple-specific quirks — resource forks, Finder metadata, and the sparse bundle format that Time Machine uses.

# The [timemachine] share above already configures everything.
# Key settings:
#   fruit:time machine = yes          — advertise to macOS as TM target
#   fruit:time machine max size = 1T  — per-user quota (prevent one Mac from filling the pool)

# Per-user Time Machine datasets with quotas
zfs create -o mountpoint=/srv/nas/backups/timemachine/alice \
    -o quota=500G tank/nas/backups/timemachine/alice
zfs create -o mountpoint=/srv/nas/backups/timemachine/bob \
    -o quota=500G tank/nas/backups/timemachine/bob

chown nas-alice: /srv/nas/backups/timemachine/alice
chown nas-bob: /srv/nas/backups/timemachine/bob

On the Mac: Open System Settings → General → Time Machine → Add Backup Disk. The NAS appears automatically via mDNS. Select it, enter your Samba credentials, and Time Machine starts backing up. Each Mac gets its own sparse bundle inside its user directory, and the ZFS quota prevents any single machine from consuming all available storage.

# Install Avahi for mDNS/Bonjour advertisement (macOS auto-discovery)
apt install -y avahi-daemon

cat > /etc/avahi/services/smb.service << 'AVAHI'
<?xml version="1.0" standalone='no'?>
<!DOCTYPE service-group SYSTEM "avahi-service.dtd">
<service-group>
  <name replace-wildcards="yes">%h</name>
  <service>
    <type>_smb._tcp</type>
    <port>445</port>
  </service>
  <service>
    <type>_device-info._tcp</type>
    <port>9</port>
    <txt-record>model=TimeCapsule8,119</txt-record>
  </service>
  <service>
    <type>_adisk._tcp</type>
    <port>9</port>
    <txt-record>dk0=adVN=timemachine,adVF=0x82</txt-record>
    <txt-record>sys=adVF=0x100</txt-record>
  </service>
</service-group>
AVAHI

systemctl enable --now avahi-daemon

6. Snapshot Schedule

Sanoid manages snapshot creation and pruning automatically. Different datasets get different policies — documents need aggressive retention (you might not notice a deleted file for weeks), media needs minimal retention (large files, easy to re-download), and photos get the most aggressive retention of all (irreplaceable).

# Sanoid is included in kldload. Configure it:
cat > /etc/sanoid/sanoid.conf << 'SANOID'
[tank/nas/documents]
    use_template = critical
    recursive = yes

[tank/nas/photos]
    use_template = irreplaceable
    recursive = yes

[tank/nas/media]
    use_template = bulk
    recursive = yes

[tank/nas/backups]
    use_template = backups
    recursive = yes

[tank/nas/docker]
    use_template = critical
    recursive = yes

[tank/nas/scratch]
    autosnap = no
    autoprune = no

# ─── Templates ──────────────────────────────────────────────────

[template_critical]
    frequently = 4
    hourly = 24
    daily = 30
    monthly = 12
    yearly = 2
    autosnap = yes
    autoprune = yes

[template_irreplaceable]
    hourly = 24
    daily = 90
    monthly = 24
    yearly = 5
    autosnap = yes
    autoprune = yes

[template_bulk]
    hourly = 0
    daily = 7
    monthly = 3
    yearly = 0
    autosnap = yes
    autoprune = yes

[template_backups]
    hourly = 0
    daily = 30
    monthly = 12
    yearly = 2
    autosnap = yes
    autoprune = yes
SANOID

# Sanoid runs via systemd timer (already enabled on kldload)
systemctl enable --now sanoid.timer

What this gives you:

Documents: 4 frequent (15-min) + 24 hourly + 30 daily + 12 monthly + 2 yearly snapshots
Photos: 24 hourly + 90 daily + 24 monthly + 5 yearly — because photos are irreplaceable
Media: 7 daily + 3 monthly — light retention, easy to re-download
Backups: 30 daily + 12 monthly + 2 yearly — deep history for Time Machine data
Scratch: No snapshots — it is temporary by definition

7. Offsite Replication

The 3-2-1 backup rule: 3 copies of your data, on 2 different media types, with 1 copy offsite. ZFS makes this trivial:

Copy 1: Live data on the NAS (tank pool)
Copy 2: ZFS snapshots on the same pool (instant recovery from accidental deletion)
Copy 3: Replicated to a remote NAS via syncoid over WireGuard (offsite, different location)

# Set up WireGuard tunnel to the remote NAS
cat > /etc/wireguard/wg-backup.conf << EOF
[Interface]
Address = 10.88.0.1/30
PrivateKey = $(cat /etc/wireguard/backup-private.key)
ListenPort = 51821

[Peer]
PublicKey = <REMOTE_NAS_PUBLIC_KEY>
AllowedIPs = 10.88.0.2/32
Endpoint = remote-nas.example.com:51821
PersistentKeepalive = 25
EOF

systemctl enable --now wg-quick@wg-backup

# Test connectivity
ping -c 3 10.88.0.2

# Set up SSH key for syncoid (passwordless replication)
ssh-keygen -t ed25519 -f /root/.ssh/syncoid-key -N ""
ssh-copy-id -i /root/.ssh/syncoid-key.pub root@10.88.0.2

# Replicate critical datasets to remote NAS
# syncoid sends only changed blocks — the first sync is a full copy,
# every subsequent sync sends only the delta.
cat > /etc/cron.d/offsite-replication << 'EOF'
# Documents and photos — replicate every hour
0 * * * * root /usr/sbin/syncoid -r --no-sync-snap --sshoption="-i /root/.ssh/syncoid-key" tank/nas/documents 10.88.0.2:backup/nas/documents 2>&1 | logger -t offsite-sync
15 * * * * root /usr/sbin/syncoid -r --no-sync-snap --sshoption="-i /root/.ssh/syncoid-key" tank/nas/photos 10.88.0.2:backup/nas/photos 2>&1 | logger -t offsite-sync

# Media — replicate daily (large files, less critical)
0 3 * * * root /usr/sbin/syncoid -r --no-sync-snap --sshoption="-i /root/.ssh/syncoid-key" tank/nas/media 10.88.0.2:backup/nas/media 2>&1 | logger -t offsite-sync

# Backups (Time Machine etc.) — replicate daily
30 3 * * * root /usr/sbin/syncoid -r --no-sync-snap --sshoption="-i /root/.ssh/syncoid-key" tank/nas/backups 10.88.0.2:backup/nas/backups 2>&1 | logger -t offsite-sync
EOF

The WireGuard tunnel encrypts everything in transit. syncoid compresses the replication stream. A 2TB NAS that changes 10GB per day sends only 10GB over the wire — not 2TB.

8. Monitoring & Alerts

ZFS pool health monitoring

cat > /usr/local/bin/nas-healthcheck << 'SCRIPT'
#!/bin/bash
# NAS health check — run via cron, email on problems

MAILTO="admin@example.com"
HOSTNAME=$(hostname)
PROBLEMS=""

# Check pool health
POOL_HEALTH=$(zpool status -x)
if [[ "$POOL_HEALTH" != "all pools are healthy" ]]; then
    PROBLEMS+="POOL DEGRADED:\n$POOL_HEALTH\n\n"
fi

# Check for disk errors
ERRORS=$(zpool status tank | grep -E "DEGRADED|FAULTED|OFFLINE|UNAVAIL|REMOVED|CKSUM [^0]|READ [^0]|WRITE [^0]")
if [[ -n "$ERRORS" ]]; then
    PROBLEMS+="DISK ERRORS:\n$ERRORS\n\n"
fi

# Check free space (warn at 80%, critical at 90%)
CAPACITY=$(zpool list -Hp -o capacity tank | tr -d '%')
if (( CAPACITY > 90 )); then
    PROBLEMS+="CRITICAL: Pool is ${CAPACITY}% full!\n\n"
elif (( CAPACITY > 80 )); then
    PROBLEMS+="WARNING: Pool is ${CAPACITY}% full.\n\n"
fi

# Check SMART health on all drives
for disk in $(lsblk -dno NAME | grep -E '^sd|^nvme'); do
    SMART=$(smartctl -H /dev/$disk 2>/dev/null | grep -i "result\|health")
    if echo "$SMART" | grep -qi "failed\|fail"; then
        PROBLEMS+="SMART FAILURE on /dev/$disk:\n$SMART\n\n"
    fi
done

# Check if scrub is overdue (last scrub > 30 days ago)
LAST_SCRUB=$(zpool status tank | grep "scan:" | grep -oP '\w+ \w+ \d+ \d+:\d+:\d+ \d+')
if [[ -n "$LAST_SCRUB" ]]; then
    SCRUB_EPOCH=$(date -d "$LAST_SCRUB" +%s 2>/dev/null)
    NOW=$(date +%s)
    DAYS_SINCE=$(( (NOW - SCRUB_EPOCH) / 86400 ))
    if (( DAYS_SINCE > 30 )); then
        PROBLEMS+="SCRUB OVERDUE: Last scrub was $DAYS_SINCE days ago.\n\n"
    fi
fi

# Send alert if problems found
if [[ -n "$PROBLEMS" ]]; then
    echo -e "NAS Health Alert from $HOSTNAME\n\n$PROBLEMS\nFull pool status:\n$(zpool status)" | \
        mail -s "[NAS ALERT] $HOSTNAME — issues detected" "$MAILTO"
fi
SCRIPT
chmod +x /usr/local/bin/nas-healthcheck

# Run health check every 30 minutes
cat > /etc/cron.d/nas-healthcheck << 'EOF'
*/30 * * * * root /usr/local/bin/nas-healthcheck
EOF

# Weekly scrub (Sunday at 2 AM)
cat > /etc/cron.d/zfs-scrub << 'EOF'
0 2 * * 0 root zpool scrub tank 2>&1 | logger -t zfs-scrub
EOF

SMART monitoring

# Install and configure smartmontools
apt install -y smartmontools

# Enable SMART on all drives and schedule short tests weekly, long tests monthly
cat > /etc/smartd.conf << 'SMARTD'
# Monitor all drives, email on problems
DEVICESCAN -a -o on -S on -n standby,q \
    -s (S/../.././02|L/../../6/03) \
    -W 4,45,55 \
    -m admin@example.com \
    -M exec /usr/share/smartmontools/smartd_warning.sh
SMARTD

systemctl enable --now smartd

Grafana dashboard

# Install node_exporter for Prometheus metrics
apt install -y prometheus-node-exporter

# ZFS metrics are exposed via node_exporter automatically:
# - zfs_arc_hits, zfs_arc_misses (ARC hit rate)
# - zfs_pool_health (pool status)
# - node_filesystem_avail_bytes (per-dataset free space)
# - node_disk_io_time_seconds_total (per-disk I/O)

# If you run a Grafana instance, import dashboard ID 11209
# ("ZFS Pool Metrics") or build a custom one with these queries:
#
# ARC hit rate:
#   rate(zfs_arc_hits[5m]) / (rate(zfs_arc_hits[5m]) + rate(zfs_arc_misses[5m])) * 100
#
# Pool usage:
#   zfs_pool_allocated_bytes / zfs_pool_size_bytes * 100
#
# Disk I/O latency:
#   rate(node_disk_io_time_seconds_total[5m])

eBPF for I/O latency analysis

# When performance drops, use bcc-tools for live analysis
apt install -y bpfcc-tools

# See which files are causing the most I/O
biosnoop-bpfcc -d sdb | head -50

# Histogram of I/O latency per disk
biolatency-bpfcc -D 10 1

# Watch ZFS ARC in real time
arcstat 1

# Trace slow ZFS operations (>10ms)
funclatency-bpfcc zfs_read -u ms -m 10

9. Performance Tuning

ARC sizing — use most of RAM for cache

On a dedicated NAS, the ARC (Adaptive Replacement Cache) should consume most of the system RAM. Unlike a general-purpose server, a NAS does one thing: serve files. Let ZFS cache aggressively.

# Check current ARC size
arc_summary | head -30

# For a NAS with 32GB RAM, give ARC 24GB (leave ~8GB for OS + Samba + Docker)
echo "options zfs zfs_arc_max=25769803776" > /etc/modprobe.d/zfs-arc.conf

# For 64GB RAM, give ARC 56GB
# echo "options zfs zfs_arc_max=60129542144" > /etc/modprobe.d/zfs-arc.conf

# For 16GB RAM (minimum recommended for a NAS), give ARC 10GB
# echo "options zfs zfs_arc_max=10737418240" > /etc/modprobe.d/zfs-arc.conf

# Apply without reboot
echo 25769803776 > /sys/module/zfs/parameters/zfs_arc_max

Special vdev for metadata acceleration

If your NAS uses HDDs for bulk storage, a mirrored NVMe special vdev transforms the user experience. Directory listings that took 2–3 seconds on spinning rust complete instantly because all metadata lives on NVMe.

# Already shown in pool design above. Verify it's working:
zpool iostat -v tank 5

# You should see metadata ops hitting the special vdev (nvme),
# while bulk reads/writes hit the HDD vdev.

L2ARC for read-heavy workloads

# Add a dedicated NVMe as L2ARC (second-level read cache)
# Useful for: Plex scanning, photo browsing (Immich/PhotoPrism), search indexing
zpool add tank cache /dev/nvme2n1

# L2ARC is populated automatically with frequently-read blocks
# that get evicted from the in-memory ARC.
# Check L2ARC hit rate:
arc_summary | grep -A5 "L2ARC"

SLOG for sync writes

# If you run VMs or databases on the NAS, sync writes go through the ZIL.
# A dedicated SLOG (Separate Log) device accelerates this dramatically.
# Use a high-endurance NVMe with capacitor-backed power loss protection.
zpool add tank log mirror /dev/nvme3n1p1 /dev/nvme3n1p2

# SLOG only helps sync writes. For a pure file-sharing NAS, it's unnecessary.
# For NFS with sync=standard or database workloads, it's transformative.

recordsize tuning per dataset

# Already configured in the dataset layout above. The reasoning:
#
# recordsize=1M   — media, photos, backups
#   Large sequential files. Bigger records = fewer metadata ops = higher throughput.
#   A 4GB movie file is stored as ~4000 1MB records instead of ~32000 128KB records.
#
# recordsize=128K — documents, Docker
#   Mixed small files. Default recordsize is a good balance.
#
# recordsize=16K  — databases (if you run PostgreSQL/MySQL on the NAS)
#   Match the database page size to avoid read-modify-write amplification.
#   zfs create -o recordsize=16K -o logbias=latency tank/nas/postgres

Compression ratios — real examples

# Check compression ratios across datasets
zfs get compressratio -r tank/nas

# Typical results on a real NAS:
#   tank/nas/documents    2.15x   — Office docs, PDFs, text files compress well
#   tank/nas/media        1.02x   — Already-compressed video, almost no gain
#   tank/nas/media/music  1.08x   — FLAC compresses slightly, MP3 does not
#   tank/nas/photos       1.00x   — JPEG/HEIC is already compressed
#   tank/nas/docker       1.65x   — Config files, logs, databases compress well
#   tank/nas/backups      1.45x   — Mixed content, moderate compression
#
# This is why we set compression=off on photos — it wastes CPU for zero benefit.
# LZ4 is so fast that the CPU cost is negligible on everything else,
# and the space savings on documents (2x) are substantial.

10. Docker on the NAS

A NAS is not just a file server — it is the natural home for every media and productivity container. Each container gets its own ZFS dataset with tuning appropriate to its workload.

# Install Docker
curl -fsSL https://get.docker.com | bash

# Configure Docker to use ZFS storage driver
mkdir -p /etc/docker
cat > /etc/docker/daemon.json << 'DOCKER'
{
    "storage-driver": "zfs",
    "data-root": "/srv/nas/docker",
    "log-driver": "json-file",
    "log-opts": {
        "max-size": "10m",
        "max-file": "3"
    }
}
DOCKER

systemctl enable --now docker

Docker Compose — NAS services stack

mkdir -p /srv/nas/docker/compose
cat > /srv/nas/docker/compose/docker-compose.yml << 'COMPOSE'
version: "3.9"

services:
  # ─── Jellyfin (open-source Plex alternative) ──────────────────
  jellyfin:
    image: jellyfin/jellyfin:latest
    container_name: jellyfin
    network_mode: host
    restart: unless-stopped
    volumes:
      - /srv/nas/docker/jellyfin/config:/config
      - /srv/nas/docker/jellyfin/cache:/cache
      - /srv/nas/media:/media:ro
    environment:
      - JELLYFIN_PublishedServerUrl=http://nas:8096
    # For hardware transcoding (Intel QuickSync):
    # devices:
    #   - /dev/dri:/dev/dri

  # ─── Immich (Google Photos replacement) ────────────────────────
  immich-server:
    image: ghcr.io/immich-app/immich-server:release
    container_name: immich-server
    restart: unless-stopped
    ports:
      - "2283:2283"
    volumes:
      - /srv/nas/photos:/usr/src/app/upload
      - /srv/nas/docker/immich:/usr/src/app/upload/library
    environment:
      - DB_HOSTNAME=immich-db
      - DB_USERNAME=immich
      - DB_PASSWORD=immich
      - DB_DATABASE_NAME=immich
      - REDIS_HOSTNAME=immich-redis
    depends_on:
      - immich-db
      - immich-redis

  immich-db:
    image: tensorchord/pgvecto-rs:pg16-v0.2.0
    container_name: immich-db
    restart: unless-stopped
    volumes:
      - /srv/nas/docker/immich-db:/var/lib/postgresql/data
    environment:
      - POSTGRES_USER=immich
      - POSTGRES_PASSWORD=immich
      - POSTGRES_DB=immich

  immich-redis:
    image: redis:7-alpine
    container_name: immich-redis
    restart: unless-stopped

  # ─── Nextcloud (self-hosted cloud storage) ─────────────────────
  nextcloud:
    image: nextcloud:latest
    container_name: nextcloud
    restart: unless-stopped
    ports:
      - "8080:80"
    volumes:
      - /srv/nas/docker/nextcloud:/var/www/html
      - /srv/nas/documents:/var/www/html/data/shared/files/Documents
      - /srv/nas/photos:/var/www/html/data/shared/files/Photos
    environment:
      - MYSQL_HOST=nextcloud-db
      - MYSQL_DATABASE=nextcloud
      - MYSQL_USER=nextcloud
      - MYSQL_PASSWORD=nextcloud
    depends_on:
      - nextcloud-db

  nextcloud-db:
    image: mariadb:11
    container_name: nextcloud-db
    restart: unless-stopped
    volumes:
      - /srv/nas/docker/nextcloud-db:/var/lib/mysql
    environment:
      - MYSQL_ROOT_PASSWORD=rootpass
      - MYSQL_DATABASE=nextcloud
      - MYSQL_USER=nextcloud
      - MYSQL_PASSWORD=nextcloud

  # ─── Syncthing (continuous file sync) ──────────────────────────
  syncthing:
    image: syncthing/syncthing:latest
    container_name: syncthing
    restart: unless-stopped
    ports:
      - "8384:8384"
      - "22000:22000/tcp"
      - "22000:22000/udp"
      - "21027:21027/udp"
    volumes:
      - /srv/nas/docker/syncthing:/var/syncthing/config
      - /srv/nas/documents:/var/syncthing/documents
      - /srv/nas/photos:/var/syncthing/photos

COMPOSE

cd /srv/nas/docker/compose && docker compose up -d

Every container’s persistent data lives on its own ZFS dataset (inside tank/nas/docker). Snapshots protect container state. If an update breaks Jellyfin, roll back its config dataset to the last snapshot.

11. Hardware Recommendations

Mini PC (2–4 bay)

Intel N100/N305 or AMD Ryzen 5600G. 16–32GB DDR5. 2–4 SATA/NVMe slots. Perfect for a home NAS. Fanless or near-silent. Draws 15–35W idle. Pair with a 4-bay USB3 enclosure if the board lacks enough SATA ports.

Repurposed desktop

Any modern desktop with 4+ SATA ports. Add an HBA card for more drives. 32–64GB DDR4. Best value path — old workstations (Dell OptiPlex, HP ProDesk) cost almost nothing used and have plenty of PCIe slots for expansion.

Rack server (8+ bay)

Dell PowerEdge, HPE ProLiant, or Supermicro chassis. 64–256GB ECC DDR4/DDR5. Hot-swap drive bays. Redundant PSUs. For serious storage: 12–24 bays, RAIDZ2 or RAIDZ3, special vdev on NVMe. Loud — lives in a basement or closet.

ECC RAM

Recommended but not required. ZFS checksums every block, so it detects corruption regardless. ECC prevents bit-flips in RAM from becoming corrupted data before ZFS sees it. For a NAS holding irreplaceable photos and documents, ECC is cheap insurance. For a media server, non-ECC is fine — the data is replaceable.

HBA vs RAID card

Always use an HBA in IT mode (passthrough). ZFS needs direct access to the disks. Hardware RAID cards hide the disks behind a fake volume — ZFS cannot monitor SMART data, cannot detect which disk has errors, and cannot replace individual disks. Crossflash a Dell PERC H310/H710 or LSI 9211-8i to IT mode firmware. These cards are widely available used for minimal cost.

Drives

CMR, not SMR. Shingled Magnetic Recording (SMR) drives rewrite entire zones on random writes, destroying ZFS resilver and scrub performance. Always verify the drive uses Conventional Magnetic Recording (CMR). WD Red Plus, Seagate IronWolf, and Toshiba N300 are all CMR. WD Red (non-Plus) is often SMR — avoid it.

NVMe roles

Special vdev: Mirrored NVMe pair. Stores metadata + small blocks. Transforms HDD directory listing performance. Any consumer NVMe works.
SLOG: Single high-endurance NVMe with power loss protection (Intel Optane, Samsung PM983). Only needed for sync writes (NFS with sync=standard, databases, VMs).
L2ARC: Single consumer NVMe. Read cache for data that does not fit in ARC. Useful for Plex scanning, photo browsing, search indexing.
Boot: The kldload boot disk. A small NVMe or SATA SSD works fine.

12. Security

Samba authentication

# Local users are configured above. For larger environments:
# Option 1: LDAP backend (FreeIPA, OpenLDAP)
# Option 2: Join Active Directory domain
#   apt install -y winbind libpam-winbind libnss-winbind krb5-user
#   net ads join -U administrator
#   Add "security = ADS" and "realm = EXAMPLE.COM" to smb.conf [global]

nftables firewall — only expose services on LAN

cat > /etc/nftables.conf << 'NFTEOF'
#!/usr/sbin/nft -f

flush ruleset

table inet nas_firewall {
    chain input {
        type filter hook input priority 0; policy drop;

        # Loopback
        iif lo accept

        # Established connections
        ct state established,related accept

        # ICMP (ping)
        ip protocol icmp accept
        ip6 nexthdr icmpv6 accept

        # SSH — LAN only
        ip saddr 192.168.1.0/24 tcp dport 22 accept

        # SMB — LAN only (ports 139, 445)
        ip saddr 192.168.1.0/24 tcp dport { 139, 445 } accept

        # NFS — LAN only (port 2049)
        ip saddr 192.168.1.0/24 tcp dport 2049 accept

        # mDNS/Avahi — LAN only
        ip saddr 192.168.1.0/24 udp dport 5353 accept

        # Docker services — LAN only
        ip saddr 192.168.1.0/24 tcp dport { 8080, 8096, 8384, 2283 } accept

        # WireGuard — from anywhere (for offsite replication + remote access)
        udp dport 51821 accept

        # Prometheus node_exporter — LAN only
        ip saddr 192.168.1.0/24 tcp dport 9100 accept

        # Log + drop everything else
        log prefix "nas-dropped: " limit rate 5/minute
        drop
    }

    chain forward {
        type filter hook forward priority 0; policy drop;
    }

    chain output {
        type filter hook output priority 0; policy accept;
    }
}
NFTEOF

nft -f /etc/nftables.conf
systemctl enable nftables

WireGuard for remote NAS access

# Access your NAS from anywhere without exposing SMB to the internet
# The WireGuard tunnel from Section 7 already provides this.
# On your laptop/phone, install WireGuard and add a peer config:

# [Interface]
# Address = 10.88.0.3/30
# PrivateKey = <LAPTOP_PRIVATE_KEY>
# DNS = 192.168.1.1
#
# [Peer]
# PublicKey = <NAS_PUBLIC_KEY>
# AllowedIPs = 192.168.1.0/24, 10.88.0.0/30
# Endpoint = your-home-ip:51821
# PersistentKeepalive = 25

# Now \\nas\documents works from a coffee shop, encrypted end-to-end.

ZFS encryption at rest

# For sensitive datasets, enable native ZFS encryption.
# The key is loaded at boot (or via SSH before mounting).
zfs create -o encryption=on -o keyformat=passphrase \
    -o mountpoint=/srv/nas/private tank/nas/private

# To unlock after reboot:
zfs load-key tank/nas/private
zfs mount tank/nas/private

# Or use a keyfile for automated unlock:
dd if=/dev/urandom of=/root/.zfs-keys/private.key bs=32 count=1
chmod 400 /root/.zfs-keys/private.key
zfs create -o encryption=on -o keyformat=raw \
    -o keylocation=file:///root/.zfs-keys/private.key \
    -o mountpoint=/srv/nas/vault tank/nas/vault

fail2ban for SSH

apt install -y fail2ban

cat > /etc/fail2ban/jail.local << 'F2B'
[sshd]
enabled = true
port = ssh
maxretry = 3
bantime = 3600
findtime = 600
F2B

systemctl enable --now fail2ban

← Multi-Site Cloud IoT Gateway →