| pick your distro, get ZFS on root
kldload — your platform, your way, free
Source

Replicate Your Data

Automatic backups to another machine — ZFS sends only what changed, not the whole dataset every time.

Before you start: this guide assumes your two machines can already reach each other. If you haven't set that up yet, do the Connect Two Machines guide first — it takes about 5 minutes and sets up a WireGuard tunnel between them.

What you'll end up with: your data on Machine A automatically replicated to Machine B every hour. If Machine A dies, you restore from Machine B in minutes.

What replication actually means

Not a file copy — a block stream

Normal backup tools copy files one by one. ZFS replication works at a lower level — it sends raw blocks of changed data. ZFS knows exactly what changed since the last backup, so it only sends the diff.

analogy: instead of re-photocopying the whole book, you only mail the pages you edited

Snapshots are the foundation

Replication works by comparing two snapshots. ZFS takes a snapshot on the source, compares it to the last snapshot it sent, and streams only what's between them. The remote side applies that stream and ends up with an identical copy.

analogy: snapshot = save point in a video game. replication = syncing your save file to a second console.

First run vs incremental

The first replication copies everything — this can take a while for large datasets. Every run after that only sends changes. A 100 GB dataset might take 20 minutes the first time, then 30 seconds every hour after that.

first run: ships the full box set. incremental: mails only the new episodes.

Step 1 — Set up SSH keys (so you don't need a password)

Replication needs to SSH from Machine A into Machine B automatically, without prompting for a password. We do this with SSH keys:

# On Machine A — generate an SSH key pair (press Enter for all prompts)
ssh-keygen -t ed25519 -f /root/.ssh/replicate_key -N ""

# Copy the public key to Machine B
# Replace 10.77.0.2 with Machine B's actual WireGuard address
ssh-copy-id -i /root/.ssh/replicate_key.pub root@10.77.0.2

Test that it works without a password:

ssh -i /root/.ssh/replicate_key root@10.77.0.2 hostname

You should see Machine B's hostname printed, with no password prompt.


Step 2 — Manual replication with native ZFS (understand what's happening)

Before automating anything, run a manual replication so you can see exactly what happens. This section uses raw zfs send / zfs recv so you understand the mechanics. The next section shows how syncoid automates all of this.

First, create the dataset on Machine A if it doesn't exist:

# On Machine A
zfs create rpool/data

Take a snapshot on Machine A:

# On Machine A — create a snapshot named "snap1"
# (syncoid does this automatically; here you do it manually to see the flow)
zfs snapshot rpool/data@snap1

Now send it to Machine B:

# The native ZFS way — pipe the snapshot stream over SSH into zfs recv on the remote
zfs send rpool/data@snap1 | ssh -i /root/.ssh/replicate_key root@10.77.0.2 zfs recv backup/data

Expected output — you'll see progress, then silence when it's done:

sending from @ to rpool/data@snap1
78.3 MiB  00:00:12 [6.32 MiB/s] [============================>] 100%

Verify on Machine B:

# On Machine B
zfs list backup/data
NAME          USED  AVAIL  REFER  MOUNTPOINT
backup/data  78.3M  412G  78.3M  /backup/data

The dataset is now on Machine B.

Incremental send — only sending the difference

After the first full send, you only need to send what changed. Make a new snapshot and use -i to send just the difference between two snapshots:

zfs snapshot rpool/data@snap2
zfs send -i rpool/data@snap1 rpool/data@snap2 | ssh -i /root/.ssh/replicate_key root@10.77.0.2 zfs recv backup/data

Syncoid does all of this automatically — it tracks which snapshots exist on both sides, picks the right -i pair, and runs the send/recv. You never have to remember snapshot names or which was the last one sent.


Step 3 — Automatic replication with syncoid

Doing the snapshot + send + recv sequence manually every time is error-prone. syncoid handles all of it automatically — it figures out what snapshots exist on both sides, creates a new one, and sends only the difference. It is installed on kldload as part of the sanoid package.

# The syncoid way — one command replaces the entire snapshot/send/recv sequence
syncoid --sshkey /root/.ssh/replicate_key rpool/data root@10.77.0.2:backup/data
# The native ZFS equivalent of what syncoid does internally — every run:
# 1. Take a new snapshot
zfs snapshot rpool/data@autosnap_2026-03-26_1300
# 2. Find the last snapshot that was sent (syncoid tracks this)
# 3. Send only the incremental difference
zfs send -i rpool/data@autosnap_2026-03-26_1200 rpool/data@autosnap_2026-03-26_1300 \
  | ssh -i /root/.ssh/replicate_key root@10.77.0.2 zfs recv backup/data

Expected output from syncoid:

INFO: Sending oldest full snapshot rpool/data@autosnap_2026-03-26_12:00:00_hourly
  (snip)
INFO: Updating backup/data on root@10.77.0.2
Sending incremental rpool/data@autosnap_2026-03-26_12:00:00_hourly ... to rpool/data@autosnap_2026-03-26_13:00:00_hourly
 4.21 MiB  00:00:01 [3.80 MiB/s] [============>] 100%

Run it again — it's nearly instant because nothing changed:

syncoid --sshkey /root/.ssh/replicate_key rpool/data root@10.77.0.2:backup/data
INFO: No new snapshots to send.

What syncoid automates that zfs send/recv doesn't

Snapshot bookkeeping — syncoid remembers the last snapshot it sent so it can send only the next increment. With raw zfs send you must track snapshot names yourself.
Automatic snapshot creation — syncoid takes a fresh snapshot before each send. You don’t have to run zfs snapshot first.
First-run detection — if no common snapshot exists, syncoid does a full send automatically. Raw zfs send -i fails if the parent snapshot doesn’t exist on the remote.
Progress display — shows transfer rate and percentage. Raw zfs send is silent by default.
Recursive replication — add -r to replicate a dataset and all its children in one command.


Step 4 — Schedule it automatically

Create a systemd timer to run syncoid every hour. This is more reliable than cron on systemd-based systems.

Create the service file:

cat > /etc/systemd/system/syncoid-data.service << 'EOF'
[Unit]
Description=ZFS replication — rpool/data to remote backup
After=network.target

[Service]
Type=oneshot
ExecStart=/usr/sbin/syncoid --sshkey /root/.ssh/replicate_key rpool/data root@10.77.0.2:backup/data
User=root
EOF

Create the timer:

cat > /etc/systemd/system/syncoid-data.timer << 'EOF'
[Unit]
Description=Run ZFS replication every hour

[Timer]
OnCalendar=hourly
Persistent=true

[Install]
WantedBy=timers.target
EOF

Enable and start the timer:

systemctl daemon-reload
systemctl enable --now syncoid-data.timer

Check that the timer is scheduled:

systemctl list-timers syncoid-data.timer
NEXT                        LEFT     LAST                        PASSED  UNIT                  ACTIVATES
Thu 2026-03-26 14:00:00 UTC 42min left Thu 2026-03-26 13:00:01 UTC 17min ago syncoid-data.timer   syncoid-data.service

If you prefer cron, the equivalent is:

# Add to root's crontab: crontab -e
0 * * * * /usr/sbin/syncoid --sshkey /root/.ssh/replicate_key rpool/data root@10.77.0.2:backup/data >> /var/log/syncoid.log 2>&1

Step 5 — Verify both sides match

Check what's on each machine:

# On Machine A
zfs list -t snapshot rpool/data
NAME                                              USED  AVAIL  REFER  MOUNTPOINT
rpool/data@autosnap_2026-03-26_12:00:00_hourly     0B      -  78.3M  -
rpool/data@autosnap_2026-03-26_13:00:00_hourly     0B      -  78.3M  -
# On Machine B
zfs list -t snapshot backup/data
NAME                                              USED  AVAIL  REFER  MOUNTPOINT
backup/data@autosnap_2026-03-26_12:00:00_hourly    0B      -  78.3M  -
backup/data@autosnap_2026-03-26_13:00:00_hourly    0B      -  78.3M  -

Identical snapshots on both sides — the data is fully replicated.


Step 6 — Rollback on the remote if you need to recover

If Machine A is lost or corrupted, you can roll back the remote copy to any snapshot:

# On Machine B — list what's available
zfs list -t snapshot backup/data

# Roll back to a specific snapshot
zfs rollback backup/data@autosnap_2026-03-26_12:00:00_hourly

Or, to use the backup as a new primary dataset after Machine A is gone:

# On Machine B — make the backup writable (it's currently a replica)
# First, check the mountpoint
zfs get mountpoint backup/data

# Set a mountpoint and mount it
zfs set mountpoint=/data backup/data
zfs mount backup/data

How much space does this use?

The first replication uses as much space as your dataset. After that, each incremental only stores the changed blocks. If you have hourly snapshots and change 10 MB per hour, each snapshot costs about 10 MB. Old snapshots can be pruned automatically by sanoid — it's already configured on kldload.


syncoid vs native zfs send/recv — quick reference table

Task syncoid Native ZFS What syncoid adds
Initial full replication syncoid --sshkey key src user@host:dst zfs snapshot src@s1 && zfs send src@s1 | ssh host zfs recv dst Auto-snapshots, single command, progress display
Incremental update (same command — syncoid detects incremental automatically) zfs snapshot src@s2 && zfs send -i src@s1 src@s2 | ssh host zfs recv dst Tracks last-sent snapshot so you never need to specify -i manually
Replicate all child datasets syncoid -r src user@host:dst zfs send -R src@snap | ssh host zfs recv dst Handles snapshots per child; native -R is an all-or-nothing stream
List what's on each machine (use native) zfs list -t snapshot src / dst No wrapper needed for listing
Roll back remote to a snapshot (use native on remote) zfs rollback backup/data@snap Recovery is a native ZFS operation
Automate (run every hour) systemd timer calling syncoid cron calling zfs snapshot + zfs send + ssh Syncoid is idempotent and safe to run repeatedly; manual scripting is fragile