ZFS — The Last Word in Filesystems
ZFS is not a filesystem. Not in the way you think of ext4, XFS, or NTFS as filesystems. ZFS is an integrated storage platform — a volume manager, a RAID controller, a filesystem, a cache manager, a compression engine, a checksumming layer, and a replication system fused into a single, coherent whole. It replaces the entire traditional storage stack with one piece of software that manages everything from raw physical disks to mounted directories.
This is the landing page for the kldload ZFS Wiki — the most complete introduction to ZFS on the internet. It covers what ZFS is, where it came from, how it works under the hood, why it matters, and how to get started in five minutes. Every section links deeper into the wiki for hands-on details. If you read this page and nothing else, you'll understand ZFS better than 95% of the people who use it.
I've been running ZFS in production since 2014. On Solaris first, then illumos, then FreeBSD, now Linux. I've watched it eat every other storage solution alive. The reason I built kldload around ZFS isn't because it's trendy — it's because after a decade of running it, I can't imagine going back to anything else. This page is the honest, complete introduction I wish someone had given me ten years ago.
ZFS by the Numbers
zpool + zfs.
The Origin Story — Sun Microsystems, 2001–2005
In 2001, two engineers at Sun Microsystems — Jeff Bonwick and Matt Ahrens — started building a filesystem from scratch. Not an incremental improvement. Not a patch on UFS. A clean-sheet design that asked: what would storage look like if we threw away every assumption from the last 30 years and started over?
The context matters. By 2001, the Unix storage stack was a patchwork of tools from different
decades. fdisk from the 1980s created partitions. md (later
mdadm) assembled software RAID. LVM carved logical volumes.
mkfs formatted filesystems. Each tool had its own syntax, its own state files,
its own failure modes. None of them knew about each other. A RAID controller didn't know
which blocks belonged to which files. The filesystem didn't know that the data underneath
was mirrored. Every layer was flying blind.
Bonwick and Ahrens decided the problem wasn't any single tool — it was the layering itself. Their insight: if one system manages everything from physical disks to mounted directories, it can make guarantees that no stack of independent tools ever could. It can checksum data and verify the checksum at read time. It can self-heal corruption from a mirror copy. It can take instant snapshots with zero I/O. It can compress transparently. It can do all of this because it controls the entire path from application write to disk platter.
The name ZFS originally stood for "Zettabyte File System" — a nod to its 128-bit address space, which can theoretically store 256 quadrillion zettabytes. To put that in perspective: if you filled every grain of sand on Earth with a hard drive, you still couldn't exhaust ZFS's address space. But Bonwick later said the name was really meant to be "the last word in filesystems" — Z being the last letter of the alphabet. The ambition was to build something so complete that nobody would ever need to build another filesystem again.
ZFS was first integrated into Solaris 10 in November 2005 and open-sourced under the CDDL (Common Development and Distribution License) as part of OpenSolaris. It was immediately recognized as a generational leap. While the rest of the industry was bolting features onto 1990s-era designs, Sun shipped a filesystem that solved problems most people didn't even know they had.
Jeff Bonwick's original blog posts about ZFS are legendary reading.
He described the design philosophy as: "we wanted to make administering storage as easy as
filling a glass of water." That philosophy shows in everything — two commands
(zpool and zfs) manage everything. No fdisk,
no mdadm, no lvcreate, no mkfs.
Just pools and datasets. That simplicity is the result of incredible engineering depth.
If you ever get the chance to read Bonwick's 2005 blog post "ZFS: The Last Word in Filesystems,"
do it. It reads like a manifesto, and twenty years later, every claim in it has held up.
The Journey — Sun to Oracle to OpenZFS
ZFS has had one of the most dramatic histories in open-source software. It survived a corporate acquisition, a license war, a community fork, and a cross-platform unification. Understanding the history explains why ZFS is what it is today.
Development begins at Sun Microsystems. Jeff Bonwick and Matt Ahrens start the ZFS project. The team grows to include Mark Shellenbaum, Mark Maybee, Neil Perrin, and Bill Moore. They work in secret for four years, building the entire system before anyone outside Sun sees a line of code.
ZFS ships in Solaris 10 Update 2. Open-sourced under the CDDL as part of OpenSolaris. The storage world immediately recognizes it as a generational leap. Apple begins porting ZFS to macOS (the port works but never ships publicly — licensing concerns and internal politics kill it).
FreeBSD integrates ZFS. Pawel Jakub Dawidek ports ZFS to FreeBSD 7.0. This becomes the first non-Solaris platform with production ZFS support. FreeBSD's ZFS integration remains the most mature on any non-illumos platform to this day.
ZFS dedup and encryption appear in Solaris. Sun continues adding major features. The community grows. FreeNAS (now TrueNAS) adopts ZFS as its storage backend. NetApp sues Sun over ZFS patents (the case is eventually settled).
Oracle acquires Sun for $7.4 billion. Within months, Oracle closes the OpenSolaris source code. The open-source community is cut off from future ZFS improvements. This is the near-death moment for open-source ZFS.
The illumos fork. The community forks the last open-source Solaris code into illumos. Garrett D'Amore, Bryan Cantrill, and others keep ZFS alive. Joyent (now Samsung) builds their SmartOS cloud platform on illumos + ZFS. Delphix builds database virtualization on it. Brian Behlendorf and LLNL (Lawrence Livermore National Laboratory) begin porting ZFS to Linux as a DKMS kernel module.
ZFS on Linux reaches production quality. The "ZoL" project (ZFS on Linux) achieves stability for production workloads. Adoption begins in HPC, scientific computing, and enterprise storage. Ubuntu becomes the first major distro to ship ZFS packages.
Ubuntu ships ZFS in the kernel. Canonical includes ZFS modules in their kernel packages. Their legal team considers CDDL + GPL distribution permissible via the "system library exception" argument. This dramatically accelerates ZFS adoption on Linux.
OpenZFS unification. The illumos, FreeBSD, and Linux codebases merge under the OpenZFS umbrella. One codebase, multiple platforms. Feature development accelerates dramatically. The project is now the definitive open-source ZFS implementation, with contributions from iXsystems, Klara Inc., Delphix, LLNL, and dozens of independent developers.
OpenZFS 2.1 ships dRAID. Distributed RAID spreads parity and spare capacity across all disks, resilvering in minutes instead of hours. Also: persistent L2ARC survives reboots, compatibility bookmarks simplify replication management.
OpenZFS 2.2 ships block cloning. Copy a file in near-zero time by sharing block pointers. Experimental RAIDZ expansion lets you add a disk to an existing RAIDZ vdev — the first time RAIDZ topology has been mutable. Linux 6.x kernel compatibility. Significant performance improvements across the board.
OpenZFS 2.3 development. RAIDZ expansion moves toward stable. Fast dedup reduces memory overhead dramatically. Continued platform improvements. The project is more active than at any point in its history, with over 200 contributors on GitHub.
Oracle's acquisition of Sun nearly killed open-source ZFS. It's one of the great what-ifs of open source history. But the community response — illumos, ZoL, and eventually the OpenZFS unification — produced something better than what Sun alone would have built. The competition between illumos and Linux implementations pushed both forward. ZFS is stronger today because it survived Oracle. That said, Oracle still runs ZFS internally in Solaris 11. Their version has features that haven't made it to OpenZFS. But the OpenZFS community moves faster now, and the unification in 2020 means every improvement benefits every platform simultaneously. The future belongs to OpenZFS, not Oracle's closed fork.
What ZFS Actually Is — Not Just a Filesystem
The most common mistake people make about ZFS is calling it a "filesystem" and comparing it to ext4 or XFS. That's like comparing a smartphone to a calculator — they overlap, but they're not the same category. ZFS is seven things in one:
Traditional Linux storage is a layer cake. Physical disks are partitioned with fdisk
or parted. Partitions are assembled into RAID arrays with mdadm.
RAID arrays are carved into logical volumes with LVM. Logical volumes are
formatted with mkfs.ext4 or mkfs.xfs. Each layer is a separate tool,
a separate configuration, a separate failure domain.
The traditional Linux storage stack:
Physical Disks → Partition Table (GPT) → mdadm RAID → LVM → ext4/XFS
Five layers. Five tools. Five failure modes. Five places where a misconfiguration corrupts your data.
The ZFS storage stack:
Physical Disks → ZFS
One layer. Two commands. Zero ambiguity.
ZFS replaces all of it. No partition tables (ZFS manages raw disks or partitions directly). No separate RAID controller (ZFS has mirrors, RAIDZ1/2/3, and dRAID built in). No volume manager (ZFS pools dynamically allocate space to datasets). No separate filesystem format (ZFS is the filesystem). No separate cache layer (ARC, L2ARC, and SLOG are native ZFS features).
This integration isn't just convenient — it's architecturally superior. When the filesystem knows about the RAID layout, it can optimize writes to fill full stripes. When the RAID layer knows about the filesystem, it can verify checksums end-to-end. When the cache manager knows about both, it can make intelligent decisions about what to keep in memory. No bolted-together stack of independent tools can achieve this.
Copy-on-Write — The Paradigm That Changes Everything
Every traditional filesystem (ext4, XFS, NTFS) uses in-place writes.
When you modify a file, the new data overwrites the old data at the same location on disk.
If power fails mid-write, you get a partially written block — corruption. That's why
journals exist: they write the change to a log first, then apply it. But journals only
protect metadata. Data corruption from interrupted writes is still possible on ext4
(unless you mount with data=journal, which most people don't because it
halves performance).
ZFS uses copy-on-write (COW). When you modify a block, ZFS doesn't touch the original. It writes the new block to a new location on disk, updates the block pointer in the parent, and frees the old location. The on-disk data is always in a consistent state. There is no window where a power failure can leave you with half-written garbage. You either have the old data or the new data. Always.
This single design choice enables everything else ZFS does:
Instant snapshots
A snapshot is just a saved set of block pointers. Since old blocks are never overwritten, creating a snapshot costs nothing — no data copy, no I/O, no delay. It's a metadata operation that completes in milliseconds regardless of dataset size.
Instant clones
A clone is a writable snapshot. It shares all blocks with the original and only allocates new space when blocks diverge. Clone a 500GB dataset in milliseconds, use zero additional space until you change something.
Atomic transactions
Every write is a transaction. The superblock (uberblock) is updated last, atomically. If power fails before the uberblock update, the write never happened. If it fails after, the write is complete. There is no inconsistent middle state.
No fsck. Ever.
Because the on-disk state is always consistent, there is no need for a filesystem check
after an unclean shutdown. The pool imports in seconds. No fsck.
No journal replay. No multi-hour scan of a 10TB filesystem.
# Traditional filesystem: modify in place, pray power doesn't fail
# ZFS: write new block, update pointer, free old block
#
# Visualized:
#
# ext4 write: [block A] --overwrite--> [block A'] (old data gone)
# ZFS write: [block A] (kept) + [block B] (new data at new location)
# pointer updated: parent now points to B
# block A freed (or kept if snapshot exists)
The ZFS Storage Stack — Disks to Datasets
Understanding ZFS means understanding four layers: disks, vdevs, pools, and datasets. Every ZFS deployment follows this hierarchy.
Layer 1: Physical Disks
Raw block devices. HDDs, SSDs, NVMe, even files (for testing). ZFS consumes them directly. You don't partition them first — though kldload does create a small EFI partition for booting, the rest of the disk is given to ZFS as a raw partition.
Layer 2: VDEVs (Virtual Devices)
Disks are grouped into vdevs. A vdev is the redundancy unit. A 2-disk mirror is one vdev. A 6-disk RAIDZ2 is one vdev. A single disk is one vdev (no redundancy). You can also have special-purpose vdevs: SLOG (write intent log), L2ARC (read cache), and special (metadata acceleration). A pool is made of one or more vdevs. Data is striped across vdevs. If any vdev is lost (all disks in that vdev fail beyond its redundancy level), the entire pool is lost.
Layer 3: Pool (zpool)
A pool is a collection of vdevs that presents a single, unified storage space.
All vdevs in a pool contribute their capacity. Data is striped across vdevs for performance.
The pool is the top-level container. You interact with it via zpool commands:
zpool create, zpool status, zpool scrub,
zpool add, zpool iostat.
Layer 4: Datasets and Zvols
On top of a pool live datasets (POSIX filesystems you mount and use)
and zvols (block devices for VMs, iSCSI, swap). Datasets are the unit of
management: each has its own properties (compression, encryption, quota, mountpoint,
snapshot schedule). Datasets form a hierarchy and inherit properties from their parents.
You interact with them via zfs commands: zfs create,
zfs snapshot, zfs send, zfs get, zfs set.
# The full stack in one example:
#
# Physical: /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/nvme0n1 /dev/nvme1n1
# | | | | | |
# VDEVs: [ mirror-0 ] [ mirror-1 ] [ special mirror ]
# sda + sdb sdc + sdd nvme0n1 + nvme1n1
# | | |
# Pool: [=================== rpool ===================]
# |
# Datasets: rpool/ROOT/centos (mountpoint=/)
# rpool/home (mountpoint=/home)
# rpool/home/alice (mountpoint=/home/alice, encryption=on)
# rpool/var/log (mountpoint=/var/log, quota=10G)
# rpool/vms (zvol for VM disk images)
Key Concepts — The Complete Map
Here is every major ZFS concept, what it does, and where to learn more. This is your roadmap to the rest of the wiki.
Pools & VDEVs
The foundation. Pools aggregate vdevs. Vdevs provide redundancy. Mirror for IOPS. RAIDZ for capacity. dRAID for large arrays. Pool topology is permanent — choose carefully.
Datasets
Mountable POSIX filesystems with independent properties. Compression, encryption, quotas, reservations, record size, ACLs — all per-dataset. Datasets inherit from their parent and form a hierarchy. This is the unit of management in ZFS.
Zvols
Block devices backed by ZFS. Used for VM disk images, swap, iSCSI targets. Get all ZFS
features (snapshots, replication, compression) but present as /dev/zvol/...
rather than a mounted filesystem.
Snapshots
Point-in-time, read-only captures of a dataset. Created instantly (metadata-only operation).
Cost zero space until blocks diverge. Accessible via the hidden .zfs/snapshot/
directory. The foundation of backup, rollback, and replication.
Clones
Writable copies of a snapshot. Instant creation, zero initial space. Share all unchanged blocks with the source. Perfect for testing, dev environments, and VM templates. Promote a clone to make it independent of the source.
Send & Receive
Stream a dataset (or incremental changes since a snapshot) to another pool, another machine,
or a file. Block-level, encrypted end-to-end if desired. This is ZFS-native replication.
No rsync. No file-level crawl. The entire dataset with all metadata in one stream.
Checksumming
Every block has a cryptographic checksum stored in its parent block pointer.
Every read is verified. If the checksum doesn't match, ZFS knows the data is corrupt
before returning it to your application. Default: fletcher4. Available:
sha256, sha512, skein, edonr, blake3.
Self-Healing
On a redundant pool (mirror or RAIDZ), when ZFS detects a checksum mismatch during a read, it automatically fetches the correct copy from another disk and repairs the bad block in place. No admin intervention. No downtime. The corruption is fixed before your application even knows it happened.
Scrubbing
zpool scrub reads every block in the pool and verifies every checksum.
On a redundant pool, it repairs any corruption it finds. Run it weekly or monthly.
It's a proactive integrity check that catches problems before they become data loss.
Compression
Transparent, per-dataset compression. lz4 is the default — nearly free
CPU cost with ~2x compression on typical data. zstd offers better ratios for
archival workloads. Compression often improves performance because fewer blocks
means fewer disk I/Os. Always leave it on.
Encryption
Native, per-dataset encryption (AES-256-GCM). Each dataset can have its own key. Encrypted datasets can be sent/received without decryption (raw send). Keys can be passphrases, keyfiles, or external key management systems. Encryption is set at dataset creation — it cannot be added later.
ARC (Adaptive Replacement Cache)
ZFS's read cache lives in RAM. ARC is far smarter than the Linux page cache — it uses a combination of recency and frequency to decide what stays cached. ARC grows to fill available RAM and shrinks under memory pressure. On a 64GB server, you might see 50GB of ARC. That's not a memory leak — that's your data being served at RAM speed.
L2ARC (Level 2 ARC)
An SSD-backed extension of ARC for when RAM isn't enough. Sits between RAM and spinning disks. Useful when your working set exceeds RAM but you still need fast reads. Not a write cache. Loses its contents on reboot (persistent L2ARC available in OpenZFS 2.0+).
SLOG (Separate ZFS Intent Log)
Accelerates synchronous writes by moving the ZFS Intent Log to a fast, power-loss-protected device. Only helps sync-write workloads (databases, NFS, iSCSI). Not a general write cache. Must use enterprise NVMe with power loss protection — consumer SSDs defeat the purpose.
Special VDEV
An SSD-based vdev that stores pool metadata and optionally small files. Dramatically
accelerates ls, find, du, and metadata-heavy
operations on HDD-based pools. Must be mirrored — losing an
unmirrored special vdev loses the entire pool.
Properties & Inheritance
Every dataset has properties: compression, encryption, quota, reservation, recordsize,
atime, exec, setuid, and dozens more. Properties are inherited from parent datasets.
Set a property on a parent and all children inherit it. Override on a child to diverge.
zfs get all pool/dataset shows everything. zfs inherit resets to parent.
Data Integrity — Why ZFS Exists
Data integrity is the reason ZFS was built. Not performance. Not features. The guarantee that when you read data back, it's exactly what you wrote. Every other ZFS feature — snapshots, compression, encryption, caching — is secondary to this core mission.
The problem ZFS solves is called silent data corruption (also known as bit rot). Hard drives lie. SATA cables flip bits. RAID controllers have firmware bugs. RAM without ECC corrupts data in transit. None of these produce I/O errors. Your application gets back data that looks fine but is subtly wrong. A JPEG with a few corrupted pixels. A database row with a garbled field. A binary that segfaults randomly. You don't find out until weeks or months later, long after your backups have been overwritten with the corrupted version.
ZFS solves this with three mechanisms:
End-to-end checksumming
Every block has a checksum stored in its parent block's pointer, not alongside the data itself. This is critical — if the checksum lived next to the data (like btrfs metadata), a misdirected write could corrupt both the data and its checksum together, making the corruption undetectable. ZFS stores checksums in the parent, which stores its checksum in its parent, all the way up to the uberblock (the root of the Merkle tree). A single bit flip anywhere in the tree is detected.
Self-healing with redundancy
When ZFS detects a checksum mismatch on a read, and the pool has redundancy (mirror or RAIDZ), ZFS reads the block from another copy. If that copy's checksum is valid, ZFS returns the good data to your application and overwrites the bad copy with the good one. The corruption is repaired transparently, during normal operation, with no downtime and no human intervention.
# See how many blocks ZFS has repaired automatically
zpool status tank
# NAME STATE READ WRITE CKSUM
# tank ONLINE 0 0 0
# mirror-0 ONLINE 0 0 0
# sda ONLINE 0 0 2 <-- 2 checksum errors, auto-repaired
# sdb ONLINE 0 0 0
Proactive scrubbing
zpool scrub reads every block on every disk and verifies every checksum.
On a redundant pool, it repairs any corruption it finds. Without scrubbing, a corrupted
block might sit undetected until someone reads it — by which time the redundant
copy might also be corrupted. Scrubbing is ZFS's proactive defense against accumulated
bit rot.
# Start a scrub
zpool scrub rpool
# Check scrub progress
zpool status rpool
# scan: scrub in progress since Mon Apr 4 02:00:01 2026
# 1.23T scanned at 456M/s, 892G issued at 334M/s, 1.88T total
# 0 repaired, 47.45% done, 00:52:14 to go
# Schedule weekly scrubs (kldload does this by default)
systemctl enable zfs-scrub-weekly@rpool.timer
I've caught real corruption with scrubs. Twice on SATA cables that were slightly loose, once on a drive that had a firmware bug that only manifested on certain LBA ranges. In all three cases, ZFS detected and repaired the corruption automatically. On ext4 with mdraid, those same failures would have been silent — the data would have been wrong and I'd never have known. This is not theoretical. This is why I won't run production on anything else.
Performance Features
ZFS is not just safe — it's fast. The same architectural choices that enable data integrity also enable performance optimizations that traditional storage stacks can't match.
ARC — the smartest cache in the building
The Adaptive Replacement Cache is ZFS's in-RAM read cache. Unlike the Linux page cache (which uses simple LRU eviction), ARC tracks both recency and frequency of access. A file read once doesn't evict a file read a hundred times. ARC automatically tunes the balance between recently-accessed and frequently-accessed data. On a server with 64GB of RAM, ARC will grow to use 40–50GB for caching. This is not a memory leak. It's your hot data being served at RAM speed. ARC releases memory instantly when applications need it.
# Check ARC statistics
arc_summary
# Key metrics to watch:
# ARC size: current cache size
# Target size (max): maximum cache will grow to
# ARC hit ratio: percentage of reads served from cache (aim for >90%)
# Demand data hits: cache hits for actual application reads
# Limit ARC to 8GB (useful for VM hosts or memory-constrained systems)
echo 8589934592 > /sys/module/zfs/parameters/zfs_arc_max
Compression — less I/O, more throughput
LZ4 compression is so fast that the CPU time to compress/decompress a block is less than the disk time saved by not reading/writing those bytes. On typical data (config files, logs, source code, documents), LZ4 achieves 2–3x compression. That means a 1TB dataset uses 400GB on disk, and reads/writes complete in half the time because half the blocks are skipped. Compression improves performance. Leave it on. Always.
# Check compression ratio on a dataset
zfs get compressratio,used,logicalused rpool/home
# NAME PROPERTY VALUE SOURCE
# rpool/home compressratio 2.14x -
# rpool/home used 18.2G -
# rpool/home logicalused 38.9G -
# You're storing 38.9GB of data in 18.2GB of space.
# Change compression algorithm per-dataset
zfs set compression=zstd rpool/archive # better ratio, more CPU
zfs set compression=lz4 rpool/home # fast, good ratio (default)
zfs set compression=off rpool/vms/images # pre-compressed VM images
Prefetch & aggregation
ZFS detects sequential read patterns and prefetches upcoming blocks before your application asks for them. It also aggregates small writes into large transactions (transaction groups, or TXGs) and flushes them periodically. This transforms random I/O patterns into sequential disk writes, which is dramatically faster on spinning disks and reduces write amplification on SSDs.
Adaptive record size
The recordsize property sets the maximum block size per dataset. Databases
with 8KB pages get recordsize=8K. Media streaming gets recordsize=1M.
General workloads use the default 128K. Matching record size to workload
eliminates read amplification (reading more data than needed) and write amplification
(rewriting more data than changed).
# Tune recordsize per workload
zfs set recordsize=8K rpool/srv/postgres # PostgreSQL 8K pages
zfs set recordsize=16K rpool/srv/mysql # MySQL/InnoDB 16K pages
zfs set recordsize=1M rpool/srv/media # large sequential reads
zfs set recordsize=128K rpool/home # general purpose (default)
Administrative Simplicity — Two Commands
The entire ZFS administration surface is two commands:
zpool — manages the physical layer
Create pools. Add disks. Replace failed disks. Check pool health. Run scrubs. View I/O statistics. Import and export pools. Everything about the physical storage.
zpool create rpool mirror sda sdb # create a mirrored pool
zpool status # show pool health and disk status
zpool iostat -v 5 # live I/O statistics every 5 seconds
zpool scrub rpool # verify all checksums
zpool replace rpool sda sdc # hot-replace a failed disk
zpool add rpool mirror sde sdf # expand pool with a new mirror pair
zpool history # audit log of every pool operation
zfs — manages the logical layer
Create datasets. Set properties. Take snapshots. Send/receive replication streams. Mount, unmount, encrypt, decrypt. Everything about how data is organized and managed.
zfs create rpool/home/alice # create a dataset
zfs set quota=100G rpool/home/alice # limit to 100GB
zfs set compression=zstd rpool/archive # per-dataset compression
zfs snapshot rpool/home/alice@before-risky # point-in-time snapshot
zfs rollback rpool/home/alice@before-risky # undo everything since snapshot
zfs send rpool/home/alice@snap | \
ssh backup zfs recv tank/backup/alice # replicate to remote machine
zfs get all rpool/home/alice # show every property
zfs list -t all -r rpool # list everything in the pool
That's it. No fdisk. No mdadm. No pvcreate,
vgcreate, lvcreate. No mkfs. No resize2fs.
No fsck. Two commands replace an entire toolchain of legacy utilities,
each with its own syntax, its own configuration files, and its own failure modes.
This is the thing that hooks people. You go from needing to remember
mdadm --create --level=1 --raid-devices=2 and pvcreate and
vgcreate and lvcreate -L 50G and mkfs.ext4
and editing /etc/fstab and running resize2fs when you need more
space... to just zpool create and zfs create. Five tools become
two. Twelve steps become two. And the two are consistent, predictable, and well-documented.
Once you internalize zpool for physical and zfs for logical,
you never look back.
The Properties System — Inheritance and Override
Every ZFS dataset has dozens of properties that control its behavior. Properties follow an inheritance model: set a property on a parent dataset, and all children inherit it. Override on a specific child to diverge. Reset a child to re-inherit from its parent.
# Set compression on the pool — all datasets inherit it
zfs set compression=lz4 rpool
zfs get compression rpool/home # inherited from rpool: lz4
zfs get compression rpool/var/log # inherited from rpool: lz4
# Override on a specific dataset
zfs set compression=zstd rpool/archive
zfs get compression rpool/archive # local: zstd (overridden)
# Check where a property value came from
zfs get -H -o name,property,value,source compression rpool/home
# rpool/home compression lz4 inherited from rpool
# Reset to inherited value
zfs inherit compression rpool/archive
# Now rpool/archive inherits lz4 from rpool again
This is how you manage storage policy at scale. Set your defaults at the pool level.
Override per-dataset only where workloads demand it. Every property is visible via
zfs get all — nothing is hidden in config files or undocumented settings.
Properties divide into two categories: native properties (managed by ZFS itself:
compression, encryption, quota, recordsize)
and user properties (arbitrary key-value pairs you define:
zfs set com.company:backup=daily rpool/srv). User properties are useful for
automation — tag datasets with metadata and let scripts query them.
ZFS in the Enterprise — Who Runs It
ZFS is not experimental. It's not hobbyist software. It runs in production at scale across every industry. Here's who trusts their data to ZFS:
Proxmox VE
The most popular open-source virtualization platform uses ZFS as a first-class storage backend. Proxmox + ZFS powers thousands of production hypervisors running VMs and containers. ZFS snapshots integrate with Proxmox backup, live migration, and replication. If you run Proxmox, you're already in the ZFS ecosystem.
iXsystems / TrueNAS
iXsystems builds TrueNAS (formerly FreeNAS), the most widely deployed ZFS-based NAS platform. TrueNAS CORE runs on FreeBSD + ZFS. TrueNAS SCALE runs on Linux + OpenZFS. iXsystems is also one of the largest contributors to the OpenZFS project and employs several core OpenZFS developers. They sell enterprise storage appliances built entirely on ZFS.
Netflix
Netflix's Open Connect CDN serves a significant fraction of global internet traffic. Their content delivery appliances run FreeBSD with ZFS, serving streaming content from ZFS pools tuned for high-throughput sequential reads. When you watch Netflix, ZFS is delivering your video.
Joyent (Samsung)
Joyent built their entire SmartOS cloud platform on illumos + ZFS. SmartOS uses ZFS for everything: OS boot (from ZFS), container storage (ZFS datasets), VM storage (ZFS zvols), and backup (ZFS send/receive). Samsung acquired Joyent and continues to run the platform.
Delphix
Delphix built a database virtualization platform on ZFS. They use ZFS clones to create instant, space-efficient copies of production databases for development and testing. A 10TB Oracle database cloned in seconds, using near-zero additional space. Delphix is one of the largest contributors to OpenZFS.
Klara Inc.
Klara (founded by former FreeBSD developers) provides commercial OpenZFS development, consulting, and support. They contribute significant features to OpenZFS including RAIDZ expansion, block cloning, and performance improvements. If a major OpenZFS feature landed in the last few years, Klara probably helped build it.
Lawrence Livermore National Laboratory
LLNL hosts the OpenZFS on Linux project and uses ZFS for high-performance computing storage. Their HPC clusters depend on ZFS for data integrity and performance on petabyte-scale datasets. The ZFS on Linux port exists because of LLNL.
The entire FreeBSD ecosystem
FreeBSD has shipped ZFS as a first-class, in-kernel filesystem since 2008. It's the default root filesystem recommendation. Every FreeBSD server, every pfSense firewall with ZFS, every FreeNAS/TrueNAS box, every FreeBSD jail host — they all run ZFS. FreeBSD's ZFS integration is the most mature on any platform.
The "who uses ZFS" question matters because of the licensing FUD. People hear "not in the mainline Linux kernel" and assume it's risky or unsupported. Meanwhile, Netflix serves a third of the internet's traffic on it. Proxmox runs hundreds of thousands of production hypervisors on it. iXsystems sells enterprise storage appliances on it. The code is battle-tested at a scale most organizations will never reach. The licensing situation is a legal nuance, not a technical risk. If your legal team approves CDDL (and most do), there is no safer storage platform available.
ZFS on Linux — OpenZFS and DKMS
ZFS is not part of the Linux kernel. It ships as an out-of-tree kernel module built via DKMS (Dynamic Kernel Module Support). When you install a new kernel, DKMS recompiles the ZFS module against the new kernel headers. This usually works seamlessly. When it doesn't — missing headers, compiler mismatch, ABI change — the module fails to build and ZFS doesn't load on next boot.
The reason ZFS isn't in the mainline kernel is licensing. ZFS is licensed under Sun's CDDL (Common Development and Distribution License). The Linux kernel is GPLv2. The FSF and some kernel developers consider these licenses incompatible for linked distribution. Linus Torvalds has declined to take a position, saying only "my lawyers tell me I should not comment." Ubuntu ships ZFS modules in their kernel packages. Canonical's legal team considers it permissible. Other distributions (Fedora, RHEL, Debian) ship ZFS only as DKMS packages that build from source.
Current OpenZFS releases: OpenZFS 2.2.x is the current stable branch (2024–2025). Key features include block cloning, Linux 6.x compatibility, improved RAIDZ expansion (experimental), and significant performance work. OpenZFS 2.3 development continues with stable RAIDZ expansion, fast dedup, and platform improvements. The project maintains compatibility with Linux kernels from ~5.x through 6.x.
# Check your OpenZFS version
zfs --version
# zfs-2.2.7-1
# zfs-kmod-2.2.7-1
# Check module is loaded
lsmod | grep zfs
# zfs 4308992 6
# spl 135168 1 zfs
# See all loaded ZFS-related modules
modinfo zfs | head -5
# filename: /lib/modules/6.8.0/extra/zfs/zfs.ko
# version: 2.2.7-1
# license: CDDL
# author: OpenZFS
# description: ZFS
What Makes ZFS Different from Everything Else
Other filesystems have copied individual ZFS features. Btrfs has copy-on-write and snapshots. XFS has scalability. ext4 has stability. But none of them are ZFS, and here's why:
Integration, not aggregation
Btrfs bolted RAID onto a filesystem. mdraid + LVM + ext4 stacks independent tools. ZFS was designed as one integrated system from day one. The RAID layer knows about the filesystem. The filesystem knows about the cache. The cache knows about the checksums. Every layer cooperates. This is why ZFS can self-heal, why snapshots are instant, why compression actually speeds things up. You can't get these properties from independent tools that don't know about each other.
20 years of production hardening
ZFS shipped in 2005. It has been in continuous production use for two decades.
Every edge case, every failure mode, every corruption scenario has been found and fixed
by millions of machines running billions of hours of I/O. Btrfs is younger, less deployed,
and has had stability issues with RAID5/6 that persist to this day.
mdadm is stable but dumb — it doesn't know about the data it's protecting.
ZFS has the combined experience of Sun, Oracle, Netflix, iXsystems, and the entire
FreeBSD ecosystem burned into its code.
The checksumming guarantee
ext4 checksums metadata (journal) but not data. XFS checksums metadata but not data. Btrfs checksums both but stores checksums inline (alongside the data), making certain corruption patterns undetectable. ZFS checksums everything and stores checksums in parent block pointers, forming a Merkle tree rooted at the uberblock. There is no arrangement of corrupted bits that ZFS cannot detect.
Operational simplicity at scale
A 100-server fleet with mdraid + LVM + ext4 requires managing hundreds of mdadm
configs, /etc/fstab entries, LVM metadata, and fsck schedules.
The same fleet with ZFS requires zpool status and zfs list.
Pools are self-describing. Datasets are self-documenting. Properties show where they came
from. History is built in. The operational overhead at scale is dramatically lower.
I'm not saying ZFS is perfect. The licensing situation is real. DKMS breaks sometimes. Pool design is permanent. Memory usage surprises people. These are real trade-offs and they're documented honestly in the section below. But the gap between ZFS and everything else is not close. It's not "ZFS is 10% better." It's "ZFS is a fundamentally different class of system." Once you've used it, going back to ext4 + mdraid + LVM feels like going back to horses after driving a car. The car has its own problems, but you're never going back to horses.
The Quick Comparison — ZFS vs Everything Else
For the full comparison, see ZFS vs Everything Else. Here's the snapshot:
| Feature | ZFS | ext4 + mdraid + LVM | btrfs | XFS + mdraid + LVM |
|---|---|---|---|---|
| Data checksumming | Every block, Merkle tree | Metadata only (journal) | Yes, inline (weaker) | Metadata only |
| Self-healing | Automatic with redundancy | No | Partial (RAID1 only) | No |
| Snapshots | Instant, unlimited | LVM snapshots (slow, fragile) | Instant, COW-based | LVM snapshots (slow, fragile) |
| Compression | Per-dataset, transparent | No | Per-volume, transparent | No |
| Native encryption | Per-dataset, AES-256-GCM | LUKS (whole-volume only) | Not yet | LUKS (whole-volume only) |
| Block-level replication | zfs send/recv |
rsync (file-level, slow) | btrfs send/recv |
rsync (file-level, slow) |
| RAID5/6 stability | RAIDZ1/2/3 production-stable | mdraid stable | RAID5/6 write hole (unsafe) | mdraid stable |
| fsck required | Never | Yes (hours on large FS) | Rarely, but possible | Yes |
| In mainline kernel | No (DKMS / out-of-tree) | Yes | Yes | Yes |
| Production maturity | 20 years, massive scale | Decades, ubiquitous | Improving, RAID5/6 still risky | Decades, enterprise proven |
The one column ZFS loses is "in mainline kernel." That's real, and it has real consequences — DKMS can break, some distros don't package it, corporate legal teams get nervous. But look at the rest of the table. Every single data safety feature is green for ZFS and red or yellow for everything else. If you care about your data being correct — not just present, but correct — ZFS is the only option that actually guarantees it. Everything else is trusting the hardware to not lie. Hardware lies.
How kldload Leverages ZFS
kldload exists because ZFS is the best storage platform available and installing it
on Linux is unreasonably hard. Every kldload install — across all eight supported
distros (CentOS Stream, Debian, Ubuntu, Fedora, RHEL, Rocky Linux, Arch Linux, Alpine Linux)
— boots on ZFS on root. Not a data partition on ZFS with ext4 for boot.
The entire OS, from / to /home to /var/log,
lives on ZFS datasets.
Pre-built ZFS module
kldload builds the OpenZFS kernel module at image creation time, matching the exact kernel version baked into the ISO. No DKMS compilation at install time. No missing headers. No compiler mismatches. The module is ready. The DKMS package is installed for future kernel updates, but the critical first boot doesn't depend on it.
ZFSBootMenu
kldload uses ZFSBootMenu instead of GRUB for ZFS-on-root systems. ZFSBootMenu understands ZFS natively — it can list boot environments, roll back to snapshots, boot into clones, and manage multiple OS installs on the same pool. A bad kernel update is a 15-second rollback, not a rescue USB adventure.
Sane defaults
Every pool kldload creates uses ashift=12, compression=lz4,
acltype=posixacl, xattr=sa, dnodesize=auto, and
autotrim=on. Datasets are split by function (/home,
/var/log, /tmp, /srv) with appropriate properties
per dataset. /tmp gets sync=disabled, exec=off,
setuid=off, and devices=off for security hardening.
Automated snapshots
Every kldload install takes a factory snapshot at install time — your known-good baseline. Hourly automatic snapshots are enabled by default (keeping 48 hours). kldload tools take pre-upgrade snapshots before package operations. You always have a rollback point.
Cross-distro consistency
Whether you install CentOS, Debian, Ubuntu, Fedora, RHEL, Rocky, Arch, or Alpine, the ZFS configuration is identical. Same pool properties. Same dataset layout. Same snapshot automation. Same boot chain. The distro is the userland; ZFS is the foundation. Move between distros without re-learning your storage.
Five-Minute Quick Start — See ZFS in Action
Reading about ZFS is one thing. Running it is another. Here's a complete hands-on walkthrough you can run right now on any Linux machine with the OpenZFS packages installed. This uses a loopback device — no real disks required.
# 1. Create two 1GB files to simulate disks
truncate -s 1G /tmp/disk1.img /tmp/disk2.img
# 2. Create a mirrored pool (two-disk mirror, like RAID1)
zpool create -o ashift=12 testpool mirror /tmp/disk1.img /tmp/disk2.img
# 3. Enable compression (always do this)
zfs set compression=lz4 testpool
# 4. Create some datasets
zfs create testpool/data
zfs create testpool/data/important
zfs create testpool/scratch
# 5. Check what you've built
zpool status testpool
zfs list -r testpool
# 6. Write some data
cp /etc/hosts /testpool/data/important/
echo "Hello, ZFS" > /testpool/scratch/hello.txt
# 7. Take a snapshot (instant, regardless of data size)
zfs snapshot testpool/data/important@backup1
# 8. Delete the data
rm /testpool/data/important/hosts
# 9. Roll back to the snapshot (instant recovery)
zfs rollback testpool/data/important@backup1
cat /testpool/data/important/hosts # It's back.
# 10. Check compression
zfs get compressratio testpool
# 11. See the snapshot in the hidden .zfs directory
ls /testpool/data/important/.zfs/snapshot/backup1/
# 12. Clean up
zpool destroy testpool
rm /tmp/disk1.img /tmp/disk2.img
That's a mirrored pool, three datasets with inheritance, transparent compression, an instant
snapshot, instant rollback, and browseable snapshot history. In twelve commands. No partition
tables. No mdadm. No LVM. No mkfs. No fstab.
No fsck. This is what Jeff Bonwick meant by "as easy as filling a glass of water."
Every time I demo ZFS to someone who's been managing mdraid + LVM + ext4,
the snapshot rollback is the moment their expression changes. They're used to "I deleted the file,
it's gone, where's the backup tape?" With ZFS it's zfs rollback and the file is back.
Sub-second, regardless of dataset size. It's not magic — it's the copy-on-write architecture
making "undo" a first-class operation. But it feels like magic the first time you see it.
Under the Hood — How ZFS Manages Data
Understanding ZFS at the architectural level isn't necessary to use it, but it explains why ZFS behaves the way it does. Here's what happens when your application writes a file.
Transaction groups (TXGs)
ZFS batches writes into transaction groups. Instead of writing each block
immediately, ZFS accumulates writes in memory for up to zfs_txg_timeout seconds
(default: 5). When the TXG is full or the timeout fires, the entire group is committed to
disk as an atomic unit. This converts random application writes into large sequential disk writes
— dramatically better for both HDDs and SSDs.
# View TXG activity in real time
zpool iostat -v 1
# The TXG timeout (default 5 seconds, rarely needs tuning)
cat /sys/module/zfs/parameters/zfs_txg_timeout
# 5
The Merkle tree
ZFS organizes all data in a Merkle tree (a hash tree). Every data block has a checksum. That checksum is stored in the block's parent pointer. The parent has its own checksum stored in its parent. This chain extends all the way up to the uberblock — the root of the entire pool. A single corrupted bit anywhere in the tree causes a checksum mismatch that propagates upward to the root.
This is fundamentally different from ext4 (which checksums journal metadata but not data) and even btrfs (which stores checksums adjacent to data, not in parent pointers). ZFS's Merkle tree ensures that a misdirected write — where the drive writes to the wrong location — is always detected. If the checksum lived next to the data, a misdirected write could corrupt both together, making the corruption invisible.
The uberblock — root of trust
The uberblock is ZFS's root block. It contains the transaction group number, a timestamp, and a pointer (with checksum) to the root of the block tree. ZFS maintains an array of 128 uberblocks and writes new ones round-robin. On pool import, ZFS finds the uberblock with the highest valid transaction group number — that's the most recent consistent state. Because copy-on-write means old data is never overwritten, the pool is always in a consistent state. There is no journal to replay, no fsck to run.
# View the current uberblock
zdb -u rpool
# Uberblock[37]
# magic = 0x00bab10c
# version = 5000
# txg = 1847293
# guid_sum = 8412736491827364918
# timestamp = 1712193847 UTC = Thu Apr 4 02:04:07 2026
# rootbp = [L0 DMU objset] ...
# checkpoint_txg = 0
ZIL and SLOG — the write path
When an application requests a synchronous write (databases, NFS, anything using
O_SYNC or fsync()), the application expects the data to be on stable
storage before the write call returns. ZFS handles this via the ZIL (ZFS Intent Log).
The ZIL writes a compact record of the pending transaction to a reserved area of the pool. Once
the ZIL write completes, ZFS tells the application the write is safe. Later, the TXG flush writes
the full data blocks. If power fails between the ZIL write and the TXG flush, ZFS replays the
ZIL on import to recover the in-flight transactions.
A SLOG (Separate Log device) is simply the ZIL on a dedicated, fast device instead of the pool's main disks. An NVMe drive with power-loss protection is the ideal SLOG. The SLOG only accelerates synchronous writes. Asynchronous writes (which are the majority for most workloads) bypass the ZIL entirely and go straight into TXGs.
The SLOG is the most misunderstood component in ZFS. People add SSDs
as SLOGs thinking they'll speed up all writes. They won't. The SLOG only helps synchronous
writes — fsync(), NFS, iSCSI, databases with sync=always.
If your workload is mostly async (file servers, media streaming, general Linux usage), a SLOG
does nothing. I've watched people spend $400 on an Optane drive for a SLOG on a media server
and wonder why performance didn't change. Know your workload before you buy hardware.
The Honest Trade-offs
Read these before you commit. Every technology has trade-offs. ZFS tells you about them upfront instead of letting you discover them at 3 AM.
Licensing. ZFS is CDDL. Linux is GPL. They're legally incompatible for linked distribution. This is why ZFS isn't in the mainline kernel and never will be. It ships as an out-of-tree module. The code is production-ready — Oracle, Netflix, Joyent, and the entire FreeBSD ecosystem run it in production. But some organizations have legal teams that won't approve CDDL on GPL systems. Ask your legal team before you're deep into a project, not after.
Pool design is permanent. ashift cannot be changed after pool creation. RAIDZ vdev width cannot be changed. A mirror cannot become RAIDZ. Moving from RAIDZ1 to RAIDZ2 means creating a new pool and zfs send/recv everything over. This is the one decision you can't undo. Get it right the first time. Read the Pool Design page before you run zpool create.
Memory. ZFS uses RAM aggressively for caching (ARC). This is a feature — it's why ZFS is fast. But it surprises people whose monitoring alerts on "high memory usage." ARC releases memory under pressure, but tools like free and Grafana will show 80% used when the system is fine. Tune zfs_arc_max if you run memory-sensitive workloads alongside ZFS. See the Memory & ARC page.
Kernel updates. When the kernel updates, the ZFS module must be rebuilt. DKMS handles this automatically — unless it doesn't. Missing headers, ABI changes, gcc version mismatches — any of these silently break the build. The machine boots, ZFS doesn't load, monitoring says green. kldload mitigates this by pre-building the module at image time, but if you patch deployed machines in place, DKMS is still in the path. Best practice: treat machines as immutable. Rebuild the image, don't patch in place.
Not distributed. ZFS is local storage. It doesn't span machines like Ceph or GlusterFS. Replication (zfs send/recv) is asynchronous — the replica is always slightly behind the source. There is no automatic failover built in. If the primary dies, something has to promote the replica. For most workloads this is fine. If you need synchronous replication or sub-second failover, you need orchestration on top of ZFS.
Encryption key management is still yours. ZFS encrypts datasets beautifully. It does not manage the encryption keys. Where the passphrases or keyfiles are stored, how they're distributed, what happens if one is lost — that's your problem. ZFS shifts the question from "how do I encrypt" to "how do I manage the keys to the encryption." The data side is solved. The key side is your responsibility.
Scrub takes hours. zpool scrub reads every block to verify checksums. On a 10TB pool, that's 4–8 hours. During resilver (replacing a failed disk), performance degrades. This isn't unique to ZFS — mdraid has the same problem. But ZFS is honest about it. Use mirrors instead of RAIDZ if resilver speed matters to you.
ECC RAM recommendation. ZFS checksums data on disk but not in RAM. If your RAM flips a bit before ZFS writes the block, the corrupted data gets a valid checksum and the corruption is permanent. ECC RAM prevents this. ZFS doesn't require ECC — no filesystem does — but it's the one filesystem honest enough to make you think about it. Use ECC if you can. If you can't, ZFS is still better than ext4, which wouldn't have detected the corruption at all.
None of these are reasons not to use ZFS. They're reasons to use it correctly. Every filesystem has trade-offs. ZFS just tells you about them upfront.
kldload Defaults — What's Set and Why
Every kldload install applies these defaults. Every default can be overridden.
Nothing is hidden. zfs get all rpool shows you everything.
Pool creation properties
ashift=12 4K sector alignment. Matches all modern drives. Permanent.
autotrim=on SSD TRIM. Free blocks returned to the drive automatically.
compression=lz4 Always on. ~2x ratio, zero measurable CPU cost.
acltype=posixacl Required for systemd, containers, and most Linux applications.
xattr=sa Extended attributes stored in dnodes, not directory entries. Faster.
dnodesize=auto Variable dnode size. Better metadata performance.
normalization=formD Unicode normalization. Consistent filename handling.
relatime=on Relaxed atime. Reduces write amplification vs full atime.
Dataset layout
rpool/ROOT/{hostname} mountpoint=/ Your OS. canmount=noauto (ZFSBootMenu controls it).
rpool/root mountpoint=/root Root home. Separate for snapshot isolation.
rpool/home mountpoint=/home User homes. Per-user child datasets.
rpool/srv mountpoint=/srv Application data.
rpool/opt mountpoint=/opt Optional packages.
rpool/usr/local mountpoint=/usr/local Local binaries.
rpool/var/cache mountpoint=/var/cache Package cache. Safe to destroy.
rpool/var/lib mountpoint=/var/lib State data (databases, containers).
rpool/var/log mountpoint=/var/log Logs. Separate so they can't fill root.
rpool/var/spool mountpoint=/var/spool Mail/print spools.
rpool/var/tmp mountpoint=/var/tmp Persistent temp.
rpool/tmp mountpoint=/tmp Temp. sync=disabled, setuid=off, exec=off, devices=off.
/tmp hardening
sync=disabled /tmp doesn't need write guarantees. Huge performance win.
setuid=off No SUID binaries in /tmp. Blocks privilege escalation.
exec=off No execution from /tmp. Blocks most tmp-based exploits.
devices=off No device nodes in /tmp. Blocks device spoofing.
Snapshot automation
Factory snapshot Taken at install time. Your known-good baseline.
Hourly auto-snapshots Enabled by default. Keep 48 (2 days). Systemd timer.
Pre-upgrade snapshots kldload tools snapshot before package operations.
Common Operations Cheat Sheet
These are the commands you'll use most often, grouped by task. Every one of these is covered in depth in the linked wiki pages. This section is your quick reference.
Pool health and monitoring
# The single most important ZFS command — run this daily
zpool status
# pool: rpool
# state: ONLINE
# scan: scrub repaired 0B in 01:23:45 with 0 errors on Sun Mar 31 02:24:12 2026
# config:
# NAME STATE READ WRITE CKSUM
# rpool ONLINE 0 0 0
# mirror-0 ONLINE 0 0 0
# sda2 ONLINE 0 0 0
# sdb2 ONLINE 0 0 0
# errors: No known data errors <-- This is what you want to see
# Live I/O statistics (like iostat for ZFS)
zpool iostat -v 5
# Space usage by dataset
zfs list -o name,used,avail,refer,compressratio -r rpool
# All pool events (including disk failures)
zpool events -v | tail -50
Snapshot management
# Create a snapshot (instant, regardless of dataset size)
zfs snapshot rpool/home@2026-04-04_manual
# Create recursive snapshots (all child datasets at once)
zfs snapshot -r rpool/home@before-upgrade
# List all snapshots
zfs list -t snapshot -o name,used,creation -s creation
# Browse snapshot contents without rollback
ls /home/.zfs/snapshot/2026-04-04_manual/
# Restore a single file from a snapshot (no rollback needed)
cp /home/.zfs/snapshot/2026-04-04_manual/alice/important.doc /home/alice/
# Full rollback (destroys all changes since snapshot)
zfs rollback rpool/home@2026-04-04_manual
# Destroy old snapshots
zfs destroy rpool/home@old-snapshot
Replication and backup
# Full send to a remote machine
zfs send rpool/data@snap1 | ssh backup-host zfs recv tank/backup/data
# Incremental send (only changes since last snapshot — fast)
zfs send -i rpool/data@snap1 rpool/data@snap2 | ssh backup-host zfs recv tank/backup/data
# Encrypted raw send (data never decrypted in transit)
zfs send --raw rpool/secrets@snap1 | ssh backup-host zfs recv tank/backup/secrets
# Estimate send size before starting
zfs send -nv -i rpool/data@snap1 rpool/data@snap2
# estimated size is 142M
# Save to a file instead (for portable backup)
zfs send rpool/data@snap1 | gzip > /backup/data-snap1.zfs.gz
Disk replacement
# Replace a failed disk with a new one (hot spare or manual)
zpool replace rpool /dev/sda /dev/sdc
# Monitor resilver progress
zpool status rpool
# scan: resilver in progress since Thu Apr 4 10:15:32 2026
# 234G scanned at 456M/s, 123G resilvered at 234M/s, 52.5% done
# Bring a disk online that was temporarily removed
zpool online rpool /dev/sda
# Clear transient errors after replacing a cable
zpool clear rpool
The zpool replace command is one of those things that
makes you realize how much better ZFS is than mdraid. With mdraid, you mdadm --fail,
mdadm --remove, physically swap the disk, mdadm --add, then wait for
the rebuild while hoping your /etc/mdadm.conf is correct and the array name
matches and the partitions are right. With ZFS: zpool replace rpool old-disk new-disk.
One command. No config files. No partition matching. ZFS handles everything.
Wiki Roadmap — Where to Go From Here
This overview covers the foundations. Every topic links deeper into the wiki. Here's the recommended reading order:
If you've read this far, you understand why I built kldload on ZFS. It's not because it's the newest or the trendiest. It's because after 20 years and billions of hours of production runtime, it remains the only storage system that actually guarantees your data comes back the way you wrote it. Everything else is hoping the disk didn't lie. ZFS is the one system that checks. Every block. Every read. Every time. That's the foundation everything else should be built on.