| pick your distro, get ZFS on root
kldload — your platform, your way, free
Source

Docker & Podman on ZFS

kldload installs on ZFS by default. Docker and Podman both support the ZFS storage driver — every container layer and volume becomes a ZFS dataset, giving you snapshots, copy-on-write clones, transparent compression, and checksummed I/O for free. This guide covers both engines on one page because the ZFS integration works identically.

Docker and Podman do the same thing with different architectures. Docker uses a daemon (dockerd) that runs as root. Podman is daemonless and runs rootless by default. Both use the same OCI image format, the same container runtime (runc/crun), and the same ZFS storage driver. If you can run one, you can run the other — the commands are nearly identical (podman run = docker run). kldload ships both in its darksites. Use Docker if you need Docker Compose or broad ecosystem compatibility. Use Podman if you want rootless containers, systemd integration, or no daemon. Use both if you want — they don't conflict. The ZFS configuration is the same either way.

Install Docker

CentOS/RHEL

# Add Docker CE repo
dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

# Install
dnf install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin

# Enable and start
systemctl enable --now docker

Debian

# Add Docker GPG key and repo
apt-get install -y ca-certificates curl gnupg
install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian trixie stable" \
  > /etc/apt/sources.list.d/docker.list
apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin

systemctl enable --now docker

Configure Docker to use the ZFS storage driver

Docker auto-detects ZFS if /var/lib/docker is on a ZFS dataset. On kldload systems it usually is, but let’s make it explicit.

Docker auto-detects ZFS if /var/lib/docker is on a ZFS dataset, but "auto-detect" means "it worked on this boot." Make it explicit. Create the dataset before Docker starts, and Docker will always use ZFS. If Docker started first with overlay2 and created files in /var/lib/docker, you need to stop Docker, wipe the directory, create the ZFS dataset at that mountpoint, and restart. The dataset must be mounted at exactly /var/lib/docker — Docker doesn't follow symlinks for storage driver detection.

Create a dedicated dataset for Docker

# Create a dataset with optimized settings for container layers
zfs create -o mountpoint=/var/lib/docker \
           -o compression=lz4 \
           -o atime=off \
           -o recordsize=128k \
           rpool/docker

Verify Docker is using ZFS

docker info | grep "Storage Driver"
# Should show: Storage Driver: zfs

If it shows overlay2 instead, Docker started before the dataset was created. Restart:

systemctl stop docker
# Make sure /var/lib/docker is empty and mounted on ZFS
rm -rf /var/lib/docker/*
systemctl start docker
docker info | grep "Storage Driver"

What this gives you

With the ZFS storage driver, every Docker image layer and container filesystem is a ZFS dataset:

# See Docker's ZFS datasets
zfs list -r rpool/docker
NAME                                                          USED  AVAIL  REFER  MOUNTPOINT
rpool/docker                                                  2.1G  35.0G    24K  /var/lib/docker
rpool/docker/c3f5...                                          156M  35.0G   156M  legacy
rpool/docker/a7e2...                                          89M   35.0G    89M  legacy

Benefits: - Instant snapshots of any container’s filesystem - CoW clones — spinning up 10 containers from the same image uses near-zero extra space - Compression — lz4 typically saves 30–50% on container layers - No overlay filesystem overhead — direct ZFS dataset access


Podman on ZFS

Podman’s biggest advantage over Docker on a kldload server: no daemon. Docker’s daemon (dockerd) is a single process that manages all containers. If it crashes or is restarted, every container restarts. Podman runs each container as a direct child process — if the management layer dies, containers keep running. For servers that need maximum uptime, this matters. Podman also integrates with systemd natively — podman generate systemd creates unit files that manage containers exactly like any other service. Combined with ZFS datasets per service, you get containers that are managed by systemd, stored on ZFS, snapshotted by sanoid, and replicated by syncoid. No orchestrator needed for small-to-medium deployments.

Install

# CentOS/RHEL/Rocky/Fedora — already in the repos
dnf install -y podman

# Debian/Ubuntu
apt install -y podman

Rootful Podman on ZFS (system services)

# Create dataset for root Podman storage
zfs create -o mountpoint=/var/lib/containers \
           -o compression=lz4 \
           -o atime=off \
           rpool/containers

# Verify ZFS driver
podman info | grep graphDriverName
# Should show: zfs

Rootless Podman on ZFS (user containers)

# As root, create a dataset under the user’s home
zfs create -o mountpoint=/home/admin/.local/share/containers \
           -o compression=lz4 \
           rpool/home/admin/containers
chown admin:admin /home/admin/.local/share/containers

# As the user:
podman info | grep graphDriverName
# Should show: zfs

Podman + systemd (production containers without an orchestrator)

# Run a container
podman run -d --name my-nginx -p 8080:80 nginx:latest

# Generate a systemd unit file
podman generate systemd --new --name my-nginx > /etc/systemd/system/container-nginx.service

# Now systemd manages the container like any other service
systemctl daemon-reload
systemctl enable --now container-nginx.service

# Survives reboots, restarts on failure, logs to journald
systemctl status container-nginx
journalctl -u container-nginx

Podman Compose

# Install podman-compose (pip or package)
pip3 install podman-compose
# or
dnf install -y podman-compose   # Fedora/CentOS

# Use the same docker-compose.yml files — podman-compose is compatible
podman-compose up -d
podman-compose ps

Snapshot a running container

Since Docker layers are ZFS datasets, you can snapshot at the ZFS level — faster and more flexible than docker commit:

# Find the container's ZFS dataset
CONTAINER_ID=$(docker inspect --format '{{.GraphDriver.Data.Dataset}}' my-container)

# Snapshot it
zfs snapshot "${CONTAINER_ID}@before-migration"

# Roll back if something breaks
zfs rollback "${CONTAINER_ID}@before-migration"

Or use the kldload tools:

# Snapshot everything under /var/lib/docker
ksnap /var/lib/docker

# List snapshots
ksnap list

Compose stacks on ZFS datasets

This is the most important pattern on this page. Docker volumes and Podman volumes are opaque blobs managed by the container engine. You can't snapshot them independently, you can't set per-volume recordsize, you can't replicate them with zfs send, and you can't see how much space each one uses without docker system df. ZFS datasets as bind mounts give you all of that. Create a dataset per service (postgres, redis, app data), bind-mount it into the container, and now each service's data is a first-class ZFS citizen with its own snapshots, compression, quotas, and replication. This is how production container deployments should work on ZFS.

For persistent data (databases, file stores), create dedicated ZFS datasets instead of using Docker/Podman volumes:

# Create datasets for a PostgreSQL + Redis stack
zfs create -o mountpoint=/srv/myapp rpool/srv/myapp
zfs create -o mountpoint=/srv/myapp/postgres -o recordsize=8k rpool/srv/myapp/postgres
zfs create -o mountpoint=/srv/myapp/redis rpool/srv/myapp/redis

recordsize=8k matches PostgreSQL’s 8KB page size for optimal I/O.

Then bind-mount in your compose file:

# docker-compose.yml
services:
  postgres:
    image: postgres:16
    volumes:
      - /srv/myapp/postgres:/var/lib/postgresql/data
    environment:
      POSTGRES_PASSWORD: changeme

  redis:
    image: redis:7
    volumes:
      - /srv/myapp/redis:/data
docker compose up -d

Backup the entire stack

# Recursive snapshot — catches postgres AND redis datasets
zfs snapshot -r rpool/srv/myapp@backup-$(date +%Y%m%d)

# List backups
zfs list -t snapshot -r rpool/srv/myapp
Recursive snapshots are the key. zfs snapshot -r rpool/srv/myapp@backup atomically snapshots the parent AND every child dataset — postgres and redis in this case — at the exact same point in time. This gives you a consistent cross-service snapshot. Rolling back to this snapshot restores both databases to the same moment, which matters when services have foreign key relationships or shared state. You can't do this with Docker volumes because they don't have a parent/child relationship. ZFS datasets do.

Clone for dev/test

# Instant clone of production data for testing
zfs clone rpool/srv/myapp/postgres@backup-20260321 rpool/srv/myapp-test/postgres

# Run a test instance on the clone
docker run -d --name pg-test \
  -v /srv/myapp-test/postgres:/var/lib/postgresql/data \
  -p 5433:5432 \
  postgres:16

The clone starts at near-zero space and only grows as the test instance writes new data. Delete it when done:

docker rm -f pg-test
zfs destroy rpool/srv/myapp-test/postgres
This is the dev/test workflow that changes how you work. Production has 200GB of PostgreSQL data. You need to test a migration. On ext4, you'd copy the whole database — 200GB, 30 minutes, 200GB of extra disk space. On ZFS, zfs clone is instant and uses zero space until the test writes diverge from production. You can spin up five test clones simultaneously and they all share the same blocks. Run your migration against each one with different parameters. Destroy the losers, keep the winner. Total extra space used: only the blocks that the tests actually changed. This is why ZFS changes how you do database testing, not just how you store data.

Property Value Why
compression lz4 Fast, saves 30–50% on container layers
atime off Containers don’t need access time tracking
recordsize 128k Good default for mixed container I/O
recordsize 8k For PostgreSQL data directories
recordsize 16k For MySQL/MariaDB data directories
logbias throughput For sequential write workloads (logs, streams)
sync standard Keep standard unless you know you can afford data loss

Docker vs Podman: which one?

Docker Podman
Architecture Client/server daemon (dockerd) Daemonless, fork/exec model
Rootless Supported but optional Default — runs as regular user
Compose docker compose (built-in plugin) podman-compose (compatible, separate install)
systemd integration Manual unit files podman generate systemd — native
ZFS storage driver Yes — auto-detects ZFS Yes — auto-detects ZFS
Daemon crash impact All containers restart Containers keep running
Ecosystem Broadest compatibility, most docs Growing, RHEL default
Available in kldload darksite Yes (Docker CE repo) Yes (distro repos)
The honest answer: use Docker if you're already using Docker Compose files and don't want to change your workflow. Use Podman if you're starting fresh, care about security (rootless by default), or run RHEL/CentOS where Podman is the system default. For production services on kldload, Podman + systemd units is the more "infrastructure native" choice — each container is managed by systemd like any other service, logs go to journald, and there's no daemon that can take everything down when it restarts. For development and CI, Docker's broader tool compatibility still wins. Both use ZFS identically. You're not locked into either one.

Limits and gotchas

These are real gotchas that will bite you if you ignore them. The ARC memory one is the most common: ZFS's ARC cache is aggressive by default — it will use half your RAM for caching. On a container host where you want that RAM for containers, you need to cap it. The dataset count issue is cosmetic but confusing — zfs list will show hundreds of datasets with hash names. That's normal. The snapshot-before-prune habit will save you the day you accidentally docker system prune -af and realize you needed that custom image you built three weeks ago.
  • ZFS memory usage: Docker/Podman on ZFS means the ARC cache competes with container memory. On memory-constrained systems, cap the ARC:

    echo "options zfs zfs_arc_max=4294967296" > /etc/modprobe.d/zfs-arc.conf
  • Dataset count: Each image layer creates a ZFS dataset. Pulling many images can create thousands of datasets. This is fine — ZFS handles it — but zfs list output gets long. Use zfs list -r rpool/docker -o name,used,refer -S used | head -20 to see the biggest consumers.

  • Snapshot before pruning: Before running docker system prune, take a snapshot so you can recover if you prune too aggressively:

    ksnap /var/lib/docker
    docker system prune -af