Docker & Podman on ZFS
kldload installs on ZFS by default. Docker and Podman both support the ZFS storage driver — every container layer and volume becomes a ZFS dataset, giving you snapshots, copy-on-write clones, transparent compression, and checksummed I/O for free. This guide covers both engines on one page because the ZFS integration works identically.
podman run = docker run). kldload ships both in its darksites. Use Docker if you need Docker Compose or broad ecosystem compatibility. Use Podman if you want rootless containers, systemd integration, or no daemon. Use both if you want — they don't conflict. The ZFS configuration is the same either way.Install Docker
CentOS/RHEL
# Add Docker CE repo
dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
# Install
dnf install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin
# Enable and start
systemctl enable --now docker
Debian
# Add Docker GPG key and repo
apt-get install -y ca-certificates curl gnupg
install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian trixie stable" \
> /etc/apt/sources.list.d/docker.list
apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin
systemctl enable --now docker
Configure Docker to use the ZFS storage driver
Docker auto-detects ZFS if /var/lib/docker is on a ZFS
dataset. On kldload systems it usually is, but let’s make it
explicit.
/var/lib/docker is on a ZFS dataset, but "auto-detect" means "it worked on this boot." Make it explicit. Create the dataset before Docker starts, and Docker will always use ZFS. If Docker started first with overlay2 and created files in /var/lib/docker, you need to stop Docker, wipe the directory, create the ZFS dataset at that mountpoint, and restart. The dataset must be mounted at exactly /var/lib/docker — Docker doesn't follow symlinks for storage driver detection.Create a dedicated dataset for Docker
# Create a dataset with optimized settings for container layers
zfs create -o mountpoint=/var/lib/docker \
-o compression=lz4 \
-o atime=off \
-o recordsize=128k \
rpool/docker
Verify Docker is using ZFS
docker info | grep "Storage Driver"
# Should show: Storage Driver: zfs
If it shows overlay2 instead, Docker started before the
dataset was created. Restart:
systemctl stop docker
# Make sure /var/lib/docker is empty and mounted on ZFS
rm -rf /var/lib/docker/*
systemctl start docker
docker info | grep "Storage Driver"
What this gives you
With the ZFS storage driver, every Docker image layer and container filesystem is a ZFS dataset:
# See Docker's ZFS datasets
zfs list -r rpool/docker
NAME USED AVAIL REFER MOUNTPOINT
rpool/docker 2.1G 35.0G 24K /var/lib/docker
rpool/docker/c3f5... 156M 35.0G 156M legacy
rpool/docker/a7e2... 89M 35.0G 89M legacy
Benefits: - Instant snapshots of any container’s filesystem - CoW clones — spinning up 10 containers from the same image uses near-zero extra space - Compression — lz4 typically saves 30–50% on container layers - No overlay filesystem overhead — direct ZFS dataset access
Podman on ZFS
podman generate systemd creates unit files that manage containers exactly like any other service. Combined with ZFS datasets per service, you get containers that are managed by systemd, stored on ZFS, snapshotted by sanoid, and replicated by syncoid. No orchestrator needed for small-to-medium deployments.Install
# CentOS/RHEL/Rocky/Fedora — already in the repos
dnf install -y podman
# Debian/Ubuntu
apt install -y podman
Rootful Podman on ZFS (system services)
# Create dataset for root Podman storage
zfs create -o mountpoint=/var/lib/containers \
-o compression=lz4 \
-o atime=off \
rpool/containers
# Verify ZFS driver
podman info | grep graphDriverName
# Should show: zfs
Rootless Podman on ZFS (user containers)
# As root, create a dataset under the user’s home
zfs create -o mountpoint=/home/admin/.local/share/containers \
-o compression=lz4 \
rpool/home/admin/containers
chown admin:admin /home/admin/.local/share/containers
# As the user:
podman info | grep graphDriverName
# Should show: zfs
Podman + systemd (production containers without an orchestrator)
# Run a container
podman run -d --name my-nginx -p 8080:80 nginx:latest
# Generate a systemd unit file
podman generate systemd --new --name my-nginx > /etc/systemd/system/container-nginx.service
# Now systemd manages the container like any other service
systemctl daemon-reload
systemctl enable --now container-nginx.service
# Survives reboots, restarts on failure, logs to journald
systemctl status container-nginx
journalctl -u container-nginx
Podman Compose
# Install podman-compose (pip or package)
pip3 install podman-compose
# or
dnf install -y podman-compose # Fedora/CentOS
# Use the same docker-compose.yml files — podman-compose is compatible
podman-compose up -d
podman-compose ps
Snapshot a running container
Since Docker layers are ZFS datasets, you can snapshot at the ZFS
level — faster and more flexible than docker commit:
# Find the container's ZFS dataset
CONTAINER_ID=$(docker inspect --format '{{.GraphDriver.Data.Dataset}}' my-container)
# Snapshot it
zfs snapshot "${CONTAINER_ID}@before-migration"
# Roll back if something breaks
zfs rollback "${CONTAINER_ID}@before-migration"
Or use the kldload tools:
# Snapshot everything under /var/lib/docker
ksnap /var/lib/docker
# List snapshots
ksnap list
Compose stacks on ZFS datasets
zfs send, and you can't see how much space each one uses without docker system df. ZFS datasets as bind mounts give you all of that. Create a dataset per service (postgres, redis, app data), bind-mount it into the container, and now each service's data is a first-class ZFS citizen with its own snapshots, compression, quotas, and replication. This is how production container deployments should work on ZFS.For persistent data (databases, file stores), create dedicated ZFS datasets instead of using Docker/Podman volumes:
# Create datasets for a PostgreSQL + Redis stack
zfs create -o mountpoint=/srv/myapp rpool/srv/myapp
zfs create -o mountpoint=/srv/myapp/postgres -o recordsize=8k rpool/srv/myapp/postgres
zfs create -o mountpoint=/srv/myapp/redis rpool/srv/myapp/redis
recordsize=8kmatches PostgreSQL’s 8KB page size for optimal I/O.
Then bind-mount in your compose file:
# docker-compose.yml
services:
postgres:
image: postgres:16
volumes:
- /srv/myapp/postgres:/var/lib/postgresql/data
environment:
POSTGRES_PASSWORD: changeme
redis:
image: redis:7
volumes:
- /srv/myapp/redis:/data
docker compose up -d
Backup the entire stack
# Recursive snapshot — catches postgres AND redis datasets
zfs snapshot -r rpool/srv/myapp@backup-$(date +%Y%m%d)
# List backups
zfs list -t snapshot -r rpool/srv/myapp
zfs snapshot -r rpool/srv/myapp@backup atomically snapshots the parent AND every child dataset — postgres and redis in this case — at the exact same point in time. This gives you a consistent cross-service snapshot. Rolling back to this snapshot restores both databases to the same moment, which matters when services have foreign key relationships or shared state. You can't do this with Docker volumes because they don't have a parent/child relationship. ZFS datasets do.Clone for dev/test
# Instant clone of production data for testing
zfs clone rpool/srv/myapp/postgres@backup-20260321 rpool/srv/myapp-test/postgres
# Run a test instance on the clone
docker run -d --name pg-test \
-v /srv/myapp-test/postgres:/var/lib/postgresql/data \
-p 5433:5432 \
postgres:16
The clone starts at near-zero space and only grows as the test instance writes new data. Delete it when done:
docker rm -f pg-test
zfs destroy rpool/srv/myapp-test/postgres
zfs clone is instant and uses zero space until the test writes diverge from production. You can spin up five test clones simultaneously and they all share the same blocks. Run your migration against each one with different parameters. Destroy the losers, keep the winner. Total extra space used: only the blocks that the tests actually changed. This is why ZFS changes how you do database testing, not just how you store data.Recommended ZFS properties for Docker workloads
| Property | Value | Why |
|---|---|---|
compression |
lz4 |
Fast, saves 30–50% on container layers |
atime |
off |
Containers don’t need access time tracking |
recordsize |
128k |
Good default for mixed container I/O |
recordsize |
8k |
For PostgreSQL data directories |
recordsize |
16k |
For MySQL/MariaDB data directories |
logbias |
throughput |
For sequential write workloads (logs, streams) |
sync |
standard |
Keep standard unless you know you can afford data
loss |
Docker vs Podman: which one?
| Docker | Podman | |
|---|---|---|
| Architecture | Client/server daemon (dockerd) | Daemonless, fork/exec model |
| Rootless | Supported but optional | Default — runs as regular user |
| Compose | docker compose (built-in plugin) | podman-compose (compatible, separate install) |
| systemd integration | Manual unit files | podman generate systemd — native |
| ZFS storage driver | Yes — auto-detects ZFS | Yes — auto-detects ZFS |
| Daemon crash impact | All containers restart | Containers keep running |
| Ecosystem | Broadest compatibility, most docs | Growing, RHEL default |
| Available in kldload darksite | Yes (Docker CE repo) | Yes (distro repos) |
Limits and gotchas
zfs list will show hundreds of datasets with hash names. That's normal. The snapshot-before-prune habit will save you the day you accidentally docker system prune -af and realize you needed that custom image you built three weeks ago.ZFS memory usage: Docker/Podman on ZFS means the ARC cache competes with container memory. On memory-constrained systems, cap the ARC:
echo "options zfs zfs_arc_max=4294967296" > /etc/modprobe.d/zfs-arc.confDataset count: Each image layer creates a ZFS dataset. Pulling many images can create thousands of datasets. This is fine — ZFS handles it — but
zfs listoutput gets long. Usezfs list -r rpool/docker -o name,used,refer -S used | head -20to see the biggest consumers.Snapshot before pruning: Before running
docker system prune, take a snapshot so you can recover if you prune too aggressively:ksnap /var/lib/docker docker system prune -af