kldload kldload — your Linux re-packer your Linux re-packer — for freegt; kldload — infrastructure, your way — for freemdash; pick your distro, get ZFS on root

Build Your Own

Build Server / CI Runner — snapshot before test, rollback after. Every build starts clean.

CI runners accumulate garbage. Old build artifacts, stale caches, leftover containers, broken dependencies from that one PR that installed a system library. Most teams deal with it by nuking the runner and reprovisioning from scratch. With ZFS, you snapshot before every build and rollback after. The runner is always pristine. Builds are reproducible. And if something goes wrong, you have the exact filesystem state that caused the failure — ready for forensics.

On ext4, "clean CI environment" means either nuking the entire VM and reprovisioning (slow), or Docker layer caching tricks that leak state between runs (unreliable). With ZFS, zfs clone gives you an instant, zero-cost test environment that shares all unchanged blocks with the golden image. Snapshot before the test, run the test, destroy the clone. Every CI run starts from a guaranteed-clean filesystem state in under a second. No Docker-in-Docker. No layer caching tricks. The filesystem itself is the cache and the isolation boundary.

What CI on ZFS actually enables that CI on ext4 can't:

Forensic debugging. Build #4,721 failed. On ext4, the evidence is gone — the next build overwrote it. On ZFS, you kept the snapshot. Mount it read-only. Walk the exact filesystem state that caused the failure. Check which dependency was wrong, which config file was stale, which symlink was broken. The crime scene is preserved.

Per-PR environments. Developer opens a PR. Your CI clones the golden workspace (zfs clone rpool/srv/ci/golden@latest rpool/srv/ci/pr-4721). The PR gets its own complete environment — OS packages, database, configs — in 0.1 seconds. Tests run in isolation. PR merges or closes, clone is destroyed. No shared state between PRs. No "it passed on my PR but broke on yours because we shared the test database."

Build cache that actually works. rpool/srv/ci/cache is a persistent dataset that survives workspace rollbacks. Your node_modules, your .m2 repository, your pip cache — all on a dataset that persists while workspaces are cloned and destroyed around it. The cache is structural, not a Docker layer trick that invalidates when the base image changes.

Artifact storage with provenance. Build produces a binary? It goes to rpool/srv/ci/artifacts. Snapshot after the build. The artifact has a timestamp, a snapshot name, and a checksummed block. Six months later, auditor asks "prove this binary was built on this date" — you mount the snapshot and the artifact is there, byte-identical, verified by ZFS checksums. That's supply chain provenance from a filesystem feature.

The recipe

Step 1: Set up the build server

# Install kldload server profile, then create build datasets
kdir /srv/ci
kdir /srv/ci/workspaces
kdir /srv/ci/cache
kdir /srv/ci/artifacts

# Set compression — build artifacts compress extremely well
zfs set compression=zstd rpool/srv/ci

# Set a quota so runaway builds can't fill the pool
zfs set quota=200G rpool/srv/ci/workspaces

Each concern gets its own dataset. Workspaces are ephemeral. Cache persists. Artifacts get shipped out. ZFS keeps them separate.

Step 2: The snapshot-build-rollback pattern

# This is the core pattern. Run this before every CI job.

# Create a clean snapshot of the workspace
zfs snapshot rpool/srv/ci/workspaces@clean

# ... build runs here, installs deps, compiles, tests ...

# After the build: rollback to clean state
zfs rollback rpool/srv/ci/workspaces@clean

# Destroy the snapshot (optional — keeps things tidy)
zfs destroy rpool/srv/ci/workspaces@clean

Snapshot is instant. Rollback is instant. No disk I/O, no reprovisioning, no waiting. The workspace is clean in under a second.

Step 3: Parallel build workspaces with ZFS clones

# Create a golden image snapshot
zfs snapshot rpool/srv/ci/workspaces@golden

# Clone it for parallel builds — each clone is instant, zero disk space
zfs clone rpool/srv/ci/workspaces@golden rpool/srv/ci/workspaces/job-1234
zfs clone rpool/srv/ci/workspaces@golden rpool/srv/ci/workspaces/job-1235
zfs clone rpool/srv/ci/workspaces@golden rpool/srv/ci/workspaces/job-1236

# Each job gets /srv/ci/workspaces/job-XXXX as its working directory
# They share the base blocks — only writes allocate new space

# When done, destroy the clones
zfs destroy rpool/srv/ci/workspaces/job-1234
zfs destroy rpool/srv/ci/workspaces/job-1235
zfs destroy rpool/srv/ci/workspaces/job-1236

10 parallel builds, each with a 20GB workspace? That's not 200GB of disk. It's 20GB plus whatever each build writes. Copy-on-write means clones are free until they diverge.

Step 4: GitLab Runner with ZFS-backed workdirs

# Install GitLab Runner
curl -L "https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.rpm.sh" | bash
kpkg install gitlab-runner

# Create the runner wrapper script
cat > /usr/local/bin/ci-workspace.sh <<'SCRIPT'
#!/bin/bash
set -euo pipefail
JOB_ID="$1"
DATASET="rpool/srv/ci/workspaces/job-${JOB_ID}"

# Clone from golden snapshot
zfs clone rpool/srv/ci/workspaces@golden "${DATASET}"
echo "/srv/ci/workspaces/job-${JOB_ID}"

# Trap: destroy clone on exit
trap "zfs destroy ${DATASET}" EXIT
wait
SCRIPT
chmod +x /usr/local/bin/ci-workspace.sh

# Register the runner
gitlab-runner register \
  --non-interactive \
  --url "https://gitlab.corp.local/" \
  --token "YOUR_REGISTRATION_TOKEN" \
  --executor "shell" \
  --builds-dir "/srv/ci/workspaces"

Step 5: GitHub Actions self-hosted runner

# Create a dedicated user
useradd -m -d /srv/ci/runner github-runner

# Download and configure the runner
cd /srv/ci/runner
curl -o actions-runner-linux-x64.tar.gz -L \
  "https://github.com/actions/runner/releases/download/v2.321.0/actions-runner-linux-x64-2.321.0.tar.gz"
tar xzf actions-runner-linux-x64.tar.gz

# Configure with ZFS workspace
./config.sh --url https://github.com/YOUR_ORG --token YOUR_TOKEN \
  --work /srv/ci/workspaces

# Install and start as service
./svc.sh install github-runner
./svc.sh start

# Pre-build hook: snapshot the workspace
cat > /srv/ci/runner/pre-job.sh <<'HOOK'
#!/bin/bash
zfs snapshot rpool/srv/ci/workspaces@pre-$(date +%s)
HOOK

# Post-build hook: rollback
cat > /srv/ci/runner/post-job.sh <<'HOOK'
#!/bin/bash
LATEST=$(zfs list -t snapshot -o name -s creation -r rpool/srv/ci/workspaces | tail -1)
zfs rollback "${LATEST}"
zfs destroy "${LATEST}"
HOOK
chmod +x /srv/ci/runner/{pre,post}-job.sh

Every build starts from a known-good state. No stale node_modules, no leftover .pyc files, no "works on the runner but not locally" mysteries.

Step 6: Build failure forensics

# Build failed? Don't rollback yet. Snapshot the failure state.
zfs snapshot rpool/srv/ci/workspaces@failed-job-1234

# Now rollback the workspace for the next build
zfs rollback rpool/srv/ci/workspaces@pre-job-1234

# Later, mount the failure snapshot read-only for investigation
mkdir -p /mnt/debug/job-1234
mount -t zfs rpool/srv/ci/workspaces@failed-job-1234 /mnt/debug/job-1234 -o ro

# Poke around — every file exactly as the build left it
ls /mnt/debug/job-1234/
cat /mnt/debug/job-1234/build.log

# Done investigating? Clean up
umount /mnt/debug/job-1234
zfs destroy rpool/srv/ci/workspaces@failed-job-1234

Why this matters

Reproducible builds

Every build starts from an identical filesystem state. No accumulated drift, no mystery failures from stale caches. Snapshot, build, rollback. Always clean.

Instant parallel workspaces

ZFS clones create full copies in milliseconds with zero disk overhead. Run 10 builds in parallel without 10x the storage.

Failure forensics

When a build breaks, snapshot the failure state before rolling back. Mount it read-only later and see exactly what went wrong. No more guessing from log files alone.

No reprovisioning

Traditional CI runners get rebuilt weekly or after failures. ZFS runners roll back in under a second. The runner is always available, always clean.

← Edge / Branch Office Server — your branch office is a self-contained datacenter that fits in a shoebox. Development Workstation — your laptop has an undo button for everything. →