Custom Postinstallers — from zero to 100 microservices in seconds.
This is the advanced guide. We're going to build a complete deployment pipeline that starts
with a blank disk and ends with a running Kubernetes cluster — or 100 Firecracker microVMs —
or whatever you want. The secret is postinstall.sh: a hook that runs after
kldload finishes installing the base system. Everything after that is yours.
This isn't theory. This is a real production pattern used to deploy 15-node Kubernetes clusters with etcd, load balancers, control planes, workers, monitoring, and GitOps — all from sealed ISOs that work without internet. You can build the same thing.
What is a postinstaller?
The hook point
kldload installs the base system: kernel, ZFS, bootloader, tools. When it's done,
it looks for /root/darksite/postinstall.sh on the target system.
If it exists, it runs it. That's your entry point. Everything you put in that script
runs with root privileges on a freshly-installed system.
#!/bin/bash
# postinstall.sh — runs after kldload finishes the base install
# You have: root access, ZFS on root, network (if configured), all base packages
# You do: whatever you want
echo "My custom postinstaller is running!"
dnf install -y nginx
systemctl enable --now nginx
echo "<h1>Built by kldload</h1>" > /usr/share/nginx/html/index.html
# Signal completion
touch /root/.postinstall_done
poweroff
Understanding chroots
What a chroot actually is
A chroot ("change root") makes a directory look like the root filesystem to a process.
When kldload installs to /target, it creates a complete Linux system there.
Then it does chroot /target to run commands inside that system
as if it were booted.
This is how kldload installs packages, builds DKMS modules, and rebuilds the initramfs without ever booting the target system. It's the same technique every Linux installer uses — from Debian's debootstrap to Arch's pacstrap to Red Hat's anaconda.
# The installer creates a complete system at /target
debootstrap trixie /target https://deb.debian.org/debian
# Mount system filesystems so chroot commands work
mount --bind /dev /target/dev
mount --bind /proc /target/proc
mount --bind /sys /target/sys
# Now run commands "inside" the target system
chroot /target apt-get install -y nginx
chroot /target systemctl enable nginx
# When done, unmount and the target is a complete, bootable system
umount /target/sys /target/proc /target/dev
The darksite pattern: baking everything in
What is a darksite?
A "darksite" is an air-gapped deployment — no internet, no upstream repos, no cloud APIs. Everything the system needs must be baked into the ISO or carried on the USB drive. This includes:
- APT/DNF packages — a complete local repository snapshot
- Container images — OCI tarballs loaded into containerd/Docker on first boot
- Ansible playbooks — the entire orchestration tree
- Helm charts — bundled for offline Kubernetes deployments
- TLS certificates — pre-generated PKI for etcd, API server, etc.
- WireGuard keys — hub keypairs for mesh networking
- Configuration files — per-node or per-role configs baked in
Payload directory structure
payload/darksite/
├── postinstall.sh # Entry point — runs on first boot
├── apply.py # Cluster convergence orchestrator
├── cluster-seed/
│ └── peers.json # Node inventory (WG IPs, roles)
├── ansible/
│ ├── ansible.cfg # Ansible configuration
│ ├── site.yml # Main playbook (imports all roles)
│ ├── group_vars/ # Per-group configuration
│ ├── host_vars/ # Per-host configuration
│ ├── roles/ # Role implementations
│ │ ├── etcd_cluster/ # etcd setup + PKI
│ │ ├── k8s_pkgs/ # kubelet, kubeadm, kubectl
│ │ ├── kubeadm_init/ # Control plane initialization
│ │ ├── kubeadm_join_cp/ # Join additional control planes
│ │ ├── kubeadm_join_worker/# Join workers
│ │ ├── lb_haproxy/ # Load balancer config
│ │ ├── prometheus_config/ # Monitoring
│ │ ├── helm/ # Helm 3 bootstrap
│ │ └── ingress_nginx/ # Ingress controller
│ └── artifacts/
│ ├── etcd-pki/ # Pre-generated etcd certificates
│ ├── join_cp.sh # Control plane join script
│ └── join_worker.sh # Worker join script
├── helm/
│ └── bootstrap.sh # Helm chart installation
└── systemd/
├── darksite-apply.service # Runs apply.py on first boot
└── darksite-wg-reflector.* # WireGuard peer sync
This entire tree gets embedded in the ISO. On first boot, postinstall.sh unpacks it
and the system bootstraps itself from the payload. No internet. No external dependencies.
Example: Building a Kubernetes cluster from postinstall.sh
Here's the real-world pattern. One master node, multiple workers. Each gets a customized ISO with a role-specific postinstaller. The master opens a WireGuard enrollment window, workers connect and register, then Ansible converges the cluster.
Step 1: The master postinstaller
postinstall-master.sh
#!/bin/bash
set -euo pipefail
# ── Phase 1: Base packages ──
dnf install -y python3 wireguard-tools nftables salt-master chrony
# ── Phase 2: WireGuard hub (star topology) ──
# Master is the center. Every worker connects back to us.
wg genkey | tee /etc/wireguard/wg1.key | wg pubkey > /etc/wireguard/wg1.pub
cat > /etc/wireguard/wg1.conf <<EOF
[Interface]
Address = 10.78.0.1/16
ListenPort = 51821
PrivateKey = $(cat /etc/wireguard/wg1.key)
# Peers added dynamically during enrollment
EOF
systemctl enable --now wg-quick@wg1
# ── Phase 3: Export hub metadata ──
# Workers fetch this to know how to reach us
cat > /srv/wg/hub.env <<EOF
HUB_LAN=$(hostname -I | awk '{print $1}')
WG1_PUB=$(cat /etc/wireguard/wg1.pub)
WG1_PORT=51821
WG1_NET=10.78.0.0/16
EOF
# ── Phase 4: Enrollment window ──
# Workers can add themselves as peers during this window
touch /srv/wg/ENROLL_ENABLED
# ── Phase 5: Salt master ──
systemctl enable --now salt-master
# ── Phase 6: Wait for workers, then converge ──
# This runs as a systemd service (darksite-apply.service)
# apply.py waits for all minions, then runs Ansible
touch /root/.postinstall_done
Step 2: The worker postinstaller
postinstall-worker.sh
#!/bin/bash
set -euo pipefail
# ── Phase 1: Base packages ──
dnf install -y wireguard-tools salt-minion prometheus-node-exporter
# ── Phase 2: Read hub metadata (baked into ISO) ──
source /root/darksite/cluster-seed/hub.env
# ── Phase 3: WireGuard spoke (connect back to hub) ──
wg genkey | tee /etc/wireguard/wg1.key | wg pubkey > /etc/wireguard/wg1.pub
cat > /etc/wireguard/wg1.conf <<EOF
[Interface]
Address = ${MY_WG1_IP}/32
PrivateKey = $(cat /etc/wireguard/wg1.key)
[Peer]
PublicKey = ${WG1_PUB}
Endpoint = ${HUB_LAN}:${WG1_PORT}
AllowedIPs = ${WG1_NET}
PersistentKeepalive = 25
EOF
systemctl enable --now wg-quick@wg1
# ── Phase 4: Auto-enroll with hub ──
# SSH to master and register our WireGuard public key
ssh -o StrictHostKeyChecking=no -i /root/darksite/enroll_key \
root@${HUB_LAN} "wg-add-peer $(cat /etc/wireguard/wg1.pub) ${MY_WG1_IP} wg1"
# ── Phase 5: Salt minion (points to master) ──
echo "master: ${HUB_LAN}" > /etc/salt/minion.d/master.conf
systemctl enable --now salt-minion
touch /root/.postinstall_done
Step 3: The convergence orchestrator
apply.py — runs on the master after all workers are enrolled
# Simplified convergence flow:
# 1. Wait for all Salt minions to check in
while salt_minion_count < expected_count:
salt "*" test.ping
sleep 3
# 2. Push SSH keys for Ansible
salt "*" cmd.run "mkdir -p /home/ansible/.ssh"
# ... distribute ansible user's pubkey to all nodes
# 3. Run Ansible playbook
ansible-playbook /srv/ansible/site.yml
# Ansible runs in order:
# 00_preflight.yml → verify connectivity
# 02_common.yml → kernel tuning, base packages
# 03_containerd.yml → container runtime
# 04_k8s_packages.yml → kubelet, kubeadm, kubectl
# 05_etcd_pki.yml → distribute pre-generated certs
# 06_etcd_cluster.yml → bootstrap 3-node etcd
# 07_loadbalancers.yml → HAProxy for API server VIP
# 08_cp_init.yml → kubeadm init (first control plane)
# 09_cp_join.yml → join remaining control planes
# 10_worker_join.yml → join workers
# 11_cilium.yml → CNI networking
# 12_monitoring.yml → Prometheus + Grafana
# 13_helm.yml → Helm 3
# 99_verify.yml → kubectl get nodes
The entire Ansible tree is baked into the ISO. No git clone. No downloading roles. The payload directory contains every playbook, role, template, and certificate. The cluster converges from local files.
The two-poweroff pattern
Why the system powers off twice
Boot 1: ISO installer
├── Preseed-driven install (no prompts)
├── Late command copies darksite payload to /target
├── Enables bootstrap.service
└── POWEROFF ← first poweroff (installer done)
Boot 2: From disk (ISO ejected)
├── bootstrap.service runs postinstall.sh
├── Packages installed, WireGuard configured
├── Salt minion registered
└── POWEROFF ← second poweroff (postinstall done)
Boot 3: Production
├── All services running
├── WireGuard mesh active
├── Salt connected to master
└── Ready for Ansible convergence
This separation is deliberate. The first poweroff proves the base install worked. The second poweroff proves the postinstaller worked. The third boot is production. Each phase is independently verifiable. If any phase fails, you know exactly where.
Snapshot, clone, and replicate
The golden image pattern
Once you have a working system (post-postinstall), snapshot it. That snapshot becomes your golden image. Clone it for every new node. Each clone takes milliseconds and uses zero extra space.
# After postinstall completes, snapshot the golden state
zfs snapshot rpool/ROOT/kldload-node@golden
# Clone for each new node (instant, zero space)
zfs clone rpool/ROOT/kldload-node@golden rpool/ROOT/worker-01
zfs clone rpool/ROOT/kldload-node@golden rpool/ROOT/worker-02
zfs clone rpool/ROOT/kldload-node@golden rpool/ROOT/worker-03
# Or replicate to another machine
zfs send rpool/ROOT/kldload-node@golden | ssh kvm-host zfs recv tank/golden/worker
# Create a ZVOL from the golden image for KVM
zfs send rpool/ROOT/kldload-node@golden | zfs recv rpool/vms/worker-01
# Boot as a VM — instant deployment
Firecracker microVMs: 100 instances in seconds
From ZFS clone to microVM in 125ms
Firecracker is Amazon's microVM hypervisor. It boots a Linux kernel in 125 milliseconds with ~5MB of memory overhead. Combined with ZFS clones, you can spray hundreds of isolated microVMs across a machine in seconds.
#!/bin/bash
# spray-microvms.sh — deploy 100 microVMs from a golden image
GOLDEN="rpool/vms/golden-microvm"
KERNEL="/boot/vmlinuz-$(uname -r)"
COUNT=100
# Snapshot the golden image once
zfs snapshot "${GOLDEN}@base"
for i in $(seq 1 $COUNT); do
VM_NAME="micro-$(printf '%03d' $i)"
# Clone the golden image (instant, zero space)
zfs clone "${GOLDEN}@base" "rpool/vms/${VM_NAME}"
# Get the zvol device path
ROOTFS="/dev/zvol/rpool/vms/${VM_NAME}"
# Launch Firecracker microVM
firecracker --no-api \
--boot-source "kernel_image_path=${KERNEL}" \
--boot-source "boot_args=console=ttyS0 root=/dev/vda ro" \
--drives "drive_id=rootfs,path_on_host=${ROOTFS},is_root_device=true" \
--machine-config "vcpu_count=1,mem_size_mib=128" \
--network-interfaces "iface_id=eth0,guest_mac=AA:FC:00:00:${i}" \
&
echo "Launched ${VM_NAME}"
done
echo "Deployed ${COUNT} microVMs from golden image"
# Total time: ~15 seconds for 100 VMs
# Total extra disk: ~0 until VMs diverge (CoW)
Hardware as a Service: the cron job pattern
Sell your hardware by the hour
# crontab -e
# Customer A: 6am-2pm
0 6 * * * /usr/local/bin/deploy-customer-a.sh
0 14 * * * /usr/local/bin/teardown-customer-a.sh
# Customer B: 2pm-10pm
0 14 * * * /usr/local/bin/deploy-customer-b.sh
0 22 * * * /usr/local/bin/teardown-customer-b.sh
# Nightly maintenance: 10pm-6am
0 22 * * * /usr/local/bin/scrub-and-snapshot.sh
#!/bin/bash
# deploy-customer-a.sh
# Clone from golden image (instant)
for i in $(seq 1 20); do
zfs clone rpool/golden/customer-a@latest rpool/vms/cust-a-$(printf '%02d' $i)
done
# Boot all VMs
for vm in /dev/zvol/rpool/vms/cust-a-*; do
virsh start "$(basename $vm)" &
done
echo "Customer A environment live — 20 VMs deployed"
#!/bin/bash
# teardown-customer-a.sh
# Snapshot for billing/audit
zfs snapshot -r rpool/vms/cust-a@teardown-$(date +%Y%m%d-%H%M)
# Destroy all VMs (instant — CoW means no disk cleanup)
for vm in $(virsh list --name | grep cust-a); do
virsh destroy "$vm" 2>/dev/null
virsh undefine "$vm" --nvram 2>/dev/null
done
# Destroy clones (instant — only divergent blocks freed)
for ds in $(zfs list -H -o name | grep rpool/vms/cust-a-); do
zfs destroy "$ds"
done
echo "Customer A environment torn down"
Infrastructure as Code — baked in, not bolted on
The payload IS the infrastructure
Traditional IaC pulls code from git at deploy time. Darksite IaC bakes it into the artifact. The ISO contains the complete Ansible tree:
- Roles — etcd, containerd, kubeadm, haproxy, prometheus, helm, ingress
- Group vars — per-role configuration (k8s versions, network CIDRs, feature flags)
- Host vars — per-node IPs, roles, WireGuard addresses
- Artifacts — pre-generated PKI certificates, join scripts, helm charts
- Templates — Jinja2 templates for haproxy.cfg, prometheus.yml, etcd.conf, kubeadm-config
Nothing is downloaded at deploy time. The playbook runs against local files. The certificates are pre-generated. The container images are pre-pulled. The artifact is the deployment.
The drop-off points
A postinstaller has natural "drop-off points" where you can stop and use the system as-is, or continue adding more layers. Each point is a valid, working system.
dnf install in postinstall.sh. Snapshot. Done.postinstall.sh is bash. Ansible roles are YAML. Kubernetes is kubeadm init.
Firecracker is a single binary. ZFS clones are one command.
You can audit every step. You can modify every step. You can build every step yourself.
That's the point.