Serverless & MicroVMs — Firecracker on ZFS
Firecracker is the virtual machine monitor that powers AWS Lambda and Fargate. It is open source, it runs on KVM, and it boots a microVM in under 125 milliseconds. Each microVM is a real virtual machine with its own Linux kernel — not a container, not a namespace trick, not a shared-kernel sandbox. Hardware-level isolation at container speed.
The kldload KVM profile installs Firecracker, jailer, and firectl from the darksite automatically. This page shows you how to use them.
1. What Firecracker is
MicroVMs, not containers
A container shares the host kernel. A microVM boots its own kernel. Firecracker strips the virtual machine down to the absolute minimum — no BIOS, no PCI bus, no USB, no GPU passthrough. Just a kernel, a rootfs, a network interface, and a block device. That is why it boots in under 125ms and uses about 5MB of memory overhead per VM.
KVM underneath
Firecracker is a KVM virtual machine monitor written in Rust. It talks to /dev/kvm
directly. No QEMU, no libvirt, no management layer. The KVM profile in kldload enables
kvm_intel / kvm_amd and sets up the necessary device permissions. If your
hardware supports VT-x or AMD-V, Firecracker works.
The jailer
Firecracker ships with jailer — a wrapper that puts each microVM inside a
chroot with its own cgroup, UID, seccomp filter, and network namespace. Even if the guest kernel is
compromised, the attacker lands inside a cgroup jail with no access to the host filesystem, no
outbound network unless you explicitly route it, and no syscalls beyond a minimal whitelist.
2. Why ZFS makes it better
Every microVM needs a rootfs — a filesystem image it boots from. Without ZFS, you are copying multi-megabyte images for every VM. With ZFS, you are cloning them in milliseconds for zero space.
Snapshot before execution
Snapshot the rootfs dataset before the microVM boots. If the function mutates the filesystem,
roll back to the snapshot after execution. Every invocation starts from a clean state. No image rebuilds,
no re-downloads. One zfs rollback and you are back to byte-identical pristine.
Clone for parallel execution
zfs clone creates a writable copy of a snapshot in under a second regardless of the
dataset size. Need 50 VMs? Clone 50 datasets. They share every block with the original until a write
diverges. Disk usage is proportional to what each VM changes, not what it contains.
Compress idle images
ZFS compression=zstd on the rootfs dataset means idle images take a fraction of their
raw size on disk. A 200MB Alpine rootfs compresses to about 70MB. You are not paying for that compression
at runtime — the ARC caches decompressed blocks in RAM.
Replicate golden images
zfs send / zfs recv replicates a rootfs snapshot to another node
in one command. Build the golden image once, ship it to every host. Incremental sends mean updates
transfer only the changed blocks.
3. Setup
The KVM profile handles the heavy lifting. kldload-firstboot calls setup_kvm(),
which installs firecracker, jailer, and firectl from the darksite
binaries in /root/darksite/binaries/. It also creates
/var/lib/firecracker/rootfs/ and /run/firecracker/.
Verify the install
# All three should return version info
firecracker --version
jailer --version
firectl --version
# Confirm KVM is available
ls -l /dev/kvm
# crw-rw---- 1 root kvm 10, 232 ... /dev/kvm
Create the rootfs dataset
# ZFS dataset for all microVM rootfs images
zfs create -o compression=zstd -o recordsize=64k \
-o mountpoint=/var/lib/firecracker/rootfs rpool/firecracker
# Create the golden Alpine rootfs
mkdir -p /tmp/alpine-rootfs
cd /tmp/alpine-rootfs
# Bootstrap a minimal Alpine root
curl -fsSL https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/x86_64/alpine-minirootfs-3.20.0-x86_64.tar.gz \
| tar xzf -
# Add OpenRC init and a shell
chroot /tmp/alpine-rootfs /bin/sh -c '
apk add --no-cache openrc
ln -s agetty /etc/init.d/agetty.ttyS0
echo ttyS0 > /etc/securetty
rc-update add agetty.ttyS0 default
rc-update add devfs boot
rc-update add procfs boot
rc-update add sysfs boot
echo "root:firecracker" | chpasswd
'
# Create an ext4 rootfs image from the chroot
truncate -s 200M /var/lib/firecracker/rootfs/alpine.ext4
mkfs.ext4 /var/lib/firecracker/rootfs/alpine.ext4
mount /var/lib/firecracker/rootfs/alpine.ext4 /mnt
cp -a /tmp/alpine-rootfs/* /mnt/
umount /mnt
rm -rf /tmp/alpine-rootfs
Get a kernel
Firecracker needs an uncompressed Linux kernel (vmlinux). You can extract one from the
host or download a prebuilt one.
# Option A: extract vmlinux from the host kernel
/usr/src/kernels/$(uname -r)/scripts/extract-vmlinux \
/boot/vmlinuz-$(uname -r) > /var/lib/firecracker/vmlinux
# Option B: use the Firecracker CI kernel (known-good, minimal)
curl -fsSL https://s3.amazonaws.com/spec.ccfc.min/ci-artifacts/kernels/x86_64/vmlinux-5.10.217 \
-o /var/lib/firecracker/vmlinux
Set up a tap device for networking
# Create a tap device for the microVM
ip tuntap add dev tap0 mode tap
ip addr add 172.16.0.1/24 dev tap0
ip link set tap0 up
# Enable NAT so the microVM can reach the outside
iptables -t nat -A POSTROUTING -o br0 -j MASQUERADE
iptables -A FORWARD -i tap0 -o br0 -j ACCEPT
iptables -A FORWARD -i br0 -o tap0 -m state --state RELATED,ESTABLISHED -j ACCEPT
echo 1 > /proc/sys/net/ipv4/ip_forward
Launch a microVM
# Launch with firectl — the fastest way to get a VM running
firectl \
--kernel=/var/lib/firecracker/vmlinux \
--root-drive=/var/lib/firecracker/rootfs/alpine.ext4 \
--tap-device=tap0/aa:fc:00:00:00:01 \
--kernel-opts="console=ttyS0 reboot=k panic=1 pci=off ip=172.16.0.2::172.16.0.1:255.255.255.0::eth0:off"
# The VM boots in under 125ms. You get a serial console.
# Login: root / firecracker
4. Lambda-style function runner
This is the pattern that Lambda uses internally: create a microVM, run a function, capture the output, destroy the VM. Each invocation is completely isolated. The rootfs rolls back to the snapshot between invocations so every execution starts from byte-identical state.
The function runner script
#!/bin/bash
# /usr/local/bin/kfc-run — kldload Firecracker function runner
# Usage: kfc-run <function-name> [args...]
set -euo pipefail
FUNC_NAME="${1:?Usage: kfc-run <function-name> [args...]}"
shift
FUNC_ARGS="$*"
FC_ROOT="/var/lib/firecracker"
ROOTFS_BASE="${FC_ROOT}/rootfs"
KERNEL="${FC_ROOT}/vmlinux"
FUNC_DIR="/srv/functions/${FUNC_NAME}"
VM_ID="fc-$(date +%s%N | sha256sum | head -c 8)"
CLONE_DS="rpool/firecracker/${VM_ID}"
# ── Validate ──────────────────────────────────────────────────────────────
[[ -d "${FUNC_DIR}" ]] || { echo "ERROR: function dir not found: ${FUNC_DIR}"; exit 1; }
[[ -f "${FUNC_DIR}/handler.sh" ]] || { echo "ERROR: no handler.sh in ${FUNC_DIR}"; exit 1; }
# ── Clone the golden rootfs snapshot ──────────────────────────────────────
zfs clone rpool/firecracker/golden@base "${CLONE_DS}"
CLONE_MOUNT=$(zfs get -H -o value mountpoint "${CLONE_DS}")
# Copy the function into the clone
mkdir -p "${CLONE_MOUNT}/srv/function"
cp -a "${FUNC_DIR}/." "${CLONE_MOUNT}/srv/function/"
# Write the invocation script
cat > "${CLONE_MOUNT}/srv/function/invoke.sh" <<INVOKE
#!/bin/sh
cd /srv/function
exec /srv/function/handler.sh ${FUNC_ARGS} 2>&1
INVOKE
chmod +x "${CLONE_MOUNT}/srv/function/invoke.sh"
# Create the ext4 image from the clone
ROOTFS_IMG="/tmp/${VM_ID}.ext4"
truncate -s 200M "${ROOTFS_IMG}"
mkfs.ext4 -q "${ROOTFS_IMG}"
mount "${ROOTFS_IMG}" /mnt
cp -a "${CLONE_MOUNT}/." /mnt/
umount /mnt
# ── Create tap device ─────────────────────────────────────────────────────
TAP="tap-${VM_ID:0:8}"
ip tuntap add dev "${TAP}" mode tap
ip addr add 172.16.0.1/30 dev "${TAP}"
ip link set "${TAP}" up
# ── Launch the microVM ────────────────────────────────────────────────────
SOCKET="/tmp/${VM_ID}.sock"
# Start firecracker in the background
firecracker --api-sock "${SOCKET}" &
FC_PID=$!
sleep 0.1
# Configure the VM via the API
curl -s --unix-socket "${SOCKET}" -X PUT "http://localhost/boot-source" \
-H "Content-Type: application/json" \
-d "{
\"kernel_image_path\": \"${KERNEL}\",
\"boot_args\": \"console=ttyS0 reboot=k panic=1 pci=off init=/srv/function/invoke.sh ip=172.16.0.2::172.16.0.1:255.255.255.0::eth0:off\"
}"
curl -s --unix-socket "${SOCKET}" -X PUT "http://localhost/drives/rootfs" \
-H "Content-Type: application/json" \
-d "{
\"drive_id\": \"rootfs\",
\"path_on_host\": \"${ROOTFS_IMG}\",
\"is_root_device\": true,
\"is_read_only\": false
}"
curl -s --unix-socket "${SOCKET}" -X PUT "http://localhost/network-interfaces/eth0" \
-H "Content-Type: application/json" \
-d "{
\"iface_id\": \"eth0\",
\"guest_mac\": \"aa:fc:00:00:00:01\",
\"host_dev_name\": \"${TAP}\"
}"
curl -s --unix-socket "${SOCKET}" -X PUT "http://localhost/machine-config" \
-H "Content-Type: application/json" \
-d "{\"vcpu_count\": 1, \"mem_size_mib\": 128}"
# Start the VM
curl -s --unix-socket "${SOCKET}" -X PUT "http://localhost/actions" \
-H "Content-Type: application/json" \
-d "{\"action_type\": \"InstanceStart\"}"
# ── Wait for the function to finish (max 30 seconds) ─────────────────────
TIMEOUT=30
while kill -0 "${FC_PID}" 2>/dev/null && [[ ${TIMEOUT} -gt 0 ]]; do
sleep 1
((TIMEOUT--))
done
# ── Cleanup ───────────────────────────────────────────────────────────────
kill "${FC_PID}" 2>/dev/null || true
wait "${FC_PID}" 2>/dev/null || true
ip link del "${TAP}" 2>/dev/null || true
rm -f "${ROOTFS_IMG}" "${SOCKET}"
zfs destroy "${CLONE_DS}"
echo "--- ${FUNC_NAME} completed (VM: ${VM_ID}) ---"
Create the golden rootfs snapshot
# Create the base dataset and snapshot for cloning
zfs create -o compression=zstd -o recordsize=64k \
-o mountpoint=/var/lib/firecracker/rootfs/golden rpool/firecracker/golden
# Copy in your prepared Alpine rootfs
cp -a /tmp/alpine-rootfs/* /var/lib/firecracker/rootfs/golden/
# Snapshot it — this is the template every function clone uses
zfs snapshot rpool/firecracker/golden@base
# Every kfc-run invocation clones from this snapshot
# The clone is instant, regardless of rootfs size
Example: image resize function
# Create the function directory
mkdir -p /srv/functions/image-resize
# The handler — this runs inside the microVM
cat > /srv/functions/image-resize/handler.sh <<'EOF'
#!/bin/sh
# Resize an image to 800x600 — runs inside a disposable microVM
INPUT="/srv/function/input.jpg"
OUTPUT="/srv/function/output.jpg"
if [ ! -f "${INPUT}" ]; then
echo "ERROR: no input.jpg found"
exit 1
fi
# ImageMagick would be in the rootfs if you need it
# For this example, use ffmpeg (smaller footprint)
ffmpeg -i "${INPUT}" -vf scale=800:600 "${OUTPUT}" 2>/dev/null
echo "Resized to 800x600: $(stat -c%s "${OUTPUT}") bytes"
EOF
chmod +x /srv/functions/image-resize/handler.sh
# Copy an image into the function directory and run it
cp photo.jpg /srv/functions/image-resize/input.jpg
kfc-run image-resize
Example: API endpoint function
# A function that responds to an HTTP request
mkdir -p /srv/functions/api-hello
cat > /srv/functions/api-hello/handler.sh <<'EOF'
#!/bin/sh
# Minimal HTTP response — runs inside a disposable microVM
TIMESTAMP=$(date -Iseconds)
HOSTNAME=$(hostname)
cat <<RESPONSE
HTTP/1.1 200 OK
Content-Type: application/json
{"status":"ok","host":"${HOSTNAME}","time":"${TIMESTAMP}","message":"Hello from a microVM that booted just for you"}
RESPONSE
EOF
chmod +x /srv/functions/api-hello/handler.sh
kfc-run api-hello
5. Scaling — 100 microVMs on one host
Each Firecracker microVM uses about 5MB of host memory overhead beyond what you allocate to the guest. A VM with 128MB of guest RAM costs 133MB total. ZFS clones mean the rootfs disk cost per VM is effectively zero until the guest starts writing. On a 64GB host, you can run hundreds of concurrent microVMs.
Parallel launcher
#!/bin/bash
# /usr/local/bin/kfc-scale — launch N microVMs in parallel
# Usage: kfc-scale <count> <function-name>
COUNT="${1:?Usage: kfc-scale <count> <function-name>}"
FUNC="${2:?Usage: kfc-scale <count> <function-name>}"
PIDS=()
echo "Launching ${COUNT} microVMs running ${FUNC}..."
for i in $(seq 1 "${COUNT}"); do
kfc-run "${FUNC}" &
PIDS+=($!)
# Stagger launches slightly to avoid tap device collisions
sleep 0.05
done
echo "All ${COUNT} VMs launched. Waiting for completion..."
FAILED=0
for pid in "${PIDS[@]}"; do
wait "${pid}" || ((FAILED++))
done
echo "Completed: $((COUNT - FAILED)) succeeded, ${FAILED} failed"
Memory budget
128MB guest + 5MB overhead = 133MB per VM. A 64GB host with 8GB reserved for the host OS and ZFS ARC gives you ~420 concurrent microVMs. Reduce the guest RAM to 64MB for simple functions and you double that. The ARC is your friend here — it caches the shared rootfs blocks that every clone reads.
Disk budget
ZFS clones share all unchanged blocks. If your golden rootfs is 200MB and each function writes 2MB of temp data, 100 VMs cost 200MB + (100 x 2MB) = 400MB total. Without ZFS, 100 copies of a 200MB image would cost 20GB. This is not a trick. This is how copy-on-write filesystems work.
Rate limiting with jailer
The jailer binary puts each microVM in its own cgroup. You can set CPU and memory limits
per VM so a runaway function cannot starve other VMs. Combined with seccomp filters, you get
defense-in-depth: the guest kernel is isolated, the process is cgroup-limited, and the syscalls are
filtered.
# Launch with jailer for production isolation
jailer --id "${VM_ID}" \
--exec-file /usr/local/bin/firecracker \
--uid 65534 --gid 65534 \
--chroot-base-dir /srv/jailer \
--cgroup-version 2 \
-- --api-sock /run/firecracker.sock
Network namespace per VM
Each microVM gets its own tap device and can be placed in its own network namespace. Outbound traffic routes through the host's bridge. VMs cannot see each other's traffic unless you explicitly route between their namespaces.
# Isolated network namespace per VM
ip netns add "ns-${VM_ID}"
ip tuntap add dev "tap-${VM_ID}" mode tap
ip link set "tap-${VM_ID}" netns "ns-${VM_ID}"
ip netns exec "ns-${VM_ID}" \
ip addr add 172.16.0.1/30 dev "tap-${VM_ID}"
ip netns exec "ns-${VM_ID}" \
ip link set "tap-${VM_ID}" up
6. Postinstaller integration
Bake the entire Firecracker function platform into a kldload postinstaller so it deploys automatically when you install a KVM-profile host.
Postinstaller script
#!/bin/bash
# /srv/postinstallers/firecracker-platform.sh
# Runs after kldload install — sets up the Firecracker function runtime
set -euo pipefail
LOG="/var/log/kldload-postinstall-firecracker.log"
exec >>"${LOG}" 2>&1
echo "=== Firecracker platform setup — $(date) ==="
# ── ZFS datasets ──────────────────────────────────────────────────────────
zfs create -o compression=zstd -o recordsize=64k \
-o mountpoint=/var/lib/firecracker rpool/firecracker
zfs create -o mountpoint=/var/lib/firecracker/rootfs/golden \
rpool/firecracker/golden
zfs create -o mountpoint=/srv/functions rpool/srv/functions
# ── Build the golden Alpine rootfs ─────────────────────────────────────────
GOLDEN="/var/lib/firecracker/rootfs/golden"
curl -fsSL https://dl-cdn.alpinelinux.org/alpine/v3.20/releases/x86_64/alpine-minirootfs-3.20.0-x86_64.tar.gz \
| tar xzf - -C "${GOLDEN}"
chroot "${GOLDEN}" /bin/sh -c '
apk add --no-cache openrc util-linux
ln -s agetty /etc/init.d/agetty.ttyS0
echo ttyS0 > /etc/securetty
rc-update add agetty.ttyS0 default
rc-update add devfs boot
rc-update add procfs boot
rc-update add sysfs boot
echo "root:firecracker" | chpasswd
'
# Snapshot the golden rootfs — all clones derive from here
zfs snapshot rpool/firecracker/golden@base
# ── Download a known-good kernel ───────────────────────────────────────────
curl -fsSL https://s3.amazonaws.com/spec.ccfc.min/ci-artifacts/kernels/x86_64/vmlinux-5.10.217 \
-o /var/lib/firecracker/vmlinux
# ── Install the function runner scripts ────────────────────────────────────
install -m 0755 /srv/postinstallers/files/kfc-run /usr/local/bin/kfc-run
install -m 0755 /srv/postinstallers/files/kfc-scale /usr/local/bin/kfc-scale
# ── Create the bridge + NAT rules ──────────────────────────────────────────
cat > /etc/sysctl.d/99-firecracker.conf <<'SYSCTL'
net.ipv4.ip_forward = 1
SYSCTL
sysctl -p /etc/sysctl.d/99-firecracker.conf
# ── Snapshot the entire setup for recovery ────────────────────────────────
ksnap /var/lib/firecracker
echo "=== Firecracker platform ready — $(date) ==="
7. AI integration
The local AI assistant can manage the microVM fleet — launching functions, monitoring VM health, cleaning up stale clones, and reporting on resource usage. Give it a context script that feeds live Firecracker state into every query.
Firecracker context for the AI
#!/bin/bash
# /usr/local/bin/kai-firecracker — query the AI about the microVM fleet
build_fc_context() {
echo "=== FIRECRACKER STATE ($(date -Iseconds)) ==="
echo -e "\n--- Running microVMs ---"
ps aux | grep '[f]irecracker' | awk '{print $2, $11, $12}'
echo -e "\n--- ZFS clones (active VMs) ---"
zfs list -r rpool/firecracker -o name,used,refer,origin 2>/dev/null
echo -e "\n--- Golden rootfs snapshots ---"
zfs list -t snapshot -r rpool/firecracker/golden \
-o name,used,creation 2>/dev/null
echo -e "\n--- Function definitions ---"
ls -la /srv/functions/ 2>/dev/null
echo -e "\n--- Tap devices ---"
ip link show type tun 2>/dev/null
echo -e "\n--- Memory pressure ---"
free -h
echo ""
cat /proc/meminfo | grep -E 'MemTotal|MemAvail|Committed_AS'
echo -e "\n--- Jailer cgroups ---"
find /sys/fs/cgroup -name "fc-*" -type d 2>/dev/null | head -20
}
QUESTION="$*"
if [ -z "$QUESTION" ]; then
echo "Usage: kai-firecracker <question>"
echo ""
echo "Examples:"
echo " kai-firecracker 'how many VMs are running?'"
echo " kai-firecracker 'clean up stale clones'"
echo " kai-firecracker 'can I launch 50 more VMs?'"
echo " kai-firecracker 'which functions ran today?'"
exit 1
fi
CONTEXT=$(build_fc_context)
echo -e "${CONTEXT}\n\n=== QUESTION ===\n${QUESTION}" | \
ollama run kldload-admin
"How many more VMs can I run?"
The AI reads free -h, counts running VMs, calculates per-VM overhead, and tells you
exactly how many 128MB VMs fit in the remaining memory. It factors in the ZFS ARC reservation.
kai-firecracker "how many more 128MB VMs can I launch?"
"Clean up stale clones"
The AI lists ZFS clones under rpool/firecracker, cross-references with running
firecracker processes, and identifies orphaned clones from crashed VMs. It gives you
the exact zfs destroy commands.
kai-firecracker "find and destroy any orphaned VM clones"
Firecracker gives you the isolation of VMs at the speed of containers. ZFS gives you instant clones, snapshots, compression, and replication underneath. The combination is a serverless runtime that runs on your hardware, boots in milliseconds, and leaves no trace when the function is done.
No orchestrator. No control plane. No billing API. Just a script that clones a dataset, boots a kernel, runs your code, and destroys the VM. That is serverless without the server bill.