| pick your distro, get ZFS on root
kldload — your platform, your way, free
Source

Proxmox & kldload

Both run on bare metal. Both support KVM/QEMU. Both have ZFS. Both target homelabs and self-hosted infrastructure. But they're solving different problems — and they can work together.

The honest take: If you want to spin up VMs through a polished GUI with minimal expertise, Proxmox is mature and it works. Use it.

If you want a proper foundation — ZFS all the way down, kernel-level observability, direct GPU access, no vendor costs, fully auditable — kldload on bare metal with the kvm profile gives you a hypervisor with instant ZFS clones, automated snapshots, and DR replication that Proxmox can't match.

Or run both: Proxmox as the management layer, kldload VMs with disk passthrough for workloads that need real ZFS.

This page is honest about both tools. Proxmox has a better GUI. kldload has better ZFS integration. The right choice depends on what you value. If your answer is "I want to click buttons and create VMs," Proxmox wins. If your answer is "I want ZFS zvol clones, per-VM snapshots, automated replication, eBPF observability, and 8 distro choices," kldload's kvm profile wins. If your answer is "both," this page shows you how to combine them without the double-ZFS penalty.

What they are

Proxmox VE

A hypervisor platform. Its entire identity is managing VMs and LXC containers through a web UI. The OS underneath is Debian, heavily modified and locked to Proxmox's tooling. ZFS is a storage option — not the foundation.

Product. Manages your VMs.

kldload

A base image factory and operating system. You pick your distro (8 options), root it in ZFS, and select a profile: desktop, server, kvm, storage, ai, or core. The kvm profile gives you a full hypervisor with ZFS zvols, instant clones, automated hourly snapshots, DR replication, and CLI tools (kvm-create, kvm-clone, kvm-snap, kvm-replicate). You own the entire stack.

Foundation. ZFS-native hypervisor. You control everything.

The comparison

Proxmox VEkldload (kvm profile)
ZFSStorage option. Bolted on. Boot environments not native.The substrate. Boot environments, zvols, snapshots before every change.
VM cloningLinked clone (qcow2 backing file chain).kvm-clone — ZFS zvol clone, instant, no chain, no dependency.
VM snapshotsqcow2 snapshot chain (degrades with depth).kvm-snap — ZFS snapshot, atomic, unlimited, zero overhead.
VM replicationProxmox replication (cluster required).kvm-replicate — incremental zfs send to any host over WireGuard.
CostFree tier exists. Enterprise ~€110/yr/socket. Nag screen.BSD-3-Clause. Zero. Forever. No nag screen, no phone home.
DistroDebian only (modified).8 distros. CentOS, Debian, Ubuntu, Fedora, RHEL, Rocky, Arch, Alpine.
GPUPCI passthrough. IOMMU pain, single-VM binding.Direct access. Drivers on first boot. Container GPU sharing.
ObservabilityNo native eBPF. Need external tools.eBPF + bcc + bpftrace on first boot. Kernel-level. No SaaS.
WireGuardSupported. Manual setup.Kernel module. 4-plane mesh. Backplane architecture documented.
OfflineNeeds internet for most operations.Full package mirrors baked in. No internet required.
Web UIExcellent. Mature. The best thing about Proxmox.Installer UI. CLI is the centerpiece: kvm-create/clone/snap/replicate.
The comparison isn't "one is better." It's different philosophies. Proxmox is a product — install it, click the GUI, manage VMs. kldload's kvm profile is an OS template — install it, get ZFS-native tools, manage VMs from the terminal. Proxmox's replication requires a Proxmox cluster (multiple Proxmox nodes). kldload's replication is zfs send to any host that speaks ZFS — it doesn't need to be running kldload, doesn't need a cluster, doesn't need any coordination. The VM data is just ZFS — it goes wherever ZFS goes.

The recommendation

Three paths, pick one:

Path 1: kldload bare metal with the kvm profile. ZFS talks directly to disks. Best performance. Full kvm-create/clone/snap/replicate tools. This is what the KVM tutorial teaches.

Path 2: Proxmox host with PCI disk passthrough to kldload VMs. Guest ZFS talks to real hardware. No double-ZFS penalty.

Path 3: Proxmox host with kldload VMs on virtio-scsi (non-ZFS storage). No double-CoW. Guest gets ZFS features, host uses LVM-thin.

Why double-ZFS hurts

ARC cache thrashing — both pools cache the same blocks. RAM burned twice.

Write amplification — CoW twice for every write. Double I/O, double latency.

Unpredictable performance — two pools competing for disk I/O, flushing independently.

Confusing diagnosticszpool iostat in the guest doesn't reflect reality.

What works

Option 1: kldload on bare metal with KVM. ZFS talks directly to the disks. Best performance.

Option 2: Proxmox host with PCI passthrough. Guest's ZFS talks to real hardware.

Option 3: Proxmox host with virtio-scsi on non-ZFS storage (LVM-thin, ext4). No double-CoW.


The double-ZFS problem is the single most common mistake people make when running kldload on Proxmox. Proxmox uses ZFS for its storage backend. kldload installs ZFS on root inside the VM. Now you have two ZFS instances on the same physical blocks — two ARC caches eating RAM, two copy-on-write layers doubling writes, two compression engines wasting CPU on already-compressed data. It works, but the overhead is real and measurable. This section explains exactly what happens and how to avoid it.

The double-ZFS problem

┌─────────────────────────────────────────────────┐
│  Proxmox host                                   │
│  zpool: rpool (ZFS on physical NVMe)            │
│    └── vm-100-disk-0.qcow2                      │
│         ┌──────────────────────────────────┐     │
│         │  kldload VM                       │     │
│         │  zpool: rpool (ZFS on /dev/vda)   │     │
│         │    ├── ROOT/default               │     │
│         │    ├── home                       │     │
│         │    └── srv                        │     │
│         └──────────────────────────────────┘     │
└─────────────────────────────────────────────────┘

Two ARC caches: Both ZFS layers maintain their own ARC (Adaptive Replacement Cache). The host caches the VM’s virtual disk blocks, and the guest caches its own filesystem blocks. This means the same data can sit in memory twice.

Double compression: If both layers use lz4, the guest compresses data, then the host tries to compress the already-compressed blocks (gaining nothing but burning CPU).

Double checksumming: Both layers verify block integrity. Redundant for data correctness, costs CPU.

Write amplification: The guest writes to its pool, which generates I/O to the virtual disk. The host’s pool then processes that I/O, potentially amplifying the number of physical writes due to COW on both layers.


When double-ZFS is fine

  • Development and testing — you want the real kldload ZFS experience in a disposable VM
  • Small workloads — the overhead is negligible when VM I/O is low
  • You need kldload’s ZFS features — boot environments, snapshots, ksnap, kbe, kupgrade — these only work with ZFS on root

When to avoid double-ZFS

  • Production storage nodes — if the VM is an NFS/iSCSI server or runs databases with heavy I/O
  • Memory-constrained hosts — two ARCs eating RAM is wasteful
  • High-throughput workloads — write amplification hurts

Alternative: bare-metal kldload + KVM

For production, install kldload directly on the hardware (bare metal) and run VMs on top using KVM/libvirt. You get one ZFS layer with full performance:

┌──────────────────────────────────────────────────┐
│  kldload bare metal (kvm profile)                │
│  zpool: rpool                                    │
│    ├── ROOT/default (host OS)                    │
│    ├── vms/web-1    → zvol → /dev/zvol/rpool/vms/web-1   │
│    ├── vms/web-2    → zvol → /dev/zvol/rpool/vms/web-2   │
│    ├── vms/db-1     → zvol → /dev/zvol/rpool/vms/db-1    │
│    └── vms/isos     → dataset (zstd compressed)  │
│                                                  │
│  kvm-create db-1 --ram 8192 --disk 200           │
│  kvm-clone db-1 db-test     (instant, zero-copy) │
│  kvm-snap db-1              (atomic ZFS snapshot) │
│  kvm-replicate rpool/vms/db-1 dr-host            │
└──────────────────────────────────────────────────┘

VMs use zvols — raw block devices on ZFS. No qcow2 layer. The host’s ZFS handles compression, snapshots, clones, and replication at the block level. VMs inside don’t need their own ZFS — the host provides all the ZFS benefits transparently.

This is the architecture the KVM tutorial teaches in detail. The kvm profile creates the datasets, tunes the ARC, installs the tools, and enables the snapshot timer. You select it during install and get a production hypervisor. The key advantage over Proxmox: every VM operation (clone, snapshot, replicate, rollback) is a ZFS operation on a zvol — no qcow2 chains, no Proxmox cluster requirement, no GUI dependency. It’s zfs send to any host, zfs rollback in seconds, zfs clone that’s instant regardless of disk size.

If you're running kldload VMs on a Proxmox host with ZFS storage, these tuning tips reduce the double-ZFS penalty from painful to tolerable. You can't eliminate it entirely — two CoW layers will always cost more than one — but you can minimize the damage. The single biggest win: disable compression on one layer so you're not burning CPU compressing already-compressed data.

If you do run on Proxmox: tuning tips

Disable compression on one layer

# Option A: disable on the Proxmox side (let kldload handle it)
# On the Proxmox host:
zfs set compression=off rpool/data/vm-100-disk-0

# Option B: disable on the kldload guest side (let Proxmox handle it)
# Inside the kldload VM:
zfs set compression=off rpool

Option A is usually better — let the guest control its own compression settings per-dataset.

Cap the guest ARC

Inside the kldload VM, reduce the ARC so the host has more memory for its own ARC:

# Inside the kldload VM — limit ARC to 1GB
echo "options zfs zfs_arc_max=1073741824" > /etc/modprobe.d/zfs-arc.conf

# Rebuild initramfs and reboot
dracut --force   # CentOS/RHEL
update-initramfs -u   # Debian
reboot

Use virtio-scsi with discard

In Proxmox VM settings: - Bus: SCSI (virtio-scsi-single) - Discard: On (enables TRIM passthrough) - IO Thread: On

# Proxmox CLI
qm set 100 --scsi0 local-zfs:vm-100-disk-0,discard=on,iothread=1,ssd=1

Inside the kldload VM, enable autotrim:

zpool set autotrim=on rpool

This lets the guest’s ZFS tell the host “I’m not using these blocks anymore,” which the host’s ZFS can then free.

Allocate fixed RAM (no ballooning)

Ballooning and ZFS ARC don’t mix well — the ARC doesn’t shrink gracefully when the balloon inflates:

# Proxmox CLI — fixed 8GB, no balloon
qm set 100 --memory 8192 --balloon 0

Use raw disk format instead of qcow2

On Proxmox with ZFS storage, raw format avoids the qcow2 layer:

# Create VM with raw disk on ZFS
qm set 100 --scsi0 local-zfs:40,format=raw

Proxmox’s ZFS storage backend uses zvols for raw disks — this means the guest’s ZFS writes go through a zvol, which is more efficient than going through a qcow2 file on top of a ZFS dataset.


ZFS special devices (Proxmox host)

If your Proxmox host has NVMe SSDs, consider using ZFS special vdevs to accelerate metadata operations:

# On the Proxmox host — add a special vdev (metadata + small blocks)
zpool add rpool special mirror /dev/nvme0n1p4 /dev/nvme1n1p4

# Set small block threshold (blocks ≤64K go to the special vdev)
zfs set special_small_blocks=64K rpool

This accelerates ls, find, zfs list, and any metadata-heavy operation, which benefits VMs because their virtual disk metadata is served from fast storage.


# Create the VM
qm create 100 \
  --name kldload-node \
  --machine q35 \
  --cpu host \
  --cores 4 \
  --memory 8192 \
  --balloon 0 \
  --bios ovmf \
  --efidisk0 local-zfs:1,efitype=4m,pre-enrolled-keys=0 \
  --tpmstate0 local-zfs:1,version=v2.0 \
  --scsi0 local-zfs:40,discard=on,iothread=1,ssd=1 \
  --scsihw virtio-scsi-single \
  --ide2 local:iso/kldload-free-latest.iso,media=cdrom \
  --net0 virtio,bridge=vmbr0 \
  --serial0 socket \
  --boot order="ide2;scsi0" \
  --ostype l26

These settings match the kldload deploy.sh proxmox-deploy defaults: q35 machine, host CPU passthrough, OVMF UEFI, TPM 2.0, virtio-scsi with discard, serial console.


Migrating from Proxmox VM to bare metal

This is the graduation path. You started with kldload on Proxmox because it was easy to test. Now you want full performance — one ZFS layer, direct disk access, the kvm profile tools. The migration is straightforward: export the VM image or zfs send the pool to bare metal, reinstall the bootloader, reboot. Your data, your ZFS snapshots, your entire configuration — all preserved. You're not reinstalling, you're relocating.

If you outgrow the double-ZFS setup:

# Inside the Proxmox kldload VM
kexport raw

# Write the raw image to bare-metal disk
dd if=kldload-export-*.raw of=/dev/nvme0n1 bs=4M status=progress conv=sparse oflag=sync
sync

Or use zfs send/receive:

# Inside the VM — send the pool to a USB backup disk
zfs snapshot -r rpool@migrate
zfs send -R rpool@migrate | ssh bare-metal zfs receive -F rpool

Then reinstall the bootloader on the bare-metal disk:

# Boot from kldload ISO on bare metal
krecovery import rpool
krecovery reinstall-bootloader /dev/nvme0n1
reboot