Proxmox Performance Tuning — stop blaming ZFS, start tuning it.
Proxmox ships with ZFS support out of the box. But the defaults are not optimized for VM workloads. People install Proxmox, create VMs on ZFS, performance is terrible, and they blame ZFS. The problem isn't ZFS. The problem is that nobody tuned it.
The 8K amplification problem
Why Proxmox VMs feel slow on default ZFS
Proxmox defaults to 128K recordsize for datasets. But VM disk I/O operates in 4K-8K blocks. When a VM writes 8K, ZFS has to:
- Read the full 128K record that contains the 8K block
- Decompress it (if compression is on)
- Modify the 8K portion
- Recompress the full 128K
- Write the new 128K record to a new location (CoW)
That's 16x write amplification for every VM I/O operation. Your VMs aren't slow because ZFS is slow. They're slow because ZFS is reading and writing 128K to change 8K.
The fixes
Fix 1: Use zvols with correct volblocksize
# DON'T do this (Proxmox default — dataset-backed, 128K records)
# zfs create rpool/data/vm-100-disk-0
# DO this — zvol with 16K block size
zfs create -V 40G -s \
-o volblocksize=16K \
-o compression=lz4 \
rpool/data/vm-100-disk-0
# For database VMs, use 8K
zfs create -V 40G -s \
-o volblocksize=8K \
rpool/data/vm-100-disk-0
16K volblocksize = 2x amplification instead of 16x. That's an 8x improvement from changing one number.
Fix 2: Tune ARC for VM workloads
# Proxmox defaults restrict ARC aggressively
# Set ARC to use up to 50% of RAM (e.g., 16GB on a 32GB host)
echo "options zfs zfs_arc_max=17179869184" > /etc/modprobe.d/zfs.conf
# Set minimum ARC (don't let the kernel starve ZFS)
echo "options zfs zfs_arc_min=4294967296" >> /etc/modprobe.d/zfs.conf
# Apply without reboot
echo 17179869184 > /sys/module/zfs/parameters/zfs_arc_max
echo 4294967296 > /sys/module/zfs/parameters/zfs_arc_min
Fix 3: Add a SLOG for sync writes
# VMs use sync writes for data integrity
# Without SLOG, every sync write waits for spinning rust
# Add an enterprise NVMe as SLOG (MUST have power loss protection)
zpool add rpool log /dev/nvme1n1
# Verify
zpool status rpool | grep log
Fix 4: Add a special vdev for metadata
# Metadata operations (directory listings, file lookups) are slow on HDDs
# A mirrored SSD special vdev stores metadata on fast storage
zpool add rpool special mirror /dev/sda /dev/sdb
# Set small_blocks threshold (files smaller than this go to special vdev)
zfs set special_small_blocks=64K rpool
Fix 5: Use mirrors, not RAIDZ, for VMs
This is the most common mistake on Proxmox. RAIDZ has terrible random write performance. VMs generate random I/O. Mirrors handle random I/O linearly — each mirror pair serves requests independently.
# BAD for VMs:
# zpool create rpool raidz2 /dev/sd{a,b,c,d,e,f}
# GOOD for VMs:
zpool create rpool \
mirror /dev/sda /dev/sdb \
mirror /dev/sdc /dev/sdd \
mirror /dev/sde /dev/sdf
Quick reference: Proxmox ZFS tuning
| Setting | Default | Recommended | Why |
|---|---|---|---|
| volblocksize | 8K | 16K (VMs) / 8K (DBs) | Match guest I/O pattern, reduce amplification |
| recordsize | 128K | Don't use datasets for VMs | Use zvols instead |
| compression | on (lz4) | lz4 | Keep it — nearly free and saves I/O |
| zfs_arc_max | 50% RAM | 50-75% RAM | Let ARC cache VM hot blocks |
| sync | standard | standard + SLOG | Never disable sync — add SLOG instead |
| VDEV layout | varies | Mirrors | RAIDZ kills VM I/O performance |
| ashift | 12 | 12 | Correct for 4K sector disks (all modern disks) |
| special vdev | none | Mirrored SSDs | Accelerates metadata for all VMs |