VDI Desktop — your desktop, streamed anywhere.
Run a full Linux desktop on a server and access it from any browser, any device, anywhere. Three streaming protocols — pick the one that fits your latency and compatibility needs. Add an NVIDIA GPU and you get hardware-accelerated encoding. The entire stack is open source.
What this replaces: Commercial VDI solutions that require per-seat licensing, dedicated connection brokers, and proprietary clients. This uses native Linux display capture, open streaming protocols, and a browser.
Architecture
How it works
A headless Wayland compositor (mutter --headless) renders a virtual desktop at any resolution.
wf-recorder captures the framebuffer and encodes it to H.264 (CPU or NVIDIA NVENC).
The encoded stream is published to mediamtx via SRT, which re-publishes it as HLS, WebRTC, or SRT — your choice.
nginx sits in front as a reverse proxy, serving HLS segments and proxying WebRTC connections. Users connect with a browser. No client software needed.
HLS
Port 8888
HTTP-based
2–5s latency
Works everywhere
WebRTC
Port 8889
UDP, peer-to-peer
<200ms latency
Browser-native
SRT
Port 8890
UDP, reliable
<500ms latency
Professional broadcast
Step 1: Install with VDI profile
Select the VDI profile in the web UI, or use an answers file:
Unattended install
# VDI server with NVIDIA GPU encoding
cat > /tmp/answers.env << 'EOF'
KLDLOAD_DISTRO=debian
KLDLOAD_DISK=/dev/sda
KLDLOAD_HOSTNAME=vdi-server
KLDLOAD_USERNAME=admin
KLDLOAD_PASSWORD=changeme
KLDLOAD_PROFILE=vdi
KLDLOAD_NVIDIA_DRIVERS=1
EOF
kldload-install-target --config /tmp/answers.env
Step 2: First boot (automatic)
kldload-firstboot detects the VDI profile and configures everything:
✓ mediamtx installed
Latest stable from GitHub. Configured for SRT ingest on :8890, HLS on :8888, WebRTC on :8889.
✓ nginx reverse proxy
HLS segments served with CORS headers. WebRTC proxied with WebSocket upgrade. Health endpoint at /health.
✓ Session launcher
kldload-vdi-session script starts a headless Wayland desktop and streams it via SRT to mediamtx.
✓ Systemd services
mediamtx and nginx enabled at boot. Sessions managed individually.
Step 3: Launch a desktop session
Start a session
# Launch session 1 (each session gets its own virtual desktop)
kldload-vdi-session 1 &
# Launch session 2 for a second user
kldload-vdi-session 2 &
# Each session is an independent Wayland desktop streaming to mediamtx
Connect from a browser
# HLS (works on any device, any browser)
http://vdi-server/hls/session1
# WebRTC (lowest latency, Chrome/Firefox/Edge)
http://vdi-server/webrtc/session1
# SRT (professional, use VLC or OBS)
srt://vdi-server:8890?streamid=read:session1
GPU-accelerated encoding
With NVIDIA GPU
If NVIDIA drivers are installed, wf-recorder uses NVENC automatically. The GPU encodes video while the CPU stays free for user applications. One GPU can encode 10+ sessions simultaneously.
# Manual session with NVENC (this is what kldload-vdi-session does internally)
mutter --wayland --headless --virtual-monitor 1920x1080 &
sleep 2
wf-recorder --audio --codec h264_nvenc \
--file "srt://127.0.0.1:8890?streamid=publish:session1&pkt_size=1316"
# Check GPU utilization
nvidia-smi
# You'll see the NVENC encoder process using the GPU's video engine
Without GPU (CPU encoding)
No GPU? No problem. libx264 with ultrafast preset handles 1080p on any modern CPU. Quality is slightly lower and CPU usage is higher, but it works.
wf-recorder --audio --codec libx264 \
--params "preset=ultrafast,tune=zerolatency" \
--file "srt://127.0.0.1:8890?streamid=publish:session1&pkt_size=1316"
Scaling — multiple users, one server
Session management
# Launch sessions for 10 users
for i in $(seq 1 10); do
kldload-vdi-session "$i" &
echo "Session $i started — http://vdi-server/webrtc/session${i}"
done
# List active sessions
ps aux | grep kldload-vdi-session
# Kill a specific session
kill $(pgrep -f "kldload-vdi-session 3")
Resource planning
ZFS integration
Per-user datasets
# Each VDI user gets their own ZFS dataset
# (adduser.local hook creates this automatically on user creation)
zfs list -r rpool/home
# NAME USED AVAIL REFER MOUNTPOINT
# rpool/home 1.2G 60G 96K /home
# rpool/home/alice 400M 60G 400M /home/alice
# rpool/home/bob 350M 60G 350M /home/bob
# Set per-user quotas
zfs set quota=10G rpool/home/alice
zfs set quota=10G rpool/home/bob
# Snapshot all user data before maintenance
ksnap /home
# User broke their desktop? Roll back their home only
ksnap rollback /home/alice
Input forwarding
Keyboard and mouse
WebRTC handles input natively — keyboard and mouse events travel over the same WebRTC data channel as the video. For HLS/SRT (video-only protocols), you need a separate input channel:
# evemu-tools for input injection (installed by VDI profile)
# xdotool for keyboard/mouse simulation
# xclip for clipboard sync
# The kldload-webui provides a thin JavaScript layer that captures
# keyboard/mouse events and sends them to the server via WebSocket.
# The server injects them into the Wayland session via evemu.
Remote access over WireGuard
Secure VDI over the internet
# On the VDI server: WireGuard is already installed (VDI profile)
cat > /etc/wireguard/wg0.conf << 'WG'
[Interface]
Address = 10.99.0.1/24
ListenPort = 51820
PrivateKey = $(wg genkey)
[Peer]
PublicKey = CLIENT_PUBKEY
AllowedIPs = 10.99.0.2/32
WG
wg-quick up wg0
# Client connects via WireGuard, then opens browser to:
# http://10.99.0.1/webrtc/session1
#
# All traffic is encrypted. No VPN client beyond WireGuard.
# No port forwarding. No exposure to the public internet.
Audio, microphone & clipboard
Video streams over WebRTC/SRT. Everything else — audio output, microphone input, clipboard sync, USB forwarding — rides the WireGuard back plane. Encrypted, low latency, always on.
PipeWire audio over WireGuard
# Server side: PipeWire is already installed (VDI profile)
# It captures audio from the Wayland session natively
# wf-recorder captures audio alongside video when --audio is set
wf-recorder --audio --codec h264_nvenc \
--file "srt://127.0.0.1:8890?streamid=publish:session1&pkt_size=1316"
# For WebRTC: audio is included in the WebRTC stream automatically
# Nothing to configure — PipeWire → wf-recorder → mediamtx → browser
Microphone forwarding (client → server)
# Client sends mic audio over WireGuard to a PulseAudio/PipeWire network sink
# On the VDI server: create a network source
pactl load-module module-native-protocol-tcp auth-ip-acl=10.99.0.0/24
# On the client: forward mic to the server over WireGuard
PULSE_SERVER=tcp:10.99.0.1 parecord --format=s16le | \
ssh 10.99.0.1 "pacat --playback --format=s16le"
# Or use PipeWire's native network streaming (simpler)
# Client and server discover each other via the WireGuard subnet
Clipboard sync
# xclip is installed by the VDI profile
# Clipboard data travels over the WebRTC data channel (WebRTC mode)
# or via a small WebSocket service over WireGuard (HLS/SRT mode)
# Simple clipboard relay over WireGuard:
# Server watches clipboard, sends changes to client
while true; do
NEW=$(wl-paste 2>/dev/null)
if [[ "$NEW" != "$LAST" ]]; then
echo "$NEW" | socat - TCP:10.99.0.2:9999
LAST="$NEW"
fi
sleep 0.5
done
The WireGuard back plane
All non-video traffic rides the WireGuard tunnel between client and server:
Streaming protocols — when to use which
| Protocol | Latency | Transport | Input | Best for |
|---|---|---|---|---|
| WebRTC | <200ms | UDP | Native | Interactive desktop use |
| SRT | <500ms | UDP | Separate | Reliable streaming over bad networks |
| HLS | 2–5s | HTTP | Separate | View-only, maximum compatibility |
Encoding: H.264 vs H.265
Universal support. Every browser, every device. Lower compression efficiency but faster encoding. Use this for VDI — latency matters more than file size.
50% better compression at the same quality. But browser support is incomplete (no Firefox on Linux). Better for recording/archiving than live streaming.
Troubleshooting
# Check mediamtx is running
systemctl status mediamtx
# Check active streams
curl -s http://localhost:9997/v3/paths/list | jq .
# Check nginx proxy
curl -s http://localhost/health
# Check if wf-recorder is capturing
ps aux | grep wf-recorder
# Check NVENC availability
ffmpeg -encoders 2>/dev/null | grep nvenc
# Test with VLC (SRT direct)
# vlc srt://vdi-server:8890?streamid=read:session1