Masterclass

Load Balancing & HA Masterclass

This guide goes deep on load balancing and high availability — HAProxy, keepalived, Traefik, and Caddy — grounded in the kldload stack. If you have web servers, APIs, or any service that needs to stay up when a node dies, this is the masterclass for you. Zero-to-hero: from the first frontend block to a full two-node HA cluster with a floating IP, WireGuard backends, and ZFS config snapshots.

1. Every Production Service Needs a Load Balancer

A single server is a single point of failure. Load balancers distribute traffic across multiple backends, health-check them continuously, and remove failed ones automatically. When a backend goes down, the load balancer stops sending it traffic — in seconds, not after a human notices and reacts. When you deploy a new version of your application, the load balancer enables zero-downtime rolling updates: take one server out of rotation, upgrade it, put it back, repeat.

On kldload, the load balancer runs on ZFS — config changes are snapshottable, upgrades are rollback-safe with boot environments, and the entire load balancer state can be replicated to a standby node with zfs send | zfs recv. WireGuard backs the backend pool: backends are reachable only via the WireGuard backplane, invisible from the internet. Health checks travel over encrypted tunnels. The backend pool has no public IPs. Port scans see one IP with one service. The load balancer is the only thing that exists.

This masterclass covers HAProxy (the industry standard — Layer 4 and Layer 7), keepalived (VRRP floating IPs for active/passive HA), Traefik (auto-discovery for container environments), and Caddy (automatic HTTPS with the simplest config format that exists).

Load balancing is not just for massive scale. Two web servers behind HAProxy means zero-downtime deploys and automatic failover. If one crashes at 3am, the other keeps serving. If you push a bad deploy, you roll back on one server before taking the other out of rotation. That is worth it even for a personal project — and HAProxy uses less RAM than most web applications.

2. HAProxy Fundamentals

HAProxy is the most widely deployed load balancer in the world. GitHub, Stack Overflow, Airbnb, and hundreds of other high-traffic sites run it. It handles millions of connections per second on commodity hardware. The entire configuration is one file.

Layer 4 load balancing (TCP)

HAProxy forwards TCP connections without reading the application protocol. Works for anything: HTTP, HTTPS (passthrough), MySQL, Redis, SMTP. Fast, simple, no protocol knowledge required. The load balancer sees source IP and destination port, picks a backend, and forwards.

// Layer 4: route by IP + port // HAProxy sees the envelope, not the letter inside

Layer 7 load balancing (HTTP)

HAProxy parses HTTP requests before routing. This enables path-based routing, header inspection, cookie-based session affinity, rate limiting, and ACL-based access control. HAProxy reads the URL, Host header, cookies — then makes a routing decision based on what it finds.

// Layer 7: route by HTTP content // HAProxy reads the letter before deciding where to send it

Frontend → Backend → Server

The HAProxy config model: a frontend listens on an IP and port. It evaluates ACLs and routes to a backend. A backend contains one or more server entries — the actual upstream hosts. One frontend can route to multiple backends based on rules.

// frontend: the door clients knock on // backend: the team that handles the request // server: an individual team member

Balance algorithms

roundrobin — send requests to backends in turn. leastconn — send to the backend with fewest active connections (best for long-lived connections). source — hash the client IP, always send to the same backend (sticky sessions). uri — hash the request URI, useful for caching tiers.

// roundrobin: take turns // leastconn: give it to whoever is least busy // source: same client, same server, always

Install HAProxy on kldload

# CentOS / Rocky / RHEL
dnf install -y haproxy

# Debian / Ubuntu
apt-get install -y haproxy

# Enable and start
systemctl enable --now haproxy

# Config file
/etc/haproxy/haproxy.cfg

# Check config syntax without restarting
haproxy -c -f /etc/haproxy/haproxy.cfg

Basic HTTP load balancer — 3 web servers

global
    log /dev/log local0
    maxconn 50000
    user haproxy
    group haproxy
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5s
    timeout client  30s
    timeout server  30s
    retries 3

frontend http_front
    bind *:80
    default_backend web_servers

backend web_servers
    balance roundrobin
    option httpchk GET /health HTTP/1.1\r\nHost:\ example.com
    server web1 10.10.0.11:8080 check inter 2s fall 3 rise 2
    server web2 10.10.0.12:8080 check inter 2s fall 3 rise 2
    server web3 10.10.0.13:8080 check inter 2s fall 3 rise 2

Breaking down the server line: check enables health checking. inter 2s — check every 2 seconds. fall 3 — mark down after 3 consecutive failures. rise 2 — mark up after 2 consecutive successes. HAProxy will not send traffic to a server marked down.

Health check types

# TCP check — just verify the port is open
server web1 10.10.0.11:8080 check

# HTTP check — verify the app returns a 200
option httpchk GET /health HTTP/1.1\r\nHost:\ example.com
http-check expect status 200

# Custom HTTP check with expected string in body
http-check expect string "OK"

# Health check over a different port (separate management port)
server web1 10.10.0.11:8080 check port 9000 inter 2s fall 3 rise 2

HAProxy handles millions of connections per second on commodity hardware. It is what GitHub, Stack Overflow, and Airbnb use. The entire config is one file. It has been in continuous production use since 2001. There is no dashboard to click through, no Helm chart to debug, no operator to install. You write a config file, check it with haproxy -c, and reload with systemctl reload haproxy. That reload is zero-downtime — HAProxy keeps existing connections alive while loading the new config.

3. HAProxy Layer 7 Features

Layer 7 means HAProxy reads the HTTP request before making a routing decision. This unlocks path-based routing, header inspection, rate limiting, and cookie-based session persistence — capabilities that a pure TCP load balancer cannot provide.

ACLs — route by path, header, or hostname

frontend http_front
    bind *:80
    bind *:443 ssl crt /etc/haproxy/certs/example.com.pem

    # Define ACLs
    acl is_api    path_beg /api/
    acl is_static path_beg /static/ /assets/ /img/
    acl is_www    hdr(host) -i www.example.com example.com
    acl is_api_host hdr(host) -i api.example.com

    # Route based on ACLs
    use_backend api_servers   if is_api_host
    use_backend api_servers   if is_api
    use_backend static_files  if is_static
    default_backend web_servers

backend api_servers
    balance leastconn
    option httpchk GET /api/health
    server api1 10.10.0.21:8000 check inter 2s fall 3 rise 2
    server api2 10.10.0.22:8000 check inter 2s fall 3 rise 2

backend static_files
    balance roundrobin
    server static1 10.10.0.31:80 check
    server static2 10.10.0.32:80 check

backend web_servers
    balance roundrobin
    server web1 10.10.0.11:8080 check inter 2s fall 3 rise 2
    server web2 10.10.0.12:8080 check inter 2s fall 3 rise 2

SSL/TLS termination

# Terminate TLS at HAProxy, forward plaintext to backends
frontend https_front
    bind *:443 ssl crt /etc/haproxy/certs/example.com.pem alpn h2,http/1.1
    # Redirect HTTP to HTTPS
    bind *:80
    redirect scheme https code 301 if !{ ssl_fc }

    default_backend web_servers

# Generate a PEM from cert + key (HAProxy wants both in one file)
cat fullchain.pem privkey.pem > /etc/haproxy/certs/example.com.pem
chmod 600 /etc/haproxy/certs/example.com.pem

Rate limiting with stick tables

frontend http_front
    bind *:80
    # Track source IPs in a stick table
    stick-table type ip size 100k expire 30s store conn_rate(10s),http_req_rate(10s)
    http-request track-sc0 src

    # Block if more than 100 requests in 10 seconds
    acl too_many_requests sc_http_req_rate(0) gt 100
    http-request deny deny_status 429 if too_many_requests

    default_backend web_servers

Session persistence with cookies

backend web_servers
    balance roundrobin
    # Insert a cookie to pin clients to a backend
    cookie SERVERID insert indirect nocache
    server web1 10.10.0.11:8080 check cookie web1
    server web2 10.10.0.12:8080 check cookie web2
    server web3 10.10.0.13:8080 check cookie web3

Connection limits

backend web_servers
    # Never send more than 100 simultaneous connections to one server
    server web1 10.10.0.11:8080 check maxconn 100
    server web2 10.10.0.12:8080 check maxconn 100
    # Queue connections above the limit, not reject
    timeout queue 10s

WebSocket support

backend ws_servers
    balance source
    option http-server-close
    # WebSocket requires HTTP/1.1 and connection upgrade
    timeout tunnel 1h
    server ws1 10.10.0.41:9000 check
    server ws2 10.10.0.42:9000 check

Layer 7 means HAProxy reads the HTTP request before routing. This enables path-based routing, header inspection, and cookie-based session affinity — things a TCP load balancer cannot do. The ACL system is expressive: match on source IP, URL path, Host header, cookie value, HTTP method, query string — and combine conditions with boolean logic. If you have never used ACLs before, start simple: one ACL per routing rule, and build up from there.

4. keepalived — Floating IPs for HA

HAProxy on a single server is better than nothing, but it is still a single point of failure. keepalived solves this with VRRP (Virtual Router Redundancy Protocol) — a virtual IP that floats between two servers. When the active node fails, the passive takes the IP in under three seconds. Clients never know the difference. No DNS changes, no reconfiguration, no human intervention.

What VRRP does

VRRP creates a virtual IP address that is owned by one node at a time (the MASTER). The MASTER sends periodic advertisements to the BACKUP. If the BACKUP stops receiving advertisements, it takes ownership of the virtual IP. The transition is transparent: ARP announces the new MAC, clients reconnect immediately.

// VRRP: one IP, two servers, one owner at a time // MASTER dies → BACKUP promotes → IP moves → 1-3 seconds

Active/passive vs active/active

Active/passive: one server handles all traffic, the other is a hot standby. Simple, no split-brain risk. Active/active: both servers handle traffic using different virtual IPs, each is the BACKUP for the other's VIP. Requires DNS round-robin or a third-party DNS LB to distribute clients across the two VIPs. More complex, but doubles throughput.

// Active/passive: primary + spare // Active/active: both working, each a spare for the other

Install keepalived

# CentOS / Rocky / RHEL
dnf install -y keepalived

# Debian / Ubuntu
apt-get install -y keepalived

systemctl enable keepalived

Two HAProxy nodes sharing a floating IP

Assume: lb1 at 10.10.0.1, lb2 at 10.10.0.2, floating VIP 10.10.0.10. Both nodes run HAProxy with identical configs. keepalived decides which one owns the VIP.

# /etc/keepalived/keepalived.conf on lb1 (MASTER)
global_defs {
    router_id lb1
    script_user root
    enable_script_security
}

vrrp_script check_haproxy {
    script "/usr/bin/pgrep haproxy"
    interval 2
    weight   -20
    fall     2
    rise     2
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 150              # higher wins
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass secretpass
    }
    virtual_ipaddress {
        10.10.0.10/24
    }
    track_script {
        check_haproxy
    }
}

# /etc/keepalived/keepalived.conf on lb2 (BACKUP)
global_defs {
    router_id lb2
    script_user root
    enable_script_security
}

vrrp_script check_haproxy {
    script "/usr/bin/pgrep haproxy"
    interval 2
    weight   -20
    fall     2
    rise     2
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 100              # lower than master
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass secretpass
    }
    virtual_ipaddress {
        10.10.0.10/24
    }
    track_script {
        check_haproxy
    }
}

The vrrp_script check_haproxy block is critical: it checks whether HAProxy is running, not just whether the server is reachable. If HAProxy crashes on lb1 but the host is alive, the weight penalty drops lb1's effective priority below lb2's, triggering failover. Health check scripts are what make keepalived actually reliable — check the service, not just the server.

# Start keepalived on both nodes
systemctl start keepalived

# Verify VIP ownership on lb1
ip addr show eth0 | grep 10.10.0.10

# Simulate failure: stop HAProxy on lb1
systemctl stop haproxy

# Verify VIP moved to lb2 within 3 seconds
ip addr show eth0    # on lb2 — should now show 10.10.0.10

# Check VRRP state
journalctl -u keepalived -f

keepalived plus HAProxy is the classic HA pattern. Two HAProxy nodes, one floating IP. If the active node dies, the passive takes the IP in under three seconds. No DNS changes, no client reconfiguration — the IP just moves. This is how you build a load balancer tier that survives hardware failures, kernel panics, and HAProxy crashes without any human intervention. The VIP becomes the stable endpoint your DNS record points to. Everything behind it can fail and recover without anyone noticing.

5. Traefik — Auto-Discovery Load Balancer

HAProxy is the right tool when you have a stable list of backends and want maximum control. Traefik is the right tool when backends are ephemeral — Docker containers that start and stop, Kubernetes pods that reschedule. Traefik discovers backends automatically from Docker labels, Kubernetes ingress resources, or config files. Add a label to a container, Traefik routes to it. No config reload, no manual backend list maintenance.

Install Traefik on kldload

# Download the binary
curl -L https://github.com/traefik/traefik/releases/download/v3.1.0/traefik_v3.1.0_linux_amd64.tar.gz | tar xz
mv traefik /usr/local/bin/

# Or run as a container
docker run -d \
  --name traefik \
  -p 80:80 -p 443:443 -p 8080:8080 \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /etc/traefik:/etc/traefik \
  traefik:v3.1

Static config — traefik.yml

# /etc/traefik/traefik.yml
api:
  dashboard: true
  insecure: false    # enable dashboard at :8080 (bind to localhost only)

entryPoints:
  web:
    address: ":80"
    http:
      redirections:
        entryPoint:
          to: websecure
          scheme: https
  websecure:
    address: ":443"

providers:
  docker:
    endpoint: "unix:///var/run/docker.sock"
    exposedByDefault: false    # require explicit label to expose a container
  file:
    directory: /etc/traefik/dynamic
    watch: true

certificatesResolvers:
  letsencrypt:
    acme:
      email: you@example.com
      storage: /etc/traefik/acme.json
      httpChallenge:
        entryPoint: web

Docker Compose — auto-routing with labels

# docker-compose.yml
version: "3.8"
services:
  traefik:
    image: traefik:v3.1
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /etc/traefik:/etc/traefik
    restart: unless-stopped

  webapp:
    image: nginx:alpine
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.webapp.rule=Host(`www.example.com`)"
      - "traefik.http.routers.webapp.entrypoints=websecure"
      - "traefik.http.routers.webapp.tls.certresolver=letsencrypt"
      - "traefik.http.services.webapp.loadbalancer.server.port=80"
    restart: unless-stopped

  api:
    image: myapp:latest
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.api.rule=Host(`api.example.com`)"
      - "traefik.http.routers.api.entrypoints=websecure"
      - "traefik.http.routers.api.tls.certresolver=letsencrypt"
      - "traefik.http.services.api.loadbalancer.server.port=8000"
    restart: unless-stopped

Traefik discovers both containers, gets Let's Encrypt certificates for both hostnames, and starts routing. No Traefik restart, no config file edit. Bring up a new container with the right labels and it is routed. Stop it and Traefik removes the route.

Kubernetes IngressRoute (Traefik CRD)

apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: webapp
  namespace: default
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`www.example.com`)
      kind: Rule
      services:
        - name: webapp-svc
          port: 80
    - match: Host(`api.example.com`) && PathPrefix(`/v2`)
      kind: Rule
      services:
        - name: api-svc
          port: 8000
  tls:
    certResolver: letsencrypt

Traefik is the set-it-and-forget-it load balancer. Add a Docker container with a label, Traefik discovers it and routes to it. No config reload, no manual backend list. For dynamic container environments — anything where containers start and stop frequently — this is the right tool. HAProxy is better when you need maximum performance and full control. Traefik is better when your infrastructure is dynamic and you want routing to be declarative rather than imperative.

6. Caddy — The Simplest HTTPS Server

Caddy is a web server and reverse proxy with one killer feature: automatic HTTPS. You write a hostname in the Caddyfile, Caddy gets a Let's Encrypt certificate, configures TLS, and handles renewal — forever. Zero certificate management. For small-to-medium deployments where you do not need HAProxy's full feature set, Caddy is the fastest path to production HTTPS.

Install Caddy on kldload

# CentOS / Rocky / RHEL
dnf install -y 'dnf-command(copr)'
dnf copr enable @caddy/caddy
dnf install -y caddy

# Debian / Ubuntu
apt-get install -y debian-keyring debian-archive-keyring apt-transport-https
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | tee /etc/apt/sources.list.d/caddy-stable.list
apt-get update && apt-get install -y caddy

systemctl enable --now caddy

Caddyfile — multi-site with automatic TLS in 10 lines

# /etc/caddy/Caddyfile

www.example.com {
    reverse_proxy 10.10.0.11:8080 10.10.0.12:8080
}

api.example.com {
    reverse_proxy 10.10.0.21:8000 10.10.0.22:8000
}

blog.example.com {
    reverse_proxy localhost:2368
}

static.example.com {
    root * /var/www/static
    file_server
}

That is the entire config. Caddy reads the hostnames, contacts Let's Encrypt, gets certificates for all four, configures TLS, and starts reverse-proxying. Multiple backends get round-robin load balancing by default. Add a health check:

www.example.com {
    reverse_proxy 10.10.0.11:8080 10.10.0.12:8080 {
        health_uri /health
        health_interval 10s
        health_timeout 5s
        health_status 200
        lb_policy round_robin
    }
}

Caddyfile — advanced features

api.example.com {
    # Rate limiting (requires caddy-ratelimit module)
    rate_limit {
        zone dynamic {
            key {remote_host}
            events 100
            window 10s
        }
    }

    # Add security headers
    header {
        Strict-Transport-Security "max-age=31536000; includeSubDomains"
        X-Content-Type-Options nosniff
        X-Frame-Options DENY
    }

    # Basic auth on a path
    handle /admin/* {
        basicauth {
            admin $2a$14$... # bcrypt hash
        }
        reverse_proxy localhost:8001
    }

    reverse_proxy 10.10.0.21:8000 10.10.0.22:8000
}

# Reload config without restart
caddy reload --config /etc/caddy/Caddyfile

Caddy's superpower is automatic HTTPS. Write a hostname in the Caddyfile, Caddy gets a Let's Encrypt cert, configures TLS, and handles renewal. Zero certificate management, zero cron jobs, zero certbot, zero manual renewal reminders. For small-to-medium deployments where you do not need HAProxy's power, Caddy is the answer. The Caddyfile syntax is so simple that it reads like documentation — you look at it and immediately understand what it does. That is a rare property in infrastructure software.

7. Health Checks Deep Dive

A load balancer without health checks is a traffic distributor, not a reliability tool. Health checks are what make failover automatic. Getting them right is the difference between a load balancer that removes failed backends in two seconds and one that keeps sending traffic to a server that returns 500 errors.

TCP health checks

# HAProxy: basic TCP check — is the port open?
server web1 10.10.0.11:8080 check inter 2s fall 3 rise 2

# This only tells you the port is open.
# It does NOT tell you the application is working.
# A web server can accept TCP connections but return 500 for every request.

HTTP health checks

# HAProxy: HTTP check — does the app return 200?
backend web_servers
    option httpchk GET /health HTTP/1.1\r\nHost:\ example.com
    http-check expect status 200
    server web1 10.10.0.11:8080 check inter 2s fall 3 rise 2

# HAProxy: check for a specific string in the response body
http-check expect string "\"status\":\"ok\""

# HAProxy: check a JSON health endpoint (match any 2xx)
option httpchk GET /healthz HTTP/1.1\r\nHost:\ internal
http-check expect rstatus ^(200|204)$

Application-specific health endpoint

# What a good /health endpoint checks:
# - Database connection: can we query the DB?
# - Cache connection: can we reach Redis?
# - Downstream APIs: are dependencies reachable?
# - Disk space: are we above the threshold?

# Example: Python Flask health endpoint
from flask import Flask, jsonify
import psycopg2, redis

app = Flask(__name__)

@app.route('/health')
def health():
    checks = {}
    try:
        conn = psycopg2.connect(DATABASE_URL)
        conn.close()
        checks['database'] = 'ok'
    except Exception as e:
        checks['database'] = str(e)

    try:
        r = redis.Redis.from_url(REDIS_URL)
        r.ping()
        checks['cache'] = 'ok'
    except Exception as e:
        checks['cache'] = str(e)

    status = 'ok' if all(v == 'ok' for v in checks.values()) else 'degraded'
    http_status = 200 if status == 'ok' else 503
    return jsonify({'status': status, 'checks': checks}), http_status

Health check timing parameters

# HAProxy server line breakdown:
server web1 10.10.0.11:8080 \
    check           \  # enable health checking
    inter 2s        \  # check interval: every 2 seconds
    fastinter 500ms \  # interval when server is in transition state
    downinter 5s    \  # interval for known-down servers
    fall 3          \  # consecutive failures before marking DOWN
    rise 2          \  # consecutive successes before marking UP
    weight 10       \  # relative weight for roundrobin
    slowstart 60s      # ramp up traffic over 60s after marking UP

Monitoring health check state with Prometheus

# HAProxy exposes Prometheus metrics natively (HAProxy 2.0+)
frontend stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 10s

# For Prometheus scraping:
frontend prometheus
    bind *:8405
    http-request use-service prometheus-exporter if { path /metrics }
    no log

# Or use haproxy_exporter sidecar
docker run -d \
  --name haproxy-exporter \
  -p 9101:9101 \
  prom/haproxy-exporter \
  --haproxy.scrape-uri="http://localhost:8404/stats;csv"

# Key metrics to alert on:
# haproxy_backend_status — 0=DOWN, 1=UP per backend
# haproxy_backend_active_servers — number of active servers
# haproxy_backend_http_responses_total by code — 5xx rate
# haproxy_server_check_failures_total — cumulative health check failures

A load balancer without health checks is a traffic distributor, not a reliability tool. The health check is what makes failover automatic. Get it right: check the application, not just the port. A good health endpoint verifies the database connection, the cache, and any critical downstream dependencies. If the health check only opens a TCP socket, you will route traffic to a server that is accepting connections but returning 500 for every request — and you will not know until your monitoring alerts fire. Check what matters. Return 503 when the application is not ready to serve.

8. SSL/TLS Termination and Passthrough

There are three ways to handle TLS at a load balancer. Understanding the tradeoffs is the difference between a correct architecture and a security hole.

TLS termination

The load balancer decrypts TLS, forwards plaintext to backends. Backends need no certificates. The LB can inspect HTTP content for Layer 7 routing. Used by most deployments. Tradeoff: the load balancer sees all traffic in plaintext — it is a trusted component in your architecture.

// Client → [TLS] → LB → [plaintext] → backend // LB decrypts. Backend gets HTTP, not HTTPS.

TLS passthrough

The load balancer forwards encrypted traffic without decrypting. Backend decrypts. The LB cannot inspect HTTP content — it can only route based on SNI (the hostname in the TLS ClientHello). Used when backends must hold the private key, or for end-to-end encryption compliance.

// Client → [TLS] → LB (SNI routing only) → [TLS] → backend // LB never sees plaintext. Backend decrypts.

Re-encryption (mTLS)

The load balancer decrypts from the client and re-encrypts to the backend. Can use mutual TLS (mTLS) for the backend leg — the backend verifies the LB's client certificate. Full HTTP inspection at the LB, encrypted transit to backends. Best for zero-trust architectures.

// Client → [TLS] → LB → [mTLS] → backend // LB decrypts client traffic, re-encrypts to backend.

HAProxy TLS passthrough (SNI routing)

frontend tls_passthrough
    bind *:443
    mode tcp
    option tcplog
    # Route by SNI without decrypting
    tcp-request inspect-delay 5s
    tcp-request content accept if { req_ssl_hello_type 1 }

    acl is_api   req_ssl_sni -i api.example.com
    acl is_www   req_ssl_sni -i www.example.com

    use_backend api_tls  if is_api
    use_backend www_tls  if is_www

backend api_tls
    mode tcp
    balance roundrobin
    server api1 10.10.0.21:443 check
    server api2 10.10.0.22:443 check

backend www_tls
    mode tcp
    balance roundrobin
    server web1 10.10.0.11:443 check
    server web2 10.10.0.12:443 check

Re-encryption to backend (HTTPS backend)

backend api_reencrypt
    balance roundrobin
    # Forward to backend over TLS
    server api1 10.10.0.21:8443 check ssl verify required ca-file /etc/haproxy/ca.pem
    server api2 10.10.0.22:8443 check ssl verify required ca-file /etc/haproxy/ca.pem

Integration with step-ca for internal TLS

# step-ca issues internal certificates for your infrastructure
# Install step-ca on a kldload node
curl -L https://dl.smallstep.com/gh-release/certificates/gh-release-header/v0.27.0/step-ca_linux_0.27.0_amd64.tar.gz | tar xz
mv step-ca /usr/local/bin/

# Initialize a CA
step ca init --deployment-type=standalone

# Issue a certificate for HAProxy
step ca certificate haproxy.internal haproxy.crt haproxy.key

# Issue for backends
step ca certificate api1.internal api1.crt api1.key

# HAProxy uses the internal CA for backend verification
backend api_internal
    server api1 10.10.0.21:8443 check ssl verify required ca-file /root/.step/certs/root_ca.crt

TLS termination at the load balancer means backends do not need certificates. But it also means the LB sees all traffic in plaintext. For compliance or zero-trust, use passthrough or re-encryption. On kldload, the WireGuard backplane already encrypts all host-to-host traffic — if your backends are on WireGuard addresses, plain-HTTP backends behind a TLS-terminating LB are still encrypted in transit at the network layer. Whether that satisfies your compliance requirements depends on your threat model. If it does not, use re-encryption with step-ca for internal mTLS.

9. Load Balancing on WireGuard

This is the kldload pattern. The load balancer is the only server with a public IP. Everything behind it — web servers, APIs, databases — lives on the WireGuard backplane. The backend pool is invisible from the internet. A port scan of your public IP shows one IP with one or two open ports. The entire infrastructure is hidden.

The pattern: public load balancer (HAProxy + keepalived) with a public IP. All backends live on the WireGuard backplane — private addresses like 10.10.0.0/24. HAProxy health checks go to WireGuard addresses. Traffic from HAProxy to backends is encrypted by WireGuard at the network layer. The backends have no public IPs. There are no firewall rules to allow inbound connections to them — they are unreachable from the internet by design.

WireGuard peers authenticate by public key — a backend that does not have a valid key cannot receive traffic, even if it could somehow route to the backplane. This is a stronger guarantee than a firewall rule, which can be misconfigured. The backend pool is cryptographically isolated.

Full config: public LB with WireGuard backends

# WireGuard is already configured on all nodes
# LB backplane address: 10.10.0.1
# web1 backplane address: 10.10.0.11
# web2 backplane address: 10.10.0.12
# web3 backplane address: 10.10.0.13

# /etc/haproxy/haproxy.cfg on the load balancer

global
    log /dev/log local0
    maxconn 100000
    user haproxy
    group haproxy
    daemon
    # Stats socket for runtime API
    stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    option  forwardfor        # pass X-Forwarded-For to backends
    option  http-server-close
    timeout connect 5s
    timeout client  30s
    timeout server  30s
    retries 3

# Public HTTPS frontend
frontend https_front
    bind *:443 ssl crt /etc/haproxy/certs/example.com.pem alpn h2,http/1.1
    bind *:80
    redirect scheme https code 301 if !{ ssl_fc }

    # Real client IP logging (HAProxy terminates TLS, add XFF header)
    http-request set-header X-Real-IP %[src]

    # Path-based routing
    acl is_api path_beg /api/
    use_backend api_wg  if is_api
    default_backend web_wg

# Web backends — all on WireGuard addresses
backend web_wg
    balance leastconn
    option httpchk GET /health HTTP/1.1\r\nHost:\ example.com
    http-check expect status 200
    # WireGuard addresses — invisible from internet
    server web1 10.10.0.11:8080 check inter 2s fall 3 rise 2
    server web2 10.10.0.12:8080 check inter 2s fall 3 rise 2
    server web3 10.10.0.13:8080 check inter 2s fall 3 rise 2

# API backends — all on WireGuard addresses
backend api_wg
    balance leastconn
    option httpchk GET /api/health HTTP/1.1\r\nHost:\ api.example.com
    http-check expect status 200
    server api1 10.10.0.21:8000 check inter 2s fall 3 rise 2
    server api2 10.10.0.22:8000 check inter 2s fall 3 rise 2

# Internal stats — bind to WireGuard address only, never public
frontend stats
    bind 10.10.0.1:8404
    stats enable
    stats uri /stats
    stats refresh 5s
    stats auth admin:changeme

nftables rules on the load balancer node

# Only allow inbound on 80/443 from internet
# WireGuard (51820) is already handled by the WireGuard interface
# Stats page accessible only from WireGuard backplane

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;
        ct state established,related accept
        iif lo accept
        # WireGuard
        udp dport 51820 accept
        # Public web traffic
        tcp dport { 80, 443 } accept
        # SSH only from backplane
        iif wg0 tcp dport 22 accept
        # ICMP
        ip protocol icmp accept
        ip6 nexthdr icmpv6 accept
    }
}

This is the kldload pattern. The load balancer is the only thing with a public IP. Everything behind it — web servers, APIs, databases — lives on the WireGuard backplane. The backend pool is invisible. Port scans show one IP with one service. The entire infrastructure is hidden. WireGuard peers authenticate by public key, which is a stronger guarantee than a firewall rule. Combine this with keepalived for a two-node HA LB tier, and you have a production-grade infrastructure front end that can withstand node failures, DDoS probing, and network-level attacks — with a config that fits on one screen.

10. ZFS for Load Balancer Config

HAProxy config changes can take down your entire infrastructure if they are wrong. A misconfigured ACL, a missing ssl crt, a typo in a server address — any of these causes HAProxy to fail to reload. On kldload, ZFS gives you an instant undo button: snapshot before every change, rollback in seconds if something breaks.

Snapshot before config changes

# Store HAProxy config on a dedicated dataset
zfs create rpool/etc/haproxy
zfs set mountpoint=/etc/haproxy rpool/etc/haproxy

# Snapshot before every change
zfs snapshot rpool/etc/haproxy@before-acl-change-2026-04-02

# Make the change
vim /etc/haproxy/haproxy.cfg

# Test the config
haproxy -c -f /etc/haproxy/haproxy.cfg

# If the test fails, rollback immediately
zfs rollback rpool/etc/haproxy@before-acl-change-2026-04-02

# If the test passes, reload
systemctl reload haproxy

Boot environments for HAProxy upgrades

# Create a boot environment before upgrading HAProxy
bectl create before-haproxy-upgrade
bectl mount before-haproxy-upgrade /mnt

# Upgrade HAProxy
dnf upgrade -y haproxy

# If the upgrade breaks something, boot back
bectl activate before-haproxy-upgrade
reboot

Replicate LB config to standby node

# On lb1: send config snapshots to lb2 continuously
zfs snapshot rpool/etc/haproxy@$(date +%Y%m%d-%H%M%S)

# Initial replication
zfs send rpool/etc/haproxy@initial | \
  ssh lb2 "zfs recv rpool/etc/haproxy"

# Incremental replication (after every change)
LAST=$(zfs list -t snapshot -H -o name rpool/etc/haproxy | tail -2 | head -1)
NOW=$(zfs list -t snapshot -H -o name rpool/etc/haproxy | tail -1)
zfs send -i $LAST $NOW | ssh lb2 "zfs recv rpool/etc/haproxy"

# lb2 always has an up-to-date copy of the config
# failover is instant — no config drift

A bad HAProxy config change can take down your entire infrastructure. Snapshot before every change, rollback in seconds. This is the same pattern as the firewall recipe — any config that can break connectivity gets a ZFS snapshot before you touch it. The snapshot costs nothing (a few KB of metadata). The rollback takes one command. There is no reason not to do this for every significant config change.

11. Cilium Load Balancing (Kubernetes)

Inside a Kubernetes cluster, Cilium replaces kube-proxy with eBPF load balancing. Every Kubernetes Service becomes an entry in an eBPF map. Lookups are O(1) hash operations in the kernel, not O(n) iptables chain traversals. At scale, this is not a minor optimization — it is the difference between a cluster that programs new services in milliseconds and one that stalls for 10-30 seconds.

Replace kube-proxy with Cilium

# Install Cilium with kube-proxy replacement
helm install cilium cilium/cilium \
  --namespace kube-system \
  --set kubeProxyReplacement=true \
  --set k8sServiceHost=10.10.0.1 \
  --set k8sServicePort=6443

# Verify kube-proxy replacement is active
kubectl exec -n kube-system ds/cilium -- \
  cilium status | grep "KubeProxyReplacement"
# Should show: KubeProxyReplacement: True

DSR — Direct Server Return

# DSR makes backends respond directly to clients, bypassing the LB node
# Eliminates the return-path bottleneck for high-throughput services
helm upgrade cilium cilium/cilium \
  --namespace kube-system \
  --reuse-values \
  --set loadBalancer.mode=dsr

BGP-announced LoadBalancer services

# Cilium BGP speaker announces service IPs to your router
# LoadBalancer services get real IPs that your LAN router knows about
apiVersion: "cilium.io/v2alpha1"
kind: CiliumLoadBalancerIPPool
metadata:
  name: service-pool
spec:
  cidrs:
    - cidr: "10.20.0.0/24"    # your LAN routable range

---
# Any service of type LoadBalancer gets an IP from this pool
apiVersion: v1
kind: Service
metadata:
  name: webapp
spec:
  type: LoadBalancer
  selector:
    app: webapp
  ports:
    - port: 80
      targetPort: 8080

Cilium's eBPF load balancing is the fastest in-kernel service routing available. At 10,000 services, iptables chains have hundreds of thousands of rules. Programming them on every service change takes 10-30 seconds — during which new endpoints are unreachable. Cilium's eBPF maps are hash tables: O(1) lookup at any scale, atomic updates with no programming window. If you are running Kubernetes at any meaningful scale, replacing kube-proxy with Cilium is one of the highest-leverage changes you can make. The performance improvement is measurable. The operational complexity is lower — one fewer component to manage.

12. Global Load Balancing (Multi-Site)

When you have more than one datacenter or site, you need global load balancing — routing users to the nearest available site and failing over between sites when one goes down. Three patterns: DNS-based failover, anycast BGP, and GeoDNS.

Cloudflare DNS failover

# Cloudflare can health-check your origin and fail over DNS automatically
# Site 1: 203.0.113.10 (primary)
# Site 2: 203.0.113.20 (failover)

# Set up health checks in Cloudflare dashboard:
# Type: HTTP, URL: https://www.example.com/health, expected: 200

# DNS records with failover:
# A www.example.com 203.0.113.10 (primary, Proxied)
# A www-failover.example.com 203.0.113.20 (secondary, DNS Only)

# Load Balancing rules (Cloudflare Load Balancing product):
# Pool 1: 203.0.113.10 (primary)
# Pool 2: 203.0.113.20 (failover)
# Origin health check on /health every 60s
# Failover: if primary pool unhealthy, route to Pool 2

PowerDNS with health checks (self-hosted)

# PowerDNS + pdns-recursor + Lua scripts for health-aware DNS
# /etc/pdns/pdns.conf
launch=gsqlite3
gsqlite3-database=/var/lib/powerdns/pdns.db
enable-lua-records=yes

# Lua record for health-checked failover
CREATE OR REPLACE TABLE records ...
-- Lua A record:
-- www IN LUA A "ifportup(80, {'203.0.113.10', '203.0.113.20'})"
-- Returns 203.0.113.10 if port 80 is open, else 203.0.113.20

Anycast BGP (same IP, multiple sites)

# Both sites announce the same IP prefix from different ASNs
# BGP routing selects the nearest site for each client
# If a site fails, its BGP announcement withdraws, traffic routes to the other

# On each kldload site's border router (FRRouting):
router bgp 65001
  bgp router-id 203.0.113.10
  neighbor 198.51.100.1 remote-as 65000    # upstream provider
  address-family ipv4 unicast
    network 203.0.113.0/24                  # announce your anycast prefix
    neighbor 198.51.100.1 activate

# When you withdraw the announcement (site goes down or maintenance):
vtysh -c "configure" -c "router bgp 65001" \
  -c "address-family ipv4 unicast" \
  -c "no network 203.0.113.0/24"

Two kldload sites with Cloudflare failover

# Site 1 (primary): HAProxy + keepalived, VIP at 203.0.113.10
# Site 2 (DR): HAProxy + keepalived, VIP at 203.0.113.20
# Both sites: identical WireGuard backplane configs
# Config replication: zfs send | zfs recv over WireGuard tunnel

# Cloudflare health monitor pings /health on both sites every 60s
# DNS TTL: 60 seconds (fast failover)
# If Site 1 /health returns non-200 for 2 consecutive checks:
# Cloudflare updates DNS to 203.0.113.20
# Users start hitting Site 2 within 60-120 seconds

13. Monitoring and Troubleshooting

HAProxy stats page

# Enable stats page (bind to backplane address, not public)
frontend stats
    bind 10.10.0.1:8404
    stats enable
    stats uri /stats
    stats refresh 5s
    stats show-legends
    stats show-node
    stats auth admin:changeme

# Access at http://10.10.0.1:8404/stats
# Shows: frontend/backend/server state, connection rates, error rates, health check status

HAProxy runtime API via socat

# HAProxy exposes a runtime API over a Unix socket
# Enable in global section:
# stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners

# Show backend status
echo "show servers state" | socat stdio /run/haproxy/admin.sock

# Drain a server (stop sending new connections, let existing finish)
echo "set server web_servers/web1 state drain" | socat stdio /run/haproxy/admin.sock

# Take a server offline for maintenance
echo "set server web_servers/web1 state maint" | socat stdio /run/haproxy/admin.sock

# Bring it back
echo "set server web_servers/web1 state ready" | socat stdio /run/haproxy/admin.sock

# Show current connections
echo "show info" | socat stdio /run/haproxy/admin.sock | grep "CurrConns"

Common issues and fixes

Symptom	Likely cause	Fix
Connection refused on frontend port	HAProxy not running, or firewall blocking	`systemctl status haproxy`, check `haproxy -c` for config errors, check nftables rules
502 Bad Gateway	Backend is up but returning an error	Check backend application logs, verify the health check endpoint returns 200
503 Service Unavailable	All backends are DOWN	Check stats page for backend health, `show servers state` via socat, verify health check config
Timeouts on long requests	HAProxy timeout too short	Increase `timeout server` and `timeout client` for long-lived connections (uploads, WebSockets)
VIP not floating after node failure	keepalived not running, or VRRP blocked	`systemctl status keepalived`, verify VRRP protocol (112) is not blocked by firewall
HAProxy reload fails	Config syntax error	`haproxy -c -f /etc/haproxy/haproxy.cfg` — always run this before reload; rollback ZFS snapshot if needed
Backends marked DOWN but they respond	Health check misconfigured	Verify health check URL, expected status code, and Host header; curl the health endpoint manually from the LB
Split-brain on keepalived	Both nodes become MASTER simultaneously	Check VRRP multicast reachability between nodes, verify identical `virtual_router_id` and `auth_pass`

Config validation and debugging workflow

# 1. Always validate before reload
haproxy -c -f /etc/haproxy/haproxy.cfg
# "Configuration file is valid" means it is safe to reload

# 2. Reload without downtime
systemctl reload haproxy
# HAProxy keeps existing connections alive, loads new config

# 3. Check logs
journalctl -u haproxy -f
# HAProxy logs to syslog; look for "Server backend/server is DOWN"

# 4. Test a specific backend from the LB node
curl -v http://10.10.0.11:8080/health
# If this fails, the backend is down or the health check path is wrong

# 5. Check keepalived state
ip addr show | grep -A2 "inet 10.10.0.10"
# Should appear on exactly one node

# 6. Verify WireGuard connectivity to backends
ping 10.10.0.11
wg show   # verify handshakes are recent

# 7. Check HAProxy stats for server state
echo "show servers state web_servers" | socat stdio /run/haproxy/admin.sock

WireGuard Masterclass — the backplane that hides your backend pool
nftables Masterclass — firewall rules for the LB node
Cilium Masterclass — eBPF load balancing inside Kubernetes
BIRD & BGP Masterclass — BGP for multi-site anycast
Observability Masterclass — Prometheus, Grafana, HAProxy metrics
WireGuard Mesh & Multi-Site — building the backplane
Cluster & Blue/Green — zero-downtime deploy patterns
Monitoring Stack Glossary (355 terms) Help & Links — haproxy_exporter, Prometheus, Grafana

← Databases on ZFS Construction Kit →