Security Hardening Masterclass
This guide goes deep on securing a kldload system — every layer, with real commands you can run today. If you have read the Security overview and the nftables Masterclass, this is the next step: you have a running system, now lock it down from the kernel to the application layer, with monitoring that tells you when something is wrong before the attacker does.
Security is not a product you install. It is layers. Encrypted transport (WireGuard), firewalled perimeters (nftables), hardened services (systemd sandboxing), runtime detection (eBPF/Falco), data integrity (ZFS checksums), and access control (RBAC/certificates). kldload provides the foundation for every layer. This masterclass teaches you to harden it.
Prerequisites: a running kldload system. No Kubernetes required — everything here applies to bare-metal servers, VMs, and containers. The sections build on each other but each stands alone. Skip to what you need.
1. The kldload Security Baseline
A freshly installed kldload system is already ahead of most Linux distributions. Here is what you get without doing anything extra, and what still needs your attention.
| Layer | What kldload provides | Status |
|---|---|---|
| Data integrity | ZFS checksums on every block — detects silent corruption and tampering | built in |
| Encrypted transport | WireGuard installed and configured — backplane encryption between nodes | built in |
| Firewall | nftables active — but default rules are permissive. You write the policy. | configure |
| Mandatory access control | SELinux enforcing (CentOS/RHEL/Rocky/Fedora) · AppArmor (Debian/Ubuntu/Alpine) | built in |
| SSH | Key auth configured, root login disabled, password auth off by default | built in |
| Service hardening | systemd units — not hardened by default. You add sandbox options. | configure |
| Runtime detection | eBPF available — Falco/Tetragon not installed by default | install |
| Kernel hardening | Default kernel settings. CIS-required sysctls not applied by default. | configure |
| Audit logging | journald active. auditd not installed by default. | install |
| Certificates | No internal CA. You bring your own or deploy step-ca. | bring your own |
The security audit command
Before hardening anything, run this to see where you stand:
# Score every service unit's sandbox exposure
systemd-analyze security
# Score a specific service
systemd-analyze security sshd.service
# Full security report (shows what each setting does)
systemd-analyze security --no-pager nginx.service
The output scores each service from 0 (fully sandboxed) to 10 (completely exposed). Most default units score 9.6 or higher. After applying the hardening in Section 6, expect scores below 3.0 for well-hardened services.
2. SSH Hardening
SSH is the most common attack surface on Linux systems. On kldload, the correct strategy is to make SSH invisible from the public internet entirely — bind it only to the WireGuard backplane interface. Then harden the daemon itself for the cases where you need access from a non-WireGuard connection.
Key-only authentication
kldload disables password authentication by default. Verify it, and add the remaining hardening settings:
# /etc/ssh/sshd_config.d/99-hardened.conf
# Drop this file — it overrides the defaults
PasswordAuthentication no
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
# Disable less-used auth methods
ChallengeResponseAuthentication no
KerberosAuthentication no
GSSAPIAuthentication no
# Limit who can log in
AllowUsers deploy ops
# AllowGroups sshusers # alternative: group-based
# Harden the protocol
Protocol 2
LoginGraceTime 30
MaxAuthTries 3
MaxSessions 5
PermitRootLogin no
PermitEmptyPasswords no
# Restrict key algorithms to modern ciphers
Ciphers chacha20-poly1305@openssh.com,aes256-gcm@openssh.com
MACs hmac-sha2-512-etm@openssh.com,hmac-sha2-256-etm@openssh.com
KexAlgorithms curve25519-sha256,curve25519-sha256@libssh.org
# Log authentication events
LogLevel VERBOSE
SyslogFacility AUTH
# Idle session timeout
ClientAliveInterval 300
ClientAliveCountMax 2
# Validate and reload
sshd -t && systemctl reload sshd
Bind SSH to the WireGuard interface only
This is the most effective SSH hardening step: the SSH port does not exist on the public interface. There is nothing to brute force.
# In /etc/ssh/sshd_config.d/99-hardened.conf, add:
ListenAddress 10.100.0.1 # your WireGuard IP
# Restart (not reload — ListenAddress requires restart)
systemctl restart sshd
# Verify it is bound only to the WireGuard interface
ss -tlnp | grep :22
nftables rate limiting (better than fail2ban)
If you must expose SSH on a public interface, use nftables rate limiting instead of fail2ban. It operates in the kernel with no userspace scanning of log files.
# /etc/nftables.d/ssh-ratelimit.nft
table inet filter {
chain input {
# Rate limit SSH to 5 connections per minute per source IP
tcp dport 22 ct state new \
limit rate over 5/minute \
add @blocklist { ip saddr timeout 1h } \
drop
# After rate limit block, drop blocked IPs at the top of chain
ip saddr @blocklist drop
}
}
# Create the blocklist set (add to main nftables config)
set blocklist {
type ipv4_addr
flags dynamic, timeout
}
SSH certificates with step-ca
SSH authorized_keys management does not scale. With SSH certificates issued by an internal CA, you grant access by issuing a certificate, revoke access by expiring it, and eliminate the authorized_keys file entirely.
# Install step-ca on a dedicated CA host
dnf install step-ca step-cli # CentOS/RHEL
apt install step-ca step-cli # Debian/Ubuntu
# Initialize the CA
step ca init \
--name "kldload-internal" \
--dns "ca.internal" \
--address ":8443" \
--provisioner "ssh-provisioner" \
--ssh
# Start the CA
systemctl enable --now step-ca
# On target hosts: trust the CA's SSH host key
step ssh config --host --set Certificate=/etc/ssh/ssh_host_ecdsa_key-cert.pub \
--set Key=/etc/ssh/ssh_host_ecdsa_key
# Users get a certificate (valid 8 hours by default)
step ssh login user@example.com --ca-url https://ca.internal:8443
# /etc/ssh/sshd_config.d/99-cert-auth.conf
TrustedUserCAKeys /etc/ssh/trusted-user-ca-keys.pem
HostCertificate /etc/ssh/ssh_host_ecdsa_key-cert.pub
3. SELinux (CentOS / RHEL / Rocky / Fedora)
SELinux is mandatory access control at the kernel level. A process can only
access what the SELinux policy explicitly permits — regardless of Unix permissions.
A compromised web server running as root still cannot read /etc/shadow or write
to /home if the SELinux policy does not allow it.
Enforcing
Policy violations are blocked and logged. This is the correct production mode. Never disable SELinux because something breaks — fix the policy.
Permissive
Policy violations are logged but not blocked. Use this temporarily when tuning a new policy. Never leave a production system in permissive mode.
Disabled
SELinux off. Requires a reboot to re-enable. All the protection is gone. If you disabled SELinux to fix a problem, you fixed the wrong thing.
Context labels
Every file, process, and port has a context label. Policy rules permit or deny operations between labels. ls -Z shows file contexts, ps -Z shows process contexts.
Check SELinux status
getenforce
sestatus
# Check what policy is loaded
semodule -l | head -20
Troubleshooting denials
# See recent denials
ausearch -m avc -ts recent
# Explain a denial in plain English
ausearch -m avc -ts recent | audit2why
# Watch denials in real time
tail -f /var/log/audit/audit.log | grep denied
# Generate a policy module to allow the denial
ausearch -m avc -ts recent | audit2allow -M mypolicy
semodule -i mypolicy.pp
Fix file contexts (don't disable SELinux)
# Wrong context on a web file? Fix it:
restorecon -Rv /var/www/html/
# Check what context a path should have
matchpathcon /var/www/html/index.php
# Set a custom context for a non-standard path
semanage fcontext -a -t httpd_sys_content_t "/opt/myapp/public(/.*)?"
restorecon -Rv /opt/myapp/public/
Open a custom port for a service
# Allow nginx to listen on port 8443
semanage port -a -t http_port_t -p tcp 8443
# List all SELinux port labels
semanage port -l | grep http
Hardening booleans for common services
# Allow httpd to connect to the network (needed for proxying)
setsebool -P httpd_can_network_connect on
# Allow containers to use the host network
setsebool -P container_manage_cgroup on
# List all available booleans with descriptions
getsebool -a | grep httpd
Custom policy for kldload services
# Create a policy module for a custom service
# 1. Run the service in permissive mode for a day to collect denials
semanage permissive -a myservice_t
# 2. Generate the policy from the collected denials
ausearch -m avc -ts today -c myservice | audit2allow -M myservice-policy
semodule -i myservice-policy.pp
# 3. Remove permissive override — enforcing now applies
semanage permissive -d myservice_t
audit2why tells you exactly what
to fix in plain English. Use it.
4. AppArmor (Debian / Ubuntu / Alpine)
AppArmor is path-based mandatory access control. Where SELinux labels every object, AppArmor attaches a profile to each process and defines what filesystem paths and capabilities that process is allowed to use. Simpler to write, less granular than SELinux, but very effective for confining services.
Check AppArmor status
aa-status
# Show all loaded profiles
aa-status --enabled
# Show profiles in complain (permissive) mode
aa-status | grep complain
Enforce, complain, and unconfined modes
# Put a profile into enforce mode
aa-enforce /etc/apparmor.d/usr.sbin.nginx
# Put into complain mode (logs but does not block — like SELinux permissive)
aa-complain /etc/apparmor.d/usr.sbin.nginx
# Reload all profiles
systemctl reload apparmor
# Reload a specific profile
apparmor_parser -r /etc/apparmor.d/usr.sbin.nginx
Writing an AppArmor profile
# /etc/apparmor.d/usr.local.bin.myservice
#include <tunables/global>
/usr/local/bin/myservice {
#include <abstractions/base>
# Capabilities
capability net_bind_service,
capability setuid,
capability setgid,
# Read-only access to config
/etc/myservice/** r,
# Read-write access to data directory
/var/lib/myservice/** rw,
# Write logs
/var/log/myservice.log w,
# Network: allow outbound connections
network tcp,
# Deny everything else (implicit)
}
# Load and enforce the new profile
apparmor_parser -r -W /etc/apparmor.d/usr.local.bin.myservice
Interactive profile generation with aa-genprof
# Run aa-genprof while exercising the application
aa-genprof /usr/local/bin/myservice
# In another terminal, run the application through its normal operations
# aa-genprof captures what it does and asks you to allow or deny each action
# After finishing: save, enforce
# Check denials in /var/log/kern.log or /var/log/syslog
Check AppArmor denials
# Denials appear in kernel log
journalctl -k | grep apparmor | grep DENIED
# Or audit log if auditd is running
ausearch -m AVC | grep apparmor
aa-genprof workflow is genuinely good — run the application under monitoring
for an hour of normal operation, then review what it did. The profile AppArmor generates
is usually 90% correct and needs only minor editing.
5. systemd Security Hardening
systemd's sandbox directives are the lowest-effort, highest-impact security hardening available on any Linux system. Adding a handful of lines to a unit file restricts filesystem access, drops capabilities, filters syscalls, and isolates the process from the rest of the system. See the systemd Masterclass for a full treatment of unit files; this section focuses specifically on security options.
The key sandbox directives
| Directive | What it does | Score impact |
|---|---|---|
| ProtectSystem=strict | Mounts /, /usr, /boot read-only. Service cannot write to system directories. | -2.0 |
| ProtectHome=true | Makes /home, /root, /run/user inaccessible. Service cannot touch user data. | -1.0 |
| PrivateTmp=true | Service gets its own private /tmp. Cannot read other services' temp files. | -0.5 |
| NoNewPrivileges=true | Service cannot gain privileges via setuid binaries or capabilities. | -1.5 |
| CapabilityBoundingSet= | Drops all capabilities the service does not need. Empty = no capabilities. | -1.5 |
| SystemCallFilter= | Allowlist of syscalls the service may call. Everything else → SIGKILL. | -1.0 |
| PrivateNetwork=true | Service gets its own network namespace. Cannot touch host network at all. | -1.5 |
| PrivateUsers=true | Service runs in a user namespace — root inside is not root outside. | -1.0 |
| RestrictAddressFamilies= | Limits which socket address families the service can use (AF_INET, AF_UNIX, etc.) | -0.5 |
| MemoryDenyWriteExecute=true | Prevents creating writable+executable memory mappings. Defeats many exploits. | -0.5 |
Hardened unit file template
# /etc/systemd/system/myservice.service
[Unit]
Description=My Hardened Service
After=network.target
[Service]
Type=simple
User=myservice
Group=myservice
ExecStart=/usr/local/bin/myservice --config /etc/myservice/config.yaml
# Security hardening
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
PrivateDevices=true
ProtectControlGroups=true
ProtectKernelModules=true
ProtectKernelTunables=true
ProtectHostname=true
ProtectClock=true
RestrictSUIDSGID=true
MemoryDenyWriteExecute=true
RestrictRealtime=true
LockPersonality=true
# Drop all capabilities (add only what is needed)
CapabilityBoundingSet=
AmbientCapabilities=
# Network restrictions
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
# Syscall filtering — use systemd's predefined sets
SystemCallFilter=@system-service
SystemCallFilter=~@debug @mount @cpu-emulation @obsolete @raw-io @reboot @swap
# Filesystem access
ReadWritePaths=/var/lib/myservice
ReadOnlyPaths=/etc/myservice
[Install]
WantedBy=multi-user.target
# After editing: reload and check score
systemctl daemon-reload
systemctl restart myservice
systemd-analyze security myservice.service
Hardened nginx example
# /etc/systemd/system/nginx.service.d/hardening.conf
[Service]
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
PrivateDevices=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
SystemCallFilter=@system-service
CapabilityBoundingSet=CAP_NET_BIND_SERVICE CAP_SETUID CAP_SETGID
ReadWritePaths=/var/log/nginx /run/nginx
ReadOnlyPaths=/etc/nginx /usr/share/nginx
systemd-analyze security tells you
exactly what to add and how much each setting improves the score. Work through your services
one at a time, starting with the ones exposed to the network. See the
systemd Masterclass for the full deep dive.
6. Kernel Hardening (sysctl)
Linux ships with insecure defaults for historical compatibility reasons.
These sysctl settings apply CIS benchmark controls and close known attack vectors. Drop
one file into /etc/sysctl.d/ and they apply on every boot.
# /etc/sysctl.d/99-hardened.conf
# CIS Benchmark Level 1 + 2 kernel hardening for kldload
###############################################
# Network hardening
###############################################
# Disable IP forwarding (enable only on routers/VPN gateways)
net.ipv4.ip_forward = 0
net.ipv6.conf.all.forwarding = 0
# Disable ICMP redirects (prevent route hijacking)
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.send_redirects = 0
net.ipv6.conf.all.accept_redirects = 0
net.ipv6.conf.default.accept_redirects = 0
# Disable source routing
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv6.conf.all.accept_source_route = 0
# Enable SYN cookies (defend against SYN flood attacks)
net.ipv4.tcp_syncookies = 1
# Enable reverse path filtering (defeat IP spoofing)
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
# Log martian packets (packets with impossible source addresses)
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.log_martians = 1
# Ignore ICMP broadcast requests (prevents Smurf attacks)
net.ipv4.icmp_echo_ignore_broadcasts = 1
# Ignore bogus ICMP error responses
net.ipv4.icmp_ignore_bogus_error_responses = 1
# Disable IPv6 if not in use
# net.ipv6.conf.all.disable_ipv6 = 1
###############################################
# Memory hardening
###############################################
# Enable full ASLR (address space layout randomization)
kernel.randomize_va_space = 2
# Restrict access to kernel pointers in /proc (defeats info leaks)
kernel.kptr_restrict = 2
# Restrict dmesg access to root
kernel.dmesg_restrict = 1
# Restrict unprivileged use of bpf()
kernel.unprivileged_bpf_disabled = 1
# Restrict perf_event_open (prevents side-channel attacks)
kernel.perf_event_paranoid = 3
###############################################
# Filesystem hardening
###############################################
# Restrict ptrace (prevents process inspection by non-root)
kernel.yama.ptrace_scope = 1
# Protect hardlinks (only owner can follow hardlinks)
fs.protected_hardlinks = 1
# Protect symlinks (only owner can follow symlinks in sticky dirs)
fs.protected_symlinks = 1
# Restrict /proc/PID visibility to process owner
# (requires hidepid mount option — see below)
###############################################
# Core dump hardening
###############################################
# Disable core dumps (prevent credential leakage from dumps)
fs.suid_dumpable = 0
kernel.core_uses_pid = 1
# Apply immediately without rebooting
sysctl -p /etc/sysctl.d/99-hardened.conf
# Verify a specific setting
sysctl kernel.randomize_va_space
Restrict /proc with hidepid
# Mount /proc so users can only see their own processes
# /etc/fstab
proc /proc proc defaults,hidepid=2,gid=proc 0 0
# Remount immediately
mount -o remount,hidepid=2,gid=proc /proc
# Add services that need /proc access to the proc group
usermod -aG proc www-data
7. Runtime Security with eBPF
Traditional security tools monitor log files. eBPF security tools monitor the
kernel directly — they see syscalls, socket operations, and file accesses as they happen,
before a log line is even written. When an attacker spawns a shell from a compromised web
server, eBPF catches the execve() syscall in the kernel. Not the log entry that
arrives seconds later. The kernel event itself.
Falco
Behavioral security monitoring. Watches system calls, container activity, and network connections against a ruleset. Generates alerts when behavior matches a known-bad pattern: unexpected shell in a container, privilege escalation, file access outside expected paths.
Tetragon
Kubernetes-aware eBPF enforcement from Cilium. Can block operations in the kernel (not just alert) — kill a process the moment it violates policy, before it can do damage. Identity-aware: policies attach to pods by label.
Custom eBPF programs
Write your own kernel monitors with bpftrace or libbpf. Trace specific syscalls, log file access patterns, monitor network connections by process. The eBPF Masterclass covers this in depth.
Install and configure Falco
# CentOS/RHEL/Rocky
dnf install -y falco
# Debian/Ubuntu
apt install -y falco
# Fedora (COPR)
dnf copr enable @falcosecurity/falco
dnf install -y falco
# Start Falco
systemctl enable --now falco
# Watch Falco alerts in real time
journalctl -fu falco
# Or write to a file
# /etc/falco/falco.yaml — set file_output: enabled: true
tail -f /var/log/falco.log
Key default Falco rules
# These fire out of the box with no configuration:
#
# Terminal shell in container
# Privilege escalation via sudo
# Write below /etc
# Read sensitive files (shadow, passwd, sudoers)
# Outbound connection to unexpected port
# New binary executed in container not in image
# Container started with --privileged
# Test that Falco is working
# Open a shell in a container — you should see an alert immediately
podman run --rm -it alpine sh
# → Falco: A shell was spawned in a container with an attached terminal
Custom Falco rules
# /etc/falco/rules.d/kldload-custom.yaml
# Alert if SSH daemon spawns any child process (possible compromise)
- rule: SSHd spawns child
desc: SSH daemon spawned a child process (possible command injection)
condition: >
spawned_process and
proc.pname = sshd and
not proc.name in (sshd, sftp-server)
output: >
SSH daemon spawned unexpected child (user=%user.name
command=%proc.cmdline parent=%proc.pname)
priority: WARNING
tags: [ssh, lateral_movement]
# Alert if any process writes to /etc
- rule: Write to /etc
desc: A process wrote to /etc outside of package management
condition: >
open_write and
fd.name startswith /etc and
not proc.name in (rpm, dnf, apt, dpkg, chef-client, puppet)
output: >
File written in /etc (user=%user.name command=%proc.cmdline
file=%fd.name)
priority: ERROR
tags: [filesystem, tampering]
Tetragon enforcement policies
# Kill any process that tries to access /etc/shadow
# (applies to all pods with label app=webapp)
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
name: block-shadow-access
spec:
kprobes:
- call: "security_file_open"
syscall: false
args:
- index: 0
type: "file"
selectors:
- matchArgs:
- index: 0
operator: "Postfix"
values:
- "/etc/shadow"
matchActions:
- action: Sigkill
bpftrace one-liners for security monitoring
# Trace all execve() calls (every process launch)
bpftrace -e 'tracepoint:syscalls:sys_enter_execve { printf("%s → %s\n", comm, str(args->filename)); }'
# Watch all outbound TCP connections
bpftrace -e 'kprobe:tcp_connect { printf("%s connecting to %s\n", comm, ntop(((struct sock *)arg0)->__sk_common.skc_daddr)); }'
# Monitor privileged file opens
bpftrace -e 'tracepoint:syscalls:sys_enter_openat /uid == 0/ { printf("root opened: %s\n", str(args->filename)); }'
# Watch for setuid() calls (privilege escalation)
bpftrace -e 'tracepoint:syscalls:sys_enter_setuid { printf("setuid(%d) by %s[%d]\n", args->uid, comm, pid); }'
8. Container Security
A container is not a security boundary by default. A container running as root
with --privileged has full host access. The security comes from combining multiple
restrictions: rootless execution, seccomp syscall filtering, read-only filesystem, and
no-new-privileges. Each layer stops a different class of attack.
Rootless containers (Podman default)
# On kldload, Podman runs rootless by default — no daemon, no root
# Containers run as your user. Even "root" inside the container is your user outside.
# Verify: run a container and check the host process owner
podman run -d --name test nginx
# In another terminal:
ps aux | grep nginx # should show your username, not root
# Run containers explicitly as a non-root user
podman run --user 1000:1000 nginx
# For Docker (not rootless by default)
# Install docker-rootless-extras and configure per Docker docs
seccomp profiles
# Use a custom seccomp profile to restrict syscalls available inside the container
# Start with Docker's default (blocks ~40 dangerous syscalls)
podman run --security-opt seccomp=/etc/podman/seccomp.json nginx
# Generate a custom profile by running the container with strace tracing
# Then allow only what was observed
# For Kubernetes: attach seccomp profile via pod spec
apiVersion: v1
kind: Pod
spec:
securityContext:
seccompProfile:
type: RuntimeDefault # use the container runtime's default profile
Read-only root filesystem
# Prevent the container from modifying its own filesystem
podman run --read-only \
--tmpfs /tmp \
--tmpfs /var/run \
nginx
# For Kubernetes:
containers:
- name: nginx
securityContext:
readOnlyRootFilesystem: true
Full hardened container run command
podman run \
--read-only \
--tmpfs /tmp \
--tmpfs /var/run \
--security-opt no-new-privileges \
--security-opt seccomp=/etc/podman/seccomp.json \
--cap-drop ALL \
--cap-add NET_BIND_SERVICE \
--user 1000:1000 \
--network slirp4netns \
nginx
Image scanning
# Scan images for CVEs before running them
# Trivy
dnf install trivy # or download from GitHub releases
trivy image nginx:latest
trivy image --severity HIGH,CRITICAL nginx:latest
# Grype
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh
grype nginx:latest
# Scan images in a registry
trivy registry myregistry.internal/myapp:1.0
# Scan the running container filesystem
trivy rootfs /var/lib/containers/storage/overlay/.../merged
Kubernetes pod security standards
# Apply Pod Security Standards at the namespace level
# restricted: most secure, blocks root containers, host access, etc.
kubectl label namespace production \
pod-security.kubernetes.io/enforce=restricted \
pod-security.kubernetes.io/enforce-version=latest
# baseline: blocks known privilege escalations
kubectl label namespace staging \
pod-security.kubernetes.io/enforce=baseline
--privileged
has full host access — it can read /etc/shadow, mount the host filesystem, load kernel
modules. Rootless plus seccomp plus read-only plus no-new-privileges makes it a real
security boundary. Apply all four. Drop all capabilities and add back only what the
container actually needs. Scan images before running them — trivy image
takes 10 seconds and finds CVEs that your base image maintainer has not patched yet.
9. Network Security
kldload provides three independent network security layers: nftables at the host, WireGuard for encrypted transport, and Cilium for Kubernetes pod policies. Each is independent. Bypassing one still leaves two more. This is defense in depth at the network layer.
nftables zone isolation
# /etc/nftables.conf — production hardened baseline
# Three zones: external (internet), internal (WireGuard backplane), loopback
table inet filter {
# Blocklist: populated dynamically by rate-limit rules
set blocklist {
type ipv4_addr
flags dynamic, timeout
timeout 1h
}
chain input {
type filter hook input priority 0; policy drop;
# Drop blocked IPs immediately
ip saddr @blocklist drop
# Allow loopback
iifname lo accept
# Allow established/related
ct state established,related accept
# Allow ICMP (limited rate)
ip protocol icmp limit rate 10/second accept
ip6 nexthdr icmpv6 limit rate 10/second accept
# Allow WireGuard from anywhere
udp dport 51820 accept
# Allow SSH only from WireGuard interface
iifname wg0 tcp dport 22 accept
# Allow web traffic on public interface
iifname eth0 tcp dport { 80, 443 } accept
# Log and drop everything else
log prefix "nft-drop: " flags all
drop
}
chain forward {
type filter hook forward priority 0; policy drop;
ct state established,related accept
# Add forward rules only on gateway/router systems
}
chain output {
type filter hook output priority 0; policy accept;
# Restrictive output filtering (optional, high maintenance)
}
}
table inet mangle {
chain prerouting {
type filter hook prerouting priority -150;
# Drop invalid TCP flag combinations (OS fingerprinting, attacks)
tcp flags & (fin|syn) == fin|syn drop
tcp flags & (syn|rst) == syn|rst drop
tcp flags == 0x0 drop
}
}
WireGuard backplane (services invisible from internet)
# Bind services to the WireGuard interface only
# Example: restrict PostgreSQL to backplane
# /etc/postgresql/16/main/postgresql.conf
listen_addresses = '10.100.0.1' # WireGuard IP only
# Redis
bind 10.100.0.1 127.0.0.1 # backplane + loopback only
# Verify no public exposure
ss -tlnp | grep 5432 # should show 10.100.0.1:5432, not 0.0.0.0
DNS sinkhole for malware domains
# Redirect known-malicious domains to a dead end
# Using systemd-resolved or Unbound
# With Unbound: add a local zone that returns NXDOMAIN for bad domains
# /etc/unbound/conf.d/sinkhole.conf
local-zone: "malware-c2.example.com" static
local-data: "malware-c2.example.com A 0.0.0.0"
# Automate with a blocklist feed (hosts file format)
# Pi-hole or AdGuard Home for managed blocking with dashboards
# Block DNS-over-HTTPS providers to prevent bypass
# (nftables: block port 443 to known DoH IPs)
table inet filter {
set doh_servers {
type ipv4_addr
elements = {
1.1.1.1, # Cloudflare DoH
8.8.8.8, # Google DoH
9.9.9.9 # Quad9 DoH
}
}
chain output {
ip daddr @doh_servers tcp dport 443 drop
}
}
Cilium L7 network policies
# Allow only GET /api/health from frontend to backend
# Deny all other HTTP methods and paths
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: api-policy
spec:
endpointSelector:
matchLabels:
app: backend
ingress:
- fromEndpoints:
- matchLabels:
app: frontend
toPorts:
- ports:
- port: "8080"
protocol: TCP
rules:
http:
- method: "GET"
path: "/api/health"
10. Data Security with ZFS
ZFS is not just a filesystem — it is a data integrity and security layer. Checksums detect corruption and tampering. Snapshots provide immutable recovery points. Encryption protects data at rest. Replication sends incremental encrypted streams to a DR site.
ZFS checksums detect tampering
# Every block on ZFS has a checksum (sha256 or blake3 recommended for security)
zfs get checksum tank/data
# NAME PROPERTY VALUE SOURCE
# tank/data checksum sha256 local
# Set blake3 (faster than sha256, same security level)
zfs set checksum=blake3 tank/data
# Verify all data integrity (reports any silent corruption or tampering)
zpool scrub tank
# Check scrub results
zpool status tank | grep scan
ZFS snapshots as ransomware defense
# Snapshots are read-only after creation
# A ransomware process running as any user cannot modify or delete them
# (the snapshot itself — the live filesystem can still be affected)
# Create automated snapshots every hour
# Using sanoid (installed by kldload)
# /etc/sanoid/sanoid.conf
[tank/data]
use_template = production
recursive = yes
[template_production]
frequently = 0
hourly = 24
daily = 30
monthly = 3
yearly = 0
autosnap = yes
autoprune = yes
systemctl enable --now sanoid.timer
# After a ransomware attack: roll back to last clean snapshot
zfs list -t snapshot tank/data
zfs rollback tank/data@sanoid_2026-04-02_03:00:00
# Or mount the snapshot read-only to recover specific files
zfs clone tank/data@sanoid_2026-04-02_03:00:00 tank/recovery
mount -t zfs tank/recovery /mnt/recovery
ZFS encryption per dataset
# Encrypt a dataset (does not encrypt the pool itself — per-dataset is more flexible)
zfs create \
-o encryption=aes-256-gcm \
-o keylocation=prompt \
-o keyformat=passphrase \
tank/secrets
# Or use a key file
zfs create \
-o encryption=aes-256-gcm \
-o keylocation=file:///etc/zfs/tank-secrets.key \
-o keyformat=hex \
tank/secrets
# Load the key and mount
zfs load-key tank/secrets
zfs mount tank/secrets
# Unload the key (data becomes inaccessible without re-loading)
zfs umount tank/secrets
zfs unload-key tank/secrets
Encrypted replication to DR site
# Send encrypted snapshots to a remote site
# The remote site cannot decrypt — they store ciphertext only
zfs send -R --raw tank/data@snapshot | \
ssh backup-server zfs recv -F backup/data
# Incremental (only changes since last snapshot)
zfs send -i tank/data@prev-snapshot tank/data@current | \
ssh backup-server zfs recv -F backup/data
# Automate with syncoid (sanoid companion)
syncoid --recursive --sendoptions=w tank/data backup-server:backup/data
# --sendoptions=w sends raw (encrypted) stream
11. Compliance Frameworks
Compliance is not security, but security enables compliance. Most CIS benchmark controls map directly to features kldload provides. This section maps the major framework controls to kldload configuration.
CIS Benchmark controls mapped to kldload
| CIS Control | kldload feature | Where to configure |
|---|---|---|
| 1.1 — Filesystem integrity | ZFS checksums on every block | Built in — set checksum=sha256 |
| 2.1 — Encrypted storage | ZFS per-dataset encryption (AES-256-GCM) | Section 10 above |
| 3.1 — Network packet filtering | nftables host firewall | Section 9 + nftables Masterclass |
| 3.2 — Network parameter hardening | sysctl hardening (ICMP, SYN cookies, source routing) | Section 6 above |
| 4.1 — Encrypted transport | WireGuard backplane encryption | WireGuard Masterclass |
| 5.1 — Access control | SELinux/AppArmor mandatory access control | Sections 3 and 4 above |
| 5.2 — SSH hardening | Key-only auth, WireGuard-bound SSH | Section 2 above |
| 6.1 — Audit logging | auditd + journald | Install auditd, configure below |
| 6.2 — Log integrity | Remote logging to syslog over WireGuard | journald + rsyslog remote |
| 7.1 — Malware detection | Falco/eBPF behavioral monitoring | Section 7 above |
| 8.1 — Incident response | ZFS snapshots for pre-incident state, Falco for detection | Section 13 below |
Install and configure auditd
# Install
dnf install -y audit audit-libs # CentOS/RHEL
apt install -y auditd audispd-plugins # Debian/Ubuntu
systemctl enable --now auditd
# /etc/audit/rules.d/99-kldload-hardened.rules
# CIS Benchmark audit rules
# Identity changes
-w /etc/passwd -p wa -k identity
-w /etc/group -p wa -k identity
-w /etc/shadow -p wa -k identity
-w /etc/sudoers -p wa -k identity
-w /etc/sudoers.d/ -p wa -k identity
# Authentication
-w /var/log/faillog -p wa -k logins
-w /var/log/lastlog -p wa -k logins
-w /var/log/wtmp -p wa -k logins
# Network configuration changes
-w /etc/hosts -p wa -k system-locale
-w /etc/hostname -p wa -k system-locale
-w /etc/sysconfig/network -p wa -k system-locale
# Privilege escalation
-a always,exit -F arch=b64 -S setuid -F a0=0 -F exe=/usr/bin/su -k special-priv
-a always,exit -F arch=b64 -S setresuid -F a0=0 -F exe=/usr/bin/sudo -k special-priv
# Time changes
-a always,exit -F arch=b64 -S adjtimex -S settimeofday -k time-change
-w /etc/localtime -p wa -k time-change
# Kernel modules
-w /sbin/insmod -p x -k modules
-w /sbin/rmmod -p x -k modules
-w /sbin/modprobe -p x -k modules
-a always,exit -F arch=b64 -S init_module -S delete_module -k modules
# Make audit config immutable (requires reboot to change)
-e 2
# Reload rules
augenrules --load
# Search audit log
ausearch -k identity -ts today
ausearch -k special-priv -ts recent | aureport -u
Remote audit logging
# Forward audit events to a remote syslog server over WireGuard
# /etc/audisp/plugins.d/syslog.conf
active = yes
direction = out
path = builtin_syslog
type = builtin
args = LOG_INFO LOG_DAEMON
format = string
12. Incident Response
When a compromise happens, the first 10 minutes determine whether you contain it or lose the environment. The kldload stack gives you tools for every phase: detection, containment, investigation, and recovery. Know the playbook before you need it.
Detection
# Falco alerts appearing? Check the journal immediately
journalctl -fu falco --since "10 minutes ago"
# Failed SSH logins (possible brute force or credential stuffing)
journalctl -u sshd | grep "Failed password\|Invalid user" | tail -50
# ZFS checksum errors (possible disk failure or tampering)
zpool status | grep errors
zpool status -v tank | grep CKSUM
# Unusual network connections
ss -tapn | grep ESTABLISHED
# Look for unexpected outbound connections (C2 beaconing)
# Unexpected privileged processes
ps aux --sort=-uid | head -20
# Recently modified files in critical directories
find /etc /usr/bin /usr/sbin -newer /var/log/dnf.log -ls 2>/dev/null | head -30
# Check for new SUID/SGID binaries
find / -perm /6000 -type f 2>/dev/null | sort > /tmp/suid-now.txt
# Compare against a baseline you took when the system was clean
Containment
# 1. Take a ZFS snapshot immediately — before investigation changes anything
zfs snapshot -r tank@incident-$(date +%Y%m%d-%H%M%S)
# 2. Block suspicious IP in nftables (immediate, no service restart)
nft add element inet filter blocklist { 1.2.3.4 }
# 3. Isolate a compromised service without killing it
# Add a deny-all nftables rule for the specific service port
nft add rule inet filter input tcp dport 8080 drop
# 4. Pause a container under investigation (preserve state for forensics)
podman pause compromised-container
# 5. Disable a compromised user account
usermod -L compromised-user
# Immediately revoke their SSH keys
# Remove from authorized_keys on all hosts
# Revoke their SSH certificate at the step-ca
# 6. Revoke a WireGuard peer
# Remove their pubkey from /etc/wireguard/wg0.conf
# wg set wg0 peer PUBKEY remove
wg set wg0 peer <compromised-pubkey> remove
Investigation
# Review journal logs around the incident time
journalctl --since "2026-04-02 14:00" --until "2026-04-02 15:00" -o verbose
# Check audit log for identity changes
ausearch -k identity -ts 2026-04-02 | aureport -i
# Check what changed in /etc since the incident snapshot
zfs diff tank@incident-2026... tank | grep "^M /etc"
# List files accessed by a specific process (if still running)
ls -la /proc/PID/fd
# Check crontabs and systemd timers for persistence
crontab -l -u root
systemctl list-timers --all | grep -v systemd
# Check for new user accounts or sudoers entries
grep -v "^#" /etc/sudoers
awk -F: '$3 >= 1000 {print}' /etc/passwd
# Review recent package installs
rpm -qa --last | head -20 # CentOS/RHEL
grep "install" /var/log/dpkg.log | tail -20 # Debian/Ubuntu
# Network forensics: capture traffic from the backplane
tcpdump -i wg0 -w /tmp/incident-capture.pcap &
Recovery
# Roll back to the pre-incident ZFS snapshot
# WARNING: this destroys all changes since the snapshot
zfs rollback -r tank@clean-snapshot
# Or: mount the snapshot and recover specific files only
zfs clone tank@clean-snapshot tank/recovery
# Copy individual files from /tank/recovery/ to production
zfs destroy tank/recovery
# Rotate all credentials
# SSH: generate new host keys
rm /etc/ssh/ssh_host_*
ssh-keygen -A
# WireGuard: regenerate all keypairs
wg genkey | tee /etc/wireguard/privatekey | wg pubkey > /etc/wireguard/publickey
# Certificates: revoke and reissue at step-ca
step ca revoke --ca-url https://ca.internal:8443 --serial <cert-serial>
# Rebuild compromised services from known-good image
podman pull nginx:latest
podman stop compromised-container
podman rm compromised-container
podman run ... nginx:latest # re-run from verified image
Post-mortem checklist
# Document:
# 1. Timeline: when did the attacker gain access? How?
# 2. What did they access or modify? (ZFS diff shows filesystem changes)
# 3. Did they persist? (crontabs, new users, modified binaries)
# 4. What monitoring would have caught this earlier?
# 5. What hardening would have prevented it?
# Add the attack pattern to Falco rules
# Add the attacker's tactics to your nftables blocklist
# File a CVE report if a 0-day was used
13. Troubleshooting
SELinux/AppArmor denials blocking legitimate traffic
# SELinux: find and explain the denial
ausearch -m avc -ts recent | audit2why
# Fix: restorecon for file context issues, semanage for port/boolean issues
# Temporarily put the service in permissive mode while debugging
semanage permissive -a httpd_t
# ... test the service, gather denials ...
semanage permissive -d httpd_t
# AppArmor: check complain mode output
aa-complain /etc/apparmor.d/usr.sbin.myservice
journalctl -k | grep apparmor
# Fix the profile, then re-enforce
aa-enforce /etc/apparmor.d/usr.sbin.myservice
Firewall blocking legitimate traffic
# Test connectivity from a specific interface
nft list ruleset
# Add a trace rule to see which rule is matching
nft add rule inet filter input ip saddr 10.0.0.5 meta nftrace set 1
nft monitor trace
# Check connection tracking state
conntrack -L | grep 10.0.0.5
# Temporarily allow traffic for debugging (remove after)
nft insert rule inet filter input position 0 tcp dport 8080 accept
# View nftables counters to see which rules are matching
nft list ruleset | grep packets
Certificate issues
# Debug a TLS connection
openssl s_client -connect host:443 -showcerts
openssl s_client -connect host:443 -CAfile /etc/pki/ca-trust/source/anchors/internal-ca.crt
# Check certificate validity
openssl x509 -in /path/to/cert.pem -text -noout | grep -A2 Validity
# Verify certificate chain
openssl verify -CAfile ca.crt -untrusted intermediate.crt service.crt
# Check if step-ca is reachable
step ca health --ca-url https://ca.internal:8443
# Renew a certificate
step ca renew /etc/ssl/service.crt /etc/ssl/service.key \
--ca-url https://ca.internal:8443 --force
eBPF program failures
# Check if a Falco rule is causing issues
falco --validate /etc/falco/rules.d/custom.yaml
# List loaded eBPF programs
bpftool prog list
# Check Tetragon status
kubectl -n kube-system get pods -l app.kubernetes.io/name=tetragon
kubectl -n kube-system logs -l app.kubernetes.io/name=tetragon
# Verify eBPF is supported and enabled
cat /proc/sys/kernel/unprivileged_bpf_disabled
# Should be 1 (restrict to root) — if 2, even root cannot load programs
# Check kernel eBPF program limits
cat /proc/sys/kernel/bpf_jit_enable # should be 1
systemd sandbox breaking a service
# Service failing after adding sandbox options?
# Check the unit's status for the specific error
systemctl status myservice
journalctl -u myservice --since "5 minutes ago"
# Common issues:
# ProtectSystem=strict — service tries to write to /usr or /etc
# Fix: add ReadWritePaths= for the specific paths it needs
# PrivateTmp — service stores state in /tmp that another process reads
# Fix: use a non-tmp directory for shared state
# NoNewPrivileges — service calls setuid()
# Fix: restructure service to separate privileged and unprivileged parts
# CapabilityBoundingSet= — service needs a capability you dropped
# Fix: strace -e trace=capability ./service to identify which one
# Run the service without sandbox to confirm it is the cause
systemctl edit myservice --force
# Add: [Service] NoNewPrivileges=false ProtectSystem=false
systemctl restart myservice