DNS Masterclass
This guide goes deep on DNS — the infrastructure layer that every network request depends on before anything else happens. Whether you are running a single kldload node at home, a multi-site fleet behind WireGuard, or a Kubernetes cluster with CoreDNS, DNS determines whether your services find each other. This page covers the full stack: how DNS actually works, recursive resolvers, authoritative servers, split-horizon, service discovery, DNSSEC, WireGuard name resolution, and production fleet architecture.
What this page covers: the DNS resolution chain from stub resolver to root servers, Unbound as a recursive resolver with DNS-over-TLS, BIND9 for authoritative internal zones, split-horizon views for internal/external routing, SRV-based service discovery, CoreDNS for Kubernetes, DNSSEC validation and zone signing, DNS for WireGuard meshes, Pi-hole and blocklist DNS, dig debugging, and a complete production DNS architecture for a kldload fleet.
1. DNS Is the Foundation of Everything
Every network request starts with a DNS query. Before your browser loads a page, before your API calls a backend, before your container pulls an image — DNS resolves a name to an IP. If DNS breaks, everything breaks. Understanding DNS means understanding the first thing that happens in every network interaction.
DNS is not just a lookup table. It is a distributed, hierarchical, cached, delegated naming system that spans the entire internet. It has been running since 1983. Every protocol you use sits on top of it. HTTP, HTTPS, SMTP, SSH, gRPC, every service mesh, every container registry — all of them start with a DNS query.
dig +trace.2. How DNS Actually Works
DNS resolution is a chain. When your application asks "what is the IP of api.example.com?", it does not go directly to the authoritative server for example.com. It goes through a hierarchy designed so that no single server needs to know everything.
The resolution chain
Caching at every level
Every layer caches. The recursive resolver caches the answer for the duration of the TTL (Time To Live) in the DNS record. If the TTL is 300 seconds, every query for that name in the next five minutes gets an instant answer from cache — no upstream queries at all. Your operating system has a stub resolver that may cache too. Your browser has its own DNS cache. TTL is the knob that controls the tradeoff between freshness and performance.
Low TTL (30–60s): changes propagate fast. Useful when you need to cut over quickly — before a migration, or when using DNS for failover. Cost: more queries, slightly higher latency on cache misses.
High TTL (3600s, 86400s): changes are slow to propagate. Useful for stable records — MX, NS, SPF. Benefit: almost every query hits cache. Cost: if you need to change the IP, clients will keep hitting the old one for up to 24 hours.
Record types
A / AAAA
The most common records. A maps a name to an IPv4 address. AAAA maps to IPv6. A single name can have multiple A records — clients typically try all of them (round-robin or preference order).
CNAME
Canonical name — an alias. www.example.com CNAME example.com means "resolve www by resolving example.com." CNAMEs chain. They cannot be used at the zone apex (you cannot CNAME the root domain itself). MX and NS records cannot point to CNAMEs.
MX
Mail exchange — where to deliver email for a domain. Has a priority number; lowest priority wins. Multiple MX records provide redundancy. Must point to A/AAAA records, never CNAMEs.
TXT
Free-form text. Used for SPF (email sender policy), DKIM (email signing key), DMARC, domain verification (Let's Encrypt DNS-01, Google Search Console), and any custom metadata you want to attach to a domain.
SRV
Service record — maps a service name to a host and port. Format: priority, weight, port, target. Used for service discovery without a service mesh. _http._tcp.web.infra.local tells clients where the HTTP service lives.
NS / SOA
NS records delegate a zone to specific nameservers. SOA (Start of Authority) contains zone metadata: primary nameserver, admin email, serial number, refresh/retry/expire timers, and minimum TTL. Every zone has exactly one SOA.
PTR
Pointer record — reverse DNS. Maps an IP address back to a hostname. Stored under the special in-addr.arpa (IPv4) or ip6.arpa (IPv6) domains. Used by mail servers, logging systems, and security tools to resolve IPs to names.
3. Recursive Resolvers on kldload
Unbound is the standard recursive resolver for kldload fleets. It is fast, secure, supports DNS-over-TLS, DNSSEC validation, and local caching. One Unbound instance on your network serves all nodes — they send queries to it, it caches results and queries upstream on cache misses.
Install Unbound
# CentOS / RHEL / Rocky
dnf install -y unbound
# Debian / Ubuntu
apt install -y unbound
# Enable and start
systemctl enable --now unbound
# Verify
dig @127.0.0.1 example.com +short
Basic configuration
# /etc/unbound/unbound.conf
server:
# Listen on all interfaces (restrict with interface: if needed)
interface: 0.0.0.0
port: 53
# Allow queries from your network
access-control: 127.0.0.0/8 allow
access-control: 10.0.0.0/8 allow
access-control: 172.16.0.0/12 allow
access-control: 192.168.0.0/16 allow
# Cache settings
cache-min-ttl: 60
cache-max-ttl: 86400
msg-cache-size: 64m
rrset-cache-size: 128m
# Privacy: hide server identity and version
hide-identity: yes
hide-version: yes
# DNSSEC validation (see section 8)
auto-trust-anchor-file: "/var/lib/unbound/root.key"
# Log level (0=minimal, 2=verbose, 5=debug)
verbosity: 1
# Prefetch popular records before TTL expires
prefetch: yes
prefetch-key: yes
# Use 0x20 bit randomization to defeat cache poisoning
use-caps-for-id: yes
Forwarding mode vs full recursive
Unbound can operate in two modes. Full recursive (default with no forward-zone) queries root servers directly for every cache miss — completely independent, no third-party resolver in the path. Forwarding mode sends cache misses to an upstream resolver (1.1.1.1, 8.8.8.8, your ISP's resolver) and caches the results.
# Forwarding mode — send cache misses to Cloudflare
# Add to unbound.conf:
forward-zone:
name: "."
forward-addr: 1.1.1.1@853#cloudflare-dns.com # DNS over TLS
forward-addr: 1.0.0.1@853#cloudflare-dns.com
forward-tls-upstream: yes
# Full recursive (no forward-zone block)
# Unbound queries root servers directly.
# Get the root hints file:
curl -o /etc/unbound/root.hints https://www.internic.net/domain/named.root
# Reference it in unbound.conf:
server:
root-hints: "/etc/unbound/root.hints"
DNS over TLS upstream
# Verify DoT is working — Unbound logs the TLS handshake at verbosity: 2
# Check with:
dig @127.0.0.1 cloudflare.com +short
# If using systemd-resolved as stub on the local machine:
# Point it at Unbound instead of the default
# /etc/systemd/resolved.conf:
[Resolve]
DNS=127.0.0.1
DNSStubListener=no
# Then symlink resolv.conf:
ln -sf /run/systemd/resolve/resolv.conf /etc/resolv.conf
4. Authoritative DNS with BIND or NSD
A recursive resolver answers questions by asking other servers. An authoritative
server answers questions from its own data — it is the source of truth for a zone.
For your internal domain (infra.local, cluster.home, kldload.internal),
you need an authoritative server so that nodes can resolve each other by name.
BIND9
The most widely deployed DNS server. Does everything: recursion, authoritative serving, dynamic updates (DDNS), DNSSEC signing, views (split-horizon), zone transfers, and catalog zones. The right choice for internal infrastructure where you need DHCP integration, dynamic registration, and split-horizon views.
NSD
Authoritative-only. NSD does not do recursion — it only serves zones from disk. This makes it simpler, faster for pure authoritative serving, and harder to misconfigure as an open resolver. The right choice when you want a dedicated authoritative server and use Unbound separately for recursion.
Install BIND9
# CentOS / RHEL / Rocky
dnf install -y bind bind-utils
# Debian / Ubuntu
apt install -y bind9 bind9-utils dnsutils
# Enable
systemctl enable --now named
Configure BIND9 as authoritative for infra.local
# /etc/named.conf (CentOS path — Debian uses /etc/bind/named.conf)
options {
directory "/var/named";
listen-on { 127.0.0.1; 10.100.10.1; }; # your DNS server's IPs
listen-on-v6 { none; };
# This server is authoritative only — no recursion for external queries
recursion no;
allow-query { 10.0.0.0/8; 172.16.0.0/12; 192.168.0.0/16; 127.0.0.0/8; };
# Allow dynamic updates from localhost (for nsupdate / DHCP)
allow-update { 127.0.0.1; };
# Disable zone transfers except to secondary
allow-transfer { none; };
# DNSSEC (section 8)
dnssec-validation auto;
};
# Forward zone — infra.local
zone "infra.local" IN {
type master;
file "/var/named/infra.local.zone";
allow-update { 127.0.0.1; 10.100.10.0/24; };
};
# Reverse zone — 10.100.10.x
zone "10.100.10.in-addr.arpa" IN {
type master;
file "/var/named/10.100.10.rev";
allow-update { 127.0.0.1; 10.100.10.0/24; };
};
Zone file: infra.local with all kldload nodes
# /var/named/infra.local.zone
$TTL 300
@ IN SOA ns1.infra.local. admin.infra.local. (
2026040101 ; serial (YYYYMMDDnn — increment on every change)
3600 ; refresh
900 ; retry
604800 ; expire
300 ) ; minimum TTL
; Nameservers
@ IN NS ns1.infra.local.
; DNS server itself
ns1 IN A 10.100.10.1
; kldload nodes
node1 IN A 10.100.10.10
node2 IN A 10.100.10.11
node3 IN A 10.100.10.12
node4 IN A 10.100.10.13
node5 IN A 10.100.10.14
; Service aliases
monitor IN A 10.100.10.10 ; Grafana/Prometheus lives on node1
api IN A 10.100.10.50 ; load balancer VIP
db IN A 10.100.10.20 ; database primary
db-replica IN A 10.100.10.21 ; database replica
; WireGuard interface names (optional)
node1-wg IN A 10.200.0.1
node2-wg IN A 10.200.0.2
node3-wg IN A 10.200.0.3
# /var/named/10.100.10.rev (reverse zone)
$TTL 300
@ IN SOA ns1.infra.local. admin.infra.local. (
2026040101 3600 900 604800 300 )
@ IN NS ns1.infra.local.
; PTR records — IP last-octet → hostname
1 IN PTR ns1.infra.local.
10 IN PTR node1.infra.local.
11 IN PTR node2.infra.local.
12 IN PTR node3.infra.local.
20 IN PTR db.infra.local.
50 IN PTR api.infra.local.
# Reload zones after editing
rndc reload
# Check zone syntax before reloading
named-checkzone infra.local /var/named/infra.local.zone
named-checkconf /etc/named.conf
# Test
dig @10.100.10.1 node1.infra.local +short
dig @10.100.10.1 -x 10.100.10.10 +short
infra.local, cluster.home), BIND9 is the right choice because you will want dynamic updates from DHCP and split-horizon views. You can also run Unbound for recursion and BIND for authoritative serving side by side: configure Unbound with a stub-zone that forwards queries for infra.local to BIND, and BIND handles everything authoritative while Unbound handles everything else. The two are designed to work together.5. Split-Horizon DNS
Split-horizon (also called split-brain DNS) means the same domain name resolves differently depending on who asks. External clients get a public IP. Internal clients get a private IP. One domain, two answers, controlled by the DNS server based on the source address of the query.
This is how every production deployment works. Your API load balancer has a public IP for the internet and a private IP for internal services. Without split-horizon, internal services hairpin through the public IP — their packets leave the server, hit your firewall, and come back in. With split-horizon, internal traffic stays on the LAN.
BIND9 views configuration
# /etc/named.conf — split-horizon with views
acl "internal" {
10.0.0.0/8;
172.16.0.0/12;
192.168.0.0/16;
127.0.0.0/8;
};
acl "external" {
any;
};
# INTERNAL VIEW — seen by LAN clients
view "internal" {
match-clients { internal; };
recursion yes; # allow recursion for internal clients
zone "example.com" IN {
type master;
file "/var/named/example.com.internal.zone";
};
# Forward everything else to Unbound for recursion
zone "." IN {
type forward;
forwarders { 10.100.10.1 port 5300; }; # Unbound on alt port
};
};
# EXTERNAL VIEW — seen by everyone else
view "external" {
match-clients { external; };
recursion no; # no recursion for external clients
zone "example.com" IN {
type master;
file "/var/named/example.com.external.zone";
};
};
Zone files with different answers
# /var/named/example.com.internal.zone
$TTL 300
@ IN SOA ns1.example.com. admin.example.com. (2026040101 3600 900 604800 300)
@ IN NS ns1.example.com.
; Internal clients get private IPs — traffic stays on LAN
api IN A 10.100.10.50 ; private load balancer VIP
www IN A 10.100.10.51 ; private web server
db IN A 10.100.10.20 ; internal DB (never exposed externally)
# /var/named/example.com.external.zone
$TTL 300
@ IN SOA ns1.example.com. admin.example.com. (2026040101 3600 900 604800 300)
@ IN NS ns1.example.com.
; External clients get the public IP
api IN A 203.0.113.10 ; public load balancer
www IN A 203.0.113.10 ; same public IP, different vhost
; db has no external record — it simply does not exist outside
# Verify split-horizon is working:
# From inside the network (should get 10.100.10.50)
dig @10.100.10.1 api.example.com +short
# From outside (or simulate with a public resolver)
dig @8.8.8.8 api.example.com +short
api.example.com send traffic to the public IP, the firewall translates it back to the private IP (NAT hairpin), and it arrives at the server — adding a round trip through the firewall for every internal call. With split-horizon, the resolver returns the private IP directly and the traffic never leaves the LAN. The performance improvement is real. The security improvement is also real: internal services that should never be reachable externally simply have no external DNS record. They are not firewalled out — they literally do not exist in the external view. An attacker scanning your public IP range gets no hostname resolution and no indication those services exist.6. Service Discovery with DNS
DNS-based service discovery predates Consul, Kubernetes, and every service mesh
that has ever been built. SRV records encode both the host and the port for a
service. A client queries _http._tcp.api.infra.local and gets back an IP and
port — no configuration files, no hardcoded ports, no service registry daemon
required.
SRV records for service discovery
# In /var/named/infra.local.zone, add SRV records:
; _service._proto.name TTL IN SRV priority weight port target
_http._tcp.api IN SRV 0 10 8080 api01.infra.local.
_http._tcp.api IN SRV 0 10 8080 api02.infra.local. ; second instance
_grpc._tcp.api IN SRV 0 10 9090 api01.infra.local.
_https._tcp.grafana IN SRV 0 10 3000 monitor.infra.local.
# Query SRV records
dig _http._tcp.api.infra.local SRV
# Output:
# _http._tcp.api.infra.local. 300 IN SRV 0 10 8080 api01.infra.local.
# _http._tcp.api.infra.local. 300 IN SRV 0 10 8080 api02.infra.local.
# The client reads the SRV records, picks one (weight-based), resolves the
# target A record, and connects to host:port. No service registry needed.
Dynamic DNS updates with nsupdate
# Register a service on boot using nsupdate
# This sends a dynamic update to BIND9
nsupdate << EOF
server 10.100.10.1
zone infra.local
update add api03.infra.local 300 A 10.100.10.53
update add _http._tcp.api.infra.local 300 SRV 0 10 8080 api03.infra.local.
send
EOF
kldload firstboot integration
# /etc/kldload/firstboot.d/50-register-dns.sh
# Runs once on first boot after install. Registers this node in DNS.
HOSTNAME=$(hostname -s)
IP=$(ip -4 addr show eth0 | awk '/inet /{print $2}' | cut -d/ -f1)
DNS_SERVER="10.100.10.1"
ZONE="infra.local"
nsupdate << EOF
server ${DNS_SERVER}
zone ${ZONE}
update delete ${HOSTNAME}.${ZONE} A
update add ${HOSTNAME}.${ZONE} 300 A ${IP}
send
EOF
echo "Registered ${HOSTNAME}.${ZONE} → ${IP}"
Consul DNS interface
# If using Consul for service discovery, Consul serves DNS on port 8600
# Configure Unbound to forward *.consul queries to Consul:
# /etc/unbound/unbound.conf
stub-zone:
name: "consul"
stub-addr: 127.0.0.1@8600
# Now "dig web.service.consul" resolves to all healthy web instances
# Consul returns only healthy instances — unhealthy ones are removed from DNS
dig web.service.consul
dig _http._tcp.web.service.consul SRV
7. CoreDNS and Kubernetes DNS
Every Kubernetes cluster runs CoreDNS as its in-cluster DNS server. Every pod
gets /etc/resolv.conf configured to point at the CoreDNS ClusterIP. Every service
gets a DNS name automatically. Understanding how CoreDNS works lets you customize
it — forward external queries through your Unbound resolver, integrate with your
internal BIND server, and tune caching.
How pod DNS resolution works
# Inside a pod, /etc/resolv.conf looks like:
nameserver 10.96.0.10 # CoreDNS ClusterIP
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
# DNS names for services follow this pattern:
# <service>.<namespace>.svc.cluster.local
# Examples:
# Service "web" in namespace "production":
dig web.production.svc.cluster.local
# Same namespace — short name works due to search domains:
dig web
# Pod-to-pod (less common, usually use service DNS):
# <pod-ip-dashes>.<namespace>.pod.cluster.local
dig 10-100-1-5.production.pod.cluster.local
Inspect CoreDNS configuration
# CoreDNS config lives in a ConfigMap
kubectl -n kube-system get configmap coredns -o yaml
# Default Corefile looks like:
# .:53 {
# errors
# health
# ready
# kubernetes cluster.local in-addr.arpa ip6.arpa {
# pods insecure
# fallthrough in-addr.arpa ip6.arpa
# }
# prometheus :9153
# forward . /etc/resolv.conf ← external queries go HERE
# cache 30
# loop
# reload
# loadbalance
# }
Forward external queries to your Unbound instance
# Edit the CoreDNS ConfigMap
kubectl -n kube-system edit configmap coredns
# Change the forward line to point at your Unbound:
# forward . 10.100.10.1 {
# prefer_udp
# }
# Or use the kubectl patch approach:
kubectl -n kube-system patch configmap coredns --type merge -p '
{
"data": {
"Corefile": ".:53 {\n errors\n health\n ready\n kubernetes cluster.local in-addr.arpa ip6.arpa {\n pods insecure\n fallthrough in-addr.arpa ip6.arpa\n }\n prometheus :9153\n forward . 10.100.10.1\n cache 30\n loop\n reload\n loadbalance\n}\n"
}
}'
# CoreDNS reloads automatically when the ConfigMap changes
Stub domains — forward *.infra.local to your BIND server
# Add a stub-zone block so pods can resolve your internal domain
# Edit the CoreDNS ConfigMap and add a new server block:
# .:53 {
# ... existing config ...
# }
#
# infra.local:53 {
# errors
# cache 30
# forward . 10.100.10.1 ← forward infra.local queries to BIND
# }
# After this, from any pod:
dig node1.infra.local # resolves via BIND
dig api.infra.local # resolves via BIND
dig google.com # resolves via Unbound → internet
/etc/resolv.conf says on the node — often a cloud provider's resolver or the host's systemd-resolved. On a kldload Kubernetes cluster, configure CoreDNS to forward to your Unbound instance so all DNS goes through your resolver, cached and private. Your nodes already use Unbound. Your pods should too. The stub-zone trick is particularly powerful: pods can resolve your internal infra.local names and external internet names through a single DNS infrastructure, and your BIND server is the single source of truth for internal names regardless of whether the query came from a bare-metal service, a VM, or a Kubernetes pod.8. DNSSEC — Signed DNS
DNSSEC adds cryptographic signatures to DNS records. A resolver that validates DNSSEC can prove that the answer it received is authentic — it came from the authoritative server and was not modified in transit. DNS spoofing, cache poisoning, and BGP hijacking attacks that redirect DNS cannot forge a valid DNSSEC signature.
Enable DNSSEC validation in Unbound
# Unbound validates DNSSEC by default when auto-trust-anchor-file is set
# /etc/unbound/unbound.conf:
server:
auto-trust-anchor-file: "/var/lib/unbound/root.key"
val-log-level: 2 # log DNSSEC validation failures
# Initialize the trust anchor (done automatically by unbound-anchor on install)
unbound-anchor -a /var/lib/unbound/root.key
# Test: query a DNSSEC-signed domain
dig @127.0.0.1 cloudflare.com +dnssec
# Look for "ad" flag in the flags line:
# ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
# "ad" = Authenticated Data — DNSSEC validation passed
# Test validation failure (should get SERVFAIL):
dig @127.0.0.1 dnssec-failed.org +dnssec
Sign your own zones with BIND9
# Step 1: Generate zone signing keys (ZSK and KSK)
cd /var/named
# Key Signing Key (KSK) — signs the DNSKEY records themselves
dnssec-keygen -a ECDSAP256SHA256 -f KSK infra.local
# Produces Kinfra.local.+013+XXXXX.key and .private
# Zone Signing Key (ZSK) — signs all other records
dnssec-keygen -a ECDSAP256SHA256 infra.local
# Step 2: Include keys in the zone file
# Add to /var/named/infra.local.zone:
$INCLUDE Kinfra.local.+013+YYYYY.key ; KSK public key
$INCLUDE Kinfra.local.+013+ZZZZZ.key ; ZSK public key
# Step 3: Sign the zone
dnssec-signzone -A -3 $(head -c 1000 /dev/random | sha1sum | cut -b 1-16) \
-N INCREMENT -o infra.local -t \
/var/named/infra.local.zone
# Produces infra.local.zone.signed
# Update named.conf to use the signed file:
zone "infra.local" IN {
type master;
file "/var/named/infra.local.zone.signed";
};
Inline signing (easier — BIND manages keys automatically)
# Modern BIND9 (9.16+) supports inline signing — much simpler
# named.conf:
zone "infra.local" IN {
type master;
file "/var/named/infra.local.zone";
inline-signing yes;
auto-dnssec maintain; # BIND generates and rotates keys
key-directory "/var/named/keys/";
};
# BIND creates the keys, signs the zone, and handles key rollover automatically
mkdir -p /var/named/keys
rndc loadkeys infra.local
rndc sign infra.local
Key rollover
# BIND's auto-dnssec maintain handles ZSK rollover automatically.
# For KSK rollover (requires DS record update at parent):
# Generate new KSK
dnssec-keygen -a ECDSAP256SHA256 -f KSK infra.local
# Copy to key directory
cp Kinfra.local.+013+NEWKEY.key /var/named/keys/
cp Kinfra.local.+013+NEWKEY.private /var/named/keys/
# Tell BIND to use it (it will pre-publish, then activate after a delay)
rndc loadkeys infra.local
9. DNS for WireGuard Networks
WireGuard gives you encrypted IP connectivity between nodes. It does not give you name resolution. After building a WireGuard mesh, every operator immediately hits the same problem: "I have to remember IP addresses." DNS fixes this. The solution is straightforward: run a DNS server on one node in the mesh, configure all peers to use it, and give every node a hostname.
The problem
# After WireGuard setup, you have IPs like:
# node1: 10.200.0.1
# node2: 10.200.0.2
# db: 10.200.0.10
# You SSH to: ssh todd@10.200.0.10 (who is this?)
# You want: ssh todd@db.wg (obvious)
Solution 1: dnsmasq on the WireGuard gateway
# Install dnsmasq on the WG hub node
dnf install -y dnsmasq # CentOS/Rocky/RHEL
apt install -y dnsmasq # Debian/Ubuntu
# /etc/dnsmasq.conf
interface=wg0 # listen on WG interface
bind-interfaces
no-dhcp-interface=wg0 # no DHCP, DNS only
domain=wg # short-name domain: "node1.wg"
local=/wg/ # serve wg zone locally
# Static assignments
address=/node1.wg/10.200.0.1
address=/node2.wg/10.200.0.2
address=/node3.wg/10.200.0.3
address=/db.wg/10.200.0.10
address=/monitor.wg/10.200.0.20
# Reverse DNS
ptr-record=1.0.200.10.in-addr.arpa,node1.wg
ptr-record=10.0.200.10.in-addr.arpa,db.wg
systemctl enable --now dnsmasq
Configure all WireGuard peers to use the DNS server
# In each peer's /etc/wireguard/wg0.conf:
[Interface]
Address = 10.200.0.2/24
PrivateKey = ...
DNS = 10.200.0.1 # hub node runs dnsmasq
# this sets DNS for the wg0 interface when it comes up
# The DNS line in wg-quick configs sets resolv.conf when the tunnel comes up.
# It is removed when the tunnel goes down.
Solution 2: Unbound with WireGuard-specific zones
# On your existing Unbound resolver, add a local-zone for the WG subnet:
# /etc/unbound/unbound.conf
local-zone: "wg." static
local-data: "node1.wg. A 10.200.0.1"
local-data: "node2.wg. A 10.200.0.2"
local-data: "node3.wg. A 10.200.0.3"
local-data: "db.wg. A 10.200.0.10"
# Reverse DNS
local-zone: "0.200.10.in-addr.arpa." static
local-data-ptr: "10.200.0.1 node1.wg"
local-data-ptr: "10.200.0.10 db.wg"
systemctl restart unbound
Dynamic registration on WireGuard interface up
# /etc/wireguard/wg0-up.sh — called by PostUp in wg0.conf
# Registers this node in BIND when the WG interface comes up
WG_IP=$(ip addr show wg0 | awk '/inet /{print $2}' | cut -d/ -f1)
HOSTNAME=$(hostname -s)
DNS_SERVER="10.100.10.1"
nsupdate << EOF
server ${DNS_SERVER}
zone infra.local
update delete ${HOSTNAME}-wg.infra.local A
update add ${HOSTNAME}-wg.infra.local 300 A ${WG_IP}
send
EOF
# In /etc/wireguard/wg0.conf:
[Interface]
PostUp = /etc/wireguard/wg0-up.sh
PreDown = nsupdate -l <<< "update delete $(hostname -s)-wg.infra.local A"
db.wg instead of 10.200.0.47. For a static fleet (nodes do not change often), dnsmasq with static address lines is the simplest possible solution — three config lines per node. For a dynamic fleet where nodes join and leave regularly, dynamic DNS updates via nsupdate on WireGuard PostUp are more appropriate. The WireGuard DNS= line in wg-quick configs is underused — it sets the resolver for the tunnel interface automatically when you bring the tunnel up. Use it.10. Pi-hole and Blocklist DNS
DNS-level ad and tracker blocking works by resolving known ad/tracker domains to
0.0.0.0 or NXDOMAIN instead of their real IPs. The client tries to connect,
gets nothing, and the ad never loads. This works for every device and every
application on your network — phones, TVs, IoT devices, anything that uses DNS.
Install Pi-hole
# Pi-hole is a DNS sinkhole — it runs a modified dnsmasq with blocklists
# Install on a dedicated node or alongside other services
# Quick install (requires curl)
curl -sSL https://install.pi-hole.net | bash
# Or use the containerized version:
podman run -d --name pihole \
-p 53:53/udp -p 53:53/tcp \
-p 8080:80 \
-e TZ="America/Toronto" \
-e WEBPASSWORD="changeme" \
-v pihole_data:/etc/pihole \
-v dnsmasq_data:/etc/dnsmasq.d \
--restart=unless-stopped \
pihole/pihole:latest
Point all kldload nodes at Pi-hole
# Via NetworkManager (persistent across reboots)
nmcli con mod "Wired connection 1" ipv4.dns "10.100.10.5"
nmcli con mod "Wired connection 1" ipv4.ignore-auto-dns yes
nmcli con up "Wired connection 1"
# Verify
cat /etc/resolv.conf
# nameserver 10.100.10.5
# Or set fleet-wide via DHCP server (dnsmasq/ISC-DHCP)
# dhcp-option=6,10.100.10.5 # option 6 = DNS server
Unbound with blocklists (no Pi-hole required)
# Download a blocklist in Unbound format
# (rpz-zone or local-zone: entries)
curl -o /etc/unbound/blocklist.conf \
https://raw.githubusercontent.com/nicehash/unbound-blocklist/main/blocklist.conf
# /etc/unbound/unbound.conf:
include: "/etc/unbound/blocklist.conf"
# Blocklist entries look like:
# local-zone: "ads.example.com" always_nxdomain
# local-zone: "tracker.example.net" always_nxdomain
# Auto-update the blocklist weekly:
cat > /etc/cron.weekly/update-unbound-blocklist << 'EOF'
#!/bin/bash
curl -s -o /etc/unbound/blocklist.conf \
https://raw.githubusercontent.com/nicehash/unbound-blocklist/main/blocklist.conf
systemctl reload unbound
EOF
chmod +x /etc/cron.weekly/update-unbound-blocklist
11. DNS Debugging
dig is the essential DNS debugging tool. Learn it. Everything else (nslookup,
host) is a simplified wrapper that hides information you need. dig shows you
the full DNS response, flags, TTL, and answer section exactly as returned by the
server.
Essential dig commands
# Basic lookup
dig example.com
# Short answer only
dig example.com +short
# Query a specific server
dig @10.100.10.1 node1.infra.local
# Query for a specific record type
dig example.com MX
dig example.com TXT
dig example.com AAAA
dig example.com NS
dig example.com SOA
# Reverse DNS lookup
dig -x 10.100.10.10
# Show query time and server used
dig example.com +stats
# Disable recursion (ask the server what it knows directly — useful for auth servers)
dig @ns1.example.com example.com +norec
dig +trace — the most powerful debugging command
# +trace follows the entire resolution chain from root servers down
# If any step fails, you see exactly where
dig api.example.com +trace
# Example output:
# . 518359 IN NS a.root-servers.net.
# a.root-servers.net. 1234 IN A 198.41.0.4
#
# com. 172800 IN NS a.gtld-servers.net.
# [Received 1174 bytes from 198.41.0.4 in 12 ms]
#
# example.com. 172800 IN NS ns1.example.com.
# [Received 512 bytes from 192.5.6.30 in 8 ms]
#
# api.example.com. 300 IN A 203.0.113.50
# [Received 68 bytes from 205.251.196.1 in 2 ms]
#
# Three hops: root → .com TLD → authoritative. Total: 22ms.
# If any step returned SERVFAIL or no response, you see exactly which hop failed.
DNSSEC debugging
# Check DNSSEC validation
dig cloudflare.com +dnssec
# Look for "ad" flag — Authenticated Data
# Use drill for DNSSEC chain verification (install ldns-utils)
drill -D -k /var/lib/unbound/root.key cloudflare.com
# Test a known-bad DNSSEC domain
dig @127.0.0.1 dnssec-failed.org
# Should return SERVFAIL because signatures are intentionally broken
tcpdump for DNS
# Capture all DNS traffic (port 53)
tcpdump -n port 53
# More readable — decode DNS queries and responses
tcpdump -n -i eth0 'udp port 53' -v
# Capture DNS to a file for analysis
tcpdump -n -i any port 53 -w /tmp/dns.pcap
# Count DNS queries per second (useful during debug)
tcpdump -n -i any port 53 2>/dev/null | \
awk '{print $1}' | cut -d. -f1 | uniq -c | sort -rn | head -20
Common DNS error codes
| RCODE | Meaning | Likely cause |
|---|---|---|
| NOERROR | Query succeeded | Normal. Check the ANSWER section for the actual records. |
| NXDOMAIN | Name does not exist | Typo in the name, missing DNS record, wrong zone, split-horizon not configured. |
| SERVFAIL | Server failed to resolve | Upstream resolver unreachable, DNSSEC validation failure, broken delegation, expired zone. |
| REFUSED | Server refused the query | ACL blocked the source IP. Check access-control in Unbound or allow-query in BIND. |
| NODATA | Name exists but no records of that type | Querying for AAAA on a v4-only host, or MX for a domain with no mail config. |
| Timeout | No response | Firewall blocking UDP/53, DNS server down, wrong IP, network unreachable. |
dig +trace is the most powerful DNS debugging command. It shows the full resolution chain from root servers to authoritative, including every referral and every response time. "My DNS does not work" — run dig +trace api.example.com and the answer is in the output. If the root hop succeeds but the TLD hop fails, the TLD servers are unreachable from your network. If TLD succeeds but the authoritative hop fails, the NS records are wrong or your authoritative server is down. If the authoritative hop succeeds but returns the wrong IP, you have a split-horizon misconfiguration or a stale record. The debug loop is: dig +trace → find the failing hop → fix that hop → repeat. It takes two minutes once you know the tool.12. Production DNS Architecture for a kldload Fleet
Putting it all together. A kldload fleet needs: fast recursive resolution with caching, authoritative service for internal zones, split-horizon for external domains, DNSSEC validation, and fallback for resolver failure. Here is the complete architecture and the concrete configs to build it.
Architecture overview
Unbound configuration for the fleet resolver
# /etc/unbound/unbound.conf on node1 and node2 (identical config)
server:
interface: 0.0.0.0
port: 53
access-control: 10.0.0.0/8 allow
access-control: 172.16.0.0/12 allow
access-control: 192.168.0.0/16 allow
access-control: 127.0.0.0/8 allow
access-control: 0.0.0.0/0 refuse
# Cache
cache-min-ttl: 60
cache-max-ttl: 86400
msg-cache-size: 128m
rrset-cache-size: 256m
prefetch: yes
prefetch-key: yes
# Privacy and security
hide-identity: yes
hide-version: yes
use-caps-for-id: yes
harden-glue: yes
harden-dnssec-stripped: yes
# DNSSEC validation
auto-trust-anchor-file: "/var/lib/unbound/root.key"
val-log-level: 2
# Local zone overrides (blocklist entries go here if not using pihole)
# local-zone: "ads.example.com" always_nxdomain
# Forward internal domains to BIND9
stub-zone:
name: "infra.local"
stub-addr: 10.100.10.1@53
stub-zone:
name: "10.100.10.in-addr.arpa"
stub-addr: 10.100.10.1@53
stub-zone:
name: "0.200.10.in-addr.arpa"
stub-addr: 10.100.10.1@53
# Forward Consul service discovery queries
stub-zone:
name: "consul"
stub-addr: 127.0.0.1@8600
# Forward everything else to upstream via DoT
forward-zone:
name: "."
forward-addr: 1.1.1.1@853#cloudflare-dns.com
forward-addr: 1.0.0.1@853#cloudflare-dns.com
forward-tls-upstream: yes
BIND9 named.conf for internal zones
# /etc/named.conf on the authoritative node
acl "internal" {
10.0.0.0/8; 172.16.0.0/12; 192.168.0.0/16; 127.0.0.0/8;
};
options {
directory "/var/named";
listen-on { 10.100.10.1; 127.0.0.1; };
recursion no;
allow-query { internal; };
allow-transfer { none; };
dnssec-validation auto;
};
# Internal-only zones (no split-horizon needed — purely internal)
zone "infra.local" IN {
type master;
file "/var/named/infra.local.zone";
allow-update { 127.0.0.1; 10.100.10.0/24; };
};
zone "10.100.10.in-addr.arpa" IN {
type master;
file "/var/named/10.100.10.rev";
allow-update { 127.0.0.1; 10.100.10.0/24; };
};
zone "0.200.10.in-addr.arpa" IN {
type master;
file "/var/named/10.200.0.rev"; ; WireGuard reverse zone
};
# Split-horizon for the external domain
view "internal" {
match-clients { internal; };
zone "example.com" IN {
type master;
file "/var/named/example.com.internal.zone";
};
};
view "external" {
match-clients { any; };
recursion no;
zone "example.com" IN {
type master;
file "/var/named/example.com.external.zone";
};
};
Point all nodes at the resolvers via NetworkManager
# Run this in a kldload postinstaller or firstboot script on every node
# Primary resolver: node1, secondary: node2 (failover)
nmcli con mod "$(nmcli -t -f NAME con show --active | head -1)" \
ipv4.dns "10.100.10.10 10.100.10.11" \
ipv4.ignore-auto-dns yes
nmcli con up "$(nmcli -t -f NAME con show --active | head -1)"
# Verify
resolvectl status
# DNS Servers: 10.100.10.10
# 10.100.10.11
High availability: keepalived for the resolver VIP
# Run keepalived on node1 and node2 to provide a single VIP for DNS
# All nodes point at 10.100.10.100 (the VIP)
# /etc/keepalived/keepalived.conf on node1 (MASTER):
vrrp_instance DNS_VIP {
state MASTER
interface eth0
virtual_router_id 53
priority 200
advert_int 1
virtual_ipaddress {
10.100.10.100/24
}
track_script {
chk_unbound
}
}
vrrp_script chk_unbound {
script "dig @127.0.0.1 cloudflare.com +short > /dev/null 2>&1"
interval 5
weight -100
}
# If node1's Unbound fails, the VIP moves to node2 automatically.
# All nodes still resolve DNS — they just hit node2 instead.
The complete picture: every kldload node sends DNS queries to the VIP 10.100.10.100. The VIP is owned by node1 (or node2 on failover). Unbound on node1 checks its cache first — most queries return from cache in under 1ms. Cache misses for internal names (infra.local) go to BIND9 on the same node. Cache misses for external names go to Cloudflare over DNS-over-TLS — encrypted, private, cached for future queries. DNSSEC validation runs on every external response. Split-horizon means internal calls to api.example.com get the private LAN IP, not the public IP. Kubernetes pods forward through CoreDNS to this same Unbound, so cluster DNS and node DNS share the same cache and the same private resolver. WireGuard peers get DNS names via stub zones in Unbound that delegate to BIND. The whole fleet — bare-metal nodes, VMs, Kubernetes pods, WireGuard peers — resolves names through one consistent, cached, validated DNS infrastructure.
Related pages
- Networking tutorial — VLANs, bonding, BGP, and the network stack that DNS sits on
- WireGuard Masterclass — the mesh that needs DNS names
- WireGuard Mesh & Multi-Site — multi-site WireGuard with per-site DNS
- Kubernetes on KVM — the cluster where CoreDNS runs
- Cilium Masterclass — L7 DNS policy enforcement at the eBPF layer
- Monitoring Stack Glossary (355 terms) Help & Links — Unbound metrics in Prometheus and Grafana
- Security — DNS security: DNSSEC, response policy zones, DNS-over-TLS