Masterclass

Vault & Secrets Masterclass

Every infrastructure has secrets. Database passwords. API keys. TLS private keys. WireGuard pre-shared keys. SSH host keys. Encryption passphrases. Service tokens. The question is not whether you have secrets — it is how you manage them. Most infrastructure answers that question the same way: plaintext config files, environment variables, YAML that everyone has the decryption key to, and a prayer that nobody leaks them.

Vault answers differently. A centralized secrets engine that issues short-lived credentials, rotates them automatically, encrypts data for applications, and audit-logs every access. On OpenZFS, Vault's storage gets snapshots before every operation, encrypted datasets for the backend, and incremental replication to DR. The most sensitive data in your infrastructure, protected by the most reliable filesystem.

The secrets problem: You have a config file with a database password. That password has been the same for two years. Six people have read it. It is in git history. It is in three backups on S3. You rotated it once and broke production for forty minutes because you missed an instance. You have no idea who accessed the database with it last Tuesday at 3 AM. This is normal. This is the state of most production infrastructure. Vault makes all of this different.

What Vault adds to kldload: kldload already has TLS/PKI via step-ca, encrypted transport via WireGuard, data at rest encryption via OpenZFS native encryption, and network access control via nftables. Vault adds the missing piece: dynamic secrets management, application credential lifecycle, and encryption as a service. A credential that is dynamically generated, scoped to one application, and automatically revoked after one hour cannot be leaked in the traditional sense. There is nothing to leak — it does not exist until it is needed, and it stops existing when it is no longer needed.

On OpenZFS: Vault's Raft storage backend lives on an encrypted OpenZFS dataset. That dataset is snapshotted by sanoid before every Vault upgrade. It is replicated to the DR site by syncoid over WireGuard. Vault is already encrypted by its own seal mechanism. OpenZFS adds a second layer of encryption, plus checksums that detect any silent data corruption, plus snapshots that let you roll back Vault's state if a migration goes wrong. This is defense in depth as a file system property.

Vault is the tool that makes "zero trust" real for application secrets. Without it, you distribute long-lived credentials in config files and hope nobody leaks them. With it, credentials are dynamic (generated on demand, expire automatically), access is audited, and rotation is automatic. HashiCorp built it. The open source version does everything you need. Vault Enterprise adds DR replication and Sentinel policies — useful, but not required. This guide covers open source Vault throughout.

2. Vault Fundamentals

Before you install anything, understand what Vault actually does and how it thinks about secrets. Vault is not a password manager and it is not a key-value store. It is a secrets engine with an API, an auth system, a policy system, and a leasing system. These four things working together are what make Vault different from everything else.

Secrets storage

Store arbitrary key-value pairs at paths inside Vault. KV v2 versions every write so you can roll back. Access is controlled by policies. Every read and write is audit-logged. This replaces config files, environment variables, and secrets in git.

// vault kv put secret/db password=hunter2 // vault kv get secret/db // vault kv rollback -version=3 secret/db

Dynamic secrets

Vault generates credentials on demand. A database credential is created when the application requests it, scoped to the role, and automatically revoked when the TTL expires. No rotation. No long-lived passwords. No shared credentials.

// vault read database/creds/my-role // → username: v-app-my-role-abc123 // → password: A1B2C3D4 (expires in 1h, then Vault drops the user)

Encryption as a service

The transit engine encrypts and decrypts data without the application ever seeing the key. Applications send plaintext, get back ciphertext. Keys live in Vault. Key rotation happens in Vault. Applications are never exposed to cryptographic material.

// vault write transit/encrypt/my-key plaintext=$(base64 <<<"hello") // → ciphertext: vault:v1:abc123... // vault write transit/decrypt/my-key ciphertext=vault:v1:abc123...

Identity and access management

Vault has auth methods (how you prove who you are) and policies (what you are allowed to do). AppRole for services, Kubernetes auth for pods, TLS cert auth for nodes. Every entity gets exactly the access it needs — nothing more.

// entity: my-webapp // policy: read database/creds/webapp-role, read secret/webapp/* // cannot read: secret/billing/*, secret/infra/*

Architecture: how the pieces fit

Vault runs as a single binary. It has a listener (HTTPS API), a storage backend (Raft, or Consul, or a few others), auth method plugins, and secrets engine plugins. The core is the barrier: an AES-GCM encrypted wall between the API and the storage. When Vault starts, the barrier is closed — this is the "sealed" state. No data can leave or enter. Unsealing decrypts the barrier key using Shamir's Secret Sharing.

Seal / unseal

When Vault starts, it is sealed. All data is encrypted and inaccessible. To unseal, a threshold of key shares must be provided (default: 3 of 5). Each share-holder can be a different person. Vault cannot be accessed until the threshold is met.

// vault operator init -key-shares=5 -key-threshold=3 // vault operator unseal <key-share-1> // vault operator unseal <key-share-2> // vault operator unseal <key-share-3> → Vault is now unsealed

Tokens

Every Vault operation uses a token for authentication. Tokens have TTLs, policies, and optional use limits. The root token is created at init time and should be revoked immediately after creating the first admin token. Tokens can create child tokens.

// vault token create -policy=my-policy -ttl=1h // vault token lookup s.abc123 // vault token revoke s.abc123

Leases

Every secret Vault issues has a lease: a TTL and a lease ID. Before the TTL expires, the leaseholder can renew it. When the TTL expires, the secret is revoked. Dynamic secrets (database credentials, AWS keys) are automatically cleaned up at lease expiry.

// vault lease renew database/creds/my-role/abc123 // vault lease revoke database/creds/my-role/abc123 // vault lease revoke -prefix database/creds/my-role/ ← revoke all

Policies

HCL files that define what paths a token can access and with what capabilities (create, read, update, delete, list, sudo). Tokens inherit the policies of their creator, bounded by the creator's policies. The root policy can do everything.

// path "secret/webapp/*" { capabilities = ["read"] } // path "database/creds/webapp-role" { capabilities = ["read"] } // Implicit deny on everything else

Audit devices

Every Vault request and response is written to the audit log. Secret values are HMAC'd (not plaintext) so the log is tamper-evident but not a credential dump. If all audit devices fail to write, Vault blocks all requests — audit is not optional.

// vault audit enable file file_path=/vault/audit/audit.log // vault audit list // jq .auth.display_name audit.log | sort | uniq -c | sort -rn

Dev mode vs. production mode

Dev mode (vault server -dev) starts Vault in memory, pre-unsealed, with a root token. Everything is lost on restart. Use it to learn the API and test policies. Never run it on a production system. Production mode requires a real storage backend, TLS, and an explicit initialization and unseal process.

Vault's seal/unseal mechanism is its most important security feature. When Vault starts, it is "sealed" — encrypted and inaccessible. Unsealing requires a threshold of key shares (Shamir's Secret Sharing). This means no single person can access Vault alone. The unseal keys should be distributed across different people and locations. On OpenZFS, the Vault storage is on an encrypted dataset — defense in depth: Vault's own encryption plus ZFS native encryption plus WireGuard transport encryption. Three cryptographic layers on the most sensitive data in your infrastructure.

3. Install Vault on kldload

kldload runs CentOS Stream 9 on the live ISO and supports CentOS, Debian, Ubuntu, Rocky, RHEL, and Fedora as install targets. The installation procedure differs by distro. The ZFS dataset layout is the same everywhere.

ZFS dataset layout

Before installing Vault, create the datasets. The Vault data needs to be on an encrypted dataset. The audit log benefits from compression (audit logs are highly compressible JSON).

# Create encrypted dataset for all Vault data
# Use a passphrase stored in your HSM or in another Vault (yes, this is recursive — see auto-unseal)
zfs create \
  -o encryption=aes-256-gcm \
  -o keylocation=prompt \
  -o keyformat=passphrase \
  -o compression=zstd \
  rpool/vault

# Vault Raft storage backend
zfs create rpool/vault/data

# Audit logs — compressed separately, easier to manage retention
zfs create \
  -o compression=zstd-9 \
  -o recordsize=128k \
  rpool/vault/audit

# Set mountpoints
zfs set mountpoint=/vault rpool/vault
zfs set mountpoint=/vault/data rpool/vault/data
zfs set mountpoint=/vault/audit rpool/vault/audit

# Verify
zfs list -r rpool/vault

Install Vault — CentOS / RHEL / Rocky / Fedora

# Add HashiCorp repo
dnf install -y dnf-plugins-core
dnf config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
dnf install -y vault

# Verify
vault version

Install Vault — Debian / Ubuntu

# Add HashiCorp repo
apt-get update && apt-get install -y gpg curl
curl -fsSL https://apt.releases.hashicorp.com/gpg | gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" \
  | tee /etc/apt/sources.list.d/hashicorp.list
apt-get update && apt-get install -y vault

# Verify
vault version

Vault configuration file

Vault reads /etc/vault.d/vault.hcl. This configuration puts Vault on the WireGuard management interface only (see section 4), uses Raft on the ZFS dataset, enables the UI, and locks Vault's memory to prevent swap leakage.

cat > /etc/vault.d/vault.hcl <<'EOF'
# Vault listens on the WireGuard management plane only
# Replace 10.201.0.1 with this node's WireGuard management IP
listener "tcp" {
  address       = "10.201.0.1:8200"
  tls_cert_file = "/etc/vault.d/tls/vault.crt"
  tls_key_file  = "/etc/vault.d/tls/vault.key"
  tls_min_version = "tls13"
}

# Raft integrated storage on ZFS dataset
storage "raft" {
  path    = "/vault/data"
  node_id = "vault-node-1"
}

# API address — used by cluster members to communicate
api_addr = "https://10.201.0.1:8200"

# Cluster address — Raft peer communication
cluster_addr = "https://10.201.0.1:8201"

# Lock memory — prevent secrets from leaking into swap
disable_mlock = false

# Enable the UI (accessible over WireGuard only)
ui = true

# Log level
log_level = "info"
EOF

Systemd service with security hardening

The package installs a systemd unit, but it needs hardening. Override it:

mkdir -p /etc/systemd/system/vault.service.d
cat > /etc/systemd/system/vault.service.d/hardening.conf <<'EOF'
[Service]
# Prevent privilege escalation
NoNewPrivileges=true
# Read-only filesystem except for Vault's own data
ProtectSystem=strict
ReadWritePaths=/vault/data /vault/audit /etc/vault.d
# Hide /home, /root, /run/user
ProtectHome=true
# Restrict capabilities — Vault only needs IPC_LOCK to mlock memory
CapabilityBoundingSet=CAP_IPC_LOCK
AmbientCapabilities=CAP_IPC_LOCK
# Private /tmp
PrivateTmp=true
# Restrict system calls
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM
# Restrict address families
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
EOF

systemctl daemon-reload

TLS certificate for Vault

Vault requires TLS. Use step-ca (already running on kldload) to issue a certificate for Vault's WireGuard address:

mkdir -p /etc/vault.d/tls

# Issue a cert from step-ca for the Vault WireGuard address
# Replace ca.internal.example.com and 10.201.0.1 with your values
step ca certificate \
  --ca-url https://ca.internal.example.com \
  --root /etc/step/certs/root_ca.crt \
  --san vault.internal.example.com \
  --san 10.201.0.1 \
  vault.internal.example.com \
  /etc/vault.d/tls/vault.crt \
  /etc/vault.d/tls/vault.key

chown -R vault:vault /etc/vault.d/tls
chmod 640 /etc/vault.d/tls/vault.key

Initialize and unseal Vault

# Start Vault
systemctl enable --now vault

# Set the Vault address for CLI use
export VAULT_ADDR="https://10.201.0.1:8200"
export VAULT_CACERT="/etc/step/certs/root_ca.crt"

# Initialize — 5 key shares, threshold of 3
vault operator init \
  -key-shares=5 \
  -key-threshold=3 \
  -format=json > /root/vault-init.json

# CRITICAL: copy the unseal keys and root token out of vault-init.json
# Store each key with a different person or in a different secure location
# Delete vault-init.json after distributing the keys
cat /root/vault-init.json | jq .unseal_keys_b64
cat /root/vault-init.json | jq -r .root_token

# Unseal with 3 of the 5 key shares
vault operator unseal <key-share-1>
vault operator unseal <key-share-2>
vault operator unseal <key-share-3>

# Login with root token
vault login <root-token>

# Enable audit logging immediately
vault audit enable file file_path=/vault/audit/audit.log

# Create an admin policy and token, then revoke the root token
vault policy write admin - <<'EOF'
path "*" {
  capabilities = ["create", "read", "update", "delete", "list", "sudo"]
}
EOF

vault token create \
  -policy=admin \
  -ttl=0 \
  -display-name=admin \
  -format=json | jq -r .auth.client_token

# Revoke root token (use the admin token going forward)
vault token revoke <root-token>

Vault's data on an encrypted OpenZFS dataset means: encrypted at rest (ZFS AES-256-GCM), encrypted in transit (WireGuard AES-256), encrypted in memory (Vault's mlock prevents paging to disk), snapshotted before every operation (sanoid), replicated to DR (syncoid over WireGuard). Five layers of protection on the most sensitive data in your infrastructure. Any single layer failing does not compromise the others.

4. Vault on the WireGuard Backplane

Vault should never be reachable from the public internet. It belongs on the WireGuard management plane — the same encrypted overlay that carries SSH, Prometheus, and internal APIs. This is not a firewall rule. It is an architectural decision: Vault does not listen on a public interface at all.

Binding Vault to the WireGuard interface

The listener in vault.hcl already specifies 10.201.0.1:8200 — the WireGuard management plane address. Vault will not respond to connections on eth0 because it does not listen there. No firewall rule required to block external access to Vault because there is nothing to block.

nftables: allow Vault ports from authorized WireGuard addresses only

Even inside the WireGuard plane, not every peer needs to reach Vault. Lock it down with nftables on the wg1 (management plane) interface:

cat > /etc/nftables.d/vault.nft <<'EOF'
# Vault access policy on the management backplane (wg1)
# Only allow Vault API and cluster ports from authorized management addresses

table inet vault {

  set vault_clients {
    type ipv4_addr
    flags interval
    # Add the WireGuard IPs of machines that need Vault access
    elements = {
      10.201.0.0/24,    # management plane — all nodes can reach Vault API
    }
  }

  set vault_peers {
    type ipv4_addr
    flags interval
    # Vault cluster peers only (for Raft)
    elements = {
      10.201.0.2,       # vault-node-2
      10.201.0.3,       # vault-node-3
    }
  }

  chain vault_input {
    type filter hook input priority 0; policy drop;
    # Vault API — allow from management plane
    iifname "wg1" ip saddr @vault_clients tcp dport 8200 accept
    # Vault cluster / Raft — allow only from peer nodes
    iifname "wg1" ip saddr @vault_peers tcp dport 8201 accept
    # Drop everything else to Vault ports
    tcp dport {8200, 8201} drop
  }
}
EOF

nft -f /etc/nftables.d/vault.nft

Client access over WireGuard

Any machine with a WireGuard management plane connection can reach Vault. Configure the Vault CLI or SDK with the WireGuard address:

# On any management plane node
export VAULT_ADDR="https://10.201.0.1:8200"
export VAULT_CACERT="/etc/step/certs/root_ca.crt"
vault status

HA Vault cluster over WireGuard mesh

For a three-node HA cluster, each node needs to reach the others over the management plane. The cluster_addr in vault.hcl points to the WireGuard management IP. Raft replication runs over WireGuard port 8201 — encrypted twice (WireGuard + TLS).

# On vault-node-2 and vault-node-3, join the cluster after vault-node-1 is initialized
export VAULT_ADDR="https://10.201.0.2:8200"
vault operator raft join https://10.201.0.1:8200

# Check cluster membership
vault operator raft list-peers

Vault on the backplane means the secrets engine is invisible from the internet. No public endpoint, no attack surface. Applications reach Vault through WireGuard. Even if the LAN is compromised, Vault is unreachable without WireGuard keys. An attacker who owns a server's public IP gets nothing — Vault is not listening on that interface. An attacker who compromises a WireGuard peer still cannot access secrets they do not have a Vault token for. The network isolation and the Vault access control system are independent layers.

5. Auth Methods — How Applications Prove Identity

Before Vault gives a secret to anything, it needs to know who is asking. Auth methods are the plugins that answer that question. Different auth methods suit different contexts: humans at a terminal, CI/CD pipelines, Kubernetes pods, and long-running services all have different ways to prove identity.

Token auth (simplest — for humans and scripts)

A token is the fundamental unit of Vault authentication. Everything else produces a token. Token auth lets you use a token directly — hand-create tokens for humans or scripts where other auth methods are impractical.

# Create a token with a specific policy and TTL
vault token create \
  -policy=read-only \
  -ttl=8h \
  -display-name="ops-session-$(date +%Y%m%d)" \
  -format=json | jq -r .auth.client_token

# Use the token
VAULT_TOKEN=s.abc123 vault kv get secret/myapp/config

# Revoke a token (and all its children)
vault token revoke s.abc123

AppRole — for applications and CI/CD

AppRole is the standard auth method for machine-to-machine authentication. An application gets two identifiers: a role_id (not secret, baked into the image or configuration) and a secret_id (short-lived, injected at deploy time). Together they authenticate and produce a token.

# Enable AppRole auth
vault auth enable approle

# Create a role for a web application
vault write auth/approle/role/webapp \
  secret_id_ttl=10m \
  token_num_uses=10 \
  token_ttl=30m \
  token_max_ttl=60m \
  policies="webapp-policy"

# Get the role_id (not secret — bake this into your app config)
vault read auth/approle/role/webapp/role-id
# → role_id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

# Generate a secret_id (secret — inject this at deploy time, use once)
vault write -f auth/approle/role/webapp/secret-id
# → secret_id: yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy
# → secret_id_ttl: 10m

# Application authenticates:
vault write auth/approle/login \
  role_id=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx \
  secret_id=yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy
# → token: s.abc123 (30m TTL, webapp-policy)

Kubernetes auth — pods authenticate automatically

Kubernetes pods authenticate to Vault using their service account JWT. The pod does not need any pre-shared credential. The Vault agent or the application uses the mounted service account token to authenticate.

# Enable Kubernetes auth
vault auth enable kubernetes

# Configure — point Vault at the Kubernetes API
# (Run this from inside the cluster or with kube credentials)
vault write auth/kubernetes/config \
  kubernetes_host=https://10.201.10.1:6443 \
  kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt \
  issuer="https://kubernetes.default.svc.cluster.local"

# Create a role binding a Kubernetes service account to a Vault policy
vault write auth/kubernetes/role/webapp \
  bound_service_account_names=webapp \
  bound_service_account_namespaces=production \
  policies=webapp-policy \
  ttl=1h

# Pod authenticates (the Vault agent does this automatically):
JWT=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
vault write auth/kubernetes/login \
  role=webapp \
  jwt=${JWT}

TLS certificate auth — mTLS via step-ca

Machines with certificates from your step-ca can authenticate to Vault using the certificate as identity. The certificate's CN or SAN becomes the principal. No shared secrets. Rotation is certificate renewal.

# Enable TLS cert auth
vault auth enable cert

# Register your step-ca root as a trusted CA
vault write auth/cert/certs/kldload-nodes \
  display_name="kldload nodes" \
  policies=node-policy \
  certificate=@/etc/step/certs/root_ca.crt \
  ttl=1h

# Node authenticates with its TLS certificate
vault login \
  -method=cert \
  -client-cert=/etc/vault.d/tls/vault.crt \
  -client-key=/etc/vault.d/tls/vault.key

LDAP / OIDC — for humans with existing identity providers

# Enable OIDC auth (for Google Workspace, Okta, Keycloak, etc.)
vault auth enable oidc

vault write auth/oidc/config \
  oidc_discovery_url="https://accounts.google.com" \
  oidc_client_id="your-client-id" \
  oidc_client_secret="your-client-secret" \
  default_role="default"

vault write auth/oidc/role/default \
  bound_audiences="your-client-id" \
  allowed_redirect_uris="https://vault.internal.example.com:8200/ui/vault/auth/oidc/oidc/callback" \
  user_claim="email" \
  policies="human-policy"

# Human logs in via browser
vault login -method=oidc

AppRole is the auth method most production deployments use for applications. The app gets a role_id (baked into the image, not secret — knowing it alone gets you nothing) and a secret_id (injected at deploy time, short-lived, single-use). Together they authenticate to Vault and produce a token. The token has a TTL. When it expires, the app re-authenticates. No long-lived credentials anywhere. If the secret_id leaks, it has already been used and it has already expired. Compare this to a database password that has been the same for two years and you understand why AppRole matters.

6. KV Secrets Engine — The Basics

KV v2 (key-value version 2) is the simplest secrets engine: a versioned key-value store at a path hierarchy. Every write creates a new version. Vault keeps N versions per key (configurable). You can roll back to any version. Metadata (who wrote it, when) is tracked separately from the secret data.

Enable and configure KV v2

# Enable KV v2 at the 'secret/' path (this is the default in dev mode)
vault secrets enable -path=secret kv-v2

# Configure how many versions to keep
vault write secret/config max_versions=10

Store and retrieve secrets

# Store a database credential
vault kv put secret/db/production \
  host=db.internal.example.com \
  port=5432 \
  database=myapp \
  username=myapp_user \
  password=hunter2

# Retrieve it
vault kv get secret/db/production

# Get just the password (for scripting)
vault kv get -field=password secret/db/production

# Get as JSON
vault kv get -format=json secret/db/production | jq .data.data

Version history and rollback

# List versions
vault kv metadata get secret/db/production

# Get a specific version
vault kv get -version=3 secret/db/production

# Roll back to version 3 (creates a new version that is a copy of v3)
vault kv rollback -version=3 secret/db/production

# Delete current version (soft delete — data retained, version is marked deleted)
vault kv delete secret/db/production

# Destroy a specific version permanently (no recovery)
vault kv destroy -versions=1 secret/db/production

Real-world KV usage: storing infra secrets

# WireGuard peer keys for a new node
vault kv put secret/wireguard/nodes/node-7 \
  private_key="$(wg genkey)" \
  public_key="$(cat /tmp/node-7-pubkey)"

# API keys (store with metadata about rotation schedule)
vault kv put secret/api-keys/stripe \
  secret_key="sk_live_..." \
  publishable_key="pk_live_..." \
  environment=production \
  rotate_after="2027-01-01"

# SSH host keys (for golden image workflow — restore at deploy time)
vault kv put secret/ssh-host-keys/node-7 \
  ed25519_private="$(cat /etc/ssh/ssh_host_ed25519_key)" \
  ed25519_public="$(cat /etc/ssh/ssh_host_ed25519_key.pub)"

KV v2 versions your secrets the same way OpenZFS snapshots version your data. Rotated a password and the application broke? Roll back to the previous version in one command. Combined with Vault's audit log, you know exactly who changed what secret and when — not just the current value, but the full history of changes. This alone is worth running Vault: the diff between "who has the password" and "who changed the password, when, and what it was before" is the difference between security through obscurity and actual accountability.

7. Dynamic Secrets — The Game Changer

Dynamic secrets are Vault's most powerful feature. Instead of storing a credential and handing it out, Vault generates a fresh credential when asked, scoped to the requesting entity, with a configurable TTL. When the TTL expires, Vault runs the cleanup — drops the database user, revokes the cloud IAM role, expires the SSH certificate. There is nothing to rotate because the credential never persists.

PostgreSQL dynamic credentials

This is the canonical example. Vault connects to PostgreSQL as an admin user. When an application requests credentials, Vault creates a temporary database user with specific permissions. After one hour, Vault drops that user.

# Enable the database secrets engine
vault secrets enable database

# Configure the PostgreSQL connection
# Vault uses this admin connection to create and revoke dynamic users
vault write database/config/myapp-db \
  plugin_name=postgresql-database-plugin \
  allowed_roles="webapp-role,reporting-role" \
  connection_url="postgresql://{{username}}:{{password}}@db.internal.example.com:5432/myapp?sslmode=require" \
  username="vault_admin" \
  password="vault-admin-password"

# Create a role — defines the SQL to run when creating/revoking credentials
vault write database/roles/webapp-role \
  db_name=myapp-db \
  creation_statements="
    CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}';
    GRANT SELECT, INSERT, UPDATE ON ALL TABLES IN SCHEMA public TO \"{{name}}\";
    GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO \"{{name}}\";
  " \
  revocation_statements="
    REVOKE ALL PRIVILEGES ON ALL TABLES IN SCHEMA public FROM \"{{name}}\";
    REVOKE ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public FROM \"{{name}}\";
    DROP ROLE IF EXISTS \"{{name}}\";
  " \
  default_ttl="1h" \
  max_ttl="24h"

# Request credentials — what an application does at startup
vault read database/creds/webapp-role
# → lease_id:       database/creds/webapp-role/abc123
# → lease_duration: 1h0m0s
# → username:       v-approle-webapp-role-def456
# → password:       A1B2C3D4E5F6G7H8

# The user exists in PostgreSQL until the lease expires, then Vault drops it
# You can verify:
# psql -c "\du" | grep v-approle

MySQL / MariaDB dynamic credentials

vault write database/config/myapp-mysql \
  plugin_name=mysql-database-plugin \
  allowed_roles="webapp-role" \
  connection_url="{{username}}:{{password}}@tcp(mysql.internal.example.com:3306)/" \
  username="vault_admin" \
  password="vault-admin-password"

vault write database/roles/webapp-role \
  db_name=myapp-mysql \
  creation_statements="
    CREATE USER '{{name}}'@'%' IDENTIFIED BY '{{password}}';
    GRANT SELECT, INSERT, UPDATE ON myapp.* TO '{{name}}'@'%';
  " \
  revocation_statements="DROP USER IF EXISTS '{{name}}'@'%';" \
  default_ttl="1h" \
  max_ttl="24h"

AWS dynamic credentials

Vault creates temporary AWS IAM users or assumes roles and returns short-lived access keys. No permanent IAM users. Credentials are automatically revoked when the lease expires.

# Enable AWS secrets engine
vault secrets enable aws

# Configure with an IAM user that has permission to create IAM credentials
vault write aws/config/root \
  access_key=AKIAIOSFODNN7EXAMPLE \
  secret_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY \
  region=us-east-1

# Create a role with an IAM policy
vault write aws/roles/deploy-role \
  credential_type=iam_user \
  policy_document=- <<'EOF'
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": ["s3:GetObject", "s3:PutObject"],
    "Resource": "arn:aws:s3:::my-deployment-bucket/*"
  }]
}
EOF

# Request credentials
vault read aws/creds/deploy-role
# → access_key: AKIAI...
# → secret_key: wJalr...
# → lease_duration: 768h
# These credentials exist in AWS for the lease duration, then Vault deletes the IAM user

SSH signed certificates

Vault signs SSH certificates with a configurable TTL. No authorized_keys file. No permanent key distribution. The SSH certificate proves identity and expires automatically.

# Enable SSH secrets engine (CA mode)
vault secrets enable ssh

# Create a CA key pair inside Vault
vault write ssh/config/ca generate_signing_key=true

# Get the public key — add this to /etc/ssh/sshd_config on all servers
vault read -field=public_key ssh/config/ca > /etc/ssh/vault_ssh_ca.pub
echo "TrustedUserCAKeys /etc/ssh/vault_ssh_ca.pub" >> /etc/ssh/sshd_config

# Create a role for operator access
vault write ssh/roles/operator \
  key_type=ca \
  allowed_users="root,ops" \
  default_user=ops \
  ttl=4h \
  max_ttl=8h \
  allowed_extensions="permit-pty,permit-port-forwarding"

# Operator signs their public key (run on the operator's workstation)
vault write ssh/sign/operator \
  public_key=@~/.ssh/id_ed25519.pub \
  valid_principals=ops \
  ttl=4h
# → signed_key: ssh-rsa-cert-v01@openssh.com ...

# Use the signed certificate to SSH
# vault write -field=signed_key ssh/sign/operator public_key=@~/.ssh/id_ed25519.pub > ~/.ssh/id_ed25519-cert.pub
# ssh -i ~/.ssh/id_ed25519-cert.pub ops@10.201.0.5

Dynamic secrets eliminate the concept of password rotation. There is nothing to rotate because credentials do not persist. Every access gets fresh credentials that expire automatically. A leaked credential is useless after the TTL. Think about what this means for the database password example at the top of this guide: "six people have read it, it is in git history, you have no idea who used it at 3 AM" — none of that applies to a dynamic credential that is created for one application, expires in one hour, and is automatically revoked. The attacker who exfiltrates your config file gets credentials that stopped working before they finished reading the file.

8. PKI Secrets Engine — Certificate Authority

Vault's PKI engine makes Vault a certificate authority. You define a root CA, one or more intermediate CAs, and roles that describe what certificates can be issued. Applications request certificates via the Vault API and get short-lived certs back. No manual certificate management. No year-long certificate lifetimes.

Set up a root CA and intermediate CA

# Enable PKI engine for root CA
vault secrets enable -path=pki pki
vault secrets tune -max-lease-ttl=87600h pki  # 10 years for root

# Generate root CA (self-signed, stored inside Vault)
vault write -field=certificate pki/root/generate/internal \
  common_name="kldload Root CA" \
  ttl=87600h > /etc/ssl/certs/vault-root-ca.crt

# Configure URLs
vault write pki/config/urls \
  issuing_certificates="https://vault.internal.example.com:8200/v1/pki/ca" \
  crl_distribution_points="https://vault.internal.example.com:8200/v1/pki/crl"

# Enable intermediate CA engine
vault secrets enable -path=pki_int pki
vault secrets tune -max-lease-ttl=43800h pki_int  # 5 years for intermediate

# Generate intermediate CSR
vault write -format=json pki_int/intermediate/generate/internal \
  common_name="kldload Intermediate CA" \
  ttl=43800h | jq -r .data.csr > /tmp/pki_int.csr

# Sign with root CA
vault write -format=json pki/root/sign-intermediate \
  csr=@/tmp/pki_int.csr \
  format=pem_bundle \
  ttl=43800h | jq -r .data.certificate > /tmp/intermediate.cert.pem

# Set signed certificate on intermediate
vault write pki_int/intermediate/set-signed certificate=@/tmp/intermediate.cert.pem

Create roles and issue certificates

# Create a role for internal services
vault write pki_int/roles/internal-services \
  allowed_domains="internal.example.com" \
  allow_subdomains=true \
  allow_bare_domains=false \
  max_ttl=720h \  # 30 days max
  generate_lease=true

# Issue a certificate for a service
vault write pki_int/issue/internal-services \
  common_name="vault.internal.example.com" \
  alt_names="10.201.0.1" \
  ttl=720h

# Automate renewal — run this via cron or systemd timer before expiry
vault write pki_int/issue/internal-services \
  common_name="$(hostname).internal.example.com" \
  ttl=168h  # 7 days — renew weekly

Vault PKI vs. step-ca

Both solve the same problem differently. step-ca is a standalone CA focused on ACME protocol support and simple certificate issuance. Vault's PKI engine is embedded inside Vault and benefits from Vault's auth, audit, and policy system. If you already run Vault, use its PKI engine — every certificate issuance is audited, policy-controlled, and revocable. If you only need certificates and do not want to run Vault, step-ca is simpler.

Vault's PKI engine enforces short-lived certificates by design. A 30-day maximum TTL means even a compromised private key stops working in 30 days without any manual intervention. Most PKI deployments issue 1-year or 5-year certificates because manual renewal is painful. Vault makes renewal automatic, so short TTLs become practical. The same principle as dynamic database credentials: make the credential lifetime short enough that leakage is self-limiting.

9. Transit Secrets Engine — Encryption as a Service

The transit engine is Vault's encryption service. Applications send plaintext to Vault, receive ciphertext back. Applications send ciphertext to Vault, receive plaintext back. The encryption key never leaves Vault. The application does not manage keys, does not handle key rotation, and does not need to understand cryptography.

Enable and configure transit

# Enable transit engine
vault secrets enable transit

# Create an encryption key
vault write -f transit/keys/webapp-data

# Key is created inside Vault — you cannot export it (by default)
# Inspect key metadata (not the key material itself)
vault read transit/keys/webapp-data

Encrypt and decrypt data

# Encrypt a value (plaintext must be base64-encoded)
vault write transit/encrypt/webapp-data \
  plaintext=$(echo -n "my-secret-database-password" | base64)
# → ciphertext: vault:v1:abc123def456...

# Decrypt
vault write transit/decrypt/webapp-data \
  ciphertext=vault:v1:abc123def456...
# → plaintext: bXktc2VjcmV0LWRhdGFiYXNlLXBhc3N3b3Jk
# Decode: echo "bXktc2VjcmV0LWRhdGFiYXNlLXBhc3N3b3Jk" | base64 -d
# → my-secret-database-password

Key rotation without re-encryption

Vault's transit keys are versioned. When you rotate, Vault creates a new key version. New encryptions use the latest version. Old ciphertexts are still decryptable with the old key version. You can then rewrap old ciphertexts to the new version at your own pace.

# Rotate the key — new version created, old versions still valid for decrypt
vault write -f transit/keys/webapp-data/rotate

# Rewrap old ciphertext to the new key version (optional — old versions still work)
vault write transit/rewrap/webapp-data \
  ciphertext=vault:v1:abc123def456...
# → ciphertext: vault:v2:newciphertext...  (now uses v2 key)

# Set a minimum version for decryption (force all clients to rewrap)
vault write transit/keys/webapp-data/config \
  min_decryption_version=2

Real-world use case: encrypting PII in a database

# Application flow for storing an SSN:
# 1. Application calls Vault transit before writing to database
ENCRYPTED=$(vault write -field=ciphertext transit/encrypt/pii-data \
  plaintext=$(echo -n "123-45-6789" | base64))

# 2. Application stores $ENCRYPTED in the database (not the SSN)
# psql -c "INSERT INTO users (ssn_encrypted) VALUES ('${ENCRYPTED}')"

# 3. Application retrieves and decrypts when needed
PLAINTEXT=$(vault write -field=plaintext transit/decrypt/pii-data \
  ciphertext="$ENCRYPTED" | base64 -d)

# If the application is compromised, the attacker gets vault:v1:abc123...
# That ciphertext is useless without access to Vault's transit key
# The SSN never exists in the database, in logs, or in application memory for long

The transit engine is how you add encryption to applications that do not natively support it. Your application does not manage keys, does not know the encryption algorithm, and does not handle key rotation. It sends data to Vault and gets back encrypted data. When Vault's transit key is rotated, the application does not change. When the application is compromised, the attacker gets ciphertext without the key. When a compliance auditor asks "how is the PII encrypted and where are the keys stored?" the answer is: "in a FIPS 140-2 compliant secrets engine, audited, policy-controlled, with hardware-backed keys if you use auto-unseal." That is a much better answer than "we use AES in the application, the key is in the config file."

10. Vault Policies — Who Can Access What

Vault policies are the ACL layer that determines what a token can do. They are HCL files that map path patterns to capabilities. The default is deny — a token can only do what its policies explicitly allow. This is the mechanism that gives each application exactly the secrets it needs and nothing more.

Policy syntax

# Capabilities: create, read, update, delete, list, sudo, deny
# Paths support wildcards: * (one segment) and + (one segment, named capture)

# Example: webapp policy
cat > /tmp/webapp-policy.hcl <<'EOF'
# Read database credentials for this app only
path "database/creds/webapp-role" {
  capabilities = ["read"]
}

# Read and write app configuration secrets (KV v2)
path "secret/data/webapp/*" {
  capabilities = ["create", "read", "update", "delete"]
}

# List secrets under the webapp path (to know what exists)
path "secret/metadata/webapp/*" {
  capabilities = ["list"]
}

# Encrypt/decrypt using the webapp transit key only
path "transit/encrypt/webapp-data" {
  capabilities = ["update"]
}
path "transit/decrypt/webapp-data" {
  capabilities = ["update"]
}

# Allow the app to renew its own token and leases
path "auth/token/renew-self" {
  capabilities = ["update"]
}
path "sys/leases/renew" {
  capabilities = ["update"]
}
EOF

vault policy write webapp /tmp/webapp-policy.hcl

Least-privilege policies for common roles

# CI/CD pipeline — can read deploy secrets, cannot read database or billing secrets
cat > /tmp/cicd-policy.hcl <<'EOF'
path "secret/data/deploy/*" {
  capabilities = ["read"]
}
path "aws/creds/deploy-role" {
  capabilities = ["read"]
}
EOF
vault policy write cicd /tmp/cicd-policy.hcl

# Database application — dynamic credentials only
cat > /tmp/db-app-policy.hcl <<'EOF'
path "database/creds/webapp-role" {
  capabilities = ["read"]
}
path "sys/leases/renew" {
  capabilities = ["update"]
}
EOF
vault policy write db-app /tmp/db-app-policy.hcl

# Infrastructure operators — read everything, write nothing in production secrets
cat > /tmp/ops-policy.hcl <<'EOF'
path "secret/data/*" {
  capabilities = ["read", "list"]
}
path "secret/metadata/*" {
  capabilities = ["list"]
}
path "sys/health" {
  capabilities = ["read"]
}
path "sys/metrics" {
  capabilities = ["read"]
}
EOF
vault policy write ops /tmp/ops-policy.hcl

Policy templating — identity-based dynamic policies

Vault can template policies with the identity of the caller. An operator's policy can grant access to their own namespace without creating one policy per operator.

# Template: each user gets their own secret namespace
cat > /tmp/user-policy.hcl <<'EOF'
path "secret/data/users/{{identity.entity.name}}/*" {
  capabilities = ["create", "read", "update", "delete"]
}
path "secret/metadata/users/{{identity.entity.name}}/*" {
  capabilities = ["list"]
}
EOF
vault policy write user-self-service /tmp/user-policy.hcl

# When alice authenticates, {{identity.entity.name}} becomes "alice"
# alice can read/write secret/data/users/alice/* but not secret/data/users/bob/*

Vault policies are the ACL that most infrastructure is missing. Without Vault, everyone who can read a config file can read every secret in it — all the database passwords, all the API keys, all the WireGuard keys. With Vault policies, the database team accesses database secrets, the platform team accesses infrastructure secrets, the CI/CD pipeline accesses deploy secrets, and none of them can see each other's secrets. The audit log records every attempt to access a path outside a policy — so you see when someone tries to read something they should not. The combination of explicit allow, implicit deny, and complete audit logging is how you actually implement least privilege rather than just claiming it.

11. Audit Logging — Every Secret Access Is Recorded

Vault's audit log records every request and response. Who asked. What they asked for. When. What Vault gave back (with secret values HMAC'd — not plaintext). The audit log is Vault's compliance story and its forensics story. It is also the mechanism that makes Vault's implicit deny meaningful: you know when access is attempted, not just when it succeeds.

Configure audit logging to the ZFS audit dataset

# Enable file audit to the ZFS compressed audit dataset
vault audit enable file \
  file_path=/vault/audit/vault-audit.log \
  log_raw=false \    # HMAC secret values (default) — not plaintext
  mode=0600

# Verify
vault audit list -detailed

# Vault blocks all requests if the audit device fails to write
# If /vault/audit fills up, Vault stops accepting requests
# Monitor disk space on rpool/vault/audit

# Set a ZFS quota to prevent the audit log from filling rpool
zfs set quota=50G rpool/vault/audit

# Set sanoid to keep 1 year of audit log snapshots
# Add to /etc/sanoid/sanoid.conf:
cat >> /etc/sanoid/sanoid.conf <<'EOF'

[rpool/vault/audit]
use_template=audit_logs

[template_audit_logs]
daily = 365
weekly = 52
monthly = 24
yearly = 5
autosnap = yes
autoprune = yes
EOF

Querying the audit log

# Vault audit logs are JSON — one object per line
# Who accessed what in the last hour?
journalctl --since="1 hour ago" --no-pager | grep vault | head

# With jq: list all principals who accessed database credentials today
grep database/creds /vault/audit/vault-audit.log \
  | jq -r '.auth.display_name + " → " + .request.path' \
  | sort | uniq -c | sort -rn

# All accesses by a specific entity
jq -r 'select(.auth.entity_id == "YOUR_ENTITY_ID") | .request.path' \
  /vault/audit/vault-audit.log

# Failed access attempts (permission denied)
jq -r 'select(.response.errors != null) | "\(.auth.display_name) \(.request.path)"' \
  /vault/audit/vault-audit.log | head -50

# Access to production database credentials in the last 24h
grep 'database/creds/webapp-role' /vault/audit/vault-audit.log \
  | jq -r '[.time, .auth.display_name, .request.remote_address] | @tsv'

Forward audit logs to Loki

# promtail config to ship Vault audit logs to Loki over WireGuard
cat > /etc/promtail/vault-audit.yaml <<'EOF'
server:
  http_listen_port: 9081
  grpc_listen_port: 0

clients:
  - url: http://10.202.0.5:3100/loki/api/v1/push  # Loki on monitoring plane

scrape_configs:
  - job_name: vault-audit
    static_configs:
      - targets:
          - localhost
        labels:
          job: vault-audit
          host: vault-node-1
          __path__: /vault/audit/vault-audit.log
    pipeline_stages:
      - json:
          expressions:
            time: time
            auth_name: auth.display_name
            request_path: request.path
            request_op: request.operation
            response_code: response.data.http_status_code
      - labels:
          auth_name:
          request_path:
          request_op:
EOF

Vault's audit log is the compliance officer's best friend. "Who accessed the production database credentials at 3 AM last Tuesday?" — one grep. "How many times was the Stripe API key read in the last 30 days, and by which services?" — one jq query. "Did anyone try to access secrets they are not authorized to read?" — jq on .response.errors. Every access, every secret path, every principal, every timestamp, every source IP. Tamper-evident because the values are HMAC'd. Stored on OpenZFS with checksums so you know if anyone tampers with the log files at the filesystem level. This combination — Vault's HMAC'd audit log on ZFS with checksums — is a forensics record you can trust.

12. Vault HA with Raft on OpenZFS

Vault HA with integrated Raft storage requires no external dependency. Three Vault nodes, each with a local Raft storage on an encrypted ZFS dataset, connected over the WireGuard management plane. Raft handles leader election and log replication. ZFS handles storage snapshots and DR replication. One leader serves all writes. Followers serve reads (or proxy to leader — configurable).

Three-node HA cluster on kldload

# vault.hcl for vault-node-1 (10.201.0.1)
# vault-node-2: 10.201.0.2, vault-node-3: 10.201.0.3

listener "tcp" {
  address     = "10.201.0.1:8200"
  tls_cert_file = "/etc/vault.d/tls/vault.crt"
  tls_key_file  = "/etc/vault.d/tls/vault.key"
  tls_min_version = "tls13"
}

storage "raft" {
  path    = "/vault/data"
  node_id = "vault-node-1"

  retry_join {
    leader_api_addr = "https://10.201.0.2:8200"
    leader_ca_cert_file = "/etc/step/certs/root_ca.crt"
  }
  retry_join {
    leader_api_addr = "https://10.201.0.3:8200"
    leader_ca_cert_file = "/etc/step/certs/root_ca.crt"
  }
}

api_addr     = "https://10.201.0.1:8200"
cluster_addr = "https://10.201.0.1:8201"
ui           = true
disable_mlock = false
log_level    = "info"

# Start all three nodes
# Initialize on node-1 only
export VAULT_ADDR="https://10.201.0.1:8200"
vault operator init -key-shares=5 -key-threshold=3 -format=json > /root/vault-init.json

# Unseal all three nodes with 3 of the 5 keys
# (node-2 and node-3 join automatically via retry_join after unseal)
for NODE in 10.201.0.1 10.201.0.2 10.201.0.3; do
  export VAULT_ADDR="https://${NODE}:8200"
  vault operator unseal <key-share-1>
  vault operator unseal <key-share-2>
  vault operator unseal <key-share-3>
done

# Check cluster status
vault operator raft list-peers
# → Node            Address              State     Voter
# → vault-node-1    10.201.0.1:8201      leader    true
# → vault-node-2    10.201.0.2:8201      follower  true
# → vault-node-3    10.201.0.3:8201      follower  true

ZFS snapshots of Vault Raft data

# Snapshot before upgrades or major operations
zfs snapshot rpool/vault/data@before-upgrade-$(date +%Y%m%d)

# Vault also has a built-in Raft snapshot
# (captures Vault's logical state, not the raw Raft log)
vault operator raft snapshot save /tmp/vault-snapshot-$(date +%Y%m%d).snap

# Store the snapshot in Vault's own ZFS dataset for safekeeping
mv /tmp/vault-snapshot-*.snap /vault/snapshots/

Auto-unseal with transit seal

Manual unseal requires humans with key shares to be available when Vault starts. Auto-unseal delegates the unseal to another Vault (transit seal) or a cloud KMS. On a multi-node kldload cluster, one Vault (the transit Vault) can auto-unseal the production Vault cluster. This is not circular — the transit Vault uses a different storage backend and is itself manually unsealed.

# vault.hcl — auto-unseal using transit from a separate "seal" Vault instance
seal "transit" {
  address         = "https://10.201.0.10:8200"   # separate "seal vault" instance
  token           = "s.seal-vault-token"
  disable_renewal = "false"
  key_name        = "autounseal"
  mount_path      = "transit/"
  tls_ca_cert     = "/etc/step/certs/root_ca.crt"
}

Vault HA on Raft is simpler than the old Consul-backed HA. Three Vault nodes, each on a kldload server with OpenZFS, connected by WireGuard. Raft handles leader election and log replication. ZFS handles storage snapshots and DR replication. The Raft data is on an encrypted ZFS dataset that is snapshotted hourly by sanoid and replicated to the DR site by syncoid over WireGuard. Vault Enterprise has built-in DR replication. Open source Vault on OpenZFS achieves the same result: syncoid gives you incremental block-level replication of the Vault Raft data to a DR site. Recovery is: import the ZFS dataset, start Vault, unseal. RPO equals your replication interval.

13. Vault + Kubernetes

Kubernetes pods need secrets. The naive approach is Kubernetes Secrets (base64-encoded, not encrypted at rest by default, visible to anyone with kubectl access). The correct approach is Vault: pods authenticate using their service account, get tokens scoped to their role, and receive only the secrets their policy allows. Vault manages the lifecycle. The pod reads a file.

Vault Agent Injector — sidecar approach

The Vault Agent Injector is a Kubernetes mutating webhook. When a pod has the right annotations, the injector adds a Vault agent sidecar that authenticates to Vault, fetches secrets, writes them to a shared volume, and renews them before they expire. The application reads a file. No SDK. No code changes.

# Install Vault Agent Injector via Helm
helm repo add hashicorp https://helm.releases.hashicorp.com
helm install vault hashicorp/vault \
  --set "injector.enabled=true" \
  --set "server.enabled=false" \
  --set "injector.externalVaultAddr=https://10.201.0.1:8200"

# Pod spec with Vault injection annotations
apiVersion: apps/v1
kind: Deployment
metadata:
  name: webapp
  namespace: production
spec:
  template:
    metadata:
      annotations:
        # Tell the injector to inject a Vault agent
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/role: "webapp"
        vault.hashicorp.com/agent-inject-secret-db-creds: "database/creds/webapp-role"
        # Template the secret into a config file format
        vault.hashicorp.com/agent-inject-template-db-creds: |
          {{- with secret "database/creds/webapp-role" -}}
          DB_HOST=db.internal.example.com
          DB_PORT=5432
          DB_USER={{ .Data.username }}
          DB_PASS={{ .Data.password }}
          {{- end }}
    spec:
      serviceAccountName: webapp
      containers:
      - name: webapp
        image: webapp:latest
        # Secret appears at /vault/secrets/db-creds
        # Application reads: source /vault/secrets/db-creds

Vault CSI Provider — mount secrets as volumes

# Install Secrets Store CSI Driver and Vault provider
helm install csi secrets-store-csi-driver/secrets-store-csi-driver
helm install vault hashicorp/vault --set "csi.enabled=true"

# SecretProviderClass — tells the CSI driver what to fetch from Vault
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: webapp-vault-secrets
  namespace: production
spec:
  provider: vault
  parameters:
    vaultAddress: "https://10.201.0.1:8200"
    roleName: "webapp"
    objects: |
      - objectName: "db-password"
        secretPath: "database/creds/webapp-role"
        secretKey: "password"
      - objectName: "api-key"
        secretPath: "secret/data/webapp/api"
        secretKey: "key"

# Pod mounts the CSI volume
volumes:
- name: vault-secrets
  csi:
    driver: secrets-store.csi.k8s.io
    readOnly: true
    volumeAttributes:
      secretProviderClass: webapp-vault-secrets

The Vault Agent Injector is the bridge between Vault and Kubernetes. A pod annotation says "I need secret database/creds/my-role" and the injector sidecar fetches it from Vault, writes it to a shared volume, and renews it before it expires. The application reads a file. No Vault SDK. No code changes. The pod's service account is the identity — the same account used for RBAC, now also used for Vault authentication. One identity system, two access control layers. Kubernetes says what the pod can do in the cluster. Vault says what secrets the pod can read. Both are audited.

14. Secrets Workflow for kldload Deployments

Put it all together: a complete secrets lifecycle for a kldload deployment. The golden rule is simple: nothing sensitive in the image. Everything injected at runtime from Vault. The image is clean. The instance bootstraps its identity from something it knows at deploy time (a one-time secret_id, a service account JWT, or a TLS certificate from step-ca), and from that point forward it fetches everything it needs from Vault.

WireGuard keys

# Store generated WireGuard keys in Vault at deploy time
PRIVATE_KEY=$(wg genkey)
PUBLIC_KEY=$(echo "$PRIVATE_KEY" | wg pubkey)

vault kv put secret/wireguard/nodes/$(hostname) \
  private_key="$PRIVATE_KEY" \
  public_key="$PUBLIC_KEY" \
  assigned_ip="10.200.0.$(shuf -i 10-254 -n1)"

# At firstboot, fetch from Vault:
# PRIVATE_KEY=$(vault kv get -field=private_key secret/wireguard/nodes/$(hostname))
# Configure wg0.conf, bring up the tunnel

Database connections — no passwords in config files

# Application startup script — fetch dynamic credentials from Vault
#!/bin/bash
export VAULT_ADDR="https://10.201.0.1:8200"
export VAULT_CACERT="/etc/step/certs/root_ca.crt"

# Authenticate with AppRole (role_id is in /etc/app/role_id — not secret)
# secret_id was injected into /run/secrets/vault-secret-id at deploy time (tmpfs, not persisted)
VAULT_TOKEN=$(vault write -field=token auth/approle/login \
  role_id="$(cat /etc/app/role_id)" \
  secret_id="$(cat /run/secrets/vault-secret-id)")

export VAULT_TOKEN

# Fetch dynamic database credentials
DB_USER=$(vault read -field=username database/creds/webapp-role)
DB_PASS=$(vault read -field=password database/creds/webapp-role)

# Export for the application
export DATABASE_URL="postgresql://${DB_USER}:${DB_PASS}@db.internal.example.com:5432/myapp?sslmode=require"

# Start the application — it reads DATABASE_URL from the environment
exec /usr/bin/webapp

Complete golden image workflow with Vault

# kldload golden image deployment with Vault integration
# Everything in the image:
#   - role_id for AppRole authentication (not secret)
#   - Vault CA certificate (for TLS verification)
#   - Firstboot script that fetches secrets and configures services
#
# Nothing in the image:
#   - Passwords, tokens, keys, passphrases

# cloud-init user-data: inject the secret_id at deploy time
# (cloud-init writes to tmpfs — not persisted to disk after first boot)
#cloud-config
write_files:
  - path: /run/secrets/vault-secret-id
    permissions: '0400'
    owner: root:root
    content: |
      ${vault_secret_id}    # injected by Packer/Terraform at image bake time

runcmd:
  - /usr/local/bin/firstboot-vault-init
  - rm -f /run/secrets/vault-secret-id   # consume and destroy

SSH access — no authorized_keys

# All SSH access via Vault-signed certificates
# No authorized_keys files. No permanent key distribution.
# An operator's certificate expires when their session should.

# Operator connects by signing their public key via Vault
vault write -field=signed_key ssh/sign/operator \
  public_key=@~/.ssh/id_ed25519.pub \
  valid_principals=ops \
  ttl=4h \
  extensions='{"permit-pty":"","permit-port-forwarding":""}' \
  > ~/.ssh/id_ed25519-cert.pub

ssh -i ~/.ssh/id_ed25519-cert.pub ops@10.201.0.5
# Works for 4 hours, then the certificate expires and access is revoked

This is the complete secrets lifecycle for a kldload deployment. The image is clean — no secrets baked in. At deploy time, the instance authenticates to Vault (AppRole, Kubernetes auth, or TLS cert), fetches its secrets, and starts services. If the image is leaked or exfiltrated, no secrets are exposed — there are none in it. If a credential is compromised, Vault revokes it and the next authentication cycle gets fresh credentials. If an operator's session needs to be terminated, revoke their Vault token — their SSH certificate stops working, their database session expires, their API key is gone. One revocation, complete access termination.

15. Backup and DR for Vault

Vault's state is the most important data in your infrastructure. If Vault goes down and cannot be recovered, every dynamic credential stops working, every application that depends on Vault-fetched secrets fails, and recovering requires either restoring from backup or rebuilding from scratch. Backup Vault like your infrastructure depends on it — because it does.

Vault Raft snapshots (built-in)

# Manual snapshot — captures Vault's logical state
vault operator raft snapshot save /vault/snapshots/vault-$(date +%Y%m%d-%H%M%S).snap

# Automated snapshot via systemd timer
cat > /etc/systemd/system/vault-snapshot.service <<'EOF'
[Unit]
Description=Vault Raft snapshot
After=vault.service

[Service]
Type=oneshot
Environment=VAULT_ADDR=https://10.201.0.1:8200
Environment=VAULT_CACERT=/etc/step/certs/root_ca.crt
EnvironmentFile=/etc/vault.d/snapshot-token
ExecStart=/usr/bin/vault operator raft snapshot save /vault/snapshots/vault-%Y%m%d-%H%M%S.snap
EOF

cat > /etc/systemd/system/vault-snapshot.timer <<'EOF'
[Unit]
Description=Vault Raft snapshot — hourly

[Timer]
OnCalendar=hourly
RandomizedDelaySec=5m
Persistent=true

[Install]
WantedBy=timers.target
EOF

systemctl enable --now vault-snapshot.timer

ZFS replication of Vault data to DR site

# Configure syncoid to replicate Vault's ZFS datasets to DR
# DR site Vault node is reachable over WireGuard at 10.200.100.1

# Add to /etc/sanoid/syncoid-args.conf or run via cron:
syncoid \
  --no-privilege-elevation \
  --sshkey /etc/syncoid/id_ed25519 \
  rpool/vault \
  root@10.200.100.1:rpool/vault-dr

# The DR site has a byte-identical copy of:
# /vault/data (Raft storage)
# /vault/audit (audit logs)
# /vault/snapshots (Raft logical snapshots)

# DR site sanoid keeps the replicated datasets:
cat >> /etc/sanoid/sanoid.conf <<'EOF'

[rpool/vault-dr]
use_template=vault_dr
[rpool/vault-dr/data]
use_template=vault_dr
[rpool/vault-dr/audit]
use_template=vault_dr

[template_vault_dr]
daily = 30
weekly = 12
monthly = 12
autosnap = no    # snapshots come from primary via syncoid
autoprune = yes  # prune old snapshots
EOF

Recovery procedure

# DR recovery: primary Vault cluster is gone
# 1. Import the ZFS datasets on the DR node
zfs import rpool/vault-dr
zfs rename rpool/vault-dr rpool/vault

# 2. Start Vault on DR node
systemctl start vault

# 3. Unseal (requires the unseal keys — they are in your safe, right?)
vault operator unseal <key-share-1>
vault operator unseal <key-share-2>
vault operator unseal <key-share-3>

# 4. Verify state
vault status
vault operator raft list-peers

# If restoring from a Raft snapshot (not ZFS replication):
vault operator raft snapshot restore /vault/snapshots/vault-20260401-0800.snap
# Then unseal

RPO equals your replication interval. If syncoid runs every 15 minutes, you lose at most 15 minutes of Vault state in a catastrophic failure. For most operations — KV secret writes, policy changes, auth method configuration — this is acceptable. For dynamic secrets with very short TTLs, the applications will simply re-authenticate at startup and get fresh credentials. The Raft snapshot captures Vault's logical state (all secrets, all policies, all auth configuration). The ZFS dataset replication captures Vault's physical state (the Raft log, including in-progress transactions). Keep both. The ZFS replication is your primary DR mechanism. The Raft snapshot is your backup for when the Raft log is corrupted or you need to restore a known-good state from before a bad operation.

16. Troubleshooting

Vault sealed and will not unseal

# Check seal status
vault status

# If storage is the problem:
# Check ZFS pool health first
zpool status rpool
zfs list rpool/vault/data

# Check if /vault/data is mounted
mount | grep vault

# Check Vault logs
journalctl -u vault -n 100 --no-pager

# Check that the Raft storage directory is writable
ls -la /vault/data/
stat /vault/data/

# If Raft is corrupted, restore from ZFS snapshot
zfs list -t snapshot rpool/vault/data
zfs rollback rpool/vault/data@before-upgrade-20260401
systemctl start vault

Permission denied

# Which token is being used?
vault token lookup

# What policies does it have?
vault token lookup -format=json | jq .data.policies

# What does the policy actually allow?
vault policy read webapp-policy

# Test a specific path
vault token capabilities database/creds/webapp-role
# → [read]  ← policy allows it
# → []       ← policy denies it (implicit deny)

# Check audit log for the denial
grep "permission denied" /vault/audit/vault-audit.log | tail -10 | jq .request.path

Lease expired — application did not renew

# Applications must renew leases before they expire
# Check current leases for a path
vault list sys/leases/lookup/database/creds/webapp-role/

# Renew a specific lease
vault lease renew database/creds/webapp-role/abc123

# If the lease has expired, the application must re-authenticate and get new credentials
# Increase the TTL for the role (if the application cannot renew fast enough)
vault write database/roles/webapp-role default_ttl=4h max_ttl=24h

# For AppRole tokens — increase TTL on the role
vault write auth/approle/role/webapp \
  token_ttl=2h \
  token_max_ttl=8h

Storage errors

# Check ZFS pool and dataset health
zpool status rpool
zfs list -r rpool/vault
zpool scrub rpool && zpool status rpool

# Check disk space — Vault stops if audit log is full
df -h /vault/audit /vault/data

# ZFS quota alert
zfs get quota,used rpool/vault/audit

# If audit dataset is full and blocking Vault:
# Temporarily increase quota (then fix the root cause)
zfs set quota=100G rpool/vault/audit

Common mistakes

Storing unseal keys in Vault

Circular dependency: Vault is sealed, unseal keys are in Vault. You cannot access Vault to get the unseal keys. Store unseal keys offline: printed paper in a safe, HSM, or distributed to key holders via PGP-encrypted email. Never in Vault itself or a system that depends on Vault.

// Wrong: vault kv put secret/unseal-keys key1=... key2=... // Right: paper in safe + PGP-encrypted copies to 5 key holders

Running dev mode in production

Dev mode starts Vault in-memory with a root token printed to stdout. All data is lost on restart. TLS is disabled. Anyone who can see the console gets the root token. Dev mode is for learning, not for running for more than one session on a developer's laptop.

// vault server -dev ← never in production // vault server -config=/etc/vault.d/vault.hcl ← production

Not enabling audit logging

Vault without audit logging is a secrets store with no accountability. Enable audit logging before putting any secrets in Vault. If the audit device cannot write, Vault blocks all requests — this is a feature, not a bug. Monitor the audit log volume and quota.

// vault audit enable file file_path=/vault/audit/audit.log // — do this immediately after initialization, before anything else

Long-lived root tokens

The root token is created at initialization and has unlimited access with no TTL. Revoke it immediately after creating admin tokens and policies. If you need a root token for emergency access, generate a new one with vault operator generate-root — it requires a threshold of unseal key shares.

// vault token revoke <root-token> // # Emergency: vault operator generate-root -init (needs 3-of-5 key shares)

Baking secrets into images

A kldload golden image with secrets baked in (passwords in /etc/app/config, tokens in /etc/systemd, WireGuard keys in /etc/wireguard) is a credential dump in ISO format. Anyone who gets the image gets the secrets. Images are clean. Secrets are fetched at firstboot from Vault.

// Wrong: echo "DB_PASS=hunter2" > /etc/app/db.conf (in Packer build) // Right: at firstboot, vault kv get -field=password secret/db/production

Not snapshotting before upgrades

Vault upgrades sometimes require Raft log migration. If the migration fails, you need to roll back. Without a ZFS snapshot, rollback means restoring from a Raft snapshot (which may be hours old). Always: zfs snapshot rpool/vault/data@pre-upgrade before starting a Vault upgrade.

// Before every vault upgrade: // zfs snapshot rpool/vault/data@before-vault-$(vault version | head -1) // vault operator raft snapshot save /vault/snapshots/pre-upgrade.snap

TLS & PKI Masterclass — step-ca, certificate lifecycle, mTLS
Security Hardening Masterclass — nftables, AppArmor, auditd, SELinux
Backplane Networks Masterclass — WireGuard mesh, the four-plane model
WireGuard Masterclass — encrypted transport layer for Vault traffic
Kubernetes Masterclass — Vault Agent Injector, CSI provider, Kubernetes auth
ZFS Encryption — OpenZFS native encryption for Vault's storage datasets

← Security Hardening TLS & PKI →