AI for Docker & Container Management — a local model that knows your containers.
Generic LLMs know Docker syntax. This model knows your running containers,
your Compose stacks, your network topology, your volume mounts, and your resource consumption.
It reads docker ps and docker stats before every answer.
It sees which containers are restarting. It knows which images are stale.
It generates Compose files from a description and audits running containers for security issues.
On kldloadOS, Docker volumes live on ZFS datasets. That means every volume is snapshotable, cloneable, and replicable. The AI knows this and uses it.
1. The Docker Modelfile
This system prompt encodes Docker, Compose, networking, volumes, registries, security best practices, and the ZFS-backed storage layer that kldloadOS provides underneath.
Complete Docker expert Modelfile
# /srv/ollama/Modelfile.docker-expert
FROM llama3.1:8b
SYSTEM """
You are a Docker and container management expert for this kldload-based infrastructure.
You give precise commands, reference actual container names from context, and always
recommend snapshots (ksnap) before destructive operations on volumes.
=== CONTAINER LIFECYCLE ===
List running: docker ps
List all: docker ps -a
Start/stop/restart: docker start|stop|restart CONTAINER
Remove container: docker rm CONTAINER (add -f for running)
Remove all stopped: docker container prune
Logs: docker logs -f --tail=100 CONTAINER
Exec into container: docker exec -it CONTAINER /bin/sh
Inspect: docker inspect CONTAINER
Stats (live): docker stats --no-stream
Top (processes): docker top CONTAINER
=== IMAGES ===
Pull image: docker pull IMAGE:TAG
List images: docker images
Remove image: docker rmi IMAGE
Remove dangling: docker image prune
Remove all unused: docker image prune -a
Build from Dockerfile: docker build -t NAME:TAG .
Tag for registry: docker tag NAME:TAG registry.local:5000/NAME:TAG
Push to registry: docker push registry.local:5000/NAME:TAG
=== COMPOSE ===
Start stack: docker compose up -d
Stop stack: docker compose down
Rebuild and start: docker compose up -d --build
View logs: docker compose logs -f SERVICE
Scale service: docker compose up -d --scale web=3
Override file: docker compose -f compose.yml -f compose.prod.yml up -d
Environment: Use .env file or environment: section in compose.yml
Depends-on: Use depends_on with condition: service_healthy for startup order
Health checks: healthcheck: test, interval, timeout, retries in compose.yml
=== NETWORKING ===
List networks: docker network ls
Inspect network: docker network inspect NETWORK
Create network: docker network create --driver bridge mynet
Connect container: docker network connect mynet CONTAINER
Macvlan (LAN access): docker network create -d macvlan --subnet=192.168.1.0/24 --gateway=192.168.1.1 -o parent=eth0 lannet
DNS: Containers on same user-defined network resolve by name
Port mapping: -p HOST:CONTAINER or ports: in compose
No host networking: --network=host bypasses isolation. Avoid in production.
=== VOLUMES (ZFS-BACKED) ===
Docker on kldloadOS uses ZFS storage driver. Each container layer is a ZFS dataset.
List volumes: docker volume ls
Create volume: docker volume create mydata
Inspect volume: docker volume inspect mydata
Remove volume: docker volume rm mydata (ksnap the dataset first!)
ZFS advantage: Volumes live at rpool/docker/volumes/VOLUME_NAME
Snapshot volume: zfs snapshot rpool/docker/volumes/mydata@before-upgrade
Clone volume: zfs clone rpool/docker/volumes/mydata@snap rpool/docker/volumes/mydata-test
Rollback volume: zfs rollback rpool/docker/volumes/mydata@before-upgrade
Replicate volume: syncoid rpool/docker/volumes/mydata root@backup:tank/docker/volumes/mydata
=== PRIVATE REGISTRY ===
Run local registry: docker run -d -p 5000:5000 -v registry:/var/lib/registry --name registry registry:2
Tag for local: docker tag myapp:latest localhost:5000/myapp:latest
Push to local: docker push localhost:5000/myapp:latest
Pull from local: docker pull localhost:5000/myapp:latest
Registry on ZFS: Volume is a dataset — snapshot the entire registry, clone for testing
TLS for registry: Mount certs, set REGISTRY_HTTP_TLS_CERTIFICATE and KEY
=== SECURITY ===
Never run as root: USER nonroot in Dockerfile, or user: "1000:1000" in compose
No privileged: --privileged gives full host access. Almost never needed.
Read-only rootfs: --read-only with tmpfs for /tmp
Drop capabilities: --cap-drop=ALL --cap-add=NET_BIND_SERVICE (add only what's needed)
No new privileges: --security-opt=no-new-privileges
Seccomp profile: --security-opt seccomp=profile.json
Scan images: trivy image myapp:latest (find vulnerabilities)
Limit resources: --memory=512m --cpus=1.0 (prevent resource exhaustion)
=== TROUBLESHOOTING ===
Container crash loop: docker logs CONTAINER | tail -50 (read the error)
docker inspect CONTAINER | grep -A5 State (check exit code)
OOM killed: docker inspect CONTAINER | grep OOMKilled
Increase --memory or optimize the application
Network issues: docker exec CONTAINER ping OTHER_CONTAINER
docker network inspect NETWORK (check subnet, gateway)
Disk full: docker system df (show disk usage by component)
docker system prune (remove unused data)
kdf (check ZFS dataset usage)
Slow container: docker stats CONTAINER (check CPU, memory, I/O)
docker top CONTAINER (check processes)
=== PHILOSOPHY ===
Snapshot before destructive operations. ksnap is instant. Rebuilding from scratch is not.
Use Compose for everything. Single docker run commands are not reproducible.
Pin image tags. :latest is not a deployment strategy.
One process per container. If you need cron + app, use two containers.
Logs to stdout. Let Docker handle log rotation and collection.
ZFS under Docker means every layer, volume, and image is checksummed and compressible.
"""
PARAMETER temperature 0.3
PARAMETER num_ctx 16384
# Build the Docker expert model
ollama create docker-expert -f /srv/ollama/Modelfile.docker-expert
# Verify it
ollama run docker-expert "How do I set up a private registry with TLS on ZFS?"
2. Live context script
The Modelfile gives the AI Docker expertise. The context script gives it your Docker state. Every query includes running containers, resource usage, networks, volumes, and recent events.
kai-docker — query with live container state
#!/bin/bash
# /usr/local/bin/kai-docker — query the Docker AI with live container context
build_docker_context() {
echo "=== LIVE DOCKER STATE ($(date -Iseconds)) ==="
echo -e "\n--- Running containers ---"
docker ps --format "table {{.Names}}\t{{.Image}}\t{{.Status}}\t{{.Ports}}" 2>/dev/null
echo -e "\n--- All containers (including stopped) ---"
docker ps -a --format "table {{.Names}}\t{{.Image}}\t{{.Status}}\t{{.Size}}" 2>/dev/null
echo -e "\n--- Resource usage ---"
docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}\t{{.BlockIO}}" 2>/dev/null
echo -e "\n--- Images ---"
docker images --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}\t{{.CreatedSince}}" 2>/dev/null
echo -e "\n--- Networks ---"
docker network ls --format "table {{.Name}}\t{{.Driver}}\t{{.Scope}}" 2>/dev/null
echo -e "\n--- Volumes ---"
docker volume ls --format "table {{.Name}}\t{{.Driver}}" 2>/dev/null
echo -e "\n--- Docker disk usage ---"
docker system df 2>/dev/null
echo -e "\n--- Recent events (last 30 min) ---"
docker events --since 30m --until 0s --format '{{.Time}} {{.Action}} {{.Actor.Attributes.name}}' 2>/dev/null | tail -20
echo -e "\n--- ZFS Docker datasets ---"
zfs list -r rpool/docker -o name,used,avail,compressratio 2>/dev/null | head -30
echo -e "\n--- Restart counts (crash detection) ---"
docker inspect $(docker ps -aq 2>/dev/null) --format '{{.Name}} restarts={{.RestartCount}} exitCode={{.State.ExitCode}}' 2>/dev/null | \
grep -v 'restarts=0 exitCode=0'
}
QUESTION="$*"
if [ -z "$QUESTION" ]; then
echo "Usage: kai-docker <question>"
echo ""
echo "Examples:"
echo " kai-docker 'why is my nginx container restarting?'"
echo " kai-docker 'which container is using the most memory?'"
echo " kai-docker 'write a compose file for wordpress + mariadb + redis'"
echo " kai-docker 'audit my running containers for security issues'"
echo " kai-docker 'set up a private registry on ZFS'"
exit 1
fi
CONTEXT=$(build_docker_context)
echo -e "${CONTEXT}\n\n=== QUESTION ===\n${QUESTION}" | ollama run docker-expert
3. ZFS-backed Docker
On kldloadOS, Docker uses the ZFS storage driver. Every container layer is a ZFS dataset. Every volume is a dataset. That means you get snapshots, clones, compression, checksums, and send/recv on every piece of container data. For free.
ZFS + Docker workflow
# Snapshot a volume before upgrading the database
ksnap /var/lib/docker/volumes/postgres_data
docker compose pull db
docker compose up -d db
# Something went wrong? Rollback in seconds
docker compose stop db
zfs rollback rpool/docker/volumes/postgres_data@auto-2026-03-23-1400
docker compose start db
# Clone a production volume for testing (instant, zero disk cost)
zfs snapshot rpool/docker/volumes/postgres_data@test-clone
zfs clone rpool/docker/volumes/postgres_data@test-clone rpool/docker/volumes/postgres_data_test
# Now mount postgres_data_test as a volume in a test container
# Replicate container volumes to another node
syncoid rpool/docker/volumes/postgres_data root@node2:rpool/docker/volumes/postgres_data
# Check compression savings
zfs get compressratio rpool/docker
# NAME PROPERTY VALUE SOURCE
# rpool/docker compressratio 2.41x -
4. Auto-generate Compose files
Describe what you want in plain English. The AI writes the docker-compose.yml.
It knows your network topology, your ZFS volumes, your existing containers,
and it generates files that actually work on your machine.
Compose generation examples
# "Give me a web app stack with nginx, Node.js, PostgreSQL, and Redis"
kai-docker "write a docker-compose.yml for nginx reverse proxy, \
node.js app on port 3000, postgresql with ZFS volume, and redis for caching. \
Use health checks. Pin all image versions."
# "Set up Gitea with PostgreSQL"
kai-docker "compose file for Gitea git server with PostgreSQL backend. \
Store repos on a ZFS volume. Expose on port 3000 and SSH on 2222."
# "Private registry with authentication"
kai-docker "compose file for a Docker registry with htpasswd auth and TLS. \
Store images on a ZFS volume. Include the htpasswd generation commands."
5. Container security audit
The AI inspects every running container and flags security issues: privileged mode, containers running as root, exposed ports, missing resource limits, host network access, writable rootfs. It gives you the exact fix for each finding.
Security audit script
#!/bin/bash
# /usr/local/bin/kai-docker-audit — AI security audit of running containers
AUDIT=$(cat <<AUDIT_DATA
=== DOCKER SECURITY AUDIT — $(hostname) — $(date) ===
--- Container Security Details ---
$(for c in $(docker ps -q 2>/dev/null); do
echo "=== $(docker inspect "$c" --format '{{.Name}}') ==="
docker inspect "$c" --format '
Image: {{.Config.Image}}
User: {{.Config.User}}
Privileged: {{.HostConfig.Privileged}}
ReadonlyRoot: {{.HostConfig.ReadonlyRootfs}}
NetworkMode: {{.HostConfig.NetworkMode}}
PidMode: {{.HostConfig.PidMode}}
CapAdd: {{.HostConfig.CapAdd}}
CapDrop: {{.HostConfig.CapDrop}}
SecurityOpt: {{.HostConfig.SecurityOpt}}
Memory: {{.HostConfig.Memory}}
CPUs: {{.HostConfig.NanoCpus}}
Ports: {{range $p, $b := .NetworkSettings.Ports}}{{$p}}->{{$b}} {{end}}
Mounts: {{range .Mounts}}{{.Type}}:{{.Source}}->{{.Destination}}({{.Mode}}) {{end}}
RestartPolicy:{{.HostConfig.RestartPolicy.Name}}'
echo ""
done)
AUDIT_DATA
)
echo "$AUDIT" | ollama run docker-expert \
"Analyze this security audit. For each container, report:
1. CRITICAL — privileged mode, host network/PID, running as root
2. WARNING — no memory limit, no CPU limit, writable rootfs, excessive capabilities
3. INFO — good practices already in place
4. FIX — exact docker run or compose changes to remediate each finding
Be specific. Reference container names."
6. Container log analysis
Feed container logs to the AI for pattern analysis. It finds errors you missed, correlates timestamps across containers, and identifies the root cause of cascading failures.
Log analysis
# Analyze a crashing container's logs
docker logs --tail 200 my-app 2>&1 | \
ollama run docker-expert "Analyze these container logs. \
Find errors, warnings, and patterns. Suggest fixes."
# Correlate logs across a Compose stack
docker compose logs --tail=100 --no-color 2>&1 | \
ollama run docker-expert "These are logs from multiple containers \
in a Compose stack. Correlate timestamps. Find the root cause \
of any errors. Which container failed first?"
# Monitor and alert
docker logs -f my-app 2>&1 | tail -50 | \
ollama run docker-expert "Any critical errors in these last 50 lines?"
7. Fleet replication via syncoid
Train the Docker expert on one node. Replicate the model — and your entire Docker volume state — to every node in the fleet. Same knowledge everywhere. Same container data replicated via ZFS send/recv.
Replicate Docker AI and volumes across fleet
#!/bin/bash
# replicate-docker-ai.sh — push Docker model + volumes to all nodes
NODES="node-2 node-3 node-4"
# Replicate the Ollama model
zfs snapshot rpool/srv/ollama@docker-expert-$(date +%F)
for node in $NODES; do
syncoid --no-sync-snap rpool/srv/ollama "root@${node}:rpool/srv/ollama"
done
# Replicate Docker volumes (e.g., shared registry)
zfs snapshot rpool/docker/volumes/registry@sync-$(date +%F)
for node in $NODES; do
syncoid --no-sync-snap rpool/docker/volumes/registry "root@${node}:rpool/docker/volumes/registry"
done
# Deploy the kai-docker script everywhere
for node in $NODES; do
scp /usr/local/bin/kai-docker "root@${node}:/usr/local/bin/kai-docker"
ssh "root@${node}" "chmod +x /usr/local/bin/kai-docker && systemctl restart ollama"
done
Containers are supposed to make things simpler. Then you have 47 of them,
3 Compose files, a private registry, 2 custom networks, and a volume that someone
mounted from the host 6 months ago and nobody remembers why. The AI does not judge.
It reads docker inspect, finds the answer, and tells you.
ZFS underneath means every container mistake is reversible. Snapshot before you pull. Clone before you test. Send/recv your volumes to backup. The containers are ephemeral. Your data doesn't have to be.