AI Admin Assistant — teach an LLM to run your infrastructure.
What if your server could diagnose its own problems? Not with hardcoded rules — with an actual language model that understands ZFS, systemd, networking, and your specific setup. A local LLM running on the same ZFS-backed storage it's monitoring, trained on your logs, your configs, and your runbooks.
This isn't science fiction. Ollama runs open-source LLMs locally. ZFS gives you the storage backend. A postinstaller bakes it all in. Twenty minutes from blank disk to an AI-powered admin assistant.
Quick start — Ollama in 5 minutes
Just want to chat with an LLM locally?
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model and start chatting
ollama pull llama3.1:8b
ollama run llama3.1:8b
# That's it. Local AI. No cloud. No API key. No data leaves your machine.
With NVIDIA GPU
# If NVIDIA drivers are installed (see NVIDIA tutorial), Ollama uses the GPU automatically
ollama run llama3.1:8b
# Watch GPU utilization — inference runs on CUDA cores
nvidia-smi
# Multiple models can share the GPU simultaneously
# Run Ollama API + Stable Diffusion + Whisper — all on one GPU
Ollama as an API server
# Ollama exposes an OpenAI-compatible API on port 11434
curl http://localhost:11434/api/generate -d '{
"model": "llama3.1:8b",
"prompt": "Explain ZFS snapshots in one sentence"
}'
# Use it from any app that supports the OpenAI API
# Just point OPENAI_BASE_URL to http://your-server:11434
# Run as a Docker container with GPU sharing (see NVIDIA tutorial)
docker run -d --name ollama --gpus all \
-p 11434:11434 \
-v /srv/ollama:/root/.ollama \
ollama/ollama
8B models
8GB RAM
General chat, coding, analysis
Fast on CPU, instant on GPU
13B–34B models
16–32GB RAM
Complex reasoning, long context
GPU recommended
70B+ models
64GB+ RAM or 24GB+ VRAM
Near-GPT-4 quality
GPU required
The full recipe — AI infrastructure admin
Step 1: Install Ollama
#!/bin/bash
# postinstall-ai-admin.sh
# Create a ZFS dataset for models (compressed, snapshotable)
kdir -o compression=zstd -o recordsize=1M /srv/ollama
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Configure Ollama to use ZFS-backed storage
mkdir -p /etc/systemd/system/ollama.service.d
cat > /etc/systemd/system/ollama.service.d/override.conf <<EOF
[Service]
Environment="OLLAMA_MODELS=/srv/ollama/models"
Environment="OLLAMA_HOST=0.0.0.0:11434"
EOF
systemctl daemon-reload
systemctl enable --now ollama
Step 2: Pull a model
# Pull a capable model — llama3.1 8B is a good starting point
ollama pull llama3.1:8b
# For more capability (needs 16GB+ RAM)
ollama pull llama3.1:70b
# For code-focused tasks
ollama pull codellama:13b
# Snapshot the clean model state
ksnap /srv/ollama
Step 3: Create your infrastructure Modelfile
This is where it gets powerful. You create a custom model persona that knows your infrastructure:
# /srv/ollama/Modelfile.infra-admin
FROM llama3.1:8b
SYSTEM """
You are an infrastructure administration assistant for a kldload-based
ZFS-on-root Linux environment. You help diagnose issues, suggest tuning,
and automate common tasks.
You know:
- ZFS: pools, datasets, snapshots, replication, ARC tuning, scrubs
- systemd: services, timers, journal, unit files
- Networking: WireGuard, nftables, NetworkManager
- Storage: RAID-Z, mirrors, special vdevs, SLOG, L2ARC
- kldload tools: kst, ksnap, kbe, kdf, kdir, kpkg, kupgrade, krecovery
When diagnosing issues:
1. Ask for relevant output (zpool status, kst, journalctl)
2. Identify the root cause
3. Suggest the fix with exact commands
4. Warn about risks before destructive operations
5. Always recommend a snapshot before changes
Current environment:
- Distro: CentOS Stream 9
- Pool: rpool (ZFS on root)
- Boot: ZFSBootMenu
- Tools: kldload CLI suite
"""
PARAMETER temperature 0.3
PARAMETER num_ctx 8192
# Build the custom model
ollama create infra-admin -f /srv/ollama/Modelfile.infra-admin
# Test it
ollama run infra-admin "my zpool status shows a DEGRADED vdev, what should I do?"
Step 4: Feed it live system data
The real power: pipe your actual system state into the LLM and let it analyze:
#!/bin/bash
# ai-diagnose.sh — pipe system state to the AI assistant
CONTEXT=$(cat <<EOF
=== SYSTEM STATUS ===
$(kst 2>/dev/null)
=== POOL STATUS ===
$(zpool status 2>/dev/null)
=== RECENT ERRORS ===
$(journalctl -p err --since "1 hour ago" --no-pager 2>/dev/null | tail -20)
=== DISK HEALTH ===
$(smartctl -H /dev/vda 2>/dev/null)
=== MEMORY ===
$(free -h)
=== ARC STATS ===
$(cat /proc/spl/kstat/zfs/arcstats 2>/dev/null | grep -E "^size|^c |^hits|^misses")
EOF
)
echo "${CONTEXT}
Based on the above, are there any issues I should address? What optimizations would you recommend?" | \
ollama run infra-admin
Step 5: Automate with cron
# Daily health check — AI reviews your infrastructure every morning
# crontab -e
0 6 * * * /usr/local/bin/ai-diagnose.sh > /var/log/ai-health-report.txt 2>&1
# Weekly deep analysis
0 8 * * 1 /usr/local/bin/ai-deep-analysis.sh | mail -s "Weekly AI Infra Report" admin@example.com
Step 6: Interactive terminal assistant
# Add to .bashrc — type 'ai' to get help anytime
ai() {
local question="$*"
local context="$(kst 2>/dev/null; echo '---'; zpool status 2>/dev/null)"
echo -e "Current system state:\n${context}\n\nQuestion: ${question}" | \
ollama run infra-admin
}
# Usage:
ai "how do I add a mirror to my pool?"
ai "what recordsize should I use for postgres?"
ai "my ARC hit rate is low, what should I tune?"
ai "create a snapshot schedule for /srv/database"
Why ZFS makes this better
Snapshot before fine-tuning
Training a custom model? ksnap /srv/ollama first.
If the fine-tuned model is worse, ksnap rollback /srv/ollama. Instant.
Try that with ext4.
Clone models for testing
kclone /srv/ollama /srv/ollama-experiment.
Test a different system prompt. Compare outputs.
Zero extra disk space until the models diverge.
Replicate to other nodes
zfs send rpool/srv/ollama@trained | ssh node-2 zfs recv rpool/srv/ollama.
Your trained AI admin assistant, deployed to every node in your fleet.
Block-level replication. Only changed data transferred.
Compressed model storage
LLM model files are large but compress well.
compression=zstd on the dataset typically saves 15-25%.
A 7GB model takes 5.5GB on disk. Free performance.
Advanced: self-healing infrastructure
The AI that fixes things while you sleep
#!/bin/bash
# ai-auto-heal.sh — AI reviews and acts on critical issues
# Run via cron or systemd timer — WITH CAUTION
STATUS=$(zpool status -x 2>/dev/null)
if [[ "$STATUS" != "all pools are healthy" ]]; then
# Ask the AI what to do
RESPONSE=$(echo "zpool status output: ${STATUS}
Is this critical? What's the safest remediation?
Respond with ONLY a bash command if safe to run, or ALERT if human needed." | \
ollama run infra-admin)
if echo "$RESPONSE" | grep -q "^ALERT"; then
# AI says human needed — send notification
echo "$RESPONSE" | mail -s "ALERT: ZFS pool issue" admin@example.com
else
# AI suggests a safe command — log and execute
echo "$(date): AI auto-heal: $RESPONSE" >> /var/log/ai-actions.log
# Uncomment below to actually execute (use with extreme caution)
# eval "$RESPONSE"
fi
fi