| your Linux construction kit
Source
← Back to Overview

AI Admin Assistant — teach an LLM to run your infrastructure.

What if your server could diagnose its own problems? Not with hardcoded rules — with an actual language model that understands ZFS, systemd, networking, and your specific setup. A local LLM running on the same ZFS-backed storage it's monitoring, trained on your logs, your configs, and your runbooks.

This isn't science fiction. Ollama runs open-source LLMs locally. ZFS gives you the storage backend. A postinstaller bakes it all in. Twenty minutes from blank disk to an AI-powered admin assistant.

Quick start — Ollama in 5 minutes

Just want to chat with an LLM locally?

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull a model and start chatting
ollama pull llama3.1:8b
ollama run llama3.1:8b

# That's it. Local AI. No cloud. No API key. No data leaves your machine.

With NVIDIA GPU

# If NVIDIA drivers are installed (see NVIDIA tutorial), Ollama uses the GPU automatically
ollama run llama3.1:8b
# Watch GPU utilization — inference runs on CUDA cores
nvidia-smi

# Multiple models can share the GPU simultaneously
# Run Ollama API + Stable Diffusion + Whisper — all on one GPU
CPU inference: ~10 tokens/sec. GPU inference: ~80+ tokens/sec. The difference between waiting and conversing.

Ollama as an API server

# Ollama exposes an OpenAI-compatible API on port 11434
curl http://localhost:11434/api/generate -d '{
  "model": "llama3.1:8b",
  "prompt": "Explain ZFS snapshots in one sentence"
}'

# Use it from any app that supports the OpenAI API
# Just point OPENAI_BASE_URL to http://your-server:11434

# Run as a Docker container with GPU sharing (see NVIDIA tutorial)
docker run -d --name ollama --gpus all \
  -p 11434:11434 \
  -v /srv/ollama:/root/.ollama \
  ollama/ollama

8B models

8GB RAM
General chat, coding, analysis
Fast on CPU, instant on GPU

13B–34B models

16–32GB RAM
Complex reasoning, long context
GPU recommended

70B+ models

64GB+ RAM or 24GB+ VRAM
Near-GPT-4 quality
GPU required

The full recipe — AI infrastructure admin

Step 1: Install Ollama

#!/bin/bash
# postinstall-ai-admin.sh

# Create a ZFS dataset for models (compressed, snapshotable)
kdir -o compression=zstd -o recordsize=1M /srv/ollama

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Configure Ollama to use ZFS-backed storage
mkdir -p /etc/systemd/system/ollama.service.d
cat > /etc/systemd/system/ollama.service.d/override.conf <<EOF
[Service]
Environment="OLLAMA_MODELS=/srv/ollama/models"
Environment="OLLAMA_HOST=0.0.0.0:11434"
EOF

systemctl daemon-reload
systemctl enable --now ollama
Models stored on ZFS = snapshotable, compressible, replicable. Snapshot before fine-tuning. Roll back if the model degrades. Clone to test variants.

Step 2: Pull a model

# Pull a capable model — llama3.1 8B is a good starting point
ollama pull llama3.1:8b

# For more capability (needs 16GB+ RAM)
ollama pull llama3.1:70b

# For code-focused tasks
ollama pull codellama:13b

# Snapshot the clean model state
ksnap /srv/ollama

Step 3: Create your infrastructure Modelfile

This is where it gets powerful. You create a custom model persona that knows your infrastructure:

# /srv/ollama/Modelfile.infra-admin
FROM llama3.1:8b

SYSTEM """
You are an infrastructure administration assistant for a kldload-based
ZFS-on-root Linux environment. You help diagnose issues, suggest tuning,
and automate common tasks.

You know:
- ZFS: pools, datasets, snapshots, replication, ARC tuning, scrubs
- systemd: services, timers, journal, unit files
- Networking: WireGuard, nftables, NetworkManager
- Storage: RAID-Z, mirrors, special vdevs, SLOG, L2ARC
- kldload tools: kst, ksnap, kbe, kdf, kdir, kpkg, kupgrade, krecovery

When diagnosing issues:
1. Ask for relevant output (zpool status, kst, journalctl)
2. Identify the root cause
3. Suggest the fix with exact commands
4. Warn about risks before destructive operations
5. Always recommend a snapshot before changes

Current environment:
- Distro: CentOS Stream 9
- Pool: rpool (ZFS on root)
- Boot: ZFSBootMenu
- Tools: kldload CLI suite
"""

PARAMETER temperature 0.3
PARAMETER num_ctx 8192
# Build the custom model
ollama create infra-admin -f /srv/ollama/Modelfile.infra-admin

# Test it
ollama run infra-admin "my zpool status shows a DEGRADED vdev, what should I do?"

Step 4: Feed it live system data

The real power: pipe your actual system state into the LLM and let it analyze:

#!/bin/bash
# ai-diagnose.sh — pipe system state to the AI assistant

CONTEXT=$(cat <<EOF
=== SYSTEM STATUS ===
$(kst 2>/dev/null)

=== POOL STATUS ===
$(zpool status 2>/dev/null)

=== RECENT ERRORS ===
$(journalctl -p err --since "1 hour ago" --no-pager 2>/dev/null | tail -20)

=== DISK HEALTH ===
$(smartctl -H /dev/vda 2>/dev/null)

=== MEMORY ===
$(free -h)

=== ARC STATS ===
$(cat /proc/spl/kstat/zfs/arcstats 2>/dev/null | grep -E "^size|^c |^hits|^misses")
EOF
)

echo "${CONTEXT}

Based on the above, are there any issues I should address? What optimizations would you recommend?" | \
    ollama run infra-admin
Instead of reading 50 lines of output yourself, the AI reads it and tells you what matters. "Your ARC hit rate is 72% — you should increase zfs_arc_max. Your last scrub was 45 days ago — schedule one. Disk vda has 3 reallocated sectors — monitor it."

Step 5: Automate with cron

# Daily health check — AI reviews your infrastructure every morning
# crontab -e
0 6 * * * /usr/local/bin/ai-diagnose.sh > /var/log/ai-health-report.txt 2>&1

# Weekly deep analysis
0 8 * * 1 /usr/local/bin/ai-deep-analysis.sh | mail -s "Weekly AI Infra Report" admin@example.com

Step 6: Interactive terminal assistant

# Add to .bashrc — type 'ai' to get help anytime
ai() {
    local question="$*"
    local context="$(kst 2>/dev/null; echo '---'; zpool status 2>/dev/null)"
    echo -e "Current system state:\n${context}\n\nQuestion: ${question}" | \
        ollama run infra-admin
}

# Usage:
ai "how do I add a mirror to my pool?"
ai "what recordsize should I use for postgres?"
ai "my ARC hit rate is low, what should I tune?"
ai "create a snapshot schedule for /srv/database"
Type 'ai' followed by any question. It sees your actual system state and gives answers specific to YOUR infrastructure. Not generic docs. YOUR pool. YOUR datasets. YOUR memory.

Why ZFS makes this better

Snapshot before fine-tuning

Training a custom model? ksnap /srv/ollama first. If the fine-tuned model is worse, ksnap rollback /srv/ollama. Instant. Try that with ext4.

Clone models for testing

kclone /srv/ollama /srv/ollama-experiment. Test a different system prompt. Compare outputs. Zero extra disk space until the models diverge.

Replicate to other nodes

zfs send rpool/srv/ollama@trained | ssh node-2 zfs recv rpool/srv/ollama. Your trained AI admin assistant, deployed to every node in your fleet. Block-level replication. Only changed data transferred.

Compressed model storage

LLM model files are large but compress well. compression=zstd on the dataset typically saves 15-25%. A 7GB model takes 5.5GB on disk. Free performance.

Advanced: self-healing infrastructure

The AI that fixes things while you sleep

#!/bin/bash
# ai-auto-heal.sh — AI reviews and acts on critical issues
# Run via cron or systemd timer — WITH CAUTION

STATUS=$(zpool status -x 2>/dev/null)

if [[ "$STATUS" != "all pools are healthy" ]]; then
    # Ask the AI what to do
    RESPONSE=$(echo "zpool status output: ${STATUS}

    Is this critical? What's the safest remediation?
    Respond with ONLY a bash command if safe to run, or ALERT if human needed." | \
        ollama run infra-admin)

    if echo "$RESPONSE" | grep -q "^ALERT"; then
        # AI says human needed — send notification
        echo "$RESPONSE" | mail -s "ALERT: ZFS pool issue" admin@example.com
    else
        # AI suggests a safe command — log and execute
        echo "$(date): AI auto-heal: $RESPONSE" >> /var/log/ai-actions.log
        # Uncomment below to actually execute (use with extreme caution)
        # eval "$RESPONSE"
    fi
fi
Start with alerts only. Read the AI's suggestions for a few weeks. When you trust it, let it run the safe ones. Never let it run destructive commands without human approval. The AI is an assistant, not an operator — yet.
All of this runs locally. No cloud. No API keys. No data leaves your machine. Ollama runs the model on your hardware. Your logs, your configs, your infrastructure data — none of it touches the internet. The AI assistant is as air-gapped as your kldload install.