| your Linux construction kit
Source

NVIDIA on kldload

NVIDIA GPU support is available on CentOS/RHEL, Debian, Ubuntu, and Proxmox installs.


During install (CentOS/RHEL only)

Set KLDLOAD_NVIDIA_DRIVERS=1 in the answers file before starting the install. The installer will:

  • Add the NVIDIA CUDA repository for your RHEL version
  • Install nvidia-driver, nvidia-driver-libs, and nvidia-driver-cuda

Web UI

Select the NVIDIA option in the hardware section of the web UI before clicking Install.

Unattended install

cat > /tmp/answers.env << ‘EOF’
KLDLOAD_DISTRO=centos
KLDLOAD_DISK=/dev/vda
KLDLOAD_HOSTNAME=gpu-node
KLDLOAD_USERNAME=admin
KLDLOAD_PASSWORD=changeme
KLDLOAD_PROFILE=desktop
KLDLOAD_NVIDIA_DRIVERS=1
EOF

kldload-install-target --config /tmp/answers.env

Post-install: CentOS / RHEL

If you didn’t enable NVIDIA during install, add it afterward. The CUDA repo URL is detected from your OS version automatically:

# Detect RHEL major version
RHEL_VER=$(. /etc/os-release && echo $VERSION_ID | cut -d. -f1)

# Add the CUDA repo
dnf install -y \
  https://developer.download.nvidia.com/compute/cuda/repos/rhel${RHEL_VER}/x86_64/cuda-repo-rhel${RHEL_VER}-12.9.0-1.x86_64.rpm

# Modern GPUs (Turing / RTX 20 series and newer) — prefer open module
dnf install -y nvidia-open nvidia-driver-libs nvidia-driver-cuda

# Legacy GPUs (pre-Turing) — proprietary driver
# dnf install -y nvidia-driver nvidia-driver-libs nvidia-driver-cuda

# Reboot to load the kernel module
reboot

Post-install: Debian

Debian installs need the non-free repo and the nvidia-driver package. The correct repo codename is detected automatically:

# Detect Debian codename (bookworm, trixie, etc.)
DEBIAN_CODENAME=$(. /etc/os-release && echo $VERSION_CODENAME)

# Add non-free to sources
cat > /etc/apt/sources.list.d/nvidia.list << EOF
deb http://deb.debian.org/debian ${DEBIAN_CODENAME} main contrib non-free non-free-firmware
EOF

apt update
apt install -y nvidia-driver firmware-misc-nonfree

reboot

This requires internet access — the NVIDIA driver is not included in the offline darksite.


Post-install: Ubuntu

Ubuntu is the simplest platform — the ubuntu-drivers tool detects and installs the correct driver automatically:

# Install the driver detection tool
apt install -y ubuntu-drivers-common

# Auto-install the recommended driver for your GPU
ubuntu-drivers autoinstall

reboot

To install a specific version instead:

# List available drivers
ubuntu-drivers devices

# Install a specific version
apt install -y nvidia-driver-570

reboot

Post-install: Proxmox

Proxmox uses a custom kernel (pve-kernel) which requires matching pve-headersnot standard linux-headers. This is the most common failure point.

# Install the correct headers for the running Proxmox kernel
apt install -y pve-headers-$(uname -r) build-essential dkms

# Download and install the NVIDIA driver with DKMS support
NVIDIA_VERSION=570.144
wget https://us.download.nvidia.com/XFree86/Linux-x86_64/${NVIDIA_VERSION}/NVIDIA-Linux-x86_64-${NVIDIA_VERSION}.run

chmod +x NVIDIA-Linux-x86_64-${NVIDIA_VERSION}.run
./NVIDIA-Linux-x86_64-${NVIDIA_VERSION}.run --dkms

reboot

After reboot, verify the driver loaded and /dev/nvidia* devices exist:

nvidia-smi
ls -al /dev/nvidia*

With drivers installed on the Proxmox host, every LXC container can share the GPU directly — no passthrough, no vGPU license. See GPU sharing below.


Verify

nvidia-smi

Expected output shows your GPU model, driver version, CUDA version, temperature, and memory usage.


CUDA toolkit

For GPU computing (machine learning, rendering, etc.), install the full CUDA toolkit after the driver is working.

CentOS / RHEL

dnf install -y cuda-toolkit

Debian

# Detect Debian version for the correct repo
DEBIAN_VERSION_ID=$(. /etc/os-release && echo $VERSION_ID)

curl -fsSL https://developer.download.nvidia.com/compute/cuda/repos/debian${DEBIAN_VERSION_ID}/x86_64/cuda-keyring_1.1-1_all.deb \
  -o /tmp/cuda-keyring.deb
dpkg -i /tmp/cuda-keyring.deb
apt update
apt install -y cuda-toolkit

Ubuntu

# Ubuntu uses the same NVIDIA CUDA repo
UBUNTU_VERSION=$(. /etc/os-release && echo $VERSION_ID | tr -d ‘.’)

curl -fsSL https://developer.download.nvidia.com/compute/cuda/repos/ubuntu${UBUNTU_VERSION}/x86_64/cuda-keyring_1.1-1_all.deb \
  -o /tmp/cuda-keyring.deb
dpkg -i /tmp/cuda-keyring.deb
apt update
apt install -y cuda-toolkit

Verify CUDA

nvcc --version

# Compile and run a sample
cat > /tmp/hello.cu << ‘EOF’
#include <stdio.h>
__global__ void hello() { printf("Hello from GPU thread %d\n", threadIdx.x); }
int main() { hello<<<1, 8>>>(); cudaDeviceSynchronize(); }
EOF

nvcc /tmp/hello.cu -o /tmp/hello_cuda && /tmp/hello_cuda

GPU sharing — one GPU, many workloads

This is the part nobody tells you about. With NVIDIA drivers installed on the host and containers running on top, every container can share the same GPU simultaneously. No passthrough. No SR-IOV. The kernel handles time-slicing natively.

What this means in practice

You can run Jellyfin transcoding a 4K stream, an AI inference container running Ollama, and a monitoring container scraping GPU metrics — all at the same time, on one GPU, on one machine.

With PCIe passthrough, one VM locks the GPU. Nobody else can touch it. With containers on bare metal, every container gets a share. That’s the difference.

Install the NVIDIA Container Toolkit

# Add the NVIDIA container toolkit repo
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
  | gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit.gpg

curl -fsSL https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
  | sed ‘s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit.gpg] https://#’ \
  > /etc/apt/sources.list.d/nvidia-container-toolkit.list

apt update && apt install -y nvidia-container-toolkit

# Configure Docker to use the NVIDIA runtime
nvidia-ctk runtime configure --runtime=docker
systemctl restart docker

Run GPU-accelerated containers

# Jellyfin with GPU transcoding
docker run -d --name jellyfin --gpus all \
  -p 8096:8096 \
  -v /srv/media:/media \
  -v /srv/jellyfin/config:/config \
  jellyfin/jellyfin

# Ollama for local AI inference
docker run -d --name ollama --gpus all \
  -p 11434:11434 \
  -v /srv/ollama:/root/.ollama \
  ollama/ollama

# Pull a model and run it
docker exec ollama ollama pull llama3
docker exec ollama ollama run llama3 "Explain ZFS in one sentence"

Verify GPU sharing

# Watch GPU utilization across all containers
watch -n 1 nvidia-smi

# You’ll see multiple processes sharing the GPU:
#   jellyfin  — video transcode
#   ollama    — model inference
#   Each gets a slice of GPU time automatically

Why this works

NVIDIA’s CUDA driver handles time-slicing at the kernel level. When multiple processes request GPU compute, the driver schedules them across available SMs (streaming multiprocessors). No configuration needed — it just works.

Enterprise “vGPU” solutions add a licensing layer on top of this same mechanism. On bare metal with containers, the GPU sharing is native and free.


Nouveau vs NVIDIA

kldload’s CentOS kernel ships with the open-source nouveau driver loaded by default. Installing the proprietary NVIDIA driver blacklists nouveau automatically. If you need to revert:

# CentOS / RHEL
dnf remove -y ‘nvidia-driver*’
rm -f /etc/modprobe.d/nvidia.conf
dracut --force -k $(uname -r)
reboot
# Debian / Ubuntu / Proxmox
apt purge -y ‘nvidia-driver*’ ‘nvidia-open*’
rm -f /etc/modprobe.d/nvidia.conf /etc/modprobe.d/blacklist-nouveau.conf
update-initramfs -u -k $(uname -r)
reboot

ZFS and NVIDIA memory

Both ZFS ARC and NVIDIA drivers use large amounts of memory. On systems with GPUs, you may want to cap ZFS ARC to leave room:

# Check current ARC max
cat /proc/spl/kstat/zfs/arcstats | grep c_max

# Limit ARC to 4GB (persistent across reboots)
echo "options zfs zfs_arc_max=4294967296" > /etc/modprobe.d/zfs-arc.conf

# Apply on RHEL/CentOS
dracut --force -k $(uname -r)

# Apply on Debian / Ubuntu / Proxmox
update-initramfs -u -k $(uname -r)

A reasonable rule of thumb: total RAM minus GPU VRAM minus 2GB for the OS, then give half of what remains to ARC.


Secure Boot

The proprietary NVIDIA kernel module is not signed for Secure Boot. If Secure Boot is enabled, you need to either:

Option 1 — Sign the module with your MOK key (kldload sets up MOK infrastructure during install):

# Find the MOK key kldload created
ls /var/lib/kldload/mok/

# Sign the NVIDIA module
/usr/src/kernels/$(uname -r)/scripts/sign-file sha256 \
  /var/lib/kldload/mok/MOK.priv \
  /var/lib/kldload/mok/MOK.der \
  $(modinfo -n nvidia)

reboot

Option 2 — Disable Secure Boot in UEFI firmware settings.


Troubleshooting

# Check if the module loaded
lsmod | grep nvidia

# If not, check for errors
dmesg | grep -i nvidia

# Kernel updated but DKMS didn’t rebuild the module
dkms status
dkms autoinstall -k $(uname -r)

# Check current desktop session type
echo $XDG_SESSION_TYPE

# Proxmox: verify /dev/nvidia* devices exist for LXC sharing
ls -al /dev/nvidia*