ai-platform -- Local AI Server Automation

Ansible automation for full lifecycle management of a server as a local AI inference platform. This project provisions, configures, benchmarks, and maintains every service required to run Ollama-based LLM inference behind NGINX reverse proxy with SSO, vector search (RAG), DNS, secret management, and Telegram bot access -- all driven by a single ansible-playbook deploy_ai.yml command.

Architecture

                         ┌──────────────┐
                         │   Internet   │
                         └──────┬───────┘
                                │
                       ┌────────▼────────┐
                       │  nginx_proxy    │
                       │  192.168.1.30   │
                       │  NGINX reverse  │
                       │  proxy + TLS    │
                       └──┬──────────┬───┘
                          │          │
          ┌───────────────▼┐    ┌────▼──────────────────────┐
          │ coredns_host   │    │ ai_server                 │
          │ 192.168.1.29   │    │ 192.168.1.100             │
          │                │    │                            │
          │ - CoreDNS      │    │ - Ollama (LLM inference)  │
          └────────────────┘    │ - Open WebUI              │
                                │ - Keycloak (SSO/OIDC)     │
                                │ - HashiCorp Vault         │
                                │ - Qdrant (vector DB)      │
                                │ - OpenClaw (Telegram bot) │
                                └───────────────────────────┘

Infrastructure Map

Host	IP Address	Purpose
`nginx_proxy`	192.168.1.30	NGINX reverse proxy, TLS termination
`coredns_host`	192.168.1.29	CoreDNS
`ai_server`	192.168.1.100	Ollama, Open WebUI, Keycloak, Vault, Qdrant, OpenClaw

These are the default values in inventory/group_vars/all.yml. Override for your environment — see Configuration below.

Service URLs

Service	URL (default `domain: example.com`)
Open WebUI	https://ollama-ui.example.com
Ollama API	https://ollama-api.example.com
Keycloak	https://idm.example.com
Vault	https://vault.example.com

Configuration

All environment-specific values are variables with generic defaults in inventory/group_vars/all.yml. Override them in local.yml (gitignored).

Variable	Default	Description
`domain`	`example.com`	Base domain for all service URLs
`ai_server_ip`	`192.168.1.100`	IP of the AI inference server
`nginx_proxy_ip`	`192.168.1.30`	IP of the NGINX reverse proxy
`coredns_host_ip`	`192.168.1.29`	IP of the CoreDNS host
`ansible_user`	`admin`	SSH user on all managed hosts
`platform_name`	`"AI Platform"`	Display name used in WebUI, Keycloak, and summaries
`vault_project_slug`	`"ai-platform"`	Slug for Keycloak realm name and Vault secret paths
`nginx_ssl_cert`	`/etc/nginx/ssl/{{ domain }}.crt`	Path to TLS certificate on nginx_proxy
`nginx_ssl_key`	`/etc/nginx/ssl/{{ domain }}.key`	Path to TLS private key on nginx_proxy

If you use Let's Encrypt, override nginx_ssl_cert and nginx_ssl_key in local.yml to point to your certbot paths (e.g. /etc/letsencrypt/live/your-domain/fullchain.pem).

Setup: two gitignored local files

Configuration is split across two gitignored files — create both before first run.

inventory/local.yml — SSH connection details (host IPs and user):

# inventory/local.yml
all:
  hosts:
    ai_server:
      ansible_host: 10.0.1.50
      ansible_user: myuser
    nginx_proxy:
      ansible_host: 10.0.1.10
      ansible_user: myuser
    coredns_host:
      ansible_host: 10.0.1.9
      ansible_user: myuser

Ansible reads the inventory/ directory automatically (ansible.cfg sets inventory = inventory/), so inventory/local.yml is merged with inventory/hosts.yml on every run — no extra flags needed.

The inventory/ directory also contains group_vars/ and host_vars/, which ensures Ansible finds them regardless of which playbook is run directly.

local.yml — play variables (domain, platform identity, SSL certs, etc.):

# local.yml
domain: mylab.internal
ai_server_ip: 10.0.1.50
nginx_proxy_ip: 10.0.1.10
coredns_host_ip: 10.0.1.9
platform_name: "My AI Platform"
vault_project_slug: my-ai
nginx_ssl_cert: /etc/letsencrypt/live/mylab.internal/fullchain.pem
nginx_ssl_key: /etc/letsencrypt/live/mylab.internal/privkey.pem

ai_server_ip, nginx_proxy_ip, and coredns_host_ip appear in both files. inventory/local.yml controls where Ansible SSHs to; local.yml controls what gets rendered into config files and DNS records.

Alternative: inline `-e` flags (no local.yml)

ansible-playbook deploy_ai.yml -K \
  -e "domain=mylab.internal" \
  -e "ai_server_ip=10.0.1.50" \
  -e "nginx_proxy_ip=10.0.1.10" \
  -e "coredns_host_ip=10.0.1.9" \
  -e "platform_name='My AI Platform'" \
  -e "vault_project_slug=my-ai" \
  -e "nginx_ssl_cert=/etc/letsencrypt/live/mylab.internal/fullchain.pem" \
  -e "nginx_ssl_key=/etc/letsencrypt/live/mylab.internal/privkey.pem"

inventory/local.yml must still exist for SSH to work — inline -e flags cannot set per-host connection variables.

Prerequisites

Ansible 2.14+
Python 3.9+
SSH access to all 3 hosts
sudo privileges on all 3 hosts

Ansible Galaxy collections:

ansible-galaxy collection install -r requirements.yml

First-Run Quickstart

git clone <repo>
cd ai-platform
ansible-galaxy collection install -r requirements.yml

# 1. Create inventory/local.yml with your host IPs and SSH user (gitignored)
# 2. Create local.yml with your domain, platform name, SSL cert paths, etc. (gitignored)
# See the Configuration section above for the contents of each file.

# 3. Deploy
ansible-playbook deploy_ai.yml -K -e @local.yml

-K prompts for the sudo (become) password on the remote hosts.

Credential Management

All secrets (API keys, passwords, OIDC client secrets) are stored in HashiCorp Vault and only written once — re-running any playbook will never overwrite an existing secret. This means deploy_ai.yml is safe to re-run at any time without breaking running services.

Credential rotation

To rotate a specific credential, delete it from Vault and re-run the full deploy:

# Example: rotate Keycloak credentials
vault kv delete secret/<vault_project_slug>/keycloak
ansible-playbook deploy_ai.yml -K -e @local.yml

New credentials will be generated, stored in Vault, and all dependent services (Keycloak, Open WebUI, Vault OIDC) will be redeployed in the correct order automatically.

Vault login

Vault UI supports two login methods:

Token — use the root token from vault/.vault-init.json (emergency/admin use only)
OIDC — select method OIDC, role default, click Sign in with OIDC Provider, authenticate via Keycloak. Only users with the ai-admin Keycloak role can log in.

User Roles

Users are created in Keycloak at https://idm.<domain>/admin/. Assign roles from the platform realm (not the master realm):

Role	Open WebUI	Vault OIDC
`ai-user`	✅ Standard access	❌ Blocked
`ai-admin`	✅ Admin access	✅ Full access
(none)	❌ Blocked	❌ Blocked

Day-2 Operations

Full deploy / idempotent re-run:

ansible-playbook deploy_ai.yml -K -e @local.yml

Pre-flight checks only:

ansible-playbook deploy_ai.yml -K -e @local.yml --tags preflight

Skip benchmarking on re-runs (faster):

ansible-playbook deploy_ai.yml -K -e @local.yml --skip-tags benchmark

Vault only:

ansible-playbook playbooks/01_vault.yml -K -e @local.yml

Docker + Ollama only:

ansible-playbook playbooks/02_infrastructure.yml -K -e @local.yml

Re-benchmark all installed models:

ansible-playbook playbooks/03_benchmark.yml -K -e @local.yml

Benchmark specific models only:

ansible-playbook playbooks/03_benchmark.yml -K -e @local.yml \
  -e "benchmark_models=qwen2.5-coder:14b-instruct-q4_K_M,codestral:22b-v0.1-q4_K_M"

Pull recommended models if scores are below threshold:

ansible-playbook playbooks/03_benchmark.yml -K -e @local.yml -e "pull_if_better=true"

Update warm-up slots after a benchmark:

ansible-playbook playbooks/04_models.yml -K -e @local.yml

Rotate slot 4 to a specific model:

ansible-playbook playbooks/04_models.yml -K -e @local.yml -e "slot4_model=deepseek-r1:14b"

Redeploy Keycloak only:

ansible-playbook playbooks/05_keycloak.yml -K -e @local.yml

Redeploy Open WebUI only:

ansible-playbook playbooks/07_openwebui.yml -K -e @local.yml

Update NGINX configs only:

ansible-playbook playbooks/09_nginx.yml -K -e @local.yml

Update CoreDNS records only:

ansible-playbook playbooks/10_coredns.yml -K -e @local.yml

Configure Keycloak SSO login for Vault UI:

ansible-playbook playbooks/11_vault_oidc.yml -K -e @local.yml

Model Slot System

Four models are kept warm in RAM at all times (OLLAMA_MAX_LOADED_MODELS=4, OLLAMA_KEEP_ALIVE=-1). Slots are filled by the benchmark playbook — no model names are hardcoded.

Slot	Role	Selection	Rotation
1	General-purpose primary	Top general composite score	Replaced if score < threshold
2	General-purpose secondary	2nd general composite score	Replaced if score < threshold
3	Coding primary	Top coding composite score	Locked; replaced only by re-benchmark
4	Coding secondary	2nd coding composite score	Rotatable: `-e slot4_model=<name>`

Classification rule: a model is classified coding if its coding composite score exceeds its general composite score by ≥ 0.15; otherwise general.

Verification Steps

After a full deploy_ai.yml run, verify the deployment (substitute your actual domain and IPs):

Vault health -- curl -s https://vault.example.com/v1/sys/health returns initialized: true, sealed: false
Vault OIDC login -- select OIDC method, role default, authenticate with an ai-admin Keycloak user
Ollama API -- curl -s https://ollama-api.example.com/api/tags returns model list
Open WebUI -- browse to https://ollama-ui.example.com, SSO login works with ai-user or ai-admin
Keycloak admin -- browse to https://idm.example.com/admin/, login with admin credentials from Vault
Qdrant health -- curl -s http://<ai_server_ip>:6333/healthz returns OK
CoreDNS resolution -- dig @<coredns_host_ip> vault.example.com returns <nginx_proxy_ip>
NGINX configs -- ssh <nginx_proxy_ip> 'sudo nginx -t' passes
OpenClaw -- send a message to the Telegram bot, confirm response
Benchmark report -- check benchmarks/results/benchmark_<timestamp>.md for latest results

Role Reference

Role	README	Purpose
preflight	roles/preflight/README.md	Pre-flight validation
hashi_vault	roles/hashi_vault/README.md	HashiCorp Vault deployment
docker	roles/docker/README.md	Docker CE installation
ollama	roles/ollama/README.md	Ollama inference server
benchmark	roles/benchmark/README.md	Model benchmarking
models	roles/models/README.md	Model lifecycle management
keycloak	roles/keycloak/README.md	Keycloak SSO/OIDC
qdrant	roles/qdrant/README.md	Qdrant vector database
openwebui	roles/openwebui/README.md	Open WebUI deployment
openclaw	roles/openclaw/README.md	OpenClaw Telegram bot
nginx	roles/nginx/README.md	NGINX reverse proxy
coredns	roles/coredns/README.md	CoreDNS zone management

Security Notes

vault/.vault-init.json and vault/.vault-token are gitignored -- they contain Vault unseal keys and root tokens. Never commit these files.
local.yml and inventory/local.yml are gitignored -- they contain your environment-specific IPs, usernames, and cert paths. Never commit these files.
All service secrets (database passwords, API keys, OIDC client secrets) are stored in HashiCorp Vault and injected at deploy time. Secrets are never regenerated unless explicitly deleted from Vault.
Ollama API is protected by OLLAMA_API_KEY to prevent unauthenticated access.
TLS termination happens at the NGINX reverse proxy layer.
Open WebUI and Vault UI both require a valid Keycloak role to access via SSO.

README.md 15 KB History Raw