diff --git a/docs/SERVER_OPERATIONS.md b/docs/SERVER_OPERATIONS.md
new file mode 100644
index 0000000000..ccf3a79192
--- /dev/null
+++ b/docs/SERVER_OPERATIONS.md
@@ -0,0 +1,443 @@
+# GLAM Server Operations Guide
+
+This document covers server architecture, disk management, troubleshooting, and recovery procedures for the GLAM production server.
+
+## Server Overview
+
+| Property | Value |
+|----------|-------|
+| **Provider** | Hetzner Cloud |
+| **Server Name** | `glam-sparql` |
+| **IP Address** | `91.98.224.44` |
+| **Instance Type** | cx32 (4 vCPU, 8GB RAM) |
+| **Location** | nbg1-dc3 (Nuremberg, Germany) |
+| **SSH User** | `root` |
+
+---
+
+## Disk Architecture
+
+The server has two storage volumes with distinct purposes:
+
+### Root Volume (`/dev/sda1`)
+
+| Property | Value |
+|----------|-------|
+| **Mount Point** | `/` |
+| **Size** | 75GB |
+| **Purpose** | Operating system, applications, virtual environments |
+
+**Contents:**
+- `/var/lib/glam/api/` - GLAM API application and Python venv
+- `/var/lib/glam/frontend/` - Frontend static files
+- `/var/lib/glam/ducklake/` - DuckLake database files
+- `/var/www/` - Web roots for Caddy
+- `/usr/` - System binaries
+- `/var/log/` - System logs
+
+### Data Volume (`/dev/sdb`)
+
+| Property | Value |
+|----------|-------|
+| **Mount Point** | `/mnt/data` |
+| **Size** | 49GB |
+| **Purpose** | Large datasets, Oxigraph triplestore, custodian YAML files |
+
+**Contents:**
+- `/mnt/data/oxigraph/` - SPARQL triplestore data
+- `/mnt/data/custodian/` - Heritage custodian YAML files (27,459 files, ~17GB)
+- `/mnt/data/ontologies/` - Base ontology files
+- `/mnt/data/rdf/` - Generated RDF schemas
+
+---
+
+## Directory Layout
+
+### Symbolic Links (Important!)
+
+The following symlinks redirect data to the data volume:
+
+```
+/var/lib/glam/api/data/custodian → /mnt/data/custodian
+```
+
+**Rationale:** The custodian data (~17GB, 27,459 YAML files) is too large for the root volume. Moving it to `/mnt/data` with a symlink allows the API to find files at the expected path while storing them on the larger volume.
+
+### Full Directory Structure
+
+```
+/var/lib/glam/
+├── api/                          # GLAM API (24GB total)
+│   ├── venv/                     # Python virtual environment (~2GB after optimization)
+│   │   └── lib/python3.12/site-packages/
+│   │       ├── torch/            # PyTorch CPU-only (~184MB)
+│   │       ├── transformers/     # HuggingFace (~115MB)
+│   │       ├── sentence_transformers/
+│   │       └── ...
+│   ├── src/                      # API source code
+│   ├── data/
+│   │   ├── custodian → /mnt/data/custodian  # SYMLINK
+│   │   ├── validation/
+│   │   └── sparql_templates.yaml
+│   ├── schemas/                  # LinkML schemas for API
+│   ├── backend/                  # RAG backend code
+│   └── requirements.txt
+├── frontend/                     # Frontend build (~200MB)
+├── ducklake/                     # DuckLake database (~3.2GB)
+├── valkey/                       # Valkey cache config
+└── scripts/                      # Deployment scripts
+
+/mnt/data/
+├── oxigraph/                     # SPARQL triplestore data
+├── custodian/                    # Heritage custodian YAML files (17GB)
+│   ├── *.yaml                    # Individual custodian files
+│   ├── person/                   # Person entity profiles
+│   └── web/                      # Archived web content
+├── ontologies/                   # Base ontology files
+└── rdf/                          # Generated RDF schemas
+```
+
+---
+
+## Services
+
+### Systemd Services
+
+| Service | Port | Description | Status Command |
+|---------|------|-------------|----------------|
+| `oxigraph` | 7878 (internal) | SPARQL triplestore | `systemctl status oxigraph` |
+| `glam-api` | 8000 (internal) | FastAPI application | `systemctl status glam-api` |
+| `caddy` | 80, 443 | Reverse proxy with TLS | `systemctl status caddy` |
+
+### Docker Containers
+
+| Container | Port | Description |
+|-----------|------|-------------|
+| `qdrant` | 6333-6334 (internal) | Vector database |
+| `glam-valkey-api` | 8090 (internal) | Valkey cache API |
+| `forgejo` | 3000 (internal), 2222 | Git server |
+
+### Service Endpoints
+
+| Domain | Backend |
+|--------|---------|
+| `bronhouder.nl` | Static frontend |
+| `archief.support` | Archief Assistent app |
+| `sparql.bronhouder.nl` | Oxigraph SPARQL proxy |
+| `api.bronhouder.nl` | GLAM API (if configured) |
+
+---
+
+## PyTorch CPU-Only Configuration
+
+**Critical:** The server does NOT have a GPU. PyTorch must be installed in CPU-only mode.
+
+### Why CPU-Only?
+
+- The cx32 instance has no GPU
+- CUDA libraries add ~6GB of unnecessary packages
+- CPU inference is sufficient for embedding generation
+
+### Installation
+
+If PyTorch needs to be reinstalled:
+
+```bash
+# SSH to server
+ssh root@91.98.224.44
+
+# Activate venv
+cd /var/lib/glam/api
+source venv/bin/activate
+
+# Uninstall GPU version
+pip uninstall torch -y
+
+# Install CPU-only version
+pip install torch --index-url https://download.pytorch.org/whl/cpu
+```
+
+### Verification
+
+```bash
+ssh root@91.98.224.44 "source /var/lib/glam/api/venv/bin/activate && python -c 'import torch; print(f\"PyTorch {torch.__version__}, CUDA available: {torch.cuda.is_available()}\")'"
+# Should output: PyTorch 2.9.1+cpu, CUDA available: False
+```
+
+### Package Sizes (Reference)
+
+| Package | GPU Version | CPU Version |
+|---------|-------------|-------------|
+| `torch` | ~1.7GB | ~184MB |
+| `nvidia-*` | ~4.3GB | Not installed |
+| `triton` | ~594MB | Not installed |
+
+---
+
+## Disk Space Management
+
+### Monitoring
+
+```bash
+# Quick check
+ssh root@91.98.224.44 "df -h / /mnt/data"
+
+# Detailed breakdown
+ssh root@91.98.224.44 "du -sh /var/lib/glam/* /mnt/data/* 2>/dev/null | sort -rh"
+```
+
+### Target Usage
+
+| Volume | Target | Warning | Critical |
+|--------|--------|---------|----------|
+| Root (`/`) | <70% | >80% | >90% |
+| Data (`/mnt/data`) | <80% | >85% | >95% |
+
+### Cleanup Procedures
+
+#### 1. System Logs
+
+```bash
+ssh root@91.98.224.44 "
+  journalctl --vacuum-size=50M
+  rm -f /var/log/*.gz /var/log/*.1 /var/log/*.old
+  truncate -s 0 /var/log/syslog /var/log/auth.log
+"
+```
+
+#### 2. APT Cache
+
+```bash
+ssh root@91.98.224.44 "apt-get clean && apt-get autoremove -y"
+```
+
+#### 3. Docker Cleanup
+
+```bash
+ssh root@91.98.224.44 "docker system prune -af --volumes"
+```
+
+#### 4. Python Cache
+
+```bash
+ssh root@91.98.224.44 "
+  find /var/lib/glam/api/venv -type d -name '__pycache__' -exec rm -rf {} + 2>/dev/null
+  find /var/lib/glam/api/venv -type f -name '*.pyc' -delete 2>/dev/null
+"
+```
+
+---
+
+## Troubleshooting
+
+### Disk Full Recovery
+
+**Symptoms:**
+- Services crash-looping
+- "No space left on device" errors
+- Deployment fails
+
+**Recovery Steps:**
+
+1. **Identify largest directories:**
+   ```bash
+   ssh root@91.98.224.44 "du -sh /* 2>/dev/null | sort -rh | head -20"
+   ```
+
+2. **Clean logs (immediate relief):**
+   ```bash
+   ssh root@91.98.224.44 "journalctl --vacuum-size=50M && truncate -s 0 /var/log/syslog"
+   ```
+
+3. **Check for unnecessary CUDA packages:**
+   ```bash
+   ssh root@91.98.224.44 "du -sh /var/lib/glam/api/venv/lib/python*/site-packages/nvidia"
+   # If >100MB, reinstall PyTorch CPU-only (see above)
+   ```
+
+4. **Move large data to data volume:**
+   ```bash
+   # Example: Move custodian data
+   ssh root@91.98.224.44 "
+     mv /var/lib/glam/api/data/custodian /mnt/data/custodian
+     ln -s /mnt/data/custodian /var/lib/glam/api/data/custodian
+   "
+   ```
+
+5. **Restart services:**
+   ```bash
+   ssh root@91.98.224.44 "systemctl restart glam-api oxigraph caddy"
+   ```
+
+### GLAM API Crash Loop
+
+**Symptoms:**
+- `systemctl status glam-api` shows "activating (auto-restart)"
+- High restart counter in logs
+
+**Diagnosis:**
+```bash
+ssh root@91.98.224.44 "journalctl -u glam-api -n 100 --no-pager"
+```
+
+**Common Causes:**
+
+1. **Port already in use:**
+   ```bash
+   ssh root@91.98.224.44 "fuser -k 8000/tcp; systemctl restart glam-api"
+   ```
+
+2. **Missing CUDA libraries (after cleanup):**
+   ```bash
+   # Look for: "libcublas.so not found"
+   # Fix: Reinstall PyTorch CPU-only (see above)
+   ```
+
+3. **Out of memory:**
+   ```bash
+   ssh root@91.98.224.44 "free -h && dmesg | grep -i 'out of memory' | tail -5"
+   ```
+
+4. **Missing custodian data symlink:**
+   ```bash
+   ssh root@91.98.224.44 "ls -la /var/lib/glam/api/data/custodian"
+   # Should be symlink to /mnt/data/custodian
+   ```
+
+### Oxigraph Issues
+
+**Check status:**
+```bash
+ssh root@91.98.224.44 "systemctl status oxigraph"
+```
+
+**Check triple count:**
+```bash
+curl -s "https://sparql.bronhouder.nl/query" \
+  -H "Accept: application/sparql-results+json" \
+  --data-urlencode "query=SELECT (COUNT(*) as ?count) WHERE { ?s ?p ?o }"
+```
+
+**Restart:**
+```bash
+ssh root@91.98.224.44 "systemctl restart oxigraph"
+```
+
+---
+
+## Recovery Procedures
+
+### Full Disk Recovery (Documented 2026-01-10)
+
+On January 10, 2026, both disks reached 100% capacity. Recovery steps:
+
+1. **Root cause:** NVIDIA CUDA packages (~6GB) + custodian data (~17GB) on root volume
+
+2. **Resolution:**
+   - Removed NVIDIA packages: `rm -rf .../site-packages/nvidia .../triton`
+   - Reinstalled PyTorch CPU-only: `pip install torch --index-url https://download.pytorch.org/whl/cpu`
+   - Moved custodian data: `/var/lib/glam/api/data/custodian` → `/mnt/data/custodian` (symlink)
+
+3. **Result:**
+   - Root: 100% → 43% used (42GB free)
+   - Data: 100% → 69% used (15GB free)
+
+### Service Recovery Order
+
+If multiple services are down, restart in this order:
+
+1. `oxigraph` - SPARQL store (no dependencies)
+2. `qdrant` (docker) - Vector database
+3. `glam-valkey-api` (docker) - Cache
+4. `glam-api` - API (depends on Qdrant, Oxigraph)
+5. `caddy` - Reverse proxy (depends on backends)
+
+```bash
+ssh root@91.98.224.44 "
+  systemctl restart oxigraph
+  docker restart qdrant glam-valkey-api
+  sleep 5
+  systemctl restart glam-api
+  sleep 10
+  systemctl restart caddy
+"
+```
+
+---
+
+## Monitoring Commands
+
+### Quick Health Check
+
+```bash
+./infrastructure/deploy.sh --status
+```
+
+### Detailed Status
+
+```bash
+# All services
+ssh root@91.98.224.44 "
+  echo '=== Disk ===' && df -h / /mnt/data
+  echo '=== Memory ===' && free -h
+  echo '=== Services ===' && systemctl status oxigraph glam-api caddy --no-pager | grep -E '(●|Active:)'
+  echo '=== Docker ===' && docker ps --format 'table {{.Names}}\t{{.Status}}'
+  echo '=== Triples ===' && curl -s 'http://localhost:7878/query' -H 'Accept: application/sparql-results+json' --data-urlencode 'query=SELECT (COUNT(*) as ?c) WHERE { ?s ?p ?o }' | jq -r '.results.bindings[0].c.value'
+"
+```
+
+### Website Verification
+
+```bash
+# Check all endpoints
+curl -s "https://bronhouder.nl" -o /dev/null -w "bronhouder.nl: %{http_code}\n"
+curl -s "https://archief.support" -o /dev/null -w "archief.support: %{http_code}\n"
+curl -s "https://sparql.bronhouder.nl/query" -H "Accept: application/sparql-results+json" \
+  --data-urlencode "query=SELECT (COUNT(*) as ?c) WHERE { ?s ?p ?o }" | jq -r '"Oxigraph: \(.results.bindings[0].c.value) triples"'
+```
+
+---
+
+## Maintenance Schedule
+
+### Daily (Automated)
+
+- Log rotation via systemd journald
+- Certificate renewal via Caddy (automatic)
+
+### Weekly (Manual)
+
+```bash
+# Check disk usage
+ssh root@91.98.224.44 "df -h"
+
+# Check for failed services
+ssh root@91.98.224.44 "systemctl --failed"
+```
+
+### Monthly (Manual)
+
+```bash
+# System updates
+ssh root@91.98.224.44 "apt update && apt upgrade -y"
+
+# Docker image updates
+ssh root@91.98.224.44 "docker pull qdrant/qdrant:latest && docker restart qdrant"
+
+# Clean old logs
+ssh root@91.98.224.44 "journalctl --vacuum-time=30d"
+```
+
+---
+
+## Related Documentation
+
+- `docs/DEPLOYMENT_GUIDE.md` - Deployment procedures
+- `.opencode/DEPLOYMENT_RULES.md` - AI agent deployment rules
+- `infrastructure/deploy.sh` - Deployment script source
+- `AGENTS.md` - Rule 7 covers deployment guidelines
+
+---
+
+**Last Updated:** 2026-01-10
+**Maintainer:** GLAM Project Team