glam/.opencode/rules/podman-containerization-rule.md
kempersc b13674400f Refactor schema slots and classes for improved organization and clarity
- Removed deprecated slots: appraisal_notes, branch_id, is_or_was_real.
- Introduced new slots: has_or_had_notes, has_or_had_provenance.
- Created Notes class to encapsulate note-related metadata.
- Archived removed slots and classes in accordance with the new archive folder convention.
- Updated slot_fixes.yaml to reflect migration status and details.
- Enhanced documentation for new slots and classes, ensuring compliance with ontology alignment.
- Added new slots for note content, date, and type to support the Notes class.
2026-01-14 12:14:07 +01:00

8.9 KiB

Rule 54: RAG API Podman Containerization

🚨 CRITICAL: The GLAM RAG API MUST be deployed via Podman container, NOT via venv/rsync. This solves Python import consistency issues between local development and production.


Why Podman (Not venv)

The Problem

Python import behavior differs between local development and gunicorn:

Context Import Style Works?
Local development (uvicorn main:app) from .provenance import
Production (gunicorn main:app) from .provenance import
Production (gunicorn main:app) from provenance import

When syncing code via rsync to a server venv, the import style that works locally may fail in production with gunicorn.

The Solution

Containerization with Podman ensures:

  1. Consistent import resolution - Same Python environment locally and in production
  2. Isolation - RAG API dependencies don't conflict with other services
  3. Reproducibility - Dockerfile defines exact environment
  4. Rootless security - Podman runs as non-root user inside container

Deployment Architecture

┌─────────────────────────────────────────────────────────────┐
│  Server: 91.98.224.44 (bronhouder.nl)                       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Podman Container: glam-rag-api                     │   │
│  │                                                     │   │
│  │  - python:3.11-slim base                            │   │
│  │  - gunicorn + uvicorn workers                       │   │
│  │  - Port 8010 (host network mode)                    │   │
│  │  - Non-root user (glam:1000)                        │   │
│  │                                                     │   │
│  │  Connects to (all on localhost):                    │   │
│  │  - Qdrant :6333                                     │   │
│  │  - Oxigraph SPARQL :7878                            │   │
│  │  - TypeDB :1729                                     │   │
│  │  - PostGIS :5432                                    │   │
│  │  - Valkey :8090                                     │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Caddy Reverse Proxy                                │   │
│  │  https://bronhouder.nl/api/rag/* → :8010            │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Key Files

File Purpose
backend/rag/Dockerfile Container image definition
backend/rag/requirements.txt Python dependencies (includes gunicorn)
backend/rag/main.py FastAPI application
infrastructure/deploy.sh Deployment script (--rag flag)

Deployment Commands

Deploy RAG API

# From project root - deploys via Podman
./infrastructure/deploy.sh --rag

This will:

  1. Sync backend/rag/ to /opt/glam-backend/rag/ on server
  2. Build Podman image glam-rag-api:latest
  3. Create/update systemd service glam-rag-api.service
  4. Start the container

Manual Operations (on server)

# Check service status
systemctl status glam-rag-api

# View container logs
podman logs glam-rag-api

# Restart service
systemctl restart glam-rag-api

# Rebuild image manually
cd /opt/glam-backend/rag
podman build -t glam-rag-api:latest .

# Clean up old images
podman image prune -f

Systemd Service Configuration

The service file is created by deploy.sh at /etc/systemd/system/glam-rag-api.service:

[Unit]
Description=GLAM Heritage RAG API (Podman)
After=network.target qdrant.service
Wants=qdrant.service

[Service]
Type=simple
Restart=always
RestartSec=10
EnvironmentFile=/var/lib/glam/.env

ExecStart=/usr/bin/podman run --rm --name glam-rag-api \
  --network host \
  -e OPENAI_API_KEY \
  -e ZAI_API_TOKEN \
  -e QDRANT_HOST=localhost \
  -e QDRANT_PORT=6333 \
  -e QDRANT_COLLECTION=heritage_custodians_minilm \
  -e EMBEDDING_MODEL=all-MiniLM-L6-v2 \
  -e EMBEDDING_DIM=384 \
  -e TYPEDB_HOST=localhost \
  -e TYPEDB_PORT=1729 \
  -e TYPEDB_DATABASE=glam \
  -e SPARQL_ENDPOINT=http://localhost:7878/query \
  -e VALKEY_CACHE_URL=http://localhost:8090 \
  -e POSTGIS_HOST=localhost \
  -e POSTGIS_PORT=5432 \
  -e POSTGIS_DATABASE=glam \
  -e LLM_PROVIDER=openai \
  -e LLM_MODEL=gpt-4.1-mini \
  -v glam-rag-optimized-models:/app/optimized_models:z \
  glam-rag-api:latest

ExecStop=/usr/bin/podman stop glam-rag-api

[Install]
WantedBy=multi-user.target

Key Configuration:

  • --network host: Container uses host networking (accesses localhost services directly)
  • EnvironmentFile: Loads API keys from /var/lib/glam/.env
  • -v ...:/app/optimized_models:z: Persistent volume for DSPy optimized models
  • Restart=always: Auto-restart on failure

Dockerfile Structure

The Dockerfile at backend/rag/Dockerfile:

FROM python:3.11-slim

# Non-root user for security
RUN useradd -m -u 1000 -s /bin/bash glam

# Install dependencies first (layer caching)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY --chown=glam:glam . .

USER glam

# Gunicorn with uvicorn workers for async
CMD ["gunicorn", "main:app", \
     "--bind", "0.0.0.0:8010", \
     "--workers", "2", \
     "--worker-class", "uvicorn.workers.UvicornWorker", \
     "--timeout", "120"]

Import Style for Container Deployment

CRITICAL: Use absolute imports in RAG API Python files:

# CORRECT - Works in container with gunicorn
from provenance import ProvenanceTracker, format_provenance_chain

# WRONG - Fails with gunicorn
from .provenance import ProvenanceTracker, format_provenance_chain

This is because gunicorn doesn't recognize the directory as a package, so relative imports fail.


Health Check Verification

# Local test (from server)
curl http://localhost:8010/health

# External test
curl https://bronhouder.nl/api/rag/health

Expected response:

{
  "status": "healthy",
  "backends": {
    "qdrant": {"status": "connected", "collections": {...}},
    "sparql": {"status": "connected", "triples": 30421},
    "typedb": {"status": "connected", "observations": 27741}
  }
}

Troubleshooting

Container Won't Start

# Check systemd logs
journalctl -u glam-rag-api -n 50

# Check container logs directly
podman logs glam-rag-api

Import Errors

If you see ModuleNotFoundError or ImportError:

  1. Check imports use absolute style (not relative)
  2. Verify all dependencies in requirements.txt
  3. Rebuild image: podman build -t glam-rag-api:latest .

Backend Connection Issues

Container uses --network host, so backends must be on localhost:

  • Qdrant: localhost:6333
  • TypeDB: localhost:1729
  • Oxigraph: localhost:7878

Check backend services:

systemctl status qdrant typedb oxigraph

Migration from venv (Historical)

The old venv-based deployment has been deprecated:

Old (Deprecated) New (Current)
/var/lib/glam/api/backend/rag/ /opt/glam-backend/rag/
glam-rag-api.service (venv) glam-rag-api.service (Podman)
Manual pip install Dockerfile-based
rsync + systemd restart rsync + podman build + restart

The old service and venv can be removed:

# Already done - for reference only
systemctl stop glam-rag-api-venv  # if exists
systemctl disable glam-rag-api-venv
rm /etc/systemd/system/glam-rag-api-venv.service
rm -rf /var/lib/glam/api/backend/rag/venv

See Also

  • Rule 7: Deployment is LOCAL via SSH/rsync (NO CI/CD)
  • backend/rag/README.md - RAG API documentation
  • infrastructure/deploy.sh - Full deployment script