- Removed deprecated slots: appraisal_notes, branch_id, is_or_was_real. - Introduced new slots: has_or_had_notes, has_or_had_provenance. - Created Notes class to encapsulate note-related metadata. - Archived removed slots and classes in accordance with the new archive folder convention. - Updated slot_fixes.yaml to reflect migration status and details. - Enhanced documentation for new slots and classes, ensuring compliance with ontology alignment. - Added new slots for note content, date, and type to support the Notes class.
8.9 KiB
Rule 54: RAG API Podman Containerization
🚨 CRITICAL: The GLAM RAG API MUST be deployed via Podman container, NOT via venv/rsync. This solves Python import consistency issues between local development and production.
Why Podman (Not venv)
The Problem
Python import behavior differs between local development and gunicorn:
| Context | Import Style | Works? |
|---|---|---|
Local development (uvicorn main:app) |
from .provenance import |
✅ |
Production (gunicorn main:app) |
from .provenance import |
❌ |
Production (gunicorn main:app) |
from provenance import |
✅ |
When syncing code via rsync to a server venv, the import style that works locally may fail in production with gunicorn.
The Solution
Containerization with Podman ensures:
- Consistent import resolution - Same Python environment locally and in production
- Isolation - RAG API dependencies don't conflict with other services
- Reproducibility - Dockerfile defines exact environment
- Rootless security - Podman runs as non-root user inside container
Deployment Architecture
┌─────────────────────────────────────────────────────────────┐
│ Server: 91.98.224.44 (bronhouder.nl) │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Podman Container: glam-rag-api │ │
│ │ │ │
│ │ - python:3.11-slim base │ │
│ │ - gunicorn + uvicorn workers │ │
│ │ - Port 8010 (host network mode) │ │
│ │ - Non-root user (glam:1000) │ │
│ │ │ │
│ │ Connects to (all on localhost): │ │
│ │ - Qdrant :6333 │ │
│ │ - Oxigraph SPARQL :7878 │ │
│ │ - TypeDB :1729 │ │
│ │ - PostGIS :5432 │ │
│ │ - Valkey :8090 │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Caddy Reverse Proxy │ │
│ │ https://bronhouder.nl/api/rag/* → :8010 │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Key Files
| File | Purpose |
|---|---|
backend/rag/Dockerfile |
Container image definition |
backend/rag/requirements.txt |
Python dependencies (includes gunicorn) |
backend/rag/main.py |
FastAPI application |
infrastructure/deploy.sh |
Deployment script (--rag flag) |
Deployment Commands
Deploy RAG API
# From project root - deploys via Podman
./infrastructure/deploy.sh --rag
This will:
- Sync
backend/rag/to/opt/glam-backend/rag/on server - Build Podman image
glam-rag-api:latest - Create/update systemd service
glam-rag-api.service - Start the container
Manual Operations (on server)
# Check service status
systemctl status glam-rag-api
# View container logs
podman logs glam-rag-api
# Restart service
systemctl restart glam-rag-api
# Rebuild image manually
cd /opt/glam-backend/rag
podman build -t glam-rag-api:latest .
# Clean up old images
podman image prune -f
Systemd Service Configuration
The service file is created by deploy.sh at /etc/systemd/system/glam-rag-api.service:
[Unit]
Description=GLAM Heritage RAG API (Podman)
After=network.target qdrant.service
Wants=qdrant.service
[Service]
Type=simple
Restart=always
RestartSec=10
EnvironmentFile=/var/lib/glam/.env
ExecStart=/usr/bin/podman run --rm --name glam-rag-api \
--network host \
-e OPENAI_API_KEY \
-e ZAI_API_TOKEN \
-e QDRANT_HOST=localhost \
-e QDRANT_PORT=6333 \
-e QDRANT_COLLECTION=heritage_custodians_minilm \
-e EMBEDDING_MODEL=all-MiniLM-L6-v2 \
-e EMBEDDING_DIM=384 \
-e TYPEDB_HOST=localhost \
-e TYPEDB_PORT=1729 \
-e TYPEDB_DATABASE=glam \
-e SPARQL_ENDPOINT=http://localhost:7878/query \
-e VALKEY_CACHE_URL=http://localhost:8090 \
-e POSTGIS_HOST=localhost \
-e POSTGIS_PORT=5432 \
-e POSTGIS_DATABASE=glam \
-e LLM_PROVIDER=openai \
-e LLM_MODEL=gpt-4.1-mini \
-v glam-rag-optimized-models:/app/optimized_models:z \
glam-rag-api:latest
ExecStop=/usr/bin/podman stop glam-rag-api
[Install]
WantedBy=multi-user.target
Key Configuration:
--network host: Container uses host networking (accesses localhost services directly)EnvironmentFile: Loads API keys from/var/lib/glam/.env-v ...:/app/optimized_models:z: Persistent volume for DSPy optimized modelsRestart=always: Auto-restart on failure
Dockerfile Structure
The Dockerfile at backend/rag/Dockerfile:
FROM python:3.11-slim
# Non-root user for security
RUN useradd -m -u 1000 -s /bin/bash glam
# Install dependencies first (layer caching)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY --chown=glam:glam . .
USER glam
# Gunicorn with uvicorn workers for async
CMD ["gunicorn", "main:app", \
"--bind", "0.0.0.0:8010", \
"--workers", "2", \
"--worker-class", "uvicorn.workers.UvicornWorker", \
"--timeout", "120"]
Import Style for Container Deployment
CRITICAL: Use absolute imports in RAG API Python files:
# CORRECT - Works in container with gunicorn
from provenance import ProvenanceTracker, format_provenance_chain
# WRONG - Fails with gunicorn
from .provenance import ProvenanceTracker, format_provenance_chain
This is because gunicorn doesn't recognize the directory as a package, so relative imports fail.
Health Check Verification
# Local test (from server)
curl http://localhost:8010/health
# External test
curl https://bronhouder.nl/api/rag/health
Expected response:
{
"status": "healthy",
"backends": {
"qdrant": {"status": "connected", "collections": {...}},
"sparql": {"status": "connected", "triples": 30421},
"typedb": {"status": "connected", "observations": 27741}
}
}
Troubleshooting
Container Won't Start
# Check systemd logs
journalctl -u glam-rag-api -n 50
# Check container logs directly
podman logs glam-rag-api
Import Errors
If you see ModuleNotFoundError or ImportError:
- Check imports use absolute style (not relative)
- Verify all dependencies in
requirements.txt - Rebuild image:
podman build -t glam-rag-api:latest .
Backend Connection Issues
Container uses --network host, so backends must be on localhost:
- Qdrant:
localhost:6333 - TypeDB:
localhost:1729 - Oxigraph:
localhost:7878
Check backend services:
systemctl status qdrant typedb oxigraph
Migration from venv (Historical)
The old venv-based deployment has been deprecated:
| Old (Deprecated) | New (Current) |
|---|---|
/var/lib/glam/api/backend/rag/ |
/opt/glam-backend/rag/ |
glam-rag-api.service (venv) |
glam-rag-api.service (Podman) |
| Manual pip install | Dockerfile-based |
| rsync + systemd restart | rsync + podman build + restart |
The old service and venv can be removed:
# Already done - for reference only
systemctl stop glam-rag-api-venv # if exists
systemctl disable glam-rag-api-venv
rm /etc/systemd/system/glam-rag-api-venv.service
rm -rf /var/lib/glam/api/backend/rag/venv
See Also
- Rule 7: Deployment is LOCAL via SSH/rsync (NO CI/CD)
backend/rag/README.md- RAG API documentationinfrastructure/deploy.sh- Full deployment script