# Rule 54: RAG API Podman Containerization 🚨 **CRITICAL**: The GLAM RAG API MUST be deployed via Podman container, NOT via venv/rsync. This solves Python import consistency issues between local development and production. --- ## Why Podman (Not venv) ### The Problem Python import behavior differs between local development and gunicorn: | Context | Import Style | Works? | |---------|--------------|--------| | Local development (`uvicorn main:app`) | `from .provenance import` | ✅ | | Production (`gunicorn main:app`) | `from .provenance import` | ❌ | | Production (`gunicorn main:app`) | `from provenance import` | ✅ | When syncing code via rsync to a server venv, the import style that works locally may fail in production with gunicorn. ### The Solution Containerization with Podman ensures: 1. **Consistent import resolution** - Same Python environment locally and in production 2. **Isolation** - RAG API dependencies don't conflict with other services 3. **Reproducibility** - Dockerfile defines exact environment 4. **Rootless security** - Podman runs as non-root user inside container --- ## Deployment Architecture ``` ┌─────────────────────────────────────────────────────────────┐ │ Server: 91.98.224.44 (bronhouder.nl) │ ├─────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ Podman Container: glam-rag-api │ │ │ │ │ │ │ │ - python:3.11-slim base │ │ │ │ - gunicorn + uvicorn workers │ │ │ │ - Port 8010 (host network mode) │ │ │ │ - Non-root user (glam:1000) │ │ │ │ │ │ │ │ Connects to (all on localhost): │ │ │ │ - Qdrant :6333 │ │ │ │ - Oxigraph SPARQL :7878 │ │ │ │ - TypeDB :1729 │ │ │ │ - PostGIS :5432 │ │ │ │ - Valkey :8090 │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ Caddy Reverse Proxy │ │ │ │ https://bronhouder.nl/api/rag/* → :8010 │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────┘ ``` --- ## Key Files | File | Purpose | |------|---------| | `backend/rag/Dockerfile` | Container image definition | | `backend/rag/requirements.txt` | Python dependencies (includes gunicorn) | | `backend/rag/main.py` | FastAPI application | | `infrastructure/deploy.sh` | Deployment script (`--rag` flag) | --- ## Deployment Commands ### Deploy RAG API ```bash # From project root - deploys via Podman ./infrastructure/deploy.sh --rag ``` This will: 1. Sync `backend/rag/` to `/opt/glam-backend/rag/` on server 2. Build Podman image `glam-rag-api:latest` 3. Create/update systemd service `glam-rag-api.service` 4. Start the container ### Manual Operations (on server) ```bash # Check service status systemctl status glam-rag-api # View container logs podman logs glam-rag-api # Restart service systemctl restart glam-rag-api # Rebuild image manually cd /opt/glam-backend/rag podman build -t glam-rag-api:latest . # Clean up old images podman image prune -f ``` --- ## Systemd Service Configuration The service file is created by `deploy.sh` at `/etc/systemd/system/glam-rag-api.service`: ```ini [Unit] Description=GLAM Heritage RAG API (Podman) After=network.target qdrant.service Wants=qdrant.service [Service] Type=simple Restart=always RestartSec=10 EnvironmentFile=/var/lib/glam/.env ExecStart=/usr/bin/podman run --rm --name glam-rag-api \ --network host \ -e OPENAI_API_KEY \ -e ZAI_API_TOKEN \ -e QDRANT_HOST=localhost \ -e QDRANT_PORT=6333 \ -e QDRANT_COLLECTION=heritage_custodians_minilm \ -e EMBEDDING_MODEL=all-MiniLM-L6-v2 \ -e EMBEDDING_DIM=384 \ -e TYPEDB_HOST=localhost \ -e TYPEDB_PORT=1729 \ -e TYPEDB_DATABASE=glam \ -e SPARQL_ENDPOINT=http://localhost:7878/query \ -e VALKEY_CACHE_URL=http://localhost:8090 \ -e POSTGIS_HOST=localhost \ -e POSTGIS_PORT=5432 \ -e POSTGIS_DATABASE=glam \ -e LLM_PROVIDER=openai \ -e LLM_MODEL=gpt-4.1-mini \ -v glam-rag-optimized-models:/app/optimized_models:z \ glam-rag-api:latest ExecStop=/usr/bin/podman stop glam-rag-api [Install] WantedBy=multi-user.target ``` **Key Configuration**: - `--network host`: Container uses host networking (accesses localhost services directly) - `EnvironmentFile`: Loads API keys from `/var/lib/glam/.env` - `-v ...:/app/optimized_models:z`: Persistent volume for DSPy optimized models - `Restart=always`: Auto-restart on failure --- ## Dockerfile Structure The Dockerfile at `backend/rag/Dockerfile`: ```dockerfile FROM python:3.11-slim # Non-root user for security RUN useradd -m -u 1000 -s /bin/bash glam # Install dependencies first (layer caching) COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt # Copy application code COPY --chown=glam:glam . . USER glam # Gunicorn with uvicorn workers for async CMD ["gunicorn", "main:app", \ "--bind", "0.0.0.0:8010", \ "--workers", "2", \ "--worker-class", "uvicorn.workers.UvicornWorker", \ "--timeout", "120"] ``` --- ## Import Style for Container Deployment **CRITICAL**: Use absolute imports in RAG API Python files: ```python # CORRECT - Works in container with gunicorn from provenance import ProvenanceTracker, format_provenance_chain # WRONG - Fails with gunicorn from .provenance import ProvenanceTracker, format_provenance_chain ``` This is because gunicorn doesn't recognize the directory as a package, so relative imports fail. --- ## Health Check Verification ```bash # Local test (from server) curl http://localhost:8010/health # External test curl https://bronhouder.nl/api/rag/health ``` Expected response: ```json { "status": "healthy", "backends": { "qdrant": {"status": "connected", "collections": {...}}, "sparql": {"status": "connected", "triples": 30421}, "typedb": {"status": "connected", "observations": 27741} } } ``` --- ## Troubleshooting ### Container Won't Start ```bash # Check systemd logs journalctl -u glam-rag-api -n 50 # Check container logs directly podman logs glam-rag-api ``` ### Import Errors If you see `ModuleNotFoundError` or `ImportError`: 1. Check imports use absolute style (not relative) 2. Verify all dependencies in `requirements.txt` 3. Rebuild image: `podman build -t glam-rag-api:latest .` ### Backend Connection Issues Container uses `--network host`, so backends must be on localhost: - Qdrant: `localhost:6333` - TypeDB: `localhost:1729` - Oxigraph: `localhost:7878` Check backend services: ```bash systemctl status qdrant typedb oxigraph ``` --- ## Migration from venv (Historical) The old venv-based deployment has been deprecated: | Old (Deprecated) | New (Current) | |------------------|---------------| | `/var/lib/glam/api/backend/rag/` | `/opt/glam-backend/rag/` | | `glam-rag-api.service` (venv) | `glam-rag-api.service` (Podman) | | Manual pip install | Dockerfile-based | | rsync + systemd restart | rsync + podman build + restart | The old service and venv can be removed: ```bash # Already done - for reference only systemctl stop glam-rag-api-venv # if exists systemctl disable glam-rag-api-venv rm /etc/systemd/system/glam-rag-api-venv.service rm -rf /var/lib/glam/api/backend/rag/venv ``` --- ## See Also - Rule 7: Deployment is LOCAL via SSH/rsync (NO CI/CD) - `backend/rag/README.md` - RAG API documentation - `infrastructure/deploy.sh` - Full deployment script