# Rule 54: RAG API Podman Containerization

🚨 **CRITICAL**: The GLAM RAG API MUST be deployed via Podman container, NOT via venv/rsync. This solves Python import consistency issues between local development and production.

---

## Why Podman (Not venv)

### The Problem

Python import behavior differs between local development and gunicorn:

| Context | Import Style | Works? |
|---------|--------------|--------|
| Local development (`uvicorn main:app`) | `from .provenance import` | ✅ |
| Production (`gunicorn main:app`) | `from .provenance import` | ❌ |
| Production (`gunicorn main:app`) | `from provenance import` | ✅ |

When syncing code via rsync to a server venv, the import style that works locally may fail in production with gunicorn.

### The Solution

Containerization with Podman ensures:
1. **Consistent import resolution** - Same Python environment locally and in production
2. **Isolation** - RAG API dependencies don't conflict with other services
3. **Reproducibility** - Dockerfile defines exact environment
4. **Rootless security** - Podman runs as non-root user inside container

---

## Deployment Architecture

```
┌─────────────────────────────────────────────────────────────┐
│  Server: 91.98.224.44 (bronhouder.nl)                       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Podman Container: glam-rag-api                     │   │
│  │                                                     │   │
│  │  - python:3.11-slim base                            │   │
│  │  - gunicorn + uvicorn workers                       │   │
│  │  - Port 8010 (host network mode)                    │   │
│  │  - Non-root user (glam:1000)                        │   │
│  │                                                     │   │
│  │  Connects to (all on localhost):                    │   │
│  │  - Qdrant :6333                                     │   │
│  │  - Oxigraph SPARQL :7878                            │   │
│  │  - TypeDB :1729                                     │   │
│  │  - PostGIS :5432                                    │   │
│  │  - Valkey :8090                                     │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Caddy Reverse Proxy                                │   │
│  │  https://bronhouder.nl/api/rag/* → :8010            │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
└─────────────────────────────────────────────────────────────┘
```

---

## Key Files

| File | Purpose |
|------|---------|
| `backend/rag/Dockerfile` | Container image definition |
| `backend/rag/requirements.txt` | Python dependencies (includes gunicorn) |
| `backend/rag/main.py` | FastAPI application |
| `infrastructure/deploy.sh` | Deployment script (`--rag` flag) |

---

## Deployment Commands

### Deploy RAG API

```bash
# From project root - deploys via Podman
./infrastructure/deploy.sh --rag
```

This will:
1. Sync `backend/rag/` to `/opt/glam-backend/rag/` on server
2. Build Podman image `glam-rag-api:latest`
3. Create/update systemd service `glam-rag-api.service`
4. Start the container

### Manual Operations (on server)

```bash
# Check service status
systemctl status glam-rag-api

# View container logs
podman logs glam-rag-api

# Restart service
systemctl restart glam-rag-api

# Rebuild image manually
cd /opt/glam-backend/rag
podman build -t glam-rag-api:latest .

# Clean up old images
podman image prune -f
```

---

## Systemd Service Configuration

The service file is created by `deploy.sh` at `/etc/systemd/system/glam-rag-api.service`:

```ini
[Unit]
Description=GLAM Heritage RAG API (Podman)
After=network.target qdrant.service
Wants=qdrant.service

[Service]
Type=simple
Restart=always
RestartSec=10
EnvironmentFile=/var/lib/glam/.env

ExecStart=/usr/bin/podman run --rm --name glam-rag-api \
  --network host \
  -e OPENAI_API_KEY \
  -e ZAI_API_TOKEN \
  -e QDRANT_HOST=localhost \
  -e QDRANT_PORT=6333 \
  -e QDRANT_COLLECTION=heritage_custodians_minilm \
  -e EMBEDDING_MODEL=all-MiniLM-L6-v2 \
  -e EMBEDDING_DIM=384 \
  -e TYPEDB_HOST=localhost \
  -e TYPEDB_PORT=1729 \
  -e TYPEDB_DATABASE=glam \
  -e SPARQL_ENDPOINT=http://localhost:7878/query \
  -e VALKEY_CACHE_URL=http://localhost:8090 \
  -e POSTGIS_HOST=localhost \
  -e POSTGIS_PORT=5432 \
  -e POSTGIS_DATABASE=glam \
  -e LLM_PROVIDER=openai \
  -e LLM_MODEL=gpt-4.1-mini \
  -v glam-rag-optimized-models:/app/optimized_models:z \
  glam-rag-api:latest

ExecStop=/usr/bin/podman stop glam-rag-api

[Install]
WantedBy=multi-user.target
```

**Key Configuration**:
- `--network host`: Container uses host networking (accesses localhost services directly)
- `EnvironmentFile`: Loads API keys from `/var/lib/glam/.env`
- `-v ...:/app/optimized_models:z`: Persistent volume for DSPy optimized models
- `Restart=always`: Auto-restart on failure

---

## Dockerfile Structure

The Dockerfile at `backend/rag/Dockerfile`:

```dockerfile
FROM python:3.11-slim

# Non-root user for security
RUN useradd -m -u 1000 -s /bin/bash glam

# Install dependencies first (layer caching)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY --chown=glam:glam . .

USER glam

# Gunicorn with uvicorn workers for async
CMD ["gunicorn", "main:app", \
     "--bind", "0.0.0.0:8010", \
     "--workers", "2", \
     "--worker-class", "uvicorn.workers.UvicornWorker", \
     "--timeout", "120"]
```

---

## Import Style for Container Deployment

**CRITICAL**: Use absolute imports in RAG API Python files:

```python
# CORRECT - Works in container with gunicorn
from provenance import ProvenanceTracker, format_provenance_chain

# WRONG - Fails with gunicorn
from .provenance import ProvenanceTracker, format_provenance_chain
```

This is because gunicorn doesn't recognize the directory as a package, so relative imports fail.

---

## Health Check Verification

```bash
# Local test (from server)
curl http://localhost:8010/health

# External test
curl https://bronhouder.nl/api/rag/health
```

Expected response:
```json
{
  "status": "healthy",
  "backends": {
    "qdrant": {"status": "connected", "collections": {...}},
    "sparql": {"status": "connected", "triples": 30421},
    "typedb": {"status": "connected", "observations": 27741}
  }
}
```

---

## Troubleshooting

### Container Won't Start

```bash
# Check systemd logs
journalctl -u glam-rag-api -n 50

# Check container logs directly
podman logs glam-rag-api
```

### Import Errors

If you see `ModuleNotFoundError` or `ImportError`:
1. Check imports use absolute style (not relative)
2. Verify all dependencies in `requirements.txt`
3. Rebuild image: `podman build -t glam-rag-api:latest .`

### Backend Connection Issues

Container uses `--network host`, so backends must be on localhost:
- Qdrant: `localhost:6333`
- TypeDB: `localhost:1729`
- Oxigraph: `localhost:7878`

Check backend services:
```bash
systemctl status qdrant typedb oxigraph
```

---

## Migration from venv (Historical)

The old venv-based deployment has been deprecated:

| Old (Deprecated) | New (Current) |
|------------------|---------------|
| `/var/lib/glam/api/backend/rag/` | `/opt/glam-backend/rag/` |
| `glam-rag-api.service` (venv) | `glam-rag-api.service` (Podman) |
| Manual pip install | Dockerfile-based |
| rsync + systemd restart | rsync + podman build + restart |

The old service and venv can be removed:
```bash
# Already done - for reference only
systemctl stop glam-rag-api-venv  # if exists
systemctl disable glam-rag-api-venv
rm /etc/systemd/system/glam-rag-api-venv.service
rm -rf /var/lib/glam/api/backend/rag/venv
```

---

## See Also

- Rule 7: Deployment is LOCAL via SSH/rsync (NO CI/CD)
- `backend/rag/README.md` - RAG API documentation
- `infrastructure/deploy.sh` - Full deployment script