- Removed deprecated slots: appraisal_notes, branch_id, is_or_was_real. - Introduced new slots: has_or_had_notes, has_or_had_provenance. - Created Notes class to encapsulate note-related metadata. - Archived removed slots and classes in accordance with the new archive folder convention. - Updated slot_fixes.yaml to reflect migration status and details. - Enhanced documentation for new slots and classes, ensuring compliance with ontology alignment. - Added new slots for note content, date, and type to support the Notes class.
292 lines
8.9 KiB
Markdown
292 lines
8.9 KiB
Markdown
# Rule 54: RAG API Podman Containerization
|
|
|
|
🚨 **CRITICAL**: The GLAM RAG API MUST be deployed via Podman container, NOT via venv/rsync. This solves Python import consistency issues between local development and production.
|
|
|
|
---
|
|
|
|
## Why Podman (Not venv)
|
|
|
|
### The Problem
|
|
|
|
Python import behavior differs between local development and gunicorn:
|
|
|
|
| Context | Import Style | Works? |
|
|
|---------|--------------|--------|
|
|
| Local development (`uvicorn main:app`) | `from .provenance import` | ✅ |
|
|
| Production (`gunicorn main:app`) | `from .provenance import` | ❌ |
|
|
| Production (`gunicorn main:app`) | `from provenance import` | ✅ |
|
|
|
|
When syncing code via rsync to a server venv, the import style that works locally may fail in production with gunicorn.
|
|
|
|
### The Solution
|
|
|
|
Containerization with Podman ensures:
|
|
1. **Consistent import resolution** - Same Python environment locally and in production
|
|
2. **Isolation** - RAG API dependencies don't conflict with other services
|
|
3. **Reproducibility** - Dockerfile defines exact environment
|
|
4. **Rootless security** - Podman runs as non-root user inside container
|
|
|
|
---
|
|
|
|
## Deployment Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Server: 91.98.224.44 (bronhouder.nl) │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ ┌─────────────────────────────────────────────────────┐ │
|
|
│ │ Podman Container: glam-rag-api │ │
|
|
│ │ │ │
|
|
│ │ - python:3.11-slim base │ │
|
|
│ │ - gunicorn + uvicorn workers │ │
|
|
│ │ - Port 8010 (host network mode) │ │
|
|
│ │ - Non-root user (glam:1000) │ │
|
|
│ │ │ │
|
|
│ │ Connects to (all on localhost): │ │
|
|
│ │ - Qdrant :6333 │ │
|
|
│ │ - Oxigraph SPARQL :7878 │ │
|
|
│ │ - TypeDB :1729 │ │
|
|
│ │ - PostGIS :5432 │ │
|
|
│ │ - Valkey :8090 │ │
|
|
│ └─────────────────────────────────────────────────────┘ │
|
|
│ │
|
|
│ ┌─────────────────────────────────────────────────────┐ │
|
|
│ │ Caddy Reverse Proxy │ │
|
|
│ │ https://bronhouder.nl/api/rag/* → :8010 │ │
|
|
│ └─────────────────────────────────────────────────────┘ │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Key Files
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `backend/rag/Dockerfile` | Container image definition |
|
|
| `backend/rag/requirements.txt` | Python dependencies (includes gunicorn) |
|
|
| `backend/rag/main.py` | FastAPI application |
|
|
| `infrastructure/deploy.sh` | Deployment script (`--rag` flag) |
|
|
|
|
---
|
|
|
|
## Deployment Commands
|
|
|
|
### Deploy RAG API
|
|
|
|
```bash
|
|
# From project root - deploys via Podman
|
|
./infrastructure/deploy.sh --rag
|
|
```
|
|
|
|
This will:
|
|
1. Sync `backend/rag/` to `/opt/glam-backend/rag/` on server
|
|
2. Build Podman image `glam-rag-api:latest`
|
|
3. Create/update systemd service `glam-rag-api.service`
|
|
4. Start the container
|
|
|
|
### Manual Operations (on server)
|
|
|
|
```bash
|
|
# Check service status
|
|
systemctl status glam-rag-api
|
|
|
|
# View container logs
|
|
podman logs glam-rag-api
|
|
|
|
# Restart service
|
|
systemctl restart glam-rag-api
|
|
|
|
# Rebuild image manually
|
|
cd /opt/glam-backend/rag
|
|
podman build -t glam-rag-api:latest .
|
|
|
|
# Clean up old images
|
|
podman image prune -f
|
|
```
|
|
|
|
---
|
|
|
|
## Systemd Service Configuration
|
|
|
|
The service file is created by `deploy.sh` at `/etc/systemd/system/glam-rag-api.service`:
|
|
|
|
```ini
|
|
[Unit]
|
|
Description=GLAM Heritage RAG API (Podman)
|
|
After=network.target qdrant.service
|
|
Wants=qdrant.service
|
|
|
|
[Service]
|
|
Type=simple
|
|
Restart=always
|
|
RestartSec=10
|
|
EnvironmentFile=/var/lib/glam/.env
|
|
|
|
ExecStart=/usr/bin/podman run --rm --name glam-rag-api \
|
|
--network host \
|
|
-e OPENAI_API_KEY \
|
|
-e ZAI_API_TOKEN \
|
|
-e QDRANT_HOST=localhost \
|
|
-e QDRANT_PORT=6333 \
|
|
-e QDRANT_COLLECTION=heritage_custodians_minilm \
|
|
-e EMBEDDING_MODEL=all-MiniLM-L6-v2 \
|
|
-e EMBEDDING_DIM=384 \
|
|
-e TYPEDB_HOST=localhost \
|
|
-e TYPEDB_PORT=1729 \
|
|
-e TYPEDB_DATABASE=glam \
|
|
-e SPARQL_ENDPOINT=http://localhost:7878/query \
|
|
-e VALKEY_CACHE_URL=http://localhost:8090 \
|
|
-e POSTGIS_HOST=localhost \
|
|
-e POSTGIS_PORT=5432 \
|
|
-e POSTGIS_DATABASE=glam \
|
|
-e LLM_PROVIDER=openai \
|
|
-e LLM_MODEL=gpt-4.1-mini \
|
|
-v glam-rag-optimized-models:/app/optimized_models:z \
|
|
glam-rag-api:latest
|
|
|
|
ExecStop=/usr/bin/podman stop glam-rag-api
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|
|
```
|
|
|
|
**Key Configuration**:
|
|
- `--network host`: Container uses host networking (accesses localhost services directly)
|
|
- `EnvironmentFile`: Loads API keys from `/var/lib/glam/.env`
|
|
- `-v ...:/app/optimized_models:z`: Persistent volume for DSPy optimized models
|
|
- `Restart=always`: Auto-restart on failure
|
|
|
|
---
|
|
|
|
## Dockerfile Structure
|
|
|
|
The Dockerfile at `backend/rag/Dockerfile`:
|
|
|
|
```dockerfile
|
|
FROM python:3.11-slim
|
|
|
|
# Non-root user for security
|
|
RUN useradd -m -u 1000 -s /bin/bash glam
|
|
|
|
# Install dependencies first (layer caching)
|
|
COPY requirements.txt .
|
|
RUN pip install --no-cache-dir -r requirements.txt
|
|
|
|
# Copy application code
|
|
COPY --chown=glam:glam . .
|
|
|
|
USER glam
|
|
|
|
# Gunicorn with uvicorn workers for async
|
|
CMD ["gunicorn", "main:app", \
|
|
"--bind", "0.0.0.0:8010", \
|
|
"--workers", "2", \
|
|
"--worker-class", "uvicorn.workers.UvicornWorker", \
|
|
"--timeout", "120"]
|
|
```
|
|
|
|
---
|
|
|
|
## Import Style for Container Deployment
|
|
|
|
**CRITICAL**: Use absolute imports in RAG API Python files:
|
|
|
|
```python
|
|
# CORRECT - Works in container with gunicorn
|
|
from provenance import ProvenanceTracker, format_provenance_chain
|
|
|
|
# WRONG - Fails with gunicorn
|
|
from .provenance import ProvenanceTracker, format_provenance_chain
|
|
```
|
|
|
|
This is because gunicorn doesn't recognize the directory as a package, so relative imports fail.
|
|
|
|
---
|
|
|
|
## Health Check Verification
|
|
|
|
```bash
|
|
# Local test (from server)
|
|
curl http://localhost:8010/health
|
|
|
|
# External test
|
|
curl https://bronhouder.nl/api/rag/health
|
|
```
|
|
|
|
Expected response:
|
|
```json
|
|
{
|
|
"status": "healthy",
|
|
"backends": {
|
|
"qdrant": {"status": "connected", "collections": {...}},
|
|
"sparql": {"status": "connected", "triples": 30421},
|
|
"typedb": {"status": "connected", "observations": 27741}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Container Won't Start
|
|
|
|
```bash
|
|
# Check systemd logs
|
|
journalctl -u glam-rag-api -n 50
|
|
|
|
# Check container logs directly
|
|
podman logs glam-rag-api
|
|
```
|
|
|
|
### Import Errors
|
|
|
|
If you see `ModuleNotFoundError` or `ImportError`:
|
|
1. Check imports use absolute style (not relative)
|
|
2. Verify all dependencies in `requirements.txt`
|
|
3. Rebuild image: `podman build -t glam-rag-api:latest .`
|
|
|
|
### Backend Connection Issues
|
|
|
|
Container uses `--network host`, so backends must be on localhost:
|
|
- Qdrant: `localhost:6333`
|
|
- TypeDB: `localhost:1729`
|
|
- Oxigraph: `localhost:7878`
|
|
|
|
Check backend services:
|
|
```bash
|
|
systemctl status qdrant typedb oxigraph
|
|
```
|
|
|
|
---
|
|
|
|
## Migration from venv (Historical)
|
|
|
|
The old venv-based deployment has been deprecated:
|
|
|
|
| Old (Deprecated) | New (Current) |
|
|
|------------------|---------------|
|
|
| `/var/lib/glam/api/backend/rag/` | `/opt/glam-backend/rag/` |
|
|
| `glam-rag-api.service` (venv) | `glam-rag-api.service` (Podman) |
|
|
| Manual pip install | Dockerfile-based |
|
|
| rsync + systemd restart | rsync + podman build + restart |
|
|
|
|
The old service and venv can be removed:
|
|
```bash
|
|
# Already done - for reference only
|
|
systemctl stop glam-rag-api-venv # if exists
|
|
systemctl disable glam-rag-api-venv
|
|
rm /etc/systemd/system/glam-rag-api-venv.service
|
|
rm -rf /var/lib/glam/api/backend/rag/venv
|
|
```
|
|
|
|
---
|
|
|
|
## See Also
|
|
|
|
- Rule 7: Deployment is LOCAL via SSH/rsync (NO CI/CD)
|
|
- `backend/rag/README.md` - RAG API documentation
|
|
- `infrastructure/deploy.sh` - Full deployment script
|