# Multi-Database Architecture for bronhouder.nl This document describes the multi-database architecture implemented for the GLAM data platform at bronhouder.nl. ## Overview The Database page (`/database`) provides a unified interface for exploring heritage custodian data across four different database systems, each optimized for different use cases: ``` ┌─────────────────────────────────────────────────────────────────┐ │ bronhouder.nl/database │ ├─────────────────────────────────────────────────────────────────┤ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ 🦆 DuckDB│ │🐘Postgres│ │ 🧠 TypeDB│ │🔗Oxigraph│ │ │ │ (Browser)│ │ (Server) │ │ (Server) │ │ (Server) │ │ │ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ │ │ │ │ │ │ │ In-Browser REST API REST API SPARQL │ │ WASM Proxy Proxy Endpoint │ └───────┼─────────────┼─────────────┼─────────────┼───────────────┘ │ │ │ │ ▼ ▼ ▼ ▼ ┌─────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ Browser │ │PostgreSQL│ │ TypeDB │ │ Oxigraph │ │ Memory │ │ Database │ │ Database │ │ Store │ └─────────┘ └──────────┘ └──────────┘ └──────────┘ ``` ## Database Systems ### 1. DuckDB (In-Browser OLAP) **Status**: ✅ Fully Operational **Technology**: DuckDB-WASM running entirely in the browser **Use Cases**: - Ad-hoc SQL analytics on heritage institution data - Fast aggregations and filtering - Data exploration without server round-trips **Data Source**: `/data/nde_institutions.json` (10.8 MB, 1,863 institutions) **Features**: - Upload JSON/CSV/Parquet files directly - Run SQL queries in-browser - No server dependency - Export query results **Hook**: `frontend/src/hooks/useDuckDB.ts` **Panel**: `frontend/src/components/database/DuckDBPanel.tsx` ### 2. PostgreSQL (Relational) **Status**: ✅ Fully Operational (as of 2025-12-06) **Technology**: PostgreSQL 16.11 with FastAPI REST proxy **Endpoint**: `https://bronhouder.nl/api/postgres` **Use Cases**: - Complex relational queries - Full-text search on institution names - Transactional operations - Integration with existing tools **Data**: 1,838 NDE heritage institutions with: - 32 columns including coordinates, ratings, reviews - GHCID identifiers (text, UUID, numeric) - JSONB fields for wikidata_types, reviews, identifiers, genealogiewerkbalk **API Endpoints**: - `GET /` - Health check and statistics - `POST /query` - Execute SQL query (read-only SELECT/WITH) - `GET /tables` - List all tables with metadata - `GET /schema/{table}` - Get table schema - `GET /stats` - Detailed database statistics **Backend**: `/opt/glam-backend/postgres/` on server - `main.py` - FastAPI application - `load_nde_data.py` - Data loading script - Systemd service: `glam-postgres-api.service` **Hook**: `frontend/src/hooks/usePostgreSQL.ts` **Panel**: `frontend/src/components/database/PostgreSQLPanel.tsx` ### 3. Oxigraph (RDF/SPARQL) **Status**: ✅ Fully Operational **Technology**: Oxigraph SPARQL endpoint on server **Use Cases**: - Linked Data queries - Ontology exploration - Cross-referencing with Wikidata, Schema.org - Semantic reasoning **Endpoint**: `https://bronhouder.nl/sparql` (proxied to 91.98.224.44:7878) **Triple Count**: 426,243 triples **Features**: - SPARQL 1.1 query interface - Graph exploration - Namespace prefix management - RDF upload (Turtle, N-Triples, JSON-LD) **Hook**: `frontend/src/hooks/useOxigraph.ts` **Panel**: `frontend/src/components/database/OxigraphPanel.tsx` ### 3. PostgreSQL (Relational) **Status**: ⏳ Requires Backend API **Technology**: PostgreSQL database with REST API proxy **Use Cases**: - Complex relational queries - Full-text search - Transactional operations - Integration with existing tools **Required**: REST API at `/api/postgres` or `VITE_POSTGRES_API_URL` **Planned Features**: - Table/schema browser - SQL query interface - Query history - Export to CSV **Hook**: `frontend/src/hooks/usePostgreSQL.ts` **Panel**: `frontend/src/components/database/PostgreSQLPanel.tsx` ### 4. TypeDB (Knowledge Graph) **Status**: ⏳ Deferred - Server has only 3.7GB RAM, TypeDB requires 4GB+ **Technology**: TypeDB with REST API proxy **Use Cases**: - Complex knowledge graph queries - Multi-hop relationship traversal - Temporal reasoning (organizational changes) - Entity resolution **Note**: To enable TypeDB, upgrade server to cx32 (8GB RAM) or higher. **Hook**: `frontend/src/hooks/useTypeDB.ts` **Panel**: `frontend/src/components/database/TypeDBPanel.tsx` ## Frontend Components ### Database Page (`/database`) The main Database page provides: 1. **Tab Navigation**: Switch between database views 2. **All Databases Overview**: Comparison grid with status indicators 3. **Individual Database Panels**: Full-featured interface for each system ### Component Structure ``` frontend/src/ ├── pages/ │ ├── Database.tsx # Main page with tab navigation │ └── Database.css # Styles for all database components ├── hooks/ │ ├── useDuckDB.ts # DuckDB-WASM hook │ ├── useOxigraph.ts # Oxigraph SPARQL hook │ ├── usePostgreSQL.ts # PostgreSQL REST hook │ └── useTypeDB.ts # TypeDB REST hook └── components/database/ ├── DuckDBPanel.tsx # DuckDB interface ├── OxigraphPanel.tsx # Oxigraph interface ├── PostgreSQLPanel.tsx # PostgreSQL interface ├── TypeDBPanel.tsx # TypeDB interface └── index.ts # Exports ``` ## Data Flow ### NDE Institution Data ``` YAML Files (data/nde/enriched/entries/) │ ├── scripts/export_nde_for_duckdb.py │ └── frontend/public/data/nde_institutions.json (DuckDB) │ ├── scripts/nde_to_hc_rdf.py │ └── data/nde/rdf/*.ttl → Oxigraph │ └── [Future] scripts/nde_to_typedb.py └── TypeDB ``` ### LinkML Schema ``` schemas/20251121/linkml/ ├── 01_custodian_name_modular.yaml # Main ontology schema ├── nde_enriched_entry.yaml # NDE entry schema └── modules/ # Modular components ``` ## API Contracts ### PostgreSQL REST API (Required) ```typescript // Expected endpoints POST /api/postgres/query { "sql": "SELECT * FROM institutions LIMIT 10" } // Response: { rows: [...], columns: [...] } GET /api/postgres/tables // Response: { tables: [...] } GET /api/postgres/schema/:table // Response: { columns: [...] } ``` ### TypeDB REST API (Required) ```typescript // Expected endpoints POST /api/typedb/query { "query": "match $x isa institution; get $x; limit 10;" } // Response: { results: [...] } GET /api/typedb/schema // Response: { entity_types: [...], relation_types: [...] } GET /api/typedb/entity-types // Response: { types: [...] } ``` ## Environment Variables ```bash # Frontend (.env) VITE_SPARQL_ENDPOINT=https://bronhouder.nl/sparql VITE_POSTGRES_API_URL=https://bronhouder.nl/api/postgres VITE_TYPEDB_API_URL=https://bronhouder.nl/api/typedb ``` ## Server Infrastructure ``` Server: 91.98.224.44 (Hetzner cx22) ├── Caddy (reverse proxy) │ ├── / → /var/www/glam-frontend/ │ ├── /sparql → localhost:7878 (Oxigraph) │ └── /api → localhost:8000 (FastAPI) ├── Oxigraph (port 7878) ├── GLAM API (port 8000) └── [Future] PostgreSQL, TypeDB ``` ## Deployment ```bash # Deploy frontend only ./infrastructure/deploy.sh --frontend # Check status ./infrastructure/deploy.sh --status # Deploy everything ./infrastructure/deploy.sh --all ``` ## Next Steps 1. **PostgreSQL Backend**: Create FastAPI endpoints for PostgreSQL queries 2. **TypeDB Backend**: Create FastAPI endpoints for TypeDB queries 3. **Data Sync**: Implement data loading scripts for PostgreSQL/TypeDB 4. **Query Builder**: Add visual query builder for non-technical users 5. **Export**: Enable data export in multiple formats ## Related Documentation - [AGENTS.md](../AGENTS.md) - AI agent instructions - [DEPLOYMENT_GUIDE.md](./DEPLOYMENT_GUIDE.md) - Deployment procedures - [SCHEMA_MODULES.md](./SCHEMA_MODULES.md) - LinkML schema architecture