296 lines
9.2 KiB
Markdown
296 lines
9.2 KiB
Markdown
# Multi-Database Architecture for bronhouder.nl
|
|
|
|
This document describes the multi-database architecture implemented for the GLAM data platform at bronhouder.nl.
|
|
|
|
## Overview
|
|
|
|
The Database page (`/database`) provides a unified interface for exploring heritage custodian data across four different database systems, each optimized for different use cases:
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ bronhouder.nl/database │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
|
│ │ 🦆 DuckDB│ │🐘Postgres│ │ 🧠 TypeDB│ │🔗Oxigraph│ │
|
|
│ │ (Browser)│ │ (Server) │ │ (Server) │ │ (Server) │ │
|
|
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
|
|
│ │ │ │ │ │
|
|
│ In-Browser REST API REST API SPARQL │
|
|
│ WASM Proxy Proxy Endpoint │
|
|
└───────┼─────────────┼─────────────┼─────────────┼───────────────┘
|
|
│ │ │ │
|
|
▼ ▼ ▼ ▼
|
|
┌─────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
|
|
│ Browser │ │PostgreSQL│ │ TypeDB │ │ Oxigraph │
|
|
│ Memory │ │ Database │ │ Database │ │ Store │
|
|
└─────────┘ └──────────┘ └──────────┘ └──────────┘
|
|
```
|
|
|
|
## Database Systems
|
|
|
|
### 1. DuckDB (In-Browser OLAP)
|
|
|
|
**Status**: ✅ Fully Operational
|
|
|
|
**Technology**: DuckDB-WASM running entirely in the browser
|
|
|
|
**Use Cases**:
|
|
- Ad-hoc SQL analytics on heritage institution data
|
|
- Fast aggregations and filtering
|
|
- Data exploration without server round-trips
|
|
|
|
**Data Source**: `/data/nde_institutions.json` (10.8 MB, 1,863 institutions)
|
|
|
|
**Features**:
|
|
- Upload JSON/CSV/Parquet files directly
|
|
- Run SQL queries in-browser
|
|
- No server dependency
|
|
- Export query results
|
|
|
|
**Hook**: `frontend/src/hooks/useDuckDB.ts`
|
|
|
|
**Panel**: `frontend/src/components/database/DuckDBPanel.tsx`
|
|
|
|
### 2. PostgreSQL (Relational)
|
|
|
|
**Status**: ✅ Fully Operational (as of 2025-12-06)
|
|
|
|
**Technology**: PostgreSQL 16.11 with FastAPI REST proxy
|
|
|
|
**Endpoint**: `https://bronhouder.nl/api/postgres`
|
|
|
|
**Use Cases**:
|
|
- Complex relational queries
|
|
- Full-text search on institution names
|
|
- Transactional operations
|
|
- Integration with existing tools
|
|
|
|
**Data**: 1,838 NDE heritage institutions with:
|
|
- 32 columns including coordinates, ratings, reviews
|
|
- GHCID identifiers (text, UUID, numeric)
|
|
- JSONB fields for wikidata_types, reviews, identifiers, genealogiewerkbalk
|
|
|
|
**API Endpoints**:
|
|
- `GET /` - Health check and statistics
|
|
- `POST /query` - Execute SQL query (read-only SELECT/WITH)
|
|
- `GET /tables` - List all tables with metadata
|
|
- `GET /schema/{table}` - Get table schema
|
|
- `GET /stats` - Detailed database statistics
|
|
|
|
**Backend**: `/opt/glam-backend/postgres/` on server
|
|
- `main.py` - FastAPI application
|
|
- `load_nde_data.py` - Data loading script
|
|
- Systemd service: `glam-postgres-api.service`
|
|
|
|
**Hook**: `frontend/src/hooks/usePostgreSQL.ts`
|
|
|
|
**Panel**: `frontend/src/components/database/PostgreSQLPanel.tsx`
|
|
|
|
### 3. Oxigraph (RDF/SPARQL)
|
|
|
|
**Status**: ✅ Fully Operational
|
|
|
|
**Technology**: Oxigraph SPARQL endpoint on server
|
|
|
|
**Use Cases**:
|
|
- Linked Data queries
|
|
- Ontology exploration
|
|
- Cross-referencing with Wikidata, Schema.org
|
|
- Semantic reasoning
|
|
|
|
**Endpoint**: `https://bronhouder.nl/sparql` (proxied to 91.98.224.44:7878)
|
|
|
|
**Triple Count**: 426,243 triples
|
|
|
|
**Features**:
|
|
- SPARQL 1.1 query interface
|
|
- Graph exploration
|
|
- Namespace prefix management
|
|
- RDF upload (Turtle, N-Triples, JSON-LD)
|
|
|
|
**Hook**: `frontend/src/hooks/useOxigraph.ts`
|
|
|
|
**Panel**: `frontend/src/components/database/OxigraphPanel.tsx`
|
|
|
|
### 3. PostgreSQL (Relational)
|
|
|
|
**Status**: ⏳ Requires Backend API
|
|
|
|
**Technology**: PostgreSQL database with REST API proxy
|
|
|
|
**Use Cases**:
|
|
- Complex relational queries
|
|
- Full-text search
|
|
- Transactional operations
|
|
- Integration with existing tools
|
|
|
|
**Required**: REST API at `/api/postgres` or `VITE_POSTGRES_API_URL`
|
|
|
|
**Planned Features**:
|
|
- Table/schema browser
|
|
- SQL query interface
|
|
- Query history
|
|
- Export to CSV
|
|
|
|
**Hook**: `frontend/src/hooks/usePostgreSQL.ts`
|
|
|
|
**Panel**: `frontend/src/components/database/PostgreSQLPanel.tsx`
|
|
|
|
### 4. TypeDB (Knowledge Graph)
|
|
|
|
**Status**: ⏳ Deferred - Server has only 3.7GB RAM, TypeDB requires 4GB+
|
|
|
|
**Technology**: TypeDB with REST API proxy
|
|
|
|
**Use Cases**:
|
|
- Complex knowledge graph queries
|
|
- Multi-hop relationship traversal
|
|
- Temporal reasoning (organizational changes)
|
|
- Entity resolution
|
|
|
|
**Note**: To enable TypeDB, upgrade server to cx32 (8GB RAM) or higher.
|
|
|
|
**Hook**: `frontend/src/hooks/useTypeDB.ts`
|
|
|
|
**Panel**: `frontend/src/components/database/TypeDBPanel.tsx`
|
|
|
|
## Frontend Components
|
|
|
|
### Database Page (`/database`)
|
|
|
|
The main Database page provides:
|
|
|
|
1. **Tab Navigation**: Switch between database views
|
|
2. **All Databases Overview**: Comparison grid with status indicators
|
|
3. **Individual Database Panels**: Full-featured interface for each system
|
|
|
|
### Component Structure
|
|
|
|
```
|
|
frontend/src/
|
|
├── pages/
|
|
│ ├── Database.tsx # Main page with tab navigation
|
|
│ └── Database.css # Styles for all database components
|
|
├── hooks/
|
|
│ ├── useDuckDB.ts # DuckDB-WASM hook
|
|
│ ├── useOxigraph.ts # Oxigraph SPARQL hook
|
|
│ ├── usePostgreSQL.ts # PostgreSQL REST hook
|
|
│ └── useTypeDB.ts # TypeDB REST hook
|
|
└── components/database/
|
|
├── DuckDBPanel.tsx # DuckDB interface
|
|
├── OxigraphPanel.tsx # Oxigraph interface
|
|
├── PostgreSQLPanel.tsx # PostgreSQL interface
|
|
├── TypeDBPanel.tsx # TypeDB interface
|
|
└── index.ts # Exports
|
|
```
|
|
|
|
## Data Flow
|
|
|
|
### NDE Institution Data
|
|
|
|
```
|
|
YAML Files (data/nde/enriched/entries/)
|
|
│
|
|
├── scripts/export_nde_for_duckdb.py
|
|
│ └── frontend/public/data/nde_institutions.json (DuckDB)
|
|
│
|
|
├── scripts/nde_to_hc_rdf.py
|
|
│ └── data/nde/rdf/*.ttl → Oxigraph
|
|
│
|
|
└── [Future] scripts/nde_to_typedb.py
|
|
└── TypeDB
|
|
```
|
|
|
|
### LinkML Schema
|
|
|
|
```
|
|
schemas/20251121/linkml/
|
|
├── 01_custodian_name_modular.yaml # Main ontology schema
|
|
├── nde_enriched_entry.yaml # NDE entry schema
|
|
└── modules/ # Modular components
|
|
```
|
|
|
|
## API Contracts
|
|
|
|
### PostgreSQL REST API (Required)
|
|
|
|
```typescript
|
|
// Expected endpoints
|
|
POST /api/postgres/query
|
|
{
|
|
"sql": "SELECT * FROM institutions LIMIT 10"
|
|
}
|
|
// Response: { rows: [...], columns: [...] }
|
|
|
|
GET /api/postgres/tables
|
|
// Response: { tables: [...] }
|
|
|
|
GET /api/postgres/schema/:table
|
|
// Response: { columns: [...] }
|
|
```
|
|
|
|
### TypeDB REST API (Required)
|
|
|
|
```typescript
|
|
// Expected endpoints
|
|
POST /api/typedb/query
|
|
{
|
|
"query": "match $x isa institution; get $x; limit 10;"
|
|
}
|
|
// Response: { results: [...] }
|
|
|
|
GET /api/typedb/schema
|
|
// Response: { entity_types: [...], relation_types: [...] }
|
|
|
|
GET /api/typedb/entity-types
|
|
// Response: { types: [...] }
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
```bash
|
|
# Frontend (.env)
|
|
VITE_SPARQL_ENDPOINT=https://bronhouder.nl/sparql
|
|
VITE_POSTGRES_API_URL=https://bronhouder.nl/api/postgres
|
|
VITE_TYPEDB_API_URL=https://bronhouder.nl/api/typedb
|
|
```
|
|
|
|
## Server Infrastructure
|
|
|
|
```
|
|
Server: 91.98.224.44 (Hetzner cx22)
|
|
├── Caddy (reverse proxy)
|
|
│ ├── / → /var/www/glam-frontend/
|
|
│ ├── /sparql → localhost:7878 (Oxigraph)
|
|
│ └── /api → localhost:8000 (FastAPI)
|
|
├── Oxigraph (port 7878)
|
|
├── GLAM API (port 8000)
|
|
└── [Future] PostgreSQL, TypeDB
|
|
```
|
|
|
|
## Deployment
|
|
|
|
```bash
|
|
# Deploy frontend only
|
|
./infrastructure/deploy.sh --frontend
|
|
|
|
# Check status
|
|
./infrastructure/deploy.sh --status
|
|
|
|
# Deploy everything
|
|
./infrastructure/deploy.sh --all
|
|
```
|
|
|
|
## Next Steps
|
|
|
|
1. **PostgreSQL Backend**: Create FastAPI endpoints for PostgreSQL queries
|
|
2. **TypeDB Backend**: Create FastAPI endpoints for TypeDB queries
|
|
3. **Data Sync**: Implement data loading scripts for PostgreSQL/TypeDB
|
|
4. **Query Builder**: Add visual query builder for non-technical users
|
|
5. **Export**: Enable data export in multiple formats
|
|
|
|
## Related Documentation
|
|
|
|
- [AGENTS.md](../AGENTS.md) - AI agent instructions
|
|
- [DEPLOYMENT_GUIDE.md](./DEPLOYMENT_GUIDE.md) - Deployment procedures
|
|
- [SCHEMA_MODULES.md](./SCHEMA_MODULES.md) - LinkML schema architecture
|