Add schema and tooling for storing administrative boundaries in PostGIS: - 002_postgis_boundaries.sql: Complete PostGIS schema with: - boundary_countries (ISO 3166-1) - boundary_admin1 (states/provinces/regions) - boundary_admin2 (municipalities/districts) - boundary_historical (HALC pre-modern territories) - custodian_service_areas (computed werkgebied geometries) - geonames_settlements (reverse geocoding) - Spatial functions: find_admin_for_point, find_nearest_settlement - Views for API access - load_boundaries_postgis.py: Python loader supporting: - GADM (Global Administrative Areas) - primary global source - CBS (Dutch municipality boundaries) - GeoNames settlements for reverse geocoding - Cached downloads and upsert logic - POSTGIS_BOUNDARY_ARCHITECTURE.md: Design documentation This replaces the static GeoJSON approach for international coverage.
9 KiB
9 KiB
PostGIS International Boundary Architecture
Overview
This document describes the PostGIS-based architecture for storing and querying international administrative boundaries to compute heritage custodian service areas ("werkgebied").
Problem Statement
The current implementation uses static GeoJSON files for Netherlands-only boundaries:
netherlands_provinces.geojson- 12 provincesnetherlands_municipalities.geojson- ~350 municipalitiesnetherlands_historical_1500.geojson- HALC historical territories
This approach does not scale for international coverage (Japan, Czechia, Germany, Belgium, Brazil, etc.) because:
- GeoJSON files are too large for client-side loading (Germany has 400+ districts)
- No server-side spatial queries (point-in-polygon, intersection)
- No temporal versioning for boundary changes
- No consistent administrative hierarchy across countries
Solution Architecture
Database Schema
┌─────────────────────────────────────────────────────────────────┐
│ PostGIS Database │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ boundary_countries │────▶│ boundary_admin1 │ │
│ │ (ISO 3166-1) │ │ (States/Provinces) │ │
│ │ - iso_a2: NL, DE... │ │ - iso_3166_2 │ │
│ │ - geom: POLYGON │ │ - geom: POLYGON │ │
│ └─────────────────────┘ └──────────┬──────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ boundary_admin2 │ │
│ │ (Municipalities) │ │
│ │ - geonames_id │ │
│ │ - cbs_gemeente_code│ │
│ │ - geom: POLYGON │ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌─────────────────────┐ │ │
│ │ boundary_historical │ │ │
│ │ (HALC, pre-modern) │ │ │
│ │ - halc_adm1_code │ │ │
│ │ - period_start/end │ │ │
│ └─────────────────────┘ │ │
│ ▼ │
│ ┌───────────────────────────┐ │
│ │ custodian_service_areas │ │
│ │ (Computed werkgebied) │ │
│ │ - ghcid: NL-NH-HAA-A-NHA │ │
│ │ - admin2_ids: [1,2,3...] │ │
│ │ - geom: MULTIPOLYGON │ │
│ └───────────────────────────┘ │
│ │
│ ┌─────────────────────┐ │
│ │ geonames_settlements│ (For reverse geocoding) │
│ │ - geonames_id │ │
│ │ - name, ascii_name │ │
│ │ - geom: POINT │ │
│ └─────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Data Sources
| Source | Coverage | License | Admin Levels | Use Case |
|---|---|---|---|---|
| GADM | Global | CC-BY-NC | 0, 1, 2, 3+ | Primary global boundaries |
| Natural Earth | Global | Public Domain | 0, 1 | Simplified country shapes |
| CBS | Netherlands | CC-BY-4.0 | 2 (gemeente) | Official NL municipalities |
| HALC | Low Countries | Academic | Historical | Pre-1800 territories |
| OSM | Global | ODbL | Variable | Crowdsourced, current |
| Eurostat | Europe | Eurostat | NUTS/LAU | EU statistical regions |
API Endpoints
The PostGIS database will be exposed via REST API:
GET /api/boundaries/countries
GET /api/boundaries/countries/{iso_a2}
GET /api/boundaries/admin1/{iso_a2}
GET /api/boundaries/admin2/{iso_a2}/{admin1_code}
GET /api/boundaries/point?lon={lon}&lat={lat}
GET /api/boundaries/service-area/{ghcid}
GET /api/boundaries/service-area/{ghcid}/geojson
Frontend Integration
The frontend will:
- Fetch service area GeoJSON via API (not static files)
- Use MapLibre vector tiles for admin boundaries (optional optimization)
- Cache frequently-accessed boundaries in browser
- Support historical boundary display with temporal filtering
// Example: Fetch service area for a custodian
const response = await fetch(`/api/boundaries/service-area/${ghcid}/geojson`);
const geojson = await response.json();
map.getSource('werkgebied').setData(geojson);
Temporal Versioning
Boundaries change over time (municipal mergers, etc.). The schema supports:
-- Current boundary (valid_to IS NULL)
SELECT * FROM boundary_admin2 WHERE cbs_gemeente_code = 'GM0363' AND valid_to IS NULL;
-- Historical boundary (before merger)
SELECT * FROM boundary_admin2 WHERE cbs_gemeente_code = 'GM0363' AND valid_to = '2001-01-01';
Service Area Computation
Service areas are computed from admin units:
-- Compute service area for Noord-Hollands Archief (serves Haarlem + region)
SELECT ST_Union(geom) AS service_area_geom
FROM boundary_admin2
WHERE id IN (SELECT unnest(admin2_ids) FROM custodian_service_areas WHERE ghcid = 'NL-NH-HAA-A-NHA');
Or pre-computed and stored:
-- Pre-compute and cache
INSERT INTO custodian_service_areas (ghcid, service_area_name, geom, admin2_ids)
VALUES (
'NL-NH-HAA-A-NHA',
'Noord-Hollands Archief Werkgebied',
compute_service_area_geometry(ARRAY[123, 124, 125, 126]), -- admin2 IDs
ARRAY[123, 124, 125, 126]
);
Implementation Plan
Phase 1: Schema & Initial Data (Current Sprint)
- Create PostGIS schema (
002_postgis_boundaries.sql) - Create boundary loading script (
load_boundaries_postgis.py) - Load Netherlands boundaries (CBS + provinces)
- Load HALC historical boundaries
- Migrate existing GeoJSON data
Phase 2: International Expansion
- Load GADM for priority countries: JP, CZ, DE, BE, CH, AT
- Load GeoNames settlements for reverse geocoding
- Create API endpoints for boundary queries
- Update frontend to use API instead of static files
Phase 3: Service Area Management
- Compute service areas for existing custodians
- Create admin UI for service area editing
- Implement temporal boundary display
- Add vector tile generation (optional optimization)
Files Created
| File | Description |
|---|---|
infrastructure/sql/002_postgis_boundaries.sql |
PostGIS schema for boundaries |
scripts/load_boundaries_postgis.py |
Python script to load boundary data |
docs/POSTGIS_BOUNDARY_ARCHITECTURE.md |
This document |
Dependencies
- PostgreSQL 14+ with PostGIS 3.3+
- Python:
psycopg2,geopandas,shapely - GADM data (downloaded on demand)
- CBS GeoJSON (existing in
frontend/public/data/)
Migration from Static GeoJSON
The current static GeoJSON approach will be deprecated but not immediately removed:
- PostGIS becomes the source of truth for boundaries
- API serves boundary GeoJSON on demand
- Static files remain as fallback for development
- Frontend gradually migrates to API-based loading