Commit graph

19 commits

Author SHA1 Message Date
kempersc
a4184cb805 feat(infra): add webhook-based schema deployment pipeline
- Add FastAPI webhook receiver for Forgejo push events
- Add setup script for server deployment
- Add Caddy snippet for webhook endpoint
- Add local sync-schemas.sh helper script
- Sync frontend schemas with source (archived deprecated slots)

Infrastructure scripts staged for optional webhook deployment.
Current deployment uses: ./infrastructure/deploy.sh --frontend
2026-01-10 21:45:02 +01:00
kempersc
d64f857aa9 add sparql validator and RAG injector 2025-12-30 03:43:31 +01:00
kempersc
0c1d19e98b enrich entries 2025-12-23 13:27:35 +01:00
kempersc
aca68ea47f remove a,bihguous web-claims 2025-12-21 00:01:54 +01:00
kempersc
3820f2fc92 chore: Add data reports, infra scripts, and API updates
- Data quality reports for Dutch custodians
- Name mismatch detection reports
- Failed crawl URL tracking
- Caddy configuration updates
- Monitor script for chunk 404 errors
- API endpoint improvements
2025-12-15 01:48:08 +01:00
kempersc
c50c35fd3a enrich person custodian 2025-12-14 17:09:55 +01:00
kempersc
1b1cfbfca0 enrich custodians 2025-12-11 22:32:09 +01:00
kempersc
d4906abae4 update postgis data 2025-12-10 23:51:51 +01:00
kempersc
41959f0766 correct HCID! 2025-12-10 13:01:13 +01:00
kempersc
131e3ca259 normalise custodian entries 2025-12-09 07:56:35 +01:00
kempersc
7e3559f7e5 add new entries 2025-12-07 23:08:02 +01:00
kempersc
6d66e67bf4 fix(deploy): route Oxigraph requests through SSH tunnel
Oxigraph only listens on localhost, so deploy script now executes
curl commands via SSH instead of trying to reach it directly.
2025-12-07 19:20:56 +01:00
kempersc
90a1f20271 chore: add YAML history fix scripts and update ducklake/deploy tooling
- Add fix_yaml_history.py and fix_yaml_history_v2.py for cleaning up
  malformed ghcid_history entries with duplicate/redundant data
- Update load_custodians_to_ducklake.py for DuckDB lakehouse loading
- Update migrate_web_archives.py for web archive management
- Update deploy.sh with improvements
- Ignore entire data/ducklake/ directory (generated databases)
2025-12-07 18:45:52 +01:00
kempersc
83ab098cf7 feat: add PostGIS international boundary architecture
Add schema and tooling for storing administrative boundaries in PostGIS:
- 002_postgis_boundaries.sql: Complete PostGIS schema with:
  - boundary_countries (ISO 3166-1)
  - boundary_admin1 (states/provinces/regions)
  - boundary_admin2 (municipalities/districts)
  - boundary_historical (HALC pre-modern territories)
  - custodian_service_areas (computed werkgebied geometries)
  - geonames_settlements (reverse geocoding)
  - Spatial functions: find_admin_for_point, find_nearest_settlement
  - Views for API access

- load_boundaries_postgis.py: Python loader supporting:
  - GADM (Global Administrative Areas) - primary global source
  - CBS (Dutch municipality boundaries)
  - GeoNames settlements for reverse geocoding
  - Cached downloads and upsert logic

- POSTGIS_BOUNDARY_ARCHITECTURE.md: Design documentation

This replaces the static GeoJSON approach for international coverage.
2025-12-07 14:34:39 +01:00
kempersc
ee4e57bc75 add new entries 2025-12-07 00:26:01 +01:00
kempersc
1635625032 added web annotations 2025-12-06 19:50:04 +01:00
kempersc
ef89b1213a validate enrichments 2025-12-02 14:36:01 +01:00
kempersc
097d116b72 enrich entries 2025-12-01 16:06:34 +01:00
kempersc
f3c149b1bb update entries 2025-11-30 23:30:29 +01:00