kempersc
d64f857aa9
add sparql validator and RAG injector
2025-12-30 03:43:31 +01:00
kempersc
0c1d19e98b
enrich entries
2025-12-23 13:27:35 +01:00
kempersc
aca68ea47f
remove a,bihguous web-claims
2025-12-21 00:01:54 +01:00
kempersc
3820f2fc92
chore: Add data reports, infra scripts, and API updates
...
- Data quality reports for Dutch custodians
- Name mismatch detection reports
- Failed crawl URL tracking
- Caddy configuration updates
- Monitor script for chunk 404 errors
- API endpoint improvements
2025-12-15 01:48:08 +01:00
kempersc
c50c35fd3a
enrich person custodian
2025-12-14 17:09:55 +01:00
kempersc
1b1cfbfca0
enrich custodians
2025-12-11 22:32:09 +01:00
kempersc
d4906abae4
update postgis data
2025-12-10 23:51:51 +01:00
kempersc
41959f0766
correct HCID!
2025-12-10 13:01:13 +01:00
kempersc
131e3ca259
normalise custodian entries
2025-12-09 07:56:35 +01:00
kempersc
7e3559f7e5
add new entries
2025-12-07 23:08:02 +01:00
kempersc
6d66e67bf4
fix(deploy): route Oxigraph requests through SSH tunnel
...
Oxigraph only listens on localhost, so deploy script now executes
curl commands via SSH instead of trying to reach it directly.
2025-12-07 19:20:56 +01:00
kempersc
90a1f20271
chore: add YAML history fix scripts and update ducklake/deploy tooling
...
- Add fix_yaml_history.py and fix_yaml_history_v2.py for cleaning up
malformed ghcid_history entries with duplicate/redundant data
- Update load_custodians_to_ducklake.py for DuckDB lakehouse loading
- Update migrate_web_archives.py for web archive management
- Update deploy.sh with improvements
- Ignore entire data/ducklake/ directory (generated databases)
2025-12-07 18:45:52 +01:00
kempersc
83ab098cf7
feat: add PostGIS international boundary architecture
...
Add schema and tooling for storing administrative boundaries in PostGIS:
- 002_postgis_boundaries.sql: Complete PostGIS schema with:
- boundary_countries (ISO 3166-1)
- boundary_admin1 (states/provinces/regions)
- boundary_admin2 (municipalities/districts)
- boundary_historical (HALC pre-modern territories)
- custodian_service_areas (computed werkgebied geometries)
- geonames_settlements (reverse geocoding)
- Spatial functions: find_admin_for_point, find_nearest_settlement
- Views for API access
- load_boundaries_postgis.py: Python loader supporting:
- GADM (Global Administrative Areas) - primary global source
- CBS (Dutch municipality boundaries)
- GeoNames settlements for reverse geocoding
- Cached downloads and upsert logic
- POSTGIS_BOUNDARY_ARCHITECTURE.md: Design documentation
This replaces the static GeoJSON approach for international coverage.
2025-12-07 14:34:39 +01:00
kempersc
ee4e57bc75
add new entries
2025-12-07 00:26:01 +01:00
kempersc
1635625032
added web annotations
2025-12-06 19:50:04 +01:00
kempersc
ef89b1213a
validate enrichments
2025-12-02 14:36:01 +01:00
kempersc
097d116b72
enrich entries
2025-12-01 16:06:34 +01:00
kempersc
f3c149b1bb
update entries
2025-11-30 23:30:29 +01:00