Commit graph

6 commits

Author SHA1 Message Date
kempersc
90b402dba6 enrich AR en Czech files 2025-12-30 23:01:01 +01:00
kempersc
d64f857aa9 add sparql validator and RAG injector 2025-12-30 03:43:31 +01:00
kempersc
84904e344b Make AGENTS more succint by referring to opencode rules & enrich custodians 2025-12-28 14:56:35 +01:00
kempersc
131e3ca259 normalise custodian entries 2025-12-09 07:56:35 +01:00
kempersc
40bd3cb8f5 data(custodian): add emic_name fields and remove duplicate files with name suffixes
- Add emic_name, name_language, and standardized_name to 1,781 custodian files
- Remove 2,239 duplicate files that had name suffixes in filename
- Consolidate data into base GHCID files per PID stability rules
- Part of UNESCO Memory of the World custodian enrichment
2025-12-08 14:57:34 +01:00
kempersc
1e01639c56 fix(data): resolve XXX placeholder city codes to actual GeoNames settlements
Rename 144 custodian files from XXX placeholders to resolved city codes:
- BR (65): ASA, RBR, MAN, FAZ, MAC, VIT, FOR, GUA, BRE, GOI, SLU, BHO, XAN, CGR, CUI, etc.
- CH (24): ZUR, BER, GEN, BAL, LUC, etc.
- MX (23): MEX, GDL, MTY, PUE, etc.
- CL (9): SCL, VAL, etc.
- CZ (5): PRG, BRN, MSV, etc.
- KR (4): SEL, etc.
- GB (4): LON, etc.
- FR (3): PAR, etc.
- IN (2): DEL, etc.
- PH, JP, EE (1 each)

City codes derived from GeoNames reverse geocoding using institution coordinates.
GHCID format: {COUNTRY}-{REGION}-{CITY}-{TYPE}-{ABBREV}
2025-12-07 17:47:09 +01:00
Renamed from data/custodian/CL-CH-XXX-R-FIP.yaml (Browse further)