Commit graph

12 commits

Author SHA1 Message Date
kempersc
4f0cafe98a enrich HC profiles 2026-01-02 02:11:04 +01:00
kempersc
38dcd2ce9c Restore YAML files for Museum Dokkum and Gemeente Smallingerland with enriched data and provenance tracking 2025-12-30 23:58:21 +01:00
kempersc
d64f857aa9 add sparql validator and RAG injector 2025-12-30 03:43:31 +01:00
kempersc
7a056fa746 enrich entries 2025-12-21 22:12:34 +01:00
kempersc
e0dd847491 extend ontology 2025-12-16 20:27:39 +01:00
kempersc
181b1cf705 data: enrich Dutch heritage custodians (DR, FL, FR, GE, GR, LI provinces)
- Add digital platform discovery data with provenance
- Cleanup duplicate/incorrect custodian entries
- Add GHCID collision resolution suffixes where needed
- Update person entity profiles with career history
2025-12-15 01:34:38 +01:00
kempersc
5b3d4d1ed5 normalize: add canonical location blocks (batch 3) 2025-12-09 14:14:13 +01:00
kempersc
bb41287730 normalize: add canonical location blocks (batch 1) 2025-12-09 13:17:11 +01:00
kempersc
c283daa1a2 normalise dutch entries 2025-12-09 08:02:27 +01:00
kempersc
131e3ca259 normalise custodian entries 2025-12-09 07:56:35 +01:00
kempersc
40bd3cb8f5 data(custodian): add emic_name fields and remove duplicate files with name suffixes
- Add emic_name, name_language, and standardized_name to 1,781 custodian files
- Remove 2,239 duplicate files that had name suffixes in filename
- Consolidate data into base GHCID files per PID stability rules
- Part of UNESCO Memory of the World custodian enrichment
2025-12-08 14:57:34 +01:00
kempersc
f284e87d13 feat: add 24,963 heritage custodian records from global extraction
Major batch addition of heritage institution data:
- Japan: 12,077 institutions (libraries, museums, archives)
- Czechia: 6,760 institutions
- Switzerland: 2,390 institutions
- Belgium: 448 institutions
- Belarus: 257 institutions
- Austria: 249 institutions (with corrected GHCIDs)
- Argentina: 235 institutions (bibliotecas populares)
- Brazil: 155 institutions
- Mexico: 110 institutions
- Bulgaria: 98 institutions
- Chile: 83 institutions
- Egypt: 50 institutions
- And additional records from VN, NL, GE, KR, GB, FR, US, IN, etc.

All records include:
- Standardized GHCID identifiers (alphabetic-only abbreviations)
- GeoNames-resolved location data
- ISO 3166-2 region codes
- Provenance metadata with extraction timestamps
2025-12-07 14:24:48 +01:00