Commit graph

16 commits

Author SHA1 Message Date
kempersc
1f723fd5d7 feat(data): merge staff data from 35 PENDING files into enriched custodians
Merged LinkedIn-extracted staff sections from PENDING files into their
corresponding proper GHCID custodian files. This consolidates data from
two extraction sources:
- Existing enriched files: Google Maps, Museum Register, YouTube, etc.
- PENDING files: LinkedIn staff data extraction

Files modified:
- 28 custodian files enriched with staff data
- 35 PENDING files deleted (merged into proper locations)
- Originals archived to archive/pending_duplicates_20250109/

Key institutions enriched:
- Rijksmuseum (NL-NH-AMS-M-RM)
- Stedelijk Museum Amsterdam (NL-NH-AMS-M-SMA)
- Amsterdam Museum (NL-NH-AMS-M-AM)
- Regionaal Archief Alkmaar (NL-NH-ALK-A-RAA)
- Maritiem Museum Rotterdam (NL-ZH-ROT-M-MMR)
- And 23 more museums/archives across NL

New scripts:
- scripts/merge_staff_data.py: Automated staff data merger
- scripts/categorize_pending_files.py: PENDING file analysis utility
2026-01-09 14:51:17 +01:00
kempersc
38dcd2ce9c Restore YAML files for Museum Dokkum and Gemeente Smallingerland with enriched data and provenance tracking 2025-12-30 23:58:21 +01:00
kempersc
d64f857aa9 add sparql validator and RAG injector 2025-12-30 03:43:31 +01:00
kempersc
0c1d19e98b enrich entries 2025-12-23 13:27:35 +01:00
kempersc
aca68ea47f remove a,bihguous web-claims 2025-12-21 00:01:54 +01:00
kempersc
e0dd847491 extend ontology 2025-12-16 20:27:39 +01:00
kempersc
cb56aa7e40 enrich all custodian timespan 2025-12-15 22:31:41 +01:00
kempersc
181b1cf705 data: enrich Dutch heritage custodians (DR, FL, FR, GE, GR, LI provinces)
- Add digital platform discovery data with provenance
- Cleanup duplicate/incorrect custodian entries
- Add GHCID collision resolution suffixes where needed
- Update person entity profiles with career history
2025-12-15 01:34:38 +01:00
kempersc
1b1cfbfca0 enrich custodians 2025-12-11 22:32:09 +01:00
kempersc
be3fbac601 enrich entries and persons 2025-12-10 18:04:25 +01:00
kempersc
5b3d4d1ed5 normalize: add canonical location blocks (batch 3) 2025-12-09 14:14:13 +01:00
kempersc
bb41287730 normalize: add canonical location blocks (batch 1) 2025-12-09 13:17:11 +01:00
kempersc
bf7c773955 edit Japanese entries 2025-12-09 09:16:19 +01:00
kempersc
131e3ca259 normalise custodian entries 2025-12-09 07:56:35 +01:00
kempersc
ee4e57bc75 add new entries 2025-12-07 00:26:01 +01:00
kempersc
1635625032 added web annotations 2025-12-06 19:50:04 +01:00