Commit graph

16 commits

Author SHA1 Message Date
kempersc
fd792fce2c Refactor code structure for improved readability and maintainability
Some checks failed
Deploy Frontend / build-and-deploy (push) Has been cancelled
2026-01-11 15:27:14 +01:00
kempersc
7d09e4179c Add US surnames dataset from 2010 Census with metadata and surname counts 2026-01-11 12:28:58 +01:00
kempersc
b2b21abe2b fix: mark 39 Google Maps false matches for Type I intangible heritage custodians
Per Rule 40 (KIEN authoritative source), Google Maps frequently returns
false matches for intangible heritage organizations. These are virtual
networks without commercial storefronts.

Changes:
- Mark google_maps_enrichment.status as FALSE_MATCH
- Preserve original data in original_false_match for audit trail
- Add correction_timestamp and correction_agent provenance
- Special handling for NL-GE-TIE-I-M (Stichting MOZA): also fixed
  YouTube false match (Mozart channel) and removed ~1750 lines of
  irrelevant video data

Detection method: Domain mismatch between Google Maps website field
and official KIEN registry website.
2026-01-08 12:16:39 +01:00
kempersc
4f0cafe98a enrich HC profiles 2026-01-02 02:11:04 +01:00
kempersc
38dcd2ce9c Restore YAML files for Museum Dokkum and Gemeente Smallingerland with enriched data and provenance tracking 2025-12-30 23:58:21 +01:00
kempersc
d64f857aa9 add sparql validator and RAG injector 2025-12-30 03:43:31 +01:00
kempersc
7a056fa746 enrich entries 2025-12-21 22:12:34 +01:00
kempersc
99430c2a70 add new entries and semantic routing 2025-12-17 10:11:56 +01:00
kempersc
cb56aa7e40 enrich all custodian timespan 2025-12-15 22:31:41 +01:00
kempersc
181b1cf705 data: enrich Dutch heritage custodians (DR, FL, FR, GE, GR, LI provinces)
- Add digital platform discovery data with provenance
- Cleanup duplicate/incorrect custodian entries
- Add GHCID collision resolution suffixes where needed
- Update person entity profiles with career history
2025-12-15 01:34:38 +01:00
kempersc
c6aee998db correct person labels 2025-12-14 17:29:39 +01:00
kempersc
c283daa1a2 normalise dutch entries 2025-12-09 08:02:27 +01:00
kempersc
131e3ca259 normalise custodian entries 2025-12-09 07:56:35 +01:00
kempersc
7e3559f7e5 add new entries 2025-12-07 23:08:02 +01:00
kempersc
ee4e57bc75 add new entries 2025-12-07 00:26:01 +01:00
kempersc
1635625032 added web annotations 2025-12-06 19:50:04 +01:00