Commit graph

5 commits

Author SHA1 Message Date
kempersc
b30711fcfb update slots 2026-01-14 09:05:54 +01:00
kempersc
d51bba5003 data: update entity resolution confidence scores
Regenerated confidence scores with updated scoring algorithm:
- Total candidates: 78,746
- Adjusted: 2,832 (was 3,869)
- Boosted: 2,499 (was 3,192)
- Penalized: 333 (was 677)
- Likely wrong person: 533
- Reviews preserved: 57

Confidence scoring version: 2.0
2026-01-13 21:54:18 +01:00
kempersc
92b490d690 edit slots 2026-01-13 20:35:11 +01:00
kempersc
f74513e8ef feat: Enhance entity resolution with email semantics and review merging
- Updated `entity_review.py` to map email semantic fields from JSON.
- Expanded `email_semantics.py` with additional museum mappings.
- Introduced a new rule in `.opencode/rules/no-duplicate-ontology-mappings.md` to prevent duplicate ontology mappings.
- Added a backup JSON file for entity resolution candidates.
- Created `enrich_email_semantics.py` to enrich candidates with email semantic signals.
- Developed `merge_entity_reviews.py` to merge reviewed decisions from a backup into new candidates.
2026-01-13 16:43:56 +01:00
kempersc
355d8be51d centralise slots 2026-01-12 14:33:56 +01:00