glam/reports/CH_ANNOTATOR_INTEGRATION_FINAL.md
2025-12-07 00:26:01 +01:00

58 lines
1.8 KiB
Markdown

# CH-Annotator to Custodian Integration Report
**Generated**: 2025-12-06T23:23:36.329132+00:00
## Summary
| Metric | Count |
|--------|-------|
| Total custodian files | 3244 |
| Files with CH-Annotator | 170 |
| Files without CH-Annotator | 3074 |
| Integration rate | 5.2% |
## Integrated by Source File
| Source File | Custodian Files Updated |
|-------------|------------------------|
| netherlands_complete_ch_annotator.yaml | 147 |
| austria_complete_ch_annotator.yaml | 6 |
| czech_unified_ch_annotator.yaml | 5 |
| latin_american_institutions_AUTHORITATIVE_ch_annotator.yaml | 4 |
| japan_complete_ch_annotator.yaml | 3 |
| georgia_glam_institutions_enriched_ch_annotator.yaml | 2 |
| belgium_complete_ch_annotator.yaml | 1 |
| bulgaria_complete_ch_annotator.yaml | 1 |
| egypt_institutions_ch_annotator.yaml | 1 |
## What Was Added
Each integrated custodian file now has a `ch_annotator` section containing:
- **convention_id**: `ch_annotator-v1_7_0`
- **entity_classification**: Hypernym codes (GRP.HER.LIB, GRP.HER.MUS, etc.)
- **ontology_class**: Schema.org/W3C ORG mappings
- **extraction_provenance**: Source file, timestamp, agent
- **annotation_provenance**: When annotation was applied
- **entity_claims**: Extracted claims (name, type, location, identifiers)
## Example Integration
```yaml
ch_annotator:
convention_id: ch_annotator-v1_7_0
entity_classification:
hypernym: GRP
subtype: GRP.HER.LIB
ontology_class: schema:Library
entity_claims:
- claim_type: full_name
claim_value: KB, nationale bibliotheek
property_uri: skos:prefLabel
```
## Next Steps
1. **Create custodian files** for unmatched CH-Annotator institutions
2. **Validate** integrated files against LinkML schema
3. **Export** to RDF/JSON-LD with CH-Annotator provenance