kempersc
dfa667c90f
Fix LinkML schema for valid RDF generation with proper slot_uri
...
Summary:
- Create 46 missing slot definition files with proper slot_uri values
- Add slot imports to main schema (01_custodian_name_modular.yaml)
- Fix YAML examples sections in 116+ class and slot files
- Fix PersonObservation.yaml examples section (nested objects → string literals)
Technical changes:
- All slots now have explicit slot_uri mapping to base ontologies (RiC-O, Schema.org, SKOS)
- Eliminates malformed URIs like 'custodian/:slot_name' in generated RDF
- gen-owl now produces valid Turtle with 153,166 triples
New slot files (46):
- RiC-O slots: rico_note, rico_organizational_principle, rico_has_or_had_holder, etc.
- Scope slots: scope_includes, scope_excludes, archive_scope
- Organization slots: organization_type, governance_authority, area_served
- Platform slots: platform_type_category, portal_type_category
- Social media slots: social_media_platform_category, post_type_*
- Type hierarchy slots: broader_type, narrower_types, custodian_type_broader
- Wikidata slots: wikidata_equivalent, wikidata_mapping
Generated output:
- schemas/20251121/rdf/01_custodian_name_modular_20260107_134534_clean.owl.ttl (6.9MB)
- Validated with rdflib: 153,166 triples, no malformed URIs
2026-01-07 13:48:03 +01:00
kempersc
98c42bf272
Fix LinkML URI conflicts and generate RDF outputs
...
- Fix scope_note → finding_aid_scope_note in FindingAid.yaml
- Remove duplicate wikidata_entity slot from CustodianType.yaml (import instead)
- Remove duplicate rico_record_set_type from class_metadata_slots.yaml
- Fix range types for equals_string compatibility (uriorcurie → string)
- Move class names from close_mappings to see_also in 10 RecordSetTypes files
- Generate all RDF formats: OWL, N-Triples, RDF/XML, N3, JSON-LD context
- Sync schemas to frontend/public/schemas/
Files: 1,151 changed (includes prior CustodianType migration)
2026-01-07 12:32:59 +01:00
kempersc
b34992b1d3
Migrate all 293 class files to ontology-aligned slots
...
Extends migration to all class types (museums, libraries, galleries, etc.)
New slots added to class_metadata_slots.yaml:
- RiC-O: rico_record_set_type, rico_organizational_principle,
rico_has_or_had_holder, rico_note
- Multilingual: label_de, label_es, label_fr, label_nl, label_it, label_pt
- Scope: scope_includes, scope_excludes, custodian_only,
organizational_level, geographic_restriction
- Notes: privacy_note, preservation_note, legal_note
Migration script now handles 30+ annotation types.
All migrated schemas pass linkml-validate.
Total: 387 class files now use proper slots instead of annotations.
2026-01-06 12:24:54 +01:00
kempersc
aa763dab25
Migrate 94 archive class annotations to ontology-aligned slots
...
- Add migration script: scripts/migrate_annotations_to_slots.py
- Convert custodian_types, wikidata, skos_broader, specificity_* annotations
- Replace with proper slots mapped to SKOS, PROV-O, RiC-O predicates
- Add ../slots/class_metadata_slots import to all migrated files
- Remove AcademicArchive_refactored.yaml (main file now migrated)
- Sync changes to frontend/public/schemas/
Migration converts:
- custodian_types → hc:custodianTypes slot
- wikidata/wikidata_label → wikidata_alignment structured slot
- skos_broader → skos:broader slot
- specificity_* → specificity_annotation structured slot
- dual_class_pattern → dual_class_link structured slot
- template_specificity → template_specificity slot
All 94 migrated schemas pass linkml-validate.
2026-01-06 11:25:37 +01:00
kempersc
11983014bb
Enhance specificity scoring system integration with existing infrastructure
...
- Updated documentation to clarify integration points with existing components in the RAG pipeline and DSPy framework.
- Added detailed mapping of SPARQL templates to context templates for improved specificity filtering.
- Implemented wrapper patterns around existing classifiers to extend functionality without duplication.
- Introduced new tests for the SpecificityAwareClassifier and SPARQLToContextMapper to ensure proper integration and functionality.
- Enhanced the CustodianRDFConverter to include ISO country and subregion codes from GHCID for better geospatial data handling.
2026-01-05 17:37:49 +01:00
kempersc
41959f0766
correct HCID!
2025-12-10 13:01:13 +01:00
kempersc
131e3ca259
normalise custodian entries
2025-12-09 07:56:35 +01:00