glam/QUICK_STATUS_CUSTODIAN_SCHEMA_MOD_20251122.md
kempersc 2761857b0d Add scripts for converting OWL/Turtle ontology to Mermaid and PlantUML diagrams
- Implemented `owl_to_mermaid.py` to convert OWL/Turtle files into Mermaid class diagrams.
- Implemented `owl_to_plantuml.py` to convert OWL/Turtle files into PlantUML class diagrams.
- Added two new PlantUML files for custodian multi-aspect diagrams.
2025-11-22 23:01:13 +01:00

12 KiB

Quick Status: Custodian Multi-Aspect Refactoring

Date: 2025-11-22
Status: CORE IMPLEMENTATION COMPLETE
Priority: HIGH

What We Did

Refactored the Heritage Custodian Ontology to properly model custodians as multi-aspect entities with three independent facets.

Key Changes

1. Renamed CustodianReconstruction → CustodianLegalStatus

  • File: CustodianReconstruction.yamlCustodianLegalStatus.yaml
  • Rationale: "Reconstruction" was ambiguous - now clearly represents the LEGAL dimension
  • class_uri: Changed to org:FormalOrganization
  • Description: Emphasizes formal legal entity with precise definition

2. Created CustodianPlace Class

  • New file: modules/classes/CustodianPlace.yaml
  • Purpose: Nominal place designation (NOT coordinates!)
  • class_uri: crm:E53_Place
  • Examples:
    • "het herenhuis in de Schilderswijk" (neighborhood reference)
    • "the mansion" (vague building reference)
    • "het museum op het Museumplein" (landmark reference)
  • Enum: Created PlaceSpecificityEnum (BUILDING, STREET, NEIGHBORHOOD, CITY, REGION, VAGUE)
  • Critical change: Observations NO LONGER directly link to Custodian hub
  • Rationale: Only ReconstructionActivity can determine if custodian is successfully identified
  • PROV-O Flow:
    CustodianObservation → prov:used → ReconstructionActivity
    ReconstructionActivity → prov:wasGeneratedBy → CustodianLegalStatus/Name/Place
    CustodianLegal Status/Name/Place → refers_to_custodian → Custodian
    

4. Updated Custodian Hub

  • Added slots:
    • legal_status → CustodianLegalStatus (formal legal entity)
    • place_designation → CustodianPlace (nominal place reference)
    • preferred_label → CustodianName (already existed)
  • Hub now aggregates THREE independent aspects

5. Updated Main Schema

  • Imports: Added CustodianPlace, PlaceSpecificityEnum
  • Renamed: CustodianReconstructionCustodianLegalStatus
  • Documentation: Updated to reflect multi-aspect architecture

Architectural Pattern

Three Aspects of a Custodian

CustodianObservation (Evidence)
        ↓ prov:used
ReconstructionActivity (Process)
        ↓ prov:wasGeneratedBy (0, 1, 2, or 3 outputs)
        ├─→ CustodianLegalStatus (formal legal entity)
        ├─→ CustodianName (emic label)
        └─→ CustodianPlace (nominal place reference)
                ↓ refers_to_custodian
            Custodian (hub)

Example: Rijksmuseum

All three aspects identify the SAME custodian:

  1. CustodianLegalStatus: "Stichting Rijksmuseum" (legal entity, KvK 41215422)
  2. CustodianName: "Rijksmuseum" (emic label, how it presents itself)
  3. CustodianPlace: "het museum op het Museumplein" (place reference)

Distinction: CustodianPlace vs Location

CustodianPlace Location
Nominal reference Geographic coordinates
"the mansion in the Schilderswijk" lat: 52.0705, lon: 4.2894
Emic/contextual Precise/measured
May be ambiguous Unambiguous
Identifies custodian Locates custodian

Files Modified

Core Classes (5 files)

  1. CustodianReconstruction.yamlCustodianLegalStatus.yaml - RENAMED + UPDATED
  2. CustodianPlace.yaml - NEW
  3. CustodianObservation.yaml - REMOVED refers_to_custodian
  4. Custodian.yaml - ADDED legal_status + place_designation
  5. CustodianName.yaml - Already updated in previous session

Enums (1 new file)

  1. PlaceSpecificityEnum.yaml - NEW

Main Schema (1 file)

  1. 01_custodian_name_modular.yaml - UPDATED imports + documentation

Next Steps

  • Validate schema with gen-owl
  • Create example instances for all three aspects
  • Update UML diagrams
  • Regenerate all RDF formats
  • Create multi-aspect modeling guide
  • Update PROV-O documentation

Key Principles Established

  1. Multi-Aspect Modeling: Custodians have THREE independent aspects (legal, name, place)
  2. Observations Are Input: CustodianObservation does NOT directly link to hub
  3. Activity Generates Aspects: ReconstructionActivity may generate 0-3 aspects
  4. Hub Aggregates Aspects: Custodian links to all three aspects
  5. Nominal vs Geographic: CustodianPlace (nominal) ≠ Location (coordinates)

Status: Core implementation complete
Priority: HIGH
Impact: Fundamental - Multi-aspect modeling, removed observation→hub link, added place aspect

Validation Results

Schema Validation SUCCESSFUL

  • Generated OWL file: 2,630 lines
  • All slot definitions created
  • All imports resolved
  • No critical errors

Files Created/Modified Summary

New Files (7):

  1. modules/classes/CustodianPlace.yaml - Place aspect class
  2. modules/enums/PlaceSpecificityEnum.yaml - Place specificity enum
  3. modules/slots/place_designation.yaml - Hub → Place link
  4. modules/slots/place_name.yaml - Nominal place name
  5. modules/slots/place_language.yaml - Place name language
  6. modules/slots/place_specificity.yaml - Specificity level
  7. modules/slots/place_note.yaml - Contextual notes

Renamed Files (1):

  1. modules/classes/CustodianReconstruction.yamlCustodianLegalStatus.yaml

Modified Files (5):

  1. modules/classes/CustodianObservation.yaml - Removed refers_to_custodian
  2. modules/classes/Custodian.yaml - Added legal_status + place_designation
  3. modules/classes/CustodianName.yaml - Already updated (previous session)
  4. modules/classes/CustodianLegalStatus.yaml - Updated description
  5. 01_custodian_name_modular.yaml - Updated imports + documentation

Batch Updated (22 files):

  • All module files with references to CustodianReconstruction updated to CustodianLegalStatus

Generated Artifacts

RDF Serialization Formats (4 formats) ALL COMPLETE

All generated from LinkML using gen-owl + rdfpipe (with stderr redirection):

  1. OWL/Turtle: schemas/20251121/rdf/custodian_multi_aspect_20251122_154430.owl.ttl (159KB, 2,619 lines)
  2. N-Triples: schemas/20251121/rdf/custodian_multi_aspect_20251122_154430.nt (456KB, 3,027 lines)
  3. JSON-LD: schemas/20251121/rdf/custodian_multi_aspect_20251122_154430.jsonld (380KB, 14,094 lines)
  4. RDF/XML: schemas/20251121/rdf/custodian_multi_aspect_20251122_154430.rdf (328KB, 4,585 lines)

UML Diagrams

  1. Mermaid: schemas/20251121/uml/mermaid/01_custodian_multi_aspect_20251122_154136.mmd (745B)

Example Instances

  1. Complete Multi-Aspect Example: schemas/20251121/examples/multi_aspect_rijksmuseum_complete.yaml
    • Demonstrates all three aspects (Legal Status, Name, Place) working together
    • Shows PROV-O observation → activity → entity flow
    • Includes confidence measures and temporal validity
    • ~200 lines of fully documented YAML

Verification Results

  • 34 CustodianLegalStatus references in RDF
  • 15 CustodianPlace references in RDF
  • 21 PlaceSpecificityEnum references in RDF
  • Schema validates with gen-owl (no critical errors)
  • All imports resolved correctly

Complete File Inventory

New Components (8 files)

  1. modules/classes/CustodianPlace.yaml - Place aspect class
  2. modules/enums/PlaceSpecificityEnum.yaml - Place specificity levels
  3. modules/slots/place_designation.yaml - Hub → Place link
  4. modules/slots/place_name.yaml - Nominal place name
  5. modules/slots/place_language.yaml - Place name language
  6. modules/slots/place_specificity.yaml - Specificity level
  7. modules/slots/place_note.yaml - Contextual notes
  8. examples/multi_aspect_rijksmuseum_complete.yaml - Complete example instance

Renamed Components (1 file)

  1. modules/classes/CustodianReconstruction.yamlCustodianLegalStatus.yaml

Modified Components (5 files)

  1. modules/classes/CustodianObservation.yaml - Removed refers_to_custodian slot
  2. modules/classes/Custodian.yaml - Added legal_status + place_designation slots
  3. modules/classes/CustodianName.yaml - Already updated (previous session)
  4. modules/classes/CustodianLegalStatus.yaml - Updated description + ontology mappings
  5. 01_custodian_name_modular.yaml - Updated imports + documentation

Batch Updated (22+ files)

  • All module files with references to CustodianReconstruction → CustodianLegalStatus

Generated Artifacts (6 files) ALL COMPLETE

  1. rdf/custodian_multi_aspect_20251122_154430.owl.ttl - OWL/Turtle (159KB, 2,619 lines)
  2. rdf/custodian_multi_aspect_20251122_154430.nt - N-Triples (456KB, 3,027 lines)
  3. rdf/custodian_multi_aspect_20251122_154430.jsonld - JSON-LD (380KB, 14,094 lines)
  4. rdf/custodian_multi_aspect_20251122_154430.rdf - RDF/XML (328KB, 4,585 lines)
  5. uml/mermaid/01_custodian_multi_aspect_20251122_154136.mmd - Mermaid UML (745B)
  6. QUICK_STATUS_CUSTODIAN_SCHEMA_MOD_20251122.md - This document

Total Files Affected: 42+ files (8 new, 1 renamed, 5 modified, 22+ batch updated, 6 generated)


Impact Assessment

Ontological Changes

  • Breaking change: CustodianObservation NO LONGER directly links to Custodian
  • New pattern: Three-aspect modeling (legal, name, place)
  • PROV-O alignment: Proper observation → activity → entity flow
  • Hub architecture: Custodian aggregates multiple independent aspects

Data Migration Required

  • Existing instances: Need to update any instances using old CustodianReconstruction class
  • Observation links: Remove any direct refers_to_custodian from CustodianObservation
  • Hub structure: Update Custodian hubs to include legal_status + place_designation

Benefits

  1. Precision: Clear separation of legal entity (precise) vs. name (ambiguous) vs. place (nominal)
  2. Flexibility: Can have legal status without name, or name without legal status
  3. Temporal modeling: Each aspect can change independently over time
  4. Source transparency: All aspects explicitly derived from observations
  5. Ontology alignment: Better mapping to CIDOC-CRM, PROV-O, W3C Org

Next Actions (Priority Order)

Immediate (Before next commit)

  • Validate schema with gen-owl - DONE
  • Generate RDF serializations (4 formats) - DONE
  • Generate UML diagrams - DONE
  • Create example instance - DONE
  • Document architectural changes - DONE

Short-term (This week)

  • Migrate existing example instances to multi-aspect pattern
  • Update documentation with multi-aspect modeling guide
  • Create data migration script for existing instances
  • Add validation rules for multi-aspect constraints

Medium-term (Next sprint)

  • Update PROV-O alignment documentation
  • Create additional example instances (individuals, groups, governments)
  • Generate full TypeDB schema from LinkML
  • Create Mermaid visualization of observation → activity → entity flow

Long-term (Future work)

  • Implement Collection aspect (fourth aspect)
  • Add Event aspect (organizational change events)
  • Create Person aspect (staff, curators via PiCo pattern)
  • Full integration with TOOI, CPOV, CIDOC-CRM

Document Status: COMPLETE
Schema Status: VALIDATED
Generated Artifacts: ALL PRODUCED
Examples: COMPREHENSIVE INSTANCE CREATED
Next Session: Ready for data migration + additional examples


Key Takeaway: The Heritage Custodian Ontology now properly models custodians as multi-aspect entities with three independent facets (legal status, name, place), all derived from observations through formal reconstruction activities. This provides the foundation for nuanced, temporally-aware, source-transparent heritage metadata.