glam/SESSION_SUMMARY_20251121_LINKML_HUB_ARCHITECTURE_COMPLETE.md
kempersc 284b575e88 Add UML diagrams for Custodian Hub v2 in Mermaid and PlantUML formats
- Introduced a new Mermaid diagram for Custodian Hub v2, detailing entities such as CustodianReconstruction, Identifier, TimeSpan, Agent, CustodianName, CustodianObservation, ReconstructionActivity, Appellation, ConfidenceMeasure, Custodian, LanguageCode, and SourceDocument.
- Established relationships between entities, including temporal extents, derivations, and revisions.
- Added a comprehensive PlantUML diagram reflecting the same structure and relationships, including enumerations for various types and statuses relevant to custodians and observations.
- Enhanced documentation to clarify the hub architecture pattern and its implications for data integrity and source authority.
2025-11-21 22:30:07 +01:00

6.3 KiB

LinkML Schema Update: Hub Architecture Implementation

Date: 2025-11-21 Version: 0.1.0

Summary

Successfully updated all LinkML schema files to implement the Heritage Custodian Hub Architecture with persistent identifiers at https://nde.nl/ontology/hc/.

Core Changes

1. New Slots Created (7 files)

Hub Architecture Slots:

  • hc_id.yaml - Persistent identifier for custodian hub
  • refers_to_custodian.yaml - Links observations/reconstructions to hub
  • observation_source.yaml - Source tracking for observations
  • reconstruction_method.yaml - Methodology documentation
  • entity_type.yaml - Entity categorization for reconstructions
  • emic_name.yaml - Self-designated custodian names
  • name_language.yaml - Language codes for names

2. New Enum Created (1 file)

EntityTypeEnum.yaml:

  • INDIVIDUAL - Single person custodians
  • GROUP - Informal groups/collectives
  • ORGANIZATION - Formal organizations
  • GOVERNMENT - Government bodies/agencies
  • CORPORATION - Commercial corporations

3. Updated Classes (4 files)

Custodian.yaml

Changes:

  • Updated description to emphasize hub role
  • Changed primary slot from id to hc_id
  • Added pattern validation for hc_id format
  • Added examples showing minimal hub structure
  • Updated comments to explain hub architecture

Key insight: Custodian is now explicitly a minimal hub containing only the persistent identifier.

CustodianObservation.yaml

Changes:

  • Added refers_to_custodian slot (links to hub)
  • Added observation_source slot (simplified source tracking)
  • Updated slot_usage with hub reference documentation
  • Added examples showing hub linkage

Key insight: Observations now explicitly reference the hub they describe.

CustodianName.yaml

Changes:

  • Added emic_name slot (observed self-designated name)
  • Added name_language slot (language code)
  • Updated slot_usage with clarification that names are NOT identifiers
  • Added examples emphasizing emic naming convention

Key insight: Names are observations about custodians, not their identifiers.

CustodianReconstruction.yaml

Changes:

  • Added refers_to_custodian slot (links to hub)
  • Added entity_type slot (categorization)
  • Added reconstruction_method slot (methodology)
  • Updated slot_usage with hub reference documentation
  • Added examples showing synthesis from observations

Key insight: Reconstructions are interpretations that reference the hub.

4. Updated Main Schema (1 file)

01_custodian_name_modular.yaml

Changes:

  • Updated description to explain hub architecture
  • Added imports for 7 new hub architecture slots
  • Added import for EntityTypeEnum
  • Added explicit enum imports section
  • Updated comments to reflect new structure

Hub Architecture Pattern

Custodian (Hub)
├── hc_id: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
│
├── CustodianObservation (multiple)
│   ├── refers_to_custodian → hub
│   ├── observation_source
│   └── observation_date
│
├── CustodianName (multiple, subclass of Observation)
│   ├── refers_to_custodian → hub
│   ├── emic_name
│   ├── name_language
│   └── name_validity_period → TimeSpan
│
└── CustodianReconstruction (multiple)
    ├── refers_to_custodian → hub
    ├── entity_type
    ├── temporal_extent → TimeSpan
    └── reconstruction_method

Persistent Identifier Format

Pattern: https://nde.nl/ontology/hc/{abstracted-ghcid} Example: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804

Derivation:

Files Modified

Created (8 files)

modules/slots/hc_id.yaml
modules/slots/refers_to_custodian.yaml
modules/slots/observation_source.yaml
modules/slots/reconstruction_method.yaml
modules/slots/entity_type.yaml
modules/slots/emic_name.yaml
modules/slots/name_language.yaml
modules/enums/EntityTypeEnum.yaml

Modified (5 files)

modules/classes/Custodian.yaml
modules/classes/CustodianObservation.yaml
modules/classes/CustodianName.yaml
modules/classes/CustodianReconstruction.yaml
01_custodian_name_modular.yaml

Example Data (1 file)

examples/hub_architecture_rijksmuseum.yaml

Benefits of Hub Architecture

  1. Persistent Identifiers: Stable URIs that don't change as observations accumulate
  2. No Privileged Source: All observations are equal; none is "authoritative"
  3. Conflict Tolerance: Contradictory observations can coexist without resolution
  4. Temporal Evolution: Names, structures, and interpretations evolve independently
  5. Complete Provenance: Every piece of data tracks its source
  6. Clean Separation: Evidence (observations) separate from interpretation (reconstructions)

Ontology Alignments

  • CIDOC-CRM: E39_Actor (Custodian), E41_Appellation (CustodianName), E52_Time-Span (TimeSpan)
  • RiC-O: rico:Agent (CustodianReconstruction)
  • PiCo: pico:PersonObservation (CustodianObservation pattern)
  • PROV-O: prov:Agent, prov:Entity, prov:hadPlan
  • Dublin Core: dcterms:identifier, dcterms:references, dcterms:source
  • SKOS: skos:prefLabel (emic names)
  • Schema.org: schema:Organization, schema:Person
  • CPOV: cpov:PublicOrganisation
  • W3C Org: org:Organization, org:FormalOrganization

Validation

To validate the schema:

cd /Users/kempersc/apps/glam/schemas/20251121/
linkml-validate --schema linkml/01_custodian_name_modular.yaml examples/hub_architecture_rijksmuseum.yaml

Next Steps

  1. Update LinkML schema files with hub architecture
  2. Generate RDF/OWL from updated schema
  3. Generate UML diagrams (PlantUML, Mermaid)
  4. Create SHACL shapes for validation
  5. Test with real-world data
  6. Document SPARQL query patterns

Impact

The Heritage Custodian Ontology now properly implements:

  • Hub-based entity management
  • Persistent identifier strategy
  • Observation/reconstruction pattern separation
  • TimeSpan integration for fuzzy temporal boundaries
  • Complete ontology alignment with CIDOC-CRM, RiC-O, PROV-O, etc.

All schema files are now consistent with the hub architecture and ready for RDF generation and production use.