glam/schemas/20251121/HUB_ARCHITECTURE_VERIFIED_COMPLETE.md
kempersc fa5680f0dd Add initial versions of custodian hub UML diagrams in Mermaid and PlantUML formats
- Introduced custodian_hub_v3.mmd, custodian_hub_v4_final.mmd, and custodian_hub_v5_FINAL.mmd for Mermaid representation.
- Created custodian_hub_FINAL.puml and custodian_hub_v3.puml for PlantUML representation.
- Defined entities such as CustodianReconstruction, Identifier, TimeSpan, Agent, CustodianName, CustodianObservation, ReconstructionActivity, Appellation, ConfidenceMeasure, Custodian, LanguageCode, and SourceDocument.
- Established relationships and associations between entities, including temporal extents, observations, and reconstruction activities.
- Incorporated enumerations for various types, statuses, and classifications relevant to custodians and their activities.
2025-11-22 14:33:51 +01:00

10 KiB

Hub Architecture - COMPLETE AND VERIFIED

Date: November 21, 2025, 22:32
Status: COMPLETE - All hub connections verified in diagrams
Schema Version: v0.1.0 (Hub Architecture with proper relationships)


CRITICAL FIX: Custodian Hub Now Properly Connected

Problem Identified

You correctly identified that the Custodian hub was disconnected in the initial Mermaid diagrams (v2, v3, v4). The hub appeared as an isolated class with no relationships.

Root Causes

  1. Inheritance removed but relationships not explicit: When we removed is_a: Custodian from CustodianObservation and CustodianReconstruction, the refers_to_custodian slot wasn't being recognized as a class relationship

  2. Wrong range type: The refers_to_custodian slot had range: uriorcurie (string) instead of range: Custodian (class reference)

  3. Slot_usage overriding range: Both CustodianObservation and CustodianReconstruction had slot_usage blocks that overrode the range back to uriorcurie

  4. Mermaid generator bug: The diagram generator was using raw cls.slot_usage[slot_name] objects instead of sv.induced_slot(), which merges base slots with slot_usage overrides

Fixes Applied

Fix 1: Changed refers_to_custodian slot range

# BEFORE (schemas/20251121/linkml/modules/slots/refers_to_custodian.yaml)
range: uriorcurie
pattern: "^https://nde\\.nl/ontology/hc/[a-z0-9-]+$"

# AFTER
range: Custodian  # ← Now points to class, not string
required: true

Fix 2: Removed inheritance from Custodian

# BEFORE (CustodianObservation.yaml)
classes:
  CustodianObservation:
    is_a: Custodian  # ← REMOVED

# AFTER
classes:
  CustodianObservation:
    # No inheritance - connects via refers_to_custodian relationship

Same for CustodianReconstruction.yaml.

Fix 3: Removed range overrides in slot_usage

# BEFORE (CustodianObservation.yaml slot_usage)
refers_to_custodian:
  range: uriorcurie  # ← REMOVED this override
  pattern: "^https://nde\\.nl/ontology/hc/[a-z0-9-]+$"  # ← REMOVED

# AFTER
refers_to_custodian:
  description: >-
    The Custodian hub that this observation refers to.    
  required: true
  # Inherits range: Custodian from base slot

Same for CustodianReconstruction.yaml.

Fix 4: Fixed Mermaid generator to use induced_slot()

# BEFORE (generate_mermaid_modular.py line 68-72)
slot = None
if cls.slot_usage and slot_name in cls.slot_usage:
    slot = cls.slot_usage[slot_name]  # ← Uses raw slot_usage (incomplete)
else:
    slot = sv.get_slot(slot_name)

# AFTER (line 68-69)
# Use induced_slot to properly merge base slot with slot_usage
slot = sv.induced_slot(slot_name, class_name)  # ← Merges correctly

Verification Results

Mermaid Diagram v5_FINAL: Hub Connections Verified

CustodianReconstruction ||--|| Custodian : "refers_to_custodian"
CustodianName ||--|| Custodian : "refers_to_custodian"  
CustodianObservation ||--|| Custodian : "refers_to_custodian"

Hub Architecture Confirmed:

  • CustodianReconstructionCustodian (required, one-to-one)
  • CustodianNameCustodian (required, one-to-one)
  • CustodianObservationCustodian (required, one-to-one)
  • Custodian hub is central connection point
  • All observations and reconstructions must reference a hub

Schema Validation

from linkml_runtime.utils.schemaview import SchemaView
sv = SchemaView('linkml/01_custodian_name_modular.yaml')

# Base slot
slot = sv.get_slot('refers_to_custodian')
# Range: Custodian ✅
# Required: True ✅

# CustodianObservation.refers_to_custodian (induced)
induced = sv.induced_slot('refers_to_custodian', 'CustodianObservation')
# Range: Custodian ✅
# Required: True ✅

# CustodianReconstruction.refers_to_custodian (induced)
induced = sv.induced_slot('refers_to_custodian', 'CustodianReconstruction')
# Range: Custodian ✅
# Required: True ✅

RDF Generation

  • File: rdf/custodian_hub_FINAL.ttl
  • Size: 90 KB
  • Status: Generated without errors

Hub Architecture Summary

The Hub Pattern

                        ┌─────────────────┐
                        │    Custodian    │ ← Minimal hub (just hc_id)
                        │   (Abstract)    │
                        └────────┬────────┘
                                 │
         ┌───────────────────────┼───────────────────────┐
         │                       │                       │
         ▼                       ▼                       ▼
┌──────────────────┐   ┌──────────────────┐   ┌──────────────────┐
│ CustodianObs     │   │ CustodianName    │   │ CustodianRecon   │
│ (Evidence)       │   │ (Emic names)     │   │ (Formal entity)  │
└──────────────────┘   └──────────────────┘   └──────────────────┘

Key Principles

  1. Hub is minimal: Contains only hc_id + metadata (created, modified)
  2. All data flows TO the hub: Via refers_to_custodian relationships
  3. Multiple observations coexist: Conflicting evidence is preserved
  4. Complete provenance: Every observation traceable to source
  5. Temporal evolution: Interpretations change without losing history

Files Modified (Final Count)

Schema Files

  1. linkml/modules/slots/refers_to_custodian.yaml - Changed range to Custodian
  2. linkml/modules/classes/CustodianObservation.yaml - Removed inheritance, removed range override
  3. linkml/modules/classes/CustodianReconstruction.yaml - Removed inheritance, removed range override
  4. linkml/01_custodian_name_modular.yaml - Added missing slot imports

Generator Scripts

  1. scripts/generate_mermaid_modular.py - Fixed to use induced_slot()

Generated Artifacts

  1. rdf/custodian_hub_FINAL.ttl - 90 KB RDF/OWL schema
  2. uml/mermaid/custodian_hub_v5_FINAL.mmd - 3.6 KB diagram with hub connections

Visual Confirmation

Mermaid ER Diagram Structure

The final diagram (custodian_hub_v5_FINAL.mmd) shows:

Classes:

  • Custodian (hub with hc_id, created, modified)
  • CustodianObservation (evidence with refers_to_custodian)
  • CustodianName (subclass of observation, also with refers_to_custodian)
  • CustodianReconstruction (formal entity with refers_to_custodian)

Relationships (partial list):

CustodianReconstruction ||--|| Custodian : "refers_to_custodian"
CustodianReconstruction ||--}| CustodianObservation : "was_derived_from"
CustodianReconstruction ||--|| ReconstructionActivity : "was_generated_by"

CustodianName ||--|| Custodian : "refers_to_custodian"
CustodianName ||--|o TimeSpan : "name_validity_period"
CustodianName ||--|| Appellation : "observed_name"

CustodianObservation ||--|| Custodian : "refers_to_custodian"
CustodianObservation ||--|| SourceDocument : "source"
CustodianObservation ||--|| Appellation : "observed_name"

Cardinality Legend:

  • ||--|| = One-to-one required (both sides mandatory)
  • ||--|o = One-to-one optional (target optional)
  • ||--}| = One-to-many required
  • ||--}o = One-to-many optional

Example Instance (Conceptual)

# Custodian Hub (minimal)
- hc_id: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
  created: 2025-11-21T20:00:00Z
  modified: 2025-11-21T22:32:00Z

# Observation from ISIL registry
- refers_to_custodian: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
  observed_name:
    appellation_value: "Rijksmuseum"
    appellation_language: "nl"
  observation_date: 2024-08-01
  observation_source: "ISIL registry 2024-08-01"
  source:
    source_uri: https://slks.nl/isil-register
    source_type: AUTHORITY_FILE

# Observation from Wikidata
- refers_to_custodian: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
  observed_name:
    appellation_value: "Rijksmuseum Amsterdam"
    appellation_language: "en"
  observation_date: 2025-11-20
  observation_source: "Wikidata Q190804"
  source:
    source_uri: https://www.wikidata.org/wiki/Q190804
    source_type: KNOWLEDGE_BASE

# Formal Reconstruction (synthesized)
- refers_to_custodian: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
  entity_type: ORGANIZATION
  legal_name: "Stichting Rijksmuseum"
  legal_form: "foundation"
  registration_number: "41215308"
  registration_authority: "KvK"
  reconstruction_method: "Manual synthesis from ISIL + Wikidata + KvK data"
  was_derived_from:
    - <observation 1>
    - <observation 2>
  was_generated_by:
    activity_type: MANUAL
    responsible_agent:
      agent_name: "Heritage Ontology Curator"
      agent_type: PERSON

Next Steps

Immediate (Ready Now)

  1. Schema complete - Hub architecture implemented and verified
  2. Diagrams accurate - Mermaid shows correct hub connections
  3. RDF generated - 90 KB OWL ontology ready for use

Short-Term (Next Session)

  1. Validate example instances - Test with real data (Rijksmuseum, etc.)
  2. Generate other diagram formats - PlantUML with hub connections
  3. Create SPARQL query examples - Demonstrate hub pattern queries

Long-Term (Project Evolution)

  1. Data migration - Convert existing GHCID-based data to hub architecture
  2. Populate observations - Extract from ISIL, Wikidata, institutional websites
  3. Build reconstructions - Entity resolution pipeline
  4. Deploy triplestore - Load RDF for federated queries

Acknowledgments

Critical Insight: User correctly identified that the Custodian hub was disconnected in diagrams v2-v4, prompting investigation and fixes.

Root Cause Analysis:

  • Inheritance approach (v1) was wrong for hub pattern
  • Range specification needed to be explicit
  • Diagram generator needed to use induced slots

Final Result: Proper hub architecture with all connections verified in schema, diagrams, and RDF.


Session Metadata

Start Time: 2025-11-21T22:16:00Z
End Time: 2025-11-21T22:32:00Z
Duration: 16 minutes
Files Modified: 7 total
Diagrams Generated: Mermaid v5_FINAL (3.6 KB) with verified hub connections
RDF Output: custodian_hub_FINAL.ttl (90 KB)
Status: COMPLETE AND VERIFIED

Next Session: Ready for data validation and SPARQL query implementation