Copies authoritative schemas from schemas/20251121/ to:
- frontend/public/schemas/20251121/
- apps/archief-assistent/public/schemas/20251121/
This ensures slot definitions with corrected ontology property
references (commit 2808dad6cd) are available to frontend apps.
17 KiB
Hub Architecture Implementation - Completion Summary
Date: November 21, 2025, 22:24
Schema Version: v0.1.0 (Hub Architecture)
Status: ✅ COMPLETE
What Was Accomplished
1. Corrected Fundamental Conceptual Error
Problem Identified: Previous agent misunderstood the relationship between Custodian and CustodianName:
- ❌ OLD (WRONG): Treated
CustodianNameas an identifier forCustodianentities - ✅ NEW (CORRECT): Custodians are identified by persistent URIs (
hc_id), and names are observations about custodians
Key Insight: Names are evidence, not identifiers. The hub persists independently of any single piece of evidence.
2. Implemented Hub Architecture Pattern
The Custodian class is now a minimal abstract hub that:
- Contains only the persistent identifier (
hc_id: https://nde.nl/ontology/hc/{abstracted-ghcid}) - Acts as a connection point for all observations and reconstructions
- Allows multiple, potentially conflicting pieces of evidence to coexist
- Prevents privileging any single source as authoritative
Hub Structure:
Custodian (Hub)
↑ refers_to_custodian
├── CustodianObservation (evidence from sources)
├── CustodianName (observed names in specific contexts)
└── CustodianReconstruction (formal entity interpretations)
Design Philosophy: The hub is NOT a thing with properties - it's a connection point that enables:
- Conflict tolerance: Multiple observations can contradict each other
- Complete provenance: Every piece of evidence is traceable
- Temporal evolution: Interpretations can change without losing history
3. Schema Changes Summary
A. New Slots Created (7 files)
-
hc_id.yaml- Persistent identifier for custodian hub- Format:
https://nde.nl/ontology/hc/{abstracted-ghcid} - Example:
https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804 - Maps to:
dcterms:identifier
- Format:
-
refers_to_custodian.yaml- Links observations/reconstructions to hub- Maps to:
dcterms:references - Range:
uriorcurie(must match hub ID pattern)
- Maps to:
-
observation_source.yaml- Direct source reference (simplified)- Maps to:
dcterms:source - Alternative to full
SourceDocumentclass when full metadata isn't needed
- Maps to:
-
reconstruction_method.yaml- Documents synthesis methodology- Maps to:
prov:hadPlan - Examples: "Manual expert curation", "Automated fuzzy matching (threshold 0.85)"
- Maps to:
-
entity_type.yaml- Categorizes reconstructed entities- Range:
EntityTypeEnum(INDIVIDUAL, GROUP, ORGANIZATION, GOVERNMENT, CORPORATION) - Maps to:
rdf:type
- Range:
-
emic_name.yaml- Self-designated name from custodian's perspective- Maps to:
skos:prefLabel - Respects cultural context and indigenous naming practices
- Maps to:
-
name_language.yaml- Language code for observed names- Maps to:
dcterms:language - Pattern: ISO 639-1 or BCP 47 codes (e.g., "nl", "en", "pt-BR")
- Maps to:
B. New Enum Created (1 file)
EntityTypeEnum.yaml - Formal entity type classification
- INDIVIDUAL: Single person (
crm:E21_Person) - GROUP: Informal collective (
crm:E74_Group) - ORGANIZATION: Formal organization (
org:Organization) - GOVERNMENT: Government body (
cpov:PublicOrganisation) - CORPORATION: Commercial corporation (
org:FormalOrganization)
C. Updated Classes (4 files)
-
Custodian.yaml- Changed: From generic base class to minimal hub
- Primary slot: Changed from
idtohc_idwith pattern validation - Role: Abstract connection point (no descriptive properties)
-
CustodianObservation.yaml- Added:
refers_to_custodianslot (links to hub) - Added:
observation_sourceslot (simplified source tracking) - Emphasis: All observations must reference the hub
- Added:
-
CustodianName.yaml- Added:
emic_nameslot (observed self-designated name) - Added:
name_languageslot (language code) - Clarification: Names are NOT identifiers, just observations with temporal/contextual validity
- Added:
-
CustodianReconstruction.yaml- Added:
refers_to_custodianslot (links to hub) - Added:
entity_typeslot (formal categorization) - Added:
reconstruction_methodslot (methodology documentation)
- Added:
D. Updated Main Schema
01_custodian_name_modular.yaml:
- Added imports for 7 new hub architecture slots
- Added import for
EntityTypeEnum - Added missing agent-related slot imports:
activity_typeaffiliationagent_nameagent_typealternative_observed_names
- Updated description to explain hub architecture pattern
- Total imports now: 69 slot modules + 6 enum modules + 12 class modules = 87 module imports
4. Generated Artifacts
RDF/OWL Schema
- File:
rdf/custodian_hub_v2.ttl - Size: 91 KB (1,560 lines)
- Format: Turtle (RDF)
- Generated with:
gen-owl -f ttl - Namespaces: CIDOC-CRM, PROV-O, Dublin Core, FOAF, SKOS, Schema.org, W3C Org, W3C Time
Key Classes in RDF:
<https://nde.nl/ontology/hc/custodian.owl.ttl> a owl:Ontology ;
rdfs:label "heritage-custodian-observation-reconstruction" ;
dcterms:license "https://creativecommons.org/licenses/by-sa/4.0/" ;
dcterms:title "Heritage Custodian Observation and Reconstruction Pattern" ;
pav:version "0.1.0" .
UML Diagrams
-
PlantUML Diagram
- File:
uml/plantuml/custodian_hub_v2.puml - Size: 6,685 bytes
- Classes: 12
- Enums: 6
- Can render to: PNG, SVG, PDF via PlantUML server
- File:
-
Mermaid Diagram
- File:
uml/mermaid/custodian_hub_v2.mmd - Size: 3,347 bytes
- Format: Mermaid ER diagram
- Can render in: GitHub, GitLab, Markdown viewers
- File:
5. Bug Fixes Applied
A. Enum Module Structure Fix
Problem: EntityTypeEnum.yaml was missing proper schema wrapper
Solution: Added schema structure:
id: https://nde.nl/ontology/hc/enum/EntityTypeEnum
name: EntityTypeEnum
title: Entity Type Enumeration
imports:
- linkml:types
enums:
EntityTypeEnum:
# ... enum definition
B. Slot Module Structure Fix
Problem: All 7 new hub architecture slots lacked schema wrapper
Solution: Wrapped each slot in proper schema structure:
id: https://nde.nl/ontology/hc/slot/{slot_name}
name: {slot_name}-slot
slots:
{slot_name}:
# ... slot definition
C. Missing Slot Imports Fix
Problem: Main schema didn't import agent-related slots referenced by Agent and ReconstructionActivity classes
Solution: Added 5 missing slot imports:
activity_type(forReconstructionActivity)affiliation(forAgent)agent_name(forAgent)agent_type(forAgent)alternative_observed_names(forCustodianName)
6. Ontology Alignment
Base Ontologies Integrated:
| Concept | LinkML Class | Ontology Mapping |
|---|---|---|
| Heritage custodian hub | Custodian |
crm:E39_Actor (CIDOC-CRM) |
| Evidence of custodian | CustodianObservation |
pico:PersonObservation (PiCo pattern) |
| Observed name | CustodianName |
skos:prefLabel + dcterms:temporal |
| Formal entity | CustodianReconstruction |
rico:Agent / cpov:PublicOrganisation |
| Entity resolution | ReconstructionActivity |
prov:Activity |
| Responsible party | Agent |
prov:Agent + foaf:Agent |
| Temporal extent | TimeSpan |
time:Interval (W3C Time) |
Inspiration: PiCo (Persons in Context) ontology for observation/entity distinction
7. Documentation Created
-
SESSION_SUMMARY_20251121_LINKML_HUB_ARCHITECTURE_COMPLETE.md- Complete change log with rationale
- Before/after comparisons
- Implementation details
-
CUSTODIAN_HUB_ARCHITECTURE.md- Architecture explanation for future agents
- Design patterns and anti-patterns
- Integration guidelines
-
examples/hub_architecture_rijksmuseum.yaml- Example using Rijksmuseum data
- Shows hub with multiple observations and reconstructions
- Demonstrates temporal validity and provenance tracking
-
HUB_ARCHITECTURE_COMPLETION_SUMMARY.md(this file)- Comprehensive summary of all changes
- Reference for next session
File Statistics
Total Files Modified/Created: 16
Modified:
linkml/01_custodian_name_modular.yaml(main schema)linkml/modules/classes/Custodian.yamllinkml/modules/classes/CustodianObservation.yamllinkml/modules/classes/CustodianName.yamllinkml/modules/classes/CustodianReconstruction.yaml
Created (New slot modules):
linkml/modules/slots/hc_id.yamllinkml/modules/slots/refers_to_custodian.yamllinkml/modules/slots/observation_source.yamllinkml/modules/slots/reconstruction_method.yamllinkml/modules/slots/entity_type.yamllinkml/modules/slots/emic_name.yamllinkml/modules/slots/name_language.yaml
Created (New enum module):
linkml/modules/enums/EntityTypeEnum.yaml
Generated:
rdf/custodian_hub_v2.ttl(91 KB RDF/OWL schema)uml/plantuml/custodian_hub_v2.puml(6.7 KB PlantUML diagram)uml/mermaid/custodian_hub_v2.mmd(3.3 KB Mermaid ER diagram)
Validation Status
Schema Validation
- ✅ LinkML schema loads without errors (
SchemaViewsuccessful) - ✅ RDF generation succeeds (91 KB output, 1,560 lines)
- ✅ PlantUML generation succeeds (12 classes, 6 enums recognized)
- ✅ Mermaid generation succeeds (ER diagram with relationships)
- ⚠️ Minor warnings:
- Schema.org namespace mapping conflict (harmless)
- Multiple OWL types for some literals (LinkML generator quirk)
Data Validation
- 📝 TODO: Validate example instance (
examples/hub_architecture_rijksmuseum.yaml) withlinkml-validate - 📝 TODO: Create additional test instances for edge cases
Key Design Decisions
1. Persistent Identifier Format
Decision: Use NDE ontology namespace with abstracted GHCID
Format: https://nde.nl/ontology/hc/{abstracted-ghcid}
Example: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
Rationale: Stable, resolvable URIs that align with GHCID system but abstracted for ontology use
2. Hub Pattern Over Property-Rich Entity
Decision: Custodian class has ONLY hc_id (+ metadata)
Rationale:
- Prevents privileging any single source
- Allows conflicting observations to coexist
- Enables complete provenance tracking
- Supports temporal evolution of interpretations
3. Observation/Reconstruction Distinction
Decision: Separate classes for evidence (CustodianObservation) vs. formal entities (CustodianReconstruction)
Rationale:
- PiCo ontology pattern (proven in person/organization modeling)
- Clear semantic distinction between "what we observed" vs. "what we concluded"
- Supports fuzzy temporal boundaries with
TimeSpan
4. TimeSpan Integration
Decision: Use fuzzy temporal boundaries (begin/end of begin, begin/end of end)
Rationale:
- Heritage institutions often have uncertain founding dates
- Dissolution may be gradual (e.g., "ceased operations sometime in 1980s")
- W3C Time Ontology alignment
5. Multilingual Name Support
Decision: name_language slot with ISO 639-1/BCP 47 codes
Rationale:
- Global heritage institutions have names in multiple languages
- Emic names must preserve original language context
- Enables language-tagged literals in RDF
Critical Understanding for Future Agents
The Hub Is NOT a Thing with Properties
❌ WRONG: Thinking of Custodian as a "museum entity" with name, address, etc.
✅ RIGHT: Thinking of Custodian as a persistent identifier that connects observations
Analogy: The hub is like a physical pin on a bulletin board:
- The pin (hub) doesn't contain information itself
- Notes (observations) are attached to the pin with strings (refers_to_custodian)
- Multiple notes can contradict each other (conflict tolerance)
- Remove one note, the pin remains (PID stability)
- The pin's location (hc_id) never changes
Data Flows TO the Hub, Not FROM It
Observations → Hub ← Reconstructions
# Evidence (observation) points to hub
<https://nde.nl/ontology/hc/observation/isil-2024-001>
refers_to_custodian <https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804> .
# Interpretation (reconstruction) points to hub
<https://nde.nl/ontology/hc/reconstruction/expert-curated-001>
refers_to_custodian <https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804> .
# Hub has NO outgoing properties except metadata
<https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804>
hc_id "https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804" ;
dcterms:created "2025-11-21T22:00:00Z" .
Names Are Observations, Not Identifiers
❌ WRONG: "Rijksmuseum" identifies the institution
✅ RIGHT: "Rijksmuseum" is an observed name from a specific source at a specific time
# Name as observation (temporal, contextual)
- emic_name: Rijksmuseum
name_language: nl
refers_to_custodian: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
observation_source: https://www.rijksmuseum.nl
observation_date: 2025-11-21
name_validity_period:
begin_of_the_begin: 1885-01-01 # Opened as Rijksmuseum in 1885
Next Steps
Immediate (This Session if Continuing)
-
Validate Example Instance
linkml-validate -s linkml/01_custodian_name_modular.yaml \ examples/hub_architecture_rijksmuseum.yaml -
Generate Additional RDF Formats
# JSON-LD gen-owl -f jsonld linkml/01_custodian_name_modular.yaml > rdf/custodian_hub_v2.jsonld # N-Triples gen-owl -f nt linkml/01_custodian_name_modular.yaml > rdf/custodian_hub_v2.nt # RDF/XML gen-owl -f rdf linkml/01_custodian_name_modular.yaml > rdf/custodian_hub_v2.rdf -
Render UML Diagrams
# PlantUML → PNG (requires PlantUML installed) plantuml -tpng uml/plantuml/custodian_hub_v2.puml # Or use online renderer open http://www.plantuml.com/plantuml/uml/
Short-term (Next Session)
-
Create Data Conversion Scripts
- Migrate existing GHCID-based data to hub architecture
- Generate hub identifiers from existing records
- Create observations from authoritative sources (ISIL, Wikidata)
- Synthesize reconstructions from merged data
-
Implement SPARQL Query Examples
- Query all observations for a given hub
- Find custodians with conflicting names
- Retrieve reconstructions by entity type
- Temporal queries (active custodians in 1950-1980)
-
Build Validation Test Suite
- Valid hub structures
- Invalid references (observation without hub)
- Temporal consistency checks
- Provenance completeness tests
Long-term (Project Roadmap)
-
Integration with Wikidata
- Map hub IDs to Wikidata Q-numbers
- Import Wikidata statements as observations
- Bidirectional reconciliation
-
TypeDB Schema Migration
- Translate hub architecture to TypeDB schema
- Implement hub pattern in TypeDB relations
- Test complex temporal queries
-
RDF Triplestore Deployment
- Load RDF into GraphDB/Blazegraph/Virtuoso
- Create SPARQL endpoint
- Implement federated queries with Wikidata
-
Documentation Site
- Generate browsable ontology documentation
- Provide worked examples
- API reference for data consumers
References
Ontology Documentation
- PiCo Ontology: https://github.com/FICLIT/PiCo
- CIDOC-CRM: http://www.cidoc-crm.org/
- PROV-O: https://www.w3.org/TR/prov-o/
- W3C Time: https://www.w3.org/TR/owl-time/
- RiC-O: https://www.ica.org/standards/RiC/ontology
LinkML Resources
- LinkML Schema Language: https://linkml.io/linkml/
- SchemaView API: https://linkml.io/linkml/developers/schemaview.html
- OWL Generator: https://linkml.io/linkml/generators/owl.html
Project Documentation
schemas/20251121/linkml/01_custodian_name_modular.yaml- Main schemaschemas/20251121/SESSION_SUMMARY_20251121_LINKML_HUB_ARCHITECTURE_COMPLETE.md- Detailed change logschemas/20251121/CUSTODIAN_HUB_ARCHITECTURE.md- Architecture guideschemas/20251121/examples/hub_architecture_rijksmuseum.yaml- Example instance
Acknowledgments
Schema Version: v0.1.0
Implementation Date: November 21, 2025
Agent: OpenCODE (session 20251121-2216-2224)
Ontology Pattern: Inspired by PiCo (Persons in Context) - FICLIT Project
License: Creative Commons BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0/)
Session Metadata
Start Time: 2025-11-21T22:16:00Z
End Time: 2025-11-21T22:24:00Z
Duration: 8 minutes
Files Modified: 16
Lines of Code Changed: ~300
RDF Output: 91 KB (1,560 lines)
Documentation Generated: 4 files
Status: ✅ READY FOR NEXT PHASE (data conversion and validation)