# CustodianAppellation Relationship Refactoring - Phase 2 **Date**: 2025-11-22 **Session**: OpenCode AI Agent (Phase 2) **Status**: ✅ COMPLETE --- ## Executive Summary Refactored the Heritage Custodian Ontology to correctly model the relationship between `CustodianAppellation` and `CustodianName`, aligning with W3C SKOS best practices and ensuring `CustodianIdentifier` is the only class that directly identifies the `Custodian` hub. **This is Phase 2** of the Appellation/Identifier refactoring: - **Phase 1** (Nov 22, morning): Connected orphaned classes to Custodian hub ✅ - **Phase 2** (Nov 22, afternoon): Moved appellations from Custodian to CustodianName ✅ --- ## Problem Statement ### Phase 1 Architecture (Correct but Improvable) After Phase 1 refactoring: ``` Custodian --[crm:P1_is_identified_by]--> CustodianAppellation Custodian --[crm:P48_has_preferred_identifier]--> CustodianIdentifier ``` **Issues Discovered**: 1. **Semantic confusion**: `crm:P1_is_identified_by` suggests appellations *identify* the Custodian hub 2. **Inconsistent hub design**: Both Identifier and Appellation claimed to identify the hub 3. **Missing name aspect**: Appellations should be variants of the canonical name, not hub identifiers 4. **Ontology misalignment**: W3C Org Ontology uses `skos:altLabel` for alternative names, not `crm:P1` --- ## Solution ### Phase 2 Architecture (Correct and Aligned with SKOS) ``` Custodian (hub) └─ skos:prefLabel ──> CustodianName (canonical emic name) └─ skos:altLabel ──> CustodianAppellation (name variants) Custodian (hub) └─ crm:P48_has_preferred_identifier ──> CustodianIdentifier (external IDs) ``` **Key Changes from Phase 1**: 1. **CustodianAppellation connects to CustodianName** (not Custodian) 2. **Uses `skos:altLabel`** (standard property for alternative lexical labels) 3. **Only CustodianIdentifier identifies the hub** (maintains clean hub architecture) 4. **Inverse relationship**: `CustodianAppellation.variant_of_name` → `CustodianName` (using `skos:broader`) --- ## Files Modified ### 1. Schema Module Files #### `/schemas/20251121/linkml/modules/classes/Appellation.yaml` ✅ **Changes**: - Updated description: "alternative name variants for CustodianName" (was "custodian") - Changed slot from `identifies_custodian` → `variant_of_name` - Changed slot_uri from `crm:P1i_identifies` → `skos:broader` - Range changed from `Custodian` → `CustodianName` - Added import for `CustodianName` class - Updated documentation with SKOS altLabel rationale #### `/schemas/20251121/linkml/modules/classes/Custodian.yaml` ✅ **Changes**: - **Removed** `appellations` slot from slots list (line 99) - **Removed** `appellations` slot_usage block (lines 169-178) - Updated documentation: "Alternative names (in CustodianName.alternative_names list)" #### `/schemas/20251121/linkml/modules/classes/CustodianName.yaml` ✅ **Changes**: - **Added** `alternative_names` slot to slots list - **Added** `alternative_names` slot_usage: - `slot_uri: skos:altLabel` - `range: CustodianAppellation` - `multivalued: true` - Examples: "BnF", "Rijks", translations, historical variants - Added related mappings: `foaf:nick`, `gleif:hasOtherName` ### 2. New Slot Files Created #### `/schemas/20251121/linkml/modules/slots/alternative_names.yaml` ✅ **Purpose**: CustodianName → CustodianAppellation relationship - `slot_uri: skos:altLabel` - `range: CustodianAppellation` - `multivalued: true` - Domain: CustodianName (SKOS Concept) #### `/schemas/20251121/linkml/modules/slots/variant_of_name.yaml` ✅ **Purpose**: CustodianAppellation → CustodianName inverse relationship - `slot_uri: skos:broader` - `range: CustodianName` - Domain: E41_Appellation - Inverse of `skos:altLabel` ### 3. Main Schema File #### `/schemas/20251121/linkml/01_custodian_name_modular.yaml` ✅ **Changes**: - **Removed import**: `modules/slots/appellations` - **Added imports**: - `modules/slots/alternative_names` - `modules/slots/variant_of_name` - Updated change log: - "New slots (3): alternative_names (CustodianName → CustodianAppellation), variant_of_name (inverse), identifies_custodian (Identifier → Custodian)" - "Architecture change: CustodianAppellation now connects to CustodianName (not Custodian) using skos:altLabel" ### 4. Deprecated Files #### `/schemas/20251121/linkml/modules/slots/appellations.yaml` ⚠️ DEPRECATED **Status**: Marked as deprecated with clear migration path - Added deprecation notice explaining why it was replaced - Documents old architecture vs. new architecture - Points to replacement files (`alternative_names.yaml`, `variant_of_name.yaml`) - Kept for historical reference only --- ## Ontology Alignment ### SKOS (Simple Knowledge Organization System) **Primary Property**: `skos:altLabel` - **Definition**: "An alternative lexical label for a resource" - **Use case**: "Trading names, colloquial names, abbreviations, acronyms" - **Source**: W3C Org Ontology (org:alternativeName → skos:altLabel) **Inverse Property**: `skos:broader` - Links alternative label back to its preferred concept - Standard SKOS hierarchical relationship ### Related Ontology Properties - **W3C Org Ontology**: `org:alternativeName` (subproperty of `skos:altLabel`) - **GLEIF**: `gleif:hasOtherName` (subproperty of `skos:altLabel`) - **FOAF**: `foaf:nick` (for nicknames) - **Schema.org**: `schema:alternateName` (close match) --- ## Generated Outputs ### RDF Formats (Timestamp: `20251122_181217` - `20251122_181224`) ```bash schemas/20251121/rdf/ ├── 01_custodian_name_modular_20251122_181217.owl.ttl (160 KB - OWL/Turtle) ├── 01_custodian_name_modular_20251122_181224.nt (458 KB - N-Triples) ├── 01_custodian_name_modular_20251122_181224.jsonld (382 KB - JSON-LD) └── 01_custodian_name_modular_20251122_181224.rdf (330 KB - RDF/XML) ``` **Validation**: ✅ All files generated successfully with `gen-owl` and `rdfpipe` ### UML Diagrams (Timestamp: `20251122_181237`) ```bash schemas/20251121/uml/mermaid/ └── 01_custodian_name_modular_20251122_181237_er.mmd (176 lines - ER diagram) ``` **Key Relationships Verified**: ```mermaid CustodianName ||--}o CustodianAppellation : "alternative_names" CustodianAppellation ||--|o CustodianName : "variant_of_name" ``` --- ## Architecture Evolution ### Phase 1 → Phase 2 Comparison | Aspect | Phase 1 | Phase 2 | |--------|---------|---------| | **Appellation connects to** | Custodian (hub) | CustodianName (aspect) | | **Property used** | `crm:P1_is_identified_by` | `skos:altLabel` | | **Inverse property** | `crm:P1i_identifies` | `skos:broader` | | **Semantic meaning** | "Appellation identifies hub" | "Appellation is variant of name" | | **Ontology alignment** | CIDOC-CRM | W3C SKOS + CIDOC-CRM | | **Slot name** | `appellations` | `alternative_names` | --- ## Examples ### Phase 1 (Before Phase 2) ```yaml # Phase 1 architecture - Connected but semantically confused Custodian: id: https://nde.nl/ontology/hc/cust/bnf appellations: # ⚠️ Direct connection to hub (problematic) - appellation_value: "BnF" appellation_type: ABBREVIATION identifies_custodian: https://nde.nl/ontology/hc/cust/bnf # Suggests it identifies hub ``` ### Phase 2 (After Phase 2) ```yaml # Phase 2 architecture - CORRECT! Custodian: id: https://nde.nl/ontology/hc/cust/bnf preferred_label: refers_to_custodian: https://nde.nl/ontology/hc/cust/bnf emic_name: "Bibliothèque nationale de France" alternative_names: # ✅ Connection through CustodianName - appellation_value: "BnF" appellation_type: ABBREVIATION variant_of_name: - appellation_value: "National Library of France" appellation_language: "en" appellation_type: TRANSLATION ``` --- ## RDF Serialization ### Turtle (TTL) - Phase 2 ```turtle @prefix skos: . @prefix crm: . skos:prefLabel . a crm:E33_Linguistic_Object ; rdf:value "Bibliothèque nationale de France" ; skos:altLabel . a crm:E41_Appellation ; rdf:value "BnF" ; skos:broader . ``` --- ## Validation Results ### LinkML Schema Validation ✅ ```bash $ gen-owl -f ttl 01_custodian_name_modular.yaml # Output: 160 KB OWL/Turtle file with no errors ``` **Warnings** (non-critical): - ⚠️ Multiple owl types for `language` (rdfs:Literal vs owl:Thing) - expected for ambiguous ranges - ⚠️ Schema namespace override (schema.org vs schema:) - cosmetic, doesn't affect semantics ### ER Diagram Validation ✅ **Relationships Confirmed**: 1. ✅ `Custodian ||--|o CustodianName : "preferred_label"` (hub → name) 2. ✅ `CustodianName ||--}o CustodianAppellation : "alternative_names"` (name → variants, one-to-many) 3. ✅ `CustodianAppellation ||--|o CustodianName : "variant_of_name"` (variant → name, inverse) 4. ✅ `Custodian ||--}o CustodianIdentifier : "identifiers"` (hub → external IDs) 5. ✅ `CustodianIdentifier ||--|o Custodian : "identifies_custodian"` (ID → hub, identifies) **Key Observation**: ❌ No direct Custodian ↔ Appellation relationship exists (by design!) --- ## Impact Analysis ### Benefits 1. **Semantic clarity**: Appellations are now clearly name variants, not hub identifiers 2. **Ontology alignment**: Uses standard `skos:altLabel` (W3C recommended practice) 3. **Clean hub architecture**: Only `CustodianIdentifier` identifies the hub 4. **Multi-aspect modeling**: Names can have independent alternative labels 5. **Bidirectional relationships**: Both forward (`alternative_names`) and inverse (`variant_of_name`) ### Breaking Changes from Phase 1 ⚠️ **Data Migration Required**: **Phase 1 data structure**: ```yaml Custodian: appellations: [list of CustodianAppellation] ``` **Phase 2 data structure**: ```yaml Custodian: preferred_label: # CustodianName alternative_names: [list of CustodianAppellation] ``` **Migration Script**: TODO - Create conversion script for existing data --- ## Design Rationale ### Why `skos:altLabel` Instead of `crm:P1_is_identified_by`? **CIDOC-CRM `crm:P1_is_identified_by`**: - ✅ Purpose: "Names and labels used to **identify** this custodian" - ❌ Problem: Suggests appellations identify the **hub entity** - ❌ Conflicts with: `CustodianIdentifier` being the only hub identifier **SKOS `skos:altLabel`**: - ✅ Purpose: "Alternative lexical label for a resource" - ✅ Standard for: Trading names, colloquial names, abbreviations - ✅ Aligns with: W3C Org Ontology best practices - ✅ Clear semantics: Alternative labels for a **name aspect**, not hub identifiers ### Why CustodianName, Not Custodian? **Aspect-Based Architecture**: - `CustodianName` = One aspect of the custodian (the emic designation) - `CustodianIdentifier` = Different aspect (external identifiers) - `CustodianLegalStatus` = Different aspect (legal entity) **Each aspect has independent lifecycle**: - Names can have alternative variants (appellations) - Identifiers can reference external systems (ISIL, Wikidata) - Legal statuses can have registration numbers (KvK, company ID) **Mixing aspects breaks the model**: - ❌ Custodian.appellations → Implies hub has name variants (wrong level of abstraction) - ✅ CustodianName.alternative_names → Correct level (names have variants) --- ## Testing Checklist - [x] LinkML schema validation passes - [x] OWL/Turtle generation succeeds - [x] RDF format conversions (N-Triples, JSON-LD, RDF/XML) - [x] Mermaid ER diagram generation - [x] Relationships verified in ER diagram - [x] Deprecated file marked with migration path - [x] Main schema imports updated - [ ] Unit tests for data instances (TODO) - [ ] Migration script for existing data (TODO) --- ## Related Documentation ### Files to Update 1. **README.md** - Update architecture diagrams showing new relationships 2. **SCHEMA_MODULES.md** - Document `alternative_names` and `variant_of_name` slots 3. **ONTOLOGY_EXTENSIONS.md** - Add section on SKOS altLabel usage 4. **Data migration guide** - Create step-by-step conversion instructions ### Reference Documents - **W3C Org Ontology**: https://www.w3.org/TR/vocab-org/#org:alternativeName - **SKOS Reference**: https://www.w3.org/TR/skos-reference/#altLabel - **CIDOC-CRM E41_Appellation**: http://www.cidoc-crm.org/Entity/e41-appellation/version-7.1.1 - **GLEIF Ontology**: https://www.gleif.org/ontology/Base/hasOtherName --- ## Phase Comparison Summary | Phase | Date | Focus | Status | |-------|------|-------|--------| | **Phase 1** | 2025-11-22 AM | Connect orphaned Appellation/Identifier to Custodian hub | ✅ Complete | | **Phase 2** | 2025-11-22 PM | Move Appellation from Custodian to CustodianName (SKOS alignment) | ✅ Complete | **See Also**: - `APPELLATION_IDENTIFIER_REFACTORING_20251122.md` - Phase 1 documentation - `LEGAL_ENTITY_REFACTORING.md` - Legal entity model (context for Phase 1) --- ## Next Steps ### Immediate (Required Before v0.2.0 Release) 1. **Create data migration script** (`scripts/migrate_appellations_phase2_20251122.py`) - Convert Phase 1 `Custodian.appellations` to Phase 2 `CustodianName.alternative_names` - Validate all existing YAML instance files - Generate migration report 2. **Update documentation**: - README.md architecture diagrams - SCHEMA_MODULES.md slot documentation - Examples in LinkML schema comments 3. **Add unit tests**: - Test CustodianName with alternative_names - Test CustodianAppellation.variant_of_name inverse - Validate SKOS altLabel RDF serialization ### Future Enhancements 1. **Add language-tagged appellations**: - Support multilingual variants with proper `@lang` tags - RDF example: `skos:altLabel "BnF"@fr, "National Library of France"@en` 2. **Appellation provenance**: - Track source of alternative names - Add temporal validity (when name was used) 3. **Authority control integration**: - Link appellations to name authority records (VIAF, ISNI) - Validate variant forms against authority files --- ## Conclusion Phase 2 successfully aligns the Heritage Custodian Ontology with W3C SKOS best practices, maintains clean hub architecture, and provides clear semantic distinction between: - **CustodianIdentifier**: External identifiers that reference the hub - **CustodianAppellation**: Alternative name variants for the canonical emic name This change improves ontology interoperability, semantic clarity, and prepares the schema for future extensions (multilingual support, authority control, provenance tracking). --- **Version**: v0.1.0 → v0.2.0 (Phase 2) **Schema Status**: ✅ Validated **RDF Generation**: ✅ Complete (4 formats) **Diagrams**: ✅ Generated (Mermaid ER) **Data Migration**: ⏳ Pending (Phase 2 → Phase 1 conversion script needed) --- *End of Phase 2 Report*