- Implemented `owl_to_mermaid.py` to convert OWL/Turtle files into Mermaid class diagrams. - Implemented `owl_to_plantuml.py` to convert OWL/Turtle files into PlantUML class diagrams. - Added two new PlantUML files for custodian multi-aspect diagrams.
439 lines
15 KiB
Markdown
439 lines
15 KiB
Markdown
# CustodianAppellation Relationship Refactoring - Phase 2
|
|
|
|
**Date**: 2025-11-22
|
|
**Session**: OpenCode AI Agent (Phase 2)
|
|
**Status**: ✅ COMPLETE
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
Refactored the Heritage Custodian Ontology to correctly model the relationship between `CustodianAppellation` and `CustodianName`, aligning with W3C SKOS best practices and ensuring `CustodianIdentifier` is the only class that directly identifies the `Custodian` hub.
|
|
|
|
**This is Phase 2** of the Appellation/Identifier refactoring:
|
|
- **Phase 1** (Nov 22, morning): Connected orphaned classes to Custodian hub ✅
|
|
- **Phase 2** (Nov 22, afternoon): Moved appellations from Custodian to CustodianName ✅
|
|
|
|
---
|
|
|
|
## Problem Statement
|
|
|
|
### Phase 1 Architecture (Correct but Improvable)
|
|
|
|
After Phase 1 refactoring:
|
|
```
|
|
Custodian --[crm:P1_is_identified_by]--> CustodianAppellation
|
|
Custodian --[crm:P48_has_preferred_identifier]--> CustodianIdentifier
|
|
```
|
|
|
|
**Issues Discovered**:
|
|
1. **Semantic confusion**: `crm:P1_is_identified_by` suggests appellations *identify* the Custodian hub
|
|
2. **Inconsistent hub design**: Both Identifier and Appellation claimed to identify the hub
|
|
3. **Missing name aspect**: Appellations should be variants of the canonical name, not hub identifiers
|
|
4. **Ontology misalignment**: W3C Org Ontology uses `skos:altLabel` for alternative names, not `crm:P1`
|
|
|
|
---
|
|
|
|
## Solution
|
|
|
|
### Phase 2 Architecture (Correct and Aligned with SKOS)
|
|
|
|
```
|
|
Custodian (hub)
|
|
└─ skos:prefLabel ──> CustodianName (canonical emic name)
|
|
└─ skos:altLabel ──> CustodianAppellation (name variants)
|
|
|
|
Custodian (hub)
|
|
└─ crm:P48_has_preferred_identifier ──> CustodianIdentifier (external IDs)
|
|
```
|
|
|
|
**Key Changes from Phase 1**:
|
|
1. **CustodianAppellation connects to CustodianName** (not Custodian)
|
|
2. **Uses `skos:altLabel`** (standard property for alternative lexical labels)
|
|
3. **Only CustodianIdentifier identifies the hub** (maintains clean hub architecture)
|
|
4. **Inverse relationship**: `CustodianAppellation.variant_of_name` → `CustodianName` (using `skos:broader`)
|
|
|
|
---
|
|
|
|
## Files Modified
|
|
|
|
### 1. Schema Module Files
|
|
|
|
#### `/schemas/20251121/linkml/modules/classes/Appellation.yaml` ✅
|
|
**Changes**:
|
|
- Updated description: "alternative name variants for CustodianName" (was "custodian")
|
|
- Changed slot from `identifies_custodian` → `variant_of_name`
|
|
- Changed slot_uri from `crm:P1i_identifies` → `skos:broader`
|
|
- Range changed from `Custodian` → `CustodianName`
|
|
- Added import for `CustodianName` class
|
|
- Updated documentation with SKOS altLabel rationale
|
|
|
|
#### `/schemas/20251121/linkml/modules/classes/Custodian.yaml` ✅
|
|
**Changes**:
|
|
- **Removed** `appellations` slot from slots list (line 99)
|
|
- **Removed** `appellations` slot_usage block (lines 169-178)
|
|
- Updated documentation: "Alternative names (in CustodianName.alternative_names list)"
|
|
|
|
#### `/schemas/20251121/linkml/modules/classes/CustodianName.yaml` ✅
|
|
**Changes**:
|
|
- **Added** `alternative_names` slot to slots list
|
|
- **Added** `alternative_names` slot_usage:
|
|
- `slot_uri: skos:altLabel`
|
|
- `range: CustodianAppellation`
|
|
- `multivalued: true`
|
|
- Examples: "BnF", "Rijks", translations, historical variants
|
|
- Added related mappings: `foaf:nick`, `gleif:hasOtherName`
|
|
|
|
### 2. New Slot Files Created
|
|
|
|
#### `/schemas/20251121/linkml/modules/slots/alternative_names.yaml` ✅
|
|
**Purpose**: CustodianName → CustodianAppellation relationship
|
|
- `slot_uri: skos:altLabel`
|
|
- `range: CustodianAppellation`
|
|
- `multivalued: true`
|
|
- Domain: CustodianName (SKOS Concept)
|
|
|
|
#### `/schemas/20251121/linkml/modules/slots/variant_of_name.yaml` ✅
|
|
**Purpose**: CustodianAppellation → CustodianName inverse relationship
|
|
- `slot_uri: skos:broader`
|
|
- `range: CustodianName`
|
|
- Domain: E41_Appellation
|
|
- Inverse of `skos:altLabel`
|
|
|
|
### 3. Main Schema File
|
|
|
|
#### `/schemas/20251121/linkml/01_custodian_name_modular.yaml` ✅
|
|
**Changes**:
|
|
- **Removed import**: `modules/slots/appellations`
|
|
- **Added imports**:
|
|
- `modules/slots/alternative_names`
|
|
- `modules/slots/variant_of_name`
|
|
- Updated change log:
|
|
- "New slots (3): alternative_names (CustodianName → CustodianAppellation), variant_of_name (inverse), identifies_custodian (Identifier → Custodian)"
|
|
- "Architecture change: CustodianAppellation now connects to CustodianName (not Custodian) using skos:altLabel"
|
|
|
|
### 4. Deprecated Files
|
|
|
|
#### `/schemas/20251121/linkml/modules/slots/appellations.yaml` ⚠️ DEPRECATED
|
|
**Status**: Marked as deprecated with clear migration path
|
|
- Added deprecation notice explaining why it was replaced
|
|
- Documents old architecture vs. new architecture
|
|
- Points to replacement files (`alternative_names.yaml`, `variant_of_name.yaml`)
|
|
- Kept for historical reference only
|
|
|
|
---
|
|
|
|
## Ontology Alignment
|
|
|
|
### SKOS (Simple Knowledge Organization System)
|
|
|
|
**Primary Property**: `skos:altLabel`
|
|
- **Definition**: "An alternative lexical label for a resource"
|
|
- **Use case**: "Trading names, colloquial names, abbreviations, acronyms"
|
|
- **Source**: W3C Org Ontology (org:alternativeName → skos:altLabel)
|
|
|
|
**Inverse Property**: `skos:broader`
|
|
- Links alternative label back to its preferred concept
|
|
- Standard SKOS hierarchical relationship
|
|
|
|
### Related Ontology Properties
|
|
|
|
- **W3C Org Ontology**: `org:alternativeName` (subproperty of `skos:altLabel`)
|
|
- **GLEIF**: `gleif:hasOtherName` (subproperty of `skos:altLabel`)
|
|
- **FOAF**: `foaf:nick` (for nicknames)
|
|
- **Schema.org**: `schema:alternateName` (close match)
|
|
|
|
---
|
|
|
|
## Generated Outputs
|
|
|
|
### RDF Formats (Timestamp: `20251122_181217` - `20251122_181224`)
|
|
|
|
```bash
|
|
schemas/20251121/rdf/
|
|
├── 01_custodian_name_modular_20251122_181217.owl.ttl (160 KB - OWL/Turtle)
|
|
├── 01_custodian_name_modular_20251122_181224.nt (458 KB - N-Triples)
|
|
├── 01_custodian_name_modular_20251122_181224.jsonld (382 KB - JSON-LD)
|
|
└── 01_custodian_name_modular_20251122_181224.rdf (330 KB - RDF/XML)
|
|
```
|
|
|
|
**Validation**: ✅ All files generated successfully with `gen-owl` and `rdfpipe`
|
|
|
|
### UML Diagrams (Timestamp: `20251122_181237`)
|
|
|
|
```bash
|
|
schemas/20251121/uml/mermaid/
|
|
└── 01_custodian_name_modular_20251122_181237_er.mmd (176 lines - ER diagram)
|
|
```
|
|
|
|
**Key Relationships Verified**:
|
|
```mermaid
|
|
CustodianName ||--}o CustodianAppellation : "alternative_names"
|
|
CustodianAppellation ||--|o CustodianName : "variant_of_name"
|
|
```
|
|
|
|
---
|
|
|
|
## Architecture Evolution
|
|
|
|
### Phase 1 → Phase 2 Comparison
|
|
|
|
| Aspect | Phase 1 | Phase 2 |
|
|
|--------|---------|---------|
|
|
| **Appellation connects to** | Custodian (hub) | CustodianName (aspect) |
|
|
| **Property used** | `crm:P1_is_identified_by` | `skos:altLabel` |
|
|
| **Inverse property** | `crm:P1i_identifies` | `skos:broader` |
|
|
| **Semantic meaning** | "Appellation identifies hub" | "Appellation is variant of name" |
|
|
| **Ontology alignment** | CIDOC-CRM | W3C SKOS + CIDOC-CRM |
|
|
| **Slot name** | `appellations` | `alternative_names` |
|
|
|
|
---
|
|
|
|
## Examples
|
|
|
|
### Phase 1 (Before Phase 2)
|
|
|
|
```yaml
|
|
# Phase 1 architecture - Connected but semantically confused
|
|
Custodian:
|
|
id: https://nde.nl/ontology/hc/cust/bnf
|
|
appellations: # ⚠️ Direct connection to hub (problematic)
|
|
- appellation_value: "BnF"
|
|
appellation_type: ABBREVIATION
|
|
identifies_custodian: https://nde.nl/ontology/hc/cust/bnf # Suggests it identifies hub
|
|
```
|
|
|
|
### Phase 2 (After Phase 2)
|
|
|
|
```yaml
|
|
# Phase 2 architecture - CORRECT!
|
|
Custodian:
|
|
id: https://nde.nl/ontology/hc/cust/bnf
|
|
preferred_label:
|
|
refers_to_custodian: https://nde.nl/ontology/hc/cust/bnf
|
|
emic_name: "Bibliothèque nationale de France"
|
|
alternative_names: # ✅ Connection through CustodianName
|
|
- appellation_value: "BnF"
|
|
appellation_type: ABBREVIATION
|
|
variant_of_name: <link back to CustodianName>
|
|
- appellation_value: "National Library of France"
|
|
appellation_language: "en"
|
|
appellation_type: TRANSLATION
|
|
```
|
|
|
|
---
|
|
|
|
## RDF Serialization
|
|
|
|
### Turtle (TTL) - Phase 2
|
|
|
|
```turtle
|
|
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
|
|
@prefix crm: <http://www.cidoc-crm.org/cidoc-crm/> .
|
|
|
|
<https://nde.nl/ontology/hc/cust/bnf>
|
|
skos:prefLabel <https://nde.nl/ontology/hc/name/bnf-001> .
|
|
|
|
<https://nde.nl/ontology/hc/name/bnf-001>
|
|
a crm:E33_Linguistic_Object ;
|
|
rdf:value "Bibliothèque nationale de France" ;
|
|
skos:altLabel <https://nde.nl/ontology/hc/appellation/bnf-abbrev> .
|
|
|
|
<https://nde.nl/ontology/hc/appellation/bnf-abbrev>
|
|
a crm:E41_Appellation ;
|
|
rdf:value "BnF" ;
|
|
skos:broader <https://nde.nl/ontology/hc/name/bnf-001> .
|
|
```
|
|
|
|
---
|
|
|
|
## Validation Results
|
|
|
|
### LinkML Schema Validation ✅
|
|
|
|
```bash
|
|
$ gen-owl -f ttl 01_custodian_name_modular.yaml
|
|
# Output: 160 KB OWL/Turtle file with no errors
|
|
```
|
|
|
|
**Warnings** (non-critical):
|
|
- ⚠️ Multiple owl types for `language` (rdfs:Literal vs owl:Thing) - expected for ambiguous ranges
|
|
- ⚠️ Schema namespace override (schema.org vs schema:) - cosmetic, doesn't affect semantics
|
|
|
|
### ER Diagram Validation ✅
|
|
|
|
**Relationships Confirmed**:
|
|
1. ✅ `Custodian ||--|o CustodianName : "preferred_label"` (hub → name)
|
|
2. ✅ `CustodianName ||--}o CustodianAppellation : "alternative_names"` (name → variants, one-to-many)
|
|
3. ✅ `CustodianAppellation ||--|o CustodianName : "variant_of_name"` (variant → name, inverse)
|
|
4. ✅ `Custodian ||--}o CustodianIdentifier : "identifiers"` (hub → external IDs)
|
|
5. ✅ `CustodianIdentifier ||--|o Custodian : "identifies_custodian"` (ID → hub, identifies)
|
|
|
|
**Key Observation**: ❌ No direct Custodian ↔ Appellation relationship exists (by design!)
|
|
|
|
---
|
|
|
|
## Impact Analysis
|
|
|
|
### Benefits
|
|
|
|
1. **Semantic clarity**: Appellations are now clearly name variants, not hub identifiers
|
|
2. **Ontology alignment**: Uses standard `skos:altLabel` (W3C recommended practice)
|
|
3. **Clean hub architecture**: Only `CustodianIdentifier` identifies the hub
|
|
4. **Multi-aspect modeling**: Names can have independent alternative labels
|
|
5. **Bidirectional relationships**: Both forward (`alternative_names`) and inverse (`variant_of_name`)
|
|
|
|
### Breaking Changes from Phase 1
|
|
|
|
⚠️ **Data Migration Required**:
|
|
|
|
**Phase 1 data structure**:
|
|
```yaml
|
|
Custodian:
|
|
appellations: [list of CustodianAppellation]
|
|
```
|
|
|
|
**Phase 2 data structure**:
|
|
```yaml
|
|
Custodian:
|
|
preferred_label: # CustodianName
|
|
alternative_names: [list of CustodianAppellation]
|
|
```
|
|
|
|
**Migration Script**: TODO - Create conversion script for existing data
|
|
|
|
---
|
|
|
|
## Design Rationale
|
|
|
|
### Why `skos:altLabel` Instead of `crm:P1_is_identified_by`?
|
|
|
|
**CIDOC-CRM `crm:P1_is_identified_by`**:
|
|
- ✅ Purpose: "Names and labels used to **identify** this custodian"
|
|
- ❌ Problem: Suggests appellations identify the **hub entity**
|
|
- ❌ Conflicts with: `CustodianIdentifier` being the only hub identifier
|
|
|
|
**SKOS `skos:altLabel`**:
|
|
- ✅ Purpose: "Alternative lexical label for a resource"
|
|
- ✅ Standard for: Trading names, colloquial names, abbreviations
|
|
- ✅ Aligns with: W3C Org Ontology best practices
|
|
- ✅ Clear semantics: Alternative labels for a **name aspect**, not hub identifiers
|
|
|
|
### Why CustodianName, Not Custodian?
|
|
|
|
**Aspect-Based Architecture**:
|
|
- `CustodianName` = One aspect of the custodian (the emic designation)
|
|
- `CustodianIdentifier` = Different aspect (external identifiers)
|
|
- `CustodianLegalStatus` = Different aspect (legal entity)
|
|
|
|
**Each aspect has independent lifecycle**:
|
|
- Names can have alternative variants (appellations)
|
|
- Identifiers can reference external systems (ISIL, Wikidata)
|
|
- Legal statuses can have registration numbers (KvK, company ID)
|
|
|
|
**Mixing aspects breaks the model**:
|
|
- ❌ Custodian.appellations → Implies hub has name variants (wrong level of abstraction)
|
|
- ✅ CustodianName.alternative_names → Correct level (names have variants)
|
|
|
|
---
|
|
|
|
## Testing Checklist
|
|
|
|
- [x] LinkML schema validation passes
|
|
- [x] OWL/Turtle generation succeeds
|
|
- [x] RDF format conversions (N-Triples, JSON-LD, RDF/XML)
|
|
- [x] Mermaid ER diagram generation
|
|
- [x] Relationships verified in ER diagram
|
|
- [x] Deprecated file marked with migration path
|
|
- [x] Main schema imports updated
|
|
- [ ] Unit tests for data instances (TODO)
|
|
- [ ] Migration script for existing data (TODO)
|
|
|
|
---
|
|
|
|
## Related Documentation
|
|
|
|
### Files to Update
|
|
|
|
1. **README.md** - Update architecture diagrams showing new relationships
|
|
2. **SCHEMA_MODULES.md** - Document `alternative_names` and `variant_of_name` slots
|
|
3. **ONTOLOGY_EXTENSIONS.md** - Add section on SKOS altLabel usage
|
|
4. **Data migration guide** - Create step-by-step conversion instructions
|
|
|
|
### Reference Documents
|
|
|
|
- **W3C Org Ontology**: https://www.w3.org/TR/vocab-org/#org:alternativeName
|
|
- **SKOS Reference**: https://www.w3.org/TR/skos-reference/#altLabel
|
|
- **CIDOC-CRM E41_Appellation**: http://www.cidoc-crm.org/Entity/e41-appellation/version-7.1.1
|
|
- **GLEIF Ontology**: https://www.gleif.org/ontology/Base/hasOtherName
|
|
|
|
---
|
|
|
|
## Phase Comparison Summary
|
|
|
|
| Phase | Date | Focus | Status |
|
|
|-------|------|-------|--------|
|
|
| **Phase 1** | 2025-11-22 AM | Connect orphaned Appellation/Identifier to Custodian hub | ✅ Complete |
|
|
| **Phase 2** | 2025-11-22 PM | Move Appellation from Custodian to CustodianName (SKOS alignment) | ✅ Complete |
|
|
|
|
**See Also**:
|
|
- `APPELLATION_IDENTIFIER_REFACTORING_20251122.md` - Phase 1 documentation
|
|
- `LEGAL_ENTITY_REFACTORING.md` - Legal entity model (context for Phase 1)
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
### Immediate (Required Before v0.2.0 Release)
|
|
|
|
1. **Create data migration script** (`scripts/migrate_appellations_phase2_20251122.py`)
|
|
- Convert Phase 1 `Custodian.appellations` to Phase 2 `CustodianName.alternative_names`
|
|
- Validate all existing YAML instance files
|
|
- Generate migration report
|
|
|
|
2. **Update documentation**:
|
|
- README.md architecture diagrams
|
|
- SCHEMA_MODULES.md slot documentation
|
|
- Examples in LinkML schema comments
|
|
|
|
3. **Add unit tests**:
|
|
- Test CustodianName with alternative_names
|
|
- Test CustodianAppellation.variant_of_name inverse
|
|
- Validate SKOS altLabel RDF serialization
|
|
|
|
### Future Enhancements
|
|
|
|
1. **Add language-tagged appellations**:
|
|
- Support multilingual variants with proper `@lang` tags
|
|
- RDF example: `skos:altLabel "BnF"@fr, "National Library of France"@en`
|
|
|
|
2. **Appellation provenance**:
|
|
- Track source of alternative names
|
|
- Add temporal validity (when name was used)
|
|
|
|
3. **Authority control integration**:
|
|
- Link appellations to name authority records (VIAF, ISNI)
|
|
- Validate variant forms against authority files
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
|
|
Phase 2 successfully aligns the Heritage Custodian Ontology with W3C SKOS best practices, maintains clean hub architecture, and provides clear semantic distinction between:
|
|
|
|
- **CustodianIdentifier**: External identifiers that reference the hub
|
|
- **CustodianAppellation**: Alternative name variants for the canonical emic name
|
|
|
|
This change improves ontology interoperability, semantic clarity, and prepares the schema for future extensions (multilingual support, authority control, provenance tracking).
|
|
|
|
---
|
|
|
|
**Version**: v0.1.0 → v0.2.0 (Phase 2)
|
|
**Schema Status**: ✅ Validated
|
|
**RDF Generation**: ✅ Complete (4 formats)
|
|
**Diagrams**: ✅ Generated (Mermaid ER)
|
|
**Data Migration**: ⏳ Pending (Phase 2 → Phase 1 conversion script needed)
|
|
|
|
---
|
|
|
|
*End of Phase 2 Report*
|