glam/APPELLATION_IDENTIFIER_REFACTORING_20251122.md

283 lines
10 KiB
Markdown

# Appellation and Identifier Refactoring - Complete ✅
**Date**: 2025-11-22
**Status**: COMPLETE
**Schema Version**: 0.1.0
## Summary
Successfully renamed and connected the `Appellation` and `Identifier` classes to the `Custodian` hub using proper CIDOC-CRM edge properties.
## Changes Made
### 1. Renamed Classes ✅
**Before**:
- `Appellation` (orphaned, no connection to Custodian hub)
- `Identifier` (orphaned, no connection to Custodian hub)
**After**:
- `CustodianAppellation` (connected via bidirectional CIDOC-CRM properties)
- `CustodianIdentifier` (connected via bidirectional CIDOC-CRM properties)
### 2. Added CIDOC-CRM Edge Properties ✅
#### CustodianAppellation Connection
**Forward Property** (Custodian → CustodianAppellation):
```yaml
# In Custodian class
appellations:
slot_uri: crm:P1_is_identified_by
range: CustodianAppellation
multivalued: true
description: "Names and labels used to identify this custodian"
```
**Inverse Property** (CustodianAppellation → Custodian):
```yaml
# In CustodianAppellation class
identifies_custodian:
slot_uri: crm:P1i_identifies
range: Custodian
required: false
description: "Links this appellation back to the Custodian hub it identifies"
```
**CIDOC-CRM Properties**:
- `crm:P1_is_identified_by` - Domain: E1_CRM_Entity (Custodian) → Range: E41_Appellation
- `crm:P1i_identifies` - Inverse property (E41_Appellation → E1_CRM_Entity)
#### CustodianIdentifier Connection
**Forward Property** (Custodian → CustodianIdentifier):
```yaml
# In Custodian class
identifiers:
slot_uri: crm:P48_has_preferred_identifier
range: CustodianIdentifier
multivalued: true
description: "External identifiers assigned to this custodian by authorities"
```
**Inverse Property** (CustodianIdentifier → Custodian):
```yaml
# In CustodianIdentifier class
identifies_custodian:
slot_uri: crm:P48i_is_preferred_identifier_of
range: Custodian
required: false
description: "Links this identifier back to the Custodian hub it identifies"
```
**CIDOC-CRM Properties**:
- `crm:P48_has_preferred_identifier` - Domain: E1_CRM_Entity (Custodian) → Range: E42_Identifier
- `crm:P48i_is_preferred_identifier_of` - Inverse property (E42_Identifier → E1_CRM_Entity)
### 3. Created New Slot Files ✅
**Created**:
- `modules/slots/appellations.yaml` - Forward property (Custodian → CustodianAppellation)
- `modules/slots/identifies_custodian.yaml` - Inverse property (both CustodianAppellation and CustodianIdentifier → Custodian)
**Updated**:
- `modules/slots/identifiers.yaml` - Updated to use `crm:P48_has_preferred_identifier` and `CustodianIdentifier` range
### 4. Updated Class Files ✅
**Updated Files**:
1. `modules/classes/Appellation.yaml` → Renamed to reference `CustodianAppellation`
- Added `identifies_custodian` slot
- Added CIDOC-CRM documentation
- Updated class_uri: `crm:E41_Appellation`
2. `modules/classes/Identifier.yaml` → Renamed to reference `CustodianIdentifier`
- Added `identifies_custodian` slot
- Added CIDOC-CRM documentation
- Updated class_uri: `crm:E42_Identifier`
3. `modules/classes/Custodian.yaml` → Added forward properties
- Added `appellations` slot (multivalued)
- Added `identifiers` slot (multivalued)
- Both use proper CIDOC-CRM properties
4. `modules/classes/CustodianObservation.yaml` → Updated range references
- `observed_name`: `Appellation``CustodianAppellation`
- `alternative_observed_names`: `Appellation``CustodianAppellation`
5. `modules/classes/CustodianReconstruction.yaml` → Updated range references
- `identifiers`: `Identifier``CustodianIdentifier`
- Updated slot_uri: `dcterms:identifier``crm:P48_has_preferred_identifier`
### 5. Updated Main Schema ✅
**File**: `01_custodian_name_modular.yaml`
**Changes**:
- Added imports: `modules/slots/appellations`, `modules/slots/identifies_custodian`
- Updated file count: 84 → 86 total files (+2 new slots)
- Added comment about new bidirectional slots
## Hub Architecture Pattern
The refactoring implements proper bidirectional linking between the Custodian hub and its appellations/identifiers:
```
┌─────────────────┐
│ Custodian │ (Hub - minimal, just hc_id)
│ (E39_Actor) │
└────────┬────────┘
├─── crm:P1_is_identified_by ─────→ CustodianAppellation (E41_Appellation)
│ │
│ └─── crm:P1i_identifies ────→ (back to hub)
└─── crm:P48_has_preferred_identifier ──→ CustodianIdentifier (E42_Identifier)
└─── crm:P48i_is_preferred_identifier_of ──→ (back to hub)
```
**Key Design Principles**:
1. **Bidirectional Links**: Both forward and inverse properties implemented
2. **CIDOC-CRM Compliance**: Uses standard cultural heritage ontology properties
3. **Multivalued**: A custodian can have multiple appellations and identifiers
4. **Optional Inverse**: The `identifies_custodian` slot is optional (not required)
## CIDOC-CRM Ontology Alignment
### E41_Appellation (CustodianAppellation)
**CIDOC-CRM Definition**:
> "This class comprises any identifier expressed as text (names, titles, labels)."
**Properties Used**:
- **P1_is_identified_by** (E1 CRM Entity → E41 Appellation)
- "This property describes the naming or identification of any real-world item by a name or any other identifier."
- **P1i_identifies** (E41 Appellation → E1 CRM Entity) - *inverse*
- "This property identifies the entity that is named or identified."
**Use Cases**:
- Official names (emic names accepted by the custodian)
- Alternative names and translations
- Historical name variants
- Multilingual representations
### E42_Identifier (CustodianIdentifier)
**CIDOC-CRM Definition**:
> "This class comprises formal symbols or reference codes for unique identification."
**Properties Used**:
- **P48_has_preferred_identifier** (E1 CRM Entity → E42 Identifier)
- "This property records the preferred E42 Identifier that was used to identify an instance of E1 CRM Entity at the time this property was recorded."
- **P48i_is_preferred_identifier_of** (E42 Identifier → E1 CRM Entity) - *inverse*
- "This property identifies the E1 CRM Entity that this E42 Identifier is the preferred identifier for."
**Use Cases**:
- ISIL codes (International Standard Identifier for Libraries and Related Organizations)
- Wikidata Q-numbers
- VIAF identifiers (Virtual International Authority File)
- KvK numbers (Dutch Chamber of Commerce)
- ROR identifiers (Research Organization Registry)
## Validation
**Schema Compilation**: ✅ PASS
```bash
$ gen-owl -f ttl schemas/20251121/linkml/01_custodian_name_modular.yaml
# Successfully compiled with expected warnings about namespace mappings
```
**Warnings** (expected and acceptable):
- Namespace mapping conflicts (heritage, schema, tooi) - resolved by import order
- Multiple owl types for language slot - acceptable for multilingual support
## File Count Summary
**Before Refactoring**:
- Total files: 84
**After Refactoring**:
- Total files: 86 (+2 new slot files)
- Breakdown:
- 17 classes (no change, renamed existing)
- 6 enums (no change)
- 61 slots (+2: `appellations`, `identifies_custodian`)
- 1 metadata file
- 1 main schema
## Next Steps
### 1. Regenerate RDF Formats ⏳
```bash
cd /Users/kempersc/apps/glam/schemas/20251121/rdf
gen-owl -f ttl ../linkml/01_custodian_name_modular.yaml > 01_custodian_name.owl.ttl
rdfpipe 01_custodian_name.owl.ttl -o nt > 01_custodian_name.nt
rdfpipe 01_custodian_name.owl.ttl -o jsonld > 01_custodian_name.jsonld
# ... repeat for all 8 formats
```
### 2. Update UML Diagrams ⏳
- Regenerate Mermaid class diagram with new appellations/identifiers slots
- Regenerate PlantUML diagram showing bidirectional relationships
### 3. Create Example Instances ⏳
```yaml
# Example showing bidirectional linking
---
# Custodian hub
- hc_id: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
appellations:
- appellation_value: "Rijksmuseum"
appellation_language: "nl"
appellation_type: OFFICIAL
identifies_custodian: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
identifiers:
- identifier_scheme: "ISIL"
identifier_value: "NL-AmRMA"
identifies_custodian: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
- identifier_scheme: "Wikidata"
identifier_value: "Q190804"
identifies_custodian: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
```
### 4. Update Documentation ⏳
- Update `SCHEMA_ARCHITECTURE.md` with bidirectional linking patterns
- Document CIDOC-CRM property usage in `ONTOLOGY_ALIGNMENT.md`
- Add examples to `USAGE_GUIDE.md`
## References
**CIDOC-CRM**:
- [CIDOC-CRM v7.1.3 Specification](https://www.cidoc-crm.org/html/cidoc_crm_v7.1.3.html)
- E41_Appellation: http://www.cidoc-crm.org/Entity/e41-appellation/version-7.1.3
- E42_Identifier: http://www.cidoc-crm.org/Entity/e42-identifier/version-7.1.3
- P1_is_identified_by: http://www.cidoc-crm.org/Property/p1-is-identified-by/version-7.1.3
- P48_has_preferred_identifier: http://www.cidoc-crm.org/Property/p48-has-preferred-identifier/version-7.1.3
**Local Files**:
- Ontology: `/data/ontology/CIDOC_CRM_v7.1.3.rdf`
- Schema: `/schemas/20251121/linkml/01_custodian_name_modular.yaml`
- RDF Output: `/schemas/20251121/rdf/` (to be regenerated)
## Session Context
This refactoring is part of the broader **Legal Entity Refactoring** work (2025-11-22), which:
1. ✅ Added comprehensive legal entity model (8 new classes)
2. ✅ Generated RDF serializations (7 formats, 2,701 triples)
3. ✅ Created UML diagrams (Mermaid + PlantUML)
4. ✅ Connected orphaned Appellation/Identifier classes to Custodian hub (THIS DOCUMENT)
**See Also**:
- `LEGAL_ENTITY_REFACTORING.md` - Complete legal entity model documentation
- `RDF_UML_GENERATION_COMPLETE_20251122.md` - RDF generation guide
- `SESSION_SUMMARY_20251122_APPELLATION_IDENTIFIER_REFACTORING.md` - Session log
---
**Status**: ✅ COMPLETE
**Validated**: Schema compiles successfully
**Next Actions**: Regenerate RDF, update UML, create example instances