glam/APPELLATION_IDENTIFIER_REFACTORING_20251122.md

10 KiB

Appellation and Identifier Refactoring - Complete

Date: 2025-11-22
Status: COMPLETE
Schema Version: 0.1.0

Summary

Successfully renamed and connected the Appellation and Identifier classes to the Custodian hub using proper CIDOC-CRM edge properties.

Changes Made

1. Renamed Classes

Before:

  • Appellation (orphaned, no connection to Custodian hub)
  • Identifier (orphaned, no connection to Custodian hub)

After:

  • CustodianAppellation (connected via bidirectional CIDOC-CRM properties)
  • CustodianIdentifier (connected via bidirectional CIDOC-CRM properties)

2. Added CIDOC-CRM Edge Properties

CustodianAppellation Connection

Forward Property (Custodian → CustodianAppellation):

# In Custodian class
appellations:
  slot_uri: crm:P1_is_identified_by
  range: CustodianAppellation
  multivalued: true
  description: "Names and labels used to identify this custodian"

Inverse Property (CustodianAppellation → Custodian):

# In CustodianAppellation class
identifies_custodian:
  slot_uri: crm:P1i_identifies
  range: Custodian
  required: false
  description: "Links this appellation back to the Custodian hub it identifies"

CIDOC-CRM Properties:

  • crm:P1_is_identified_by - Domain: E1_CRM_Entity (Custodian) → Range: E41_Appellation
  • crm:P1i_identifies - Inverse property (E41_Appellation → E1_CRM_Entity)

CustodianIdentifier Connection

Forward Property (Custodian → CustodianIdentifier):

# In Custodian class
identifiers:
  slot_uri: crm:P48_has_preferred_identifier
  range: CustodianIdentifier
  multivalued: true
  description: "External identifiers assigned to this custodian by authorities"

Inverse Property (CustodianIdentifier → Custodian):

# In CustodianIdentifier class
identifies_custodian:
  slot_uri: crm:P48i_is_preferred_identifier_of
  range: Custodian
  required: false
  description: "Links this identifier back to the Custodian hub it identifies"

CIDOC-CRM Properties:

  • crm:P48_has_preferred_identifier - Domain: E1_CRM_Entity (Custodian) → Range: E42_Identifier
  • crm:P48i_is_preferred_identifier_of - Inverse property (E42_Identifier → E1_CRM_Entity)

3. Created New Slot Files

Created:

  • modules/slots/appellations.yaml - Forward property (Custodian → CustodianAppellation)
  • modules/slots/identifies_custodian.yaml - Inverse property (both CustodianAppellation and CustodianIdentifier → Custodian)

Updated:

  • modules/slots/identifiers.yaml - Updated to use crm:P48_has_preferred_identifier and CustodianIdentifier range

4. Updated Class Files

Updated Files:

  1. modules/classes/Appellation.yaml → Renamed to reference CustodianAppellation

    • Added identifies_custodian slot
    • Added CIDOC-CRM documentation
    • Updated class_uri: crm:E41_Appellation
  2. modules/classes/Identifier.yaml → Renamed to reference CustodianIdentifier

    • Added identifies_custodian slot
    • Added CIDOC-CRM documentation
    • Updated class_uri: crm:E42_Identifier
  3. modules/classes/Custodian.yaml → Added forward properties

    • Added appellations slot (multivalued)
    • Added identifiers slot (multivalued)
    • Both use proper CIDOC-CRM properties
  4. modules/classes/CustodianObservation.yaml → Updated range references

    • observed_name: AppellationCustodianAppellation
    • alternative_observed_names: AppellationCustodianAppellation
  5. modules/classes/CustodianReconstruction.yaml → Updated range references

    • identifiers: IdentifierCustodianIdentifier
    • Updated slot_uri: dcterms:identifiercrm:P48_has_preferred_identifier

5. Updated Main Schema

File: 01_custodian_name_modular.yaml

Changes:

  • Added imports: modules/slots/appellations, modules/slots/identifies_custodian
  • Updated file count: 84 → 86 total files (+2 new slots)
  • Added comment about new bidirectional slots

Hub Architecture Pattern

The refactoring implements proper bidirectional linking between the Custodian hub and its appellations/identifiers:

┌─────────────────┐
│   Custodian     │ (Hub - minimal, just hc_id)
│   (E39_Actor)   │
└────────┬────────┘
         │
         ├─── crm:P1_is_identified_by ─────→ CustodianAppellation (E41_Appellation)
         │                                          │
         │                                          └─── crm:P1i_identifies ────→ (back to hub)
         │
         └─── crm:P48_has_preferred_identifier ──→ CustodianIdentifier (E42_Identifier)
                                                         │
                                                         └─── crm:P48i_is_preferred_identifier_of ──→ (back to hub)

Key Design Principles:

  1. Bidirectional Links: Both forward and inverse properties implemented
  2. CIDOC-CRM Compliance: Uses standard cultural heritage ontology properties
  3. Multivalued: A custodian can have multiple appellations and identifiers
  4. Optional Inverse: The identifies_custodian slot is optional (not required)

CIDOC-CRM Ontology Alignment

E41_Appellation (CustodianAppellation)

CIDOC-CRM Definition:

"This class comprises any identifier expressed as text (names, titles, labels)."

Properties Used:

  • P1_is_identified_by (E1 CRM Entity → E41 Appellation)

    • "This property describes the naming or identification of any real-world item by a name or any other identifier."
  • P1i_identifies (E41 Appellation → E1 CRM Entity) - inverse

    • "This property identifies the entity that is named or identified."

Use Cases:

  • Official names (emic names accepted by the custodian)
  • Alternative names and translations
  • Historical name variants
  • Multilingual representations

E42_Identifier (CustodianIdentifier)

CIDOC-CRM Definition:

"This class comprises formal symbols or reference codes for unique identification."

Properties Used:

  • P48_has_preferred_identifier (E1 CRM Entity → E42 Identifier)

    • "This property records the preferred E42 Identifier that was used to identify an instance of E1 CRM Entity at the time this property was recorded."
  • P48i_is_preferred_identifier_of (E42 Identifier → E1 CRM Entity) - inverse

    • "This property identifies the E1 CRM Entity that this E42 Identifier is the preferred identifier for."

Use Cases:

  • ISIL codes (International Standard Identifier for Libraries and Related Organizations)
  • Wikidata Q-numbers
  • VIAF identifiers (Virtual International Authority File)
  • KvK numbers (Dutch Chamber of Commerce)
  • ROR identifiers (Research Organization Registry)

Validation

Schema Compilation: PASS

$ gen-owl -f ttl schemas/20251121/linkml/01_custodian_name_modular.yaml
# Successfully compiled with expected warnings about namespace mappings

Warnings (expected and acceptable):

  • Namespace mapping conflicts (heritage, schema, tooi) - resolved by import order
  • Multiple owl types for language slot - acceptable for multilingual support

File Count Summary

Before Refactoring:

  • Total files: 84

After Refactoring:

  • Total files: 86 (+2 new slot files)
  • Breakdown:
    • 17 classes (no change, renamed existing)
    • 6 enums (no change)
    • 61 slots (+2: appellations, identifies_custodian)
    • 1 metadata file
    • 1 main schema

Next Steps

1. Regenerate RDF Formats

cd /Users/kempersc/apps/glam/schemas/20251121/rdf
gen-owl -f ttl ../linkml/01_custodian_name_modular.yaml > 01_custodian_name.owl.ttl
rdfpipe 01_custodian_name.owl.ttl -o nt > 01_custodian_name.nt
rdfpipe 01_custodian_name.owl.ttl -o jsonld > 01_custodian_name.jsonld
# ... repeat for all 8 formats

2. Update UML Diagrams

  • Regenerate Mermaid class diagram with new appellations/identifiers slots
  • Regenerate PlantUML diagram showing bidirectional relationships

3. Create Example Instances

# Example showing bidirectional linking
---
# Custodian hub
- hc_id: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
  appellations:
    - appellation_value: "Rijksmuseum"
      appellation_language: "nl"
      appellation_type: OFFICIAL
      identifies_custodian: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
  identifiers:
    - identifier_scheme: "ISIL"
      identifier_value: "NL-AmRMA"
      identifies_custodian: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804
    - identifier_scheme: "Wikidata"
      identifier_value: "Q190804"
      identifies_custodian: https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804

4. Update Documentation

  • Update SCHEMA_ARCHITECTURE.md with bidirectional linking patterns
  • Document CIDOC-CRM property usage in ONTOLOGY_ALIGNMENT.md
  • Add examples to USAGE_GUIDE.md

References

CIDOC-CRM:

Local Files:

  • Ontology: /data/ontology/CIDOC_CRM_v7.1.3.rdf
  • Schema: /schemas/20251121/linkml/01_custodian_name_modular.yaml
  • RDF Output: /schemas/20251121/rdf/ (to be regenerated)

Session Context

This refactoring is part of the broader Legal Entity Refactoring work (2025-11-22), which:

  1. Added comprehensive legal entity model (8 new classes)
  2. Generated RDF serializations (7 formats, 2,701 triples)
  3. Created UML diagrams (Mermaid + PlantUML)
  4. Connected orphaned Appellation/Identifier classes to Custodian hub (THIS DOCUMENT)

See Also:

  • LEGAL_ENTITY_REFACTORING.md - Complete legal entity model documentation
  • RDF_UML_GENERATION_COMPLETE_20251122.md - RDF generation guide
  • SESSION_SUMMARY_20251122_APPELLATION_IDENTIFIER_REFACTORING.md - Session log

Status: COMPLETE
Validated: Schema compiles successfully
Next Actions: Regenerate RDF, update UML, create example instances