- Introduced custodian_hub_v3.mmd, custodian_hub_v4_final.mmd, and custodian_hub_v5_FINAL.mmd for Mermaid representation. - Created custodian_hub_FINAL.puml and custodian_hub_v3.puml for PlantUML representation. - Defined entities such as CustodianReconstruction, Identifier, TimeSpan, Agent, CustodianName, CustodianObservation, ReconstructionActivity, Appellation, ConfidenceMeasure, Custodian, LanguageCode, and SourceDocument. - Established relationships and associations between entities, including temporal extents, observations, and reconstruction activities. - Incorporated enumerations for various types, statuses, and classifications relevant to custodians and their activities.
398 lines
12 KiB
Markdown
398 lines
12 KiB
Markdown
# Heritage Custodian Ontology Integration Design
|
|
|
|
**Version**: 0.2.0
|
|
**Date**: 2025-11-05
|
|
**Status**: DRAFT - Awaiting subagent analysis
|
|
|
|
## Executive Summary
|
|
|
|
This document outlines the design for integrating multiple established ontologies into the Heritage Custodian LinkML schema:
|
|
|
|
- **TOOI** (Dutch government organizational ontology)
|
|
- **CPOV** (Core Public Organization Vocabulary - EU)
|
|
- **Schema.org** (web semantics)
|
|
- **EDM** (Europeana Data Model)
|
|
- **PROV-O** (W3C Provenance Ontology)
|
|
|
|
## Key Ontology Patterns Identified
|
|
|
|
### 1. TOOI Temporal Model
|
|
|
|
**Source**: `data/ontology/tooiont.ttl`
|
|
|
|
TOOI provides a sophisticated temporal tracking system for organizational changes:
|
|
|
|
#### Core Classes
|
|
|
|
```turtle
|
|
tooi:Overheidsorganisatie
|
|
- rdfs:subClassOf org:FormalOrganization
|
|
- rdfs:subClassOf prov:Agent
|
|
- rdfs:subClassOf prov:Entity
|
|
- rdfs:subClassOf prov:Organization
|
|
```
|
|
|
|
**Key Properties**:
|
|
- `prov:generatedAtTime` - when organization was founded
|
|
- `prov:invalidatedAtTime` - when organization ceased to exist
|
|
- `tooi:begindatum` - first calendar day organization existed (date component of generatedAtTime)
|
|
- `tooi:einddatum` - last calendar day organization existed (date component of invalidatedAtTime)
|
|
- `tooi:afkorting` - abbreviation
|
|
- `tooi:alternatieveNaam` - alternative names (multivalued)
|
|
- `dcterms:isPartOf` - parent organization (recursive)
|
|
- `tooi:organisatiecode` - unique organization code
|
|
|
|
#### Change Event Model
|
|
|
|
```turtle
|
|
tooi:Wijzigingsgebeurtenis (ChangeEvent)
|
|
- rdfs:subClassOf prov:Activity
|
|
- Properties:
|
|
- tooi:tijdstipWijziging (dateTime when change formally occurred)
|
|
- tooi:tijdstipRegistratie (dateTime when change was registered)
|
|
- tooi:redenWijziging (reason for change)
|
|
- tooi:heeftJuridischeGrondslag (legal basis for change)
|
|
|
|
tooi:ExistentieleWijziging (ExistentialChange)
|
|
- rdfs:subClassOf tooi:Wijzigingsgebeurtenis
|
|
- Subtypes:
|
|
- tooi:Afsplitsing (Split/spinoff)
|
|
- tooi:Fusie (Merger)
|
|
- tooi:Opheffing (Dissolution)
|
|
```
|
|
|
|
**Design Implication**: Our current `ghcid_history` uses a simple list of history entries. We should consider adding a `ChangeEvent` class that follows TOOI's pattern of linking changes to PROV-O activities.
|
|
|
|
### 2. CPOV Public Organization Model
|
|
|
|
**Source**: `data/ontology/core-public-organisation-ap.ttl`
|
|
|
|
CPOV focuses on public sector organizations with EU interoperability:
|
|
|
|
#### Core Classes
|
|
|
|
```turtle
|
|
cpov:PublicOrganisation
|
|
- rdfs:subClassOf org:Organization
|
|
- Represents government/public heritage organizations
|
|
|
|
cpov:ContactPoint
|
|
- Properties:
|
|
- cpov:email
|
|
- cpov:telephone
|
|
- cpov:contactPage (foaf:Document)
|
|
```
|
|
|
|
**Design Implication**: Our `ContactInfo` class aligns well with `cpov:ContactPoint`. We should add `class_uri: cpov:ContactPoint` mapping.
|
|
|
|
### 3. PROV-O Provenance Model
|
|
|
|
Both TOOI and CPOV heavily use PROV-O for temporal and provenance tracking:
|
|
|
|
- `prov:Entity` - things with provenance
|
|
- `prov:Activity` - activities that affect entities
|
|
- `prov:Agent` - agents responsible for activities
|
|
- `prov:generatedAtTime` - when entity was created
|
|
- `prov:invalidatedAtTime` - when entity ceased to be valid
|
|
- `prov:wasGeneratedBy` - links entity to creating activity
|
|
- `prov:wasInvalidatedBy` - links entity to ending activity
|
|
|
|
**Design Implication**: We should make our `HeritageCustodian` a subclass of `prov:Entity` and use PROV-O properties for temporal tracking.
|
|
|
|
## Proposed Schema Extensions
|
|
|
|
### New Classes to Add
|
|
|
|
#### 1. ChangeEvent
|
|
|
|
Models organizational changes over time (inspired by TOOI):
|
|
|
|
```yaml
|
|
ChangeEvent:
|
|
description: >-
|
|
An event that changed the state of a heritage custodian organization
|
|
(e.g., founding, closure, relocation, name change, merger, split).
|
|
Based on tooi:Wijzigingsgebeurtenis pattern.
|
|
class_uri: prov:Activity
|
|
mixins:
|
|
- tooi:Wijzigingsgebeurtenis
|
|
slots:
|
|
- change_type # founding, closure, relocation, rename, merger, split
|
|
- effective_date # when change formally occurred (tooi:tijdstipWijziging)
|
|
- registration_date # when change was recorded (tooi:tijdstipRegistratie)
|
|
- reason # reason for change (tooi:redenWijziging)
|
|
- legal_basis # legal document/regulation (tooi:heeftJuridischeGrondslag)
|
|
- affected_organization # link to HeritageCustodian
|
|
- resulting_ghcid # new GHCID after this change
|
|
- previous_ghcid # GHCID before this change
|
|
```
|
|
|
|
#### 2. OrganizationalUnit
|
|
|
|
For departments/branches of larger institutions:
|
|
|
|
```yaml
|
|
OrganizationalUnit:
|
|
description: >-
|
|
A unit, department, or branch within a larger heritage custodian organization.
|
|
class_uri: org:OrganizationalUnit
|
|
is_a: HeritageCustodian
|
|
slots:
|
|
- unit_type # department, branch, division, section
|
|
- parent_unit # recursive
|
|
```
|
|
|
|
### Properties to Enhance
|
|
|
|
#### Temporal Properties
|
|
|
|
Add PROV-O temporal tracking to `HeritageCustodian`:
|
|
|
|
```yaml
|
|
# In HeritageCustodian class
|
|
slots:
|
|
- prov_generated_at # maps to prov:generatedAtTime
|
|
- prov_invalidated_at # maps to prov:invalidatedAtTime
|
|
- change_history # list of ChangeEvent instances
|
|
|
|
# Slot definitions
|
|
prov_generated_at:
|
|
description: Timestamp when organization was formally founded
|
|
range: datetime
|
|
slot_uri: prov:generatedAtTime
|
|
|
|
prov_invalidated_at:
|
|
description: Timestamp when organization ceased to exist
|
|
range: datetime
|
|
slot_uri: prov:invalidatedAtTime
|
|
|
|
change_history:
|
|
description: Historical record of changes to this organization
|
|
range: ChangeEvent
|
|
multivalued: true
|
|
inlined: true
|
|
inlined_as_list: true
|
|
```
|
|
|
|
#### Name Properties (TOOI-inspired)
|
|
|
|
```yaml
|
|
official_name:
|
|
description: Official legal name of the organization
|
|
range: string
|
|
slot_uri: tooi:officieleNaamInclSoort
|
|
|
|
sorting_name:
|
|
description: Name formatted for alphabetical sorting
|
|
range: string
|
|
slot_uri: tooi:officieleNaamSorteer
|
|
|
|
abbreviation:
|
|
description: Official abbreviation or acronym
|
|
range: string
|
|
slot_uri: tooi:afkorting
|
|
```
|
|
|
|
### Ontology Mappings to Update
|
|
|
|
#### HeritageCustodian
|
|
|
|
```yaml
|
|
HeritageCustodian:
|
|
class_uri: org:Organization
|
|
mixins:
|
|
- prov:Entity # Add PROV-O provenance tracking
|
|
- tooi:Overheidsorganisatie # For Dutch institutions
|
|
- cpov:PublicOrganisation # For government institutions
|
|
- schema:Organization # For Schema.org compatibility
|
|
```
|
|
|
|
#### ContactInfo
|
|
|
|
```yaml
|
|
ContactInfo:
|
|
class_uri: cpov:ContactPoint
|
|
exact_mappings:
|
|
- schema:ContactPoint
|
|
```
|
|
|
|
### Enumerations to Add
|
|
|
|
#### ChangeTypeEnum
|
|
|
|
```yaml
|
|
ChangeTypeEnum:
|
|
description: Types of organizational changes
|
|
permissible_values:
|
|
FOUNDING:
|
|
description: Organization was founded
|
|
meaning: tooi:Oprichting
|
|
CLOSURE:
|
|
description: Organization ceased operations
|
|
meaning: tooi:Opheffing
|
|
MERGER:
|
|
description: Organization merged with another
|
|
meaning: tooi:Fusie
|
|
SPLIT:
|
|
description: Organization split into multiple entities
|
|
meaning: tooi:Afsplitsing
|
|
RELOCATION:
|
|
description: Organization moved to new location
|
|
NAME_CHANGE:
|
|
description: Organization changed its name
|
|
TYPE_CHANGE:
|
|
description: Institution type changed
|
|
STATUS_CHANGE:
|
|
description: Operational status changed
|
|
```
|
|
|
|
## Integration with GHCID System
|
|
|
|
The GHCID system already tracks identifier changes via `ghcid_history`. We should:
|
|
|
|
1. **Keep `ghcid_history`** as-is (simple, functional)
|
|
2. **Add `change_history`** for richer semantic change tracking
|
|
3. **Link the two**: Each `GHCIDHistoryEntry` should reference a `ChangeEvent` if applicable
|
|
|
|
### Example Mapping
|
|
|
|
```yaml
|
|
# Simple GHCID history (current system)
|
|
ghcid_history:
|
|
- ghcid: "NL-NH-AMS-M-RM"
|
|
ghcid_numeric: 12345678901234567890
|
|
valid_from: "2020-01-01T00:00:00Z"
|
|
valid_to: null
|
|
reason: "Initial identifier"
|
|
institution_name: "Rijksmuseum"
|
|
location_city: "Amsterdam"
|
|
location_country: "NL"
|
|
|
|
# Rich semantic change history (new system)
|
|
change_history:
|
|
- change_type: FOUNDING
|
|
effective_date: "1800-11-19T00:00:00Z"
|
|
registration_date: "2020-01-01T00:00:00Z"
|
|
reason: "Founded as national art museum"
|
|
resulting_ghcid: "NL-NH-AMS-M-RM"
|
|
affected_organization: "https://example.org/custodian/12345"
|
|
```
|
|
|
|
## EDM Aggregator/Provider Pattern
|
|
|
|
*(Awaiting subagent analysis of EDM conversations)*
|
|
|
|
Expected patterns:
|
|
- `edm:ProvidedCHO` (Cultural Heritage Object)
|
|
- `edm:WebResource` (digital representation)
|
|
- `edm:Agent` (provider organization)
|
|
- `ore:Aggregation` (metadata aggregation)
|
|
|
|
## Namespace Strategy
|
|
|
|
### Recommended Approach
|
|
|
|
1. **Create our own namespace**: `https://w3id.org/heritage/custodian/`
|
|
2. **Reuse existing properties** via `slot_uri` mappings
|
|
3. **Define custom properties** only when no suitable property exists
|
|
|
|
### Prefix Registry
|
|
|
|
```yaml
|
|
prefixes:
|
|
heritage: https://w3id.org/heritage/custodian/
|
|
tooi: https://identifier.overheid.nl/tooi/def/ont/
|
|
cpov: http://data.europa.eu/m8g/
|
|
org: http://www.w3.org/ns/org#
|
|
prov: http://www.w3.org/ns/prov#
|
|
schema: http://schema.org/
|
|
rico: https://www.ica.org/standards/RiC/ontology#
|
|
edm: http://www.europeana.eu/schemas/edm/
|
|
ore: http://www.openarchives.org/ore/terms/
|
|
```
|
|
|
|
## Validation Strategy
|
|
|
|
### SHACL Constraints
|
|
|
|
TOOI uses SHACL extensively for validation. We should:
|
|
|
|
1. **Generate SHACL from LinkML** using `gen-shacl`
|
|
2. **Define custom constraints** for:
|
|
- GHCID format validation
|
|
- Date consistency (founded_date < closed_date)
|
|
- Identifier uniqueness
|
|
- Geographic coordinate validation
|
|
|
|
### Example SHACL (conceptual)
|
|
|
|
```turtle
|
|
heritage:HeritageCustodianShape
|
|
a sh:NodeShape ;
|
|
sh:targetClass heritage:HeritageCustodian ;
|
|
sh:property [
|
|
sh:path heritage:ghcid_current ;
|
|
sh:pattern "^[A-Z]{2}-[A-Z0-9]{1,3}-[A-Z]{3}-[A-Z]-[A-Z0-9]{1,10}(-Q[0-9]+)?$" ;
|
|
] ;
|
|
sh:property [
|
|
sh:path prov:generatedAtTime ;
|
|
sh:maxCount 1 ;
|
|
sh:datatype xsd:dateTime ;
|
|
] .
|
|
```
|
|
|
|
## Implementation Roadmap
|
|
|
|
### Phase 1: Core Extensions (Current Priority)
|
|
- [ ] Add `ChangeEvent` class
|
|
- [ ] Add PROV-O temporal properties to `HeritageCustodian`
|
|
- [ ] Add `ChangeTypeEnum`
|
|
- [ ] Update `class_uri` and `slot_uri` mappings
|
|
- [ ] Create example instances
|
|
|
|
### Phase 2: TOOI Integration
|
|
- [ ] Add `OrganizationalUnit` class
|
|
- [ ] Add Dutch-specific TOOI properties
|
|
- [ ] Implement TOOI name variants (official, preferred, sorting)
|
|
- [ ] Add legal basis tracking
|
|
|
|
### Phase 3: EDM Integration
|
|
- [ ] Add aggregator/provider relationship model
|
|
- [ ] Add collection digitization tracking
|
|
- [ ] Add EDM-specific metadata
|
|
|
|
### Phase 4: Validation
|
|
- [ ] Generate SHACL constraints from LinkML
|
|
- [ ] Implement custom validators
|
|
- [ ] Create validation test suite
|
|
|
|
## Open Questions
|
|
|
|
1. **Class hierarchy**: Should `HeritageCustodian` use `is_a` or `mixins` for multiple ontology mappings?
|
|
- **Recommendation**: Use `mixins` to avoid diamond inheritance issues
|
|
|
|
2. **Temporal model**: Dual tracking (`founded_date` vs `prov:generatedAtTime`)?
|
|
- **Recommendation**: Keep both - `founded_date` for simple queries, PROV-O for semantic interoperability
|
|
|
|
3. **Change events**: Link to `ghcid_history` or keep separate?
|
|
- **Recommendation**: Keep separate but allow optional cross-references
|
|
|
|
4. **Dutch-specific fields**: In base class or subclass?
|
|
- **Current approach**: `DutchHeritageCustodian` subclass ✅
|
|
|
|
## References
|
|
|
|
- TOOI Ontology: `data/ontology/tooiont.ttl`
|
|
- CPOV Ontology: `data/ontology/core-public-organisation-ap.ttl`
|
|
- W3C PROV-O: https://www.w3.org/TR/prov-o/
|
|
- W3C Org Ontology: https://www.w3.org/TR/vocab-org/
|
|
- LinkML Docs: https://linkml.io/linkml/
|
|
|
|
---
|
|
|
|
**Next Steps**:
|
|
1. Wait for subagent analysis of ontology conversations
|
|
2. Refine design based on subagent findings
|
|
3. Implement `heritage_custodian_extended.yaml`
|
|
4. Create example instances
|
|
5. Validate with LinkML tools
|