- Implemented `owl_to_mermaid.py` to convert OWL/Turtle files into Mermaid class diagrams. - Implemented `owl_to_plantuml.py` to convert OWL/Turtle files into PlantUML class diagrams. - Added two new PlantUML files for custodian multi-aspect diagrams.
516 lines
18 KiB
Markdown
516 lines
18 KiB
Markdown
# CustodianCollection Addition - Session Summary
|
|
|
|
**Date**: 2025-11-22
|
|
**Time**: 18:23 UTC
|
|
**Schema Version**: 0.1.0 → 0.3.0
|
|
**Status**: ✅ COMPLETE - Validated, Generated, Documented
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
Added **CustodianCollection** as the fourth reconstruction output of the Heritage Custodian Ontology, completing the multi-aspect modeling of heritage institutions. Collections represent the heritage materials managed by custodians and are crucial for modeling metonymic discourse ("The Rijksmuseum has a Rembrandt" = the collection contains it).
|
|
|
|
---
|
|
|
|
## Architecture Evolution
|
|
|
|
### Before: Three Aspects
|
|
```
|
|
Custodian (hub)
|
|
├─ preferred_label → CustodianName (emic name)
|
|
├─ legal_status → CustodianLegalStatus (legal entity)
|
|
└─ place_designation → CustodianPlace (nominal place)
|
|
```
|
|
|
|
### After: Four Aspects ✅
|
|
```
|
|
Custodian (hub)
|
|
├─ preferred_label → CustodianName (emic name)
|
|
├─ legal_status → CustodianLegalStatus (legal entity)
|
|
├─ place_designation → CustodianPlace (nominal place)
|
|
└─ has_collection → CustodianCollection (heritage materials) ← NEW!
|
|
```
|
|
|
|
---
|
|
|
|
## Files Created
|
|
|
|
### 1. Class Definition
|
|
**`modules/classes/CustodianCollection.yaml`** (128 lines)
|
|
- `class_uri: crm:E78_Curated_Holding`
|
|
- Maps to CIDOC-CRM, RiC-O, BIBFRAME
|
|
- Represents aggregations of heritage materials
|
|
- Supports multiple collection types (archival, museum, library, etc.)
|
|
|
|
### 2. Collection-Specific Slots (9 files)
|
|
|
|
| File | Purpose | Property Mapping |
|
|
|------|---------|------------------|
|
|
| **`collection_name.yaml`** | Name of collection | `dcterms:title` |
|
|
| **`collection_description.yaml`** | Narrative description | `dcterms:description` |
|
|
| **`collection_type.yaml`** | Type(s) of materials | `dcterms:type` |
|
|
| **`collection_scope.yaml`** | Subject/thematic focus | `dcterms:coverage` |
|
|
| **`temporal_coverage.yaml`** | Time period of materials | `dcterms:temporal` |
|
|
| **`extent.yaml`** | Size/quantity | `dcterms:extent` |
|
|
| **`arrangement_system.yaml`** | Intellectual organization | `rico:hasRecordSetType` |
|
|
| **`provenance_note.yaml`** | Acquisition history | `crm:P24_transferred_title_of` |
|
|
| **`has_collection.yaml`** | Links Custodian to Collection | `crm:P46_is_composed_of` |
|
|
|
|
---
|
|
|
|
## Files Modified
|
|
|
|
### Custodian Class
|
|
**`modules/classes/Custodian.yaml`**
|
|
|
|
**Changes**:
|
|
- Added `has_collection` to slots list (line 99)
|
|
- Added `has_collection` slot_usage documentation:
|
|
- `slot_uri: crm:P46_is_composed_of`
|
|
- `range: CustodianCollection`
|
|
- `multivalued: true`
|
|
- Extensive documentation on metonymic relationships
|
|
- Updated comments: "Four aspects" (was "Three aspects")
|
|
|
|
### Main Schema
|
|
**`01_custodian_name_modular.yaml`**
|
|
|
|
**Changes**:
|
|
- Added CustodianCollection to class imports (line 133)
|
|
- Added 9 new slot imports:
|
|
- `arrangement_system`
|
|
- `collection_description`
|
|
- `collection_name`
|
|
- `collection_scope`
|
|
- `collection_type`
|
|
- `extent`
|
|
- `has_collection`
|
|
- `provenance_note`
|
|
- `temporal_coverage`
|
|
- Updated schema description with collection aspect
|
|
- Updated file count: 19 classes + 7 enums + 70 slots = 96 definition files
|
|
|
|
---
|
|
|
|
## Ontology Alignment
|
|
|
|
### Primary Ontologies
|
|
|
|
| Ontology | Class | Use Case |
|
|
|----------|-------|----------|
|
|
| **CIDOC-CRM** | `crm:E78_Curated_Holding` | Museum collections, curated aggregations |
|
|
| **RiC-O** | `rico:RecordSet` | Archival fonds, series, file groups |
|
|
| **BIBFRAME** | `bf:Collection` | Library special collections |
|
|
| **Schema.org** | `schema:Collection` | General aggregations |
|
|
|
|
### Key Properties
|
|
|
|
| Slot | Ontology Property | Description |
|
|
|------|-------------------|-------------|
|
|
| `collection_name` | `dcterms:title` | Name of collection (may differ from custodian) |
|
|
| `collection_description` | `dcterms:description` | Narrative description |
|
|
| `collection_type` | `dcterms:type` | Material types (multivalued) |
|
|
| `collection_scope` | `dcterms:coverage` | Subject/thematic focus |
|
|
| `temporal_coverage` | `dcterms:temporal` | Time period covered by materials |
|
|
| `extent` | `dcterms:extent` | Size (linear meters, object counts) |
|
|
| `arrangement_system` | `rico:hasRecordSetType` | Intellectual organization |
|
|
| `provenance_note` | `crm:P24_transferred_title_of` | Acquisition history |
|
|
| `has_collection` | `crm:P46_is_composed_of` | Custodian-to-Collection link |
|
|
|
|
### Inverse Relationships
|
|
```turtle
|
|
# Forward (Custodian → Collection)
|
|
:custodian crm:P46_is_composed_of :collection .
|
|
|
|
# Inverse (Collection → Custodian)
|
|
:collection crm:P46i_forms_part_of :custodian .
|
|
```
|
|
|
|
---
|
|
|
|
## Collection Types Supported
|
|
|
|
The `collection_type` slot supports multiple material types:
|
|
|
|
- **`archival_records`** - Historical documents, correspondence, records (RiC-O)
|
|
- **`museum_objects`** - Cultural artifacts, art objects (CIDOC-CRM)
|
|
- **`library_holdings`** - Books, serials, manuscripts (BIBFRAME)
|
|
- **`monuments`** - Built heritage, archaeological sites (CIDOC-CRM E27_Site)
|
|
- **`archaeological_materials`** - Excavation finds, archaeological assemblages
|
|
- **`natural_history_specimens`** - Biological specimens, geological samples
|
|
- **`digital_born`** - Born-digital collections (web archives, digital art)
|
|
- **`photographs`** - Photographic collections
|
|
- **`manuscripts`** - Handwritten documents, medieval codices
|
|
|
|
Collections can have **multiple types** (e.g., mixed archival + museum collections).
|
|
|
|
---
|
|
|
|
## ER Diagram Verification
|
|
|
|
### Generated Diagram
|
|
**File**: `schemas/20251121/uml/mermaid/01_custodian_name_modular_20251122_182317_er.mmd`
|
|
|
|
### Verified Relationships
|
|
|
|
✅ **Custodian → CustodianCollection**
|
|
```
|
|
Custodian ||--}o CustodianCollection : "has_collection"
|
|
```
|
|
- One custodian can have multiple collections (multivalued)
|
|
- Collections are optional (some custodians may have no collection data)
|
|
|
|
✅ **CustodianCollection → Custodian**
|
|
```
|
|
CustodianCollection ||--|| Custodian : "refers_to_custodian"
|
|
```
|
|
- Every collection must refer to exactly one custodian hub
|
|
|
|
✅ **CustodianCollection → ReconstructionActivity**
|
|
```
|
|
CustodianCollection ||--|o ReconstructionActivity : "was_generated_by"
|
|
```
|
|
- Documents scholarly reconstruction process (PiCo pattern)
|
|
|
|
✅ **CustodianCollection → CustodianObservation**
|
|
```
|
|
CustodianCollection ||--}| CustodianObservation : "was_derived_from"
|
|
```
|
|
- Links reconstructed collection to source observations (PROV-O)
|
|
|
|
✅ **CustodianCollection → TimeSpan**
|
|
```
|
|
CustodianCollection ||--|o TimeSpan : "temporal_coverage"
|
|
```
|
|
- Time period covered by materials (NOT collection creation date)
|
|
|
|
---
|
|
|
|
## RDF Generation Results
|
|
|
|
### Generated Files (Timestamp: 20251122_182317)
|
|
|
|
```bash
|
|
schemas/20251121/rdf/
|
|
├── 01_custodian_name_modular_20251122_182317.owl.ttl (179 KB)
|
|
├── 01_custodian_name_modular_20251122_182317.nt (508 KB)
|
|
├── 01_custodian_name_modular_20251122_182317.jsonld (425 KB)
|
|
└── 01_custodian_name_modular_20251122_182317.rdf (367 KB)
|
|
```
|
|
|
|
### Validation Status
|
|
✅ **Schema compiles successfully** (no errors)
|
|
|
|
**Warnings** (non-critical, expected):
|
|
- ⚠️ Multiple owl types for `language` (rdfs:Literal vs owl:Thing) - cosmetic
|
|
- ⚠️ Schema namespace override - expected with modular design
|
|
|
|
---
|
|
|
|
## Example Use Cases
|
|
|
|
### Use Case 1: Museum Collection
|
|
```yaml
|
|
Custodian:
|
|
hc_id: https://nde.nl/ontology/hc/cust/rijksmuseum
|
|
preferred_label:
|
|
emic_name: "Rijksmuseum"
|
|
has_collection:
|
|
- id: https://nde.nl/ontology/hc/collection/rijksmuseum-001
|
|
collection_name: "Rijksmuseum Collection"
|
|
collection_description: "Dutch art and history from 1100-2000"
|
|
collection_type:
|
|
- "museum_objects"
|
|
- "library_holdings" # Art library
|
|
collection_scope: "Dutch Golden Age painting, Asian art, Delftware, prints"
|
|
temporal_coverage:
|
|
begin_of_the_begin: "1100-01-01T00:00:00Z"
|
|
end_of_the_end: "2000-12-31T23:59:59Z"
|
|
extent: "1 million objects, 35,000 artworks on display"
|
|
arrangement_system: "Classified by medium, period, and geography"
|
|
provenance_note: "Collection established 1800 as national art collection, nationalized 1808"
|
|
```
|
|
|
|
### Use Case 2: Archival Collection
|
|
```yaml
|
|
Custodian:
|
|
hc_id: https://nde.nl/ontology/hc/cust/noord-hollands-archief
|
|
preferred_label:
|
|
emic_name: "Noord-Hollands Archief"
|
|
has_collection:
|
|
- id: https://nde.nl/ontology/hc/collection/nha-archives-001
|
|
collection_name: "Provincial Archives of Noord-Holland"
|
|
collection_description: "Government records, notarial archives, family papers"
|
|
collection_type:
|
|
- "archival_records"
|
|
collection_scope: "Provincial government, municipalities, families, estates"
|
|
temporal_coverage:
|
|
begin_of_the_begin: "1289-01-01T00:00:00Z" # Earliest document
|
|
end_of_the_end: "2025-11-22T00:00:00Z" # Ongoing accessions
|
|
extent: "60 linear kilometers of archival materials"
|
|
arrangement_system: "ISAD(G) hierarchical structure, respect des fonds"
|
|
provenance_note: "Formed 2001 from merger of Gemeentearchief Haarlem (1910) and Rijksarchief in Noord-Holland (1802)"
|
|
```
|
|
|
|
### Use Case 3: Mixed Collection (Museum + Archive)
|
|
```yaml
|
|
Custodian:
|
|
hc_id: https://nde.nl/ontology/hc/cust/verzetsmuseum
|
|
preferred_label:
|
|
emic_name: "Verzetsmuseum"
|
|
has_collection:
|
|
- id: https://nde.nl/ontology/hc/collection/verzetsmuseum-001
|
|
collection_name: "Dutch Resistance Museum Collection"
|
|
collection_type:
|
|
- "museum_objects" # Artifacts, uniforms, weapons
|
|
- "archival_records" # Personal papers, resistance documents
|
|
- "photographs" # Photo archive
|
|
collection_scope: "Dutch resistance during WWII (1940-1945)"
|
|
temporal_coverage:
|
|
begin_of_the_begin: "1940-05-10T00:00:00Z" # German invasion
|
|
end_of_the_end: "1945-05-05T00:00:00Z" # Liberation
|
|
extent: "10,000 objects, 25,000 photographs, 500 linear meters archival materials"
|
|
```
|
|
|
|
---
|
|
|
|
## Metonymic Relationships Explained
|
|
|
|
### What is Metonymy?
|
|
|
|
**Metonymy** = Using one entity to refer to a related entity
|
|
|
|
In heritage discourse, people commonly say:
|
|
- "The Rijksmuseum has a Rembrandt" (= the collection contains it)
|
|
- "The British Library digitized its manuscripts" (= the collection was digitized)
|
|
- "The National Archives preserves colonial records" (= the collection preserves them)
|
|
|
|
They are **NOT** referring to the legal entity or the building, but to the **collection**.
|
|
|
|
### Why This Matters
|
|
|
|
Before CustodianCollection, the ontology had no way to model:
|
|
1. **Collection identity** - Collections have names distinct from custodians
|
|
2. **Multiple collections** - One custodian can manage multiple collections
|
|
3. **Custody transfers** - Collections move between custodians over time
|
|
4. **Joint custody** - Multiple custodians can share collection management
|
|
5. **Collection-level provenance** - Acquisition history, custody changes
|
|
|
|
### Modeling Strategy
|
|
|
|
```
|
|
Person says: "The Rijksmuseum has a Rembrandt"
|
|
↓
|
|
Observation: CustodianObservation (observed statement)
|
|
↓
|
|
Reconstruction: Parse as metonymic reference
|
|
↓
|
|
├─ Custodian: Rijksmuseum (legal entity)
|
|
└─ CustodianCollection: Rijksmuseum Collection (contains Rembrandt)
|
|
```
|
|
|
|
---
|
|
|
|
## Key Design Decisions
|
|
|
|
### Decision 1: Fourth Aspect vs. Custodian Slot
|
|
|
|
**Why separate class instead of `Custodian.collections` slot?**
|
|
|
|
✅ **Separate class (chosen)**:
|
|
- Collections have independent lifecycle (can be transferred, split, merged)
|
|
- Collections need extensive metadata (9 specialized slots)
|
|
- Collections are reconstructed outputs (require ReconstructionActivity link)
|
|
- Collections can have temporal validity independent of custodian
|
|
|
|
❌ **Simple slot**:
|
|
- Would couple collection lifecycle to custodian
|
|
- Harder to model custody transfers
|
|
- Cannot link to observations/reconstructions separately
|
|
|
|
### Decision 2: CIDOC-CRM E78 vs. RiC-O RecordSet
|
|
|
|
**Why multiple ontology mappings?**
|
|
|
|
Different heritage domains use different ontologies:
|
|
- **Museums**: CIDOC-CRM E78_Curated_Holding (managed aggregations)
|
|
- **Archives**: RiC-O RecordSet (archival fonds, series)
|
|
- **Libraries**: BIBFRAME Collection (special collections)
|
|
|
|
**Solution**: Use `collection_type` to determine which ontology applies:
|
|
- `archival_records` → `rico:RecordSet`
|
|
- `museum_objects` → `crm:E78_Curated_Holding`
|
|
- `library_holdings` → `bf:Collection`
|
|
|
|
Collections can implement **multiple ontology classes** simultaneously.
|
|
|
|
### Decision 3: temporal_coverage vs. Dates
|
|
|
|
**Why TimeSpan for temporal_coverage?**
|
|
|
|
`temporal_coverage` = **Time period covered by collection materials** (NOT collection creation dates)
|
|
|
|
Examples:
|
|
- Rijksmuseum collection: 1100-2000 (artworks span 9 centuries)
|
|
- Medieval manuscripts collection: 800-1500 (manuscripts created in Middle Ages)
|
|
- WWII archive: 1940-1945 (documents from war period)
|
|
|
|
**CustodianCollection creation dates** tracked separately via `valid_from`/`valid_to` slots.
|
|
|
|
---
|
|
|
|
## File Count Summary
|
|
|
|
### Before CustodianCollection
|
|
- 18 classes + 7 enums + 61 slots = 86 files
|
|
- Grand total: 88 files (including metadata.yaml + main schema)
|
|
|
|
### After CustodianCollection
|
|
- **19 classes** (+1: CustodianCollection)
|
|
- **7 enums** (unchanged)
|
|
- **70 slots** (+9: collection slots + linkers)
|
|
- **= 96 definition files**
|
|
- **Grand total: 98 files** (including metadata.yaml + main schema)
|
|
|
|
---
|
|
|
|
## Testing & Validation
|
|
|
|
### Schema Validation ✅
|
|
```bash
|
|
$ cd schemas/20251121/linkml
|
|
$ gen-owl -f ttl 01_custodian_name_modular.yaml 2>&1 | head -20
|
|
|
|
# Result: SUCCESS
|
|
# - Output: 179 KB Turtle file
|
|
# - No schema errors
|
|
# - Expected warnings only (language type ambiguity)
|
|
```
|
|
|
|
### ER Diagram Generation ✅
|
|
```bash
|
|
$ gen-erdiagram 01_custodian_name_modular.yaml > \
|
|
../uml/mermaid/01_custodian_name_modular_20251122_182317_er.mmd
|
|
|
|
# Result: SUCCESS
|
|
# - 5.9 KB Mermaid ER diagram
|
|
# - All CustodianCollection relationships present
|
|
# - Verified cardinalities correct
|
|
```
|
|
|
|
### RDF Format Generation ✅
|
|
```bash
|
|
# All 4 RDF formats generated successfully
|
|
$ ls -lh schemas/20251121/rdf/*20251122_182317*
|
|
-rw-r--r-- 179K 01_custodian_name_modular_20251122_182317.owl.ttl
|
|
-rw-r--r-- 508K 01_custodian_name_modular_20251122_182317.nt
|
|
-rw-r--r-- 425K 01_custodian_name_modular_20251122_182317.jsonld
|
|
-rw-r--r-- 367K 01_custodian_name_modular_20251122_182317.rdf
|
|
```
|
|
|
|
---
|
|
|
|
## Session Context
|
|
|
|
### Phase 1 (Nov 22, 10:00-12:00 UTC)
|
|
**Connected Orphaned Classes to Custodian**
|
|
- Problem: CustodianAppellation and CustodianIdentifier had no path to Custodian hub
|
|
- Solution: Added `variant_of_name` and `identifies_custodian` slots
|
|
- Result: All classes reachable from Custodian hub
|
|
|
|
### Phase 2 (Nov 22, 14:00-16:00 UTC)
|
|
**Appellation Refactoring for SKOS Alignment**
|
|
- Problem: CustodianAppellation directly on Custodian violated SKOS semantics
|
|
- Solution: Moved alternative names to CustodianName (SKOS Concept)
|
|
- Result: `skos:prefLabel` (CustodianName) + `skos:altLabel` (CustodianAppellation)
|
|
|
|
### Phase 3 (Nov 22, 18:00-18:30 UTC) ← **THIS SESSION**
|
|
**Added CustodianCollection as Fourth Aspect**
|
|
- Problem: No way to model heritage materials or metonymic references
|
|
- Solution: Created CustodianCollection with 9 specialized slots
|
|
- Result: Complete four-aspect modeling (Name, LegalStatus, Place, Collection)
|
|
|
|
---
|
|
|
|
## Next Steps (Pending)
|
|
|
|
### Documentation
|
|
- [ ] Update `README.md` with four-aspect architecture
|
|
- [ ] Create `COLLECTION_EXAMPLES.md` with real-world examples
|
|
- [ ] Update ontology alignment documentation
|
|
|
|
### Testing
|
|
- [ ] Create test instances with CustodianCollection
|
|
- Rijksmuseum (museum collection)
|
|
- Noord-Hollands Archief (archival collection)
|
|
- Koninklijke Bibliotheek (library holdings)
|
|
- [ ] Unit tests for collection aspect
|
|
- [ ] Validation tests for temporal_coverage TimeSpan
|
|
|
|
### Features
|
|
- [ ] Collection-level provenance events (custody transfers, acquisitions)
|
|
- [ ] Collection splits/mergers (track fonds reorganization)
|
|
- [ ] Digital surrogates (link physical collections to digitized versions)
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
### Schema Files
|
|
- **Main schema**: `schemas/20251121/linkml/01_custodian_name_modular.yaml`
|
|
- **CustodianCollection class**: `schemas/20251121/linkml/modules/classes/CustodianCollection.yaml`
|
|
- **Collection slots**: `schemas/20251121/linkml/modules/slots/collection_*.yaml`
|
|
|
|
### Generated Outputs
|
|
- **RDF (Turtle)**: `schemas/20251121/rdf/01_custodian_name_modular_20251122_182317.owl.ttl`
|
|
- **ER Diagram**: `schemas/20251121/uml/mermaid/01_custodian_name_modular_20251122_182317_er.mmd`
|
|
|
|
### Ontology Documentation
|
|
- **CIDOC-CRM**: `data/ontology/CIDOC_CRM_v7.1.3.rdf` (E78_Curated_Holding)
|
|
- **RiC-O**: `data/ontology/RiC-O_1-1.rdf` (RecordSet)
|
|
- **BIBFRAME**: `data/ontology/bibframe_vocabulary.rdf` (Collection)
|
|
|
|
---
|
|
|
|
## Session Metadata
|
|
|
|
| Attribute | Value |
|
|
|-----------|-------|
|
|
| **Session Date** | 2025-11-22 |
|
|
| **Session Time** | 18:00-18:30 UTC (30 minutes) |
|
|
| **Agent** | Claude (OpenCode) |
|
|
| **User** | kempersc |
|
|
| **Schema Version Before** | 0.1.0 (18 classes, 61 slots) |
|
|
| **Schema Version After** | 0.3.0 (19 classes, 70 slots) |
|
|
| **Files Created** | 10 (1 class + 9 slots) |
|
|
| **Files Modified** | 2 (Custodian.yaml, main schema) |
|
|
| **Validation Status** | ✅ PASS (gen-owl, gen-erdiagram) |
|
|
| **RDF Formats Generated** | 4 (Turtle, N-Triples, JSON-LD, RDF/XML) |
|
|
| **Diagram Generated** | ER diagram (Mermaid) |
|
|
| **Documentation Created** | This file |
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
|
|
The Heritage Custodian Ontology now models heritage institutions as **four-aspect entities**:
|
|
|
|
1. **CustodianName** (emic label) - SKOS Concept
|
|
2. **CustodianLegalStatus** (legal entity) - W3C ORG, TOOI, CPOV
|
|
3. **CustodianPlace** (nominal location) - CIDOC-CRM E53_Place
|
|
4. **CustodianCollection** (heritage materials) - CIDOC-CRM E78, RiC-O RecordSet, BIBFRAME Collection ← **NEW!**
|
|
|
|
Each aspect:
|
|
- Has independent temporal lifecycle
|
|
- Is reconstructed from CustodianObservation sources
|
|
- Links back to Custodian hub via `refers_to_custodian`
|
|
- Maps to established ontologies (CIDOC-CRM, RiC-O, BIBFRAME, SKOS, W3C ORG)
|
|
|
|
**Status**: ✅ **COMPLETE - Ready for instance creation and testing**
|
|
|
|
---
|
|
|
|
**Document Version**: 1.0
|
|
**Generated**: 2025-11-22T18:30:00Z
|
|
**Author**: AI Agent (Claude via OpenCode)
|