# Validation Rules for Heritage Custodian Ontology (v0.7.0) **Date**: 2025-11-22 **Schema Version**: v0.7.0 (Phase 5: Validation Framework) **Validator**: `scripts/validate_temporal_consistency.py` **Test Suite**: `tests/test_temporal_validation.py` --- ## Overview This document defines the validation rules for temporal consistency and bidirectional relationships in the Heritage Custodian Ontology. These rules ensure data quality across organizational structures, collections, and staff relationships. **Validation Categories**: 1. **Collection-Unit Temporal Consistency** (Phase 4: Collection-Department Integration) 2. **Collection-Unit Bidirectional Relationships** (Phase 4) 3. **Custody Transfer Continuity** (Phase 4: Organizational changes) 4. **Staff-Unit Temporal Consistency** (Phase 3: Staff Role Tracking) 5. **Staff-Unit Bidirectional Relationships** (Phase 3) --- ## Rule 1: Collection-Unit Temporal Consistency ### Description Collection custody dates must fit within the managing organizational unit's validity period. A collection cannot be managed by a unit that doesn't exist. ### Rule ID `COLLECTION_UNIT_TEMPORAL` ### Constraints **Constraint 1.1**: Collection custody start date must be **on or after** unit founding date ``` CustodianCollection.valid_from >= OrganizationalStructure.valid_from ``` **Constraint 1.2**: Collection custody end date must be **on or before** unit dissolution date (if unit dissolved) ``` IF OrganizationalStructure.valid_to IS NOT NULL THEN CustodianCollection.valid_to <= OrganizationalStructure.valid_to ``` **Warning Condition**: Collection custody ongoing but unit dissolved ``` IF OrganizationalStructure.valid_to IS NOT NULL AND CustodianCollection.valid_to IS NULL THEN WARN (missing custody transfer) ``` --- ### Examples #### Valid Example 1: Collection within unit lifetime ```yaml --- id: "https://example.org/unit/dept-1" unit_name: "Paintings Department" unit_type: DEPARTMENT valid_from: "1985-01-01" # Unit founded valid_to: null # Still active managed_collections: - "https://example.org/collection/dutch-paintings" --- id: "https://example.org/collection/dutch-paintings" collection_name: "Dutch Paintings Collection" managing_unit: "https://example.org/unit/dept-1" valid_from: "1995-01-01" # ✅ After unit founding (1985) valid_to: null # Ongoing ``` **Result**: ✅ **PASS** - Collection starts after unit founded --- #### Invalid Example 1: Collection before unit exists ```yaml --- id: "https://example.org/unit/dept-1" unit_name: "Special Collections Division" unit_type: DIVISION valid_from: "1982-01-01" # Unit founded in 1982 managed_collections: - "https://example.org/collection/medieval-manuscripts" --- id: "https://example.org/collection/medieval-manuscripts" collection_name: "Medieval Manuscripts" managing_unit: "https://example.org/unit/dept-1" valid_from: "1798-01-01" # ❌ Before unit exists (1982)! ``` **Error**: ``` [ERROR] COLLECTION_UNIT_TEMPORAL: Collection custody starts (1798-01-01) before managing unit exists (1982-01-01). Managing unit: Special Collections Division ``` **Fix**: Adjust collection `valid_from` to match or postdate unit founding: ```yaml valid_from: "1982-01-01" # ✅ Custody starts when unit founded ``` --- #### Invalid Example 2: Collection extends beyond unit dissolution ```yaml --- id: "https://example.org/unit/dept-old" unit_name: "Old Department" unit_type: DEPARTMENT valid_from: "1950-01-01" valid_to: "2013-02-28" # Unit dissolved in 2013 managed_collections: - "https://example.org/collection/coll-1" --- id: "https://example.org/collection/coll-1" collection_name: "Test Collection" managing_unit: "https://example.org/unit/dept-old" valid_from: "1960-01-01" valid_to: "2020-12-31" # ❌ Extends beyond unit dissolution (2013-02-28)! ``` **Error**: ``` [ERROR] COLLECTION_UNIT_TEMPORAL: Collection custody extends (2020-12-31) beyond managing unit validity (2013-02-28). Managing unit: Old Department ``` **Fix**: End collection custody when unit dissolves, create new version for new unit: ```yaml --- id: "https://example.org/collection/coll-1-v1" collection_name: "Test Collection" managing_unit: "https://example.org/unit/dept-old" valid_from: "1960-01-01" valid_to: "2013-02-28" # ✅ Custody ends with unit dissolution --- id: "https://example.org/collection/coll-1-v2" collection_name: "Test Collection" managing_unit: "https://example.org/unit/dept-new" valid_from: "2013-03-01" # ✅ Custody transferred to new unit valid_to: "2020-12-31" ``` --- #### Warning Example: Collection ongoing after unit dissolved ```yaml --- id: "https://example.org/unit/dept-dissolved" unit_name: "Dissolved Department" unit_type: DEPARTMENT valid_from: "1950-01-01" valid_to: "2013-02-28" # Unit dissolved managed_collections: - "https://example.org/collection/coll-1" --- id: "https://example.org/collection/coll-1" collection_name: "Test Collection" managing_unit: "https://example.org/unit/dept-dissolved" valid_from: "1960-01-01" valid_to: null # ⚠️ Custody ongoing but unit dissolved! ``` **Warning**: ``` [WARNING] COLLECTION_UNIT_TEMPORAL: Collection custody ongoing but managing unit dissolved (2013-02-28). Missing custody transfer? Managing unit: Dissolved Department ``` **Recommended Fix**: Transfer custody to successor unit: ```yaml --- id: "https://example.org/collection/coll-1-v1" collection_name: "Test Collection" managing_unit: "https://example.org/unit/dept-dissolved" valid_from: "1960-01-01" valid_to: "2013-02-28" # End custody with unit dissolution provenance_note: "Custody transferred to New Department during 2013 reorganization" --- id: "https://example.org/collection/coll-1-v2" collection_name: "Test Collection" managing_unit: "https://example.org/unit/dept-new" valid_from: "2013-03-01" # Transfer custody to new unit valid_to: null provenance_note: "Custody assumed from Dissolved Department" ``` --- ## Rule 2: Collection-Unit Bidirectional Consistency ### Description Bidirectional collection-unit relationships must be consistent. If a collection references a unit, the unit must list the collection, and vice versa. ### Rule ID `COLLECTION_UNIT_BIDIRECTIONAL` ### Constraints **Constraint 2.1**: Forward consistency (collection → unit) ``` IF CustodianCollection.managing_unit = unit_id THEN OrganizationalStructure[unit_id].managed_collections MUST INCLUDE collection_id ``` **Constraint 2.2**: Reverse consistency (unit → collection) ``` IF OrganizationalStructure.managed_collections INCLUDES collection_id THEN CustodianCollection[collection_id].managing_unit MUST EQUAL unit_id ``` --- ### Examples #### Valid Example: Bidirectional relationship ```yaml --- id: "https://example.org/unit/dept-1" unit_name: "Paintings Department" managed_collections: - "https://example.org/collection/dutch-paintings" # ✅ Lists collection --- id: "https://example.org/collection/dutch-paintings" collection_name: "Dutch Paintings Collection" managing_unit: "https://example.org/unit/dept-1" # ✅ References unit ``` **Result**: ✅ **PASS** - Bidirectional relationship consistent --- #### Invalid Example 1: Collection missing from unit.managed_collections ```yaml --- id: "https://example.org/unit/dept-1" unit_name: "Paintings Department" managed_collections: [] # ❌ Empty list, doesn't include collection --- id: "https://example.org/collection/dutch-paintings" collection_name: "Dutch Paintings Collection" managing_unit: "https://example.org/unit/dept-1" # Collection references unit ``` **Error**: ``` [ERROR] COLLECTION_UNIT_BIDIRECTIONAL: Collection references unit 'Paintings Department' as managing_unit, but unit does not list collection in managed_collections. Add collection to unit.managed_collections. ``` **Fix**: Add collection to unit's `managed_collections`: ```yaml --- id: "https://example.org/unit/dept-1" unit_name: "Paintings Department" managed_collections: - "https://example.org/collection/dutch-paintings" # ✅ Added ``` --- #### Invalid Example 2: Unit references non-existent collection ```yaml --- id: "https://example.org/unit/dept-1" unit_name: "Paintings Department" managed_collections: - "https://example.org/collection/nonexistent" # ❌ Collection doesn't exist ``` **Error**: ``` [ERROR] COLLECTION_UNIT_BIDIRECTIONAL: Unit references non-existent collection: https://example.org/collection/nonexistent. Remove from unit.managed_collections or create collection. ``` **Fix**: Either create the collection or remove the reference: ```yaml # Option 1: Create collection --- id: "https://example.org/collection/nonexistent" collection_name: "New Collection" managing_unit: "https://example.org/unit/dept-1" # Option 2: Remove reference --- id: "https://example.org/unit/dept-1" unit_name: "Paintings Department" managed_collections: [] # Removed non-existent reference ``` --- ## Rule 3: Custody Transfer Continuity ### Description Collection custody transfers must be continuous—no gaps or overlaps between versions. Collections don't disappear; custody must transfer during organizational changes. ### Rule ID `CUSTODY_CONTINUITY` ### Constraints **Constraint 3.1**: Continuous custody (no gaps > 1 day) ``` IF CustodianCollection version 1 ends (valid_to = T1) AND CustodianCollection version 2 exists with same collection_name THEN version 2 must start at T1 OR T1+1 day Gap = version2.valid_from - version1.valid_to IF Gap > 1 day THEN WARN ``` **Constraint 3.2**: No overlapping custody ``` IF CustodianCollection version 1 ends (valid_to = T1) AND CustodianCollection version 2 starts (valid_from = T2) AND T2 < T1 THEN ERROR (overlapping custody) ``` --- ### Examples #### Valid Example: Continuous custody transfer ```yaml # Version 1: Before merger --- id: "https://example.org/collection/paintings-v1" collection_name: "Paintings Collection" managing_unit: "https://example.org/unit/old-dept" valid_from: "1995-01-01" valid_to: "2013-02-28" # Custody ends # Version 2: After merger (next day) --- id: "https://example.org/collection/paintings-v2" collection_name: "Paintings Collection" managing_unit: "https://example.org/unit/new-dept" valid_from: "2013-03-01" # ✅ Custody starts next day (1 day gap OK) valid_to: null ``` **Result**: ✅ **PASS** - Continuous custody (1-day gap acceptable) --- #### Warning Example: Custody gap ```yaml # Version 1: Ends 2013-02-28 --- id: "https://example.org/collection/paintings-v1" collection_name: "Paintings Collection" managing_unit: "https://example.org/unit/old-dept" valid_from: "1995-01-01" valid_to: "2013-02-28" # Version 2: Starts 2013-05-01 (60-day gap!) --- id: "https://example.org/collection/paintings-v2" collection_name: "Paintings Collection" managing_unit: "https://example.org/unit/new-dept" valid_from: "2013-05-01" # ⚠️ 60-day gap! valid_to: null ``` **Warning**: ``` [WARNING] CUSTODY_CONTINUITY: Collection 'Paintings Collection' has custody gap: version ending 2013-02-28, next version starting 2013-05-01 (gap: 60 days). Expected continuous custody transfer. ``` **Fix**: Adjust dates to eliminate gap: ```yaml valid_from: "2013-03-01" # ✅ Next day after previous version ``` --- #### Error Example: Overlapping custody ```yaml # Version 1: Ends 2013-12-31 --- id: "https://example.org/collection/paintings-v1" collection_name: "Paintings Collection" managing_unit: "https://example.org/unit/old-dept" valid_from: "1995-01-01" valid_to: "2013-12-31" # Version 2: Starts 2013-06-01 (overlaps by 6 months!) --- id: "https://example.org/collection/paintings-v2" collection_name: "Paintings Collection" managing_unit: "https://example.org/unit/new-dept" valid_from: "2013-06-01" # ❌ Overlapping custody! valid_to: null ``` **Error**: ``` [ERROR] CUSTODY_CONTINUITY: Collection 'Paintings Collection' has overlapping custody periods: version ending 2013-12-31 overlaps with version starting 2013-06-01 (overlap: 214 days). ``` **Fix**: Align dates so custody ends before new version starts: ```yaml # Version 1: End on merger date valid_to: "2013-05-31" # Version 2: Start day after merger valid_from: "2013-06-01" # ✅ Continuous, no overlap ``` --- ## Rule 4: Staff-Unit Temporal Consistency ### Description Staff role dates must fit within the organizational unit's validity period. A person cannot work for a unit that doesn't exist. ### Rule ID `STAFF_UNIT_TEMPORAL` ### Constraints **Constraint 4.1**: Role start date must be **on or after** unit founding date ``` PersonObservation.role_start_date >= OrganizationalStructure.valid_from ``` **Constraint 4.2**: Role end date must be **on or before** unit dissolution date (if unit dissolved) ``` IF OrganizationalStructure.valid_to IS NOT NULL THEN PersonObservation.role_end_date <= OrganizationalStructure.valid_to ``` **Warning Condition**: Role ongoing but unit dissolved ``` IF OrganizationalStructure.valid_to IS NOT NULL AND PersonObservation.role_end_date IS NULL THEN WARN (missing staff reassignment) ``` --- ### Examples #### Valid Example: Staff role within unit lifetime ```yaml --- id: "https://example.org/unit/dept-1" unit_name: "Paintings Department" unit_type: DEPARTMENT valid_from: "1985-01-01" valid_to: null staff_members: - "https://example.org/person/curator-001" --- id: "https://example.org/person/curator-001" person_name: "Dr. Jan Vermeer" staff_role: CURATOR unit_affiliation: "https://example.org/unit/dept-1" role_start_date: "2010-01-01" # ✅ After unit founding (1985) role_end_date: null ``` **Result**: ✅ **PASS** - Role starts after unit founded --- #### Invalid Example: Role before unit exists ```yaml --- id: "https://example.org/unit/dept-1" unit_name: "Special Collections" unit_type: DIVISION valid_from: "1982-01-01" staff_members: - "https://example.org/person/curator-001" --- id: "https://example.org/person/curator-001" person_name: "Dr. Smith" staff_role: CURATOR unit_affiliation: "https://example.org/unit/dept-1" role_start_date: "1975-01-01" # ❌ Before unit exists (1982)! role_end_date: null ``` **Error**: ``` [ERROR] STAFF_UNIT_TEMPORAL: Staff role starts (1975-01-01) before unit exists (1982-01-01). Unit: Special Collections, Person: Dr. Smith ``` **Fix**: Adjust role start date or unit founding date. --- ## Rule 5: Staff-Unit Bidirectional Consistency ### Description Bidirectional staff-unit relationships must be consistent. If a person references a unit, the unit must list the person, and vice versa. ### Rule ID `STAFF_UNIT_BIDIRECTIONAL` ### Constraints **Constraint 5.1**: Forward consistency (person → unit) ``` IF PersonObservation.unit_affiliation = unit_id THEN OrganizationalStructure[unit_id].staff_members MUST INCLUDE person_id ``` **Constraint 5.2**: Reverse consistency (unit → person) ``` IF OrganizationalStructure.staff_members INCLUDES person_id THEN PersonObservation[person_id].unit_affiliation MUST EQUAL unit_id ``` --- ### Examples #### Valid Example: Bidirectional staff-unit relationship ```yaml --- id: "https://example.org/unit/dept-1" unit_name: "Paintings Department" staff_members: - "https://example.org/person/curator-001" # ✅ Lists person --- id: "https://example.org/person/curator-001" person_name: "Dr. Jan Vermeer" staff_role: CURATOR unit_affiliation: "https://example.org/unit/dept-1" # ✅ References unit ``` **Result**: ✅ **PASS** - Bidirectional relationship consistent --- #### Invalid Example: Person missing from unit.staff_members ```yaml --- id: "https://example.org/unit/dept-1" unit_name: "Paintings Department" staff_members: [] # ❌ Empty, doesn't include person --- id: "https://example.org/person/curator-001" person_name: "Dr. Jan Vermeer" staff_role: CURATOR unit_affiliation: "https://example.org/unit/dept-1" # Person references unit ``` **Error**: ``` [ERROR] STAFF_UNIT_BIDIRECTIONAL: Person references unit 'Paintings Department' as unit_affiliation, but unit does not list person in staff_members. Add person to unit.staff_members. Person: Dr. Jan Vermeer ``` **Fix**: Add person to unit's `staff_members`: ```yaml --- id: "https://example.org/unit/dept-1" unit_name: "Paintings Department" staff_members: - "https://example.org/person/curator-001" # ✅ Added ``` --- ## Validation Workflow ### Using the Validator **Command**: ```bash python scripts/validate_temporal_consistency.py ``` **Example**: ```bash python scripts/validate_temporal_consistency.py \ schemas/20251121/examples/collection_department_integration_examples.yaml ``` **Output**: ``` ================================================================================ HERITAGE CUSTODIAN ONTOLOGY - TEMPORAL CONSISTENCY VALIDATOR Schema Version: v0.7.0 (Phase 5) ================================================================================ 🔍 Validating collection_department_integration_examples.yaml... - Organizational units: 5 - Collections: 10 - Person observations: 0 - Change events: 0 ================================================================================ VALIDATION SUMMARY ================================================================================ Entities validated: 15 Rules checked: 5 Errors: 0 Warnings: 0 Status: ✅ PASS ================================================================================ ✅ All validation rules passed! ``` --- ### Exit Codes - **0**: All validation rules passed (may have warnings) - **1**: Validation failed (errors present) --- ### Interpreting Results **Errors (🔴)**: - **Severity**: High (must fix) - **Impact**: Data integrity violation - **Action**: Fix immediately before using data **Warnings (🟡)**: - **Severity**: Medium (should fix) - **Impact**: Potential data quality issue - **Action**: Review and fix if appropriate --- ## SHACL Shapes (RDF Validation) Future work will include SHACL shapes for RDF triple store validation. Preview: ```turtle # Collection-Unit Temporal Constraint (SHACL) :CollectionUnitTemporalConstraint a sh:NodeShape ; sh:targetClass custodian:CustodianCollection ; sh:sparql [ sh:message "Collection custody starts before managing unit exists" ; sh:select """ PREFIX custodian: PREFIX schema: SELECT $this WHERE { $this custodian:managing_unit ?unit ; schema:startDate ?coll_start . ?unit schema:startDate ?unit_start . FILTER (?coll_start < ?unit_start) } """ ; ] . ``` --- ## Integration with LinkML Schema Validation rules are implemented in Python (runtime) but future versions will include LinkML schema constraints: ```yaml slots: managing_unit: range: OrganizationalStructure # Future: Add LinkML validation expression # validation: # rule: "valid_from >= managing_unit.valid_from" ``` --- ## Testing **Test Suite**: `tests/test_temporal_validation.py` **Coverage**: - 19 test cases - 100% rule coverage (all 5 rules tested) - Valid cases (should pass) - Invalid cases (should fail with specific errors) - Warning cases (should generate warnings) - Integration tests (multiple rules together) **Run Tests**: ```bash python -m pytest tests/test_temporal_validation.py -v ``` **Expected Output**: ``` tests/test_temporal_validation.py::TestDateUtilities::test_parse_date_iso_string PASSED tests/test_temporal_validation.py::TestDateUtilities::test_parse_date_iso_with_time PASSED ... (17 more tests) ============================== 19 passed in 0.20s ============================== ``` --- ## References ### Schema Files - Main schema: `schemas/20251121/linkml/01_custodian_name_modular.yaml` (v0.7.0) - CustodianCollection: `schemas/20251121/linkml/modules/classes/CustodianCollection.yaml` - OrganizationalStructure: `schemas/20251121/linkml/modules/classes/OrganizationalStructure.yaml` - PersonObservation: `schemas/20251121/linkml/modules/classes/PersonObservation.yaml` ### Implementation - Validator: `scripts/validate_temporal_consistency.py` (534 lines) - Test suite: `tests/test_temporal_validation.py` (19 tests) - Examples: `schemas/20251121/examples/collection_department_integration_examples.yaml` ### Documentation - Phase 4 Completion: `COLLECTION_DEPARTMENT_INTEGRATION_COMPLETE_20251122.md` - Phase 3 Completion: `PICO_STAFF_ROLES_COMPLETE_20251122.md` - Phase 5 Completion: (to be created) --- **Version**: 1.0 **Date**: 2025-11-22 **Schema Version**: v0.7.0 **Validator Version**: 1.0 **Status**: ✅ Complete