465 lines
13 KiB
Markdown
465 lines
13 KiB
Markdown
# Hyper-Modular Schema Structure
|
|
|
|
**Version**: 0.1.0
|
|
**Date**: 2025-11-21
|
|
**Schema**: Heritage Custodian Observation and Reconstruction Pattern
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
The Heritage Custodian schema uses a **hyper-modular architecture** where every class, enum, and slot is defined in its own individual file. This provides maximum granularity for version control, parallel development, and maintainability.
|
|
|
|
**Total Files**: 78 YAML files
|
|
- **Classes**: 12 files (`modules/classes/`)
|
|
- **Enums**: 5 files (`modules/enums/`)
|
|
- **Slots**: 59 files (`modules/slots/`)
|
|
- **Core Modules**: 1 file (`metadata.yaml`)
|
|
- **Main Schema**: 1 file (`01_custodian_name_modular.yaml`)
|
|
|
|
**Note**: Aggregator files (`enums_all.yaml`, `slots_all.yaml`, `classes_all.yaml`) still exist but are not used by the main schema. They remain available for backward compatibility.
|
|
|
|
---
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
schemas/20251121/linkml/
|
|
├── 01_custodian_name_modular.yaml # Main schema (directly imports all 76 modules)
|
|
├── HYPER_MODULAR_STRUCTURE.md # This file
|
|
├── SLOT_NAMING_CONVENTIONS.md # Slot naming rules for range variants
|
|
│
|
|
└── modules/
|
|
├── metadata.yaml # Schema metadata & namespace prefixes
|
|
│
|
|
├── enums/ # 5 enum definitions (all imported directly)
|
|
│ ├── AgentTypeEnum.yaml
|
|
│ ├── AppellationTypeEnum.yaml
|
|
│ ├── LegalStatusEnum.yaml
|
|
│ ├── ReconstructionActivityTypeEnum.yaml
|
|
│ └── SourceDocumentTypeEnum.yaml
|
|
│
|
|
├── slots/ # 59 slot definitions (all imported directly)
|
|
│ ├── id.yaml
|
|
│ ├── created.yaml
|
|
│ ├── modified.yaml
|
|
│ ├── observed_name.yaml
|
|
│ ├── was_revision_of.yaml # May contain multiple range variants
|
|
│ └── ... (54 more slot files) # (see SLOT_NAMING_CONVENTIONS.md)
|
|
│
|
|
├── classes/ # 12 class definitions (all imported directly)
|
|
│ ├── Custodian.yaml
|
|
│ ├── CustodianObservation.yaml
|
|
│ ├── CustodianName.yaml
|
|
│ ├── CustodianReconstruction.yaml
|
|
│ ├── ReconstructionActivity.yaml
|
|
│ ├── Agent.yaml
|
|
│ ├── Identifier.yaml
|
|
│ ├── Appellation.yaml
|
|
│ ├── SourceDocument.yaml
|
|
│ ├── ConfidenceMeasure.yaml
|
|
│ ├── LanguageCode.yaml
|
|
│ └── TimeSpan.yaml
|
|
│
|
|
└── [Legacy aggregators - not used by main schema]
|
|
├── enums_all.yaml # Aggregator for backward compatibility
|
|
├── slots_all.yaml # Aggregator for backward compatibility
|
|
└── classes_all.yaml # Aggregator for backward compatibility
|
|
```
|
|
|
|
---
|
|
|
|
## Namespace Structure
|
|
|
|
All components use the `https://nde.nl/ontology/hc/` base namespace:
|
|
|
|
| Component | Namespace Pattern | Example |
|
|
|-----------|-------------------|---------|
|
|
| **Base** | `https://nde.nl/ontology/hc/` | Schema root |
|
|
| **Classes** | `https://nde.nl/ontology/hc/class/{ClassName}` | `https://nde.nl/ontology/hc/class/Custodian` |
|
|
| **Enums** | `https://nde.nl/ontology/hc/enum/{EnumName}` | `https://nde.nl/ontology/hc/enum/LegalStatusEnum` |
|
|
| **Slots** | `https://nde.nl/ontology/hc/slot/{slot_name}` | `https://nde.nl/ontology/hc/slot/was_revision_of` |
|
|
| **Metadata** | `https://nde.nl/ontology/hc/metadata` | Metadata module |
|
|
|
|
**Prefixes** (defined in `modules/metadata.yaml`):
|
|
```yaml
|
|
prefixes:
|
|
hc: https://nde.nl/ontology/hc/
|
|
hc_class: https://nde.nl/ontology/hc/class/
|
|
hc_enum: https://nde.nl/ontology/hc/enum/
|
|
hc_slot: https://nde.nl/ontology/hc/slot/
|
|
```
|
|
|
|
---
|
|
|
|
## Import Strategy
|
|
|
|
### Direct Import Pattern
|
|
|
|
The main schema **directly imports all 76 individual component files** for maximum transparency and granularity:
|
|
|
|
```yaml
|
|
# 01_custodian_name_modular.yaml
|
|
|
|
imports:
|
|
- linkml:types
|
|
- modules/metadata
|
|
|
|
# Enums (5 files)
|
|
- modules/enums/AgentTypeEnum
|
|
- modules/enums/AppellationTypeEnum
|
|
- modules/enums/LegalStatusEnum
|
|
- modules/enums/ReconstructionActivityTypeEnum
|
|
- modules/enums/SourceDocumentTypeEnum
|
|
|
|
# Slots (59 files)
|
|
- modules/slots/activity_type
|
|
- modules/slots/affiliation
|
|
# ... (57 more slot imports)
|
|
|
|
# Classes (12 files)
|
|
- modules/classes/Agent
|
|
- modules/classes/Appellation
|
|
# ... (10 more class imports)
|
|
```
|
|
|
|
**Benefits of Direct Imports**:
|
|
- ✅ **Complete Transparency**: Immediately see all schema dependencies
|
|
- ✅ **Explicit Dependencies**: No hidden imports through aggregators
|
|
- ✅ **Selective Imports**: Easy to comment out individual components for custom schemas
|
|
- ✅ **Better IDE Support**: Direct file references for navigation
|
|
- ✅ **Clear Audit Trail**: Git diffs show exactly which components changed
|
|
|
|
**Note**: Aggregator modules (`enums_all.yaml`, `slots_all.yaml`, `classes_all.yaml`) still exist for backward compatibility and can be used by downstream projects that prefer a simpler import structure.
|
|
|
|
---
|
|
|
|
## File Naming Conventions
|
|
|
|
### Class Files
|
|
|
|
**Pattern**: `{ClassName}.yaml` (PascalCase)
|
|
|
|
Examples:
|
|
- `Custodian.yaml`
|
|
- `CustodianObservation.yaml`
|
|
- `CustodianReconstruction.yaml`
|
|
|
|
**File structure**:
|
|
```yaml
|
|
id: https://nde.nl/ontology/hc/class/ClassName
|
|
name: ClassName
|
|
title: ClassName Class
|
|
|
|
imports:
|
|
- linkml:types
|
|
- OtherClass # If needed
|
|
|
|
classes:
|
|
ClassName:
|
|
class_uri: ontology:Class
|
|
description: "..."
|
|
slots:
|
|
- slot1
|
|
- slot2
|
|
```
|
|
|
|
---
|
|
|
|
### Enum Files
|
|
|
|
**Pattern**: `{EnumName}.yaml` (PascalCase with "Enum" suffix)
|
|
|
|
Examples:
|
|
- `LegalStatusEnum.yaml`
|
|
- `AgentTypeEnum.yaml`
|
|
|
|
**File structure**:
|
|
```yaml
|
|
id: https://nde.nl/ontology/hc/enum/EnumName
|
|
name: EnumName
|
|
|
|
enums:
|
|
EnumName:
|
|
description: "..."
|
|
permissible_values:
|
|
VALUE1:
|
|
description: "..."
|
|
VALUE2:
|
|
description: "..."
|
|
```
|
|
|
|
---
|
|
|
|
### Slot Files
|
|
|
|
**Pattern**: `{slot_name}.yaml` (snake_case)
|
|
|
|
Examples:
|
|
- `legal_name.yaml`
|
|
- `was_revision_of.yaml`
|
|
- `observed_name.yaml`
|
|
|
|
**Special Case - Range Variants**: See `SLOT_NAMING_CONVENTIONS.md` for handling multiple slots with the same ontological property but different ranges.
|
|
|
|
**File structure**:
|
|
```yaml
|
|
id: https://nde.nl/ontology/hc/slot/slot_name
|
|
name: slot-name-slot
|
|
|
|
imports:
|
|
- ../classes/RangeClass # If range is a class
|
|
|
|
slots:
|
|
slot_name:
|
|
slot_uri: ontology:property
|
|
range: RangeType
|
|
description: "..."
|
|
|
|
# Optional: Range variants (same slot_uri, different range)
|
|
slot_name-variant:
|
|
slot_uri: ontology:property # SAME as base
|
|
range: DifferentRangeType
|
|
description: "..."
|
|
```
|
|
|
|
---
|
|
|
|
## Maintenance Guidelines
|
|
|
|
### Adding a New Class
|
|
|
|
1. Create file: `modules/classes/{ClassName}.yaml`
|
|
2. Define class with namespace: `https://nde.nl/ontology/hc/class/{ClassName}`
|
|
3. Add import to `modules/classes_all.yaml`
|
|
4. Test schema generation: `gen-owl 01_custodian_name_modular.yaml`
|
|
|
|
### Adding a New Enum
|
|
|
|
1. Create file: `modules/enums/{EnumName}.yaml`
|
|
2. Define enum with namespace: `https://nde.nl/ontology/hc/enum/{EnumName}`
|
|
3. Add import to `modules/enums_all.yaml`
|
|
4. Test schema generation
|
|
|
|
### Adding a New Slot
|
|
|
|
1. **Check if ontologically related slot exists**: Look for existing slots with same `slot_uri`
|
|
2. **If EXISTS**: Add range variant to existing file (see `SLOT_NAMING_CONVENTIONS.md`)
|
|
3. **If NEW**: Create file `modules/slots/{slot_name}.yaml`
|
|
4. Define slot with namespace: `https://nde.nl/ontology/hc/slot/{slot_name}`
|
|
5. Add import to `modules/slots_all.yaml` (only if new file)
|
|
6. Test schema generation
|
|
|
|
### Adding a Range Variant to Existing Slot
|
|
|
|
**Example**: Adding `was_revision_of` for `Record` class
|
|
|
|
1. Open existing file: `modules/slots/was_revision_of.yaml`
|
|
2. Add import for new range class:
|
|
```yaml
|
|
imports:
|
|
- ../classes/CustodianReconstruction # Existing
|
|
- ../classes/Record # NEW
|
|
```
|
|
3. Add new slot variant:
|
|
```yaml
|
|
slots:
|
|
was_revision_of:
|
|
slot_uri: prov:wasRevisionOf
|
|
range: CustodianReconstruction
|
|
description: "..."
|
|
|
|
was_revision_of-record: # NEW
|
|
slot_uri: prov:wasRevisionOf
|
|
range: Record
|
|
description: "..."
|
|
```
|
|
4. **No change to aggregator needed** (file already imported)
|
|
5. Test schema generation
|
|
|
|
---
|
|
|
|
## Validation and Testing
|
|
|
|
### Validate Schema Structure
|
|
|
|
```bash
|
|
cd /Users/kempersc/apps/glam/schemas/20251121/linkml
|
|
|
|
# Test OWL generation (validates imports and structure)
|
|
gen-owl 01_custodian_name_modular.yaml > /dev/null
|
|
|
|
# Test JSON Schema generation
|
|
gen-json-schema 01_custodian_name_modular.yaml > /dev/null
|
|
|
|
# Test Python dataclasses generation
|
|
gen-python 01_custodian_name_modular.yaml > /dev/null
|
|
```
|
|
|
|
### Check Namespace Consistency
|
|
|
|
```bash
|
|
# All class files should have hc/class/ namespace
|
|
grep -h "^id:" modules/classes/*.yaml | sort -u
|
|
|
|
# All enum files should have hc/enum/ namespace
|
|
grep -h "^id:" modules/enums/*.yaml | sort -u
|
|
|
|
# All slot files should have hc/slot/ namespace
|
|
grep -h "^id:" modules/slots/*.yaml | sort -u
|
|
```
|
|
|
|
Expected output:
|
|
```
|
|
# Classes
|
|
https://nde.nl/ontology/hc/class/Agent
|
|
https://nde.nl/ontology/hc/class/Appellation
|
|
...
|
|
|
|
# Enums
|
|
https://nde.nl/ontology/hc/enum/AgentTypeEnum
|
|
https://nde.nl/ontology/hc/enum/AppellationTypeEnum
|
|
...
|
|
|
|
# Slots
|
|
https://nde.nl/ontology/hc/slot/activity_type
|
|
https://nde.nl/ontology/hc/slot/affiliation
|
|
...
|
|
```
|
|
|
|
---
|
|
|
|
## Benefits of Hyper-Modular Structure
|
|
|
|
### 1. Granular Version Control
|
|
|
|
Each component has independent git history:
|
|
```bash
|
|
git log modules/classes/Custodian.yaml
|
|
git blame modules/slots/legal_form.yaml
|
|
```
|
|
|
|
### 2. Parallel Development
|
|
|
|
Multiple developers can work simultaneously without merge conflicts:
|
|
- Developer A edits `CustodianObservation.yaml`
|
|
- Developer B edits `ReconstructionActivity.yaml`
|
|
- No conflicts, both changes merge cleanly
|
|
|
|
### 3. Selective Imports
|
|
|
|
Can create specialized schemas importing only needed components:
|
|
```yaml
|
|
# Mini schema for observations only
|
|
imports:
|
|
- modules/metadata
|
|
- modules/classes/CustodianObservation
|
|
- modules/slots/observed_name
|
|
- modules/slots/source
|
|
```
|
|
|
|
### 4. Clear Ownership
|
|
|
|
One file = one concept = one maintainer:
|
|
- `Custodian.yaml` → CIDOC-CRM expert
|
|
- `LegalStatusEnum.yaml` → GLEIF ontology expert
|
|
- `was_revision_of.yaml` → PROV-O expert
|
|
|
|
### 5. Easier Code Review
|
|
|
|
Small, focused pull requests:
|
|
- ❌ "Update schema with 5 new classes" (monolithic, 500 lines)
|
|
- ✅ "Add TimeSpan class" (one file, 96 lines)
|
|
|
|
### 6. Better Documentation
|
|
|
|
Each file can have extensive inline documentation without cluttering others:
|
|
```yaml
|
|
# CustodianReconstruction.yaml can have 200 lines of comments
|
|
# without making Identifier.yaml harder to read
|
|
```
|
|
|
|
### 7. IDE-Friendly
|
|
|
|
File tree navigation:
|
|
```
|
|
modules/classes/
|
|
├── Agent.yaml ← Easy to find
|
|
├── Custodian.yaml ← Alphabetically sorted
|
|
└── CustodianObservation.yaml
|
|
```
|
|
|
|
vs. monolithic:
|
|
```
|
|
heritage_custodian.yaml:3458 ← Where is CustodianObservation?
|
|
```
|
|
|
|
---
|
|
|
|
## Migration from Consolidated Structure
|
|
|
|
### Phase 1: Module Consolidation (Completed)
|
|
|
|
- ✅ Split monolithic schema into 9 modules
|
|
- ✅ Classes grouped by function (base, observation, reconstruction, etc.)
|
|
|
|
### Phase 2: Hyper-Modularization (Completed 2025-11-21)
|
|
|
|
- ✅ Split all 12 classes into individual files
|
|
- ✅ Split all 5 enums into individual files
|
|
- ✅ Split all 59 slots into individual files
|
|
- ✅ Created 3 aggregator modules
|
|
- ✅ Updated all namespace URIs to `nde.nl/ontology/hc/`
|
|
- ✅ Validated OWL generation
|
|
|
|
### Legacy Files (Can Be Deleted)
|
|
|
|
These consolidated module files are now obsolete:
|
|
- `modules/base_classes.yaml` → Replaced by `classes/Custodian.yaml`
|
|
- `modules/observation_classes.yaml` → Replaced by `classes/CustodianObservation.yaml`, `classes/CustodianName.yaml`
|
|
- `modules/reconstruction_classes.yaml` → Replaced by `classes/CustodianReconstruction.yaml`
|
|
- `modules/provenance_classes.yaml` → Replaced by `classes/ReconstructionActivity.yaml`, `classes/Agent.yaml`
|
|
- `modules/supporting_classes.yaml` → Replaced by 6 individual class files
|
|
- `modules/enums.yaml` → Replaced by `enums_all.yaml` + 5 individual files
|
|
- `modules/slots.yaml` → Replaced by `slots_all.yaml` + 59 individual files
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Error: "Cannot find module X"
|
|
|
|
**Cause**: Import path incorrect or file missing
|
|
|
|
**Solution**:
|
|
1. Check aggregator imports correct file name
|
|
2. Verify file exists: `ls modules/classes/X.yaml`
|
|
3. Check `id:` in file matches import path
|
|
|
|
### Error: "Duplicate class definition"
|
|
|
|
**Cause**: Class defined in multiple files and both imported
|
|
|
|
**Solution**:
|
|
1. Remove class from old consolidated module
|
|
2. Ensure aggregator imports new individual file only
|
|
|
|
### Warning: "Multiple owl types"
|
|
|
|
**Cause**: Range conflicts (e.g., slot used as both object property and datatype property)
|
|
|
|
**Solution**: Expected for polymorphic slots with `any_of`. Can be ignored if intentional.
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- **Slot Naming Conventions**: `SLOT_NAMING_CONVENTIONS.md`
|
|
- **LinkML Documentation**: https://linkml.io/
|
|
- **Schema Validation**: `gen-owl`, `gen-json-schema`, `gen-python`
|
|
- **Main Schema**: `01_custodian_name_modular.yaml`
|
|
|
|
---
|
|
|
|
**Last Updated**: 2025-11-21
|
|
**Maintainer**: GLAM Data Extraction Project
|