13 KiB
Hyper-Modular Schema Structure
Version: 0.1.0
Date: 2025-11-21
Schema: Heritage Custodian Observation and Reconstruction Pattern
Overview
The Heritage Custodian schema uses a hyper-modular architecture where every class, enum, and slot is defined in its own individual file. This provides maximum granularity for version control, parallel development, and maintainability.
Total Files: 78 YAML files
- Classes: 12 files (
modules/classes/) - Enums: 5 files (
modules/enums/) - Slots: 59 files (
modules/slots/) - Core Modules: 1 file (
metadata.yaml) - Main Schema: 1 file (
01_custodian_name_modular.yaml)
Note: Aggregator files (enums_all.yaml, slots_all.yaml, classes_all.yaml) still exist but are not used by the main schema. They remain available for backward compatibility.
Directory Structure
schemas/20251121/linkml/
├── 01_custodian_name_modular.yaml # Main schema (directly imports all 76 modules)
├── HYPER_MODULAR_STRUCTURE.md # This file
├── SLOT_NAMING_CONVENTIONS.md # Slot naming rules for range variants
│
└── modules/
├── metadata.yaml # Schema metadata & namespace prefixes
│
├── enums/ # 5 enum definitions (all imported directly)
│ ├── AgentTypeEnum.yaml
│ ├── AppellationTypeEnum.yaml
│ ├── LegalStatusEnum.yaml
│ ├── ReconstructionActivityTypeEnum.yaml
│ └── SourceDocumentTypeEnum.yaml
│
├── slots/ # 59 slot definitions (all imported directly)
│ ├── id.yaml
│ ├── created.yaml
│ ├── modified.yaml
│ ├── observed_name.yaml
│ ├── was_revision_of.yaml # May contain multiple range variants
│ └── ... (54 more slot files) # (see SLOT_NAMING_CONVENTIONS.md)
│
├── classes/ # 12 class definitions (all imported directly)
│ ├── Custodian.yaml
│ ├── CustodianObservation.yaml
│ ├── CustodianName.yaml
│ ├── CustodianReconstruction.yaml
│ ├── ReconstructionActivity.yaml
│ ├── Agent.yaml
│ ├── Identifier.yaml
│ ├── Appellation.yaml
│ ├── SourceDocument.yaml
│ ├── ConfidenceMeasure.yaml
│ ├── LanguageCode.yaml
│ └── TimeSpan.yaml
│
└── [Legacy aggregators - not used by main schema]
├── enums_all.yaml # Aggregator for backward compatibility
├── slots_all.yaml # Aggregator for backward compatibility
└── classes_all.yaml # Aggregator for backward compatibility
Namespace Structure
All components use the https://nde.nl/ontology/hc/ base namespace:
| Component | Namespace Pattern | Example |
|---|---|---|
| Base | https://nde.nl/ontology/hc/ |
Schema root |
| Classes | https://nde.nl/ontology/hc/class/{ClassName} |
https://nde.nl/ontology/hc/class/Custodian |
| Enums | https://nde.nl/ontology/hc/enum/{EnumName} |
https://nde.nl/ontology/hc/enum/LegalStatusEnum |
| Slots | https://nde.nl/ontology/hc/slot/{slot_name} |
https://nde.nl/ontology/hc/slot/was_revision_of |
| Metadata | https://nde.nl/ontology/hc/metadata |
Metadata module |
Prefixes (defined in modules/metadata.yaml):
prefixes:
hc: https://nde.nl/ontology/hc/
hc_class: https://nde.nl/ontology/hc/class/
hc_enum: https://nde.nl/ontology/hc/enum/
hc_slot: https://nde.nl/ontology/hc/slot/
Import Strategy
Direct Import Pattern
The main schema directly imports all 76 individual component files for maximum transparency and granularity:
# 01_custodian_name_modular.yaml
imports:
- linkml:types
- modules/metadata
# Enums (5 files)
- modules/enums/AgentTypeEnum
- modules/enums/AppellationTypeEnum
- modules/enums/LegalStatusEnum
- modules/enums/ReconstructionActivityTypeEnum
- modules/enums/SourceDocumentTypeEnum
# Slots (59 files)
- modules/slots/activity_type
- modules/slots/affiliation
# ... (57 more slot imports)
# Classes (12 files)
- modules/classes/Agent
- modules/classes/Appellation
# ... (10 more class imports)
Benefits of Direct Imports:
- ✅ Complete Transparency: Immediately see all schema dependencies
- ✅ Explicit Dependencies: No hidden imports through aggregators
- ✅ Selective Imports: Easy to comment out individual components for custom schemas
- ✅ Better IDE Support: Direct file references for navigation
- ✅ Clear Audit Trail: Git diffs show exactly which components changed
Note: Aggregator modules (enums_all.yaml, slots_all.yaml, classes_all.yaml) still exist for backward compatibility and can be used by downstream projects that prefer a simpler import structure.
File Naming Conventions
Class Files
Pattern: {ClassName}.yaml (PascalCase)
Examples:
Custodian.yamlCustodianObservation.yamlCustodianReconstruction.yaml
File structure:
id: https://nde.nl/ontology/hc/class/ClassName
name: ClassName
title: ClassName Class
imports:
- linkml:types
- OtherClass # If needed
classes:
ClassName:
class_uri: ontology:Class
description: "..."
slots:
- slot1
- slot2
Enum Files
Pattern: {EnumName}.yaml (PascalCase with "Enum" suffix)
Examples:
LegalStatusEnum.yamlAgentTypeEnum.yaml
File structure:
id: https://nde.nl/ontology/hc/enum/EnumName
name: EnumName
enums:
EnumName:
description: "..."
permissible_values:
VALUE1:
description: "..."
VALUE2:
description: "..."
Slot Files
Pattern: {slot_name}.yaml (snake_case)
Examples:
legal_name.yamlwas_revision_of.yamlobserved_name.yaml
Special Case - Range Variants: See SLOT_NAMING_CONVENTIONS.md for handling multiple slots with the same ontological property but different ranges.
File structure:
id: https://nde.nl/ontology/hc/slot/slot_name
name: slot-name-slot
imports:
- ../classes/RangeClass # If range is a class
slots:
slot_name:
slot_uri: ontology:property
range: RangeType
description: "..."
# Optional: Range variants (same slot_uri, different range)
slot_name-variant:
slot_uri: ontology:property # SAME as base
range: DifferentRangeType
description: "..."
Maintenance Guidelines
Adding a New Class
- Create file:
modules/classes/{ClassName}.yaml - Define class with namespace:
https://nde.nl/ontology/hc/class/{ClassName} - Add import to
modules/classes_all.yaml - Test schema generation:
gen-owl 01_custodian_name_modular.yaml
Adding a New Enum
- Create file:
modules/enums/{EnumName}.yaml - Define enum with namespace:
https://nde.nl/ontology/hc/enum/{EnumName} - Add import to
modules/enums_all.yaml - Test schema generation
Adding a New Slot
- Check if ontologically related slot exists: Look for existing slots with same
slot_uri - If EXISTS: Add range variant to existing file (see
SLOT_NAMING_CONVENTIONS.md) - If NEW: Create file
modules/slots/{slot_name}.yaml - Define slot with namespace:
https://nde.nl/ontology/hc/slot/{slot_name} - Add import to
modules/slots_all.yaml(only if new file) - Test schema generation
Adding a Range Variant to Existing Slot
Example: Adding was_revision_of for Record class
- Open existing file:
modules/slots/was_revision_of.yaml - Add import for new range class:
imports: - ../classes/CustodianReconstruction # Existing - ../classes/Record # NEW - Add new slot variant:
slots: was_revision_of: slot_uri: prov:wasRevisionOf range: CustodianReconstruction description: "..." was_revision_of-record: # NEW slot_uri: prov:wasRevisionOf range: Record description: "..." - No change to aggregator needed (file already imported)
- Test schema generation
Validation and Testing
Validate Schema Structure
cd /Users/kempersc/apps/glam/schemas/20251121/linkml
# Test OWL generation (validates imports and structure)
gen-owl 01_custodian_name_modular.yaml > /dev/null
# Test JSON Schema generation
gen-json-schema 01_custodian_name_modular.yaml > /dev/null
# Test Python dataclasses generation
gen-python 01_custodian_name_modular.yaml > /dev/null
Check Namespace Consistency
# All class files should have hc/class/ namespace
grep -h "^id:" modules/classes/*.yaml | sort -u
# All enum files should have hc/enum/ namespace
grep -h "^id:" modules/enums/*.yaml | sort -u
# All slot files should have hc/slot/ namespace
grep -h "^id:" modules/slots/*.yaml | sort -u
Expected output:
# Classes
https://nde.nl/ontology/hc/class/Agent
https://nde.nl/ontology/hc/class/Appellation
...
# Enums
https://nde.nl/ontology/hc/enum/AgentTypeEnum
https://nde.nl/ontology/hc/enum/AppellationTypeEnum
...
# Slots
https://nde.nl/ontology/hc/slot/activity_type
https://nde.nl/ontology/hc/slot/affiliation
...
Benefits of Hyper-Modular Structure
1. Granular Version Control
Each component has independent git history:
git log modules/classes/Custodian.yaml
git blame modules/slots/legal_form.yaml
2. Parallel Development
Multiple developers can work simultaneously without merge conflicts:
- Developer A edits
CustodianObservation.yaml - Developer B edits
ReconstructionActivity.yaml - No conflicts, both changes merge cleanly
3. Selective Imports
Can create specialized schemas importing only needed components:
# Mini schema for observations only
imports:
- modules/metadata
- modules/classes/CustodianObservation
- modules/slots/observed_name
- modules/slots/source
4. Clear Ownership
One file = one concept = one maintainer:
Custodian.yaml→ CIDOC-CRM expertLegalStatusEnum.yaml→ GLEIF ontology expertwas_revision_of.yaml→ PROV-O expert
5. Easier Code Review
Small, focused pull requests:
- ❌ "Update schema with 5 new classes" (monolithic, 500 lines)
- ✅ "Add TimeSpan class" (one file, 96 lines)
6. Better Documentation
Each file can have extensive inline documentation without cluttering others:
# CustodianReconstruction.yaml can have 200 lines of comments
# without making Identifier.yaml harder to read
7. IDE-Friendly
File tree navigation:
modules/classes/
├── Agent.yaml ← Easy to find
├── Custodian.yaml ← Alphabetically sorted
└── CustodianObservation.yaml
vs. monolithic:
heritage_custodian.yaml:3458 ← Where is CustodianObservation?
Migration from Consolidated Structure
Phase 1: Module Consolidation (Completed)
- ✅ Split monolithic schema into 9 modules
- ✅ Classes grouped by function (base, observation, reconstruction, etc.)
Phase 2: Hyper-Modularization (Completed 2025-11-21)
- ✅ Split all 12 classes into individual files
- ✅ Split all 5 enums into individual files
- ✅ Split all 59 slots into individual files
- ✅ Created 3 aggregator modules
- ✅ Updated all namespace URIs to
nde.nl/ontology/hc/ - ✅ Validated OWL generation
Legacy Files (Can Be Deleted)
These consolidated module files are now obsolete:
modules/base_classes.yaml→ Replaced byclasses/Custodian.yamlmodules/observation_classes.yaml→ Replaced byclasses/CustodianObservation.yaml,classes/CustodianName.yamlmodules/reconstruction_classes.yaml→ Replaced byclasses/CustodianReconstruction.yamlmodules/provenance_classes.yaml→ Replaced byclasses/ReconstructionActivity.yaml,classes/Agent.yamlmodules/supporting_classes.yaml→ Replaced by 6 individual class filesmodules/enums.yaml→ Replaced byenums_all.yaml+ 5 individual filesmodules/slots.yaml→ Replaced byslots_all.yaml+ 59 individual files
Troubleshooting
Error: "Cannot find module X"
Cause: Import path incorrect or file missing
Solution:
- Check aggregator imports correct file name
- Verify file exists:
ls modules/classes/X.yaml - Check
id:in file matches import path
Error: "Duplicate class definition"
Cause: Class defined in multiple files and both imported
Solution:
- Remove class from old consolidated module
- Ensure aggregator imports new individual file only
Warning: "Multiple owl types"
Cause: Range conflicts (e.g., slot used as both object property and datatype property)
Solution: Expected for polymorphic slots with any_of. Can be ignored if intentional.
References
- Slot Naming Conventions:
SLOT_NAMING_CONVENTIONS.md - LinkML Documentation: https://linkml.io/
- Schema Validation:
gen-owl,gen-json-schema,gen-python - Main Schema:
01_custodian_name_modular.yaml
Last Updated: 2025-11-21
Maintainer: GLAM Data Extraction Project