glam/schemas/20251121/IMPORT_PATTERN_MIGRATION.md
2025-11-21 22:12:33 +01:00

11 KiB

Migration Guide: Aggregator Pattern → Direct Imports

Date: 2025-11-21
Schema Version: 0.1.0
Affected File: 01_custodian_name_modular.yaml


Overview

The Heritage Custodian LinkML schema has transitioned from an aggregator-based import pattern to direct individual module imports for improved transparency, maintainability, and developer experience.


What Changed

Before: Aggregator Pattern (Pre-2025-11-21)

# 01_custodian_name_modular.yaml (old)

imports:
  - linkml:types
  - modules/metadata
  - modules/enums_all      # Aggregator importing 5 enums
  - modules/slots_all      # Aggregator importing 59 slots
  - modules/classes_all    # Aggregator importing 12 classes

Total imports: 5 (linkml:types + metadata + 3 aggregators)
Components loaded: 76 (5 enums + 59 slots + 12 classes) - loaded indirectly

After: Direct Import Pattern (2025-11-21+)

# 01_custodian_name_modular.yaml (new)

imports:
  - linkml:types
  - modules/metadata
  
  # Enums (5 files)
  - modules/enums/AgentTypeEnum
  - modules/enums/AppellationTypeEnum
  - modules/enums/LegalStatusEnum
  - modules/enums/ReconstructionActivityTypeEnum
  - modules/enums/SourceDocumentTypeEnum
  
  # Slots (59 files)
  - modules/slots/activity_type
  - modules/slots/affiliation
  # ... (57 more slot imports)
  
  # Classes (12 files)
  - modules/classes/Agent
  - modules/classes/Appellation
  # ... (10 more class imports)

Total imports: 78 (linkml:types + metadata + 76 individual modules)
Components loaded: 76 - loaded directly and explicitly


Why We Changed

Problems with Aggregators

  1. Hidden Dependencies: Importing enums_all didn't show which specific enums were being used
  2. No Selective Imports: Couldn't easily exclude specific components for custom schemas
  3. Debugging Difficulty: Errors referenced aggregator files, not the actual problem module
  4. Poor IDE Support: IDEs couldn't navigate directly to component files
  5. Unclear Audit Trail: Git diffs showed aggregator changes, not individual component additions/removals

Benefits of Direct Imports

  1. Complete Transparency: Every dependency is immediately visible in the main schema
  2. Explicit Dependencies: Clear which classes depend on which enums and slots
  3. Selective Customization: Easy to comment out unwanted components for specialized schemas
  4. Better IDE Navigation: Direct file references enable "Go to Definition" in IDEs
  5. Clear Git Diffs: Addition/removal of components is explicitly shown in version control
  6. Easier Debugging: Error messages point directly to problematic module files

Migration Steps

For Schema Consumers (Using the Schema)

No action required! The schema semantics haven't changed - only the import structure.

If you're importing the schema into your own LinkML schemas:

# Your custom schema

imports:
  - path/to/01_custodian_name_modular.yaml  # ← No change needed

For Schema Maintainers (Modifying the Schema)

Adding a New Class

Old workflow (aggregator pattern):

  1. Create modules/classes/NewClass.yaml
  2. Edit modules/classes_all.yaml to add - classes/NewClass
  3. Main schema automatically picks it up

New workflow (direct imports):

  1. Create modules/classes/NewClass.yaml
  2. Edit 01_custodian_name_modular.yaml to add - modules/classes/NewClass under # Classes section

Example:

# 01_custodian_name_modular.yaml

imports:
  # ... existing imports ...
  
  # Classes (13 files)  ← Update count
  - modules/classes/Agent
  - modules/classes/Appellation
  - modules/classes/NewClass  # ← Add here
  - modules/classes/Custodian
  # ...

Adding a New Enum

New workflow:

  1. Create modules/enums/NewEnum.yaml
  2. Edit 01_custodian_name_modular.yaml to add - modules/enums/NewEnum under # Enums section
  3. Update comment count: # Enums (6 files) ← increment

Adding a New Slot

New workflow:

  1. Create modules/slots/new_slot.yaml
  2. Edit 01_custodian_name_modular.yaml to add - modules/slots/new_slot under # Slots section
  3. Update comment count: # Slots (60 files) ← increment

Removing a Component

New workflow:

  1. Comment out or delete the import line in 01_custodian_name_modular.yaml
  2. Optionally delete the module file
  3. Update comment count

Example:

# Deprecating a slot

# Slots (58 files)  ← Update count
- modules/slots/activity_type
# - modules/slots/deprecated_slot  ← Comment out or delete
- modules/slots/affiliation

Backward Compatibility

Aggregator Files Status

The aggregator modules still exist and are fully functional:

  • modules/enums_all.yaml Still maintained
  • modules/slots_all.yaml Still maintained
  • modules/classes_all.yaml Still maintained

Use aggregators if:

  • You prefer a simpler import structure (5 imports vs 78)
  • You're creating quick prototypes
  • You want all components without listing them

Example - Using aggregators in a custom schema:

# my_custom_schema.yaml

id: https://example.org/my_schema
name: my-custom-schema

imports:
  - linkml:types
  - ../heritage_custodian/modules/metadata
  - ../heritage_custodian/modules/enums_all    # ← Still works!
  - ../heritage_custodian/modules/slots_all
  - ../heritage_custodian/modules/classes_all

Main Schema Uses Direct Imports

The main schema (01_custodian_name_modular.yaml) no longer uses aggregators, but you can in your custom schemas.


Validation

Schema Still Validates

Both old and new import patterns produce identical OWL output:

# Validate schema
$ cd schemas/20251121/linkml
$ gen-owl 01_custodian_name_modular.yaml

# ✅ Generates OWL successfully
# Only benign warnings (namespace mapping)

RDF Generation

All 8 RDF formats regenerated successfully:

$ cd schemas/20251121/linkml
$ gen-owl -f ttl 01_custodian_name_modular.yaml > ../rdf/01_custodian_name.owl.ttl
$ cd ../rdf
$ rdfpipe 01_custodian_name.owl.ttl -o nt > 01_custodian_name.nt
$ rdfpipe 01_custodian_name.owl.ttl -o json-ld > 01_custodian_name.jsonld
# ... (6 more formats)

Result: 8 RDF files (Turtle, N-Triples, JSON-LD, RDF/XML, N3, TriG, TriX, N-Quads) - all formats regenerated from new schema.


Examples

Example 1: Selective Import for Custom Schema

Scenario: You want a minimal schema with only CustodianObservation and SourceDocument classes.

Using direct imports (easy):

# minimal_schema.yaml

id: https://example.org/minimal_custodian
name: minimal-custodian-schema

imports:
  - linkml:types
  - ../heritage_custodian/modules/metadata
  
  # Only import what we need
  - ../heritage_custodian/modules/classes/CustodianObservation
  - ../heritage_custodian/modules/classes/SourceDocument
  - ../heritage_custodian/modules/slots/observed_name
  - ../heritage_custodian/modules/slots/source
  - ../heritage_custodian/modules/enums/SourceDocumentTypeEnum

Using aggregators (imports everything):

# You'd get all 76 components even if you only need 5
imports:
  - ../heritage_custodian/modules/enums_all    # Gets all 5 enums
  - ../heritage_custodian/modules/slots_all    # Gets all 59 slots
  - ../heritage_custodian/modules/classes_all  # Gets all 12 classes

Example 2: Extending the Schema

Scenario: Add a new StaffRole class for tracking personnel.

Step 1: Create the class module

# modules/classes/StaffRole.yaml

id: https://nde.nl/ontology/hc/class/StaffRole
name: StaffRole
title: Staff Role Class

imports:
  - linkml:types

classes:
  StaffRole:
    class_uri: pico:PersonObservation
    description: "Staff member role at a heritage custodian institution"
    slots:
      - agent_name
      - affiliation
      - role_title
      - employment_period

Step 2: Update main schema

# 01_custodian_name_modular.yaml

imports:
  # ... existing imports ...
  
  # Classes (13 files)  ← Increment count
  - modules/classes/Agent
  - modules/classes/Appellation
  - modules/classes/ConfidenceMeasure
  - modules/classes/Custodian
  - modules/classes/CustodianName
  - modules/classes/CustodianObservation
  - modules/classes/CustodianReconstruction
  - modules/classes/Identifier
  - modules/classes/LanguageCode
  - modules/classes/ReconstructionActivity
  - modules/classes/SourceDocument
  - modules/classes/StaffRole      # ← Add here
  - modules/classes/TimeSpan

Step 3: Regenerate RDF

$ gen-owl -f ttl 01_custodian_name_modular.yaml > ../rdf/01_custodian_name.owl.ttl

Troubleshooting

"Module not found" error

Error:

FileNotFoundError: [Errno 2] No such file or directory: 'modules/classes/NewClass.yaml'

Solution: Check spelling and file path. Module files must exist before importing.

"Duplicate import" warning

Warning:

WARNING: Module 'modules/classes/Agent' imported multiple times

Solution: Check for duplicate import lines in 01_custodian_name_modular.yaml. Each module should be imported only once.

Schema validation fails after adding import

Error:

linkml.utils.schemaview.SchemaView: Unable to find class 'NewClass'

Solution:

  1. Verify the module file exists at the correct path
  2. Check that the class is defined in the classes: section of the module
  3. Ensure the class name matches exactly (case-sensitive)

Changes not reflected in generated OWL

Issue: Modified a module but OWL output unchanged.

Solution: Regenerate OWL explicitly:

$ gen-owl -f ttl 01_custodian_name_modular.yaml > ../rdf/01_custodian_name.owl.ttl

LinkML doesn't automatically detect changes in imported modules.


Timeline

Date Event
2025-11-21 Migration to direct imports completed
2025-11-21 All 8 RDF formats regenerated
2025-11-21 Documentation updated (README, HYPER_MODULAR_STRUCTURE.md)
2025-11-21 Legacy aggregators archived (still available for backward compatibility)

References

  • Main Schema: schemas/20251121/linkml/01_custodian_name_modular.yaml
  • Architecture Guide: .opencode/HYPER_MODULAR_STRUCTURE.md
  • Slot Naming Rules: .opencode/SLOT_NAMING_CONVENTIONS.md
  • RDF Formats: schemas/20251121/rdf/ (8 serialization formats)

Questions?

For questions about this migration, see:

  1. Technical Details: .opencode/HYPER_MODULAR_STRUCTURE.md
  2. Module Organization: Browse modules/classes/, modules/enums/, modules/slots/
  3. Examples: schemas/20251121/examples/ (LinkML instance files)

Migration Status: COMPLETE - Production ready as of 2025-11-21