glam/SESSION_SUMMARY_20251121_SCHEMA_AUTHORITY_COMPLETE.md
2025-11-21 22:12:33 +01:00

11 KiB

Session Summary: Schema Authority Documentation Complete

Session Date: 2025-11-21
Session Type: Documentation update (schema authority)
Status: COMPLETE


Session Overview

This session completed the final documentation task from the ISO 20275 migration by establishing clear schema authority guidelines for all AI agents working with the Heritage Custodian Ontology.

Problem Addressed

Issue: Without explicit documentation, agents might:

  • Edit RDF files directly (which are auto-generated)
  • Treat TypeDB schemas as authoritative (they're derived)
  • Modify UML diagrams without updating source schemas
  • Use outdated schema references from old schemas/ directory

Solution: Added "Schema Source of Truth" sections to agent documentation establishing LinkML schemas as the single authoritative source.


Changes Made

1. Updated AGENTS.md

Location: Root directory (/Users/kempersc/apps/glam/AGENTS.md)

Changes:

  • Added Rule 0: LinkML Schemas Are the Single Source of Truth (line 24)
  • Positioned before existing Rule 1 (ontology consultation)
  • Clearly establishes schema hierarchy: LinkML → RDF/TypeDB/UML
  • Documents regeneration workflow for schema changes
  • Lists forbidden practices (editing RDF directly, etc.)

Key Content:

### Rule 0: LinkML Schemas Are the Single Source of Truth

**MASTER SCHEMA LOCATION**: `schemas/20251121/linkml/`

The LinkML schema files are the **authoritative, canonical definition** 
of the Heritage Custodian Ontology...

2. Updated .opencode/agent/README.md

Location: Agent directory (/Users/kempersc/apps/glam/.opencode/agent/README.md)

Changes:

  • Added "🚨 Schema Source of Truth" section at top (lines 5-63)
  • Updated schema references from old schemas/ to schemas/20251121/linkml/
  • Updated schema version from v0.2.0 to v0.2.1 (ISO 20275 migration)
  • Updated agent schema mappings to new class names:
    • HeritageCustodianCustodianAspect
    • LocationPlaceAspect
    • ChangeEventTemporalEvent
  • Updated validation commands with correct paths
  • Added schema features list (ISO 20275 codes, multi-aspect modeling, etc.)

Key Sections Added:

  1. Schema Authority Declaration:
## 🚨 Schema Source of Truth

**MASTER SCHEMA LOCATION**: `schemas/20251121/linkml/`
  1. File Hierarchy:
  • Primary: 01_name_entity.yaml, 02_organization_observation_reconstruction.yaml
  • Derived: RDF (8 formats), TypeDB (TQL), UML (Mermaid), Examples (YAML)
  1. Workflow for Changes:
1. EDIT LinkML schema
   ↓
2. REGENERATE RDF formats (gen-owl + rdfpipe)
   ↓
3. UPDATE TypeDB schema (manual translation)
   ↓
4. UPDATE UML/Mermaid diagrams
   ↓
5. VALIDATE example instances (linkml-validate)
  1. Why LinkML is Master:
  • Formal specification (type-safe, validation, cardinality)
  • Multi-format generation (RDF, JSON-LD, Python, SQL, GraphQL)
  • Version control (clear diffs, semantic versioning)
  • Ontology alignment (explicit class_uri mappings)
  • Documentation (rich inline docs)
  1. Agent Rules:
  • NEVER: Edit RDF directly, treat TypeDB as authoritative, modify UML diagrams
  • ALWAYS: Refer to LinkML, update LinkML first, validate changes, document in YAML comments

Documentation Hierarchy

Schema Authority Chain

LinkML YAML (authoritative)
  ├─ schemas/20251121/linkml/01_name_entity.yaml
  └─ schemas/20251121/linkml/02_organization_observation_reconstruction.yaml
     ↓
GENERATED Formats (do not edit directly)
  ├─ RDF/OWL (8 formats: TTL, NT, JSON-LD, RDF/XML, N3, TriG, TriX, TRIX)
  ├─ TypeDB (TQL schema, manual translation)
  ├─ UML/Mermaid (diagrams, manual visualization)
  └─ Examples (YAML instances conforming to schema)

Agent Instruction Documents

Primary:

  1. AGENTS.md - Root-level agent instructions (Rule 0 added)

    • Audience: All AI agents working on the project
    • Scope: General extraction guidelines, ontology consultation, schema authority
  2. .opencode/agent/README.md - Agent-specific documentation (updated)

    • Audience: OpenCode NLP extraction subagents
    • Scope: Agent invocation, schema reference, output formats

Supporting:

  • .opencode/agent/institution-extractor.md - Institution extraction agent
  • .opencode/agent/location-extractor.md - Location extraction agent
  • .opencode/agent/identifier-extractor.md - Identifier extraction agent
  • .opencode/agent/event-extractor.md - Event extraction agent
  • .opencode/agent/ontology-mapping-rules.md - Ontology consultation workflow

Impact on Future Work

What This Enables

  1. Clear Hierarchy: Agents know exactly which files are authoritative vs. derived
  2. Correct Workflow: Schema changes follow proper edit → regenerate → validate flow
  3. Prevents Errors: Explicit warnings against editing auto-generated files
  4. Consistent References: All schema paths updated to schemas/20251121/linkml/
  5. Version Awareness: Documentation reflects ISO 20275 migration (v0.2.1)

What Agents Should Do Now

When working with schemas, agents must:

DO:

  • Read LinkML YAML files for class definitions
  • Update LinkML first, then regenerate RDF
  • Validate changes with linkml-validate
  • Document schema changes in YAML comments
  • Reference schemas/20251121/linkml/ paths

DON'T:

  • Edit RDF files in schemas/20251121/rdf/ directly
  • Treat TypeDB .tql files as authoritative
  • Modify UML diagrams without updating LinkML source
  • Use old schema paths from schemas/ directory

Files Modified This Session

Documentation Files

  1. AGENTS.md (lines 24-82)

    • Added Rule 0: LinkML Schemas Are the Single Source of Truth
    • 58 lines of schema authority documentation
  2. .opencode/agent/README.md (lines 5-81, plus scattered updates)

    • Added "🚨 Schema Source of Truth" section (58 lines)
    • Updated schema version v0.2.0 → v0.2.1
    • Updated schema references to schemas/20251121/linkml/
    • Updated agent class mappings to new names
    • Updated validation commands
    • Updated footer metadata
  3. SESSION_SUMMARY_20251121_SCHEMA_AUTHORITY_COMPLETE.md (this file)

    • Complete session documentation

Total Lines Changed

  • AGENTS.md: +58 lines
  • .opencode/agent/README.md: +58 lines + ~20 updates
  • Documentation: +300 lines

Validation Checklist

Pre-Session Status

  • ISO 20275 migration complete (6 tasks)
  • Country guides complete (5 countries)
  • RDF regeneration complete (8 formats, 1,427 triples)
  • TypeDB schema updated
  • Mermaid diagrams fixed
  • Schema authority documented ← THIS SESSION

Post-Session Status

  • ISO 20275 migration complete
  • Country guides complete
  • RDF regeneration complete
  • TypeDB schema updated
  • Mermaid diagrams fixed
  • Schema authority documented COMPLETE

Next Steps for Future Agents

Immediate Tasks (If Continuing)

  1. Test Migration Script with Real Data

    • Script: scripts/migrate_to_iso20275.py
    • Example: Rijksmuseum (Dutch KvK foundation)
    • Validate: Legal form code → ISO 20275 mapping
  2. Create Instance Examples

    • Location: schemas/20251121/examples/
    • Include: ISO 20275 legal form codes
    • Cover: Multiple aspects (place, custodian, legal, collections, people)
  3. Expand Country Guides

    • Current: Netherlands, France, Germany, Belgium, Italy
    • Next: Spain, Portugal, UK, Austria, Switzerland
    • Format: Markdown with ISO 20275 code tables

Optional Enhancements

  1. Validate RDF in Protégé

    • Load: schemas/20251121/rdf/02_organization_observation_reconstruction.owl.ttl
    • Run: HermiT reasoner to check consistency
    • Document: Any reasoning issues or missing constraints
  2. Add LinkML Validation Tests

    • Test: Schema compliance with LinkML metamodel
    • Test: Example instances validate against schema
    • Test: Generated RDF validates with ontology reasoners
  3. Create Visual Decision Trees

    • Guide: When to use each aspect (place vs. custodian vs. legal)
    • Guide: ISO 20275 code selection flowchart
    • Format: Mermaid diagrams

Technical Context

Schema Version Information

Current Version: v0.2.1
Migration: ISO 20275 legal form codes
Date: 2025-11-21

Key Changes from v0.2.0:

  • Removed LegalFormEnum (replaced with ISO 20275 pattern)
  • Added legal_form_code pattern: ^[A-Z0-9]{4}$
  • Added OrganizationName class (emic name standardization)
  • Reorganized schemas into schemas/20251121/ directory

RDF Generation Statistics

Organization Schema (02_organization_observation_reconstruction):

  • Triples: 1,427 (up from 1,343)
  • Formats: 8 (TTL, NT, JSON-LD, RDF/XML, N3, TriG, TriX, TRIX)
  • Reasoning: OWL 2 DL (description logic)

Name Entity Schema (01_name_entity):

  • Triples: 463 (unchanged)
  • Formats: 8 (same as organization schema)

Total Dataset:

  • Triples: 1,890
  • Classes: 25+
  • Properties: 100+

References

Documentation Updated This Session

  • AGENTS.md - Rule 0 added
  • .opencode/agent/README.md - Schema authority section added
  • schemas/20251121/RDF_GENERATION_SUMMARY.md - RDF generation process
  • schemas/20251121/uml/MERMAID_UPDATE_SUMMARY.md - Diagram fixes
  • SESSION_SUMMARY_20251121_ISO20275_COMPLETE.md - Migration completion
  • MIGRATION_CHECKLIST_ISO20275.md - Task checklist
  • docs/MIGRATION_GUIDE.md - Schema migration procedures

Schema Files

  • schemas/20251121/linkml/01_name_entity.yaml - Name Entity Hub Pattern
  • schemas/20251121/linkml/02_organization_observation_reconstruction.yaml - Organization Pattern

External Resources


Session Statistics

Duration: ~30 minutes
Files Modified: 3
Lines Added: ~136
Lines Updated: ~20
Total Documentation: ~400 lines

Tasks Completed: 1 (schema authority documentation)
Tasks Remaining: 0 (all ISO 20275 migration tasks complete)


Conclusion

This session successfully completed the final documentation task from the ISO 20275 migration by:

  1. Establishing LinkML schemas as single source of truth
  2. Documenting schema hierarchy (authoritative vs. derived)
  3. Providing clear workflow for schema changes
  4. Updating all agent documentation with correct schema paths
  5. Preventing future errors from editing auto-generated files

The Heritage Custodian Ontology schema authority is now clearly documented and ready for future agent work.

All agents working on this project should refer to:

  • AGENTS.md (Rule 0) - For schema authority principles
  • .opencode/agent/README.md - For agent-specific schema guidance
  • schemas/20251121/linkml/ - For authoritative schema definitions

Status: COMPLETE
Next Session: Optional enhancements (instance examples, Protégé validation, country guides)
Handoff: Ready for production use with clear schema authority documentation