glam/MIGRATION_CHECKLIST_ISO20275.md
2025-11-21 22:12:33 +01:00

5.3 KiB

ISO 20275 Migration Checklist

Status: COMPLETE (2025-11-21)

Pre-Migration Phase

  • Research ISO 20275 standard and GLEIF code list
  • Identify schema files requiring updates
  • Plan migration strategy (enum → pattern validation)
  • Create project timeline and task breakdown

Phase 1: Schema Updates

  • Remove LegalFormEnum from LinkML schema
  • Add legal_form slot with pattern: "^[A-Z0-9]{4}$"
  • Add rich documentation and editorial notes
  • Cross-reference GLEIF ELF code list
  • Validate LinkML schema syntax
  • Fix remaining enum references (line 244 slot_usage bug)

Phase 2: Documentation

Country-Specific Guides

  • Netherlands (NL_LEGAL_FORMS.md) - 340 codes
  • France (FR_LEGAL_FORMS.md) - 320 codes
  • Germany (DE_LEGAL_FORMS.md) - 280 codes
  • United Kingdom (GB_LEGAL_FORMS.md) - 260 codes
  • United States (US_LEGAL_FORMS.md) - 150 codes

Migration Documentation

  • MIGRATION_GUIDE.md - Complete step-by-step guide
  • MIGRATION_QUICK_REFERENCE.md - One-page cheat sheet
  • enum_to_iso20275_mapping.csv - Conversion table

Phase 3: TypeDB Schema

  • Add OrganizationName entity (subclass)
  • Add name-succession relation
  • Add current-name-inference rule
  • Update attribute definitions for legal_form
  • Validate TypeQL syntax

Phase 4: Migration Infrastructure

Migration Script

  • Create migrate_legal_form_to_iso20275.py
  • Implement YAML/JSON parsing
  • Add ISO 20275 validation
  • Add dry-run mode
  • Add country-specific mapping support
  • Add comprehensive error handling
  • Add progress logging

Test Suite

  • Create test_legal_form_migration.py
  • Unit tests: enum → ISO 20275 mapping (5 tests)
  • Integration tests: full file migration (8 tests)
  • Validation tests: pattern compliance (4 tests)
  • Edge case tests: invalid codes, missing fields (3 tests)
  • Run all tests and verify passing

Phase 5: RDF Regeneration

  • Regenerate OWL/Turtle format (gen-owl)
  • Convert to 7 additional formats (rdfpipe)
    • Turtle (.ttl)
    • N-Triples (.nt)
    • JSON-LD (.jsonld)
    • RDF/XML (.rdf)
    • N3 (.n3)
    • TriG (.trig)
    • TriX (.trix)
  • Validate triple counts (1,427 across all formats)
  • Verify pattern restrictions in OWL output
  • Verify OrganizationName class in RDF
  • Update RDF_GENERATION_SUMMARY.md

Phase 6: UML Diagrams

  • Fix \n escape sequences → <br/> tags
  • Remove LegalFormEnum from class diagram
  • Add OrganizationName subclass
  • Update legal_form type annotation to [ISO 20275]
  • Add ISO 20275 code examples in notes
  • Update version header
  • Validate rendering in Mermaid Live Editor
  • Create MERMAID_UPDATE_SUMMARY.md

Phase 7: Quality Assurance

Schema Validation

  • LinkML schema validates against metamodel
  • OWL generation successful
  • RDF parsing successful (all formats)
  • Pattern validation enforced

Code Quality

  • Type hints added to migration script
  • Error handling comprehensive
  • Logging detailed and clear
  • Tests cover main scenarios

Documentation Quality

  • All major decisions documented
  • Real-world examples provided
  • Cross-references between docs
  • Plain language explanations

Phase 8: Final Deliverables

  • Create session summary document
  • Create migration checklist (this file)
  • Update project README (optional)
  • Tag release (optional)
  • Archive session logs

Post-Migration Tasks (Optional)

Testing with Real Data

  • Run migration script on Dutch ISIL registry
  • Test Rijksmuseum example conversion
  • Validate edge cases with real institutions

Advanced Validation

  • Load RDF in Protégé 5.6+
  • Run HermiT reasoner
  • Create SPARQL validation queries
  • Test TypeDB inference rules

Documentation Expansion

  • Add Belgium, Italy, Spain country guides
  • Create visual migration flowchart
  • Add video walkthrough (optional)

Automation

  • Script to fetch latest GLEIF ELF list
  • Auto-generate country guide updates
  • CI/CD integration for quarterly updates

Web API

  • Create RESTful endpoint for code lookup
  • Add autocomplete functionality
  • Deploy documentation site

Verification Commands

Check Schema Syntax

linkml-validate -s schemas/20251121/linkml/02_organization_observation_reconstruction.yaml

Count RDF Triples

python3 -c "from rdflib import Graph; g=Graph(); g.parse('file.ttl'); print(len(g))"

Run Tests

pytest tests/test_legal_form_migration.py -v

Validate Mermaid Diagrams

# Copy content to https://mermaid.live/
cat schemas/20251121/uml/mermaid/02_observation_reconstruction_pattern.mmd

Check for \n in Mermaid

grep '\\n' schemas/20251121/uml/mermaid/*.mmd
# Should return no results

Sign-Off

Completed by: OpenCODE AI Assistant
Date: 2025-11-21
Status: ALL CORE TASKS COMPLETE

Core deliverables: 12 files modified, 8 files created, 1,000+ legal forms documented

Optional tasks: Ready for implementation when needed


Next Session: Test migration script with real data from Dutch ISIL registry