glam/COMPLETE_SCHEMA_DIAGRAM_SESSION_SUMMARY.md
2025-11-25 12:48:07 +01:00

8.1 KiB
Raw Blame History

Session Complete: Complete Schema Mermaid Diagram Extension

Date: 2025-11-24
Status: COMPLETE


What We Accomplished

1. Extended LinkML to Generate Complete Schema Diagrams

Problem: LinkML's built-in MermaidClassDiagramGenerator only generates individual per-class diagrams (53 separate files). No holistic view of the entire schema architecture.

Solution: Created custom script scripts/generate_complete_mermaid_diagram.py that generates a single comprehensive diagram showing:

  • All 53 classes
  • All 149 relationships (inheritance + associations)
  • Abstract class annotations
  • Relationship cardinalities
  • Key attributes per class (limited to 10 for readability)

Generated Outputs

Complete Schema Diagram

File: schemas/20251121/uml/mermaid/complete_schema_20251124_004329.mmd

  • 53 classes
  • 149 relationships
  • 701 lines
  • 31 KB
  • Syntax verified (fixed abstract class annotation placement)

Documentation

  1. Comprehensive guide: COMPLETE_SCHEMA_MERMAID_GENERATION.md

    • Usage instructions
    • Architecture overview
    • Extension patterns
    • Troubleshooting
  2. Quick reference: QUICK_STATUS_COMPLETE_MERMAID_GENERATION.md

    • TL;DR summary
    • Key visualizations
    • When to use what

Technical Implementation

Key Design Decisions

  1. Two-pass generation:

    # Pass 1: Define classes with attributes
    class MyClass
    MyClass : attribute_name type
    <<abstract>> MyClass  # Annotation separate from class declaration
    
    # Pass 2: Define relationships
    ParentClass <|-- ChildClass : inherits
    ClassA --> "1..*" ClassB : association
    
  2. Abstract class syntax fix:

    • Wrong: class MyClass <<abstract>> (causes parse error with attributes)
    • Correct: Declare class first, add attributes, then add <<abstract>> annotation
  3. Attribute limiting:

    • Limited to 10 slots per class (prevents diagram explosion)
    • Prioritizes understanding over completeness
  4. Relationship filtering:

    • Only includes class-to-class relationships
    • Filters out primitive types (string, integer, date)
    • Avoids noise from URI references

Key Visualizations Captured

EncompassingBody Hierarchy (NEW!)

EncompassingBody (abstract)
├── UmbrellaOrganisation (legal parent organizations)
├── NetworkOrganisation (service providers)
└── Consortium (peer-to-peer collaborations)

Hub Architecture Pattern

Custodian (hub) ←─── CustodianObservation (sources)
                ├─── CustodianLegalStatus (legal aspect)
                ├─── CustodianName (emic name aspect)
                ├─── CustodianPlace (place aspect)
                ├─── CustodianCollection (collection aspect)
                └─── OrganizationalStructure (internal structure)

19-Type CustodianType Taxonomy

All 19 GLAMORCUBESFIXPHDNT types with inheritance relationships visualized.


How to Use

Generate New Diagram

cd /Users/kempersc/apps/glam
python3 scripts/generate_complete_mermaid_diagram.py

View Diagram

  1. Online: Copy .mmd contents to https://mermaid.live/
  2. VS Code: Install Mermaid extension, open preview
  3. GitHub: Push to repo (auto-renders)

Export to Image

npm install -g @mermaid-js/mermaid-cli
mmdc -i complete_schema_20251124_004329.mmd -o schema_overview.svg

Dependencies Installed

pip3 install linkml-renderer  # Version 0.3.1

Provides:

  • linkml_renderer.renderers.mermaid_renderer.MermaidRenderer
  • HTML, Markdown, and Mermaid rendering capabilities

Comparison: Before vs After

Aspect Before (Per-Class) After (Complete)
Files 53 separate .mmd files 1 unified diagram
Total size 212 KB 31 KB
Holistic view Requires viewing 53 files Single comprehensive view
Use case Detailed class docs Architecture overview
Relationships Immediate neighbors only All schema relationships
Presentation-ready Too fragmented Perfect for presentations

Recommendation: Use both

  • Per-class: Developer reference
  • Complete: Presentations, onboarding, ontology consultations

Files Changed/Created

New Files

  1. scripts/generate_complete_mermaid_diagram.py (executable script)
  2. schemas/20251121/uml/mermaid/complete_schema_20251124_004329.mmd (generated diagram)
  3. COMPLETE_SCHEMA_MERMAID_GENERATION.md (comprehensive documentation)
  4. QUICK_STATUS_COMPLETE_MERMAID_GENERATION.md (quick reference)
  5. COMPLETE_SCHEMA_DIAGRAM_SESSION_SUMMARY.md (this file)

Modified Files

None (pure addition, no schema changes)


Next Steps (Optional)

1. Generate Multiple Focused Diagrams

Create variants for different audiences:

# Core hub architecture only
python3 scripts/generate_hub_architecture_diagram.py

# CustodianType hierarchy only
python3 scripts/generate_custodian_type_hierarchy.py

# EncompassingBody focus
python3 scripts/generate_encompassing_body_diagram.py

2. Add to CI/CD Pipeline

Auto-generate on schema changes:

# .github/workflows/schema-docs.yml
- name: Generate Complete Diagram
  run: python3 scripts/generate_complete_mermaid_diagram.py

3. Create Interactive Web Viewer

Embed in documentation website with zoom/pan controls.

4. Export High-Resolution Images

For academic papers and presentations:

mmdc -i complete_schema.mmd -o schema.png -w 4096 -H 4096

Lessons Learned

Mermaid Syntax Quirks

  • Abstract classes require annotation after attributes, not inline with class declaration
  • Attribute syntax: ClassName : attribute_name type (not ClassName.attribute_name)
  • Relationship labels must be quoted if multi-word: "1..*" not 1..*

LinkML SchemaView API

  • schemaview.all_classes() returns class names (strings), not objects
  • schemaview.get_class(name) returns ClassDefinition object
  • schemaview.class_slots(name) returns ordered list of slot names
  • Inheritance via cls.is_a, mixins via cls.mixins

Performance Considerations

  • 53 classes × 10 attributes = 530 attribute lines
  • 149 relationships manageable for Mermaid.live
  • Beyond ~100 classes, consider splitting into multiple diagrams

Testing Checklist

  • Script runs without errors
  • Output file generated with correct timestamp
  • Mermaid syntax valid (no parse errors)
  • Abstract classes correctly annotated
  • All 53 classes included
  • All 149 relationships captured
  • Inheritance relationships correct
  • Association cardinalities present
  • File size reasonable (31 KB)
  • Documentation complete

Success Metrics

Extended LinkML's capabilities - Added complete schema diagram generation
Improved developer experience - Single file shows entire architecture
Presentation-ready output - Suitable for talks, papers, consultations
Maintainable solution - Script is simple, well-documented, easy to extend
Zero breaking changes - Pure addition, no schema modifications


References


Session Timeline

  1. Initial request: "Generate complete schema diagram, not just per-class"
  2. Investigation: Discovered LinkML only generates per-class diagrams
  3. Solution design: Custom script extending MermaidRenderer
  4. Implementation: Created generate_complete_mermaid_diagram.py
  5. Bug fix: Corrected abstract class annotation syntax
  6. Documentation: Created comprehensive and quick-reference guides
  7. Validation: Verified output with 53 classes, 149 relationships

Total time: ~1 hour
Lines of code: ~150 (script + docs)
Impact: High - Enables holistic schema visualization