# Before vs After: Complete Schema Mermaid Diagram **Date**: 2025-11-24 --- ## The Problem LinkML's default behavior generates **53 separate Mermaid diagrams** (one per class): ``` schemas/20251121/uml/mermaid/auto_generated/ ├── ArchiveOrganizationType.mmd ├── BioCustodianType.mmd ├── CommercialOrganizationType.mmd ├── ConfidenceMeasure.mmd ├── Consortium.mmd ├── Country.mmd ├── Custodian.mmd ← Core class (abstract) ├── CustodianAppellation.mmd ├── CustodianCollection.mmd ├── CustodianIdentifier.mmd ├── CustodianLegalStatus.mmd ← Core class ├── CustodianName.mmd ← Core class ├── CustodianObservation.mmd ← Core class ├── CustodianPlace.mmd ← Core class ├── CustodianType.mmd ← Core class (abstract) ├── DigitalPlatformType.mmd ├── EducationalProviderType.mmd ├── EncompassingBody.mmd ← NEW: Abstract parent ├── FeaturePlaceType.mmd ├── GalleryType.mmd ├── HolySiteType.mmd ├── IntangibleHeritageType.mmd ├── LegalEntity.mmd ├── LegalResponsibility.mmd ├── LegalStatus.mmd ├── LibraryType.mmd ├── MixedCustodianType.mmd ├── MuseumType.mmd ├── NetworkOrganisation.mmd ← NEW: Service providers ├── NonProfitType.mmd ├── OfficialInstitutionType.mmd ├── OrganizationalChangeEvent.mmd ├── OrganizationalStructure.mmd ├── PersonObservation.mmd ├── PersonalCollectionType.mmd ├── ReconstructionActivity.mmd ├── ReconstructionAgent.mmd ├── RegistrationAuthority.mmd ├── RegistrationInfo.mmd ├── ResearchOrganizationType.mmd ├── Settlement.mmd ├── Subregion.mmd ├── TasteScentHeritageType.mmd ├── TimeSpan.mmd ├── UmbrellaOrganisation.mmd ← NEW: Legal parents └── UnspecifiedType.mmd Total: 53 files, 212 KB ``` **Problem**: To understand the schema architecture, you need to open 53 files and mentally reconstruct relationships. --- ## The Solution One comprehensive diagram showing **everything**: ``` schemas/20251121/uml/mermaid/ └── complete_schema_20251124_004329.mmd Total: 1 file, 31 KB ``` --- ## Visual Comparison ### Before: Fragmented View (53 files) To understand how `Custodian` relates to other classes, you need to: 1. Open `Custodian.mmd` → see immediate relationships 2. Open `CustodianObservation.mmd` → see observation pattern 3. Open `CustodianLegalStatus.mmd` → see legal aspect 4. Open `CustodianName.mmd` → see name aspect 5. Open `CustodianPlace.mmd` → see place aspect 6. Open `CustodianCollection.mmd` → see collection aspect 7. Open `OrganizationalStructure.mmd` → see internal structure 8. Open `EncompassingBody.mmd` → see external governance 9. Open `UmbrellaOrganisation.mmd` → see legal parents 10. Open `NetworkOrganisation.mmd` → see service providers 11. Open `Consortium.mmd` → see peer collaborations **Result**: Mental overhead, lost context switching between 11+ files --- ### After: Unified View (1 file) Open `complete_schema_20251124_004329.mmd` → see **everything** at once: ```mermaid classDiagram %% All 53 classes defined with attributes class Custodian Custodian : *hc_id uriorcurie Custodian : preferred_label CustodianName Custodian : custodian_type CustodianType Custodian : legal_status CustodianLegalStatus Custodian : place_designation CustodianPlace <> Custodian class EncompassingBody EncompassingBody : id uriorcurie EncompassingBody : organization_name string <> EncompassingBody class UmbrellaOrganisation UmbrellaOrganisation : governance_authority string class NetworkOrganisation NetworkOrganisation : service_offerings string class Consortium Consortium : membership_criteria string %% All 149 relationships visible EncompassingBody <|-- UmbrellaOrganisation : inherits EncompassingBody <|-- NetworkOrganisation : inherits EncompassingBody <|-- Consortium : inherits CustodianObservation --> "1" Custodian : identifies_custodian CustodianLegalStatus --> "1" Custodian : refers_to_custodian CustodianName --> "1" Custodian : refers_to_custodian CustodianPlace --> "1" Custodian : refers_to_custodian CustodianCollection --> "1" Custodian : refers_to_custodian %% ... 140+ more relationships ``` **Result**: Complete architecture visible in one view, no context switching --- ## Feature Comparison | Feature | Per-Class (Before) | Complete (After) | |---------|-------------------|------------------| | **Files generated** | 53 | 1 | | **Total size** | 212 KB | 31 KB | | **Classes shown** | 1 per file | 53 in one file | | **Relationships** | Immediate neighbors only | All 149 relationships | | **Abstract classes** | Marked per-file | All 3 marked in context | | **Inheritance hierarchy** | Fragmented | Complete tree visible | | **Hub pattern** | Hidden across files | Immediately clear | | **EncompassingBody architecture** | 4 separate files | Unified hierarchy | | **CustodianType taxonomy** | 19 separate files | Full taxonomy tree | | **Context switching** | High (11+ files for Custodian) | None | | **Onboarding time** | Hours (explore 53 files) | Minutes (one diagram) | | **Presentation-ready** | ❌ Too fragmented | ✅ Yes | | **Print-friendly** | ❌ 53 pages | ✅ 1 diagram | | **Whiteboard-friendly** | ❌ Can't draw all | ✅ Shows structure | --- ## Use Cases: When to Use What ### Per-Class Diagrams (LinkML Default) ✅ **Best for**: - Detailed class documentation - API reference generation - Field-level schema understanding - Developer onboarding (one class at a time) ❌ **Not good for**: - Understanding overall architecture - Seeing cross-class relationships - Presentations and talks - Executive summaries ### Complete Diagram (This Extension) ✅ **Best for**: - **Architecture overview** - Understand schema structure at a glance - **Presentations** - Conference talks, webinars, workshops - **Ontology consultations** - Show alignment with CIDOC-CRM, W3C ORG, etc. - **Onboarding** - New developers see the big picture first - **Documentation** - Schema overview chapter in guides - **Academic papers** - Illustrate data model in publications - **Stakeholder communication** - Non-technical audience understanding ❌ **Not good for**: - Field-level details (too many attributes = unreadable) - API documentation (too high-level) --- ## Real-World Impact ### Before (Fragmented) **Scenario**: New developer joins project, asks "How does the hub pattern work?" **Answer**: ``` "Open these files in order: 1. Custodian.mmd - see the hub 2. CustodianObservation.mmd - see observations 3. CustodianLegalStatus.mmd - see legal aspect 4. CustodianName.mmd - see name aspect 5. CustodianPlace.mmd - see place aspect 6. ReconstructionActivity.mmd - see the derivation process Now mentally integrate all 6 diagrams to understand the pattern." ``` **Time to understanding**: 2-4 hours (with confusion) --- ### After (Unified) **Scenario**: Same question **Answer**: ``` "Open complete_schema_20251124_004329.mmd and look at the center. You'll see Custodian (hub) with 5 arrows pointing TO it from: - CustodianObservation (sources) - CustodianLegalStatus (legal aspect) - CustodianName (emic name) - CustodianPlace (location aspect) - CustodianCollection (holdings) All derived via ReconstructionActivity." ``` **Time to understanding**: 5-10 minutes (clear) --- ## Technical Comparison ### Generation Method **Per-Class (LinkML Default)**: ```bash # Uses gen-yuml (part of LinkML docs generator) gen-yuml schemas/01_custodian_name_modular.yaml \ --output-dir schemas/20251121/uml/mermaid/auto_generated/ # Result: 53 files, one per class ``` **Complete (This Extension)**: ```bash # Uses custom script with SchemaView API python3 scripts/generate_complete_mermaid_diagram.py # Result: 1 file with all classes and relationships ``` ### Customization **Per-Class**: - Limited customization (LinkML generator) - All-or-nothing (can't filter classes) - Fixed format **Complete**: - Fully customizable (Python script) - Can filter by class, module, type - Can adjust attribute count per class - Can focus on specific relationship types - Easy to extend for new use cases --- ## Storage Efficiency **Per-Class**: 212 KB across 53 files - Each file has boilerplate (header, footer) - Class definitions repeated for relationships - Redundant metadata **Complete**: 31 KB in 1 file - Single header/footer - Class definitions once - Relationships deduplicated **Savings**: 85% reduction in total size --- ## Conclusion Both approaches have value: - **Use per-class diagrams** for detailed documentation and API reference - **Use complete diagram** for architecture understanding and communication The complete diagram **complements** rather than **replaces** per-class diagrams. **Best practice**: Generate both, use for different audiences. --- ## Generated Files - **Before**: `schemas/20251121/uml/mermaid/auto_generated/*.mmd` (53 files) - **After**: `schemas/20251121/uml/mermaid/complete_schema_20251124_004329.mmd` (1 file) - **Script**: `scripts/generate_complete_mermaid_diagram.py` --- ## Try It Yourself ```bash # Generate complete diagram cd /Users/kempersc/apps/glam python3 scripts/generate_complete_mermaid_diagram.py # View online open https://mermaid.live/ # Paste contents of complete_schema_*.mmd # Compare with per-class diagram open schemas/20251121/uml/mermaid/auto_generated/Custodian.mmd ```