glam/COMPLETE_SCHEMA_MERMAID_GENERATION.md
2025-11-25 12:48:07 +01:00

275 lines
7.6 KiB
Markdown

# Complete Schema Mermaid Diagram Generation
**Date**: 2025-11-24
**Feature**: Extended LinkML's MermaidRenderer to generate complete schema diagrams
---
## Overview
LinkML's built-in `MermaidClassDiagramGenerator` generates **individual diagrams per class** (53 separate `.mmd` files). While this is useful for detailed class-by-class documentation, it doesn't provide a **holistic view of the entire schema**.
We've created a custom script that generates a **single comprehensive Mermaid diagram** showing:
- ✅ All 53 classes
- ✅ All 149 relationships (inheritance, associations)
- ✅ Key attributes for each class (limited to 10 per class for readability)
- ✅ Abstract classes marked with `<<abstract>>`
- ✅ Relationship cardinalities (`1`, `1..*`)
---
## Usage
### Generate Complete Diagram
```bash
cd /Users/kempersc/apps/glam
python3 scripts/generate_complete_mermaid_diagram.py
```
**Output**:
```
schemas/20251121/uml/mermaid/complete_schema_YYYYMMDD_HHMMSS.mmd
```
### View the Diagram
**Option 1: Mermaid Live Editor**
1. Open https://mermaid.live/
2. Copy contents of `complete_schema_YYYYMMDD_HHMMSS.mmd`
3. Paste into editor
4. Explore interactively (zoom, pan, export to SVG/PNG)
**Option 2: VS Code (with Mermaid extension)**
1. Install "Markdown Preview Mermaid Support" extension
2. Open `.mmd` file in VS Code
3. Right-click → "Open Preview"
**Option 3: GitHub (automatic rendering)**
1. Push `.mmd` file to GitHub repository
2. View file on GitHub (automatic Mermaid rendering)
---
## Generated Diagram Statistics
**Latest Generation** (2025-11-24 00:41:43):
- **Classes**: 53
- **Relationships**: 149
- Inheritance (`<|--`): 42
- Associations (`-->`): 107
- **Lines**: 697
- **File Size**: 31 KB
---
## Key Relationships Captured
### EncompassingBody Hierarchy
```mermaid
classDiagram
class EncompassingBody <<abstract>>
EncompassingBody <|-- UmbrellaOrganisation : inherits
EncompassingBody <|-- NetworkOrganisation : inherits
EncompassingBody <|-- Consortium : inherits
Custodian --> EncompassingBody : encompassing_body
```
### CustodianType Hierarchy (19 types)
```mermaid
classDiagram
class CustodianType
CustodianType <|-- MuseumType
CustodianType <|-- LibraryType
CustodianType <|-- ArchiveOrganizationType
CustodianType <|-- GalleryType
CustodianType <|-- OfficialInstitutionType
CustodianType <|-- ResearchOrganizationType
CustodianType <|-- CommercialOrganizationType
CustodianType <|-- UnspecifiedType
CustodianType <|-- BioCustodianType
CustodianType <|-- EducationalProviderType
CustodianType <|-- CollectingSocietyType
CustodianType <|-- FeaturePlaceType
CustodianType <|-- IntangibleHeritageType
CustodianType <|-- MixedCustodianType
CustodianType <|-- PersonalCollectionType
CustodianType <|-- HolySiteType
CustodianType <|-- DigitalPlatformType
CustodianType <|-- NonProfitType
CustodianType <|-- TasteScentHeritageType
```
### Hub Architecture (Custodian as central reference)
```mermaid
classDiagram
class Custodian
CustodianObservation --> Custodian : identifies_custodian
CustodianLegalStatus --> Custodian : refers_to_custodian
CustodianName --> Custodian : refers_to_custodian
CustodianPlace --> Custodian : refers_to_custodian
CustodianCollection --> Custodian : refers_to_custodian
OrganizationalStructure --> Custodian : refers_to_custodian
```
---
## Script Details
**Location**: `scripts/generate_complete_mermaid_diagram.py`
**Key Features**:
1. **Two-pass generation**:
- Pass 1: Define all classes with attributes
- Pass 2: Define all relationships (inheritance + associations)
2. **Smart filtering**:
- Limits attributes to 10 per class (prevents diagram explosion)
- Only includes class-to-class relationships (skips primitive types)
3. **Metadata embedded**:
- Schema name and version
- Generation timestamp
- Class/relationship counts
4. **Timestamped output**:
- Format: `complete_schema_YYYYMMDD_HHMMSS.mmd`
- Allows multiple generations without conflicts
---
## Comparison: Per-Class vs Complete Diagrams
| Feature | Per-Class Diagrams (gen-yuml) | Complete Diagram (this script) |
|---------|-------------------------------|--------------------------------|
| **Files generated** | 53 (one per class) | 1 (entire schema) |
| **Total size** | 212 KB (53 files) | 31 KB (1 file) |
| **Use case** | Detailed class documentation | Schema overview and architecture |
| **Relationships shown** | Immediate class relationships | All schema relationships |
| **Holistic view** | ❌ Requires viewing 53 files | ✅ Single comprehensive view |
| **Detail level** | High (all slots, all docs) | Medium (10 slots per class) |
| **Maintainability** | Auto-generated by LinkML | Custom script (easy to extend) |
**Recommendation**: Use **both**:
- Per-class diagrams: Detailed reference for developers
- Complete diagram: Architecture overview for onboarding, presentations, ontology consultations
---
## Extending the Script
### Add More Attributes Per Class
Edit `scripts/generate_complete_mermaid_diagram.py` line 72:
```python
for slot_name in class_slots[:10]: # Change 10 to 20, 30, etc.
```
### Include Enum Values
Add after line 86:
```python
# Add enums
for enum_name in sorted(schemaview.all_enums()):
enum = schemaview.get_enum(enum_name)
mermaid_lines.append(f" class {enum_name} <<enumeration>>")
for pv in enum.permissible_values.values():
mermaid_lines.append(f" {enum_name} : {pv.text}")
mermaid_lines.append("")
```
### Filter to Specific Classes
Add after line 69:
```python
# Only include core classes
core_classes = ['Custodian', 'CustodianObservation', 'CustodianLegalStatus',
'CustodianName', 'CustodianPlace', 'EncompassingBody']
if class_name not in core_classes:
continue
```
---
## Troubleshooting
### Diagram Too Large (Mermaid.live won't render)
**Solution 1**: Reduce attributes per class
```python
for slot_name in class_slots[:5]: # Fewer attributes
```
**Solution 2**: Filter to core classes only (see "Extending" above)
**Solution 3**: Generate multiple diagrams by module
```python
# Filter by module
if 'CustodianType' in class_name:
# CustodianType hierarchy diagram
```
### Missing Relationships
**Check**: Ensure slot `range` is a class name:
```python
if slot.range and slot.range in schemaview.all_classes():
# Only creates relationship if range is a class
```
**Fix**: Some relationships may use `uriorcurie` (string reference) instead of class range. Script currently filters these out to avoid noise.
---
## Next Steps
### Generate SVG/PNG Exports
Install Mermaid CLI:
```bash
npm install -g @mermaid-js/mermaid-cli
```
Convert to SVG:
```bash
mmdc -i complete_schema_20251124_004143.mmd -o complete_schema.svg
```
### Add to Documentation Website
If building with Sphinx/MkDocs:
```markdown
## Schema Overview
```mermaid
{{ include('schemas/20251121/uml/mermaid/complete_schema_20251124_004143.mmd') }}
\```
```
### Automate Generation in CI/CD
Add to `.github/workflows/schema-docs.yml`:
```yaml
- name: Generate Complete Mermaid Diagram
run: python3 scripts/generate_complete_mermaid_diagram.py
```
---
## References
- **LinkML Documentation**: https://linkml.io/linkml/generators/mermaid.html
- **Mermaid Class Diagrams**: https://mermaid.js.org/syntax/classDiagram.html
- **Mermaid Live Editor**: https://mermaid.live/
- **Script Location**: `scripts/generate_complete_mermaid_diagram.py`
- **Output Directory**: `schemas/20251121/uml/mermaid/`
---
## Changelog
**2025-11-24**: Initial implementation
- Created custom script extending LinkML's MermaidRenderer
- Generated complete schema diagram (53 classes, 149 relationships)
- Documented usage and extension patterns