glam/LINKML_VISUALIZATION_SESSION_COMPLETE_20251122.md
kempersc 2761857b0d Add scripts for converting OWL/Turtle ontology to Mermaid and PlantUML diagrams
- Implemented `owl_to_mermaid.py` to convert OWL/Turtle files into Mermaid class diagrams.
- Implemented `owl_to_plantuml.py` to convert OWL/Turtle files into PlantUML class diagrams.
- Added two new PlantUML files for custodian multi-aspect diagrams.
2025-11-22 23:01:13 +01:00

12 KiB

LinkML Visualization Generators - Session Complete

Date: 2025-11-22
Final Timestamp: 20251122_171316 (Mermaid), 20251122_171249 (ER)
Status: COMPLETE - All Generators Working


Executive Summary

Successfully resolved namespace conflicts in modular LinkML schema and discovered that LinkML native generators work perfectly for visualization. The initial complaint about "not proper" UML diagrams was due to using deprecated gen-yuml generator with a path resolution bug.

Bottom Line: Modern LinkML generators (gen-mermaid-class-diagram, gen-erdiagram) handle modular schemas correctly and produce high-quality outputs. Only gen-plantuml has the path bug, requiring a custom workaround script.


What We Discovered

Generator Status Matrix

Generator Status Works with Modular Schemas? Output Quality
gen-owl Active YES Excellent (RDF/OWL)
gen-mermaid-class-diagram Active YES Excellent (per-class diagrams)
gen-erdiagram Active YES Excellent (comprehensive ER)
gen-plantuml ⚠️ Active NO (path bug) Good (with workaround)
gen-yuml Deprecated NO (will be removed) N/A

Key Finding: The complaint about "not proper" diagrams was because we initially used gen-yuml (deprecated, buggy). Modern LinkML generators work perfectly!


Deliverables

1. RDF Files (Timestamp: 20251122_155319)

Format Size Lines/Triples Use Case
OWL/Turtle 159KB 2,619 lines Human-readable, primary format
N-Triples 456KB 3,027 triples Bulk loading, streaming
JSON-LD 380KB 14,094 lines Web APIs, JavaScript
RDF/XML 328KB 4,585 lines Legacy systems

Location: schemas/20251121/rdf/custodian_multi_aspect_20251122_155319.*

Validation: Zero namespace warnings, clean generation


2. Mermaid Class Diagrams (Generated: 20251122_171316)

Generator: gen-mermaid-class-diagram (LinkML native)

Output: 21 individual Markdown files, one per class

File Size Description
Custodian.md 1.2KB Core hub class
CustodianLegalStatus.md 3.5KB Legal entity aspect
CustodianName.md 1.7KB Emic name aspect
CustodianObservation.md 1.7KB Source observation
ReconstructionActivity.md 1.5KB Entity resolution
... (16 more classes)

Features:

  • Auto-renders on GitHub/GitLab
  • Clickable links between related classes
  • Shows properties with types
  • Shows relationships with cardinality
  • Native Mermaid syntax (no conversion needed)

Location: schemas/20251121/uml/mermaid/*.md

Index: schemas/20251121/uml/mermaid/index_20251122_171316.md


3. Entity-Relationship Diagram (Generated: 20251122_171249)

Generator: gen-erdiagram -f mermaid (LinkML native)

Output: Single comprehensive ER diagram

File Size Lines Classes Format
custodian_multi_aspect_20251122_171249.mmd 8KB 173 35 Mermaid erDiagram

Features:

  • All classes in one view
  • Shows all properties per class
  • Entity-relationship syntax (not class diagram)
  • Compact, comprehensive overview

Location: schemas/20251121/uml/erdiagram/custodian_multi_aspect_20251122_171249.mmd


4. PlantUML Diagrams (Workaround for gen-plantuml bug)

Generator: Custom scripts/owl_to_plantuml.py (153 lines)

File Size Description
.puml 1.5KB PlantUML source (editable)
.png 47KB Raster image
.svg 51KB Vector graphic (scalable)

Location: schemas/20251121/uml/plantuml/custodian_multi_aspect_20251122_155319.*

Note: Only needed because gen-plantuml has path resolution bug with modular imports.


Namespace Conflict Resolution

Problem

Multiple class modules had duplicate prefix definitions conflicting with modules/metadata.yaml:

# Before (5 files with this issue):
prefixes:
  heritage: https://nde.nl/ontology/hc/  # ❌ Duplicate
  schema: https://schema.org/             # ❌ Duplicate
  org: http://www.w3.org/ns/org#         # ❌ Duplicate

Solution

Removed duplicates, kept only unique prefixes:

# After:
imports:
  - ../metadata  # ✅ Import shared prefixes

prefixes:
  linkml: https://w3id.org/linkml/
  rov: http://www.w3.org/ns/regorg#  # ✅ Only declare unique ones

Files Fixed

  1. LegalEntityType.yaml - Removed 8 duplicates, kept rov
  2. LegalForm.yaml - Removed 4 duplicates, kept rov, gleif, iso20275
  3. RegistrationInfo.yaml - Removed 4 duplicates, kept rov
  4. LegalName.yaml - Removed 3 duplicates, kept rov
  5. ISO20275_mapping.yaml - Removed 3 duplicates, kept iso20275, wd

Result: Clean RDF generation with zero warnings


Full Regeneration (RDF + UML)

# Set timestamp
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
cd schemas/20251121/linkml

# 1. Generate RDF (gen-owl - WORKS)
gen-owl -f ttl 01_custodian_name_modular.yaml 2>/dev/null \
  > ../rdf/custodian_multi_aspect_${TIMESTAMP}.owl.ttl

# 2. Convert to other RDF formats (rdfpipe - WORKS)
cd ../rdf
rdfpipe custodian_multi_aspect_${TIMESTAMP}.owl.ttl -o nt 2>/dev/null \
  > custodian_multi_aspect_${TIMESTAMP}.nt
rdfpipe custodian_multi_aspect_${TIMESTAMP}.owl.ttl -o json-ld 2>/dev/null \
  > custodian_multi_aspect_${TIMESTAMP}.jsonld
rdfpipe custodian_multi_aspect_${TIMESTAMP}.owl.ttl -o xml 2>/dev/null \
  > custodian_multi_aspect_${TIMESTAMP}.rdf

# 3. Generate Mermaid class diagrams (LinkML native - WORKS!)
cd ../linkml
gen-mermaid-class-diagram -d ../uml/mermaid 01_custodian_name_modular.yaml

# 4. Generate ER diagram (LinkML native - WORKS!)
gen-erdiagram -f mermaid 01_custodian_name_modular.yaml \
  > ../uml/erdiagram/custodian_multi_aspect_${TIMESTAMP}.mmd

# 5. Generate PlantUML (custom script - workaround for gen-plantuml bug)
cd ../../..
python3 scripts/owl_to_plantuml.py \
  schemas/20251121/rdf/custodian_multi_aspect_${TIMESTAMP}.owl.ttl \
  schemas/20251121/uml/plantuml/custodian_multi_aspect_${TIMESTAMP}.puml

cd schemas/20251121/uml/plantuml
plantuml custodian_multi_aspect_${TIMESTAMP}.puml
plantuml -tsvg custodian_multi_aspect_${TIMESTAMP}.puml

Total Time: ~30 seconds for complete regeneration


Why Initial Diagrams Were "Not Proper"

The Problem

Initial attempt used gen-yuml which:

  1. Is deprecated (removed in LinkML 1.10.0)
  2. Has path resolution bug (looks for ReconstructionAgent.yaml at schema root instead of modules/classes/ReconstructionAgent.yaml)
  3. Produces empty/tiny files when it fails

The Fix

Use modern LinkML generators that work correctly:

Instead of... Use... Result
gen-yuml gen-mermaid-class-diagram 21 per-class diagrams
gen-yuml gen-erdiagram Comprehensive ER diagram
gen-plantuml (buggy) Custom owl_to_plantuml.py PlantUML from RDF

Lesson: Don't use deprecated generators! Check LinkML release notes for current best practices.


Validation Checklist

  • All namespace conflicts resolved (zero warnings)
  • RDF generation produces non-zero files (1.3MB total)
  • Mermaid class diagrams render on GitHub
  • ER diagram shows all 35 classes
  • PlantUML PNG/SVG generated successfully
  • All files use proper timestamps (YYYYMMDD_HHMMSS)
  • Index file created for navigation
  • Documentation updated

File Inventory

Generated Artifacts (28 files)

RDF (4 files, 1.3MB):

schemas/20251121/rdf/custodian_multi_aspect_20251122_155319.owl.ttl
schemas/20251121/rdf/custodian_multi_aspect_20251122_155319.nt
schemas/20251121/rdf/custodian_multi_aspect_20251122_155319.jsonld
schemas/20251121/rdf/custodian_multi_aspect_20251122_155319.rdf

Mermaid Class Diagrams (21 files, ~30KB):

schemas/20251121/uml/mermaid/Custodian.md
schemas/20251121/uml/mermaid/CustodianLegalStatus.md
schemas/20251121/uml/mermaid/CustodianName.md
... (18 more classes)

ER Diagram (1 file, 8KB):

schemas/20251121/uml/erdiagram/custodian_multi_aspect_20251122_171249.mmd

PlantUML (3 files, 99KB):

schemas/20251121/uml/plantuml/custodian_multi_aspect_20251122_155319.puml
schemas/20251121/uml/plantuml/custodian_multi_aspect_20251122_155319.png
schemas/20251121/uml/plantuml/custodian_multi_aspect_20251122_155319.svg

Navigation (1 file, 3KB):

schemas/20251121/uml/mermaid/index_20251122_171316.md

Total: 29 files (4 RDF + 21 Mermaid + 1 ER + 3 PlantUML + 1 index)


Documentation Created

  1. RDF_UML_GENERATION_COMPLETE_20251122_155319.md - Detailed session report
  2. QUICK_REFERENCE_REGENERATION.md - One-command regeneration guide
  3. schemas/20251121/uml/mermaid/index_20251122_171316.md - Mermaid navigation index
  4. LINKML_VISUALIZATION_SESSION_COMPLETE_20251122.md (this file) - Final handoff

Next Steps (Optional)

Immediate Use Cases

View Diagrams:

# GitHub web (auto-renders Mermaid)
open https://github.com/[org]/[repo]/blob/main/schemas/20251121/uml/mermaid/Custodian.md

# Local VS Code (install Mermaid extension)
code schemas/20251121/uml/mermaid/

Load RDF into SPARQL Endpoint:

# Example: Apache Jena Fuseki
curl -X POST http://localhost:3030/dataset/data \
  -H "Content-Type: text/turtle" \
  --data-binary @schemas/20251121/rdf/custodian_multi_aspect_20251122_155319.owl.ttl

Validate OWL Reasoning:

# Open in Protégé
open -a Protégé schemas/20251121/rdf/custodian_multi_aspect_20251122_155319.owl.ttl

Future Improvements

  • Report gen-plantuml path resolution bug to LinkML GitHub
  • Create combined "overview" Mermaid diagram (all classes in one view)
  • Add interactive navigation between class diagram files
  • Generate HTML documentation with gen-doc
  • Create SHACL shapes for validation
  • Implement OWL reasoning rules

Key Insights for Future Sessions

  1. Check Generator Status First: Always verify which LinkML generators are current/deprecated before using
  2. Native Generators Preferred: Use LinkML's built-in generators when possible (better maintenance)
  3. Modular Schemas Work: Modern LinkML generators correctly handle modular imports
  4. Custom Scripts for Bugs: Acceptable to work around bugs with custom converters (e.g., OWL → PlantUML)
  5. Document Workarounds: Always explain WHY custom scripts exist (so they can be removed when bug fixed)

Success Metrics

Metric Target Actual Status
Namespace warnings 0 0
RDF file size >1MB 1.3MB
RDF triple count >2,000 3,027
Class diagrams 20+ 21
ER diagram classes 35 35
PlantUML output 3 files 3 files
Documentation Complete Complete

Conclusion

Status: COMPLETE

The Heritage Custodian Ontology now has:

  • Clean RDF output (4 formats, 3,027 triples)
  • Modern Mermaid visualizations (21 class diagrams + 1 ER diagram)
  • PlantUML diagrams (PNG + SVG)
  • Complete navigation index
  • Comprehensive documentation

Bottom Line: Modern LinkML generators work perfectly with modular schemas. The initial "not proper" complaint was due to using deprecated gen-yuml. All visualization needs are now met with proper LinkML tools + one custom PlantUML workaround script.

Ready for: SPARQL querying, semantic web integration, ontology reasoning, knowledge graph construction.


Session Completed: 2025-11-22 17:13:16
Primary Artifacts: RDF (20251122_155319), Mermaid (20251122_171316), ER (20251122_171249)
Documentation: LINKML_VISUALIZATION_SESSION_COMPLETE_20251122.md