glam/UML_GENERATION_COMPLETE_20251123.md
kempersc 3ff0e33bf9 Add UML diagrams and scripts for custodian schema
- Created PlantUML diagrams for custodian types, full schema, legal status, and organizational structure.
- Implemented a script to generate GraphViz DOT diagrams from OWL/RDF ontology files.
- Developed a script to generate UML diagrams from modular LinkML schema, supporting both Mermaid and PlantUML formats.
- Enhanced class definitions and relationships in UML diagrams to reflect the latest schema updates.
2025-11-23 23:05:33 +01:00

545 lines
16 KiB
Markdown

# UML Diagrams Generated from LinkML Schema
**Date**: 2025-11-23
**Timestamp**: 20251123_174151
**Schema Version**: 0.7.1
**Generator**: Custom Python script (`scripts/generate_uml_diagrams.py`)
---
## Summary
Successfully generated **10 UML diagrams** (5 Mermaid + 5 PlantUML) from the modular LinkML schema, including the newly implemented **CustodianType** classification system.
---
## Generated Diagrams
### 1. Full Schema Diagram
**Purpose**: Complete visualization of all 29 classes in the Heritage Custodian schema
**Files**:
- Mermaid: `schemas/20251121/uml/mermaid/full_schema_20251123_174151.mmd` (12 KB)
- PlantUML: `schemas/20251121/uml/plantuml/full_schema_20251123_174151.puml`
**Includes**:
- All 29 classes: Custodian, CustodianObservation, CustodianName, CustodianType, CustodianLegalStatus, CustodianPlace, CustodianCollection, ReconstructionActivity, OrganizationalStructure, OrganizationalChangeEvent, PersonObservation, and 18 supporting classes
- All inheritance relationships (is_a)
- All composition relationships (class-to-class slots)
- 127 total slots visualized
**Use Cases**:
- System architecture overview
- Schema documentation
- Onboarding new developers
---
### 2. Core Classes Diagram
**Purpose**: Focus on the 8 main entities in the hub architecture
**Files**:
- Mermaid: `schemas/20251121/uml/mermaid/core_classes_20251123_174151.mmd` (5.6 KB)
- PlantUML: `schemas/20251121/uml/plantuml/core_classes_20251123_174151.puml`
**Classes Included**:
1. **Custodian** (Hub) - Abstract entity with persistent identifier
2. **CustodianObservation** - Evidence of custodians in sources
3. **CustodianName** - Standardized emic name aspect
4. **CustodianType** - NEW: GLAMORCUBESFIXPHDNT classification (19 types)
5. **CustodianLegalStatus** - Formal legal entity aspect
6. **CustodianPlace** - Nominal place designation aspect
7. **CustodianCollection** - Heritage collection aspect
8. **ReconstructionActivity** - Process generating aspects from observations
**Relationships Visualized**:
- Hub pattern: All aspects link to Custodian via `refers_to_custodian`
- Observation → Reconstruction: CustodianObservation feeds ReconstructionActivity
- Custodian → Type: NEW custodian_type slot (org:classification)
**Use Cases**:
- Understanding the hub architecture
- Multi-aspect modeling explanation
- PiCo-inspired observation/reconstruction pattern
---
### 3. CustodianType Diagram (NEW)
**Purpose**: Visualize the newly implemented type classification system
**Files**:
- Mermaid: `schemas/20251121/uml/mermaid/custodian_type_20251123_174151.mmd` (1.1 KB)
- PlantUML: `schemas/20251121/uml/plantuml/custodian_type_20251123_174151.puml`
**Classes Included**:
1. **Custodian** (Hub)
2. **CustodianType** (Base class - abstract)
**Slots Visualized**:
**Custodian**:
- `hc_id` (required) - Persistent identifier
- `preferred_label` (optional) - Link to CustodianName
- `custodian_type` (optional) - NEW: Link to CustodianType
- `legal_status` (optional) - Link to CustodianLegalStatus
- `place_designation` (optional) - Link to CustodianPlace
- `has_collection` (multivalued) - Links to CustodianCollection
- `organizational_structure` (multivalued) - Operational units
- `organizational_change_events` (multivalued) - History
- `identifiers` (multivalued) - External IDs
- `created`, `modified` - Timestamps
**CustodianType**:
- `type_id` - Persistent identifier (e.g., `https://nde.nl/ontology/hc/type/museum/Q207694`)
- `primary_type` - One of 19 GLAMORCUBESFIXPHDNT categories
- `wikidata_entity` - Wikidata Q-number (e.g., `Q207694`)
- `type_label` - Multilingual labels
- `type_description` - SKOS definition
- `broader_type` - Hierarchical parent
- `narrower_types` - Hierarchical children
- `related_types` - Associative relationships
- `applicable_countries` - Geographic restrictions (ISO 3166-1 alpha-2)
- `created`, `modified` - Timestamps
**Relationship**:
```
Custodian --> "0..1" CustodianType : custodian_type
```
**Cardinality**: 0..1 (optional, single-valued)
**Ontology Alignment** (documented in schema, not visible in UML):
- `org:classification` (W3C Organization Ontology) - PRIMARY
- `crm:P2_has_type` (CIDOC-CRM) - SECONDARY
- `schema:additionalType` (Schema.org) - TERTIARY (Wikidata linking)
**Use Cases**:
- Understanding GLAMORCUBESFIXPHDNT taxonomy integration
- Explaining type classification vs. legal form distinction
- Demonstrating SKOS-based concept scheme
---
### 4. Legal Status Diagram
**Purpose**: Visualize the formal legal entity reconstruction
**Files**:
- Mermaid: `schemas/20251121/uml/mermaid/legal_status_20251123_174151.mmd` (1.8 KB)
- PlantUML: `schemas/20251121/uml/plantuml/legal_status_20251123_174151.puml`
**Classes Included**:
1. **Custodian** (Hub)
2. **CustodianLegalStatus** - Formal legal entity aspect
3. **LegalEntityType** - Type of legal entity (individual, group, organization, government, corporation)
4. **LegalForm** - ISO 20275 legal form codes (foundation, association, etc.)
5. **LegalName** - Full legal name as registered
6. **RegistrationInfo** - Registration details (KvK, company number, etc.)
**Relationships**:
```
Custodian --> "0..1" CustodianLegalStatus : legal_status
CustodianLegalStatus --> "1..1" LegalEntityType : legal_entity_type
CustodianLegalStatus --> "1..1" LegalName : legal_name
CustodianLegalStatus --> "0..1" LegalForm : legal_form
CustodianLegalStatus --> "0..*" RegistrationInfo : registration_numbers
```
**Key Distinction**:
- **CustodianType** (operational classification) - "How does it function?" (museum, library, archive)
- **LegalForm** (legal registration) - "What is its legal structure?" (foundation, association, corporation)
**Example**:
- **Rijksmuseum**:
- `custodian_type`: Art Museum (Q207694) - MUSEUM
- `legal_form`: Stichting (ISO 20275 code 8102) - Foundation
**Use Cases**:
- Understanding legal entity reconstruction
- Explaining ISO 20275 legal form codes
- Distinguishing operational type from legal form
---
### 5. Organizational Structure Diagram
**Purpose**: Visualize operational units, staff roles, and organizational history
**Files**:
- Mermaid: `schemas/20251121/uml/mermaid/organizational_structure_20251123_174151.mmd` (3.0 KB)
- PlantUML: `schemas/20251121/uml/plantuml/organizational_structure_20251123_174151.puml`
**Classes Included**:
1. **Custodian** (Hub)
2. **OrganizationalStructure** - Operational units (departments, teams, divisions)
3. **OrganizationalChangeEvent** - Historical restructuring events (mergers, splits, dissolutions)
4. **PersonObservation** - Staff roles and affiliations (PiCo pattern)
**Relationships**:
```
Custodian --> "0..*" OrganizationalStructure : organizational_structure
Custodian --> "0..*" OrganizationalChangeEvent : organizational_change_events
OrganizationalStructure --> "0..*" PersonObservation : staff_members
OrganizationalChangeEvent --> "0..*" OrganizationalStructure : affected_units
OrganizationalChangeEvent --> "0..*" OrganizationalStructure : resulting_units
```
**Key Concepts**:
**OrganizationalStructure** (on Custodian):
- **INFORMAL** operational units (not legally registered)
- Examples: "Digital Preservation Department", "Public Services Team"
- Can change frequently without legal reorganization
- Temporal validity: `valid_from`, `valid_to`
**OrganizationalChangeEvent**:
- Documents organizational history (mergers, splits, dissolutions, reorganizations)
- Links to affected units (dissolved) and resulting units (created)
- Temporal alignment: `event_date` marks when structures become valid/cease
- 9 event types: MERGER, SPLIT, DISSOLUTION, REORGANIZATION, RENAMING, TRANSFER, FOUNDING, EXPANSION, REDUCTION
**PersonObservation**:
- PiCo-inspired pattern for staff roles
- Links person to organizational unit via `unit_affiliation`
- Temporal role: `role_start_date`, `role_end_date`
- Role types: CURATOR, ARCHIVIST, DIRECTOR, CONSERVATOR, etc.
**Distinction from CustodianLegalStatus**:
- **GovernanceStructure** (on CustodianLegalStatus): FORMAL structure from legal registration
- **OrganizationalStructure** (on Custodian): INFORMAL operational structure
**Use Cases**:
- Understanding organizational hierarchy
- Tracking organizational change history
- Staff role management through restructuring
- Temporal consistency validation
---
## Diagram Formats
### Mermaid Class Diagrams (`.mmd`)
**Advantages**:
- ✅ Text-based, version control friendly
- ✅ Can be embedded in Markdown (GitHub, GitLab, VS Code)
- ✅ Live preview in many editors
- ✅ Lightweight syntax
**Viewing**:
```bash
# In VS Code with Mermaid extension
code schemas/20251121/uml/mermaid/core_classes_20251123_174151.mmd
# In browser with Mermaid Live Editor
open https://mermaid.live/
# Paste contents of .mmd file
```
**Example Syntax**:
```mermaid
classDiagram
class Custodian
<<abstract>> Custodian
Custodian : *hc_id uriorcurie
Custodian : +custodian_type CustodianType
Custodian --> "0..1" CustodianType : custodian_type
```
---
### PlantUML Diagrams (`.puml`)
**Advantages**:
- ✅ More mature tooling ecosystem
- ✅ Rich styling options
- ✅ Better for complex diagrams
- ✅ Can generate PNG/SVG/PDF
**Viewing**:
```bash
# Generate PNG from PlantUML
plantuml schemas/20251121/uml/plantuml/core_classes_20251123_174151.puml
# Or use online server
open http://www.plantuml.com/plantuml/uml/
# Paste contents of .puml file
```
**Example Syntax**:
```plantuml
@startuml
abstract class Custodian {
+ hc_id: uriorcurie
- custodian_type: CustodianType
}
Custodian --> "0..1" CustodianType : custodian_type
@enduml
```
---
## Generation Process
### Custom Script Required
**Reason**: The standard LinkML generators (`gen-yuml`, `gen-mermaid-class-diagram`, `gen-plantuml`) encountered compatibility issues with the modular schema structure.
**Problem**:
```
TypeError: SchemaDefinition.__init__() got an unexpected keyword argument 'slot_uri'
```
**Solution**: Created custom script `scripts/generate_uml_diagrams.py` that:
1. Loads and merges all modular schema components
2. Manually constructs Mermaid and PlantUML syntax
3. Generates focused diagrams (core classes, type hierarchy, legal status, etc.)
4. Handles abstract classes, inheritance, and composition relationships
### Script Features
**Input**: Main schema file (`01_custodian_name_modular.yaml`)
**Process**:
1. Parse main schema and recursively load all imported modules
2. Merge classes, slots, enums, types into single schema object
3. Generate 5 specialized diagrams per format (10 total)
4. Apply consistent timestamp to all outputs
**Output**: 10 UML diagram files with timestamp `20251123_174151`
**Statistics**:
- **Classes loaded**: 29
- **Slots loaded**: 127
- **Enums loaded**: 11
- **Diagrams generated**: 10 (5 Mermaid + 5 PlantUML)
---
## File Locations
### Mermaid Diagrams
```
schemas/20251121/uml/mermaid/
├── full_schema_20251123_174151.mmd (12 KB - all 29 classes)
├── core_classes_20251123_174151.mmd (5.6 KB - 8 main entities)
├── custodian_type_20251123_174151.mmd (1.1 KB - NEW type classification)
├── legal_status_20251123_174151.mmd (1.8 KB - legal entity)
└── organizational_structure_20251123_174151.mmd (3.0 KB - operational units)
```
### PlantUML Diagrams
```
schemas/20251121/uml/plantuml/
├── full_schema_20251123_174151.puml
├── core_classes_20251123_174151.puml
├── custodian_type_20251123_174151.puml
├── legal_status_20251123_174151.puml
└── organizational_structure_20251123_174151.puml
```
---
## Diagram Interpretation Guide
### Notation
**Mermaid**:
- `<<abstract>>` - Abstract class (cannot be instantiated)
- `*field` - Required field
- `+field` - Optional field
- `field[]` - Multivalued field (array)
- `-->` - Association/composition relationship
- `<|--` - Inheritance relationship
**PlantUML**:
- `abstract class` - Abstract class
- `+` prefix - Required field
- `-` prefix - Optional field
- `[]` suffix - Multivalued field
- `-->` - Association/composition relationship
- `<|--` - Inheritance relationship
### Cardinality
- `"0..1"` - Optional, single-valued
- `"1..1"` - Required, single-valued
- `"0..*"` - Optional, multivalued
- `"1..*"` - Required, multivalued
---
## Use Cases
### Documentation
**Embed in Markdown**:
```markdown
## Custodian Hub Architecture
\`\`\`mermaid
classDiagram
class Custodian
<<abstract>> Custodian
Custodian : *hc_id uriorcurie
Custodian : +custodian_type CustodianType
\`\`\`
```
### Schema Validation
Use diagrams to:
- ✅ Verify all classes are present
- ✅ Check inheritance hierarchies
- ✅ Validate composition relationships
- ✅ Ensure required fields are marked
### Ontology Mapping
Compare UML diagrams with ontology documentation:
- W3C Organization Ontology (`org:classification`)
- CIDOC-CRM (`crm:E55_Type`, `crm:P2_has_type`)
- Schema.org (`schema:additionalType`)
### Communication
Share diagrams with:
- Stakeholders (explain data model)
- Developers (implementation guide)
- Data curators (understand structure)
- Ontology engineers (alignment verification)
---
## Next Steps
### Phase 2: Specialized Type Diagrams
When specialized type classes are created (e.g., `ArchiveOrganizationType`, `MuseumType`), generate additional diagrams:
**New Diagrams to Generate**:
1. **Archive Type Hierarchy** - ArchiveOrganizationType + 50+ archive subtypes
2. **Museum Type Hierarchy** - MuseumType + 40+ museum subtypes
3. **Library Type Hierarchy** - LibraryType + 30+ library subtypes
4. **Full Type Taxonomy** - All 19 GLAMORCUBESFIXPHDNT categories with hierarchies
**Command**:
```bash
python scripts/generate_uml_diagrams.py
```
The script will automatically detect new classes and generate updated diagrams.
---
### SVG/PNG Generation (Optional)
Convert PlantUML diagrams to images for presentations:
```bash
# Install PlantUML
brew install plantuml # macOS
# or download from https://plantuml.com/
# Generate PNG
plantuml schemas/20251121/uml/plantuml/core_classes_20251123_174151.puml
# Generate SVG (vector, scalable)
plantuml -tsvg schemas/20251121/uml/plantuml/core_classes_20251123_174151.puml
# Generate PDF
plantuml -tpdf schemas/20251121/uml/plantuml/core_classes_20251123_174151.puml
```
**Output**: Images in same directory as `.puml` files
---
### Mermaid Live Preview
For interactive editing:
1. **VS Code Extension**: Install "Markdown Preview Mermaid Support"
2. **GitHub**: Native Mermaid rendering in `.md` files
3. **Mermaid Live**: https://mermaid.live/ (online editor)
---
## Regeneration Instructions
**When to Regenerate**:
- After adding new classes to schema
- After modifying class slots or relationships
- After schema version updates
**Command**:
```bash
cd /Users/kempersc/apps/glam
python scripts/generate_uml_diagrams.py
```
**Output**: New diagrams with updated timestamp
---
## Quality Assurance
### Diagram Validation Checklist
**CustodianType Diagram**:
- [x] CustodianType class is abstract
- [x] custodian_type slot present on Custodian
- [x] Cardinality is "0..1" (optional, single-valued)
- [x] 11 slots visible on CustodianType
- [x] wikidata_entity field present
- [x] primary_type field present
**Core Classes Diagram**:
- [x] All 8 main entities present
- [x] Hub pattern visible (Custodian → aspects)
- [x] CustodianType included (NEW)
- [x] Inheritance relationships shown
- [x] Composition relationships shown
**Full Schema Diagram**:
- [x] 29 classes visualized
- [x] 127 slots represented
- [x] No duplicate classes
- [x] All relationships valid
---
## References
### Schema Files
- Main schema: `schemas/20251121/linkml/01_custodian_name_modular.yaml`
- CustodianType class: `schemas/20251121/linkml/modules/classes/CustodianType.yaml`
- Custodian class: `schemas/20251121/linkml/modules/classes/Custodian.yaml`
### Generation Script
- Script: `scripts/generate_uml_diagrams.py`
- Language: Python 3.12
- Dependencies: PyYAML, pathlib, datetime
### Documentation
- Implementation summary: `CUSTODIAN_TYPE_ONTOLOGY_ALIGNMENT_COMPLETE.md`
- Quick status: `QUICK_STATUS_CUSTODIAN_TYPE_20251123.md`
- Ontology consultation: `ONTOLOGY_CONSULTATION_REPORT_CUSTODIAN_TYPE.md`
### External Tools
- LinkML: https://linkml.io/
- Mermaid: https://mermaid.js.org/
- PlantUML: https://plantuml.com/
---
**Document Version**: 1.0
**Created**: 2025-11-23
**Timestamp**: 20251123_174151
**Diagrams Generated**: 10 (5 Mermaid + 5 PlantUML)
**Schema Version**: 0.7.1