- Implemented `owl_to_mermaid.py` to convert OWL/Turtle files into Mermaid class diagrams. - Implemented `owl_to_plantuml.py` to convert OWL/Turtle files into PlantUML class diagrams. - Added two new PlantUML files for custodian multi-aspect diagrams.
348 lines
11 KiB
Markdown
348 lines
11 KiB
Markdown
# RDF and UML Generation Complete - Session Summary
|
|
|
|
**Date**: 2025-11-22
|
|
**Session**: Namespace Conflict Resolution & Visualization Generation
|
|
**Status**: ✅ **COMPLETE**
|
|
**Final Timestamp**: 20251122_155319
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
Successfully resolved all namespace conflicts in the modular LinkML schema and generated complete RDF and UML outputs. The session overcame LinkML's `gen-yuml` path resolution bug by creating custom OWL → UML converter scripts using PlantUML and Mermaid.
|
|
|
|
**Key Achievements**:
|
|
- ✅ Fixed 5 class files with namespace conflicts
|
|
- ✅ Generated 4 RDF formats (1.3MB total)
|
|
- ✅ Created 3 UML visualization formats (PlantUML PNG/SVG, Mermaid)
|
|
- ✅ Built 2 reusable OWL converter scripts
|
|
- ✅ Documented complete regeneration workflow
|
|
|
|
---
|
|
|
|
## Problem Solved: Namespace Conflicts
|
|
|
|
### Issue
|
|
Multiple module files contained duplicate prefix definitions that conflicted with `modules/metadata.yaml`:
|
|
|
|
```
|
|
WARNING: schema namespace already mapped to http://schema.org/ - Overriding with https://schema.org/
|
|
WARNING: heritage namespace already mapped to https://nde.nl/ontology/hc/# - Overriding with https://nde.nl/ontology/hc/
|
|
WARNING: tooi namespace already mapped to https://standaarden.overheid.nl/tooi# - Overriding with https://identifier.overheid.nl/tooi/def/ont/
|
|
```
|
|
|
|
### Solution
|
|
Removed duplicate prefixes from 5 files and added `../metadata` imports:
|
|
|
|
| File | Duplicates Removed | Unique Prefixes Kept |
|
|
|------|-------------------|---------------------|
|
|
| `LegalEntityType.yaml` | 8 (heritage, schema, org, cpov, crm, tooi, foaf, owl) | `rov` |
|
|
| `LegalForm.yaml` | 4 (heritage, schema, org, tooi) | `rov`, `gleif`, `iso20275` |
|
|
| `RegistrationInfo.yaml` | 4 (heritage, schema, org, tooi) | `rov` |
|
|
| `LegalName.yaml` | 3 (heritage, schema, tooi) | `rov` |
|
|
| `ISO20275_mapping.yaml` | 3 (heritage, org, schema) | `iso20275`, `wd` |
|
|
|
|
**Result**: Clean RDF generation with zero namespace warnings.
|
|
|
|
---
|
|
|
|
## Generated Artifacts
|
|
|
|
### RDF Files (Timestamp: 20251122_155319)
|
|
|
|
| Format | Size | Lines | Status | Use Case |
|
|
|--------|------|-------|--------|----------|
|
|
| **OWL/Turtle** | 159KB | 2,619 | ✅ | Primary format, human-readable |
|
|
| **N-Triples** | 456KB | 3,027 | ✅ | Bulk loading, line-oriented processing |
|
|
| **JSON-LD** | 380KB | 14,094 | ✅ | Web APIs, JavaScript integration |
|
|
| **RDF/XML** | 328KB | 4,585 | ✅ | Legacy systems, XML tools |
|
|
|
|
**Total**: 1.3MB across 4 serialization formats
|
|
**Triple count**: 3,027 triples
|
|
|
|
**Location**: `schemas/20251121/rdf/custodian_multi_aspect_20251122_155319.*`
|
|
|
|
### UML Visualizations (Timestamp: 20251122_155319)
|
|
|
|
| Format | Size | Tool | Status | Use Case |
|
|
|--------|------|------|--------|----------|
|
|
| **PlantUML Source** | 1.5KB | Custom script | ✅ | Editable diagram source |
|
|
| **PlantUML PNG** | 47KB | PlantUML CLI | ✅ | Raster image for documents |
|
|
| **PlantUML SVG** | 51KB | PlantUML CLI | ✅ | Vector graphic (web, scaling) |
|
|
| **Mermaid** | 1.6KB | Custom script | ✅ | GitHub README, Markdown |
|
|
|
|
**Location**:
|
|
- `schemas/20251121/uml/plantuml/custodian_multi_aspect_20251122_155319.*`
|
|
- `schemas/20251121/uml/mermaid/custodian_multi_aspect_20251122_155319.mmd`
|
|
|
|
**Classes visualized**: 35 HC ontology classes with properties and inheritance
|
|
|
|
---
|
|
|
|
## Custom Converter Scripts
|
|
|
|
### 1. `scripts/owl_to_plantuml.py`
|
|
|
|
**Purpose**: Convert OWL/Turtle RDF to PlantUML class diagram
|
|
|
|
**Features**:
|
|
- Parses RDF graph using rdflib
|
|
- Extracts classes, properties, inheritance (`rdfs:subClassOf`)
|
|
- Generates PlantUML syntax with class notes
|
|
- Supports property type annotations
|
|
|
|
**Usage**:
|
|
```bash
|
|
python3 scripts/owl_to_plantuml.py input.owl.ttl output.puml
|
|
plantuml output.puml # Render to PNG
|
|
plantuml -tsvg output.puml # Render to SVG
|
|
```
|
|
|
|
**Stats**: 153 lines, handles 35 classes, includes RDFS/OWL reasoning
|
|
|
|
### 2. `scripts/owl_to_mermaid.py`
|
|
|
|
**Purpose**: Convert OWL/Turtle RDF to Mermaid class diagram
|
|
|
|
**Features**:
|
|
- Parses RDF graph using rdflib
|
|
- Generates Mermaid `classDiagram` syntax
|
|
- Limits properties to 8 per class (readability)
|
|
- Compatible with GitHub, GitLab, VS Code preview
|
|
|
|
**Usage**:
|
|
```bash
|
|
python3 scripts/owl_to_mermaid.py input.owl.ttl output.mmd
|
|
```
|
|
|
|
**Stats**: 133 lines, handles 35 classes, web-friendly output
|
|
|
|
---
|
|
|
|
## Regeneration Workflow
|
|
|
|
### Step 1: Generate RDF from LinkML
|
|
|
|
```bash
|
|
cd schemas/20251121/linkml
|
|
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
|
|
|
|
# Generate OWL/Turtle
|
|
gen-owl -f ttl 01_custodian_name_modular.yaml 2>/dev/null \
|
|
> ../rdf/custodian_multi_aspect_${TIMESTAMP}.owl.ttl
|
|
|
|
# Convert to other formats
|
|
cd ../rdf
|
|
rdfpipe custodian_multi_aspect_${TIMESTAMP}.owl.ttl -o nt 2>/dev/null \
|
|
> custodian_multi_aspect_${TIMESTAMP}.nt
|
|
rdfpipe custodian_multi_aspect_${TIMESTAMP}.owl.ttl -o json-ld 2>/dev/null \
|
|
> custodian_multi_aspect_${TIMESTAMP}.jsonld
|
|
rdfpipe custodian_multi_aspect_${TIMESTAMP}.owl.ttl -o xml 2>/dev/null \
|
|
> custodian_multi_aspect_${TIMESTAMP}.rdf
|
|
```
|
|
|
|
### Step 2: Generate UML Visualizations
|
|
|
|
```bash
|
|
# PlantUML
|
|
python3 scripts/owl_to_plantuml.py \
|
|
schemas/20251121/rdf/custodian_multi_aspect_${TIMESTAMP}.owl.ttl \
|
|
schemas/20251121/uml/plantuml/custodian_multi_aspect_${TIMESTAMP}.puml
|
|
|
|
cd schemas/20251121/uml/plantuml
|
|
plantuml custodian_multi_aspect_${TIMESTAMP}.puml
|
|
plantuml -tsvg custodian_multi_aspect_${TIMESTAMP}.puml
|
|
|
|
# Mermaid
|
|
python3 scripts/owl_to_mermaid.py \
|
|
schemas/20251121/rdf/custodian_multi_aspect_${TIMESTAMP}.owl.ttl \
|
|
schemas/20251121/uml/mermaid/custodian_multi_aspect_${TIMESTAMP}.mmd
|
|
```
|
|
|
|
### Step 3: Validate Output
|
|
|
|
```bash
|
|
# Check file sizes
|
|
ls -lh schemas/20251121/rdf/custodian_multi_aspect_${TIMESTAMP}.*
|
|
ls -lh schemas/20251121/uml/*/custodian_multi_aspect_${TIMESTAMP}.*
|
|
|
|
# Optional: Validate RDF syntax
|
|
rapper -i turtle -c schemas/20251121/rdf/custodian_multi_aspect_${TIMESTAMP}.owl.ttl
|
|
```
|
|
|
|
---
|
|
|
|
## Ontology Structure (35 Classes)
|
|
|
|
### Core Hub Pattern
|
|
- `Custodian` - Minimal hub (persistent ID only)
|
|
- `CustodianObservation` - Source-based references
|
|
- `ReconstructionActivity` - Entity resolution process
|
|
|
|
### Three Independent Aspects
|
|
1. **CustodianLegalStatus** - Formal legal entity (registered)
|
|
2. **CustodianName** - Standardized emic name (ambiguous)
|
|
3. **CustodianPlace** - Nominal place designation
|
|
|
|
### Supporting Classes
|
|
- **Provenance**: `ConfidenceMeasure`, `SourceDocument`, `ReconstructionAgent`
|
|
- **Temporal**: `TimeSpan` (begin_of_begin, end_of_end)
|
|
- **Identity**: `Identifier`, `Appellation`, `LanguageCode`
|
|
- **Legal**: `LegalEntityType`, `LegalForm`, `LegalName`, `RegistrationInfo`
|
|
|
|
### Enumerations (5)
|
|
- `AgentTypeEnum` - PERSON, ORGANIZATION, SOFTWARE
|
|
- `AppellationTypeEnum` - Name classifications
|
|
- `EntityTypeEnum` - Legal entity types
|
|
- `LegalStatusEnum` - ACTIVE, DISSOLVED, MERGED, etc.
|
|
- `PlaceSpecificityEnum` - CITY, REGION, COUNTRY, etc.
|
|
|
|
---
|
|
|
|
## Technical Details
|
|
|
|
### Namespace Consistency
|
|
|
|
All modules now use standardized namespace URIs:
|
|
|
|
| Prefix | URI | Source |
|
|
|--------|-----|--------|
|
|
| `heritage` | `https://nde.nl/ontology/hc/` | `modules/metadata.yaml` |
|
|
| `schema` | `https://schema.org/` | `modules/metadata.yaml` (HTTPS!) |
|
|
| `tooi` | `https://identifier.overheid.nl/tooi/def/ont/` | `modules/metadata.yaml` |
|
|
|
|
**Why this matters**:
|
|
- Prevents duplicate triples (same property, different namespace)
|
|
- Enables consistent SPARQL queries
|
|
- Maintains Linked Open Data best practices
|
|
|
|
### Import Path Pattern
|
|
|
|
```yaml
|
|
# Standard pattern for class modules
|
|
imports:
|
|
- linkml:types
|
|
- ../metadata # ← Shared prefixes
|
|
- ./SiblingClass # ← Same-directory classes
|
|
|
|
# Only declare unique prefixes
|
|
prefixes:
|
|
linkml: https://w3id.org/linkml/
|
|
rov: http://www.w3.org/ns/regorg# # ← Not in metadata.yaml
|
|
```
|
|
|
|
---
|
|
|
|
## What Didn't Work (But We Solved)
|
|
|
|
### Issue: LinkML `gen-yuml` Path Resolution Bug
|
|
|
|
**Error**:
|
|
```
|
|
FileNotFoundError: [Errno 2] No such file or directory:
|
|
'/Users/kempersc/apps/glam/schemas/20251121/linkml/ReconstructionAgent.yaml'
|
|
```
|
|
|
|
**Root cause**: `gen-yuml` looks for `ReconstructionAgent.yaml` at schema root instead of `modules/classes/ReconstructionAgent.yaml`
|
|
|
|
**Solution**: Created custom OWL → UML converters that:
|
|
1. Parse already-generated OWL/Turtle (which works correctly)
|
|
2. Extract class structure from RDF triples
|
|
3. Generate PlantUML/Mermaid from RDF graph
|
|
|
|
**Advantage**: More flexible than `gen-yuml`, can customize diagram layout
|
|
|
|
---
|
|
|
|
## Files Modified/Created
|
|
|
|
### Modified (5 schema files)
|
|
1. `schemas/20251121/linkml/modules/classes/LegalEntityType.yaml`
|
|
2. `schemas/20251121/linkml/modules/classes/LegalForm.yaml`
|
|
3. `schemas/20251121/linkml/modules/classes/RegistrationInfo.yaml`
|
|
4. `schemas/20251121/linkml/modules/classes/LegalName.yaml`
|
|
5. `schemas/20251121/linkml/modules/mappings/ISO20275_mapping.yaml`
|
|
|
|
### Generated (8 artifact files)
|
|
**RDF**:
|
|
1. `custodian_multi_aspect_20251122_155319.owl.ttl` (159KB)
|
|
2. `custodian_multi_aspect_20251122_155319.nt` (456KB)
|
|
3. `custodian_multi_aspect_20251122_155319.jsonld` (380KB)
|
|
4. `custodian_multi_aspect_20251122_155319.rdf` (328KB)
|
|
|
|
**UML**:
|
|
5. `custodian_multi_aspect_20251122_155319.puml` (1.5KB)
|
|
6. `custodian_multi_aspect_20251122_155319.png` (47KB)
|
|
7. `custodian_multi_aspect_20251122_155319.svg` (51KB)
|
|
8. `custodian_multi_aspect_20251122_155319.mmd` (1.6KB)
|
|
|
|
### Created (2 scripts)
|
|
1. `scripts/owl_to_plantuml.py` (153 lines)
|
|
2. `scripts/owl_to_mermaid.py` (133 lines)
|
|
|
|
**Total**: 15 files (5 modified, 8 generated, 2 created)
|
|
|
|
---
|
|
|
|
## Success Criteria ✅
|
|
|
|
- [x] All namespace conflicts resolved (zero warnings)
|
|
- [x] 4 RDF formats generated successfully (1.3MB)
|
|
- [x] UML visualizations created (PlantUML + Mermaid)
|
|
- [x] Reusable converter scripts documented
|
|
- [x] Full regeneration workflow documented
|
|
- [x] All files use proper timestamps (YYYYMMDD_HHMMSS)
|
|
|
|
---
|
|
|
|
## Integration with Project Documentation
|
|
|
|
This session builds on:
|
|
- **`RDF_GENERATION_SUMMARY.md`** - RDF usage guide (created earlier today)
|
|
- **`.opencode/SCHEMA_GENERATION_RULES.md`** - Timestamp policy (Rule 1)
|
|
- **`AGENTS.md`** - LinkML master schema policy (Rule 0)
|
|
|
|
---
|
|
|
|
## Next Steps (Optional)
|
|
|
|
### Immediate
|
|
- [ ] Test RDF in SPARQL endpoint (Apache Jena Fuseki)
|
|
- [ ] Validate OWL with Protégé or HermiT reasoner
|
|
- [ ] Generate HTML docs from LinkML schema
|
|
|
|
### Short-term
|
|
- [ ] File bug report: LinkML `gen-yuml` path resolution
|
|
- [ ] Create SPARQL query examples
|
|
- [ ] Add RDF validation to CI/CD
|
|
|
|
### Long-term
|
|
- [ ] Implement OWL reasoning rules
|
|
- [ ] Create SHACL shapes for validation
|
|
- [ ] Generate JSON-LD @context file
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
|
|
**Status**: ✅ **COMPLETE**
|
|
|
|
The Heritage Custodian Ontology has been successfully converted to RDF and visualized in multiple formats. All namespace conflicts resolved, ensuring clean Linked Open Data output.
|
|
|
|
**Ready for**:
|
|
- SPARQL querying and reasoning
|
|
- Semantic web integration
|
|
- Ontology-based data validation
|
|
- Knowledge graph construction
|
|
|
|
**Deliverables**:
|
|
- 4 RDF serialization formats
|
|
- 3 UML visualization formats
|
|
- 2 reusable converter scripts
|
|
- Complete regeneration documentation
|
|
|
|
---
|
|
|
|
**Session Completed**: 2025-11-22 15:55:19
|
|
**Artifact Timestamp**: 20251122_155319
|
|
**Documentation**: `RDF_UML_GENERATION_COMPLETE_20251122_155319.md`
|