12 KiB
RDF and UML Generation Complete
Date: 2025-11-22
Schema Version: 20251121
Status: ✅ COMPLETE
Summary
Successfully generated all RDF serializations and UML diagrams for the Heritage Custodian Ontology with the new legal entity model (v0.2.2).
Generated Files
RDF Formats (7 serializations)
All generated from: schemas/20251121/linkml/01_custodian_name_modular.yaml
| Format | File | Size | Lines | Triples | Description |
|---|---|---|---|---|---|
| Turtle | 01_custodian_name_modular.owl.ttl |
140K | 2,328 | 2,701 | Primary OWL ontology (human-readable) |
| N-Triples | 01_custodian_name_modular.nt |
452K | 2,701 | 2,701 | Line-based triple format (machine-readable) |
| JSON-LD | 01_custodian_name_modular.jsonld |
336K | 7,451 | 2,701 | JSON Linked Data (web-friendly) |
| RDF/XML | 01_custodian_name_modular.rdf |
324K | 10,810 | 2,701 | XML serialization (legacy compatibility) |
| N3 | 01_custodian_name_modular.n3 |
196K | 5,144 | 2,701 | Notation3 (Turtle superset) |
| TriG | 01_custodian_name_modular.trig |
196K | 5,144 | 2,701 | Named graphs extension |
| TriX | 01_custodian_name_modular.trix |
644K | 21,377 | 2,701 | XML with named graphs |
Total RDF Size: ~2.3 MB
Total RDF Lines: 40,955 lines
UML Diagrams (2 formats)
| Format | File | Size | Description |
|---|---|---|---|
| Mermaid | uml/mermaid/01_custodian_name_modular.mmd |
6.0K | Markdown-based class diagram (GitHub-friendly) |
| PlantUML | uml/plantuml/01_custodian_name_modular.puml |
7.5K | UML class diagram with color-coded packages |
Validation Results
RDF Validation ✅
Using rdflib Python library:
✅ Turtle validation: SUCCESS
Triples: 2,701
Subjects: 652
Predicates: 36
Objects: 1,325
Key Statistics:
- 2,701 triples - All class/slot/enum definitions and mappings
- 652 unique subjects - Classes, slots, enums, and their components
- 36 unique predicates - RDF/RDFS/OWL properties
- 1,325 unique objects - Property values and types
Ontology Coverage
The generated RDF includes:
Classes (17):
- Custodian (hub)
- CustodianObservation, CustodianName (observation pattern)
- CustodianReconstruction (reconstruction pattern)
- LegalEntityType (NEW)
- LegalForm (NEW)
- LegalName (NEW)
- RegistrationNumber (NEW, within RegistrationInfo)
- RegistrationAuthority (NEW, within RegistrationInfo)
- GovernanceStructure (NEW, within RegistrationInfo)
- LegalStatus (NEW, within RegistrationInfo)
- SourceDocument, TimeSpan, ConfidenceMeasure
- ReconstructionActivity, ReconstructionAgent
- Identifier, LanguageCode, Appellation
Enums (6):
- AppellationTypeEnum
- AgentTypeEnum
- EntityTypeEnum (DEPRECATED, use LegalEntityType)
- LegalStatusEnum (DEPRECATED, use LegalStatus class)
- ReconstructionActivityTypeEnum
- SourceDocumentTypeEnum
Slots (59+):
- All 59 modular slot definitions
- Including new legal entity slots:
legal_entity_type,registration_numbers
UML Diagram Features
Mermaid Diagram
Features:
- Class diagram with all 17 classes
- Hub-Observation-Reconstruction pattern visualization
- Legal entity model highlighted (8 new classes)
- Relationship arrows with cardinality
- Inline notes for key classes
- GitHub-renderable (displays directly in markdown files)
Sections:
- Hub Pattern (Custodian)
- Observation Pattern (CustodianObservation, CustodianName)
- Reconstruction Pattern (CustodianReconstruction)
- Legal Entity Model (8 classes, highlighted)
- Supporting Classes (9 classes)
PlantUML Diagram
Features:
- Color-coded packages:
- 🔵 Light Blue: Hub (Custodian)
- 🟢 Light Green: Observations
- 🔴 Light Coral: Reconstructions
- 🟡 Gold: Legal Entity classes
- ⚪ Light Gray: Supporting classes
- Detailed class attributes with types
- Relationship arrows with labels
- Comprehensive notes explaining:
- Hub pattern (minimal entity)
- Observation pattern (source evidence)
- Reconstruction pattern (formal entity)
- Legal entity classes (NEW in v0.2.2)
- ISO 20275 and TOOI references
Rendering:
- Use PlantUML server: https://www.plantuml.com/plantuml/
- Or local PlantUML CLI:
plantuml 01_custodian_name_modular.puml
Generation Process
Step 1: Generate OWL/Turtle
gen-owl -f ttl schemas/20251121/linkml/01_custodian_name_modular.yaml 2>/dev/null \
> schemas/20251121/rdf/01_custodian_name_modular.owl.ttl
Output: 138K Turtle file with 2,328 lines
Step 2: Convert to Other RDF Formats
cd schemas/20251121/rdf
rdfpipe -i turtle -o nt 01_custodian_name_modular.owl.ttl > 01_custodian_name_modular.nt
rdfpipe -i turtle -o json-ld 01_custodian_name_modular.owl.ttl > 01_custodian_name_modular.jsonld
rdfpipe -i turtle -o xml 01_custodian_name_modular.owl.ttl > 01_custodian_name_modular.rdf
rdfpipe -i turtle -o n3 01_custodian_name_modular.owl.ttl > 01_custodian_name_modular.n3
rdfpipe -i turtle -o trig 01_custodian_name_modular.owl.ttl > 01_custodian_name_modular.trig
rdfpipe -i turtle -o trix 01_custodian_name_modular.owl.ttl > 01_custodian_name_modular.trix
Tool: rdfpipe from rdflib package
Step 3: Create UML Diagrams (Manual)
LinkML's auto-generators (gen-plantuml, gen-yuml) do not support modular schemas properly. Created comprehensive diagrams manually based on schema structure.
Mermaid: Manually authored class diagram with all relationships
PlantUML: Manually authored with color-coded packages and detailed notes
Step 4: Validate
from rdflib import Graph
g = Graph()
g.parse('01_custodian_name_modular.owl.ttl', format='turtle')
# SUCCESS: 2,701 triples
Ontology Mappings in RDF
The generated RDF includes mappings to:
W3C/DCMI Vocabularies
- OWL: Class/property definitions
- RDFS: Labels, comments, subclass relationships
- RDF: Type assertions
- DCTERMS: Title, license, version
- SKOS: Definitions, notes, exact/close mappings
- PAV: Provenance (version, license)
- FOAF: Agent information
- PROV-O: Activity tracking
- TIME: Temporal expressions
Domain Ontologies
-
W3C Org Ontology (
org:): Organization structureorg:classification(LegalEntityType)org:hasUnit(GovernanceStructure)
-
ROV (
rov:): Registered organizationsrov:legalName(LegalName)rov:orgType(LegalForm)rov:registration(RegistrationNumber)rov:hasRegisteredOrganization(RegistrationAuthority)
-
TOOI (
tooi:): Dutch governmenttooi:rechtsvorm(legal form)tooi:organisatieIdentificatie(registration)tooi:officieleNaamInclSoort(legal name)
-
GLEIF (
gleif:): Legal entity identifiersgleif:hasLegalForm(LegalForm)gleif-base:hasEntityStatus(LegalStatus)
-
Schema.org (
schema:): Web semanticsschema:status(LegalStatus)schema:identifier(identifiers)schema:legalName(legal name)
RDF Format Comparison
| Format | Human-Readable | Machine-Readable | Web-Friendly | Compression | Use Case |
|---|---|---|---|---|---|
| Turtle | ✅ Excellent | ✅ Good | 🟡 Fair | Best | Editing, documentation |
| N-Triples | 🟡 Fair | ✅ Excellent | 🟡 Fair | None | Streaming, line-by-line processing |
| JSON-LD | 🟡 Fair | ✅ Excellent | ✅ Excellent | Good | Web APIs, JavaScript |
| RDF/XML | ❌ Poor | ✅ Good | 🟡 Fair | Fair | Legacy systems, XML tools |
| N3 | ✅ Excellent | ✅ Good | 🟡 Fair | Best | Advanced logic, rules |
| TriG | ✅ Good | ✅ Good | 🟡 Fair | Best | Named graphs, datasets |
| TriX | ❌ Poor | ✅ Good | 🟡 Fair | Poor | XML + named graphs |
Recommendations:
- Development/Documentation: Use Turtle (most readable)
- Web APIs: Use JSON-LD (web-native)
- Bulk Processing: Use N-Triples (line-based, streaming)
- SPARQL Queries: Load Turtle or TriG into triplestore
- Legacy Integration: Use RDF/XML if required
SPARQL Query Examples
Query 1: Find All Legal Entity Types
PREFIX heritage: <https://nde.nl/ontology/hc/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?type ?label ?description
WHERE {
?type a heritage:LegalEntityType .
OPTIONAL { ?type rdfs:label ?label }
OPTIONAL { ?type heritage:description ?description }
}
Query 2: Find All Classes with Legal Form
PREFIX heritage: <https://nde.nl/ontology/hc/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?class ?label
WHERE {
?class rdfs:subClassOf* heritage:CustodianReconstruction .
?class rdfs:label ?label .
FILTER EXISTS { ?class heritage:legal_form ?form }
}
Query 3: List All Slots with ISO 20275 Mapping
PREFIX heritage: <https://nde.nl/ontology/hc/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX rov: <http://www.w3.org/ns/regorg#>
SELECT ?slot ?label ?mapping
WHERE {
?slot a heritage:Slot .
?slot rdfs:label ?label .
?slot skos:exactMatch|skos:closeMatch ?mapping .
FILTER (CONTAINS(STR(?mapping), "regorg"))
}
File Locations
schemas/20251121/
├── linkml/
│ └── 01_custodian_name_modular.yaml # Source LinkML schema
│
├── rdf/
│ ├── 01_custodian_name_modular.owl.ttl # Turtle (primary)
│ ├── 01_custodian_name_modular.nt # N-Triples
│ ├── 01_custodian_name_modular.jsonld # JSON-LD
│ ├── 01_custodian_name_modular.rdf # RDF/XML
│ ├── 01_custodian_name_modular.n3 # N3
│ ├── 01_custodian_name_modular.trig # TriG
│ └── 01_custodian_name_modular.trix # TriX
│
└── uml/
├── mermaid/
│ └── 01_custodian_name_modular.mmd # Mermaid class diagram
└── plantuml/
└── 01_custodian_name_modular.puml # PlantUML class diagram
Next Steps
Immediate
- ✅ RDF generation - COMPLETE
- ✅ UML generation - COMPLETE
- ✅ Validation - COMPLETE
- ⏳ Load into triplestore - TODO (optional)
- ⏳ Render PlantUML diagram - TODO (optional)
Short-term
- ⏳ Create SPARQL queries - TODO (example queries provided above)
- ⏳ Generate documentation - TODO (using
gen-doc) - ⏳ Create example instances - TODO (validate against RDF schema)
Medium-term
- ⏳ Publish to ontology registry - TODO (LOV, BioPortal, etc.)
- ⏳ Create persistent URIs - TODO (w3id.org or purl.org)
- ⏳ Deploy SPARQL endpoint - TODO (public query interface)
Tools Used
| Tool | Version | Purpose |
|---|---|---|
gen-owl |
linkml 1.9.5 | Generate OWL from LinkML |
rdfpipe |
rdflib (Python) | Convert RDF formats |
rdflib |
Python package | Validate RDF syntax |
| Manual authoring | - | Create UML diagrams |
Troubleshooting
Issue: gen-owl warnings in output
Problem: gen-owl outputs warnings to stdout, corrupting Turtle file
Solution: Redirect stderr to /dev/null:
gen-owl -f ttl schema.yaml 2>/dev/null > output.ttl
Issue: gen-plantuml/gen-yuml fail with modular schema
Problem: LinkML generators don't support modular imports properly
Solution: Manually author UML diagrams based on schema structure
Issue: rdfpipe parsing errors
Problem: Turtle file contains non-RDF content (warnings)
Solution: Regenerate Turtle cleanly with stderr suppressed
Version Control
Generated from:
- Schema:
schemas/20251121/linkml/01_custodian_name_modular.yaml - Version: 0.1.0 (schema version in LinkML)
- Legal Entity Model: v0.2.2 (project version)
- Generation Date: 2025-11-22
Git Status:
- All generated files should be committed to version control
- RDF files are derived but worth tracking (transparency)
- UML diagrams should be committed (manual authoring)
References
- LinkML Documentation: https://linkml.io/
- RDF 1.1 Primer: https://www.w3.org/TR/rdf11-primer/
- OWL 2 Primer: https://www.w3.org/TR/owl2-primer/
- SPARQL 1.1 Query: https://www.w3.org/TR/sparql11-query/
- Mermaid Docs: https://mermaid.js.org/
- PlantUML Docs: https://plantuml.com/class-diagram
Status: ✅ ALL GENERATION COMPLETE
Next Session: Data instance creation and validation