- Implemented `owl_to_mermaid.py` to convert OWL/Turtle files into Mermaid class diagrams. - Implemented `owl_to_plantuml.py` to convert OWL/Turtle files into PlantUML class diagrams. - Added two new PlantUML files for custodian multi-aspect diagrams.
9.7 KiB
Schema Generation Rules for AI Agents
Date: 2025-11-22
Purpose: Standard rules for generating derived artifacts from LinkML schemas
Rule 1: Always Use Full Timestamps in Generated File Names
MANDATORY: When generating derived artifacts (RDF, UML, etc.) from LinkML schemas, ALWAYS include a full timestamp (date AND time) in the filename.
Format
{base_name}_{YYYYMMDD}_{HHMMSS}.{extension}
Examples
# ✅ CORRECT - Full timestamp (date + time)
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
gen-yuml schemas/linkml/schema.yaml > schemas/uml/mermaid/schema_${TIMESTAMP}.mmd
gen-owl -f ttl schemas/linkml/schema.yaml > schemas/rdf/schema_${TIMESTAMP}.owl.ttl
# Examples of correct filenames:
custodian_multi_aspect_20251122_154136.mmd
custodian_multi_aspect_20251122_154430.owl.ttl
custodian_multi_aspect_20251122_154430.nt
custodian_multi_aspect_20251122_154430.jsonld
custodian_multi_aspect_20251122_154430.rdf
# ❌ WRONG - No timestamp
schema.mmd
01_custodian_name.owl.ttl
# ❌ WRONG - Date only (MISSING TIME!)
schema_20251122.mmd
custodian_multi_aspect_20251122.owl.ttl
# ❌ WRONG - Time only (missing date)
schema_154430.mmd
Rationale
- Version tracking: Full timestamps enable precise version identification
- No overwrites: Multiple generations on same day don't conflict
- Debugging: Can identify exact time when changes were made
- Rollback: Easy to revert to specific versions
- Audit trail: Documents schema evolution with chronological precision
- Prevents overwrites: Never lose previous versions
- Multiple sessions per day: Teams may generate artifacts multiple times daily
- Git-friendly: Easy to diff between versions
- Reproducibility: Can correlate generated artifacts with git commits
Critical Note
The timestamp must include BOTH date and time (YYYYMMDD_HHMMSS), not just date. This allows multiple generation runs per day without filename conflicts.
Rule 2: LinkML is the Single Source of Truth
NEVER manually create or edit derived files. Always generate from LinkML.
Correct Workflow ✅
1. Edit LinkML schema (.yaml)
2. Generate RDF formats (gen-owl + rdfpipe)
3. Generate UML diagrams (gen-yuml)
4. Generate TypeDB schema (manual translation, but documented)
5. Validate examples (linkml-validate)
Incorrect Workflow ❌
❌ Editing .ttl files directly
❌ Creating .jsonld manually
❌ Drawing UML diagrams by hand
❌ Modifying TypeDB schema without updating LinkML
Rule 3: Generate All RDF Serialization Formats
When generating RDF from LinkML, produce all standard serialization formats:
Required Formats
- OWL/Turtle (.owl.ttl) - Primary, human-readable
- N-Triples (.nt) - Simple, line-based
- JSON-LD (.jsonld) - Web-friendly, JSON-based
- RDF/XML (.rdf) - XML-based, traditional
Generation Commands
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BASE_NAME="schema_${TIMESTAMP}"
# 1. Generate OWL/Turtle (primary)
gen-owl -f ttl schemas/linkml/schema.yaml > schemas/rdf/${BASE_NAME}.owl.ttl
# 2. Convert to other formats using rdfpipe
rdfpipe --input-format turtle --output-format nt schemas/rdf/${BASE_NAME}.owl.ttl > schemas/rdf/${BASE_NAME}.nt
rdfpipe --input-format turtle --output-format json-ld schemas/rdf/${BASE_NAME}.owl.ttl > schemas/rdf/${BASE_NAME}.jsonld
rdfpipe --input-format turtle --output-format xml schemas/rdf/${BASE_NAME}.owl.ttl > schemas/rdf/${BASE_NAME}.rdf
Rule 4: Validate Before Committing
Before committing schema changes, ALWAYS:
-
Validate LinkML schema:
gen-owl -f ttl schemas/linkml/schema.yaml > /tmp/test_validation.ttl # Check for errors in output -
Validate example instances:
linkml-validate -s schemas/linkml/schema.yaml schemas/examples/instance.yaml -
Check RDF triples count:
wc -l schemas/rdf/*.nt # N-Triples are easy to count -
Verify class presence:
grep -c "ClassName" schemas/rdf/*.owl.ttl
Rule 5: Document Schema Changes
Every schema change requires:
- Quick status document:
QUICK_STATUS_{TOPIC}_{YYYYMMDD}.md - Session summary:
SESSION_SUMMARY_{YYYYMMDD}_{TOPIC}.md - Updated examples: Add/update instance files demonstrating changes
- Commit message: Reference quick status document
Template: Quick Status Document
# Quick Status: {Topic}
Date: YYYY-MM-DD
Status: ✅ COMPLETE / ⏳ IN PROGRESS
Priority: HIGH / MEDIUM / LOW
## What We Did
...
## Key Changes
...
## Files Modified
...
## Validation Results
...
## Next Steps
...
Rule 6: Example Instances Are Required
For every new class or major schema change:
- Create at least ONE complete example instance
- Place in
schemas/{version}/examples/ - Use descriptive filenames:
{class_name}_{use_case}_{timestamp}.yaml - Include all required slots and at least 2-3 optional slots
- Add inline comments explaining non-obvious fields
Example Instance Template
---
# Complete Example: {ClassName}
# Date: YYYY-MM-DD
# Use Case: {Description}
# Status: Valid instance conforming to schema version {X.Y.Z}
instances:
- id: https://example.org/id
required_field_1: "value"
required_field_2: "value"
optional_field: "value" # Explanation of when to use this field
# ... more fields
Rule 7: UML Diagram Conventions
When generating UML diagrams:
File Naming
{schema_name}_{diagram_type}_{YYYYMMDD}_{HHMMSS}.mmd
Examples:
custodian_class_diagram_20251122_154136.mmdprov_flow_sequence_20251122_154200.mmd
Diagram Types
class_diagram- Class hierarchies and relationshipssequence- PROV-O temporal flowsstate- State transitions (e.g., organizational change events)er- Entity-relationship (database perspective)
Storage Location
schemas/{version}/uml/mermaid/{timestamp_files}.mmd
Rule 8: TypeDB Schema Updates
TypeDB schemas are manually translated from LinkML (not auto-generated).
Required Steps
- Update LinkML schema first
- Regenerate RDF to verify OWL alignment
- Manually update TypeDB schema (.tql)
- Document translation decisions
- Test TypeDB queries
Translation Documentation
Create TYPEDB_TRANSLATION_NOTES.md documenting:
- LinkML class → TypeDB entity/relation mapping
- Slot → attribute mapping
- Constraints and rules
- Query examples
Rule 9: Version Control for Generated Files
What to Commit
✅ DO commit:
- LinkML schema files (.yaml)
- Example instances (.yaml)
- Documentation (.md)
- Latest timestamped RDF (keep last 3 versions)
- Latest timestamped UML (keep last 3 versions)
❌ DO NOT commit:
- Temporary validation files (/tmp/*)
- Old versions (>3 generations old)
- Duplicate non-timestamped files
Cleanup Script
# Keep only last 3 timestamped versions of each schema
cd schemas/rdf
ls -t schema_*.owl.ttl | tail -n +4 | xargs rm -f
Rule 10: Generation Workflow Template
Standard workflow for schema changes:
#!/bin/bash
# Schema Generation Workflow
# Usage: ./generate_schema_artifacts.sh
set -e # Exit on error
SCHEMA_FILE="schemas/20251121/linkml/01_custodian_name_modular.yaml"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BASE_NAME="custodian_${TIMESTAMP}"
echo "=== Schema Generation Workflow ==="
echo "Timestamp: $TIMESTAMP"
echo ""
# Step 1: Validate LinkML
echo "Step 1: Validating LinkML schema..."
gen-owl -f ttl "$SCHEMA_FILE" > /tmp/validation_test.ttl 2>&1
echo "✅ Schema valid"
# Step 2: Generate RDF formats
echo "Step 2: Generating RDF formats..."
gen-owl -f ttl "$SCHEMA_FILE" > "schemas/20251121/rdf/${BASE_NAME}.owl.ttl"
rdfpipe --input-format turtle --output-format nt "schemas/20251121/rdf/${BASE_NAME}.owl.ttl" > "schemas/20251121/rdf/${BASE_NAME}.nt"
rdfpipe --input-format turtle --output-format json-ld "schemas/20251121/rdf/${BASE_NAME}.owl.ttl" > "schemas/20251121/rdf/${BASE_NAME}.jsonld"
rdfpipe --input-format turtle --output-format xml "schemas/20251121/rdf/${BASE_NAME}.owl.ttl" > "schemas/20251121/rdf/${BASE_NAME}.rdf"
echo "✅ RDF formats generated"
# Step 3: Generate UML
echo "Step 3: Generating UML diagrams..."
gen-yuml "$SCHEMA_FILE" > "schemas/20251121/uml/mermaid/${BASE_NAME}.mmd"
echo "✅ UML diagram generated"
# Step 4: Validate examples
echo "Step 4: Validating example instances..."
for example in schemas/20251121/examples/*.yaml; do
linkml-validate -s "$SCHEMA_FILE" "$example" || echo "⚠️ Warning: $example failed validation"
done
echo "✅ Examples validated"
# Step 5: Report
echo ""
echo "=== Generation Complete ==="
ls -lh "schemas/20251121/rdf/${BASE_NAME}".* | awk '{print $9, "("$5")"}'
ls -lh "schemas/20251121/uml/mermaid/${BASE_NAME}.mmd" | awk '{print $9, "("$5")"}'
echo ""
echo "Next: Update documentation and commit"
Quick Reference Commands
Generate All Artifacts
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
gen-owl -f ttl schema.yaml > schema_${TIMESTAMP}.owl.ttl
gen-yuml schema.yaml > schema_${TIMESTAMP}.mmd
Validate
gen-owl -f ttl schema.yaml > /tmp/test.ttl # Check for errors
linkml-validate -s schema.yaml instance.yaml
Convert RDF Formats
rdfpipe -i turtle -o nt file.ttl > file.nt
rdfpipe -i turtle -o json-ld file.ttl > file.jsonld
rdfpipe -i turtle -o xml file.ttl > file.rdf
Check RDF Content
grep -c "ClassName" file.owl.ttl # Count class references
wc -l file.nt # Count triples
Status: ✅ ACTIVE RULES
Version: 1.0
Last Updated: 2025-11-22
Applies To: All LinkML schema work in this project
See Also:
.opencode/HYPER_MODULAR_STRUCTURE.md- Module organization.opencode/SLOT_NAMING_CONVENTIONS.md- Slot naming patternsAGENTS.md- AI agent instructions