- Implemented `owl_to_mermaid.py` to convert OWL/Turtle files into Mermaid class diagrams. - Implemented `owl_to_plantuml.py` to convert OWL/Turtle files into PlantUML class diagrams. - Added two new PlantUML files for custodian multi-aspect diagrams.
360 lines
9.7 KiB
Markdown
360 lines
9.7 KiB
Markdown
# Schema Generation Rules for AI Agents
|
|
|
|
**Date**: 2025-11-22
|
|
**Purpose**: Standard rules for generating derived artifacts from LinkML schemas
|
|
|
|
---
|
|
|
|
## Rule 1: Always Use Full Timestamps in Generated File Names
|
|
|
|
**MANDATORY**: When generating derived artifacts (RDF, UML, etc.) from LinkML schemas, **ALWAYS** include a full timestamp (date AND time) in the filename.
|
|
|
|
### Format
|
|
```
|
|
{base_name}_{YYYYMMDD}_{HHMMSS}.{extension}
|
|
```
|
|
|
|
### Examples
|
|
```bash
|
|
# ✅ CORRECT - Full timestamp (date + time)
|
|
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
|
|
gen-yuml schemas/linkml/schema.yaml > schemas/uml/mermaid/schema_${TIMESTAMP}.mmd
|
|
gen-owl -f ttl schemas/linkml/schema.yaml > schemas/rdf/schema_${TIMESTAMP}.owl.ttl
|
|
|
|
# Examples of correct filenames:
|
|
custodian_multi_aspect_20251122_154136.mmd
|
|
custodian_multi_aspect_20251122_154430.owl.ttl
|
|
custodian_multi_aspect_20251122_154430.nt
|
|
custodian_multi_aspect_20251122_154430.jsonld
|
|
custodian_multi_aspect_20251122_154430.rdf
|
|
|
|
# ❌ WRONG - No timestamp
|
|
schema.mmd
|
|
01_custodian_name.owl.ttl
|
|
|
|
# ❌ WRONG - Date only (MISSING TIME!)
|
|
schema_20251122.mmd
|
|
custodian_multi_aspect_20251122.owl.ttl
|
|
|
|
# ❌ WRONG - Time only (missing date)
|
|
schema_154430.mmd
|
|
```
|
|
|
|
### Rationale
|
|
1. **Version tracking**: Full timestamps enable precise version identification
|
|
2. **No overwrites**: Multiple generations on same day don't conflict
|
|
3. **Debugging**: Can identify exact time when changes were made
|
|
4. **Rollback**: Easy to revert to specific versions
|
|
5. **Audit trail**: Documents schema evolution with chronological precision
|
|
6. **Prevents overwrites**: Never lose previous versions
|
|
7. **Multiple sessions per day**: Teams may generate artifacts multiple times daily
|
|
8. **Git-friendly**: Easy to diff between versions
|
|
9. **Reproducibility**: Can correlate generated artifacts with git commits
|
|
|
|
### Critical Note
|
|
The timestamp must include BOTH date and time (YYYYMMDD_HHMMSS), not just date. This allows multiple generation runs per day without filename conflicts.
|
|
|
|
---
|
|
|
|
## Rule 2: LinkML is the Single Source of Truth
|
|
|
|
**NEVER** manually create or edit derived files. Always generate from LinkML.
|
|
|
|
### Correct Workflow ✅
|
|
```
|
|
1. Edit LinkML schema (.yaml)
|
|
2. Generate RDF formats (gen-owl + rdfpipe)
|
|
3. Generate UML diagrams (gen-yuml)
|
|
4. Generate TypeDB schema (manual translation, but documented)
|
|
5. Validate examples (linkml-validate)
|
|
```
|
|
|
|
### Incorrect Workflow ❌
|
|
```
|
|
❌ Editing .ttl files directly
|
|
❌ Creating .jsonld manually
|
|
❌ Drawing UML diagrams by hand
|
|
❌ Modifying TypeDB schema without updating LinkML
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 3: Generate All RDF Serialization Formats
|
|
|
|
When generating RDF from LinkML, produce all standard serialization formats:
|
|
|
|
### Required Formats
|
|
1. **OWL/Turtle** (.owl.ttl) - Primary, human-readable
|
|
2. **N-Triples** (.nt) - Simple, line-based
|
|
3. **JSON-LD** (.jsonld) - Web-friendly, JSON-based
|
|
4. **RDF/XML** (.rdf) - XML-based, traditional
|
|
|
|
### Generation Commands
|
|
```bash
|
|
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
|
|
BASE_NAME="schema_${TIMESTAMP}"
|
|
|
|
# 1. Generate OWL/Turtle (primary)
|
|
gen-owl -f ttl schemas/linkml/schema.yaml > schemas/rdf/${BASE_NAME}.owl.ttl
|
|
|
|
# 2. Convert to other formats using rdfpipe
|
|
rdfpipe --input-format turtle --output-format nt schemas/rdf/${BASE_NAME}.owl.ttl > schemas/rdf/${BASE_NAME}.nt
|
|
rdfpipe --input-format turtle --output-format json-ld schemas/rdf/${BASE_NAME}.owl.ttl > schemas/rdf/${BASE_NAME}.jsonld
|
|
rdfpipe --input-format turtle --output-format xml schemas/rdf/${BASE_NAME}.owl.ttl > schemas/rdf/${BASE_NAME}.rdf
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 4: Validate Before Committing
|
|
|
|
Before committing schema changes, **ALWAYS**:
|
|
|
|
1. **Validate LinkML schema**:
|
|
```bash
|
|
gen-owl -f ttl schemas/linkml/schema.yaml > /tmp/test_validation.ttl
|
|
# Check for errors in output
|
|
```
|
|
|
|
2. **Validate example instances**:
|
|
```bash
|
|
linkml-validate -s schemas/linkml/schema.yaml schemas/examples/instance.yaml
|
|
```
|
|
|
|
3. **Check RDF triples count**:
|
|
```bash
|
|
wc -l schemas/rdf/*.nt # N-Triples are easy to count
|
|
```
|
|
|
|
4. **Verify class presence**:
|
|
```bash
|
|
grep -c "ClassName" schemas/rdf/*.owl.ttl
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 5: Document Schema Changes
|
|
|
|
Every schema change requires:
|
|
|
|
1. **Quick status document**: `QUICK_STATUS_{TOPIC}_{YYYYMMDD}.md`
|
|
2. **Session summary**: `SESSION_SUMMARY_{YYYYMMDD}_{TOPIC}.md`
|
|
3. **Updated examples**: Add/update instance files demonstrating changes
|
|
4. **Commit message**: Reference quick status document
|
|
|
|
### Template: Quick Status Document
|
|
```markdown
|
|
# Quick Status: {Topic}
|
|
Date: YYYY-MM-DD
|
|
Status: ✅ COMPLETE / ⏳ IN PROGRESS
|
|
Priority: HIGH / MEDIUM / LOW
|
|
|
|
## What We Did
|
|
...
|
|
|
|
## Key Changes
|
|
...
|
|
|
|
## Files Modified
|
|
...
|
|
|
|
## Validation Results
|
|
...
|
|
|
|
## Next Steps
|
|
...
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 6: Example Instances Are Required
|
|
|
|
For every new class or major schema change:
|
|
|
|
1. Create at least ONE complete example instance
|
|
2. Place in `schemas/{version}/examples/`
|
|
3. Use descriptive filenames: `{class_name}_{use_case}_{timestamp}.yaml`
|
|
4. Include all required slots and at least 2-3 optional slots
|
|
5. Add inline comments explaining non-obvious fields
|
|
|
|
### Example Instance Template
|
|
```yaml
|
|
---
|
|
# Complete Example: {ClassName}
|
|
# Date: YYYY-MM-DD
|
|
# Use Case: {Description}
|
|
# Status: Valid instance conforming to schema version {X.Y.Z}
|
|
|
|
instances:
|
|
- id: https://example.org/id
|
|
required_field_1: "value"
|
|
required_field_2: "value"
|
|
optional_field: "value" # Explanation of when to use this field
|
|
# ... more fields
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 7: UML Diagram Conventions
|
|
|
|
When generating UML diagrams:
|
|
|
|
### File Naming
|
|
```
|
|
{schema_name}_{diagram_type}_{YYYYMMDD}_{HHMMSS}.mmd
|
|
```
|
|
|
|
Examples:
|
|
- `custodian_class_diagram_20251122_154136.mmd`
|
|
- `prov_flow_sequence_20251122_154200.mmd`
|
|
|
|
### Diagram Types
|
|
- `class_diagram` - Class hierarchies and relationships
|
|
- `sequence` - PROV-O temporal flows
|
|
- `state` - State transitions (e.g., organizational change events)
|
|
- `er` - Entity-relationship (database perspective)
|
|
|
|
### Storage Location
|
|
```
|
|
schemas/{version}/uml/mermaid/{timestamp_files}.mmd
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 8: TypeDB Schema Updates
|
|
|
|
TypeDB schemas are **manually translated** from LinkML (not auto-generated).
|
|
|
|
### Required Steps
|
|
1. Update LinkML schema first
|
|
2. Regenerate RDF to verify OWL alignment
|
|
3. Manually update TypeDB schema (.tql)
|
|
4. Document translation decisions
|
|
5. Test TypeDB queries
|
|
|
|
### Translation Documentation
|
|
Create `TYPEDB_TRANSLATION_NOTES.md` documenting:
|
|
- LinkML class → TypeDB entity/relation mapping
|
|
- Slot → attribute mapping
|
|
- Constraints and rules
|
|
- Query examples
|
|
|
|
---
|
|
|
|
## Rule 9: Version Control for Generated Files
|
|
|
|
### What to Commit
|
|
✅ **DO commit**:
|
|
- LinkML schema files (.yaml)
|
|
- Example instances (.yaml)
|
|
- Documentation (.md)
|
|
- Latest timestamped RDF (keep last 3 versions)
|
|
- Latest timestamped UML (keep last 3 versions)
|
|
|
|
❌ **DO NOT commit**:
|
|
- Temporary validation files (/tmp/*)
|
|
- Old versions (>3 generations old)
|
|
- Duplicate non-timestamped files
|
|
|
|
### Cleanup Script
|
|
```bash
|
|
# Keep only last 3 timestamped versions of each schema
|
|
cd schemas/rdf
|
|
ls -t schema_*.owl.ttl | tail -n +4 | xargs rm -f
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 10: Generation Workflow Template
|
|
|
|
Standard workflow for schema changes:
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
# Schema Generation Workflow
|
|
# Usage: ./generate_schema_artifacts.sh
|
|
|
|
set -e # Exit on error
|
|
|
|
SCHEMA_FILE="schemas/20251121/linkml/01_custodian_name_modular.yaml"
|
|
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
|
|
BASE_NAME="custodian_${TIMESTAMP}"
|
|
|
|
echo "=== Schema Generation Workflow ==="
|
|
echo "Timestamp: $TIMESTAMP"
|
|
echo ""
|
|
|
|
# Step 1: Validate LinkML
|
|
echo "Step 1: Validating LinkML schema..."
|
|
gen-owl -f ttl "$SCHEMA_FILE" > /tmp/validation_test.ttl 2>&1
|
|
echo "✅ Schema valid"
|
|
|
|
# Step 2: Generate RDF formats
|
|
echo "Step 2: Generating RDF formats..."
|
|
gen-owl -f ttl "$SCHEMA_FILE" > "schemas/20251121/rdf/${BASE_NAME}.owl.ttl"
|
|
rdfpipe --input-format turtle --output-format nt "schemas/20251121/rdf/${BASE_NAME}.owl.ttl" > "schemas/20251121/rdf/${BASE_NAME}.nt"
|
|
rdfpipe --input-format turtle --output-format json-ld "schemas/20251121/rdf/${BASE_NAME}.owl.ttl" > "schemas/20251121/rdf/${BASE_NAME}.jsonld"
|
|
rdfpipe --input-format turtle --output-format xml "schemas/20251121/rdf/${BASE_NAME}.owl.ttl" > "schemas/20251121/rdf/${BASE_NAME}.rdf"
|
|
echo "✅ RDF formats generated"
|
|
|
|
# Step 3: Generate UML
|
|
echo "Step 3: Generating UML diagrams..."
|
|
gen-yuml "$SCHEMA_FILE" > "schemas/20251121/uml/mermaid/${BASE_NAME}.mmd"
|
|
echo "✅ UML diagram generated"
|
|
|
|
# Step 4: Validate examples
|
|
echo "Step 4: Validating example instances..."
|
|
for example in schemas/20251121/examples/*.yaml; do
|
|
linkml-validate -s "$SCHEMA_FILE" "$example" || echo "⚠️ Warning: $example failed validation"
|
|
done
|
|
echo "✅ Examples validated"
|
|
|
|
# Step 5: Report
|
|
echo ""
|
|
echo "=== Generation Complete ==="
|
|
ls -lh "schemas/20251121/rdf/${BASE_NAME}".* | awk '{print $9, "("$5")"}'
|
|
ls -lh "schemas/20251121/uml/mermaid/${BASE_NAME}.mmd" | awk '{print $9, "("$5")"}'
|
|
echo ""
|
|
echo "Next: Update documentation and commit"
|
|
```
|
|
|
|
---
|
|
|
|
## Quick Reference Commands
|
|
|
|
### Generate All Artifacts
|
|
```bash
|
|
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
|
|
gen-owl -f ttl schema.yaml > schema_${TIMESTAMP}.owl.ttl
|
|
gen-yuml schema.yaml > schema_${TIMESTAMP}.mmd
|
|
```
|
|
|
|
### Validate
|
|
```bash
|
|
gen-owl -f ttl schema.yaml > /tmp/test.ttl # Check for errors
|
|
linkml-validate -s schema.yaml instance.yaml
|
|
```
|
|
|
|
### Convert RDF Formats
|
|
```bash
|
|
rdfpipe -i turtle -o nt file.ttl > file.nt
|
|
rdfpipe -i turtle -o json-ld file.ttl > file.jsonld
|
|
rdfpipe -i turtle -o xml file.ttl > file.rdf
|
|
```
|
|
|
|
### Check RDF Content
|
|
```bash
|
|
grep -c "ClassName" file.owl.ttl # Count class references
|
|
wc -l file.nt # Count triples
|
|
```
|
|
|
|
---
|
|
|
|
**Status**: ✅ ACTIVE RULES
|
|
**Version**: 1.0
|
|
**Last Updated**: 2025-11-22
|
|
**Applies To**: All LinkML schema work in this project
|
|
|
|
**See Also**:
|
|
- `.opencode/HYPER_MODULAR_STRUCTURE.md` - Module organization
|
|
- `.opencode/SLOT_NAMING_CONVENTIONS.md` - Slot naming patterns
|
|
- `AGENTS.md` - AI agent instructions
|