- Updated WorldCatIdentifier.yaml to remove unnecessary description and ensure consistent formatting. - Enhanced WorldHeritageSite.yaml by breaking long description into multiple lines for better readability and removed unused attributes. - Simplified WritingSystem.yaml by removing redundant attributes and ensuring consistent formatting. - Cleaned up XPathScore.yaml by removing unnecessary attributes and ensuring consistent formatting. - Improved YoutubeChannel.yaml by breaking long description into multiple lines for better readability. - Enhanced YoutubeEnrichment.yaml by breaking long description into multiple lines for better readability. - Updated YoutubeVideo.yaml to break long description into multiple lines and removed legacy field name. - Refined has_or_had_affiliation.yaml by removing unnecessary comments and ensuring clarity. - Cleaned up is_or_was_retrieved_at.yaml by removing unnecessary comments and ensuring clarity. - Added rules for generic slots and avoiding rough edits in schema files to maintain structural integrity. - Introduced changes_or_changed_through.yaml to define a new slot for linking entities to change events.
1.8 KiB
1.8 KiB
Rule: No Rough Edits in Schema Files
Identifier: no-rough-edits-in-schema
Severity: CRITICAL
Core Directive
DO NOT perform rough, imprecise, or bulk text substitutions (like sed -i or regex-based python scripts) on LinkML schema files (schemas/*/linkml/) without guaranteeing structural integrity.
YOU MUST:
- ✅ Use proper YAML parsers/dumpers if modifying structure programmatically.
- ✅ Manually verify edits if using text replacement.
- ✅ Ensure indentation and nesting are preserved exactly.
- ✅ Respect comments and ordering (which parsers often destroy, so careful text editing is sometimes necessary, but it must be PRECISE).
Rationale
LinkML schemas are highly structured YAML files where indentation and nesting semantics are critical. Rough edits often cause:
- Duplicate keys (e.g., leaving a property behind after deleting its parent key).
- Invalid indentation (breaking the parent-child relationship).
- Silent corruption (valid YAML but wrong semantics).
Examples
❌ Anti-Pattern: Rough Deletion
Deleting lines containing a string without checking context:
# WRONG: Deleting lines blindly
for line in lines:
if "some_slot" in line:
continue # Deletes the line, but might leave children orphaned!
new_lines.append(line)
Resulting Corruption:
# Original
slots:
some_slot:
range: string
# Corrupted (orphaned child)
slots:
range: string # INVALID!
✅ Correct Pattern: Structural Awareness
If removing a slot reference, ensure you remove the entire list item or key-value block.
# BETTER: Check for list item syntax
if re.match(r'^\s*-\s*some_slot\s*$', line):
continue
Application
This rule applies to ALL files in schemas/20251121/linkml/ and future versions.