- Updated WorldCatIdentifier.yaml to remove unnecessary description and ensure consistent formatting. - Enhanced WorldHeritageSite.yaml by breaking long description into multiple lines for better readability and removed unused attributes. - Simplified WritingSystem.yaml by removing redundant attributes and ensuring consistent formatting. - Cleaned up XPathScore.yaml by removing unnecessary attributes and ensuring consistent formatting. - Improved YoutubeChannel.yaml by breaking long description into multiple lines for better readability. - Enhanced YoutubeEnrichment.yaml by breaking long description into multiple lines for better readability. - Updated YoutubeVideo.yaml to break long description into multiple lines and removed legacy field name. - Refined has_or_had_affiliation.yaml by removing unnecessary comments and ensuring clarity. - Cleaned up is_or_was_retrieved_at.yaml by removing unnecessary comments and ensuring clarity. - Added rules for generic slots and avoiding rough edits in schema files to maintain structural integrity. - Introduced changes_or_changed_through.yaml to define a new slot for linking entities to change events.
61 lines
1.8 KiB
Markdown
61 lines
1.8 KiB
Markdown
# Rule: No Rough Edits in Schema Files
|
|
|
|
**Identifier**: `no-rough-edits-in-schema`
|
|
**Severity**: **CRITICAL**
|
|
|
|
## Core Directive
|
|
|
|
**DO NOT** perform rough, imprecise, or bulk text substitutions (like `sed -i` or regex-based python scripts) on LinkML schema files (`schemas/*/linkml/`) without guaranteeing structural integrity.
|
|
|
|
**YOU MUST**:
|
|
* ✅ Use proper YAML parsers/dumpers if modifying structure programmatically.
|
|
* ✅ Manually verify edits if using text replacement.
|
|
* ✅ Ensure indentation and nesting are preserved exactly.
|
|
* ✅ Respect comments and ordering (which parsers often destroy, so careful text editing is sometimes necessary, but it must be PRECISE).
|
|
|
|
## Rationale
|
|
|
|
LinkML schemas are highly structured YAML files where indentation and nesting semantics are critical. Rough edits often cause:
|
|
* **Duplicate keys** (e.g., leaving a property behind after deleting its parent key).
|
|
* **Invalid indentation** (breaking the parent-child relationship).
|
|
* **Silent corruption** (valid YAML but wrong semantics).
|
|
|
|
## Examples
|
|
|
|
### ❌ Anti-Pattern: Rough Deletion
|
|
|
|
Deleting lines containing a string without checking context:
|
|
|
|
```python
|
|
# WRONG: Deleting lines blindly
|
|
for line in lines:
|
|
if "some_slot" in line:
|
|
continue # Deletes the line, but might leave children orphaned!
|
|
new_lines.append(line)
|
|
```
|
|
|
|
**Resulting Corruption**:
|
|
```yaml
|
|
# Original
|
|
slots:
|
|
some_slot:
|
|
range: string
|
|
|
|
# Corrupted (orphaned child)
|
|
slots:
|
|
range: string # INVALID!
|
|
```
|
|
|
|
### ✅ Correct Pattern: Structural Awareness
|
|
|
|
If removing a slot reference, ensure you remove the entire list item or key-value block.
|
|
|
|
```python
|
|
# BETTER: Check for list item syntax
|
|
if re.match(r'^\s*-\s*some_slot\s*$', line):
|
|
continue
|
|
```
|
|
|
|
## Application
|
|
|
|
This rule applies to ALL files in `schemas/20251121/linkml/` and future versions.
|