- Create .opencode/rules/no-duplicate-ontology-mappings.md with detection script - Add Rule 52 to AGENTS.md (after Rule 51) - Fix 29 duplicate mappings: same URI in multiple mapping categories - 26 slot files: remove duplicates keeping most precise mapping - 3 class files: ExhibitionSpace, Custodian, DigitalPlatform - Mapping precedence: exact > close > narrow/broad > related Each ontology URI must appear in only ONE mapping category per schema element, following SKOS semantics where mapping properties are mutually exclusive.
5.6 KiB
Rule 52: No Duplicate Ontology Mappings
Summary
Each ontology URI MUST appear in only ONE mapping category per schema element. A URI cannot simultaneously have multiple semantic relationships to the same class or slot.
The Problem
LinkML provides five mapping annotation types based on SKOS vocabulary alignment:
| Property | SKOS Predicate | Meaning |
|---|---|---|
exact_mappings |
skos:exactMatch |
"This IS that" (equivalent) |
close_mappings |
skos:closeMatch |
"This is very similar to that" |
related_mappings |
skos:relatedMatch |
"This is conceptually related to that" |
narrow_mappings |
skos:narrowMatch |
"This is MORE SPECIFIC than that" |
broad_mappings |
skos:broadMatch |
"This is MORE GENERAL than that" |
These relationships are mutually exclusive. A URI cannot simultaneously:
- BE the element (
exact_mappings) AND be broader than it (broad_mappings) - Be closely similar (
close_mappings) AND be more general (broad_mappings)
Anti-Pattern (WRONG)
# WRONG - schema:url appears in TWO mapping types
slots:
source_url:
slot_uri: prov:atLocation
exact_mappings:
- schema:url # Says "source_url IS schema:url"
broad_mappings:
- schema:url # Says "schema:url is MORE GENERAL than source_url"
This is a logical contradiction: source_url cannot simultaneously BE schema:url AND be more specific than schema:url.
Correct Pattern
# CORRECT - each URI appears in only ONE mapping type
slots:
source_url:
slot_uri: prov:atLocation
exact_mappings:
- schema:url # source_url IS schema:url
close_mappings:
- dcterms:source # Similar but not identical
Decision Guide: Which Mapping to Keep
When a URI appears in multiple categories, keep the most precise one:
Precedence Order (keep the first match)
- exact_mappings - Strongest claim: semantic equivalence
- close_mappings - Strong claim: nearly equivalent
- narrow_mappings / broad_mappings - Hierarchical relationship
- related_mappings - Weakest claim: conceptual association
Decision Matrix
| If URI appears in... | Keep | Remove |
|---|---|---|
| exact + broad | exact | broad |
| exact + close | exact | close |
| exact + related | exact | related |
| close + broad | close | broad |
| close + related | close | related |
| related + broad | related | broad |
| narrow + broad | narrow | broad (contradictory!) |
Special Case: narrow + broad
If a URI appears in BOTH narrow_mappings AND broad_mappings, this is a data error - the same URI cannot be both more specific AND more general. Investigate which is correct based on the ontology definition.
Real Examples Fixed
Example 1: source_url
# BEFORE (wrong)
slots:
source_url:
exact_mappings:
- schema:url
broad_mappings:
- schema:url # Duplicate!
# AFTER (correct)
slots:
source_url:
exact_mappings:
- schema:url # Keep exact (strongest)
# broad_mappings removed
Example 2: Custodian class
# BEFORE (wrong)
classes:
Custodian:
close_mappings:
- cpov:PublicOrganisation
narrow_mappings:
- cpov:PublicOrganisation # Duplicate!
# AFTER (correct)
classes:
Custodian:
close_mappings:
- cpov:PublicOrganisation # Keep close (Custodian ≈ PublicOrganisation)
# narrow_mappings: use for URIs that are MORE SPECIFIC than Custodian
Example 3: geonames_id (narrow + broad conflict)
# BEFORE (wrong - logical contradiction!)
slots:
geonames_id:
narrow_mappings:
- dcterms:identifier # Says geonames_id is MORE SPECIFIC
broad_mappings:
- dcterms:identifier # Says geonames_id is MORE GENERAL
# AFTER (correct)
slots:
geonames_id:
narrow_mappings:
- dcterms:identifier # geonames_id IS a specific type of identifier
# broad_mappings removed (was contradictory)
Detection Script
Run this to find duplicate mappings in the schema:
import yaml
from pathlib import Path
from collections import defaultdict
mapping_types = ['exact_mappings', 'close_mappings', 'related_mappings',
'narrow_mappings', 'broad_mappings']
dirs = [
Path('schemas/20251121/linkml/modules/slots'),
Path('schemas/20251121/linkml/modules/classes'),
]
for d in dirs:
for yaml_file in d.glob('*.yaml'):
try:
with open(yaml_file) as f:
content = yaml.safe_load(f)
except Exception:
continue
if not content:
continue
for section in ['classes', 'slots']:
items = content.get(section, {})
if not isinstance(items, dict):
continue
for name, defn in items.items():
if not isinstance(defn, dict):
continue
uri_to_types = defaultdict(list)
for mt in mapping_types:
for uri in defn.get(mt, []) or []:
uri_to_types[uri].append(mt)
for uri, types in uri_to_types.items():
if len(types) > 1:
print(f"{yaml_file}: {name} - {uri} in {types}")
Validation Rule
Pre-commit check: Before committing LinkML schema changes, run the detection script. If any duplicates are found, the commit should fail.
References
- LinkML Mappings Documentation
- SKOS Mapping Properties
- Rule 50: Ontology-to-LinkML Mapping Convention (parent rule)
- Rule 51: No Hallucinated Ontology References