docs: add Rule 52 prohibiting duplicate ontology mappings
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 3m53s

- Create .opencode/rules/no-duplicate-ontology-mappings.md with detection script
- Add Rule 52 to AGENTS.md (after Rule 51)
- Fix 29 duplicate mappings: same URI in multiple mapping categories
  - 26 slot files: remove duplicates keeping most precise mapping
  - 3 class files: ExhibitionSpace, Custodian, DigitalPlatform
- Mapping precedence: exact > close > narrow/broad > related

Each ontology URI must appear in only ONE mapping category per schema
element, following SKOS semantics where mapping properties are mutually
exclusive.
This commit is contained in:
kempersc 2026-01-13 15:57:26 +01:00
parent 6781073d06
commit 73b3b21017
32 changed files with 243 additions and 58 deletions

View file

@ -0,0 +1,189 @@
# Rule 52: No Duplicate Ontology Mappings
## Summary
Each ontology URI MUST appear in only ONE mapping category per schema element. A URI cannot simultaneously have multiple semantic relationships to the same class or slot.
## The Problem
LinkML provides five mapping annotation types based on SKOS vocabulary alignment:
| Property | SKOS Predicate | Meaning |
|----------|---------------|---------|
| `exact_mappings` | `skos:exactMatch` | "This IS that" (equivalent) |
| `close_mappings` | `skos:closeMatch` | "This is very similar to that" |
| `related_mappings` | `skos:relatedMatch` | "This is conceptually related to that" |
| `narrow_mappings` | `skos:narrowMatch` | "This is MORE SPECIFIC than that" |
| `broad_mappings` | `skos:broadMatch` | "This is MORE GENERAL than that" |
These relationships are **mutually exclusive**. A URI cannot simultaneously:
- BE the element (`exact_mappings`) AND be broader than it (`broad_mappings`)
- Be closely similar (`close_mappings`) AND be more general (`broad_mappings`)
## Anti-Pattern (WRONG)
```yaml
# WRONG - schema:url appears in TWO mapping types
slots:
source_url:
slot_uri: prov:atLocation
exact_mappings:
- schema:url # Says "source_url IS schema:url"
broad_mappings:
- schema:url # Says "schema:url is MORE GENERAL than source_url"
```
This is a **logical contradiction**: `source_url` cannot simultaneously BE `schema:url` AND be more specific than `schema:url`.
## Correct Pattern
```yaml
# CORRECT - each URI appears in only ONE mapping type
slots:
source_url:
slot_uri: prov:atLocation
exact_mappings:
- schema:url # source_url IS schema:url
close_mappings:
- dcterms:source # Similar but not identical
```
## Decision Guide: Which Mapping to Keep
When a URI appears in multiple categories, keep the **most precise** one:
### Precedence Order (keep the first match)
1. **exact_mappings** - Strongest claim: semantic equivalence
2. **close_mappings** - Strong claim: nearly equivalent
3. **narrow_mappings** / **broad_mappings** - Hierarchical relationship
4. **related_mappings** - Weakest claim: conceptual association
### Decision Matrix
| If URI appears in... | Keep | Remove |
|---------------------|------|--------|
| exact + broad | exact | broad |
| exact + close | exact | close |
| exact + related | exact | related |
| close + broad | close | broad |
| close + related | close | related |
| related + broad | related | broad |
| narrow + broad | narrow | broad (contradictory!) |
### Special Case: narrow + broad
If a URI appears in BOTH `narrow_mappings` AND `broad_mappings`, this is a **data error** - the same URI cannot be both more specific AND more general. Investigate which is correct based on the ontology definition.
## Real Examples Fixed
### Example 1: source_url
```yaml
# BEFORE (wrong)
slots:
source_url:
exact_mappings:
- schema:url
broad_mappings:
- schema:url # Duplicate!
# AFTER (correct)
slots:
source_url:
exact_mappings:
- schema:url # Keep exact (strongest)
# broad_mappings removed
```
### Example 2: Custodian class
```yaml
# BEFORE (wrong)
classes:
Custodian:
close_mappings:
- cpov:PublicOrganisation
narrow_mappings:
- cpov:PublicOrganisation # Duplicate!
# AFTER (correct)
classes:
Custodian:
close_mappings:
- cpov:PublicOrganisation # Keep close (Custodian ≈ PublicOrganisation)
# narrow_mappings: use for URIs that are MORE SPECIFIC than Custodian
```
### Example 3: geonames_id (narrow + broad conflict)
```yaml
# BEFORE (wrong - logical contradiction!)
slots:
geonames_id:
narrow_mappings:
- dcterms:identifier # Says geonames_id is MORE SPECIFIC
broad_mappings:
- dcterms:identifier # Says geonames_id is MORE GENERAL
# AFTER (correct)
slots:
geonames_id:
narrow_mappings:
- dcterms:identifier # geonames_id IS a specific type of identifier
# broad_mappings removed (was contradictory)
```
## Detection Script
Run this to find duplicate mappings in the schema:
```python
import yaml
from pathlib import Path
from collections import defaultdict
mapping_types = ['exact_mappings', 'close_mappings', 'related_mappings',
'narrow_mappings', 'broad_mappings']
dirs = [
Path('schemas/20251121/linkml/modules/slots'),
Path('schemas/20251121/linkml/modules/classes'),
]
for d in dirs:
for yaml_file in d.glob('*.yaml'):
try:
with open(yaml_file) as f:
content = yaml.safe_load(f)
except Exception:
continue
if not content:
continue
for section in ['classes', 'slots']:
items = content.get(section, {})
if not isinstance(items, dict):
continue
for name, defn in items.items():
if not isinstance(defn, dict):
continue
uri_to_types = defaultdict(list)
for mt in mapping_types:
for uri in defn.get(mt, []) or []:
uri_to_types[uri].append(mt)
for uri, types in uri_to_types.items():
if len(types) > 1:
print(f"{yaml_file}: {name} - {uri} in {types}")
```
## Validation Rule
**Pre-commit check**: Before committing LinkML schema changes, run the detection script. If any duplicates are found, the commit should fail.
## References
- [LinkML Mappings Documentation](https://linkml.io/linkml-model/latest/docs/mappings/)
- [SKOS Mapping Properties](https://www.w3.org/TR/skos-reference/#mapping)
- Rule 50: Ontology-to-LinkML Mapping Convention (parent rule)
- Rule 51: No Hallucinated Ontology References

View file

@ -1528,6 +1528,56 @@ slots:
--- ---
### Rule 52: No Duplicate Ontology Mappings
🚨 **CRITICAL**: Each ontology URI MUST appear in only ONE mapping category per schema element. A URI cannot have multiple semantic relationships to the same class or slot.
**The Problem**: LinkML mapping properties (`exact_mappings`, `close_mappings`, `related_mappings`, `narrow_mappings`, `broad_mappings`) are mutually exclusive based on SKOS semantics. The same URI appearing in multiple categories creates logical contradictions.
**Anti-Pattern (WRONG)**:
```yaml
slots:
source_url:
exact_mappings:
- schema:url # Says "source_url IS schema:url"
broad_mappings:
- schema:url # Says "schema:url is MORE GENERAL than source_url"
# CONTRADICTION: source_url cannot both BE schema:url AND be more specific than it
```
**Correct Pattern**:
```yaml
slots:
source_url:
exact_mappings:
- schema:url # Keep only the most precise mapping
```
**Decision Guide** - When duplicates found, keep the MOST PRECISE:
| Precedence | Mapping Type | Meaning |
|------------|--------------|---------|
| 1st (keep) | `exact_mappings` | Semantic equivalence |
| 2nd | `close_mappings` | Nearly equivalent |
| 3rd | `narrow_mappings` | This is more specific |
| 4th | `broad_mappings` | This is more general |
| 5th | `related_mappings` | Conceptual association |
**Quick Reference**:
| If URI in... | Action |
|--------------|--------|
| exact + broad | Keep exact, remove broad |
| close + broad | Keep close, remove broad |
| related + broad | Keep related, remove broad |
| narrow + broad | ERROR - investigate (contradictory) |
**See**: `.opencode/rules/no-duplicate-ontology-mappings.md` for complete documentation
---
## Appendix: Full Rule Content (No .opencode Equivalent) ## Appendix: Full Rule Content (No .opencode Equivalent)
The following rules have no separate .opencode file and are preserved in full: The following rules have no separate .opencode file and are preserved in full:

View file

@ -1,5 +1,5 @@
{ {
"generated": "2026-01-13T12:50:30.701Z", "generated": "2026-01-13T14:57:26.896Z",
"schemaRoot": "/schemas/20251121/linkml", "schemaRoot": "/schemas/20251121/linkml",
"totalFiles": 2886, "totalFiles": 2886,
"categoryCounts": { "categoryCounts": {

View file

@ -146,7 +146,6 @@ classes:
- pico:Person - pico:Person
- schema:Person - schema:Person
- schema:Organization - schema:Organization
- cpov:PublicOrganisation
- rico:CorporateBody - rico:CorporateBody
- org:Organization - org:Organization
- foaf:Person - foaf:Person

View file

@ -91,7 +91,6 @@ classes:
- foaf:homepage - foaf:homepage
close_mappings: close_mappings:
- schema:WebApplication - schema:WebApplication
- schema:SoftwareApplication
- dcat:Catalog - dcat:Catalog
- dcat:DataService - dcat:DataService
- crm:E73_Information_Object - crm:E73_Information_Object

View file

@ -76,7 +76,6 @@ classes:
related_mappings: related_mappings:
- schema:Museum - schema:Museum
- schema:ArtGallery - schema:ArtGallery
- aat:300005768
slots: slots:
- has_or_had_admission_fee - has_or_had_admission_fee
- current_exhibition - current_exhibition

View file

@ -28,5 +28,3 @@ slots:
description: Natural history specimen data standard description: Natural history specimen data standard
related_mappings: related_mappings:
- dcterms:conformsTo - dcterms:conformsTo
broad_mappings:
- dcterms:conformsTo

View file

@ -29,5 +29,3 @@ slots:
' '
close_mappings: close_mappings:
- dcterms:type - dcterms:type
broad_mappings:
- dcterms:type

View file

@ -14,5 +14,3 @@ slots:
description: The extracted value from the web source. This is the actual content claimed to exist at the XPath location. description: The extracted value from the web source. This is the actual content claimed to exist at the XPath location.
close_mappings: close_mappings:
- rdf:value - rdf:value
broad_mappings:
- rdf:value

View file

@ -14,5 +14,3 @@ slots:
- schema:description - schema:description
related_mappings: related_mappings:
- dcterms:description - dcterms:description
broad_mappings:
- dcterms:description

View file

@ -21,5 +21,3 @@ slots:
- dcterms:coverage - dcterms:coverage
close_mappings: close_mappings:
- schema:about - schema:about
broad_mappings:
- dcterms:coverage

View file

@ -71,5 +71,3 @@ slots:
- schema:numberOfItems - schema:numberOfItems
close_mappings: close_mappings:
- dcterms:extent - dcterms:extent
broad_mappings:
- dcterms:extent

View file

@ -41,5 +41,3 @@ slots:
' '
close_mappings: close_mappings:
- dcterms:type - dcterms:type
broad_mappings:
- dcterms:type

View file

@ -11,5 +11,3 @@ slots:
' '
close_mappings: close_mappings:
- prov:wasGeneratedBy - prov:wasGeneratedBy
broad_mappings:
- prov:wasGeneratedBy

View file

@ -39,7 +39,7 @@ slots:
conflict_status: conflict_status:
status: destroyed status: destroyed
date: "2023-12-08" date: "2023-12-08"
description: "Destroyed by Israeli airstrike\ \ during Gaza conflict" description: "Destroyed by Israeli airstrike during Gaza conflict"
sources: sources:
- "LAP Gaza Report 2024" - "LAP Gaza Report 2024"
@ -49,7 +49,7 @@ slots:
status: damaged status: damaged
date: "2022-03-01" date: "2022-03-01"
is_rebuilding: true is_rebuilding: true
description: "Damaged\ \ by shelling, currently under restoration" description: "Damaged by shelling, currently under restoration"
sources: sources:
- "UNESCO Ukraine heritage monitoring" - "UNESCO Ukraine heritage monitoring"
@ -62,5 +62,3 @@ slots:
- hc:time_of_destruction - hc:time_of_destruction
- hc:ConflictStatus - hc:ConflictStatus
- hc:ConflictStatusEnum - hc:ConflictStatusEnum
broad_mappings:
- adms:status

View file

@ -26,5 +26,3 @@ slots:
exact_mappings: exact_mappings:
- schema:email - schema:email
- vcard:hasEmail - vcard:hasEmail
broad_mappings:
- schema:email

View file

@ -19,5 +19,3 @@ slots:
- schema:startDate - schema:startDate
related_mappings: related_mappings:
- dcterms:date - dcterms:date
broad_mappings:
- dcterms:date

View file

@ -38,5 +38,3 @@ slots:
- CustodianTimelineEvent overrides range to TimelineExtractionMethodEnum - CustodianTimelineEvent overrides range to TimelineExtractionMethodEnum
close_mappings: close_mappings:
- prov:wasGeneratedBy - prov:wasGeneratedBy
broad_mappings:
- prov:wasGeneratedBy

View file

@ -16,5 +16,3 @@ slots:
' '
close_mappings: close_mappings:
- skos:note - skos:note
broad_mappings:
- skos:note

View file

@ -38,8 +38,6 @@ slots:
- gn:geonamesID - gn:geonamesID
narrow_mappings: narrow_mappings:
- dcterms:identifier - dcterms:identifier
broad_mappings:
- dcterms:identifier
comments: comments:
- Used by Settlement, AuxiliaryPlace, and GeoSpatialPlace classes - Used by Settlement, AuxiliaryPlace, and GeoSpatialPlace classes
- 'Lookup URL: https://www.geonames.org/{geonames_id}/' - 'Lookup URL: https://www.geonames.org/{geonames_id}/'

View file

@ -32,5 +32,3 @@ slots:
\n iiif_image_api_version: \"3.0\"\n```\n" \n iiif_image_api_version: \"3.0\"\n```\n"
close_mappings: close_mappings:
- dcat:endpointURL - dcat:endpointURL
broad_mappings:
- dcat:endpointURL

View file

@ -29,5 +29,3 @@ slots:
range: string range: string
related_mappings: related_mappings:
- dcterms:source - dcterms:source
broad_mappings:
- dcterms:source

View file

@ -9,5 +9,3 @@ slots:
\ (ISIL)\n- 'Q190804' (Wikidata)\n- '148691498' (VIAF)\n- '0000 0001 2146 5765' (ISNI with spaces)\n" \ (ISIL)\n- 'Q190804' (Wikidata)\n- '148691498' (VIAF)\n- '0000 0001 2146 5765' (ISNI with spaces)\n"
exact_mappings: exact_mappings:
- rdf:value - rdf:value
broad_mappings:
- rdf:value

View file

@ -54,9 +54,8 @@ slots:
close_mappings: close_mappings:
- schema:parentOrganization - schema:parentOrganization
- schema:memberOf
- rico:isOrWasSubordinateTo - rico:isOrWasSubordinateTo
broad_mappings: broad_mappings:
- schema:memberOf - schema:memberOf

View file

@ -32,5 +32,3 @@ slots:
- org:linkedTo - org:linkedTo
close_mappings: close_mappings:
- prov:wasAttributedTo - prov:wasAttributedTo
broad_mappings:
- prov:wasAttributedTo

View file

@ -37,5 +37,3 @@ slots:
' '
close_mappings: close_mappings:
- dcterms:conformsTo - dcterms:conformsTo
broad_mappings:
- dcterms:conformsTo

View file

@ -43,5 +43,3 @@ slots:
range: string range: string
close_mappings: close_mappings:
- dcterms:source - dcterms:source
broad_mappings:
- dcterms:source

View file

@ -19,5 +19,3 @@ slots:
\ types (e.g., digital archive + aggregator).\n" \ types (e.g., digital archive + aggregator).\n"
close_mappings: close_mappings:
- dcterms:type - dcterms:type
broad_mappings:
- dcterms:type

View file

@ -27,5 +27,3 @@ slots:
- PersonWebClaim overrides range to RetrievalAgentEnum - PersonWebClaim overrides range to RetrievalAgentEnum
close_mappings: close_mappings:
- prov:wasAttributedTo - prov:wasAttributedTo
broad_mappings:
- prov:wasAttributedTo

View file

@ -13,5 +13,3 @@ slots:
' '
related_mappings: related_mappings:
- dcterms:type - dcterms:type
broad_mappings:
- dcterms:type

View file

@ -46,8 +46,6 @@ slots:
exact_mappings: exact_mappings:
- schema:url - schema:url
- dcterms:source - dcterms:source
broad_mappings:
- schema:url
comments: comments:
- Maps to pav:retrievedFrom for provenance tracking - Maps to pav:retrievedFrom for provenance tracking
- Essential for web claim verification workflows - Essential for web claim verification workflows

View file

@ -40,5 +40,3 @@ slots:
- schema:sameAs - schema:sameAs
narrow_mappings: narrow_mappings:
- dcterms:identifier - dcterms:identifier
broad_mappings:
- dcterms:identifier