glam/.opencode/rules/verified-ontology-mapping-requirements.md
2026-02-04 00:24:46 +01:00

323 lines
9.8 KiB
Markdown

# Rule: Verified Ontology Mapping Requirements
## Overview
All LinkML slot files MUST include ontology mappings that are **verified against the actual ontology files** in `data/ontology/`. Never use hallucinated or assumed ontology terms.
---
## 1. Source Ontology Files
The following ontology files are available for verification:
| Prefix | Namespace | File | Key Properties |
|--------|-----------|------|----------------|
| `crm:` | `http://www.cidoc-crm.org/cidoc-crm/` | `CIDOC_CRM_v7.1.3.rdf` | P1, P2, P22, P23, P70, P82, etc. |
| `rico:` | `https://www.ica.org/standards/RiC/ontology#` | `RiC-O_1-1.rdf` | hasOrHadHolder, isOrWasPartOf, etc. |
| `prov:` | `http://www.w3.org/ns/prov#` | `prov.ttl` | wasInfluencedBy, wasDerivedFrom, used, etc. |
| `schema:` | `http://schema.org/` | `schemaorg.owl` | url, name, description, etc. |
| `dcterms:` | `http://purl.org/dc/terms/` | `dcterms.rdf` | format, rights, source, etc. |
| `skos:` | `http://www.w3.org/2004/02/skos/core#` | `skos.rdf` | prefLabel, notation, inScheme, etc. |
| `foaf:` | `http://xmlns.com/foaf/0.1/` | `foaf.ttl` | page, homepage, name, etc. |
| `dcat:` | `http://www.w3.org/ns/dcat#` | `dcat3.ttl` | mediaType, downloadURL, etc. |
| `time:` | `http://www.w3.org/2006/time#` | `time.ttl` | hasBeginning, hasEnd, etc. |
| `org:` | `http://www.w3.org/ns/org#` | `org.rdf` | siteOf, hasSite, subOrganizationOf, etc. |
| `sosa:` | `http://www.w3.org/ns/sosa/` | `sosa.ttl` | madeBySensor, observes, etc. |
---
## 2. Required Header Documentation
Every slot file MUST include a header comment block with an ontology alignment table:
```yaml
# ==============================================================================
# LinkML Slot Definition: {slot_name}
# ==============================================================================
# {Brief description - one line}
#
# ONTOLOGY ALIGNMENT (verified against data/ontology/):
#
# | Ontology | Property | File/Line | Mapping | Notes |
# |---------------|-----------------------|----------------------|---------|------------------------------------|
# | **PROV-O** | `prov:used` | prov.ttl:1046-1057 | exact | Entity used by activity |
# | **PROV-O** | `prov:wasInfluencedBy`| prov.ttl:1099-1121 | broad | Parent property (subPropertyOf) |
#
# HIERARCHY: prov:used rdfs:subPropertyOf prov:wasInfluencedBy (line 1046)
#
# CREATED: YYYY-MM-DD
# UPDATED: YYYY-MM-DD - Description of changes
# ==============================================================================
```
---
## 3. Mapping Types
Use the correct mapping type based on semantic relationship:
| Mapping Type | Usage | Example |
|--------------|-------|---------|
| `slot_uri` | Primary RDF predicate for this slot | `slot_uri: prov:used` |
| `exact_mappings` | Semantically equivalent properties | `- schema:dateRetrieved` |
| `close_mappings` | Very similar but slightly different semantics | `- prov:wasGeneratedBy` |
| `broad_mappings` | Parent/broader properties (slot is subPropertyOf these) | `- prov:wasInfluencedBy` |
| `narrow_mappings` | Child/narrower properties (these are subPropertyOf slot) | `- prov:qualifiedUsage` |
| `related_mappings` | Conceptually related but different scope | `- dcterms:source` |
---
## 4. Hierarchy Discovery Process
### Step 1: Search for subPropertyOf relationships
```bash
# Find if our property is subPropertyOf something (-> broad_mapping)
grep -n "OUR_PROPERTY.*subPropertyOf\|subPropertyOf.*OUR_PROPERTY" data/ontology/*.ttl
# Find properties that are subPropertyOf our property (-> narrow_mappings)
grep -n "subPropertyOf.*OUR_PROPERTY" data/ontology/*.rdf
```
### Step 2: Document the hierarchy
When you find a hierarchy, document it in:
1. The header comment block (HIERARCHY line)
2. The appropriate mapping field (`broad_mappings` or `narrow_mappings`)
3. Inline comments with file/line references
---
## 5. Key Ontology Hierarchies Reference
### PROV-O (`prov.ttl`)
```
prov:wasInfluencedBy (parent of many)
├── prov:wasDerivedFrom
│ ├── prov:hadPrimarySource
│ ├── prov:wasQuotedFrom
│ └── prov:wasRevisionOf
├── prov:wasGeneratedBy
├── prov:used
├── prov:wasAssociatedWith
├── prov:wasAttributedTo
└── prov:wasInformedBy
prov:influenced (inverse direction)
├── prov:generated
└── prov:invalidated
```
### CIDOC-CRM (`CIDOC_CRM_v7.1.3.rdf`)
```
crm:P1_is_identified_by
├── crm:P48_has_preferred_identifier
└── crm:P168_place_is_defined_by
crm:P82_at_some_time_within
├── crm:P82a_begin_of_the_begin
└── crm:P82b_end_of_the_end
crm:P81_ongoing_throughout
├── crm:P81a_end_of_the_begin
└── crm:P81b_begin_of_the_end
crm:P67_refers_to
└── crm:P70_documents
```
### RiC-O (`RiC-O_1-1.rdf`)
```
rico:isOrWasUnderAuthorityOf
├── rico:hasOrHadManager
│ └── rico:hasOrHadHolder
└── (other authority relationships)
rico:hasOrHadPart
└── rico:containsOrContained
└── rico:containsTransitive
rico:isSuccessorOf
├── rico:hasAncestor
├── rico:resultedFromTheMergerOf
└── rico:resultedFromTheSplitOf
```
### Dublin Core Terms (`dcterms.rdf`)
```
dcterms:rights
└── dcterms:accessRights
```
### DCAT (`dcat3.ttl`)
```
dcterms:format
├── dcat:mediaType
├── dcat:compressFormat
└── dcat:packageFormat
```
### FOAF (`foaf.ttl`)
```
foaf:page
├── foaf:homepage
├── foaf:weblog
├── foaf:interest
├── foaf:workplaceHomepage
└── foaf:schoolHomepage
```
### Schema.org (`schemaorg.owl`)
```
schema:workFeatured
├── schema:workPerformed
└── schema:workPresented
```
---
## 6. Verification Commands
### Check if a property exists
```bash
grep -n "PROPERTY_NAME" data/ontology/FILE.ttl
```
### Find all subPropertyOf for a property
```bash
grep -B5 -A5 "subPropertyOf" data/ontology/FILE.ttl | grep -A5 -B5 "PROPERTY_NAME"
```
### Validate YAML after editing
```bash
python3 -c "import yaml; yaml.safe_load(open('FILENAME.yaml')); print('✅ valid')"
```
---
## 7. Complete Slot File Example
```yaml
# ==============================================================================
# LinkML Slot Definition: retrieved_through
# ==============================================================================
# To denote the specific method, protocol, or mechanism by which a resource
# or data was accessed, fetched, or collected.
#
# ONTOLOGY ALIGNMENT (verified against data/ontology/):
#
# | Ontology | Property | File/Line | Mapping | Notes |
# |------------|--------------------------|--------------------|---------|------------------------------------|
# | **PROV-O** | `prov:used` | prov.ttl:1046-1057 | exact | Entity used by activity |
# | **PROV-O** | `prov:wasInfluencedBy` | prov.ttl:1099-1121 | broad | Parent property (subPropertyOf) |
# | **PROV-O** | `prov:qualifiedUsage` | prov.ttl:788-798 | narrow | Qualified usage with details |
#
# HIERARCHY: prov:used rdfs:subPropertyOf prov:wasInfluencedBy (line 1046)
#
# CREATED: 2026-01-26
# UPDATED: 2026-02-03 - Added broad/narrow mappings, header documentation
# ==============================================================================
id: https://nde.nl/ontology/hc/slot/retrieved_through
name: retrieved_through
title: Retrieved Through
prefixes:
linkml: https://w3id.org/linkml/
hc: https://nde.nl/ontology/hc/
prov: http://www.w3.org/ns/prov#
schema: http://schema.org/
imports:
- linkml:types
default_prefix: hc
slots:
retrieved_through:
slot_uri: prov:used
description: |
To denote the specific method, protocol, or mechanism by which a resource or data was accessed, fetched, or collected.
range: string
exact_mappings:
- prov:used # prov.ttl:1046-1057
broad_mappings:
- prov:wasInfluencedBy # prov.ttl:1099-1121 - parent (used subPropertyOf wasInfluencedBy)
narrow_mappings:
- prov:qualifiedUsage # prov.ttl:788-798 - qualified form with details
comments:
- |
**ONTOLOGY ALIGNMENT** (verified against data/ontology/):
| Ontology | Property | Line | Mapping | Notes |
|----------|----------|------|---------|-------|
| PROV-O | prov:used | 1046-1057 | exact | Entity used by activity |
| PROV-O | prov:wasInfluencedBy | 1099-1121 | broad | Parent property |
| PROV-O | prov:qualifiedUsage | 788-798 | narrow | Qualified usage |
```
---
## 8. Anti-Patterns
### ❌ WRONG: Hallucinated ontology terms
```yaml
exact_mappings:
- prov:retrievedWith # ❌ Does not exist in PROV-O!
- rico:wasObtainedBy # ❌ Not a real RiC-O property!
```
### ❌ WRONG: No verification references
```yaml
exact_mappings:
- prov:used # No file/line reference - how do we know this is correct?
```
### ✅ CORRECT: Verified with references
```yaml
exact_mappings:
- prov:used # prov.ttl:1046-1057 - "Entity used by activity"
broad_mappings:
- prov:wasInfluencedBy # prov.ttl:1099-1121 - parent property (verified subPropertyOf)
```
---
## 9. Validation Checklist
Before completing a slot file, verify:
- [ ] Header comment block includes ontology alignment table
- [ ] All mappings verified against actual ontology files in `data/ontology/`
- [ ] File/line references provided for each mapping
- [ ] `rdfs:subPropertyOf` relationships checked for broad/narrow mappings
- [ ] HIERARCHY line documents any property hierarchies
- [ ] No hallucinated or assumed ontology terms
- [ ] YAML validates correctly
---
## See Also
- Rule 1: Ontology Files Are Your Primary Reference (`no-hallucinated-ontology-references.md`)
- Rule: Verified Ontology Terms (`verified-ontology-terms.md`)
- Ontology files: `data/ontology/`
---
**Version**: 1.0.0
**Created**: 2026-02-03
**Author**: OpenCODE