Refactor code structure for improved readability and maintainability

This commit is contained in:
kempersc 2026-01-12 19:13:35 +01:00
parent 3b35f4aea5
commit 8d7aca0f98
3 changed files with 335 additions and 2 deletions

View file

@ -0,0 +1,274 @@
# Rule 50: Ontology-to-LinkML Mapping Convention
🚨 **CRITICAL**: When mapping base ontology classes and predicates to LinkML schema elements, use LinkML's dedicated mapping properties as documented at https://linkml.io/linkml-model/latest/docs/mappings/
---
## 1. What "LinkML Mapping" Means in This Project
**"LinkML mapping"** refers specifically to:
1. Connecting LinkML schema elements (classes, slots, enums) to external ontology URIs
2. Using LinkML's built-in mapping properties (`class_uri`, `slot_uri`, `*_mappings`)
3. Following SKOS-based vocabulary alignment standards
**LinkML mapping does NOT mean**:
- Creating arbitrary crosswalks in spreadsheets
- Writing prose descriptions of how concepts relate
- Inventing custom `@context` JSON-LD mappings outside the schema
---
## 2. LinkML Mapping Property Reference
### Primary Identity Properties
| Property | Applies To | Purpose | Example |
|----------|-----------|---------|---------|
| `class_uri` | Classes | Primary RDF class URI | `class_uri: ore:Aggregation` |
| `slot_uri` | Slots | Primary RDF predicate URI | `slot_uri: rico:hasOrHadHolder` |
| `enum_uri` | Enums | Enum namespace URI | `enum_uri: hc:PlatformTypeEnum` |
### SKOS-Based Mapping Properties
These properties express **semantic relationships** to external ontology terms:
| Property | SKOS Predicate | Meaning | Use When |
|----------|---------------|---------|----------|
| `exact_mappings` | `skos:exactMatch` | Identical meaning | Different ontology, same semantics |
| `close_mappings` | `skos:closeMatch` | Very similar meaning | Similar but not interchangeable |
| `related_mappings` | `skos:relatedMatch` | Semantically related | Broader conceptual relationship |
| `narrow_mappings` | `skos:narrowMatch` | This is more specific | External term is broader |
| `broad_mappings` | `skos:broadMatch` | This is more general | External term is narrower |
---
## 3. Mapping Workflow: Ontology → LinkML
### Step 1: Identify External Ontology Class/Predicate
Search base ontology files in `/data/ontology/`:
```bash
# Find aggregation-related classes
rg -i "aggregation|aggregate" data/ontology/*.ttl data/ontology/*.rdf data/ontology/*.owl
# Check specific ontology
rg "rdfs:Class|owl:Class" data/ontology/ore.rdf | grep -i "aggregation"
```
### Step 2: Determine Mapping Strength
| Scenario | Mapping Property |
|----------|------------------|
| **This IS that ontology class** (identity) | `class_uri` |
| **Equivalent in another vocabulary** | `exact_mappings` |
| **Similar concept, different scope** | `close_mappings` |
| **Related but different granularity** | `narrow_mappings` / `broad_mappings` |
| **Conceptually related** | `related_mappings` |
### Step 3: Document Mapping in LinkML Schema
#### For Classes
```yaml
classes:
DataAggregator:
class_uri: ore:Aggregation # Primary identity - THIS IS an ORE Aggregation
description: |
A platform that harvests and STORES copies of metadata/content, causing data duplication.
ore:Aggregation - "A set of related resources grouped together."
Mapped to ORE because aggregators create aggregations of harvested metadata.
exact_mappings:
- edm:EuropeanaAggregation # Europeana's specialization
close_mappings:
- dcat:Catalog # Similar (collects datasets) but broader scope
narrow_mappings:
- edm:ProvidedCHO # More specific (single cultural object)
```
#### For Slots
```yaml
slots:
aggregates_from:
slot_uri: ore:aggregates # Primary predicate
description: |
Institutions whose data is aggregated (harvested and stored) by this platform.
ore:aggregates - "Aggregations assert ore:aggregates relationships."
exact_mappings:
- edm:aggregatedCHO # Europeana equivalent
range: HeritageCustodian
multivalued: true
```
---
## 4. Aggregation vs. Linking: A Mapping Example
This project requires **semantic precision** in distinguishing:
| Concept | Primary Mapping | Semantic Pattern |
|---------|-----------------|------------------|
| **Data Aggregation** | `ore:Aggregation` | Data is COPIED to aggregator's server |
| **Linking/Federation** | `dcat:DataService` | Data REMAINS at source; only links provided |
### Aggregation Pattern (Data Duplication)
```yaml
classes:
DataAggregator:
class_uri: ore:Aggregation
description: |
Harvests and stores copies of metadata from partner institutions.
Key semantic: Data DUPLICATION occurs - the aggregator maintains its own copy.
Examples: Europeana, DPLA, Archives Portal Europe
exact_mappings:
- edm:EuropeanaAggregation
annotations:
data_storage_pattern: AGGREGATION
causes_data_duplication: true
```
### Linking Pattern (Single Source of Truth)
```yaml
classes:
FederatedDiscoveryPortal:
class_uri: dcat:DataService
description: |
Provides unified search across multiple institutions but LINKS to original sources.
Key semantic: NO data duplication - users are redirected to source institutions.
Data remains at partner institutions' platforms (single source of truth).
close_mappings:
- schema:SearchAction # The search functionality
related_mappings:
- ore:Aggregation # Related but crucially different
annotations:
data_storage_pattern: LINKING
causes_data_duplication: false
```
### Linking Properties from EDM
Use `edm:isShownAt` and `edm:isShownBy` to express links to source:
```yaml
slots:
is_shown_at:
slot_uri: edm:isShownAt
description: |
Unambiguous URL to the digital object on the provider's web site
in its full information context.
edm:isShownAt - "The URL of a web view of the object in full context."
This property LINKS to the source institution - no data duplication.
range: uri
is_shown_by:
slot_uri: edm:isShownBy
description: |
Direct URL to the object in best available resolution on provider's site.
edm:isShownBy - "The URL of the object itself (not the context page)."
range: uri
```
---
## 5. Complete Mapping Documentation Template
When creating or updating a class with ontology mappings:
```yaml
classes:
MyNewClass:
# === PRIMARY IDENTITY ===
class_uri: {prefix}:{ClassName} # The ontology class this IS
# === DESCRIPTION WITH ONTOLOGY REFERENCE ===
description: |
{Human-readable description of what this class represents}
{Ontology}: {class} - "{Definition from ontology documentation}"
Mapping rationale:
- Chosen because: {why this ontology class fits}
- Not using X because: {why alternatives were rejected}
# === SKOS-BASED MAPPINGS ===
exact_mappings:
- {prefix}:{EquivalentClass} # Same meaning, different vocabulary
close_mappings:
- {prefix}:{SimilarClass} # Very similar but not identical
narrow_mappings:
- {prefix}:{MoreSpecificClass} # External is broader than ours
broad_mappings:
- {prefix}:{MoreGeneralClass} # External is narrower than ours
related_mappings:
- {prefix}:{RelatedClass} # Conceptually related
# === OPTIONAL ANNOTATIONS ===
annotations:
ontology_source: "{Full name of source ontology}"
ontology_version: "{Version if applicable}"
mapping_confidence: "high|medium|low"
mapping_notes: "{Additional context}"
```
---
## 6. Validation Checklist
Before committing ontology mappings:
- [ ] `class_uri` / `slot_uri` points to a real URI in `data/ontology/` files
- [ ] Description includes ontology definition (quoted from source)
- [ ] Mapping rationale documented for non-obvious choices
- [ ] `exact_mappings` used ONLY for truly equivalent terms
- [ ] `close_mappings` documented with difference explanation
- [ ] All prefixes declared in schema's `prefixes:` block
- [ ] Prefixes resolve to valid ontology namespaces
---
## 7. Common Ontology Prefixes for Mappings
| Prefix | Namespace | Ontology | Use For |
|--------|-----------|----------|---------|
| `ore:` | `http://www.openarchives.org/ore/terms/` | OAI-ORE | Aggregation patterns |
| `edm:` | `http://www.europeana.eu/schemas/edm/` | Europeana Data Model | Cultural heritage aggregation |
| `dcat:` | `http://www.w3.org/ns/dcat#` | DCAT | Data catalogs, services |
| `rico:` | `https://www.ica.org/standards/RiC/ontology#` | Records in Contexts | Archival description |
| `crm:` | `http://www.cidoc-crm.org/cidoc-crm/` | CIDOC-CRM | Cultural heritage events |
| `schema:` | `http://schema.org/` | Schema.org | Web semantics |
| `skos:` | `http://www.w3.org/2004/02/skos/core#` | SKOS | Concepts, labels |
| `dcterms:` | `http://purl.org/dc/terms/` | Dublin Core | Metadata properties |
| `prov:` | `http://www.w3.org/ns/prov#` | PROV-O | Provenance |
| `org:` | `http://www.w3.org/ns/org#` | W3C Organization | Organizations |
| `foaf:` | `http://xmlns.com/foaf/0.1/` | FOAF | People, agents |
---
## See Also
- [LinkML Mappings Documentation](https://linkml.io/linkml-model/latest/docs/mappings/)
- [LinkML URIs and Mappings Guide](https://linkml.io/linkml/schemas/uris-and-mappings.html)
- [LinkML class_uri Reference](https://linkml.io/linkml-model/latest/docs/class_uri/)
- [LinkML slot_uri Reference](https://linkml.io/linkml-model/latest/docs/slot_uri/)
- Rule 1: Ontology Files Are Your Primary Reference
- Rule 38: Slot Centralization and Semantic URI Requirements
- Rule 42: No Ontology Prefixes in Slot Names
---
**Version**: 1.0.0
**Created**: 2026-01-12
**Author**: OpenCODE

View file

@ -47,7 +47,7 @@ This is NOT a simple data extraction project. This is an **ontology engineering
---
This section summarizes 49 critical rules. Each rule has complete documentation in `.opencode/` files.
This section summarizes 50 critical rules. Each rule has complete documentation in `.opencode/` files.
### Rule 0: LinkML Schemas Are the Single Source of Truth
@ -1426,6 +1426,65 @@ slot_usage:
---
### Rule 50: Ontology-to-LinkML Mapping Convention
🚨 **CRITICAL**: When mapping base ontology classes and predicates to LinkML schema elements, use LinkML's dedicated mapping properties as documented at https://linkml.io/linkml-model/latest/docs/mappings/
**What "LinkML mapping" means**:
- Connecting LinkML schema elements (classes, slots, enums) to external ontology URIs
- Using LinkML's built-in mapping properties (`class_uri`, `slot_uri`, `*_mappings`)
- Following SKOS-based vocabulary alignment standards
**Primary Identity Properties**:
| Property | Applies To | Purpose |
|----------|-----------|---------|
| `class_uri` | Classes | Primary RDF class URI |
| `slot_uri` | Slots | Primary RDF predicate URI |
| `enum_uri` | Enums | Enum namespace URI |
**SKOS-Based Mapping Properties**:
| Property | SKOS Predicate | Use When |
|----------|---------------|----------|
| `exact_mappings` | `skos:exactMatch` | Different ontology, same semantics |
| `close_mappings` | `skos:closeMatch` | Similar but not identical |
| `related_mappings` | `skos:relatedMatch` | Broader conceptual relationship |
| `narrow_mappings` | `skos:narrowMatch` | External term is broader |
| `broad_mappings` | `skos:broadMatch` | External term is narrower |
**Example - Aggregation vs. Linking Distinction**:
```yaml
# Aggregation (data duplication)
classes:
DataAggregator:
class_uri: ore:Aggregation # Primary identity
exact_mappings:
- edm:EuropeanaAggregation
annotations:
data_storage_pattern: AGGREGATION
# Linking (single source of truth)
classes:
FederatedDiscoveryPortal:
class_uri: dcat:DataService # Links, doesn't store
close_mappings:
- schema:SearchAction
annotations:
data_storage_pattern: LINKING
```
**Validation Checklist**:
- [ ] `class_uri` / `slot_uri` points to a real URI in `data/ontology/` files
- [ ] Description includes ontology definition
- [ ] `exact_mappings` used ONLY for truly equivalent terms
- [ ] All prefixes declared in schema's `prefixes:` block
**See**: `.opencode/rules/ontology-to-linkml-mapping-convention.md` for complete documentation
---
## Appendix: Full Rule Content (No .opencode Equivalent)
The following rules have no separate .opencode file and are preserved in full:

View file

@ -1,5 +1,5 @@
{
"generated": "2026-01-12T17:27:48.486Z",
"generated": "2026-01-12T17:31:31.920Z",
"schemaRoot": "/schemas/20251121/linkml",
"totalFiles": 2886,
"categoryCounts": {