glam/.opencode/rules/ontology-to-linkml-mapping-convention.md

11 KiB

Rule 50: Ontology-to-LinkML Mapping Convention

🚨 CRITICAL: When mapping base ontology classes and predicates to LinkML schema elements, use LinkML's dedicated mapping properties as documented at https://linkml.io/linkml-model/latest/docs/mappings/


1. What "LinkML Mapping" Means in This Project

"LinkML mapping" refers specifically to:

  1. Connecting LinkML schema elements (classes, slots, enums) to external ontology URIs
  2. Using LinkML's built-in mapping properties (class_uri, slot_uri, *_mappings)
  3. Following SKOS-based vocabulary alignment standards

LinkML mapping does NOT mean:

  • Creating arbitrary crosswalks in spreadsheets
  • Writing prose descriptions of how concepts relate
  • Inventing custom @context JSON-LD mappings outside the schema

2. LinkML Mapping Property Reference

Primary Identity Properties

Property Applies To Purpose Example
class_uri Classes Primary RDF class URI class_uri: ore:Aggregation
slot_uri Slots Primary RDF predicate URI slot_uri: rico:hasOrHadHolder
enum_uri Enums Enum namespace URI enum_uri: hc:PlatformTypeEnum

SKOS-Based Mapping Properties

These properties express semantic relationships to external ontology terms:

Property SKOS Predicate Meaning Use When
exact_mappings skos:exactMatch IDENTICAL meaning Different ontology, SAME semantics (interchangeable)
close_mappings skos:closeMatch Very similar meaning Similar but NOT interchangeable
related_mappings skos:relatedMatch Semantically related Broader conceptual relationship
narrow_mappings skos:narrowMatch This is more specific External term is broader
broad_mappings skos:broadMatch This is more general External term is narrower

⚠️ CRITICAL: exact_mappings Requires PRECISE Semantic Equivalence

exact_mappings means the terms are INTERCHANGEABLE - you could substitute one for the other in any context without changing meaning.

Requirements for exact_mappings:

  1. Same definition: Both terms must have equivalent definitions
  2. Same scope: Both terms cover the same set of instances
  3. Same constraints: Same domain/range restrictions apply
  4. Bidirectional: If A exactMatch B, then B exactMatch A

DO NOT use exact_mappings when:

  • One term is a subset of the other (use narrow_mappings/broad_mappings)
  • Terms are similar but have different scopes (use close_mappings)
  • Terms are related but not equivalent (use related_mappings)
  • You're uncertain about equivalence (default to close_mappings)

Example - WRONG:

# PersonProfile is NOT equivalent to foaf:Person
# PersonProfile is a structured document ABOUT a person, not the person themselves
exact_mappings:
  - foaf:Person  # ❌ WRONG - different semantics!

Example - CORRECT:

# foaf:Person and schema:Person ARE equivalent
# Both define "a person" with the same scope
exact_mappings:
  - schema:Person  # ✅ CORRECT - truly equivalent

3. Mapping Workflow: Ontology → LinkML

Step 1: Identify External Ontology Class/Predicate

Search base ontology files in /data/ontology/:

# Find aggregation-related classes
rg -i "aggregation|aggregate" data/ontology/*.ttl data/ontology/*.rdf data/ontology/*.owl

# Check specific ontology
rg "rdfs:Class|owl:Class" data/ontology/ore.rdf | grep -i "aggregation"

Step 2: Determine Mapping Strength

Scenario Mapping Property
This IS that ontology class (identity) class_uri
Equivalent in another vocabulary exact_mappings
Similar concept, different scope close_mappings
Related but different granularity narrow_mappings / broad_mappings
Conceptually related related_mappings

Step 3: Document Mapping in LinkML Schema

For Classes

classes:
  DataAggregator:
    class_uri: ore:Aggregation          # Primary identity - THIS IS an ORE Aggregation
    description: |
      A platform that harvests and STORES copies of metadata/content, causing data duplication.
      
      ore:Aggregation - "A set of related resources grouped together."
      
      Mapped to ORE because aggregators create aggregations of harvested metadata.      
    exact_mappings:
      - edm:EuropeanaAggregation        # Europeana's specialization
    close_mappings:
      - dcat:Catalog                    # Similar (collects datasets) but broader scope
    narrow_mappings:
      - edm:ProvidedCHO                 # More specific (single cultural object)

For Slots

slots:
  aggregates_from:
    slot_uri: ore:aggregates            # Primary predicate
    description: |
      Institutions whose data is aggregated (harvested and stored) by this platform.
      
      ore:aggregates - "Aggregations assert ore:aggregates relationships."      
    exact_mappings:
      - edm:aggregatedCHO               # Europeana equivalent
    range: HeritageCustodian
    multivalued: true

4. Aggregation vs. Linking: A Mapping Example

This project requires semantic precision in distinguishing:

Concept Primary Mapping Semantic Pattern
Data Aggregation ore:Aggregation Data is COPIED to aggregator's server
Linking/Federation dcat:DataService Data REMAINS at source; only links provided

Aggregation Pattern (Data Duplication)

classes:
  DataAggregator:
    class_uri: ore:Aggregation
    description: |
      Harvests and stores copies of metadata from partner institutions.
      
      Key semantic: Data DUPLICATION occurs - the aggregator maintains its own copy.
      
      Examples: Europeana, DPLA, Archives Portal Europe      
    exact_mappings:
      - edm:EuropeanaAggregation
    annotations:
      data_storage_pattern: AGGREGATION
      causes_data_duplication: true

Linking Pattern (Single Source of Truth)

classes:
  FederatedDiscoveryPortal:
    class_uri: dcat:DataService
    description: |
      Provides unified search across multiple institutions but LINKS to original sources.
      
      Key semantic: NO data duplication - users are redirected to source institutions.
      
      Data remains at partner institutions' platforms (single source of truth).      
    close_mappings:
      - schema:SearchAction              # The search functionality
    related_mappings:
      - ore:Aggregation                  # Related but crucially different
    annotations:
      data_storage_pattern: LINKING
      causes_data_duplication: false

Linking Properties from EDM

Use edm:isShownAt and edm:isShownBy to express links to source:

slots:
  is_shown_at:
    slot_uri: edm:isShownAt
    description: |
      Unambiguous URL to the digital object on the provider's web site 
      in its full information context.
      
      edm:isShownAt - "The URL of a web view of the object in full context."
      
      This property LINKS to the source institution - no data duplication.      
    range: uri
    
  is_shown_by:
    slot_uri: edm:isShownBy
    description: |
      Direct URL to the object in best available resolution on provider's site.
      
      edm:isShownBy - "The URL of the object itself (not the context page)."      
    range: uri

5. Complete Mapping Documentation Template

When creating or updating a class with ontology mappings:

classes:
  MyNewClass:
    # === PRIMARY IDENTITY ===
    class_uri: {prefix}:{ClassName}     # The ontology class this IS
    
    # === DESCRIPTION WITH ONTOLOGY REFERENCE ===
    description: |
      {Human-readable description of what this class represents}
      
      {Ontology}: {class} - "{Definition from ontology documentation}"
      
      Mapping rationale:
      - Chosen because: {why this ontology class fits}
      - Not using X because: {why alternatives were rejected}      
    
    # === SKOS-BASED MAPPINGS ===
    exact_mappings:
      - {prefix}:{EquivalentClass}      # Same meaning, different vocabulary
    close_mappings:
      - {prefix}:{SimilarClass}         # Very similar but not identical
    narrow_mappings:
      - {prefix}:{MoreSpecificClass}    # External is broader than ours
    broad_mappings:
      - {prefix}:{MoreGeneralClass}     # External is narrower than ours
    related_mappings:
      - {prefix}:{RelatedClass}         # Conceptually related
    
    # === OPTIONAL ANNOTATIONS ===
    annotations:
      ontology_source: "{Full name of source ontology}"
      ontology_version: "{Version if applicable}"
      mapping_confidence: "high|medium|low"
      mapping_notes: "{Additional context}"

6. Validation Checklist

Before committing ontology mappings:

  • class_uri / slot_uri points to a real URI in data/ontology/ files
  • Description includes ontology definition (quoted from source)
  • Mapping rationale documented for non-obvious choices
  • exact_mappings used ONLY for truly equivalent terms
  • close_mappings documented with difference explanation
  • All prefixes declared in schema's prefixes: block
  • Prefixes resolve to valid ontology namespaces

7. Common Ontology Prefixes for Mappings

Prefix Namespace Ontology Use For
ore: http://www.openarchives.org/ore/terms/ OAI-ORE Aggregation patterns
edm: http://www.europeana.eu/schemas/edm/ Europeana Data Model Cultural heritage aggregation
dcat: http://www.w3.org/ns/dcat# DCAT Data catalogs, services
rico: https://www.ica.org/standards/RiC/ontology# Records in Contexts Archival description
crm: http://www.cidoc-crm.org/cidoc-crm/ CIDOC-CRM Cultural heritage events
schema: http://schema.org/ Schema.org Web semantics
skos: http://www.w3.org/2004/02/skos/core# SKOS Concepts, labels
dcterms: http://purl.org/dc/terms/ Dublin Core Metadata properties
prov: http://www.w3.org/ns/prov# PROV-O Provenance
org: http://www.w3.org/ns/org# W3C Organization Organizations
foaf: http://xmlns.com/foaf/0.1/ FOAF People, agents

See Also


Version: 1.0.0 Created: 2026-01-12 Author: OpenCODE