glam/ONTOLOGY_RULES_SUMMARY.md
kempersc 176a7479f9 Add comprehensive ontology mapping rules and update project mission
- Update AGENTS.md with PROJECT CORE MISSION section emphasizing ontology engineering focus
- Create .opencode/agent/ontology-mapping-rules.md (665 lines) with detailed guidelines:
  * Ontology consultation workflows (Rule 1)
  * Wikidata entity mapping procedures (Rule 2)
  * Multi-aspect modeling requirements (Rule 3)
  * Temporal independence documentation (Rule 4)
  * Property research workflows (Rule 5)
  * Decision trees for ontology selection (Rule 6-7)
  * Quality assurance checklists (Rule 8-9)
  * Agent collaboration protocols (Rule 10)
- Create ONTOLOGY_RULES_SUMMARY.md as quick reference guide

Key principles established:
1. Wikidata Q-numbers are NOT ontology classes (must be mapped)
2. Every heritage entity has multiple aspects with independent temporal lifecycles
3. Base ontologies (CPOV, TOOI, CIDOC-CRM, RiC-O, Schema.org, PiCo) are source of truth
4. Custom properties forbidden when ontology equivalents exist

Example: 'Mansion' (Q1802963) requires modeling as:
- Place aspect (crm:E27_Site, construction→present)
- Custodian aspect (cpov:PublicOrganisation OR schema:Museum, founding→present)
- Legal form aspect (org:FormalOrganization, registration→present)
- Collections aspect (crm:E78_Curated_Holding, accession→present)
- People aspect (picom:PersonObservation, employment periods)
- Temporal events (crm:E10_Transfer_of_Custody for custody changes)

All agents MUST read ontology files before schema design.
2025-11-20 23:09:02 +01:00

5.8 KiB

Ontology Mapping Rules - Quick Reference

Created: 2025-11-20
Purpose: Summary of critical ontology engineering rules for heritage custodian project


Key Changes Made

1. Updated AGENTS.md

Added PROJECT CORE MISSION section at top emphasizing:

  • This is an ontology engineering project, not simple data extraction
  • Multi-aspect temporal modeling is required
  • Multiple base ontologies must be integrated
  • Wikidata entities are NOT ontology classes

2. Created .opencode/agent/ontology-mapping-rules.md

Comprehensive 30-page guide covering:

  • Ontology consultation workflows
  • Wikidata entity mapping procedures
  • Multi-aspect modeling requirements
  • Temporal independence documentation
  • Property research workflows
  • Decision trees for ontology selection
  • Quality assurance checklists

Core Principles

Principle 1: Ontology Files Are Source of Truth

ALWAYS read base ontologies before designing:

# Example: Research CIDOC-CRM for heritage sites
rg "E27_Site|E53_Place" /Users/kempersc/apps/glam/data/ontology/CIDOC_CRM_v7.1.3.rdf

Principle 2: Wikidata ≠ Ontology

NEVER use Wikidata Q-numbers as class_uri:

❌ WRONG: class_uri: wd:Q1802963
✅ RIGHT: class_uri: crm:E27_Site  # After mapping Q1802963 to ontology

Principle 3: Multi-Aspect Modeling

EVERY heritage entity has multiple aspects:

  • Place (construction → present)
  • Custodian (founding → present)
  • Legal form (registration → present)
  • Collections (accession → present)
  • People (employment periods)
  • Events (custody transfers, mergers)

Principle 4: Temporal Independence

Each aspect has its OWN timeline:

# Building exists 1880-present (144 years)
place_aspect:
  temporal_extent:
    start_date: "1880-01-01"
    end_date: null

# Museum organization founded 1994-present (30 years)
custodian_aspect:
  temporal_extent:
    start_date: "1994-05-12"
    end_date: null

Available Ontologies

Ontology File Use For
CPOV core-public-organisation-ap.ttl EU public sector heritage
TOOI tooiont.ttl Dutch government organizations
Schema.org schemaorg.owl Web semantics, private sector
CIDOC-CRM CIDOC_CRM_v7.1.3.rdf Cultural heritage domain
RiC-O RiC-O_1-1.rdf Archival description
BIBFRAME bibframe_vocabulary.rdf Library collections
PiCo pico.ttl Person observations, staff roles

Required Workflow

1. Read hyponyms_curated.yaml (Wikidata entities)
       ↓
2. Analyze hypernym + semantic properties
       ↓
3. Search base ontologies for matching classes
       ↓
4. Map Wikidata entity → Ontology class(es)
       ↓
5. Extract relevant properties from ontologies
       ↓
6. Document rationale and temporal model
       ↓
7. Create LinkML schema with class_uri
       ↓
8. Human review if complexity ≥ 7/10

Example: Mansion (Q1802963)

Wrong Approach

Mansion:
  class_uri: wd:Q1802963  # Wikidata entity used directly

Correct Approach

Mansion:
  wikidata_source: Q1802963
  
  # PLACE ASPECT
  place_aspect:
    class_uri: crm:E27_Site  # CIDOC-CRM
    secondary_class_uri: schema:LandmarksOrHistoricalBuildings
    temporal_extent:
      start_date: "1880-01-01"  # Construction
  
  # CUSTODIAN ASPECT (if operates as museum)
  custodian_aspect:
    class_uri: cpov:PublicOrganisation  # If public
    alt_class_uri: schema:Museum  # If private
    temporal_extent:
      start_date: "1994-05-12"  # Foundation established
  
  # COLLECTIONS ASPECT
  collections_aspect:
    class_uri: crm:E78_Curated_Holding
    temporal_extent:
      start_date: "1994-01-01"  # Accessions begin

Decision Tree: Ontology Selection

Is it Dutch government?
  ├─ YES → tooiont:Overheidsorganisatie + cpov:PublicOrganisation
  └─ NO → Is it public sector?
           ├─ YES → cpov:PublicOrganisation
           └─ NO → schema:Organization
                    ├─ Museum → schema:Museum
                    ├─ Archive → schema:ArchiveOrganization
                    ├─ Library → schema:Library
                    └─ NGO → schema:NGO

Is it a physical site?
  ├─ YES → crm:E27_Site + schema:Place
  └─ NO → Continue with organizational classes

Does it hold collections?
  ├─ Archival → rico:RecordSet
  ├─ Museum → crm:E78_Curated_Holding
  └─ Library → bf:Collection

Does it have staff?
  └─ YES → picom:PersonObservation + crm:E21_Person

Quality Checklist

Before submitting ontology design:

  • Base ontologies consulted (/data/ontology/ files read)
  • Wikidata entities mapped (not used directly as classes)
  • Multi-aspect modeling applied
  • Temporal independence documented
  • Properties sourced from ontologies
  • Rationale documented
  • Examples provided
  • Complexity score assigned (1-10)
  • Human review requested if complexity ≥ 7

Files Updated

  1. AGENTS.md - Added PROJECT CORE MISSION section (lines 1-100)
  2. .opencode/agent/ontology-mapping-rules.md - NEW comprehensive guide
  3. This file (ONTOLOGY_RULES_SUMMARY.md) - Quick reference

Next Steps

  1. Continue manual ontology mapping for hyponyms_curated.yaml entries
  2. Document each mapping with full rationale
  3. Build aspect-based LinkML schema modules
  4. Create temporal modeling examples for common patterns

Key Resources

  • Full Rules: .opencode/agent/ontology-mapping-rules.md
  • Agent Instructions: AGENTS.md
  • Ontology Files: data/ontology/
  • Wikidata Sources: data/wikidata/GLAMORCUBEPSXHFN/

Remember: This is ontology engineering, not data extraction. Precision matters more than speed.