- Introduced custodian_hub_v3.mmd, custodian_hub_v4_final.mmd, and custodian_hub_v5_FINAL.mmd for Mermaid representation. - Created custodian_hub_FINAL.puml and custodian_hub_v3.puml for PlantUML representation. - Defined entities such as CustodianReconstruction, Identifier, TimeSpan, Agent, CustodianName, CustodianObservation, ReconstructionActivity, Appellation, ConfidenceMeasure, Custodian, LanguageCode, and SourceDocument. - Established relationships and associations between entities, including temporal extents, observations, and reconstruction activities. - Incorporated enumerations for various types, statuses, and classifications relevant to custodians and their activities.
5.8 KiB
5.8 KiB
Ontology Mapping Rules - Quick Reference
Created: 2025-11-20
Purpose: Summary of critical ontology engineering rules for heritage custodian project
Key Changes Made
1. Updated AGENTS.md
Added PROJECT CORE MISSION section at top emphasizing:
- This is an ontology engineering project, not simple data extraction
- Multi-aspect temporal modeling is required
- Multiple base ontologies must be integrated
- Wikidata entities are NOT ontology classes
2. Created .opencode/agent/ontology-mapping-rules.md
Comprehensive 30-page guide covering:
- Ontology consultation workflows
- Wikidata entity mapping procedures
- Multi-aspect modeling requirements
- Temporal independence documentation
- Property research workflows
- Decision trees for ontology selection
- Quality assurance checklists
Core Principles
Principle 1: Ontology Files Are Source of Truth
ALWAYS read base ontologies before designing:
# Example: Research CIDOC-CRM for heritage sites
rg "E27_Site|E53_Place" /Users/kempersc/apps/glam/data/ontology/CIDOC_CRM_v7.1.3.rdf
Principle 2: Wikidata ≠ Ontology
NEVER use Wikidata Q-numbers as class_uri:
❌ WRONG: class_uri: wd:Q1802963
✅ RIGHT: class_uri: crm:E27_Site # After mapping Q1802963 to ontology
Principle 3: Multi-Aspect Modeling
EVERY heritage entity has multiple aspects:
- Place (construction → present)
- Custodian (founding → present)
- Legal form (registration → present)
- Collections (accession → present)
- People (employment periods)
- Events (custody transfers, mergers)
Principle 4: Temporal Independence
Each aspect has its OWN timeline:
# Building exists 1880-present (144 years)
place_aspect:
temporal_extent:
start_date: "1880-01-01"
end_date: null
# Museum organization founded 1994-present (30 years)
custodian_aspect:
temporal_extent:
start_date: "1994-05-12"
end_date: null
Available Ontologies
| Ontology | File | Use For |
|---|---|---|
| CPOV | core-public-organisation-ap.ttl |
EU public sector heritage |
| TOOI | tooiont.ttl |
Dutch government organizations |
| Schema.org | schemaorg.owl |
Web semantics, private sector |
| CIDOC-CRM | CIDOC_CRM_v7.1.3.rdf |
Cultural heritage domain |
| RiC-O | RiC-O_1-1.rdf |
Archival description |
| BIBFRAME | bibframe_vocabulary.rdf |
Library collections |
| PiCo | pico.ttl |
Person observations, staff roles |
Required Workflow
1. Read hyponyms_curated.yaml (Wikidata entities)
↓
2. Analyze hypernym + semantic properties
↓
3. Search base ontologies for matching classes
↓
4. Map Wikidata entity → Ontology class(es)
↓
5. Extract relevant properties from ontologies
↓
6. Document rationale and temporal model
↓
7. Create LinkML schema with class_uri
↓
8. Human review if complexity ≥ 7/10
Example: Mansion (Q1802963)
❌ Wrong Approach
Mansion:
class_uri: wd:Q1802963 # Wikidata entity used directly
✅ Correct Approach
Mansion:
wikidata_source: Q1802963
# PLACE ASPECT
place_aspect:
class_uri: crm:E27_Site # CIDOC-CRM
secondary_class_uri: schema:LandmarksOrHistoricalBuildings
temporal_extent:
start_date: "1880-01-01" # Construction
# CUSTODIAN ASPECT (if operates as museum)
custodian_aspect:
class_uri: cpov:PublicOrganisation # If public
alt_class_uri: schema:Museum # If private
temporal_extent:
start_date: "1994-05-12" # Foundation established
# COLLECTIONS ASPECT
collections_aspect:
class_uri: crm:E78_Curated_Holding
temporal_extent:
start_date: "1994-01-01" # Accessions begin
Decision Tree: Ontology Selection
Is it Dutch government?
├─ YES → tooi:Overheidsorganisatie + cpov:PublicOrganisation
└─ NO → Is it public sector?
├─ YES → cpov:PublicOrganisation
└─ NO → schema:Organization
├─ Museum → schema:Museum
├─ Archive → schema:ArchiveOrganization
├─ Library → schema:Library
└─ NGO → schema:NGO
Is it a physical site?
├─ YES → crm:E27_Site + schema:Place
└─ NO → Continue with organizational classes
Does it hold collections?
├─ Archival → rico:RecordSet
├─ Museum → crm:E78_Curated_Holding
└─ Library → bf:Collection
Does it have staff?
└─ YES → pico:PersonObservation + crm:E21_Person
Quality Checklist
Before submitting ontology design:
- Base ontologies consulted (
/data/ontology/files read) - Wikidata entities mapped (not used directly as classes)
- Multi-aspect modeling applied
- Temporal independence documented
- Properties sourced from ontologies
- Rationale documented
- Examples provided
- Complexity score assigned (1-10)
- Human review requested if complexity ≥ 7
Files Updated
- AGENTS.md - Added PROJECT CORE MISSION section (lines 1-100)
- .opencode/agent/ontology-mapping-rules.md - NEW comprehensive guide
- This file (ONTOLOGY_RULES_SUMMARY.md) - Quick reference
Next Steps
- Continue manual ontology mapping for hyponyms_curated.yaml entries
- Document each mapping with full rationale
- Build aspect-based LinkML schema modules
- Create temporal modeling examples for common patterns
Key Resources
- Full Rules:
.opencode/agent/ontology-mapping-rules.md - Agent Instructions:
AGENTS.md - Ontology Files:
data/ontology/ - Wikidata Sources:
data/wikidata/GLAMORCUBEPSXHFN/
Remember: This is ontology engineering, not data extraction. Precision matters more than speed.