glam/ONTOLOGY_RULES_SUMMARY.md
kempersc fa5680f0dd Add initial versions of custodian hub UML diagrams in Mermaid and PlantUML formats
- Introduced custodian_hub_v3.mmd, custodian_hub_v4_final.mmd, and custodian_hub_v5_FINAL.mmd for Mermaid representation.
- Created custodian_hub_FINAL.puml and custodian_hub_v3.puml for PlantUML representation.
- Defined entities such as CustodianReconstruction, Identifier, TimeSpan, Agent, CustodianName, CustodianObservation, ReconstructionActivity, Appellation, ConfidenceMeasure, Custodian, LanguageCode, and SourceDocument.
- Established relationships and associations between entities, including temporal extents, observations, and reconstruction activities.
- Incorporated enumerations for various types, statuses, and classifications relevant to custodians and their activities.
2025-11-22 14:33:51 +01:00

213 lines
5.8 KiB
Markdown

# Ontology Mapping Rules - Quick Reference
**Created**: 2025-11-20
**Purpose**: Summary of critical ontology engineering rules for heritage custodian project
---
## Key Changes Made
### 1. Updated AGENTS.md
Added **PROJECT CORE MISSION** section at top emphasizing:
- This is an **ontology engineering project**, not simple data extraction
- Multi-aspect temporal modeling is required
- Multiple base ontologies must be integrated
- Wikidata entities are NOT ontology classes
### 2. Created .opencode/agent/ontology-mapping-rules.md
Comprehensive 30-page guide covering:
- Ontology consultation workflows
- Wikidata entity mapping procedures
- Multi-aspect modeling requirements
- Temporal independence documentation
- Property research workflows
- Decision trees for ontology selection
- Quality assurance checklists
---
## Core Principles
### Principle 1: Ontology Files Are Source of Truth
**ALWAYS** read base ontologies before designing:
```bash
# Example: Research CIDOC-CRM for heritage sites
rg "E27_Site|E53_Place" /Users/kempersc/apps/glam/data/ontology/CIDOC_CRM_v7.1.3.rdf
```
### Principle 2: Wikidata ≠ Ontology
**NEVER** use Wikidata Q-numbers as `class_uri`:
```yaml
❌ WRONG: class_uri: wd:Q1802963
✅ RIGHT: class_uri: crm:E27_Site # After mapping Q1802963 to ontology
```
### Principle 3: Multi-Aspect Modeling
**EVERY** heritage entity has multiple aspects:
- **Place** (construction → present)
- **Custodian** (founding → present)
- **Legal form** (registration → present)
- **Collections** (accession → present)
- **People** (employment periods)
- **Events** (custody transfers, mergers)
### Principle 4: Temporal Independence
**Each aspect has its OWN timeline:**
```yaml
# Building exists 1880-present (144 years)
place_aspect:
temporal_extent:
start_date: "1880-01-01"
end_date: null
# Museum organization founded 1994-present (30 years)
custodian_aspect:
temporal_extent:
start_date: "1994-05-12"
end_date: null
```
---
## Available Ontologies
| Ontology | File | Use For |
|----------|------|---------|
| **CPOV** | `core-public-organisation-ap.ttl` | EU public sector heritage |
| **TOOI** | `tooiont.ttl` | Dutch government organizations |
| **Schema.org** | `schemaorg.owl` | Web semantics, private sector |
| **CIDOC-CRM** | `CIDOC_CRM_v7.1.3.rdf` | Cultural heritage domain |
| **RiC-O** | `RiC-O_1-1.rdf` | Archival description |
| **BIBFRAME** | `bibframe_vocabulary.rdf` | Library collections |
| **PiCo** | `pico.ttl` | Person observations, staff roles |
---
## Required Workflow
```
1. Read hyponyms_curated.yaml (Wikidata entities)
2. Analyze hypernym + semantic properties
3. Search base ontologies for matching classes
4. Map Wikidata entity → Ontology class(es)
5. Extract relevant properties from ontologies
6. Document rationale and temporal model
7. Create LinkML schema with class_uri
8. Human review if complexity ≥ 7/10
```
---
## Example: Mansion (Q1802963)
### ❌ Wrong Approach
```yaml
Mansion:
class_uri: wd:Q1802963 # Wikidata entity used directly
```
### ✅ Correct Approach
```yaml
Mansion:
wikidata_source: Q1802963
# PLACE ASPECT
place_aspect:
class_uri: crm:E27_Site # CIDOC-CRM
secondary_class_uri: schema:LandmarksOrHistoricalBuildings
temporal_extent:
start_date: "1880-01-01" # Construction
# CUSTODIAN ASPECT (if operates as museum)
custodian_aspect:
class_uri: cpov:PublicOrganisation # If public
alt_class_uri: schema:Museum # If private
temporal_extent:
start_date: "1994-05-12" # Foundation established
# COLLECTIONS ASPECT
collections_aspect:
class_uri: crm:E78_Curated_Holding
temporal_extent:
start_date: "1994-01-01" # Accessions begin
```
---
## Decision Tree: Ontology Selection
```
Is it Dutch government?
├─ YES → tooi:Overheidsorganisatie + cpov:PublicOrganisation
└─ NO → Is it public sector?
├─ YES → cpov:PublicOrganisation
└─ NO → schema:Organization
├─ Museum → schema:Museum
├─ Archive → schema:ArchiveOrganization
├─ Library → schema:Library
└─ NGO → schema:NGO
Is it a physical site?
├─ YES → crm:E27_Site + schema:Place
└─ NO → Continue with organizational classes
Does it hold collections?
├─ Archival → rico:RecordSet
├─ Museum → crm:E78_Curated_Holding
└─ Library → bf:Collection
Does it have staff?
└─ YES → pico:PersonObservation + crm:E21_Person
```
---
## Quality Checklist
Before submitting ontology design:
- [ ] Base ontologies consulted (`/data/ontology/` files read)
- [ ] Wikidata entities mapped (not used directly as classes)
- [ ] Multi-aspect modeling applied
- [ ] Temporal independence documented
- [ ] Properties sourced from ontologies
- [ ] Rationale documented
- [ ] Examples provided
- [ ] Complexity score assigned (1-10)
- [ ] Human review requested if complexity ≥ 7
---
## Files Updated
1. **AGENTS.md** - Added PROJECT CORE MISSION section (lines 1-100)
2. **.opencode/agent/ontology-mapping-rules.md** - NEW comprehensive guide
3. **This file** (ONTOLOGY_RULES_SUMMARY.md) - Quick reference
---
## Next Steps
1. Continue manual ontology mapping for hyponyms_curated.yaml entries
2. Document each mapping with full rationale
3. Build aspect-based LinkML schema modules
4. Create temporal modeling examples for common patterns
---
## Key Resources
- **Full Rules**: `.opencode/agent/ontology-mapping-rules.md`
- **Agent Instructions**: `AGENTS.md`
- **Ontology Files**: `data/ontology/`
- **Wikidata Sources**: `data/wikidata/GLAMORCUBEPSXHFN/`
**Remember**: This is ontology engineering, not data extraction. Precision matters more than speed.