- Introduced custodian_hub_v3.mmd, custodian_hub_v4_final.mmd, and custodian_hub_v5_FINAL.mmd for Mermaid representation. - Created custodian_hub_FINAL.puml and custodian_hub_v3.puml for PlantUML representation. - Defined entities such as CustodianReconstruction, Identifier, TimeSpan, Agent, CustodianName, CustodianObservation, ReconstructionActivity, Appellation, ConfidenceMeasure, Custodian, LanguageCode, and SourceDocument. - Established relationships and associations between entities, including temporal extents, observations, and reconstruction activities. - Incorporated enumerations for various types, statuses, and classifications relevant to custodians and their activities.
719 lines
24 KiB
Markdown
719 lines
24 KiB
Markdown
# Ontology Mapping Rules for Heritage Custodian Project
|
|
|
|
**Version**: 1.0
|
|
**Last Updated**: 2025-11-20
|
|
**Purpose**: Define rigorous ontological mapping procedures for AI agents working on the GLAM heritage custodian data project
|
|
|
|
---
|
|
|
|
## Core Principle: Ontology-First Design
|
|
|
|
**CRITICAL**: The primary objective of this project is to create a **comprehensive, nuanced ontology** that can accurately represent the complex, temporal, multi-faceted nature of heritage custodian institutions worldwide.
|
|
|
|
### What This Means
|
|
|
|
- ✅ **DO**: Study ontology files deeply before creating classes or properties
|
|
- ✅ **DO**: Map Wikidata entities to formal ontology classes with explicit rationale
|
|
- ✅ **DO**: Model temporal independence of different aspects (place, custodian, legal form, collections, people)
|
|
- ✅ **DO**: Support multiple ontology classes for the same entity (CPOV + TOOI + Schema.org + CIDOC-CRM)
|
|
- ❌ **DON'T**: Use Wikidata Q-numbers directly as ontology classes
|
|
- ❌ **DON'T**: Create generic "HeritageCustodian" mappings without considering semantic aspects
|
|
- ❌ **DON'T**: Ignore temporal dimensions (everything changes over time!)
|
|
|
|
---
|
|
|
|
## Heritage-First Framing Principle
|
|
|
|
**CRITICAL FRAMING**: This project exclusively focuses on entities with **heritage significance**. All Wikidata entities in our taxonomy are evaluated through a heritage lens.
|
|
|
|
### Heritage Significance Default
|
|
|
|
When mapping Wikidata entities to ontology classes:
|
|
|
|
- ✅ **ALWAYS assume heritage significance** - We only include entities that are or could become heritage custodians
|
|
- ✅ **ALWAYS use heritage-focused ontology classes** - Prefer crm:E27_Site over generic schema:Place, prefer schema:LandmarksOrHistoricalBuildings over schema:Building
|
|
- ✅ **ALWAYS model place aspect for physical sites** - Buildings, monuments, landscapes in our taxonomy have heritage value
|
|
- ❌ **DON'T use generic real estate classes** - schema:Accommodation, schema:Residence are TOO GENERIC for our heritage focus
|
|
- ❌ **DON'T require "proof of heritage status"** - If an entity type is in our Wikidata extraction, it has heritage potential
|
|
|
|
### Examples
|
|
|
|
**Vacation Properties (Q3694)**
|
|
- ❌ WRONG: "Use schema:Accommodation as primary class because most vacation properties are commercial rentals"
|
|
- ✅ CORRECT: "Use crm:E27_Site as primary class because vacation properties in our taxonomy are HISTORIC VACATION PROPERTIES (royal summer palaces, historic villas) with documented heritage significance"
|
|
|
|
**Mansions (Q1802963)**
|
|
- ❌ WRONG: "Use schema:Residence because mansions are residential buildings"
|
|
- ✅ CORRECT: "Use crm:E27_Site + schema:LandmarksOrHistoricalBuildings because mansions in our taxonomy are HERITAGE BUILDINGS with architectural significance"
|
|
|
|
**Buitenplaatsen (Q2927789)**
|
|
- ❌ WRONG: "Use schema:House because buitenplaatsen are country houses"
|
|
- ✅ CORRECT: "Use crm:E27_Site because buitenplaatsen are HISTORIC ESTATES, many with Rijksmonument status and heritage protection"
|
|
|
|
### Ontology Selection Decision Tree for Physical Sites
|
|
|
|
```
|
|
Is the entity a physical place/building/site?
|
|
↓ YES
|
|
Is it in our GLAMORCUBESFIXPHDNT taxonomy?
|
|
↓ YES
|
|
THEN it has heritage significance
|
|
↓
|
|
PRIMARY CLASS: crm:E27_Site (CIDOC-CRM heritage site)
|
|
SECONDARY CLASS: schema:LandmarksOrHistoricalBuildings (Schema.org)
|
|
TERTIARY CLASS: dbo:HistoricPlace OR dbo:HistoricBuilding (DBpedia)
|
|
↓
|
|
NEVER USE: schema:Accommodation, schema:Residence, schema:Building (too generic)
|
|
```
|
|
|
|
### Rationale
|
|
|
|
1. **Taxonomy Scope**: Our Wikidata extraction targets GLAM entities - by definition, these have heritage significance
|
|
2. **Project Mission**: We model heritage custodians, not generic real estate
|
|
3. **Ontology Precision**: Heritage-specific classes (crm:E27_Site) provide richer semantics than generic classes
|
|
4. **Data Quality**: Using heritage classes signals to consumers that these are culturally significant entities
|
|
5. **Interoperability**: CIDOC-CRM is the STANDARD for cultural heritage - we must use it for heritage sites
|
|
|
|
---
|
|
|
|
## Rule 1: Ontology Files Are Source of Truth
|
|
|
|
**All ontology design MUST reference base ontologies in `/data/ontology/`.**
|
|
|
|
### Available Ontologies
|
|
|
|
| Ontology | File | Scope | When to Use |
|
|
|----------|------|-------|-------------|
|
|
| **CPOV** | `core-public-organisation-ap.ttl` | EU public sector | Government archives, state museums, public cultural institutions |
|
|
| **TOOI** | `tooiont.ttl` | Dutch government | Netherlands government heritage organizations |
|
|
| **Schema.org** | `schemaorg.owl` | Web semantics | Private collections, web discoverability, general fallback |
|
|
| **CIDOC-CRM** | `CIDOC_CRM_v7.1.3.rdf` | Cultural heritage domain | Museums, sites, curated holdings, provenance |
|
|
| **RiC-O** | `RiC-O_1-1.rdf` | Archival description | Archives, record sets, corporate bodies |
|
|
| **BIBFRAME** | `bibframe_vocabulary.rdf` | Bibliographic resources | Libraries, bibliographic collections |
|
|
| **PiCo** | `pico.ttl` | Person observations | Staff, curators, archivists, directors |
|
|
| **W3C Org** | (embedded in CPOV) | Organizational structure | Legal forms, organizational units |
|
|
|
|
### Mandatory Ontology Consultation Workflow
|
|
|
|
**Before designing any LinkML class, agents MUST:**
|
|
|
|
1. **Identify the semantic domain** (cultural, archival, educational, legal, etc.)
|
|
2. **Read relevant ontology files** using `read` or `grep` tools
|
|
3. **Extract applicable classes and properties**
|
|
4. **Document ontology alignment** in design notes
|
|
5. **Map Wikidata hypernyms to ontology classes** (not vice versa!)
|
|
|
|
**Example Workflow**:
|
|
|
|
```bash
|
|
# Step 1: Identify domain
|
|
# Entity: "mansion" (building + potential heritage custodian)
|
|
|
|
# Step 2: Search CIDOC-CRM for site/building classes
|
|
rg "E27_Site|E53_Place" /Users/kempersc/apps/glam/data/ontology/CIDOC_CRM_v7.1.3.rdf
|
|
|
|
# Step 3: Search Schema.org for building types
|
|
rg "LandmarksOrHistoricalBuildings|TouristAttraction" /Users/kempersc/apps/glam/data/ontology/schemaorg.owl
|
|
|
|
# Step 4: Search CPOV for organization classes (if mansion operates as museum)
|
|
rg "PublicOrganisation|classification" /Users/kempersc/apps/glam/data/ontology/core-public-organisation-ap.ttl
|
|
|
|
# Step 5: Document findings in design notes
|
|
# "Mansion should map to crm:E27_Site (place aspect) AND
|
|
# cpov:PublicOrganisation (custodian aspect if operates as museum)"
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 2: Never Use Wikidata Entities Directly
|
|
|
|
**Wikidata Q-numbers are NOT ontology classes. They are ENTITY IDENTIFIERS.**
|
|
|
|
### Incorrect Approach ❌
|
|
|
|
```yaml
|
|
# BAD - Wikidata Q-number used as class
|
|
HeritageCustodian:
|
|
class_uri: wd:Q1802963 # ← This is an INSTANCE (mansion), not a CLASS!
|
|
```
|
|
|
|
### Correct Approach ✅
|
|
|
|
```yaml
|
|
# GOOD - Wikidata entity mapped to formal ontology classes
|
|
Mansion:
|
|
description: >-
|
|
Large residential building, often with heritage significance.
|
|
Wikidata reference: Q1802963
|
|
|
|
# Place aspect
|
|
place_class_uri: crm:E27_Site
|
|
place_secondary_uri: schema:LandmarksOrHistoricalBuildings
|
|
|
|
# Custodian aspect (if operates as heritage institution)
|
|
custodian_class_uri: cpov:PublicOrganisation # If public
|
|
custodian_alt_uri: schema:Museum # If private
|
|
|
|
# Collections aspect
|
|
collections_class_uri: crm:E78_Curated_Holding
|
|
```
|
|
|
|
### Wikidata Hypernym Files Purpose
|
|
|
|
The files `/schemas/hyponyms_curated.yaml` and `/schemas/hyponyms_curated_full.yaml` are:
|
|
|
|
- ✅ **Source data** for identifying heritage entity TYPES
|
|
- ✅ **Analysis input** for understanding domain taxonomy
|
|
- ✅ **Reference** for multilingual labels and descriptions
|
|
- ❌ **NOT** direct ontology class definitions
|
|
|
|
**Required Mapping Workflow**:
|
|
|
|
```
|
|
hyponyms_curated.yaml (Wikidata entities)
|
|
↓
|
|
ANALYZE semantic properties
|
|
↓
|
|
SEARCH base ontologies for appropriate classes
|
|
↓
|
|
MAP Wikidata entity to ontology class(es)
|
|
↓
|
|
DOCUMENT rationale and properties
|
|
↓
|
|
CREATE LinkML schema with ontology class_uri
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 3: Multi-Aspect Modeling is Mandatory
|
|
|
|
**Every heritage entity has MULTIPLE ontological aspects with INDEPENDENT temporal lifecycles.**
|
|
|
|
### Required Aspects
|
|
|
|
All heritage custodian entities MUST model these aspects:
|
|
|
|
1. **Place Aspect** (physical location/site)
|
|
- Ontology: CIDOC-CRM (E27_Site, E53_Place) + Schema.org (Place)
|
|
- Temporal: Construction → Demolition/Present
|
|
- Properties: Address, coordinates, building type, heritage designation
|
|
|
|
2. **Custodian Aspect** (organization managing heritage)
|
|
- Ontology: CPOV (public) OR Schema.org (private) + CIDOC-CRM (E39_Actor)
|
|
- Temporal: Founding → Dissolution/Present
|
|
- Properties: Legal identifiers, organizational structure, mission
|
|
|
|
3. **Legal Form Aspect** (legal entity registration)
|
|
- Ontology: W3C Org (FormalOrganization) + TOOI (Dutch)
|
|
- Temporal: Registration → Deregistration/Present
|
|
- Properties: KvK number, legal classification, registered address
|
|
|
|
4. **Collections Aspect** (heritage materials preserved)
|
|
- Ontology: RiC-O (archival) OR CIDOC-CRM (museum) OR BIBFRAME (library)
|
|
- Temporal: Accession → Deaccession (per item/collection)
|
|
- Properties: Provenance, extent, access restrictions
|
|
|
|
5. **People Aspect** (staff/curators)
|
|
- Ontology: PiCo (PersonObservation) + CIDOC-CRM (E21_Person)
|
|
- Temporal: Employment start → Employment end (per person)
|
|
- Properties: Roles, activities, employment records
|
|
|
|
6. **Temporal Events** (organizational changes)
|
|
- Ontology: CIDOC-CRM (E10_Transfer_of_Custody, E8_Acquisition) + RiC-O (Event)
|
|
- Properties: Custody transfers, mergers, relocations, transformations
|
|
|
|
### Example: Modeling a Historic Mansion Operating as Museum
|
|
|
|
```yaml
|
|
# Entity: Villa Mondriaan (Winterswijk, Netherlands)
|
|
|
|
# PLACE ASPECT
|
|
villa_mondriaan_place:
|
|
aspect_type: place
|
|
class_uri: crm:E27_Site
|
|
secondary_class_uri: schema:LandmarksOrHistoricalBuildings
|
|
temporal_extent:
|
|
construction_date: "1880-01-01"
|
|
current_status: standing
|
|
properties:
|
|
address: "Zonnebrink 4, 7101 NP Winterswijk"
|
|
coordinates: [51.9711, 6.7197]
|
|
heritage_designation: "Rijksmonument"
|
|
|
|
# CUSTODIAN ASPECT
|
|
stichting_villa_mondriaan:
|
|
aspect_type: custodian
|
|
class_uri: cpov:PublicOrganisation # Dutch foundation with public benefit
|
|
secondary_class_uri: schema:Museum
|
|
temporal_extent:
|
|
founding_date: "1994-05-12"
|
|
current_status: active
|
|
properties:
|
|
legal_name: "Stichting Villa Mondriaan"
|
|
isil_code: "NL-WtVM"
|
|
manages: [villa_mondriaan_collections]
|
|
|
|
# LEGAL FORM ASPECT
|
|
stichting_legal_entity:
|
|
aspect_type: legal_form
|
|
class_uri: org:FormalOrganization
|
|
mixin_class_uri: tooi:Overheidsorganisatie # Dutch government org
|
|
temporal_extent:
|
|
registration_date: "1994-05-12"
|
|
current_status: registered
|
|
properties:
|
|
kvk_number: "12345678"
|
|
legal_form: "stichting" # Dutch foundation
|
|
|
|
# COLLECTIONS ASPECT
|
|
villa_mondriaan_collections:
|
|
aspect_type: collections
|
|
class_uri: crm:E78_Curated_Holding
|
|
archival_class_uri: rico:RecordSet
|
|
temporal_extent:
|
|
accession_start: "1994-01-01"
|
|
current_status: growing
|
|
properties:
|
|
provenance: "Mondriaan family"
|
|
extent: "500 objects, 200 archival documents"
|
|
|
|
# PEOPLE ASPECT
|
|
curator_maria_van_der_berg:
|
|
aspect_type: person
|
|
class_uri: pico:PersonObservation
|
|
secondary_class_uri: crm:E21_Person
|
|
temporal_extent:
|
|
employment_start: "2020-01-01"
|
|
current_status: employed
|
|
properties:
|
|
role: picot_roles:curator
|
|
works_for: stichting_villa_mondriaan
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 4: Temporal Independence Documentation
|
|
|
|
**All aspects have SEPARATE temporal lifecycles. Document this explicitly.**
|
|
|
|
### Required Temporal Properties
|
|
|
|
Every aspect MUST include:
|
|
|
|
```yaml
|
|
temporal_extent:
|
|
start_date: "YYYY-MM-DD" # When this aspect began
|
|
end_date: "YYYY-MM-DD" or null # When aspect ended (null = ongoing)
|
|
certainty: "certain" | "approximate" | "inferred"
|
|
source: "archival_record" | "legal_registration" | "oral_history" | etc.
|
|
```
|
|
|
|
### Example: Temporal Independence in Custody Transfer
|
|
|
|
```yaml
|
|
# Heineken corporate archive custody transfer (2005)
|
|
|
|
# BEFORE TRANSFER (1864-2005)
|
|
heineken_corporate_archive:
|
|
custodian_aspect:
|
|
custodian_id: heineken_nv
|
|
class_uri: schema:Corporation
|
|
temporal_extent:
|
|
start_date: "1864-01-01" # Heineken founded
|
|
end_date: "2005-06-15" # Custody transferred
|
|
|
|
collections_aspect:
|
|
class_uri: rico:RecordSet
|
|
provenance: "Heineken N.V."
|
|
temporal_extent:
|
|
start_date: "1864-01-01"
|
|
end_date: null # Collection still exists (just moved)
|
|
|
|
# AFTER TRANSFER (2005-present)
|
|
heineken_archive_at_stadsarchief:
|
|
custodian_aspect:
|
|
custodian_id: stadsarchief_amsterdam
|
|
class_uri: cpov:PublicOrganisation
|
|
temporal_extent:
|
|
start_date: "2005-06-15" # Custody received
|
|
end_date: null # Ongoing
|
|
|
|
collections_aspect:
|
|
class_uri: rico:RecordSet
|
|
provenance: "Heineken N.V." # ← Provenance unchanged!
|
|
temporal_extent:
|
|
start_date: "1864-01-01" # ← Collection dates unchanged!
|
|
end_date: null
|
|
|
|
# CUSTODY TRANSFER EVENT
|
|
custody_transfer_event:
|
|
event_type: crm:E10_Transfer_of_Custody
|
|
class_uri: rico:Event
|
|
temporal_extent:
|
|
event_date: "2005-06-15"
|
|
properties:
|
|
surrendered_by: heineken_nv
|
|
received_by: stadsarchief_amsterdam
|
|
transferred_object: heineken_corporate_archive
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 5: Ontology Properties Must Be Researched
|
|
|
|
**Never invent custom properties when ontology equivalents exist.**
|
|
|
|
### Property Research Workflow
|
|
|
|
1. **Identify the relationship** you need to express
|
|
2. **Search base ontologies** for existing properties
|
|
3. **Use ontology property** with proper namespace
|
|
4. **Document property source** in comments
|
|
|
|
**Example**:
|
|
|
|
```yaml
|
|
# ❌ WRONG - Custom property invented
|
|
institution:
|
|
official_name: "Rijksarchief in Noord-Holland"
|
|
|
|
# ✅ CORRECT - CPOV ontology property used
|
|
institution:
|
|
skos:prefLabel: "Rijksarchief in Noord-Holland"@nl
|
|
# Source: CPOV uses SKOS for preferred labels
|
|
```
|
|
|
|
### Common Property Mappings
|
|
|
|
| Need | Ontology Property | Namespace |
|
|
|------|-------------------|-----------|
|
|
| Preferred name | `skos:prefLabel` | SKOS (used by CPOV) |
|
|
| Alternative names | `skos:altLabel` | SKOS |
|
|
| Identifiers | `dct:identifier` | Dublin Core Terms |
|
|
| Address | `locn:address` | W3C Location Core |
|
|
| Coordinates | `schema:geo` | Schema.org |
|
|
| Founding date | `schema:foundingDate` OR `tooi:begindatum` | Schema.org / TOOI |
|
|
| Organizational unit | `cpov:hasUnit` OR `org:hasUnit` | CPOV / W3C Org |
|
|
| Curated collection | `crm:P147_curated` | CIDOC-CRM |
|
|
| Archival holdings | `rico:isOrWasHolderOf` | RiC-O |
|
|
| Person role | `pico:hasRole` | PiCo |
|
|
| Provenance | `rico:hasProvenance` OR `prov:hadPrimarySource` | RiC-O / PROV-O |
|
|
|
|
---
|
|
|
|
## Rule 6: Decision Trees for Ontology Selection
|
|
|
|
**Use structured decision trees to select appropriate ontologies.**
|
|
|
|
### Decision Tree: Primary Ontology Class
|
|
|
|
```
|
|
START: Heritage entity identified
|
|
↓
|
|
Is it a physical place/site?
|
|
├─ YES → PRIMARY: crm:E27_Site + schema:Place
|
|
│ Continue to check if also a custodian organization ↓
|
|
│
|
|
└─ NO → Is it an organization?
|
|
├─ YES → Is it public sector?
|
|
│ ├─ YES → cpov:PublicOrganisation
|
|
│ │ Is it Dutch government?
|
|
│ │ ├─ YES → ADD MIXIN: tooi:Overheidsorganisatie
|
|
│ │ └─ NO → CPOV only
|
|
│ │
|
|
│ └─ NO → schema:Organization
|
|
│ What type?
|
|
│ ├─ Museum → schema:Museum
|
|
│ ├─ Library → schema:Library
|
|
│ ├─ Archive → schema:ArchiveOrganization
|
|
│ ├─ Education → schema:EducationalOrganization
|
|
│ └─ NGO → schema:NGO
|
|
│
|
|
└─ NO → Is it a collection?
|
|
├─ Archival → rico:RecordSet
|
|
├─ Museum → crm:E78_Curated_Holding
|
|
├─ Library → bf:Collection
|
|
└─ Mixed → Use multiple classes
|
|
```
|
|
|
|
### Decision Tree: Dutch vs. EU vs. Global
|
|
|
|
```
|
|
START: Determine geographic/legal scope
|
|
↓
|
|
Country == "Netherlands"?
|
|
├─ YES → Legal status == "public"?
|
|
│ ├─ YES → USE: tooi:Overheidsorganisatie (Dutch government)
|
|
│ │ ALSO ADD: cpov:PublicOrganisation (EU compliance)
|
|
│ │
|
|
│ └─ NO → USE: schema:Organization (private)
|
|
│ ADD: DutchLegalEntityMixin (KvK numbers)
|
|
│
|
|
└─ NO → In Europe?
|
|
├─ YES → Legal status == "public"?
|
|
│ ├─ YES → USE: cpov:PublicOrganisation
|
|
│ └─ NO → USE: schema:Organization
|
|
│
|
|
└─ NO → USE: schema:Organization (global)
|
|
ADD domain-specific class:
|
|
- schema:Museum
|
|
- schema:ArchiveOrganization
|
|
- schema:Library
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 7: Documentation Requirements
|
|
|
|
**All ontology mappings MUST be documented with rationale.**
|
|
|
|
### Required Documentation Fields
|
|
|
|
```yaml
|
|
ontology_mapping:
|
|
wikidata_source: Q1802963 # Wikidata entity being mapped
|
|
wikidata_label: mansion
|
|
|
|
primary_class:
|
|
uri: crm:E27_Site
|
|
namespace: http://www.cidoc-crm.org/cidoc-crm/
|
|
rationale: >-
|
|
CIDOC-CRM E27_Site for physical heritage buildings with
|
|
archaeological/architectural significance.
|
|
ontology_file: data/ontology/CIDOC_CRM_v7.1.3.rdf
|
|
ontology_section: "Lines 1234-1267" # Optional
|
|
|
|
secondary_class:
|
|
uri: schema:LandmarksOrHistoricalBuildings
|
|
namespace: http://schema.org/
|
|
rationale: Web discoverability for historic landmarks
|
|
ontology_file: data/ontology/schemaorg.owl
|
|
|
|
properties:
|
|
- uri: crm:P1_is_identified_by
|
|
range: crm:E41_Appellation
|
|
usage: Building name identification
|
|
example: "Buitenplaats Beeckestijn"
|
|
|
|
- uri: schema:geo
|
|
range: schema:GeoCoordinates
|
|
usage: Geographic coordinates
|
|
example: "{latitude: 51.9711, longitude: 6.7197}"
|
|
|
|
temporal_model:
|
|
aspects:
|
|
- place # Physical site
|
|
- custodian # If operates as heritage institution
|
|
- collections # If holds curated materials
|
|
|
|
temporal_independence_note: >-
|
|
Place existence (construction → present) is independent from
|
|
custodian organization lifecycle (founding → present).
|
|
|
|
complexity_score: 9 # 1-10 scale
|
|
reviewed_by: human_expert
|
|
review_date: "2025-11-20"
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 8: Prohibited Practices
|
|
|
|
**The following practices are STRICTLY FORBIDDEN:**
|
|
|
|
### ❌ Prohibited
|
|
|
|
1. **Using Wikidata Q-numbers as class URIs**
|
|
```yaml
|
|
# FORBIDDEN
|
|
class_uri: wd:Q33506 # This is an entity, not a class!
|
|
```
|
|
|
|
2. **Creating custom properties without ontology research**
|
|
```yaml
|
|
# FORBIDDEN
|
|
slots:
|
|
institution_official_name: # Use skos:prefLabel instead!
|
|
```
|
|
|
|
3. **Single-ontology mappings for complex entities**
|
|
```yaml
|
|
# FORBIDDEN - Mansion is BOTH place AND potential custodian
|
|
Mansion:
|
|
class_uri: schema:Place # ← Missing custodian aspect!
|
|
```
|
|
|
|
4. **Ignoring temporal dimensions**
|
|
```yaml
|
|
# FORBIDDEN - No temporal tracking
|
|
custodian:
|
|
name: "Heineken Archive"
|
|
location: "Amsterdam"
|
|
# ← Where are the dates? Which period does this describe?
|
|
```
|
|
|
|
5. **Binary public/private classifications**
|
|
```yaml
|
|
# FORBIDDEN - Too simplistic
|
|
PublicHeritageCustodian: # What about NGOs? Foundations? Mixed?
|
|
PrivateHeritageCustodian: # What about government corporations?
|
|
```
|
|
|
|
---
|
|
|
|
## Rule 9: Quality Assurance Checklist
|
|
|
|
**Before submitting any ontology design, verify:**
|
|
|
|
- [ ] All base ontologies consulted (`/data/ontology/` files read)
|
|
- [ ] Wikidata entities mapped to formal ontology classes (not used directly)
|
|
- [ ] Multi-aspect modeling applied (place, custodian, legal, collections, people)
|
|
- [ ] Temporal independence documented for each aspect
|
|
- [ ] Properties sourced from ontologies (not custom inventions)
|
|
- [ ] Decision trees applied for ontology selection
|
|
- [ ] Rationale documented for all class/property choices
|
|
- [ ] Examples provided with real-world entities
|
|
- [ ] Complexity score assigned (1-10 scale)
|
|
- [ ] Human review requested for complexity ≥ 7
|
|
|
|
---
|
|
|
|
## Rule 10: Agent Collaboration Protocol
|
|
|
|
**When working with other agents or humans:**
|
|
|
|
1. **Always cite ontology files** in design discussions
|
|
- "According to CIDOC-CRM (lines 1234-1267 in CIDOC_CRM_v7.1.3.rdf)..."
|
|
|
|
2. **Share ontology search commands** for reproducibility
|
|
```bash
|
|
rg "E27_Site" /Users/kempersc/apps/glam/data/ontology/CIDOC_CRM_v7.1.3.rdf
|
|
```
|
|
|
|
3. **Document disagreements** with explicit rationale
|
|
- "Agent A suggests schema:Museum, but I recommend cpov:PublicOrganisation
|
|
because institution is government-operated (see TOOI classification rules)."
|
|
|
|
4. **Request human review** for:
|
|
- Complexity score ≥ 7
|
|
- Conflicting ontology recommendations
|
|
- Temporal modeling ambiguities
|
|
- Novel aspect combinations
|
|
|
|
---
|
|
|
|
## Example: Complete Ontology Mapping Workflow
|
|
|
|
**Scenario**: Map Wikidata Q3437789 (heemkamer - local history room)
|
|
|
|
### Step 1: Research Entity
|
|
```bash
|
|
# Read Wikidata metadata from hyponyms_curated_full.yaml
|
|
grep -A 100 "Q3437789" /Users/kempersc/apps/glam/data/wikidata/GLAMORCUBEPSXHFN/hyponyms_curated_full.yaml
|
|
```
|
|
|
|
**Findings**:
|
|
- Dutch concept: "Local history room/museum"
|
|
- Usually operated by volunteers/heritage societies
|
|
- Mix of museum, archive, library functions
|
|
- Often in small municipalities
|
|
|
|
### Step 2: Search Base Ontologies
|
|
```bash
|
|
# Search CPOV for organizational types
|
|
rg "classification|OrganisationType" /Users/kempersc/apps/glam/data/ontology/core-public-organisation-ap.ttl
|
|
|
|
# Search Schema.org for community organizations
|
|
rg "NGO|CivicStructure|LocalBusiness" /Users/kempersc/apps/glam/data/ontology/schemaorg.owl
|
|
|
|
# Search CIDOC-CRM for community groups
|
|
rg "E74_Group|E40_Legal_Body" /Users/kempersc/apps/glam/data/ontology/CIDOC_CRM_v7.1.3.rdf
|
|
```
|
|
|
|
### Step 3: Apply Decision Trees
|
|
|
|
**Geographic scope**: Netherlands → Check TOOI
|
|
**Legal status**: Usually private foundation (stichting) or association (vereniging)
|
|
**Function**: Collects + Preserves + Exhibits local heritage → Multi-functional
|
|
|
|
**Decision**:
|
|
- PRIMARY: `schema:NGO` (non-governmental heritage organization)
|
|
- SECONDARY: `crm:E74_Group` (community heritage group)
|
|
- DUTCH MIXIN: `DutchLegalEntityMixin` (KvK registration)
|
|
|
|
### Step 4: Model Aspects
|
|
|
|
```yaml
|
|
heemkamer:
|
|
wikidata_id: Q3437789
|
|
ontology_mapping:
|
|
|
|
# CUSTODIAN ASPECT
|
|
custodian_class: schema:NGO
|
|
custodian_secondary: crm:E74_Group
|
|
rationale: >-
|
|
Non-governmental community heritage organization.
|
|
Not public sector (excludes CPOV). Uses Schema.org NGO.
|
|
|
|
# PLACE ASPECT (often operates in specific building)
|
|
place_class: schema:CivicStructure
|
|
place_secondary: crm:E27_Site
|
|
|
|
# LEGAL FORM ASPECT (Dutch foundation/association)
|
|
legal_class: org:FormalOrganization
|
|
legal_dutch_mixin: DutchLegalEntityMixin
|
|
properties:
|
|
kvk_number: required
|
|
legal_form: "stichting OR vereniging"
|
|
|
|
# COLLECTIONS ASPECT (multi-functional)
|
|
collections_classes:
|
|
- rico:RecordSet # Local archival materials
|
|
- crm:E78_Curated_Holding # Museum objects
|
|
- bf:Collection # Local history books
|
|
|
|
# PEOPLE ASPECT (volunteers)
|
|
people_class: pico:PersonObservation
|
|
people_roles:
|
|
- picot_roles:curator
|
|
- picot_roles:volunteer_archivist
|
|
- picot_roles:educator
|
|
|
|
temporal_model:
|
|
aspects:
|
|
- custodian # Founding → present/closure
|
|
- place # Building occupancy (may change)
|
|
- collections # Accessions over time
|
|
- people # Volunteer participation periods
|
|
```
|
|
|
|
### Step 5: Document and Review
|
|
|
|
```yaml
|
|
ontology_enrichment:
|
|
complexity_score: 8 # Multi-functional, temporal complexity
|
|
requires_human_review: true
|
|
review_notes: >-
|
|
Heemkamer concept is Dutch-specific with no direct
|
|
international equivalent. Multi-functional nature
|
|
(museum + archive + library) requires careful aspect modeling.
|
|
```
|
|
|
|
---
|
|
|
|
## Summary: Key Takeaways for Agents
|
|
|
|
1. **Ontology files are your bible** - Read them first, always
|
|
2. **Wikidata is data, not ontology** - Map Q-numbers to formal classes
|
|
3. **Everything has multiple aspects** - Place, custodian, legal, collections, people
|
|
4. **Time is always a factor** - Model temporal independence
|
|
5. **Properties must be justified** - Use ontology properties, document rationale
|
|
6. **Complexity is reality** - Don't oversimplify, embrace nuance
|
|
7. **Document everything** - Future agents/humans need your reasoning
|
|
8. **Ask for help** - Complex cases require human review
|
|
|
|
**When in doubt**: Read the ontology files, consult AGENTS.md, request human guidance.
|
|
|
|
---
|
|
|
|
**End of Ontology Mapping Rules v1.0**
|