# FeaturePlace Ontology Mapping Strategy **Date**: 2025-11-22 **Task**: Map 298 Wikidata feature types to ontology classes from `/data/ontology/` --- ## Ontology Sources Available ### Primary Ontologies 1. **CIDOC-CRM** (`CIDOC_CRM_v7.1.3.rdf`) - Cultural heritage domain standard - Key classes: `E27_Site`, `E22_Human-Made_Object`, `E25_Human-Made_Feature`, `E26_Physical_Feature` 2. **Schema.org** (`schemaorg.owl`) - Web semantics, general-purpose - Key classes: `schema:Place`, `schema:LandmarksOrHistoricalBuildings`, `schema:Museum`, `schema:Church`, `schema:PlaceOfWorship` 3. **DBpedia Ontology** (`dbpedia_heritage_classes.ttl`, `dbpedia_ontology.owl`) - Linked data from Wikipedia - Key classes: `dbo:Building`, `dbo:HistoricBuilding`, `dbo:Museum`, `dbo:Library`, `dbo:Archive` - **Mappings**: 804-line `dbpedia_wikidata_mappings.ttl` provides `dbo:Class ↔ wd:Q*` equivalences 4. **W3C Org Ontology** (`org.rdf`) - Organizational structures - Key classes: `org:Organization`, `org:FormalOrganization` 5. **GeoSPARQL** (`geo.ttl`) - Spatial features - Key classes: `geo:Feature`, `geo:Geometry` ### Supporting Ontologies - **PROV-O** (`prov.ttl`, `prov-o.rdf`) - Provenance - **Dublin Core** (`dublin_core_elements.rdf`) - Metadata - **SKOS** (`skos.rdf`) - Knowledge organization - **FOAF** (`foaf.ttl`) - Social networks - **VCARD** (`vcard.rdf`) - Contact information --- ## Mapping Strategy by Hypernym Category ### 1. Buildings (33 entries, 11.1%) **Wikidata Examples**: Q1802963 (mansion), Q317557 (parish church), Q1021645 (office building) **Ontology Mappings**: - **Primary**: `crm:E22_Human-Made_Object` (CIDOC-CRM) - **Secondary**: `dbo:Building` (DBpedia) - **Web**: `schema:LandmarksOrHistoricalBuildings` (Schema.org for heritage buildings) - **Specific types**: - Churches → `schema:Church`, `schema:PlaceOfWorship` - Museums → `schema:Museum`, `dbo:Museum` - Historic buildings → `dbo:HistoricBuilding` **Mapping Pattern**: ```yaml MANSION: meaning: wd:Q1802963 exact_mappings: - crm:E22_Human-Made_Object - dbo:Building close_mappings: - schema:LandmarksOrHistoricalBuildings - dbo:HistoricBuilding ``` --- ### 2. Heritage Sites (144 entries, 48.3%) **Wikidata Examples**: Q3694 (vacation property), Q2927789 (buitenplaats) **Ontology Mappings**: - **Primary**: `crm:E27_Site` (CIDOC-CRM physical site) - **Secondary**: `dbo:HistoricPlace` (DBpedia) - **Web**: `schema:LandmarksOrHistoricalBuildings`, `schema:TouristAttraction` **Mapping Pattern**: ```yaml HERITAGE_SITE: meaning: wd:Q??? exact_mappings: - crm:E27_Site close_mappings: - dbo:HistoricPlace - schema:LandmarksOrHistoricalBuildings ``` --- ### 3. Protected Areas (23 entries, 7.7%) **Wikidata Examples**: National parks, nature reserves, conservation areas **Ontology Mappings**: - **Primary**: `crm:E27_Site` (CIDOC-CRM) - **Web**: `schema:Park`, `schema:Place` - **Geo**: `geo:Feature` (GeoSPARQL) **Mapping Pattern**: ```yaml PROTECTED_AREA: meaning: wd:Q??? exact_mappings: - crm:E27_Site - geo:Feature close_mappings: - schema:Park ``` --- ### 4. Structures (12 entries, 4.0%) **Wikidata Examples**: Q336164 (sewerage pumping station), Q15710813 (physical structure) **Ontology Mappings**: - **Primary**: `crm:E25_Human-Made_Feature` (CIDOC-CRM) - **Secondary**: `crm:E26_Physical_Feature` (broader) - **Web**: `schema:Place` **Mapping Pattern**: ```yaml STRUCTURE: meaning: wd:Q??? exact_mappings: - crm:E25_Human-Made_Feature close_mappings: - crm:E26_Physical_Feature ``` --- ### 5. Museums (8 entries, 2.7%) **Wikidata Examples**: Military museums, art museums, historical museums **Ontology Mappings**: - **Primary**: `schema:Museum` (Schema.org) - **Secondary**: `dbo:Museum` (DBpedia) - **Heritage**: `crm:E22_Human-Made_Object` (building as object) **Mapping Pattern**: ```yaml MUSEUM: meaning: wd:Q33506 exact_mappings: - schema:Museum - dbo:Museum close_mappings: - crm:E22_Human-Made_Object ``` --- ### 6. Infrastructure (6 entries, 2.0%) **Wikidata Examples**: Q376799 (transport infrastructure), Q1311670 (rail infrastructure) **Ontology Mappings**: - **Primary**: `crm:E25_Human-Made_Feature` (CIDOC-CRM) - **Web**: `schema:Place` - **Note**: Infrastructure is underrepresented in cultural heritage ontologies **Mapping Pattern**: ```yaml INFRASTRUCTURE: meaning: wd:Q??? exact_mappings: - crm:E25_Human-Made_Feature close_mappings: - schema:Place related_mappings: - crm:E26_Physical_Feature ``` --- ### 7. Organizations (monasteries, etc.) **Wikidata Examples**: Q44613 (monastery) **Ontology Mappings**: - **Primary**: `org:Organization` (W3C Org) - **Secondary**: `dbo:Organisation` (DBpedia) - **But also**: `crm:E22_Human-Made_Object` (monastery as building) **Note**: Monasteries are BOTH organizations AND buildings - use multi-aspect approach **Mapping Pattern**: ```yaml MONASTERY: meaning: wd:Q44613 exact_mappings: - org:Organization # Organizational aspect - crm:E22_Human-Made_Object # Building aspect close_mappings: - dbo:Organisation - schema:PlaceOfWorship ``` --- ## General Mapping Rules ### Rule 1: Multiple Mappings (Multi-Aspect Entities) Many heritage features have MULTIPLE ontological aspects: ```yaml CASTLE: exact_mappings: - crm:E22_Human-Made_Object # Physical building - crm:E27_Site # Historic site - dbo:Building # DBpedia building class close_mappings: - schema:LandmarksOrHistoricalBuildings ``` **Rationale**: A castle is simultaneously: - A physical building (E22) - A historic site (E27) - A landmark (Schema.org) ### Rule 2: Hierarchy (Exact → Close → Related) ```yaml exact_mappings: # Direct equivalence (this IS that class) - crm:E27_Site close_mappings: # Close semantic match (this is SIMILAR to that class) - dbo:HistoricPlace - schema:LandmarksOrHistoricalBuildings related_mappings: # Related but not equivalent (this RELATES to that class) - geo:Feature - dcterms:Location ``` ### Rule 3: Prefer Heritage-Specific Ontologies **Priority order**: 1. **CIDOC-CRM** (cultural heritage domain standard) 2. **DBpedia** (linked data with Wikidata mappings) 3. **Schema.org** (web semantics, broad coverage) 4. **Domain-specific** (GeoSPARQL for geographic, Org for organizations) ### Rule 4: Use DBpedia Wikidata Mappings When Available **Check first**: `dbpedia_wikidata_mappings.ttl` ```bash # Example: Look up DBpedia class for Wikidata Q33506 (museum) grep "wikidata:Q33506" /Users/kempersc/apps/glam/data/ontology/dbpedia_wikidata_mappings.ttl # Returns: dbo:Museum owl:equivalentClass wikidata:Q33506 ``` **If found**: Use `dbo:Class` as exact mapping **If not found**: Use semantic approximation + document in `mapping_note` --- ## Implementation Workflow ### Step 1: Automated Mapping (High Confidence) Use `dbpedia_wikidata_mappings.ttl` to automatically map entries with direct DBpedia equivalents: ```python # Load mappings dbpedia_wd_mappings = parse_ttl('dbpedia_wikidata_mappings.ttl') # For each feature type for feature in feature_types: q_number = feature['meaning'] # e.g., wd:Q33506 # Check for DBpedia mapping if q_number in dbpedia_wd_mappings: dbo_class = dbpedia_wd_mappings[q_number] feature['exact_mappings'].append(dbo_class) feature['mapping_confidence'] = 'high' ``` **Coverage estimate**: ~60-70% of entries (based on DBpedia's GLAM coverage) --- ### Step 2: Semantic Rule-Based Mapping (Medium Confidence) Use hypernym categories to apply ontology mapping rules: ```python # Mapping rules by hypernym hypernym_rules = { 'building': ['crm:E22_Human-Made_Object', 'dbo:Building'], 'heritage site': ['crm:E27_Site', 'dbo:HistoricPlace'], 'museum': ['schema:Museum', 'dbo:Museum'], 'park': ['crm:E27_Site', 'schema:Park'], 'structure': ['crm:E25_Human-Made_Feature'], 'infrastructure': ['crm:E25_Human-Made_Feature'], # ... etc. } # Apply rules for feature in feature_types: for hypernym in feature['hypernyms']: if hypernym in hypernym_rules: feature['exact_mappings'].extend(hypernym_rules[hypernym]) feature['mapping_confidence'] = 'medium' ``` **Coverage estimate**: ~25-30% additional entries --- ### Step 3: Manual Review (Low Confidence) Remaining entries (~5-10%) require manual ontology consultation: - Read Wikidata descriptions - Search ontology files for semantic matches - Document mapping rationale ```yaml ESOTERIC_FEATURE_TYPE: meaning: wd:Q??? exact_mappings: - crm:E27_Site # Default fallback mapping_note: "No specific ontology class found. Using general site class." mapping_confidence: low ``` --- ## Default Fallback Mappings When no specific mapping found, use these defaults: ```yaml # Physical features (default) exact_mappings: - crm:E27_Site # CIDOC-CRM site (broadest physical feature) close_mappings: - schema:Place # Schema.org generic place related_mappings: - geo:Feature # GeoSPARQL spatial feature ``` **Rationale**: Every feature type is AT LEAST: - A site (E27) - A place (Schema.org) - A geographic feature (GeoSPARQL) --- ## Quality Assurance ### Validation Checks 1. **Every entry has at least one exact_mapping**: No orphaned entries 2. **CIDOC-CRM class present**: Cultural heritage standard compliance 3. **Mapping confidence documented**: Transparency about mapping quality 4. **Wikidata Q-number valid**: All `wd:Q*` references resolve ### Confidence Levels ```yaml mapping_confidence: high: # DBpedia direct equivalence or clear 1:1 match medium: # Semantic rule-based mapping low: # Manual approximation or fallback to general class ``` ### Mapping Notes Document rationale for non-obvious mappings: ```yaml SCIENTIFIC_FACILITY: meaning: wd:Q119459808 exact_mappings: - org:Organization # Organizational aspect - crm:E27_Site # Physical site aspect mapping_note: >- DBpedia lacks specific 'scientific facility' class. Mapped to Organization (function) + Site (physical). mapping_confidence: medium ``` --- ## Expected Output Format ```yaml enums: FeatureTypeEnum: permissible_values: MANSION: title: mansion description: very large and imposing dwelling house meaning: wd:Q1802963 # NEW: Ontology mappings exact_mappings: - crm:E22_Human-Made_Object - dbo:Building close_mappings: - schema:LandmarksOrHistoricalBuildings - dbo:HistoricBuilding related_mappings: - geo:Feature # NEW: Mapping metadata annotations: wikidata_id: Q1802963 wikidata_url: https://www.wikidata.org/wiki/Q1802963 hypernyms: building dbpedia_class: dbo:Building cidoc_crm_class: crm:E22_Human-Made_Object schema_org_class: schema:LandmarksOrHistoricalBuildings mapping_confidence: high mapping_date: 2025-11-22 ``` --- ## Implementation Plan ### Phase 1: Automated Mapping (2 hours) 1. Parse `dbpedia_wikidata_mappings.ttl` 2. Create hypernym → ontology class rules 3. Apply automated mapping to all 298 entries 4. Generate updated `FeatureTypeEnum.yaml` ### Phase 2: Manual Review (3 hours) 1. Review entries with `mapping_confidence: low` 2. Search ontology files for better matches 3. Document mapping rationale 4. Update entries with improved mappings ### Phase 3: Validation (1 hour) 1. Check all entries have exact_mappings 2. Verify CIDOC-CRM coverage 3. Validate Wikidata Q-numbers 4. Generate mapping quality report ### Phase 4: Documentation (1 hour) 1. Update AGENTS.md with mapping workflow 2. Create ontology mapping reference guide 3. Generate mapping statistics report 4. Update FeaturePlace.yaml with ontology references **Total estimated time**: 7 hours --- ## References - **CIDOC-CRM Specification**: http://www.cidoc-crm.org/html/cidoc_crm_v7.1.3.html - **Schema.org**: https://schema.org/ - **DBpedia Ontology**: https://dbpedia.org/ontology/ - **DBpedia Wikidata Mappings**: `/data/ontology/dbpedia_wikidata_mappings.ttl` - **DBpedia Heritage Classes**: `/data/ontology/dbpedia_heritage_classes.ttl` - **GeoSPARQL**: https://www.ogc.org/standards/geosparql --- **Next Step**: Implement Phase 1 automated mapping script