glam/docs/DBPEDIA_ONTOLOGY_INTEGRATION.md
2025-11-21 22:12:33 +01:00

14 KiB

DBpedia Ontology Integration for Heritage Custodian Project

Date: 2025-11-20
Purpose: Document DBpedia Ontology (DBO) conventions for mapping Wikidata entities to specialized heritage ontologies


Executive Summary

DBpedia Ontology (DBO) provides a critical bridge between Wikidata entities and formal ontology classes. This document establishes conventions for integrating DBO mappings into the heritage custodian ontology enrichment workflow.

Key Finding: DBpedia already maps many Wikidata GLAM entities to ontology classes via owl:equivalentClass assertions. We should leverage these existing mappings instead of creating them from scratch.


DBpedia Ontology Overview

Namespace: http://dbpedia.org/ontology/
Prefix: dbo:
Coverage: 768 classes, 3000 properties, ~4.2M instances
Scope: Cross-domain ontology (shallow but broad coverage)

Key Resources


Why DBpedia Matters for This Project

1. Pre-existing Wikidata Mappings

DBpedia already maps many heritage institution Wikidata entities to ontology classes:

dbo:Museum owl:equivalentClass wd:Q33506 .
dbo:Library owl:equivalentClass wd:Q7075 .
dbo:Archive owl:equivalentClass wd:Q166118 .

Benefit: We can use DBpedia as an intermediary to discover ontology mappings for Wikidata entities.

2. Schema.org Alignment

DBpedia classes map to Schema.org (which we already use):

dbo:Library owl:equivalentClass schema:Library .

Benefit: DBpedia validates our existing Schema.org mappings.

3. Domain-Specific Properties

DBpedia defines heritage-specific properties:

  • dbo:collection - Museum collections
  • dbo:curator - Museum curator
  • dbo:museumType - Museum specialization
  • dbo:isil - ISIL code (for libraries)
  • dbo:numberOfCollectionItems - Collection size

Benefit: We can reference DBpedia properties in our mappings instead of inventing custom ones.


DBpedia Heritage Classes

Museums

Class: dbo:Museum
Wikidata: wd:Q33506
Subclass of: dbo:Building
Properties:

  • dbo:collection (museum collections)
  • dbo:curator (curator name)
  • dbo:museumType (specialization: art, history, science, etc.)

Example RDF:

<http://dbpedia.org/resource/Rijksmuseum>
  rdf:type dbo:Museum ;
  dbo:collection "Dutch Golden Age paintings" ;
  dbo:curator "Taco Dibbits" ;
  dbo:museumType "Art museum" .

Libraries

Class: dbo:Library
Wikidata: wd:Q7075
Schema.org: schema:Library
Subclass of: dbo:EducationalInstitution
Properties:

  • dbo:isil (ISIL code)
  • dbo:numberOfCollectionItems (collection size)

Example RDF:

<http://dbpedia.org/resource/Library_of_Congress>
  rdf:type dbo:Library ;
  dbo:isil "US-DLC" ;
  dbo:numberOfCollectionItems 17000000 .

Archives

Class: dbo:Archive
Wikidata: wd:Q166118
Subclass of: dbo:CollectionOfValuables
Properties: (fewer specialized properties than Museum/Library)

Example RDF:

<http://dbpedia.org/resource/National_Archives_and_Records_Administration>
  rdf:type dbo:Archive ;
  rdfs:label "National Archives and Records Administration"@en .

Integration Workflow for Ontology Enrichment

Step 1: Check DBpedia for Wikidata Mapping

When enriching a Wikidata entity (e.g., Q2772772 - military museum):

# Query DBpedia SPARQL endpoint
SELECT ?dboClass WHERE {
  ?dboClass owl:equivalentClass <http://www.wikidata.org/entity/Q2772772> .
}

If match found: Use DBpedia class as secondary/tertiary ontology reference.

Step 2: Discover DBpedia Subclass Hierarchy

# Find superclasses
SELECT ?superclass WHERE {
  dbo:Museum rdfs:subClassOf ?superclass .
}
# Result: dbo:Building

Use this to understand where DBpedia places the entity in the ontology hierarchy.

Step 3: Extract DBpedia Properties

# Find properties applicable to Museum class
SELECT DISTINCT ?property WHERE {
  ?property rdfs:domain dbo:Museum .
}

Result:

  • dbo:collection
  • dbo:curator
  • dbo:museumType

Action: Reference these properties in our ontology mapping properties: section.

Step 4: Document DBpedia Mapping in YAML

ontology_mapping:
  wikidata_source: Q2772772
  dbpedia_class: dbo:Museum  # ← ADD THIS
  dbpedia_equivalent_wikidata: wd:Q33506  # ← ADD THIS
  
  custodian_ontology:
    public_sector:
      class: cpov:PublicOrganisation
      secondary_class: schema:Museum
      tertiary_class: dbo:Museum  # ← REFERENCE DBpedia
      quaternary_class: crm:E39_Actor
      
    properties:
    - label: dbo:collection  # ← USE DBpedia property
      value:
      - label: Museum collections
    - label: dbo:curator  # ← USE DBpedia property
      value:
      - label: Curator name
    - label: dbo:museumType  # ← USE DBpedia property
      value:
      - label: Museum specialization (military, art, history, etc.)

DBpedia Advantages Over Wikidata

Feature Wikidata DBpedia
Ontology Structure Flat entity graph Hierarchical class ontology
Property Definitions No formal domains/ranges Typed properties with domain/range
OWL Semantics Limited OWL support Full OWL ontology
Reasoning Support Manual queries OWL reasoning possible
Multilingual Labels Excellent Good (40+ languages)
Heritage Coverage Comprehensive instances Structured classes + properties

Use Case: Wikidata provides entity instances; DBpedia provides ontology structure.


Updated Ontology Mapping Template

New Fields to Add

ontology_mapping:
  wikidata_source: Q[number]
  
  # NEW: DBpedia integration
  dbpedia_mapping:
    dbpedia_class: dbo:[ClassName]  # If DBpedia has equivalent class
    dbpedia_equivalent_wikidata: wd:Q[number]  # Wikidata entity DBpedia maps to
    dbpedia_properties:  # DBpedia-specific properties to use
    - dbo:collection
    - dbo:curator
    - dbo:isil
    sparql_query: |  # SPARQL query used to discover mapping
      SELECT ?dboClass WHERE {
        ?dboClass owl:equivalentClass <http://www.wikidata.org/entity/Q[number]> .
      }
  
  semantic_aspects: [...]
  complexity_score: N
  
  custodian_ontology:
    public_sector:
      class: cpov:PublicOrganisation
      secondary_class: schema:Museum
      tertiary_class: dbo:Museum  # ← REFERENCE DBpedia class
      quaternary_class: crm:E39_Actor
      
      properties:
      - label: dbo:collection  # ← USE DBpedia properties
        value:
        - label: Collection description

DBpedia Properties for Heritage Institutions

Museum Properties

Property Domain Range Description
dbo:collection dbo:Museum xsd:string Collections held by museum
dbo:curator dbo:Museum dbo:Person Museum curator
dbo:museumType dbo:Museum xsd:string Museum specialization

Library Properties

Property Domain Range Description
dbo:isil dbo:Library xsd:string ISIL code
dbo:numberOfCollectionItems dbo:Library xsd:integer Collection size

General Organizational Properties

Property Domain Range Description
dbo:foundingDate dbo:Organisation xsd:date Founding date
dbo:dissolutionDate dbo:Organisation xsd:date Closure date
dbo:location dbo:Organisation dbo:Place Physical location
dbo:affiliation dbo:Organisation dbo:Organisation Parent organization

Comparison: CPOV vs Schema.org vs DBpedia

Museum Example

# CPOV (EU Public Sector)
<http://example.org/rijksmuseum>
  rdf:type cpov:PublicOrganisation ;
  skos:prefLabel "Rijksmuseum" ;
  dct:identifier "NL-AmRMA" .  # ISIL

# Schema.org (Web Semantics)
<http://example.org/rijksmuseum>
  rdf:type schema:Museum ;
  schema:name "Rijksmuseum" ;
  schema:identifier "NL-AmRMA" .

# DBpedia (Cross-Domain Ontology)
<http://example.org/rijksmuseum>
  rdf:type dbo:Museum ;
  rdfs:label "Rijksmuseum" ;
  dbo:isil "NL-AmRMA" ;
  dbo:collection "Dutch Golden Age paintings" ;
  dbo:curator "Taco Dibbits" .

Decision: Use ALL THREE in multi-typed assertions for maximum interoperability.


SPARQL Queries for DBpedia Integration

Query 1: Find DBpedia Class for Wikidata Entity

PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX wd: <http://www.wikidata.org/entity/>

SELECT ?dboClass ?label WHERE {
  ?dboClass owl:equivalentClass wd:Q2772772 .
  ?dboClass rdfs:label ?label .
  FILTER(LANG(?label) = "en")
}
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?class ?label WHERE {
  ?class rdfs:subClassOf* dbo:Museum .
  ?class rdfs:label ?label .
  FILTER(LANG(?label) = "en")
}

Query 3: Find DBpedia Properties for a Class

PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?property ?label WHERE {
  ?property rdfs:domain dbo:Museum .
  ?property rdfs:label ?label .
  FILTER(LANG(?label) = "en")
}

Implementation Recommendations

Recommendation 1: Add DBpedia as Fourth Ontology Layer

Current: CPOV (primary) + Schema.org (secondary) + CIDOC-CRM (tertiary)
Proposed: CPOV + Schema.org + DBpedia + CIDOC-CRM

Rationale: DBpedia bridges Wikidata entities to formal ontologies.

Recommendation 2: Use DBpedia Properties in LinkML Schema

Current: Custom properties or Schema.org properties
Proposed: Reference DBpedia properties when available

Example (LinkML schema):

HeritageCustodian:
  slots:
    collection_description:
      slot_uri: dbo:collection  # ← Map to DBpedia property
    curator_name:
      slot_uri: dbo:curator
    museum_type:
      slot_uri: dbo:museumType
    isil_code:
      slot_uri: dbo:isil

Recommendation 3: Create DBpedia Mapping Cache

Problem: Querying DBpedia SPARQL endpoint for every entity is slow.
Solution: Pre-cache Wikidata → DBpedia mappings for common heritage classes.

Script (scripts/cache_dbpedia_mappings.py):

import requests

DBPEDIA_SPARQL = "http://dbpedia.org/sparql"

def get_dbpedia_class(wikidata_id):
    """Query DBpedia for equivalent class of Wikidata entity."""
    query = f"""
    PREFIX owl: <http://www.w3.org/2002/07/owl#>
    PREFIX wd: <http://www.wikidata.org/entity/>
    
    SELECT ?dboClass WHERE {{
      ?dboClass owl:equivalentClass wd:{wikidata_id} .
    }}
    """
    response = requests.get(DBPEDIA_SPARQL, params={
        'query': query,
        'format': 'json'
    })
    results = response.json()['results']['bindings']
    if results:
        return results[0]['dboClass']['value']
    return None

# Cache mappings for GLAM entities
cache = {}
for qid in ['Q33506', 'Q7075', 'Q166118', 'Q2772772', ...]:
    cache[qid] = get_dbpedia_class(qid)

# Save to YAML
with open('data/ontology/dbpedia_wikidata_mappings.yaml', 'w') as f:
    yaml.dump(cache, f)

Recommendation 4: Enrich Ontology Mapping Workflow

Updated workflow (.opencode/agent/ontology-mapping-rules.md):

  1. Read Wikidata entity metadata
  2. Query DBpedia for equivalent class ← NEW STEP
  3. Search base ontologies (CPOV, TOOI, Schema.org, CIDOC-CRM)
  4. Reference DBpedia properties ← NEW STEP
  5. Map to ontology classes
  6. Document rationale
  7. Write ontology_mapping YAML

Example: Military Museum with DBpedia Integration

- label: Q2772772
  hypernym:
  - museum
  type:
  - M
  ontology_mapping:
    wikidata_source: Q2772772
    
    # DBpedia integration
    dbpedia_mapping:
      dbpedia_class: dbo:Museum
      dbpedia_equivalent_wikidata: wd:Q33506
      dbpedia_subclass_of: dbo:Building
      dbpedia_properties:
      - dbo:collection
      - dbo:curator
      - dbo:museumType
      sparql_discovery_date: "2025-11-20"
    
    semantic_aspects:
    - custodian
    - collections
    
    complexity_score: 4
    
    custodian_ontology:
      public_sector:
        class: cpov:PublicOrganisation
        namespace: http://data.europa.eu/m8g/
        secondary_class: schema:Museum
        tertiary_class: dbo:Museum  # ← DBpedia class
        quaternary_class: crm:E39_Actor
        
        properties:
        - label: dbo:collection
          value:
          - label: Military artifacts and archival records
        - label: dbo:curator
          value:
          - label: Museum curator name
        - label: dbo:museumType
          value:
          - label: Military history specialization
        - label: dct:identifier
          value:
          - label: ISIL code
        - label: schema:url
          value:
          - label: Official website
    
    collections_ontology:
      museum_collections:
        class: crm:E78_Curated_Holding
        properties:
        - label: dbo:collection  # ← Reference DBpedia property
          value:
          - label: Military artifacts description

References


Next Steps

  1. Document DBpedia integration conventions (THIS DOCUMENT)
  2. Create DBpedia → Wikidata mapping cache script
  3. Update .opencode/agent/ontology-mapping-rules.md with DBpedia step
  4. Retrofit existing ontology mappings (Q1802963, Q3694, Q2927789) with DBpedia references
  5. Add dbpedia_class field to LinkML schema
  6. Continue ontology enrichment with DBpedia integration

Version: 1.0
Last Updated: 2025-11-20
Maintained By: Heritage Custodian Ontology Project