523 lines
14 KiB
Markdown
523 lines
14 KiB
Markdown
# DBpedia Ontology Integration for Heritage Custodian Project
|
|
|
|
**Date**: 2025-11-20
|
|
**Purpose**: Document DBpedia Ontology (DBO) conventions for mapping Wikidata entities to specialized heritage ontologies
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
**DBpedia Ontology (DBO)** provides a critical bridge between Wikidata entities and formal ontology classes. This document establishes conventions for integrating DBO mappings into the heritage custodian ontology enrichment workflow.
|
|
|
|
**Key Finding**: DBpedia already maps many Wikidata GLAM entities to ontology classes via `owl:equivalentClass` assertions. We should leverage these existing mappings instead of creating them from scratch.
|
|
|
|
---
|
|
|
|
## DBpedia Ontology Overview
|
|
|
|
**Namespace**: `http://dbpedia.org/ontology/`
|
|
**Prefix**: `dbo:`
|
|
**Coverage**: 768 classes, 3000 properties, ~4.2M instances
|
|
**Scope**: Cross-domain ontology (shallow but broad coverage)
|
|
|
|
### Key Resources
|
|
|
|
- **Ontology Browser**: http://dbpedia.org/ontology/
|
|
- **SPARQL Endpoint**: http://dbpedia.org/sparql
|
|
- **Mappings Wiki**: http://mappings.dbpedia.org
|
|
- **Archivo (Ontology Archive)**: https://archivo.dbpedia.org/info?o=http://dbpedia.org/ontology/
|
|
- **Development Version**: https://databus.dbpedia.org/ontologies/dbpedia.org/ontology--DEV
|
|
|
|
---
|
|
|
|
## Why DBpedia Matters for This Project
|
|
|
|
### 1. **Pre-existing Wikidata Mappings**
|
|
|
|
DBpedia already maps many heritage institution Wikidata entities to ontology classes:
|
|
|
|
```turtle
|
|
dbo:Museum owl:equivalentClass wd:Q33506 .
|
|
dbo:Library owl:equivalentClass wd:Q7075 .
|
|
dbo:Archive owl:equivalentClass wd:Q166118 .
|
|
```
|
|
|
|
**Benefit**: We can use DBpedia as an intermediary to discover ontology mappings for Wikidata entities.
|
|
|
|
### 2. **Schema.org Alignment**
|
|
|
|
DBpedia classes map to Schema.org (which we already use):
|
|
|
|
```turtle
|
|
dbo:Library owl:equivalentClass schema:Library .
|
|
```
|
|
|
|
**Benefit**: DBpedia validates our existing Schema.org mappings.
|
|
|
|
### 3. **Domain-Specific Properties**
|
|
|
|
DBpedia defines heritage-specific properties:
|
|
|
|
- `dbo:collection` - Museum collections
|
|
- `dbo:curator` - Museum curator
|
|
- `dbo:museumType` - Museum specialization
|
|
- `dbo:isil` - ISIL code (for libraries)
|
|
- `dbo:numberOfCollectionItems` - Collection size
|
|
|
|
**Benefit**: We can reference DBpedia properties in our mappings instead of inventing custom ones.
|
|
|
|
---
|
|
|
|
## DBpedia Heritage Classes
|
|
|
|
### Museums
|
|
|
|
**Class**: `dbo:Museum`
|
|
**Wikidata**: `wd:Q33506`
|
|
**Subclass of**: `dbo:Building`
|
|
**Properties**:
|
|
- `dbo:collection` (museum collections)
|
|
- `dbo:curator` (curator name)
|
|
- `dbo:museumType` (specialization: art, history, science, etc.)
|
|
|
|
**Example RDF**:
|
|
```turtle
|
|
<http://dbpedia.org/resource/Rijksmuseum>
|
|
rdf:type dbo:Museum ;
|
|
dbo:collection "Dutch Golden Age paintings" ;
|
|
dbo:curator "Taco Dibbits" ;
|
|
dbo:museumType "Art museum" .
|
|
```
|
|
|
|
### Libraries
|
|
|
|
**Class**: `dbo:Library`
|
|
**Wikidata**: `wd:Q7075`
|
|
**Schema.org**: `schema:Library`
|
|
**Subclass of**: `dbo:EducationalInstitution`
|
|
**Properties**:
|
|
- `dbo:isil` (ISIL code)
|
|
- `dbo:numberOfCollectionItems` (collection size)
|
|
|
|
**Example RDF**:
|
|
```turtle
|
|
<http://dbpedia.org/resource/Library_of_Congress>
|
|
rdf:type dbo:Library ;
|
|
dbo:isil "US-DLC" ;
|
|
dbo:numberOfCollectionItems 17000000 .
|
|
```
|
|
|
|
### Archives
|
|
|
|
**Class**: `dbo:Archive`
|
|
**Wikidata**: `wd:Q166118`
|
|
**Subclass of**: `dbo:CollectionOfValuables`
|
|
**Properties**: (fewer specialized properties than Museum/Library)
|
|
|
|
**Example RDF**:
|
|
```turtle
|
|
<http://dbpedia.org/resource/National_Archives_and_Records_Administration>
|
|
rdf:type dbo:Archive ;
|
|
rdfs:label "National Archives and Records Administration"@en .
|
|
```
|
|
|
|
---
|
|
|
|
## Integration Workflow for Ontology Enrichment
|
|
|
|
### Step 1: Check DBpedia for Wikidata Mapping
|
|
|
|
When enriching a Wikidata entity (e.g., Q2772772 - military museum):
|
|
|
|
```sparql
|
|
# Query DBpedia SPARQL endpoint
|
|
SELECT ?dboClass WHERE {
|
|
?dboClass owl:equivalentClass <http://www.wikidata.org/entity/Q2772772> .
|
|
}
|
|
```
|
|
|
|
**If match found**: Use DBpedia class as secondary/tertiary ontology reference.
|
|
|
|
### Step 2: Discover DBpedia Subclass Hierarchy
|
|
|
|
```sparql
|
|
# Find superclasses
|
|
SELECT ?superclass WHERE {
|
|
dbo:Museum rdfs:subClassOf ?superclass .
|
|
}
|
|
# Result: dbo:Building
|
|
```
|
|
|
|
**Use this to understand** where DBpedia places the entity in the ontology hierarchy.
|
|
|
|
### Step 3: Extract DBpedia Properties
|
|
|
|
```sparql
|
|
# Find properties applicable to Museum class
|
|
SELECT DISTINCT ?property WHERE {
|
|
?property rdfs:domain dbo:Museum .
|
|
}
|
|
```
|
|
|
|
**Result**:
|
|
- `dbo:collection`
|
|
- `dbo:curator`
|
|
- `dbo:museumType`
|
|
|
|
**Action**: Reference these properties in our ontology mapping `properties:` section.
|
|
|
|
### Step 4: Document DBpedia Mapping in YAML
|
|
|
|
```yaml
|
|
ontology_mapping:
|
|
wikidata_source: Q2772772
|
|
dbpedia_class: dbo:Museum # ← ADD THIS
|
|
dbpedia_equivalent_wikidata: wd:Q33506 # ← ADD THIS
|
|
|
|
custodian_ontology:
|
|
public_sector:
|
|
class: cpov:PublicOrganisation
|
|
secondary_class: schema:Museum
|
|
tertiary_class: dbo:Museum # ← REFERENCE DBpedia
|
|
quaternary_class: crm:E39_Actor
|
|
|
|
properties:
|
|
- label: dbo:collection # ← USE DBpedia property
|
|
value:
|
|
- label: Museum collections
|
|
- label: dbo:curator # ← USE DBpedia property
|
|
value:
|
|
- label: Curator name
|
|
- label: dbo:museumType # ← USE DBpedia property
|
|
value:
|
|
- label: Museum specialization (military, art, history, etc.)
|
|
```
|
|
|
|
---
|
|
|
|
## DBpedia Advantages Over Wikidata
|
|
|
|
| Feature | Wikidata | DBpedia |
|
|
|---------|----------|---------|
|
|
| **Ontology Structure** | Flat entity graph | Hierarchical class ontology |
|
|
| **Property Definitions** | No formal domains/ranges | Typed properties with domain/range |
|
|
| **OWL Semantics** | Limited OWL support | Full OWL ontology |
|
|
| **Reasoning Support** | Manual queries | OWL reasoning possible |
|
|
| **Multilingual Labels** | Excellent | Good (40+ languages) |
|
|
| **Heritage Coverage** | Comprehensive instances | Structured classes + properties |
|
|
|
|
**Use Case**: Wikidata provides entity instances; DBpedia provides ontology structure.
|
|
|
|
---
|
|
|
|
## Updated Ontology Mapping Template
|
|
|
|
### New Fields to Add
|
|
|
|
```yaml
|
|
ontology_mapping:
|
|
wikidata_source: Q[number]
|
|
|
|
# NEW: DBpedia integration
|
|
dbpedia_mapping:
|
|
dbpedia_class: dbo:[ClassName] # If DBpedia has equivalent class
|
|
dbpedia_equivalent_wikidata: wd:Q[number] # Wikidata entity DBpedia maps to
|
|
dbpedia_properties: # DBpedia-specific properties to use
|
|
- dbo:collection
|
|
- dbo:curator
|
|
- dbo:isil
|
|
sparql_query: | # SPARQL query used to discover mapping
|
|
SELECT ?dboClass WHERE {
|
|
?dboClass owl:equivalentClass <http://www.wikidata.org/entity/Q[number]> .
|
|
}
|
|
|
|
semantic_aspects: [...]
|
|
complexity_score: N
|
|
|
|
custodian_ontology:
|
|
public_sector:
|
|
class: cpov:PublicOrganisation
|
|
secondary_class: schema:Museum
|
|
tertiary_class: dbo:Museum # ← REFERENCE DBpedia class
|
|
quaternary_class: crm:E39_Actor
|
|
|
|
properties:
|
|
- label: dbo:collection # ← USE DBpedia properties
|
|
value:
|
|
- label: Collection description
|
|
```
|
|
|
|
---
|
|
|
|
## DBpedia Properties for Heritage Institutions
|
|
|
|
### Museum Properties
|
|
|
|
| Property | Domain | Range | Description |
|
|
|----------|--------|-------|-------------|
|
|
| `dbo:collection` | `dbo:Museum` | `xsd:string` | Collections held by museum |
|
|
| `dbo:curator` | `dbo:Museum` | `dbo:Person` | Museum curator |
|
|
| `dbo:museumType` | `dbo:Museum` | `xsd:string` | Museum specialization |
|
|
|
|
### Library Properties
|
|
|
|
| Property | Domain | Range | Description |
|
|
|----------|--------|-------|-------------|
|
|
| `dbo:isil` | `dbo:Library` | `xsd:string` | ISIL code |
|
|
| `dbo:numberOfCollectionItems` | `dbo:Library` | `xsd:integer` | Collection size |
|
|
|
|
### General Organizational Properties
|
|
|
|
| Property | Domain | Range | Description |
|
|
|----------|--------|-------|-------------|
|
|
| `dbo:foundingDate` | `dbo:Organisation` | `xsd:date` | Founding date |
|
|
| `dbo:dissolutionDate` | `dbo:Organisation` | `xsd:date` | Closure date |
|
|
| `dbo:location` | `dbo:Organisation` | `dbo:Place` | Physical location |
|
|
| `dbo:affiliation` | `dbo:Organisation` | `dbo:Organisation` | Parent organization |
|
|
|
|
---
|
|
|
|
## Comparison: CPOV vs Schema.org vs DBpedia
|
|
|
|
### Museum Example
|
|
|
|
```turtle
|
|
# CPOV (EU Public Sector)
|
|
<http://example.org/rijksmuseum>
|
|
rdf:type cpov:PublicOrganisation ;
|
|
skos:prefLabel "Rijksmuseum" ;
|
|
dct:identifier "NL-AmRMA" . # ISIL
|
|
|
|
# Schema.org (Web Semantics)
|
|
<http://example.org/rijksmuseum>
|
|
rdf:type schema:Museum ;
|
|
schema:name "Rijksmuseum" ;
|
|
schema:identifier "NL-AmRMA" .
|
|
|
|
# DBpedia (Cross-Domain Ontology)
|
|
<http://example.org/rijksmuseum>
|
|
rdf:type dbo:Museum ;
|
|
rdfs:label "Rijksmuseum" ;
|
|
dbo:isil "NL-AmRMA" ;
|
|
dbo:collection "Dutch Golden Age paintings" ;
|
|
dbo:curator "Taco Dibbits" .
|
|
```
|
|
|
|
**Decision**: Use ALL THREE in multi-typed assertions for maximum interoperability.
|
|
|
|
---
|
|
|
|
## SPARQL Queries for DBpedia Integration
|
|
|
|
### Query 1: Find DBpedia Class for Wikidata Entity
|
|
|
|
```sparql
|
|
PREFIX owl: <http://www.w3.org/2002/07/owl#>
|
|
PREFIX wd: <http://www.wikidata.org/entity/>
|
|
|
|
SELECT ?dboClass ?label WHERE {
|
|
?dboClass owl:equivalentClass wd:Q2772772 .
|
|
?dboClass rdfs:label ?label .
|
|
FILTER(LANG(?label) = "en")
|
|
}
|
|
```
|
|
|
|
### Query 2: Find All Museum-Related DBpedia Classes
|
|
|
|
```sparql
|
|
PREFIX dbo: <http://dbpedia.org/ontology/>
|
|
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
|
|
|
|
SELECT ?class ?label WHERE {
|
|
?class rdfs:subClassOf* dbo:Museum .
|
|
?class rdfs:label ?label .
|
|
FILTER(LANG(?label) = "en")
|
|
}
|
|
```
|
|
|
|
### Query 3: Find DBpedia Properties for a Class
|
|
|
|
```sparql
|
|
PREFIX dbo: <http://dbpedia.org/ontology/>
|
|
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
|
|
|
|
SELECT ?property ?label WHERE {
|
|
?property rdfs:domain dbo:Museum .
|
|
?property rdfs:label ?label .
|
|
FILTER(LANG(?label) = "en")
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Implementation Recommendations
|
|
|
|
### Recommendation 1: Add DBpedia as Fourth Ontology Layer
|
|
|
|
**Current**: CPOV (primary) + Schema.org (secondary) + CIDOC-CRM (tertiary)
|
|
**Proposed**: CPOV + Schema.org + **DBpedia** + CIDOC-CRM
|
|
|
|
**Rationale**: DBpedia bridges Wikidata entities to formal ontologies.
|
|
|
|
### Recommendation 2: Use DBpedia Properties in LinkML Schema
|
|
|
|
**Current**: Custom properties or Schema.org properties
|
|
**Proposed**: Reference DBpedia properties when available
|
|
|
|
**Example** (LinkML schema):
|
|
```yaml
|
|
HeritageCustodian:
|
|
slots:
|
|
collection_description:
|
|
slot_uri: dbo:collection # ← Map to DBpedia property
|
|
curator_name:
|
|
slot_uri: dbo:curator
|
|
museum_type:
|
|
slot_uri: dbo:museumType
|
|
isil_code:
|
|
slot_uri: dbo:isil
|
|
```
|
|
|
|
### Recommendation 3: Create DBpedia Mapping Cache
|
|
|
|
**Problem**: Querying DBpedia SPARQL endpoint for every entity is slow.
|
|
**Solution**: Pre-cache Wikidata → DBpedia mappings for common heritage classes.
|
|
|
|
**Script** (`scripts/cache_dbpedia_mappings.py`):
|
|
```python
|
|
import requests
|
|
|
|
DBPEDIA_SPARQL = "http://dbpedia.org/sparql"
|
|
|
|
def get_dbpedia_class(wikidata_id):
|
|
"""Query DBpedia for equivalent class of Wikidata entity."""
|
|
query = f"""
|
|
PREFIX owl: <http://www.w3.org/2002/07/owl#>
|
|
PREFIX wd: <http://www.wikidata.org/entity/>
|
|
|
|
SELECT ?dboClass WHERE {{
|
|
?dboClass owl:equivalentClass wd:{wikidata_id} .
|
|
}}
|
|
"""
|
|
response = requests.get(DBPEDIA_SPARQL, params={
|
|
'query': query,
|
|
'format': 'json'
|
|
})
|
|
results = response.json()['results']['bindings']
|
|
if results:
|
|
return results[0]['dboClass']['value']
|
|
return None
|
|
|
|
# Cache mappings for GLAM entities
|
|
cache = {}
|
|
for qid in ['Q33506', 'Q7075', 'Q166118', 'Q2772772', ...]:
|
|
cache[qid] = get_dbpedia_class(qid)
|
|
|
|
# Save to YAML
|
|
with open('data/ontology/dbpedia_wikidata_mappings.yaml', 'w') as f:
|
|
yaml.dump(cache, f)
|
|
```
|
|
|
|
### Recommendation 4: Enrich Ontology Mapping Workflow
|
|
|
|
**Updated workflow** (`.opencode/agent/ontology-mapping-rules.md`):
|
|
|
|
1. Read Wikidata entity metadata
|
|
2. **Query DBpedia for equivalent class** ← NEW STEP
|
|
3. Search base ontologies (CPOV, TOOI, Schema.org, CIDOC-CRM)
|
|
4. **Reference DBpedia properties** ← NEW STEP
|
|
5. Map to ontology classes
|
|
6. Document rationale
|
|
7. Write ontology_mapping YAML
|
|
|
|
---
|
|
|
|
## Example: Military Museum with DBpedia Integration
|
|
|
|
```yaml
|
|
- label: Q2772772
|
|
hypernym:
|
|
- museum
|
|
type:
|
|
- M
|
|
ontology_mapping:
|
|
wikidata_source: Q2772772
|
|
|
|
# DBpedia integration
|
|
dbpedia_mapping:
|
|
dbpedia_class: dbo:Museum
|
|
dbpedia_equivalent_wikidata: wd:Q33506
|
|
dbpedia_subclass_of: dbo:Building
|
|
dbpedia_properties:
|
|
- dbo:collection
|
|
- dbo:curator
|
|
- dbo:museumType
|
|
sparql_discovery_date: "2025-11-20"
|
|
|
|
semantic_aspects:
|
|
- custodian
|
|
- collections
|
|
|
|
complexity_score: 4
|
|
|
|
custodian_ontology:
|
|
public_sector:
|
|
class: cpov:PublicOrganisation
|
|
namespace: http://data.europa.eu/m8g/
|
|
secondary_class: schema:Museum
|
|
tertiary_class: dbo:Museum # ← DBpedia class
|
|
quaternary_class: crm:E39_Actor
|
|
|
|
properties:
|
|
- label: dbo:collection
|
|
value:
|
|
- label: Military artifacts and archival records
|
|
- label: dbo:curator
|
|
value:
|
|
- label: Museum curator name
|
|
- label: dbo:museumType
|
|
value:
|
|
- label: Military history specialization
|
|
- label: dct:identifier
|
|
value:
|
|
- label: ISIL code
|
|
- label: schema:url
|
|
value:
|
|
- label: Official website
|
|
|
|
collections_ontology:
|
|
museum_collections:
|
|
class: crm:E78_Curated_Holding
|
|
properties:
|
|
- label: dbo:collection # ← Reference DBpedia property
|
|
value:
|
|
- label: Military artifacts description
|
|
```
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- **DBpedia Homepage**: https://www.dbpedia.org/
|
|
- **DBpedia Ontology**: http://dbpedia.org/ontology/
|
|
- **DBpedia SPARQL Endpoint**: http://dbpedia.org/sparql
|
|
- **DBpedia Mappings Wiki**: http://mappings.dbpedia.org
|
|
- **Archivo (DBpedia Ontology Archive)**: https://archivo.dbpedia.org/
|
|
- **DBpedia Databus**: https://databus.dbpedia.org/ontologies/dbpedia.org/ontology--DEV
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
1. ✅ Document DBpedia integration conventions (THIS DOCUMENT)
|
|
2. ⏳ Create DBpedia → Wikidata mapping cache script
|
|
3. ⏳ Update `.opencode/agent/ontology-mapping-rules.md` with DBpedia step
|
|
4. ⏳ Retrofit existing ontology mappings (Q1802963, Q3694, Q2927789) with DBpedia references
|
|
5. ⏳ Add `dbpedia_class` field to LinkML schema
|
|
6. ⏳ Continue ontology enrichment with DBpedia integration
|
|
|
|
---
|
|
|
|
**Version**: 1.0
|
|
**Last Updated**: 2025-11-20
|
|
**Maintained By**: Heritage Custodian Ontology Project
|