glam/docs/ONTOLOGY_INTEGRATION.md
2025-11-19 23:25:22 +01:00

39 KiB

GHCID Ontology Integration Guide

Version: 1.0
Date: 2025-11-06
Schema Version: v0.2.0 (modular)


Table of Contents

  1. Introduction
  2. Ontology Landscape
  3. CIDOC-CRM Integration
  4. RiC-O Integration
  5. Schema.org Integration
  6. EU Core Public Organization Vocabulary (CPOV)
  7. Dutch TOOI Ontology
  8. W3C Organization Ontology
  9. PROV-O Integration
  10. Comprehensive Mapping Table
  11. RDF Serialization Examples
  12. Gap Analysis
  13. Implementation Recommendations

Introduction

Purpose

This document provides a comprehensive guide to integrating the Global Heritage Custodian Identifier (GHCID) schema with established heritage, archival, and organizational ontologies. The GHCID schema is designed to be interoperable with existing Linked Data standards while providing unique capabilities for persistent identification and change tracking across heritage institutions worldwide.

Scope

The GHCID project integrates with seven major ontologies:

  1. CIDOC Conceptual Reference Model (CIDOC-CRM) v7.1.3 - Museum and cultural heritage domain
  2. Records in Contexts Ontology (RiC-O) v1.1 - International archival standard
  3. Schema.org - Web discoverability and SEO
  4. EU Core Public Organization Vocabulary (CPOV) - European public sector organizations
  5. Dutch TOOI Ontology - Netherlands government organizations
  6. W3C Organization Ontology (ORG) - Generic organizational structures
  7. W3C PROV Ontology (PROV-O) - Provenance and data lineage

Integration Philosophy

The GHCID schema follows these principles:

  • Alignment over Replacement: Map GHCID concepts to ontology equivalents using owl:equivalentClass and rdfs:subClassOf
  • Selective Extension: Add GHCID-specific capabilities (persistent IDs, change history) as extensions
  • Multi-Ontology Compatibility: Support multiple ontologies simultaneously (e.g., a museum can be both cidoc:E74_Group and schema:Museum)
  • Namespace Preservation: Preserve original ontology namespaces in RDF serialization
  • Practical Interoperability: Focus on real-world use cases (data exchange, aggregation, discovery)

Ontology Landscape

File Locations

All ontology files are stored in /data/ontology/:

data/ontology/
├── CIDOC_CRM_v7.1.3.rdf          # CIDOC-CRM OWL/RDF (948 lines)
├── RiC-O_1-1.rdf                  # RiC-O OWL/RDF (16,795 lines)
├── schemaorg.owl                  # Schema.org OWL (2.7 MB)
├── core-public-organisation-ap.ttl # CPOV Turtle (600+ lines)
├── tooiont.ttl                    # TOOI Turtle (Dutch gov)
├── org.ttl                        # W3C ORG Ontology
└── prov-o.ttl                     # W3C PROV-O

Namespace Prefixes

@prefix ghcid:   <https://w3id.org/heritage/custodian/> .
@prefix cidoc:   <http://www.cidoc-crm.org/cidoc-crm/> .
@prefix rico:    <https://www.ica.org/standards/RiC/ontology#> .
@prefix schema:  <http://schema.org/> .
@prefix cpov:    <http://data.europa.eu/m8g/> .
@prefix tooiont: <https://identifier.overheid.nl/tooi/def/ont/> .
@prefix org:     <http://www.w3.org/ns/org#> .
@prefix prov:    <http://www.w3.org/ns/prov#> .
@prefix owl:     <http://www.w3.org/2002/07/owl#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix skos:    <http://www.w3.org/2004/02/skos/core#> .

CIDOC-CRM Integration

Overview

CIDOC Conceptual Reference Model (CIDOC-CRM) is the ISO 21127:2014 standard for cultural heritage information. It provides a formal ontology for museum documentation, collection management, and heritage data integration.

Specification: data/ontology/CIDOC_CRM_v7.1.3.rdf
Namespace: http://www.cidoc-crm.org/cidoc-crm/
Relevance: Essential for museums, galleries, and cultural heritage institutions.

Key CIDOC-CRM Classes

CIDOC-CRM Class Description GHCID Mapping
E39_Actor Agents (people or groups) capable of actions HeritageCustodian (superclass)
E74_Group Organized collectives with identity HeritageCustodian (primary mapping)
E5_Event Changes of state in physical/conceptual items ChangeEvent
E78_Curated_Holding Collections assembled and managed Collection
E53_Place Geographic locations Location
E42_Identifier Identifiers assigned to entities Identifier

Class Alignments

HeritageCustodian → E74_Group

ghcid:HeritageCustodian
  rdfs:subClassOf cidoc:E74_Group ;
  owl:equivalentClass cidoc:E74_Group ;
  rdfs:label "Heritage Custodian Organization"@en ;
  rdfs:comment "A heritage institution (museum, library, archive, etc.) that holds and manages cultural heritage collections. Modeled as a CIDOC-CRM Group (E74) to enable integration with museum documentation standards."@en .

Rationale:

  • E74_Group represents organizations with collective identity and purpose
  • E74 is subclass of E39_Actor, enabling participation in events (E5)
  • Museums, libraries, and archives fit the definition of "social entities" in CIDOC-CRM

ChangeEvent → E5_Event

ghcid:ChangeEvent
  rdfs:subClassOf cidoc:E5_Event ;
  rdfs:label "Organizational Change Event"@en ;
  rdfs:comment "Significant organizational changes (mergers, relocations, closures, etc.) modeled as CIDOC-CRM Events (E5) to track temporal evolution of heritage institutions."@en .

Properties:

  • cidoc:P4_has_time-spanChangeEvent.event_date
  • cidoc:P14_carried_out_by → Participating organizations
  • cidoc:P16_used_specific_object → Affected organizations

Collection → E78_Curated_Holding

ghcid:Collection
  rdfs:subClassOf cidoc:E78_Curated_Holding ;
  rdfs:label "Heritage Collection"@en .

Property Mappings

GHCID Property CIDOC-CRM Property Range
name P1_is_identified_by + E41_Appellation String
description P3_has_note String
locations P74_has_current_or_former_residence E53_Place
identifiers P1_is_identified_by E42_Identifier
change_history P11i_participated_in E5_Event

CIDOC-CRM Benefits

  1. ISO Standard Compliance: Enables interoperability with museum sector systems
  2. Event Modeling: Rich temporal modeling for organizational changes
  3. Collection Integration: Direct mapping to collection management standards
  4. Property Graph: Detailed relationship tracking via CIDOC properties
  5. Time-Span Support: Better temporal granularity than ISO 8601 dates

Limitations

  • Complexity: CIDOC-CRM's 90+ classes can be overwhelming
  • Museum-Centric: Less tailored for libraries and archives (use RiC-O for those)
  • Learning Curve: Requires expertise in cultural heritage documentation

RiC-O Integration

Overview

Records in Contexts Ontology (RiC-O) is the official ontology of the International Council on Archives (ICA). It replaces older archival standards (ISAD(G), ISAAR(CPF)) with a modern Linked Data approach.

Specification: data/ontology/RiC-O_1-1.rdf (16,795 lines!)
Namespace: https://www.ica.org/standards/RiC/ontology#
Relevance: CRITICAL for archives and archival institutions.

Why RiC-O Matters for GHCID

RiC-O is the most important ontology for the archival sector:

  1. International Standard: Endorsed by ICA, used globally
  2. Rich Provenance: Extensive provenance modeling aligned with archival principles
  3. Relationship-Centric: Models complex organizational relationships (predecessors, successors, hierarchies)
  4. Archival Description: Purpose-built for describing archival records and their custodians
  5. Change Tracking: Native support for organizational change events

Key RiC-O Classes

RiC-O Class Description GHCID Mapping
rico:Agent Entities capable of actions HeritageCustodian (superclass)
rico:CorporateBody Organizations and institutions HeritageCustodian (primary)
rico:RecordResource Records and record sets Collection
rico:Record Individual records DigitalObject
rico:RecordSet Groups of records Collection
rico:Activity Actions performed by agents ChangeEvent
rico:Place Geographic locations Location
rico:Identifier Identifiers Identifier

Class Alignments

HeritageCustodian → rico:CorporateBody

ghcid:HeritageCustodian
  rdfs:subClassOf rico:CorporateBody ;
  owl:equivalentClass rico:CorporateBody ;
  rdfs:label "Heritage Custodian Organization"@en ;
  rdfs:comment "An archival institution, library, or museum modeled as a RiC-O Corporate Body to align with international archival standards (ICA RiC-O)."@en .

Rationale:

  • Archives ARE corporate bodies in RiC-O terminology
  • rico:CorporateBody supports organizational change tracking
  • Aligns with archival description practices worldwide

ChangeEvent → rico:Activity

ghcid:ChangeEvent
  rdfs:subClassOf rico:Activity ;
  rdfs:label "Organizational Change Activity"@en ;
  rdfs:comment "Organizational changes (mergers, splits, relocations) modeled as RiC-O Activities to track archival institution evolution over time."@en .

Properties:

  • rico:hasBeginningDateChangeEvent.event_date
  • rico:hasEndDate → End of event (optional)
  • rico:hasActivityType → Maps to ChangeTypeEnum
  • rico:agentAssociatedWithActivity → Participating organizations

Property Mappings

GHCID Property RiC-O Property Notes
name rico:hasOrHadName + rico:Name Use rico:Name class for structured names
alternative_names rico:hasOrHadOtherName Multiple alternative names
description rico:scopeAndContent Archival description field
institution_type rico:hasOrganizationType Maps to institution type taxonomy
locations rico:hasOrHadLocation Links to rico:Place
identifiers rico:hasOrHadIdentifier Links to rico:Identifier
change_history rico:history Narrative history field
change_history (events) rico:agentIsTargetOfActivity Link to rico:Activity

Advanced RiC-O Features

Predecessor/Successor Relationships

RiC-O excels at modeling organizational lineage:

# Example: Merger creating Noord-Hollands Archief (2001)
ghcid:nl-haarhaarlem rdfs:label "Gemeentearchief Haarlem" ;
  rico:hasSuccessor ghcid:nl-haarnha ;
  rico:agentIsTargetOfActivity ghcid:event-nha-merger-2001 .

ghcid:nl-haarrijks rdfs:label "Rijksarchief in Noord-Holland" ;
  rico:hasSuccessor ghcid:nl-haarnha ;
  rico:agentIsTargetOfActivity ghcid:event-nha-merger-2001 .

ghcid:nl-haarnha rdfs:label "Noord-Hollands Archief" ;
  rico:hasPredecessor ghcid:nl-haarhaarlem, ghcid:nl-haarrijks ;
  rico:beginningDate "2001-01-01"^^xsd:date .

ghcid:event-nha-merger-2001 a rico:Activity, ghcid:ChangeEvent ;
  rico:hasActivityType ghcid:MERGER ;
  rico:hasBeginningDate "2001-01-01"^^xsd:date ;
  rico:agentAssociatedWithActivity ghcid:nl-haarhaarlem, ghcid:nl-haarrijks, ghcid:nl-haarnha .

Archival Holdings

Link heritage custodians to their collections:

ghcid:nl-haarnha
  rico:hasOrHadCustody ghcid:collection-haarlem-city-records ;
  rico:isOrWasAuthorityOf ghcid:collection-haarlem-city-records .

ghcid:collection-haarlem-city-records a rico:RecordSet, ghcid:Collection ;
  rico:scopeAndContent "Municipal records of Haarlem, 1245-present" ;
  rico:hasProvenance ghcid:nl-haarhaarlem .

RiC-O Benefits for GHCID

  1. Archival Authority: ICA-endorsed standard, widely adopted
  2. Change Tracking: Native support for organizational evolution
  3. Relationship Modeling: Predecessor/successor, parent/child, partner relationships
  4. Provenance Integration: Aligns with archival provenance principles
  5. Global Adoption: Used by national archives worldwide (France, UK, Netherlands, etc.)

Implementation Priority

HIGH PRIORITY: For any GHCID instance involving archives, RiC-O alignment is mandatory.


Schema.org Integration

Overview

Schema.org is the de facto standard for structured data markup on the web. It enables search engines (Google, Bing) to understand and display rich results for heritage institutions.

Specification: data/ontology/schemaorg.owl (2.7 MB)
Namespace: http://schema.org/
Relevance: Essential for web discoverability and SEO.

GLAM-Specific Schema.org Classes

Schema.org has dedicated classes for GLAM institutions:

Schema.org Class Description GHCID Mapping
schema:Museum Museum organizations InstitutionTypeEnum.MUSEUM
schema:Library Library organizations InstitutionTypeEnum.LIBRARY
schema:ArchiveOrganization Archival institutions InstitutionTypeEnum.ARCHIVE
schema:ArchiveComponent Archival collections Collection
schema:LibrarySystem Multi-branch library systems Parent organizations

Class Alignments

Institution Type Mapping

# When institution_type = MUSEUM
ghcid:rijksmuseum a schema:Museum, ghcid:HeritageCustodian ;
  schema:name "Rijksmuseum" ;
  schema:location [ a schema:Place ; schema:addressCountry "NL" ] ;
  schema:url "https://www.rijksmuseum.nl" .

# When institution_type = LIBRARY
ghcid:kb-nl a schema:Library, ghcid:HeritageCustodian ;
  schema:name "Koninklijke Bibliotheek" ;
  schema:description "National Library of the Netherlands" .

# When institution_type = ARCHIVE
ghcid:nl-haarnha a schema:ArchiveOrganization, ghcid:HeritageCustodian ;
  schema:name "Noord-Hollands Archief" ;
  schema:archiveHeld ghcid:collection-haarlem-city-records .

Collection Mapping

ghcid:collection-haarlem-city-records a schema:ArchiveComponent, ghcid:Collection ;
  schema:name "Municipal Records of Haarlem" ;
  schema:dateCreated "1245" ;
  schema:temporalCoverage "1245/2025" ;
  schema:about schema:HistoricalEvent ;
  schema:archivedAt ghcid:nl-haarnha .

Property Mappings

GHCID Property Schema.org Property Notes
name schema:name Plain string
alternative_names schema:alternateName Multiple values allowed
description schema:description SEO-optimized description
locations.city schema:location + schema:address Use schema:PostalAddress
identifiers (URL) schema:url Official website
identifiers (Wikidata) schema:sameAs Link to Wikidata entity
digital_platforms.platform_url schema:url or schema:mainEntityOfPage Digital presence
collections schema:archiveHeld (archives) For archival holdings

SEO Benefits

Schema.org markup enables Google Knowledge Graph integration:

<!-- JSON-LD for Rijksmuseum webpage -->
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Museum",
  "@id": "https://w3id.org/heritage/custodian/nl/amsnl-rms",
  "name": "Rijksmuseum",
  "alternateName": "Rijksmuseum Amsterdam",
  "description": "The Rijksmuseum is the national museum of the Netherlands, dedicated to Dutch arts and history.",
  "url": "https://www.rijksmuseum.nl",
  "image": "https://www.rijksmuseum.nl/assets/logo.jpg",
  "address": {
    "@type": "PostalAddress",
    "streetAddress": "Museumstraat 1",
    "addressLocality": "Amsterdam",
    "postalCode": "1071 XX",
    "addressCountry": "NL"
  },
  "sameAs": [
    "https://www.wikidata.org/wiki/Q190804",
    "https://viaf.org/viaf/131511535"
  ],
  "openingHoursSpecification": { ... },
  "hasOfferCatalog": { ... }
}
</script>

Implementation Recommendations

  1. Export JSON-LD: Generate Schema.org JSON-LD for each institution
  2. Website Embedding: Provide embeddable snippets for institutional websites
  3. Aggregator Support: Use Schema.org in aggregation platforms (Europeana, DPLA)
  4. Discovery APIs: Expose Schema.org via JSON-LD API endpoint

EU Core Public Organization Vocabulary (CPOV)

Overview

EU Core Public Organization Vocabulary (CPOV) is the European standard for describing public sector organizations and services.

Specification: data/ontology/core-public-organisation-ap.ttl
Namespace: http://data.europa.eu/m8g/
Relevance: Important for government-funded heritage institutions in EU member states.

Integration Status

ALREADY INTEGRATED in schemas/core.yaml:106:

# From schemas/core.yaml
class_uri: cpov:ContactPoint  # EU Core Public Organization Vocabulary

Key CPOV Classes

CPOV Class Description GHCID Usage
cpov:PublicOrganisation Government or public sector organization Many heritage institutions
cpov:ContactPoint Contact information HeritageCustodian contact details
cpov:ReferenceFramework Legal/regulatory frameworks Governance metadata

Property Mappings

GHCID Property CPOV Property Notes
name skos:prefLabel Preferred label
alternative_names skos:altLabel Alternative labels
description dct:description Dublin Core Terms
locations locn:location ISA Core Location Vocabulary
identifiers dct:identifier Dublin Core identifier

Use Cases

  1. EU Aggregation: Interoperability with EU data portals (data.europa.eu)
  2. Public Sector Reporting: Compliance with EU data standards
  3. Cross-Border Discovery: Facilitate EU-wide heritage discovery
  4. Funding Compliance: Align with EU funding requirements (Horizon Europe, etc.)

Dutch TOOI Ontology

Overview

TOOI (Thesaurus en Ontologie Overheidsinformatie) is the Dutch government's ontology for public sector organizations and administrative structures.

Specification: data/ontology/tooiont.ttl
Namespace: https://identifier.overheid.nl/tooi/def/ont/
Relevance: Essential for Dutch heritage institutions, especially government archives.

Integration Status

ALREADY INTEGRATED in schemas/provenance.yaml:91:

# From schemas/provenance.yaml
ChangeEvent:
  class_uri: tooiont:Wijzigingsgebeurtenis  # Dutch government change event ontology

Key TOOI Classes

TOOI Class Description GHCID Mapping
tooiont:Wijzigingsgebeurtenis Organizational change event ChangeEvent
tooiont:Afsplitsing Organizational split ChangeTypeEnum.SPLIT
tooiont:ExistentieleWijziging Existential change (founding, closure) ChangeTypeEnum.FOUNDING, CLOSURE
tooiont:BestuurlijkeRuimte Administrative territory Location (Dutch regions)

Change Event Alignment

TOOI provides precise Dutch government terminology for organizational changes:

ghcid:event-nha-merger-2001 a tooiont:Wijzigingsgebeurtenis, ghcid:ChangeEvent ;
  tooiont:wijzigingsdatum "2001-01-01"^^xsd:date ;
  tooiont:wijzigingstype ghcid:MERGER ;
  tooiont:betrokkenOrganisatie ghcid:nl-haarhaarlem, ghcid:nl-haarrijks, ghcid:nl-haarnha .

Benefits for Dutch Institutions

  1. National Standard: Official Dutch government ontology
  2. Legal Compliance: Aligns with Dutch administrative law
  3. Provincial Archives: Essential for provincial and municipal archive integration
  4. Change Tracking: Precise terminology for Dutch government reorganizations
  5. Linked Open Data: Integration with Dutch LOD infrastructure (data.overheid.nl)

Implementation

Use TOOI for all Dutch government heritage institutions:

  • Provincial archives (e.g., Noord-Hollands Archief, Gelders Archief)
  • Municipal archives (e.g., Stadsarchief Amsterdam)
  • National archives (Nationaal Archief)
  • Government-funded museums and libraries

W3C Organization Ontology

Overview

W3C Organization Ontology (ORG) is a generic ontology for organizational structures, hierarchies, and relationships.

Specification: data/ontology/org.ttl
Namespace: http://www.w3.org/ns/org#
Relevance: Foundation for organizational modeling.

Integration Status

ALREADY INTEGRATED in schemas/core.yaml:

# From schemas/core.yaml
HeritageCustodian:
  class_uri: org:Organization  # W3C Organization Ontology
  mixins:
    - org:OrganizationalUnit  # For sub-units and departments

Key ORG Classes

ORG Class Description GHCID Mapping
org:Organization Generic organization HeritageCustodian
org:OrganizationalUnit Department or sub-unit Branch libraries, museum departments
org:Site Physical location Location
org:Post Job or role Staff roles (optional)

Property Mappings

GHCID Property ORG Property Notes
name skos:prefLabel Organization name
description dct:description Organization description
locations org:hasSite Links to org:Site
Parent organization org:subOrganizationOf Hierarchical relationships
Sub-units org:hasSubOrganization Departments, branches

Use Cases

  1. Hierarchical Structures: Model multi-branch library systems
  2. Museum Departments: Represent departments within large museums
  3. Archive Networks: Model provincial/municipal archive relationships
  4. Generic Interoperability: Broadest possible compatibility

PROV-O Integration

Overview

W3C PROV Ontology (PROV-O) is the standard for provenance and data lineage tracking.

Specification: data/ontology/prov-o.ttl
Namespace: http://www.w3.org/ns/prov#
Relevance: Critical for data quality and provenance tracking.

Integration Status

ALREADY INTEGRATED across multiple schema modules.

Key PROV-O Classes

PROV-O Class Description GHCID Mapping
prov:Entity Data entities HeritageCustodian, Collection, etc.
prov:Activity Activities that generate/modify entities ChangeEvent, extraction processes
prov:Agent Agents responsible for activities Data creators, extractors

Property Mappings

GHCID Property PROV-O Property Notes
provenance.data_source prov:hadPrimarySource Source of data
provenance.extraction_date prov:generatedAtTime When data was extracted
provenance.extraction_method prov:wasGeneratedBy Extraction process
change_history prov:wasInfluencedBy Organizational changes

Provenance Graph Example

# GHCID record as PROV Entity
ghcid:nl-haarnha a prov:Entity, ghcid:HeritageCustodian ;
  prov:wasGeneratedBy ghcid:extraction-activity-20251106 ;
  prov:hadPrimarySource <https://www.noord-hollandsarchief.nl> ;
  prov:generatedAtTime "2025-11-06T10:30:00Z"^^xsd:dateTime .

# Extraction activity
ghcid:extraction-activity-20251106 a prov:Activity ;
  prov:wasAssociatedWith ghcid:agent-opencode ;
  prov:used <file:///data/ISIL-codes_2025-08-01.csv> ;
  prov:endedAtTime "2025-11-06T10:30:00Z"^^xsd:dateTime .

# Data source
<file:///data/ISIL-codes_2025-08-01.csv> a prov:Entity ;
  prov:wasAttributedTo <https://www.kb.nl/isil> ;
  prov:generatedAtTime "2025-08-01T00:00:00Z"^^xsd:dateTime .

Benefits

  1. Data Quality: Track data lineage from source to publication
  2. Reproducibility: Document extraction methods and tools
  3. Trust: Demonstrate data provenance to users
  4. Versioning: Track changes to GHCID records over time

Comprehensive Mapping Table

HeritageCustodian Class Mappings

Ontology Class Relationship Notes
CIDOC-CRM cidoc:E74_Group owl:equivalentClass Primary mapping for museums
RiC-O rico:CorporateBody owl:equivalentClass Primary mapping for archives
Schema.org schema:Museum / Library / ArchiveOrganization rdf:type (conditional) Based on institution_type
CPOV cpov:PublicOrganisation rdf:type (conditional) For public sector institutions
W3C ORG org:Organization rdfs:subClassOf Generic superclass
PROV-O prov:Entity rdf:type For provenance tracking

ChangeEvent Class Mappings

Ontology Class Relationship Notes
CIDOC-CRM cidoc:E5_Event rdfs:subClassOf General events
RiC-O rico:Activity rdfs:subClassOf Archival events
TOOI tooiont:Wijzigingsgebeurtenis owl:equivalentClass Dutch government events
PROV-O prov:Activity rdfs:subClassOf Provenance activities

Collection Class Mappings

Ontology Class Relationship Notes
CIDOC-CRM cidoc:E78_Curated_Holding rdfs:subClassOf Museum collections
RiC-O rico:RecordSet owl:equivalentClass Archival record sets
Schema.org schema:ArchiveComponent rdf:type Archival collections

Location Class Mappings

Ontology Class Relationship Notes
CIDOC-CRM cidoc:E53_Place rdfs:subClassOf Geographic places
RiC-O rico:Place rdfs:subClassOf Archival places
Schema.org schema:Place rdfs:subClassOf Generic places
W3C ORG org:Site rdf:type Organizational sites

Identifier Class Mappings

Ontology Class Relationship Notes
CIDOC-CRM cidoc:E42_Identifier rdfs:subClassOf Generic identifiers
RiC-O rico:Identifier rdfs:subClassOf Archival identifiers

RDF Serialization Examples

Example 1: Dutch Archive with Multiple Ontologies

@prefix ghcid:   <https://w3id.org/heritage/custodian/> .
@prefix cidoc:   <http://www.cidoc-crm.org/cidoc-crm/> .
@prefix rico:    <https://www.ica.org/standards/RiC/ontology#> .
@prefix schema:  <http://schema.org/> .
@prefix tooiont: <https://identifier.overheid.nl/tooi/def/ont/> .
@prefix org:     <http://www.w3.org/ns/org#> .
@prefix prov:    <http://www.w3.org/ns/prov#> .
@prefix xsd:     <http://www.w3.org/2001/XMLSchema#> .

# Noord-Hollands Archief - Multi-ontology representation
ghcid:nl-haarnha
  # Type declarations
  a ghcid:HeritageCustodian,
    rico:CorporateBody,           # RiC-O: Archival institution
    schema:ArchiveOrganization,   # Schema.org: Web discoverability
    org:Organization,             # W3C ORG: Generic organization
    prov:Entity ;                 # PROV-O: Provenance entity
  
  # Basic metadata
  ghcid:name "Noord-Hollands Archief" ;
  schema:name "Noord-Hollands Archief" ;
  rico:hasOrHadName [
    a rico:Name ;
    rico:textualValue "Noord-Hollands Archief"@nl
  ] ;
  
  # Alternative names
  ghcid:alternative_names "NHA", "North Holland Archives" ;
  schema:alternateName "NHA"@en ;
  rico:hasOrHadOtherName [
    a rico:Name ;
    rico:textualValue "NHA"@en
  ] ;
  
  # Institution type
  ghcid:institution_type "ARCHIVE" ;
  rico:hasOrganizationType ghcid:ARCHIVE ;
  
  # Description
  ghcid:description "Provincial archive of North Holland, formed in 2001 through merger of Gemeentearchief Haarlem and Rijksarchief in Noord-Holland." ;
  schema:description "Provincial archive of North Holland, formed in 2001 through merger of Gemeentearchief Haarlem and Rijksarchief in Noord-Holland."@en ;
  rico:scopeAndContent "Provincial archive of North Holland, formed in 2001 through merger of Gemeentearchief Haarlem and Rijksarchief in Noord-Holland."@en ;
  
  # Location
  ghcid:location [
    a ghcid:Location, rico:Place, schema:Place ;
    ghcid:city "Haarlem" ;
    ghcid:country "NL" ;
    schema:addressLocality "Haarlem" ;
    schema:addressCountry "NL" ;
    rico:hasOrHadName [ rico:textualValue "Haarlem"@nl ]
  ] ;
  
  # Identifiers
  ghcid:identifier [
    a ghcid:Identifier, rico:Identifier ;
    ghcid:identifier_scheme "ISIL" ;
    ghcid:identifier_value "NL-HaarNHA" ;
    rico:identifierType "ISIL" ;
    rico:textualValue "NL-HaarNHA"
  ] ;
  
  schema:url "https://www.noord-hollandsarchief.nl" ;
  schema:sameAs <https://www.wikidata.org/wiki/Q2725652> ;
  
  # Change history (merger event)
  ghcid:change_history ghcid:event-nha-merger-2001 ;
  rico:agentIsTargetOfActivity ghcid:event-nha-merger-2001 ;
  
  # Predecessors (RiC-O)
  rico:hasPredecessor ghcid:nl-haarhaarlem, ghcid:nl-haarrijks ;
  
  # Collections
  schema:archiveHeld ghcid:collection-haarlem-city-records ;
  rico:hasOrHadCustody ghcid:collection-haarlem-city-records ;
  
  # Provenance
  prov:wasGeneratedBy ghcid:extraction-isil-registry-20251106 ;
  prov:hadPrimarySource <file:///data/ISIL-codes_2025-08-01.csv> ;
  prov:generatedAtTime "2025-11-06T10:30:00Z"^^xsd:dateTime .

# Merger event
ghcid:event-nha-merger-2001
  a ghcid:ChangeEvent,
    rico:Activity,              # RiC-O: Activity
    cidoc:E5_Event,             # CIDOC-CRM: Event
    tooiont:Wijzigingsgebeurtenis, # TOOI: Dutch gov change event
    prov:Activity ;             # PROV-O: Activity
  
  ghcid:change_type "MERGER" ;
  ghcid:event_date "2001-01-01"^^xsd:date ;
  ghcid:event_description "Merger of Gemeentearchief Haarlem and Rijksarchief in Noord-Holland to form Noord-Hollands Archief." ;
  
  # RiC-O properties
  rico:hasActivityType ghcid:MERGER ;
  rico:hasBeginningDate "2001-01-01"^^xsd:date ;
  rico:agentAssociatedWithActivity ghcid:nl-haarhaarlem, ghcid:nl-haarrijks, ghcid:nl-haarnha ;
  
  # CIDOC-CRM properties
  cidoc:P4_has_time-span [
    a cidoc:E52_Time-Span ;
    cidoc:P82_at_some_time_within "2001-01-01"^^xsd:date
  ] ;
  cidoc:P14_carried_out_by ghcid:nl-haarnha ;
  
  # TOOI properties
  tooiont:wijzigingsdatum "2001-01-01"^^xsd:date ;
  tooiont:wijzigingstype ghcid:MERGER ;
  
  # PROV-O properties
  prov:generated ghcid:nl-haarnha ;
  prov:used ghcid:nl-haarhaarlem, ghcid:nl-haarrijks ;
  prov:atTime "2001-01-01T00:00:00Z"^^xsd:dateTime .

Example 2: Museum with CIDOC-CRM Focus

# Rijksmuseum - Museum-centric representation
ghcid:nl-amsnl-rms
  a ghcid:HeritageCustodian,
    cidoc:E74_Group,           # CIDOC-CRM: Museum as group
    schema:Museum,             # Schema.org: Museum type
    org:Organization ;
  
  ghcid:name "Rijksmuseum" ;
  cidoc:P1_is_identified_by [
    a cidoc:E41_Appellation ;
    cidoc:P190_has_symbolic_content "Rijksmuseum"
  ] ;
  
  ghcid:institution_type "MUSEUM" ;
  cidoc:P2_has_type ghcid:MUSEUM ;
  
  # Location (CIDOC-CRM E53_Place)
  cidoc:P74_has_current_or_former_residence [
    a cidoc:E53_Place ;
    cidoc:P87_is_identified_by [
      a cidoc:E44_Place_Appellation ;
      cidoc:P190_has_symbolic_content "Amsterdam, Netherlands"
    ]
  ] ;
  
  # Collections (CIDOC-CRM E78_Curated_Holding)
  cidoc:P109_has_current_or_former_curator ghcid:collection-rijksmuseum-dutch-masters ;
  
  schema:url "https://www.rijksmuseum.nl" ;
  schema:sameAs <https://www.wikidata.org/wiki/Q190804> .

# Collection
ghcid:collection-rijksmuseum-dutch-masters
  a ghcid:Collection,
    cidoc:E78_Curated_Holding ;
  
  ghcid:collection_name "Dutch Masters Collection" ;
  cidoc:P3_has_note "Collection of Dutch Golden Age paintings including works by Rembrandt, Vermeer, and Hals." ;
  
  cidoc:P109i_is_current_or_former_curator_of ghcid:nl-amsnl-rms .

Example 3: JSON-LD for Schema.org

{
  "@context": "https://schema.org",
  "@type": "ArchiveOrganization",
  "@id": "https://w3id.org/heritage/custodian/nl-haarnha",
  "name": "Noord-Hollands Archief",
  "alternateName": ["NHA", "North Holland Archives"],
  "description": "Provincial archive of North Holland, formed in 2001 through merger of Gemeentearchief Haarlem and Rijksarchief in Noord-Holland.",
  "url": "https://www.noord-hollandsarchief.nl",
  "sameAs": [
    "https://www.wikidata.org/wiki/Q2725652",
    "https://viaf.org/viaf/..."
  ],
  "address": {
    "@type": "PostalAddress",
    "addressLocality": "Haarlem",
    "addressCountry": "NL"
  },
  "archiveHeld": {
    "@type": "ArchiveComponent",
    "@id": "https://w3id.org/heritage/custodian/collection-haarlem-city-records",
    "name": "Municipal Records of Haarlem",
    "temporalCoverage": "1245/2025",
    "description": "Historical records of the city of Haarlem spanning 780 years."
  },
  "foundingDate": "2001-01-01",
  "parentOrganization": null,
  "subOrganization": []
}

Gap Analysis

What GHCID Provides Beyond Standard Ontologies

  1. Persistent Identifiers (GHCID)

    • Unique: Globally unique, persistent IDs for heritage institutions
    • Versioned: Change history tracking via GHCID history entries
    • Resolvable: Designed for HTTP resolution (future)
    • Gap: No standard ontology provides persistent IDs for organizations
  2. Change History Tracking

    • Comprehensive: Dedicated ChangeEvent model with 11 event types
    • Temporal: Full history from founding to present
    • Linked: Events linked to affected organizations
    • Gap: Most ontologies don't track organizational evolution over time
  3. Multi-Tier Provenance

    • Data Quality: 4-tier data quality system (TIER_1 through TIER_4)
    • Confidence Scoring: Numeric confidence scores for extracted data
    • Source Tracking: Precise provenance metadata
    • Gap: PROV-O lacks data quality tiers
  4. Institution Type Taxonomy

    • GLAM-Specific: 13 institution types tailored for heritage sector
    • Extensible: Can add new types as needed
    • Gap: Schema.org has only 3 types (Museum, Library, ArchiveOrganization)

What Standard Ontologies Provide That GHCID Could Use

  1. CIDOC-CRM Time-Spans

    • Feature: E52_Time-Span for precise temporal modeling
    • Benefit: More granular than ISO 8601 dates (start/end, fuzzy dates)
    • Recommendation: Extend ChangeEvent to support CIDOC-CRM time-spans
  2. RiC-O Relationship Types

    • Feature: 50+ relationship types (hierarchical, temporal, associative)
    • Benefit: Rich organizational relationship modeling
    • Recommendation: Add relationship modeling to GHCID schema (future)
  3. Schema.org Events

    • Feature: schema:Event for public-facing event information
    • Benefit: Exhibition openings, closures, relocations for SEO
    • Recommendation: Link ChangeEvents to schema:Event for web discoverability
  4. CPOV Contact Points

    • Feature: Structured contact information (already integrated!)
    • Benefit: Standardized contact metadata
    • Status: Already using cpov:ContactPoint

Areas for Future Development

  1. Collection Metadata Enhancement

    • Adopt CIDOC-CRM E78_Curated_Holding properties
    • Integrate RiC-O RecordSet relationships
    • Add Schema.org CreativeWork linkage
  2. Relationship Modeling

    • Add parent/child organization relationships (org:subOrganizationOf)
    • Add partnership relationships (RiC-O rico:isAssociatedWith)
    • Add network membership (RiC-O rico:isMemberOf)
  3. Temporal Modeling

    • Support fuzzy dates ("circa 1850", "early 20th century")
    • Use CIDOC-CRM E52_Time-Span for date ranges
    • Add temporal qualifiers (RiC-O rico:certainty)
  4. Rights and Licensing

    • Add Dublin Core Terms rights metadata
    • Integrate Creative Commons licenses
    • Support ODRL for data usage rights

Implementation Recommendations

1. RDF Export Priority

When exporting GHCID data to RDF, include:

REQUIRED (all institutions):

  • W3C ORG: org:Organization (generic compatibility)
  • PROV-O: prov:Entity (provenance)
  • Schema.org: Institution-type-specific class (SEO)

RECOMMENDED (based on institution type):

  • Archives → RiC-O: rico:CorporateBody (archival standard)
  • Museums → CIDOC-CRM: cidoc:E74_Group (museum standard)
  • Dutch institutions → TOOI: tooiont:Organisatie (Dutch gov)
  • EU institutions → CPOV: cpov:PublicOrganisation (EU public sector)

2. JSON-LD Context Design

Create a comprehensive JSON-LD context file:

{
  "@context": {
    "ghcid": "https://w3id.org/heritage/custodian/",
    "cidoc": "http://www.cidoc-crm.org/cidoc-crm/",
    "rico": "https://www.ica.org/standards/RiC/ontology#",
    "schema": "https://schema.org/",
    "org": "http://www.w3.org/ns/org#",
    "prov": "http://www.w3.org/ns/prov#",
    "cpov": "http://data.europa.eu/m8g/",
    "tooiont": "https://identifier.overheid.nl/tooi/def/ont/",
    
    "HeritageCustodian": {
      "@id": "ghcid:HeritageCustodian",
      "@type": "@id"
    },
    "name": {
      "@id": "schema:name",
      "@container": "@language"
    },
    "institution_type": {
      "@id": "ghcid:institution_type",
      "@type": "@vocab"
    },
    ...
  }
}

3. Validation Strategy

Validate RDF exports against ontology specifications:

  1. Schema.org: Use Google's Structured Data Testing Tool
  2. CIDOC-CRM: Validate with CIDOC-CRM RDFS/OWL definitions
  3. RiC-O: Validate with RiC-O v1.1 SHACL shapes (if available)
  4. W3C Standards: Use W3C RDF validator

4. Incremental Adoption

Phase 1 (Current): Basic ontology alignment

  • Export as W3C ORG + Schema.org
  • Include PROV-O provenance
  • Minimal RDF serialization

Phase 2 (Next 6 months): Domain-specific ontologies

  • Add RiC-O for archives
  • Add CIDOC-CRM for museums
  • Add TOOI for Dutch institutions

Phase 3 (Future): Advanced features

  • Full relationship modeling
  • CIDOC-CRM time-spans
  • Rights and licensing metadata

5. Documentation and Examples

Provide clear examples for data consumers:

  • GitHub repo: /examples/rdf/ directory with Turtle, JSON-LD, RDF/XML
  • API documentation: Ontology alignment tables in API docs
  • SPARQL queries: Example queries for each ontology
  • Conversion scripts: LinkML → RDF conversion utilities

Conclusion

The GHCID schema achieves interoperability with seven major ontologies while maintaining unique capabilities for persistent identification and change tracking. By aligning with international standards (CIDOC-CRM, RiC-O, Schema.org) and regional standards (CPOV, TOOI), GHCID enables seamless data exchange across the global heritage sector.

Key Takeaways:

  1. RiC-O is critical for archival institutions (highest priority)
  2. Schema.org enables web discoverability (required for SEO)
  3. CIDOC-CRM supports museum integration (recommended for museums)
  4. Multiple ontologies can coexist (use conditional type assignments)
  5. GHCID adds unique value (persistent IDs, change history, data quality tiers)

Next Steps:

  1. Complete ONTOLOGY_INTEGRATION.md (this document)
  2. Create GHCID_PID_SCHEME.md (persistent identifier specification)
  3. Implement RDF export utilities in Python
  4. Generate JSON-LD context file
  5. Create example RDF serializations for each ontology

Revision History:

  • v1.0 (2025-11-06): Initial comprehensive ontology integration guide

Maintained By: GLAM Data Extraction Project
Contact: Project repository