# RDF Partnership Export Implementation **Status**: ✅ COMPLETE **Date**: 2025-11-07 **Version**: 1.0 ## Overview Successfully implemented RDF/JSON-LD serialization of Partnership data using W3C Organization Ontology (ORG) patterns. The implementation integrates multiple heritage ontologies including CIDOC-CRM, RiC-O, Schema.org, PROV-O, and W3C ORG. ## Implementation ### Files Created/Modified 1. **`src/glam_extractor/exporters/rdf_exporter.py`** (343 lines) - NEW - Full RDF exporter with multi-ontology support - Partnership serialization using `org:Membership` pattern - Supports Turtle, RDF/XML, JSON-LD, N-Triples formats 2. **`src/glam_extractor/exporters/__init__.py`** - UPDATED - Exported `RDFExporter` class for public API 3. **`tests/exporters/test_rdf_exporter.py`** (292 lines) - NEW - 5 comprehensive tests covering: - Single partnership export - Multiple partnerships export - Partnerships with temporal scope (start/end dates) - Full Turtle serialization - Complete custodian with all fields ### Test Results ``` tests/exporters/test_rdf_exporter.py::TestRDFExporterPartnership::test_single_partnership_export PASSED tests/exporters/test_rdf_exporter.py::TestRDFExporterPartnership::test_multiple_partnerships_export PASSED tests/exporters/test_rdf_exporter.py::TestRDFExporterPartnership::test_partnership_with_temporal_scope PASSED tests/exporters/test_rdf_exporter.py::TestRDFExporterPartnership::test_export_to_turtle PASSED tests/exporters/test_rdf_exporter.py::TestRDFExporterCompleteness::test_full_custodian_export PASSED 5 passed in 1.00s Coverage: 89% for rdf_exporter.py ``` ## RDF Partnership Pattern ### W3C Organization Ontology Pattern Partnerships are serialized using the `org:Membership` class with the following structure: ```turtle org:hasMembership [ a org:Membership, ghcid:Partnership ; org:organization ; org:member [ a org:Organization ; schema:name "Partner Name" ] ; org:role "partnership_type" ; ghcid:partner_name "Partner Name" ; ghcid:partnership_type "partnership_type" ; schema:startDate "2022-01-01"^^xsd:date ; schema:endDate "2025-12-31"^^xsd:date ; schema:description "Partnership description" ; ] . ``` ### Ontology Integration **Primary Classes**: - `org:Membership` - W3C Organization Ontology (standardized pattern) - `ghcid:Partnership` - GHCID-specific type for domain queries **Properties**: - `org:organization` - Links membership to custodian - `org:member` - Partner organization (blank node or URI) - `org:role` - Partnership type (string literal) - `schema:startDate` / `schema:endDate` - Temporal scope (XSD dates) - `schema:description` - Partnership description - `ghcid:partner_name` - Partner organization name (string) - `ghcid:partnership_type` - Partnership classification ### Partner Organization Representation Partners are represented as blank nodes with: - `rdf:type org:Organization` - `schema:name` - Organization name **Future Enhancement**: When partner organizations have resolvable URIs in the GHCID dataset, replace blank nodes with URI references. ## Real-World Example ### Input Data From Dutch Organizations CSV (`data/voorbeeld_lijst_organisaties_en_diensten-totaallijst_nederland.csv`): ``` Regionaal Historisch Centrum (RHC) Drents Archief - ISIL: NL-AsnDA - City: Assen - Partnerships: - Archieven.nl (aggregator_participation) - Archives Portal Europe (international_aggregator) - WO2Net (thematic_network) - OODE24 (Mondriaan) (thematic_network) ``` ### RDF Output (Turtle) ```turtle a schema:ArchiveOrganization, schema:Organization, org:Organization, prov:Entity, ghcid:HeritageCustodian, rico:CorporateBody ; schema:name "Regionaal Historisch Centrum (RHC) Drents Archief" ; org:hasMembership [ a org:Membership, ghcid:Partnership ; org:organization ; org:member [ a org:Organization ; schema:name "Archieven.nl" ] ; org:role "aggregator_participation" ; schema:description "Dutch national archive portal" ; ] , [ a org:Membership, ghcid:Partnership ; org:organization ; org:member [ a org:Organization ; schema:name "Archives Portal Europe" ] ; org:role "international_aggregator" ; schema:description "European archive aggregation network" ; ] , [ a org:Membership, ghcid:Partnership ; org:organization ; org:member [ a org:Organization ; schema:name "WO2Net" ] ; org:role "thematic_network" ; schema:description "WWII heritage network" ; ] , [ a org:Membership, ghcid:Partnership ; org:organization ; org:member [ a org:Organization ; schema:name "OODE24 (Mondriaan)" ] ; org:role "thematic_network" ; schema:description "Mondriaan art project" ; ] . ``` ## Export Formats Supported ### 1. Turtle (RDF/Turtle) ```python exporter = RDFExporter() turtle = exporter.export([custodian], format="turtle") ``` **Features**: - Human-readable RDF serialization - Prefix declarations for all ontologies - Blank node lists for partnerships ### 2. JSON-LD ```python jsonld = exporter.export([custodian], format="json-ld") ``` **Features**: - JSON structure with `@context`, `@type`, `@id` - Machine-parseable linked data - Interoperable with IIIF, Web Annotations, Activity Streams ### 3. RDF/XML ```python rdfxml = exporter.export([custodian], format="xml") ``` **Features**: - XML serialization for OAI-PMH, SWORD - Traditional Semantic Web format ### 4. N-Triples ```python ntriples = exporter.export([custodian], format="nt") ``` **Features**: - Simple triple format (subject, predicate, object per line) - Easy to parse with Unix tools ## Usage Examples ### Export Single Custodian ```python from glam_extractor.exporters.rdf_exporter import RDFExporter from glam_extractor.models import HeritageCustodian, Partnership custodian = HeritageCustodian( id="https://w3id.org/heritage/custodian/nl/test", name="Test Museum", institution_type=InstitutionType.MUSEUM, partnerships=[ Partnership( partner_name="Museum Register", partnership_type="national_museum_certification" ) ], provenance=Provenance(...) ) exporter = RDFExporter() turtle = exporter.export([custodian], format="turtle") print(turtle) ``` ### Export Multiple Custodians ```python exporter = RDFExporter() for custodian in custodians: exporter.add_custodian(custodian) # Export all at once turtle = exporter.export(custodians, format="turtle") ``` ### Export to File ```python exporter = RDFExporter() turtle = exporter.export(custodians, format="turtle") with open("output.ttl", "w", encoding="utf-8") as f: f.write(turtle) ``` ## Ontology Namespaces The RDF exporter integrates the following ontologies: | Prefix | Namespace | Purpose | |--------|-----------|---------| | `ghcid` | `https://w3id.org/heritage/custodian/` | GHCID domain classes and properties | | `cidoc` | `http://www.cidoc-crm.org/cidoc-crm/` | CIDOC Conceptual Reference Model (cultural heritage) | | `rico` | `https://www.ica.org/standards/RiC/ontology#` | Records in Contexts (archival description) | | `schema` | `http://schema.org/` | Schema.org vocabulary (web search, IIIF) | | `org` | `http://www.w3.org/ns/org#` | W3C Organization Ontology (partnerships, hierarchy) | | `prov` | `http://www.w3.org/ns/prov#` | W3C PROV Ontology (provenance tracking) | | `foaf` | `http://xmlns.com/foaf/0.1/` | Friend of a Friend (agents, names) | | `dcterms` | `http://purl.org/dc/terms/` | Dublin Core metadata terms | ## Design Decisions ### Why org:Membership? The W3C Organization Ontology provides `org:Membership` specifically for representing "membership or affiliation of agents to organizations." This aligns perfectly with heritage institution partnerships: - **Standardized pattern** - Established W3C recommendation - **Flexible scope** - Supports temporal bounds, roles, descriptions - **Interoperable** - Used by government data portals (UK, EU) - **Extensible** - Can add GHCID-specific properties via `ghcid:Partnership` ### Blank Nodes vs. URIs **Current**: Partner organizations are blank nodes **Rationale**: Most partners don't have GHCIDs (yet) **Future**: Replace blank nodes with URIs when partners are in GHCID dataset Example migration: ```turtle # Current (blank node) org:member [ a org:Organization ; schema:name "Museum Register" ] # Future (URI reference) org:member ``` ### Dual Typing (org:Membership + ghcid:Partnership) Memberships are typed as **both** `org:Membership` and `ghcid:Partnership`: ```turtle [ a org:Membership, ghcid:Partnership ; ... ] ``` **Rationale**: - `org:Membership` - Standard interoperability with non-GLAM systems - `ghcid:Partnership` - Domain-specific queries (e.g., SPARQL: `?s org:hasMembership ?m . ?m a ghcid:Partnership`) ## SPARQL Query Examples ### Find All Partnerships of an Institution ```sparql PREFIX org: PREFIX ghcid: SELECT ?partner ?type WHERE { org:hasMembership ?membership . ?membership a ghcid:Partnership ; ghcid:partner_name ?partner ; ghcid:partnership_type ?type . } ``` ### Find All Institutions in a Network ```sparql PREFIX org: SELECT ?institution ?name WHERE { ?institution org:hasMembership ?membership . ?membership org:role "thematic_network" ; ghcid:partner_name "WO2Net" . ?institution schema:name ?name . } ``` ### Find Partnerships with Temporal Scope ```sparql PREFIX schema: PREFIX org: SELECT ?institution ?partner ?start ?end WHERE { ?institution org:hasMembership ?membership . ?membership ghcid:partner_name ?partner ; schema:startDate ?start ; schema:endDate ?end . FILTER(?end > "2025-01-01"^^xsd:date) } ``` ## Next Steps ### Task 3: Conversation JSON Parser Enhancement Add Partnership extraction to `src/glam_extractor/parsers/conversation.py`: 1. Pattern detection for partnership mentions 2. Classify partnership types from context 3. Extract temporal scope when mentioned 4. Link to partner organizations if identifiable ### Task 4: Global Partnership Taxonomy Documentation Document the partnership type taxonomy in `docs/PARTNERSHIP_TAXONOMY.md`: 1. **Dutch Partnership Types** (18 types observed): - `national_museum_certification` - Museum Register - `aggregator_participation` - Collectie Nederland, Archieven.nl - `digitization_program` - Versnellen, DC4EU - `thematic_network` - WO2Net, Mondriaan, Van Gogh Worldwide - (and 14 more types) 2. **Global Partnership Categories**: - National certifications/registers - Aggregation platforms - Digitization programs - Thematic networks - International collaborations - Funding partnerships - Technical infrastructure 3. **Mapping to Controlled Vocabularies**: - AAT (Art & Architecture Thesaurus) - PROV-O activity types - EU corporate vocabularies (CPOV) ## References - **W3C Organization Ontology**: https://www.w3.org/TR/vocab-org/ - **CIDOC-CRM**: https://www.cidoc-crm.org/ - **RiC-O**: https://www.ica.org/standards/RiC/ontology - **PROV-O**: https://www.w3.org/TR/prov-o/ - **Schema.org**: https://schema.org/ - **LinkML Schema**: `schemas/collections.yaml` (Partnership class definition) --- **Contributors**: OpenCODE AI Agent **License**: CC0 1.0 Universal (Public Domain) **Project**: GLAM Data Extractor - Global Heritage Custodian Identifier (GHCID) System