glam/docs/PARTNERSHIP_TAXONOMY.md
2025-11-19 23:25:22 +01:00

22 KiB

Partnership Taxonomy for Heritage Institutions

Version: 1.0
Last Updated: 2025-11-07
Status: Draft

Overview

This document defines the partnership types used in the GLAM Data Extraction project to categorize relationships between heritage institutions and external organizations, platforms, and networks.

Purpose

Heritage institutions (Galleries, Libraries, Archives, Museums) participate in various collaborative frameworks ranging from aggregator platforms to digitization programs to thematic networks. Structured partnership metadata enables:

  • Interoperability analysis: Which institutions participate in cross-border initiatives?
  • Network mapping: How do regional heritage organizations connect to global platforms?
  • Digital infrastructure discovery: What technical platforms and standards are in use?
  • Impact assessment: How do digitization programs and certifications affect institutional visibility?

Partnership Type Definitions

1. Aggregator Participation

Type Code: aggregator_participation

Definition: Participation in content aggregation platforms that harvest, index, and provide unified discovery across multiple institutions' digital collections.

Characteristics:

  • Institution provides metadata/content to aggregator
  • Aggregator provides discovery portal and API access
  • Often requires metadata standardization (OAI-PMH, EDM, Dublin Core)
  • May include content hosting or reference linking

Examples:

  • Europeana (European aggregator, 4,000+ institutions)
  • DPLA (Digital Public Library of America, 4,500+ institutions)
  • Archieven.nl (Dutch archives aggregator, 80+ institutions)
  • Archives Portal Europe (European archival aggregator)
  • Collectie Nederland (Dutch heritage aggregator)
  • Rijkscollectie (Dutch national collection platform)
  • Trove (Australian aggregator)
  • DigitalNZ (New Zealand aggregator)

Typical Temporal Scope: Ongoing (institutions may join/leave)

AAT Concept Mapping:

  • aat:300025976 - cooperation (organizations)
  • aat:300379842 - digital repositories

2. Digitization Program

Type Code: digitization_program

Definition: Participation in funded or coordinated initiatives to digitize physical heritage materials, including equipment provision, training, metadata standards, and technical infrastructure.

Characteristics:

  • Time-bounded projects (typically 1-5 years)
  • Often government or foundation funded
  • May include equipment grants (scanners, cameras)
  • Training and capacity building components
  • Metadata standardization requirements

Examples:

  • DC4EU (Digital Culture for Europe, EU-funded program)
  • Versnellen (Dutch accelerated digitization program)
  • Metamorfoze (Dutch preservation/digitization program, National Library)
  • Google Arts & Culture (partnership program with Art Camera equipment)
  • Internet Archive (book digitization partnerships)
  • Endangered Archives Programme (British Library)
  • National Digital Stewardship Alliance (US coordination)

Typical Temporal Scope: 1-5 years (may have multiple phases)

AAT Concept Mapping:

  • aat:300054608 - digitization
  • aat:300054698 - preservation programs

3. Thematic Network

Type Code: thematic_network

Definition: Membership in subject-specific or technology-specific collaborative networks focused on specialized collections, standards development, or community of practice.

Characteristics:

  • Subject or technology focus (WW2 collections, IIIF, fashion heritage)
  • Often voluntary membership
  • Knowledge sharing and best practices
  • May develop standards or specifications
  • Community-driven governance

Examples:

  • IIIF Consortium (International Image Interoperability Framework)
  • WO2Net (Dutch WW2 heritage network)
  • ModeMuze (Dutch fashion heritage network)
  • NIOD (Netherlands Institute for War Documentation network)
  • DBNL (Digital Library of Dutch Literature)
  • Delpher (Dutch newspapers and books platform)
  • Linked Art (museum data community)
  • CIDOC CRM (conceptual reference model community)

Typical Temporal Scope: Ongoing (voluntary membership)

AAT Concept Mapping:

  • aat:300025976 - cooperation (organizations)
  • aat:300025948 - professional associations

4. International Aggregator

Type Code: international_aggregator

Definition: Participation in global-scale aggregators, authority files, or linked data platforms that provide cross-border identifier harmonization and discovery.

Characteristics:

  • Global scope (50+ countries)
  • Authority control or identifier management
  • Often library-focused (but expanding)
  • Linked data / semantic web technologies
  • Persistent identifier assignment

Examples:

  • WorldCat (OCLC global library catalog, 10,000+ libraries)
  • VIAF (Virtual International Authority File, name authorities)
  • Getty Vocabularies (AAT, TGN, ULAN authority vocabularies)
  • Wikidata (global knowledge graph, cultural heritage focus growing)
  • OpenStreetMap (geographic data, institutions mapped)

Typical Temporal Scope: Ongoing (continuous participation)

AAT Concept Mapping:

  • aat:300379842 - digital repositories
  • aat:300026061 - authority files

5. National Certification

Type Code: national_certification

Definition: Official recognition, registration, or certification by national heritage authorities, UNESCO, or government-designated collection programs.

Characteristics:

  • Government or international body authority
  • Often includes quality standards
  • May provide funding or legal status
  • Public accountability and reporting
  • Status symbol for prestige and trust

Examples:

  • Museum Register (Museumregister, Dutch national certification, ~420 museums)
  • National Collection Designation (UK, ~140 collections)
  • UNESCO World Heritage (cultural heritage sites)
  • UNESCO Memory of the World (documentary heritage)
  • Smithsonian Affiliations (US museum network, 200+ affiliates)

Typical Temporal Scope: Ongoing (periodic re-certification may be required)

AAT Concept Mapping:

  • aat:300026392 - accreditation (validation)
  • aat:300055932 - registration (official)

6. Academic Network

Type Code: academic_network

Definition: Participation in university consortia, research library networks, or scholarly communication platforms, often focused on academic resource sharing and open access.

Characteristics:

  • University or research institution membership
  • Resource sharing (ILL, document delivery)
  • Subscription consortia for journals/databases
  • Often national or regional scope
  • Library-focused governance

Examples:

  • CONRICYT (Mexican academic library consortium, 500+ institutions)
  • OCLC (global library cooperative, cataloging/ILL)
  • ALDU (Latin American university libraries)
  • DSpace (institutional repository software community)
  • OAI-PMH (Open Archives Initiative, metadata harvesting)
  • COAR (Confederation of Open Access Repositories)

Typical Temporal Scope: Ongoing (membership-based)

AAT Concept Mapping:

  • aat:300025976 - cooperation (organizations)
  • aat:300266386 - academic institutions

7. Linked Data Platform

Type Code: linked_data_platform

Definition: Participation in semantic web initiatives, SPARQL endpoints, RDF data publishing, or linked open data (LOD) platforms for machine-readable heritage data.

Characteristics:

  • Technical infrastructure (SPARQL, RDF, JSON-LD)
  • Linked open data principles (URIs, HTTP, standards)
  • Semantic interoperability (ontologies, vocabularies)
  • API access and machine-readable data
  • Often experimental/pilot projects

Examples:

  • SPARQL endpoints (institution-specific)
  • Linked Data initiatives (various)
  • RDF data dumps (Wikidata, DBpedia)
  • Semantic Web platforms (Europeana LOD, DPLA LOD)
  • Tainacan (Brazilian digital repository with linked data)

Typical Temporal Scope: Ongoing (technical implementation)

AAT Concept Mapping:

  • aat:300054162 - computer networks
  • aat:300379842 - digital repositories

8. Dataset Registry

Type Code: dataset_registry

Definition: Publication and registration of research datasets, cultural heritage data exports, or institutional metadata in data repositories with DOI assignment and preservation.

Characteristics:

  • DOI assignment for datasets
  • Long-term preservation commitment
  • Metadata describing datasets
  • Often open access
  • Versioning and citation tracking

Examples:

  • DANS (Data Archiving and Networked Services, Netherlands)
  • Dataverse (Harvard, multi-institutional research data repository)
  • Zenodo (CERN-hosted open research repository)
  • Figshare (research data repository)
  • Dryad (research data publication)

Typical Temporal Scope: Ongoing (dataset publication)

AAT Concept Mapping:

  • aat:300379842 - digital repositories
  • aat:300028051 - databases (structured data)

9. Generic Partnership

Type Code: partnership

Definition: Catch-all category for partnerships that don't fit other types, or when the nature of the partnership is unclear from context.

Characteristics:

  • Used when specific type cannot be determined
  • May be refined later with additional context
  • Includes informal collaborations
  • Ad-hoc project partnerships

Examples: Any unclassified partnership mention

AAT Concept Mapping:

  • aat:300025976 - cooperation (organizations)

Dutch Heritage Sector Partnership Patterns

Analysis of Dutch Organizations CSV (1,351 institutions)

From data/voorbeeld_lijst_organisaties_en_diensten-totaallijst_nederland.csv, we observe 18 distinct partnership types among Dutch heritage institutions:

Platform Participation Frequencies

Platform Count Type Notes
Collectie Nederland 987 aggregator_participation National aggregator, 73% of institutions
Archieven.nl 502 aggregator_participation Archives aggregator, 37%
Rijkscollectie 178 aggregator_participation National collection, 13%
Atlantis 156 System (not partnership) CMS for libraries
Delpher 134 thematic_network Historical newspapers, 10%
MAIS 89 System (not partnership) CMS for archives
Museum register 62 national_certification National museum accreditation
DBNL 43 thematic_network Digital Library of Dutch Literature
NIOD 21 thematic_network WW2 documentation
Europeana 18 aggregator_participation European aggregator
WO2Net 12 thematic_network WW2 heritage network
IIIF 8 thematic_network Image interoperability
ModeMuze 6 thematic_network Fashion heritage
Google Arts & Culture 5 digitization_program Art Camera program
Metamorfoze 4 digitization_program Preservation program
Versnellen 2 digitization_program Digitization acceleration

Key Findings:

  • Aggregator dominance: 987 institutions (73%) participate in Collectie Nederland
  • Multi-platform strategy: Most institutions participate in 2-4 platforms
  • Regional variation: Urban institutions have higher platform participation
  • Type correlation: Museums favor Rijkscollectie (art museums), archives favor Archieven.nl

Global Partnership Patterns (Preliminary)

Evidence from 139 Conversation Files

Based on preliminary analysis of conversation text (60+ countries covered):

Regional Aggregators by Continent

Europe:

  • Europeana (4,000+ institutions, 40+ countries)
  • Archives Portal Europe (archival focus)
  • Judaica Europeana (Jewish heritage)

Americas:

  • DPLA (Digital Public Library of America, 4,500+ institutions)
  • Canadiana (Canadian aggregator)
  • Red de Museos Virtuales de América Latina

Asia-Pacific:

  • Trove (Australia)
  • DigitalNZ (New Zealand)
  • National Digital Heritage (Singapore)

Latin America:

  • Biblioteca Digital Hispánica (Spain-Latin America)
  • Tainacan (Brazilian repositories)

Academic Consortia

  • OCLC WorldCat: 10,000+ libraries globally
  • CONRICYT: 500+ Mexican academic institutions
  • ALDU: Latin American university libraries
  • Ex Libris/ProQuest networks: Global library automation

Digitization Programs

  • Google Arts & Culture: 2,000+ museums worldwide (Art Camera partnerships)
  • Internet Archive: Book digitization, 1,000+ library partners
  • Endangered Archives Programme: 600+ projects (British Library)
  • HathiTrust: 150+ research libraries (US)

Controlled Vocabulary Mappings

AAT (Art & Architecture Thesaurus) Mappings

Partnership Type AAT Concept ID AAT Term
Aggregator Participation aat:300379842 digital repositories
Digitization Program aat:300054608 digitization
Thematic Network aat:300025976 cooperation (organizations)
National Certification aat:300026392 accreditation (validation)
Academic Network aat:300266386 academic institutions
Linked Data Platform aat:300054162 computer networks

PROV-O (Provenance Ontology) Mappings

Partnerships can be modeled as PROV activities:

# Example: Institution joins Europeana
<institution/rijksmuseum> prov:wasInfluencedBy <partnership/europeana-2015> .

<partnership/europeana-2015> a prov:Activity ;
    rdfs:label "Rijksmuseum joined Europeana"@en ;
    prov:startedAtTime "2015-01-01T00:00:00Z"^^xsd:dateTime ;
    prov:wasAssociatedWith <aggregator/europeana> ;
    glam:partnershipType "aggregator_participation" .

EU CPOV (Core Public Organization Vocabulary)

For government-mandated partnerships:

  • cpov:formalFramework - Legal/regulatory framework
  • org:Organization - Partner organization
  • org:memberOf - Membership relationships

Data Model: Partnership Schema

LinkML Schema Extension

Add to schemas/core.yaml or create new schemas/partnerships.yaml:

classes:
  Partnership:
    description: >-
      A collaborative relationship between a heritage institution and an 
      external organization, platform, or network.      
    slots:
      - partner_name
      - partnership_type
      - start_date
      - end_date
      - partnership_description
      - partner_url
      - contact_point
      - membership_status
      - funding_source

slots:
  partner_name:
    description: Name of partner organization or platform
    range: string
    required: true

  partnership_type:
    description: Type of partnership from controlled vocabulary
    range: PartnershipTypeEnum
    required: true

  start_date:
    description: Partnership start date (ISO 8601)
    range: date

  end_date:
    description: Partnership end date (ISO 8601, omit if ongoing)
    range: date

  partnership_description:
    description: Context and details of partnership
    range: string

  partner_url:
    description: URL of partner organization or platform
    range: uri

  contact_point:
    description: Contact person or department for partnership
    range: string

  membership_status:
    description: Current status (active, inactive, pending)
    range: MembershipStatusEnum

  funding_source:
    description: Funding source if applicable (e.g., EU Horizon, government grant)
    range: string

enums:
  PartnershipTypeEnum:
    permissible_values:
      aggregator_participation:
        description: Participation in content aggregation platforms
      digitization_program:
        description: Participation in digitization initiatives
      thematic_network:
        description: Subject or technology-specific networks
      international_aggregator:
        description: Global aggregators and authority files
      national_certification:
        description: Official government or UNESCO recognition
      academic_network:
        description: University consortia and research networks
      linked_data_platform:
        description: Semantic web and linked open data platforms
      dataset_registry:
        description: Research data repository participation
      partnership:
        description: Generic or unclassified partnership

  MembershipStatusEnum:
    permissible_values:
      active:
        description: Currently active partnership
      inactive:
        description: Partnership ended or suspended
      pending:
        description: Partnership under negotiation
      pilot:
        description: Experimental or trial phase

Extraction Strategies

Pattern-Based Extraction (Implemented)

The ConversationParser.extract_partnerships() method uses:

  1. Named entity patterns: Regex matching known partnership names

    • Europeana, DPLA, Archieven.nl, Collectie Nederland, etc.
    • 8 partnership types with 30+ predefined patterns
  2. Generic phrase patterns:

    • "collaborates with [Organization]"
    • "member of the [Network/Consortium]"
    • "part of [Platform]"
    • "participated in [Program]"
  3. Temporal extraction:

    • "from 2020 to 2025" → start_date, end_date
    • "since 2018" → start_date
    • "until 2023" → end_date
    • "in 2022" → single-year partnership
  4. Context extraction:

    • Extracts sentence containing partnership mention
    • Max 300 characters
    • Provides description field for validation

CSV Parsing (Dutch Organizations)

Direct field mapping from CSV columns:

  • Europeana (J/N) → If "J", add {"partner_name": "Europeana", "partnership_type": "aggregator_participation"}
  • Collectie Nederland (J/N) → If "J", add partnership
  • Similar for Archieven.nl, Rijkscollectie, Museum register, etc.

NLP-Based Extraction (Future)

For conversations mentioning partnerships without explicit names:

  • Named Entity Recognition: Extract ORG entities near partnership keywords
  • Dependency parsing: Identify "member of X" relationships
  • Coreference resolution: Link "the consortium" to previously mentioned organization
  • Entity linking: Match organization names to Wikidata entities

Validation and Quality Control

Confidence Scoring

Assign confidence scores based on extraction method:

  • 0.9-1.0: Explicit mention with temporal info (e.g., "joined Europeana in 2018")
  • 0.7-0.9: Clear mention without dates (e.g., "participates in DPLA")
  • 0.5-0.7: Generic pattern match (e.g., "member of Digital Heritage Network")
  • 0.3-0.5: Inferred from context (e.g., "portal" → likely aggregator)
  • 0.0-0.3: Uncertain, needs manual verification

Cross-Referencing

Validate extracted partnerships against:

  1. Authoritative sources:

  2. Institutional websites:

    • Check website footer for "partner logos"
    • Check "about" or "partnerships" pages
    • Verify membership claims
  3. Deduplication:

    • Normalize partner names (e.g., "OCLC WorldCat" vs "WorldCat")
    • Handle abbreviations (e.g., "DPLA" vs "Digital Public Library of America")

Use Cases

1. Network Analysis

Question: Which institutions are most connected to international platforms?

Query (SPARQL):

SELECT ?institution (COUNT(?partnership) AS ?partnershipCount)
WHERE {
  ?institution a glam:HeritageCustodian ;
               glam:partnership ?partnership .
  ?partnership glam:partnershipType ?type .
  FILTER(?type IN ("aggregator_participation", "international_aggregator"))
}
GROUP BY ?institution
ORDER BY DESC(?partnershipCount)

2. Digitization Impact Assessment

Question: Which institutions participated in digitization programs 2020-2025?

Query:

SELECT ?institution ?program ?startDate
WHERE {
  ?institution glam:partnership ?partnership .
  ?partnership glam:partnerName ?program ;
               glam:partnershipType "digitization_program" ;
               glam:startDate ?startDate .
  FILTER(?startDate >= "2020-01-01"^^xsd:date)
}

3. Regional Platform Coverage

Question: What percentage of Dutch museums participate in Collectie Nederland?

Analysis: 987 / 1,351 Dutch institutions = 73% participation

4. Temporal Network Evolution

Question: How has Europeana membership grown over time?

Method: Extract start_date from all Europeana partnerships, plot cumulative membership graph.


Future Work

1. Expand Partnership Detection

  • Social media monitoring: Detect partnership announcements on Twitter, LinkedIn
  • Press releases: Scrape institutional news pages for partnership announcements
  • Grant databases: Parse EU Horizon, NEH, IMLS grant databases for partnership projects

2. Relationship Modeling

Beyond binary partnerships, model:

  • Hierarchies: Sub-networks within larger networks (e.g., WO2Net within Collectie Nederland)
  • Dependencies: Platform dependencies (e.g., institution uses Tainacan hosted by Brazilian government)
  • Reciprocal relationships: Mutual partnerships vs. service provider relationships

3. Impact Metrics

Quantify partnership impact:

  • Discoverability: Institutions in Europeana see X% increase in website traffic
  • Funding: Digitization program participants receive Y average funding
  • Visibility: Certified museums (Museum Register) have Z% higher visitor counts

4. Temporal Dynamics

Track partnership lifecycle:

  • Formation: When do institutions join networks? (e.g., new museum opening → join aggregators)
  • Churn: When do institutions leave? (funding cuts, platform migrations)
  • Renewal: Do memberships require periodic renewal?

References

Standards and Vocabularies

Platform Documentation

  • Terras, M. (2015). Opening Access to Collections: The Making and Using of Open Digitised Cultural Content. Online Information Review, 39(5), 733-752.
  • Navarrete, T., & Borowiecki, K. J. (2016). Changes in Cultural Heritage Institutions in the Digital Era. Economics of Art and Culture, 295-315.

Document Status: Draft v1.0
Feedback: Please submit issues or pull requests to improve this taxonomy.