glam/docs/PARTNERSHIP_TAXONOMY.md
2025-11-19 23:25:22 +01:00

668 lines
22 KiB
Markdown

# Partnership Taxonomy for Heritage Institutions
**Version**: 1.0
**Last Updated**: 2025-11-07
**Status**: Draft
## Overview
This document defines the partnership types used in the GLAM Data Extraction project to categorize relationships between heritage institutions and external organizations, platforms, and networks.
## Purpose
Heritage institutions (Galleries, Libraries, Archives, Museums) participate in various collaborative frameworks ranging from aggregator platforms to digitization programs to thematic networks. Structured partnership metadata enables:
- **Interoperability analysis**: Which institutions participate in cross-border initiatives?
- **Network mapping**: How do regional heritage organizations connect to global platforms?
- **Digital infrastructure discovery**: What technical platforms and standards are in use?
- **Impact assessment**: How do digitization programs and certifications affect institutional visibility?
## Partnership Type Definitions
### 1. Aggregator Participation
**Type Code**: `aggregator_participation`
**Definition**: Participation in content aggregation platforms that harvest, index, and provide unified discovery across multiple institutions' digital collections.
**Characteristics**:
- Institution provides metadata/content to aggregator
- Aggregator provides discovery portal and API access
- Often requires metadata standardization (OAI-PMH, EDM, Dublin Core)
- May include content hosting or reference linking
**Examples**:
- **Europeana** (European aggregator, 4,000+ institutions)
- **DPLA** (Digital Public Library of America, 4,500+ institutions)
- **Archieven.nl** (Dutch archives aggregator, 80+ institutions)
- **Archives Portal Europe** (European archival aggregator)
- **Collectie Nederland** (Dutch heritage aggregator)
- **Rijkscollectie** (Dutch national collection platform)
- **Trove** (Australian aggregator)
- **DigitalNZ** (New Zealand aggregator)
**Typical Temporal Scope**: Ongoing (institutions may join/leave)
**AAT Concept Mapping**:
- `aat:300025976` - cooperation (organizations)
- `aat:300379842` - digital repositories
---
### 2. Digitization Program
**Type Code**: `digitization_program`
**Definition**: Participation in funded or coordinated initiatives to digitize physical heritage materials, including equipment provision, training, metadata standards, and technical infrastructure.
**Characteristics**:
- Time-bounded projects (typically 1-5 years)
- Often government or foundation funded
- May include equipment grants (scanners, cameras)
- Training and capacity building components
- Metadata standardization requirements
**Examples**:
- **DC4EU** (Digital Culture for Europe, EU-funded program)
- **Versnellen** (Dutch accelerated digitization program)
- **Metamorfoze** (Dutch preservation/digitization program, National Library)
- **Google Arts & Culture** (partnership program with Art Camera equipment)
- **Internet Archive** (book digitization partnerships)
- **Endangered Archives Programme** (British Library)
- **National Digital Stewardship Alliance** (US coordination)
**Typical Temporal Scope**: 1-5 years (may have multiple phases)
**AAT Concept Mapping**:
- `aat:300054608` - digitization
- `aat:300054698` - preservation programs
---
### 3. Thematic Network
**Type Code**: `thematic_network`
**Definition**: Membership in subject-specific or technology-specific collaborative networks focused on specialized collections, standards development, or community of practice.
**Characteristics**:
- Subject or technology focus (WW2 collections, IIIF, fashion heritage)
- Often voluntary membership
- Knowledge sharing and best practices
- May develop standards or specifications
- Community-driven governance
**Examples**:
- **IIIF Consortium** (International Image Interoperability Framework)
- **WO2Net** (Dutch WW2 heritage network)
- **ModeMuze** (Dutch fashion heritage network)
- **NIOD** (Netherlands Institute for War Documentation network)
- **DBNL** (Digital Library of Dutch Literature)
- **Delpher** (Dutch newspapers and books platform)
- **Linked Art** (museum data community)
- **CIDOC CRM** (conceptual reference model community)
**Typical Temporal Scope**: Ongoing (voluntary membership)
**AAT Concept Mapping**:
- `aat:300025976` - cooperation (organizations)
- `aat:300025948` - professional associations
---
### 4. International Aggregator
**Type Code**: `international_aggregator`
**Definition**: Participation in global-scale aggregators, authority files, or linked data platforms that provide cross-border identifier harmonization and discovery.
**Characteristics**:
- Global scope (50+ countries)
- Authority control or identifier management
- Often library-focused (but expanding)
- Linked data / semantic web technologies
- Persistent identifier assignment
**Examples**:
- **WorldCat** (OCLC global library catalog, 10,000+ libraries)
- **VIAF** (Virtual International Authority File, name authorities)
- **Getty Vocabularies** (AAT, TGN, ULAN authority vocabularies)
- **Wikidata** (global knowledge graph, cultural heritage focus growing)
- **OpenStreetMap** (geographic data, institutions mapped)
**Typical Temporal Scope**: Ongoing (continuous participation)
**AAT Concept Mapping**:
- `aat:300379842` - digital repositories
- `aat:300026061` - authority files
---
### 5. National Certification
**Type Code**: `national_certification`
**Definition**: Official recognition, registration, or certification by national heritage authorities, UNESCO, or government-designated collection programs.
**Characteristics**:
- Government or international body authority
- Often includes quality standards
- May provide funding or legal status
- Public accountability and reporting
- Status symbol for prestige and trust
**Examples**:
- **Museum Register** (Museumregister, Dutch national certification, ~420 museums)
- **National Collection Designation** (UK, ~140 collections)
- **UNESCO World Heritage** (cultural heritage sites)
- **UNESCO Memory of the World** (documentary heritage)
- **Smithsonian Affiliations** (US museum network, 200+ affiliates)
**Typical Temporal Scope**: Ongoing (periodic re-certification may be required)
**AAT Concept Mapping**:
- `aat:300026392` - accreditation (validation)
- `aat:300055932` - registration (official)
---
### 6. Academic Network
**Type Code**: `academic_network`
**Definition**: Participation in university consortia, research library networks, or scholarly communication platforms, often focused on academic resource sharing and open access.
**Characteristics**:
- University or research institution membership
- Resource sharing (ILL, document delivery)
- Subscription consortia for journals/databases
- Often national or regional scope
- Library-focused governance
**Examples**:
- **CONRICYT** (Mexican academic library consortium, 500+ institutions)
- **OCLC** (global library cooperative, cataloging/ILL)
- **ALDU** (Latin American university libraries)
- **DSpace** (institutional repository software community)
- **OAI-PMH** (Open Archives Initiative, metadata harvesting)
- **COAR** (Confederation of Open Access Repositories)
**Typical Temporal Scope**: Ongoing (membership-based)
**AAT Concept Mapping**:
- `aat:300025976` - cooperation (organizations)
- `aat:300266386` - academic institutions
---
### 7. Linked Data Platform
**Type Code**: `linked_data_platform`
**Definition**: Participation in semantic web initiatives, SPARQL endpoints, RDF data publishing, or linked open data (LOD) platforms for machine-readable heritage data.
**Characteristics**:
- Technical infrastructure (SPARQL, RDF, JSON-LD)
- Linked open data principles (URIs, HTTP, standards)
- Semantic interoperability (ontologies, vocabularies)
- API access and machine-readable data
- Often experimental/pilot projects
**Examples**:
- **SPARQL endpoints** (institution-specific)
- **Linked Data initiatives** (various)
- **RDF data dumps** (Wikidata, DBpedia)
- **Semantic Web platforms** (Europeana LOD, DPLA LOD)
- **Tainacan** (Brazilian digital repository with linked data)
**Typical Temporal Scope**: Ongoing (technical implementation)
**AAT Concept Mapping**:
- `aat:300054162` - computer networks
- `aat:300379842` - digital repositories
---
### 8. Dataset Registry
**Type Code**: `dataset_registry`
**Definition**: Publication and registration of research datasets, cultural heritage data exports, or institutional metadata in data repositories with DOI assignment and preservation.
**Characteristics**:
- DOI assignment for datasets
- Long-term preservation commitment
- Metadata describing datasets
- Often open access
- Versioning and citation tracking
**Examples**:
- **DANS** (Data Archiving and Networked Services, Netherlands)
- **Dataverse** (Harvard, multi-institutional research data repository)
- **Zenodo** (CERN-hosted open research repository)
- **Figshare** (research data repository)
- **Dryad** (research data publication)
**Typical Temporal Scope**: Ongoing (dataset publication)
**AAT Concept Mapping**:
- `aat:300379842` - digital repositories
- `aat:300028051` - databases (structured data)
---
### 9. Generic Partnership
**Type Code**: `partnership`
**Definition**: Catch-all category for partnerships that don't fit other types, or when the nature of the partnership is unclear from context.
**Characteristics**:
- Used when specific type cannot be determined
- May be refined later with additional context
- Includes informal collaborations
- Ad-hoc project partnerships
**Examples**: Any unclassified partnership mention
**AAT Concept Mapping**:
- `aat:300025976` - cooperation (organizations)
---
## Dutch Heritage Sector Partnership Patterns
### Analysis of Dutch Organizations CSV (1,351 institutions)
From `data/voorbeeld_lijst_organisaties_en_diensten-totaallijst_nederland.csv`, we observe 18 distinct partnership types among Dutch heritage institutions:
#### Platform Participation Frequencies
| Platform | Count | Type | Notes |
|----------|-------|------|-------|
| **Collectie Nederland** | 987 | `aggregator_participation` | National aggregator, 73% of institutions |
| **Archieven.nl** | 502 | `aggregator_participation` | Archives aggregator, 37% |
| **Rijkscollectie** | 178 | `aggregator_participation` | National collection, 13% |
| **Atlantis** | 156 | System (not partnership) | CMS for libraries |
| **Delpher** | 134 | `thematic_network` | Historical newspapers, 10% |
| **MAIS** | 89 | System (not partnership) | CMS for archives |
| **Museum register** | 62 | `national_certification` | National museum accreditation |
| **DBNL** | 43 | `thematic_network` | Digital Library of Dutch Literature |
| **NIOD** | 21 | `thematic_network` | WW2 documentation |
| **Europeana** | 18 | `aggregator_participation` | European aggregator |
| **WO2Net** | 12 | `thematic_network` | WW2 heritage network |
| **IIIF** | 8 | `thematic_network` | Image interoperability |
| **ModeMuze** | 6 | `thematic_network` | Fashion heritage |
| **Google Arts & Culture** | 5 | `digitization_program` | Art Camera program |
| **Metamorfoze** | 4 | `digitization_program` | Preservation program |
| **Versnellen** | 2 | `digitization_program` | Digitization acceleration |
**Key Findings**:
- **Aggregator dominance**: 987 institutions (73%) participate in Collectie Nederland
- **Multi-platform strategy**: Most institutions participate in 2-4 platforms
- **Regional variation**: Urban institutions have higher platform participation
- **Type correlation**: Museums favor Rijkscollectie (art museums), archives favor Archieven.nl
---
## Global Partnership Patterns (Preliminary)
### Evidence from 139 Conversation Files
Based on preliminary analysis of conversation text (60+ countries covered):
#### Regional Aggregators by Continent
**Europe**:
- Europeana (4,000+ institutions, 40+ countries)
- Archives Portal Europe (archival focus)
- Judaica Europeana (Jewish heritage)
**Americas**:
- DPLA (Digital Public Library of America, 4,500+ institutions)
- Canadiana (Canadian aggregator)
- Red de Museos Virtuales de América Latina
**Asia-Pacific**:
- Trove (Australia)
- DigitalNZ (New Zealand)
- National Digital Heritage (Singapore)
**Latin America**:
- Biblioteca Digital Hispánica (Spain-Latin America)
- Tainacan (Brazilian repositories)
#### Academic Consortia
- **OCLC WorldCat**: 10,000+ libraries globally
- **CONRICYT**: 500+ Mexican academic institutions
- **ALDU**: Latin American university libraries
- **Ex Libris/ProQuest** networks: Global library automation
#### Digitization Programs
- **Google Arts & Culture**: 2,000+ museums worldwide (Art Camera partnerships)
- **Internet Archive**: Book digitization, 1,000+ library partners
- **Endangered Archives Programme**: 600+ projects (British Library)
- **HathiTrust**: 150+ research libraries (US)
---
## Controlled Vocabulary Mappings
### AAT (Art & Architecture Thesaurus) Mappings
| Partnership Type | AAT Concept ID | AAT Term |
|------------------|----------------|----------|
| Aggregator Participation | `aat:300379842` | digital repositories |
| Digitization Program | `aat:300054608` | digitization |
| Thematic Network | `aat:300025976` | cooperation (organizations) |
| National Certification | `aat:300026392` | accreditation (validation) |
| Academic Network | `aat:300266386` | academic institutions |
| Linked Data Platform | `aat:300054162` | computer networks |
### PROV-O (Provenance Ontology) Mappings
Partnerships can be modeled as PROV activities:
```turtle
# Example: Institution joins Europeana
<institution/rijksmuseum> prov:wasInfluencedBy <partnership/europeana-2015> .
<partnership/europeana-2015> a prov:Activity ;
rdfs:label "Rijksmuseum joined Europeana"@en ;
prov:startedAtTime "2015-01-01T00:00:00Z"^^xsd:dateTime ;
prov:wasAssociatedWith <aggregator/europeana> ;
glam:partnershipType "aggregator_participation" .
```
### EU CPOV (Core Public Organization Vocabulary)
For government-mandated partnerships:
- `cpov:formalFramework` - Legal/regulatory framework
- `org:Organization` - Partner organization
- `org:memberOf` - Membership relationships
---
## Data Model: Partnership Schema
### LinkML Schema Extension
Add to `schemas/core.yaml` or create new `schemas/partnerships.yaml`:
```yaml
classes:
Partnership:
description: >-
A collaborative relationship between a heritage institution and an
external organization, platform, or network.
slots:
- partner_name
- partnership_type
- start_date
- end_date
- partnership_description
- partner_url
- contact_point
- membership_status
- funding_source
slots:
partner_name:
description: Name of partner organization or platform
range: string
required: true
partnership_type:
description: Type of partnership from controlled vocabulary
range: PartnershipTypeEnum
required: true
start_date:
description: Partnership start date (ISO 8601)
range: date
end_date:
description: Partnership end date (ISO 8601, omit if ongoing)
range: date
partnership_description:
description: Context and details of partnership
range: string
partner_url:
description: URL of partner organization or platform
range: uri
contact_point:
description: Contact person or department for partnership
range: string
membership_status:
description: Current status (active, inactive, pending)
range: MembershipStatusEnum
funding_source:
description: Funding source if applicable (e.g., EU Horizon, government grant)
range: string
enums:
PartnershipTypeEnum:
permissible_values:
aggregator_participation:
description: Participation in content aggregation platforms
digitization_program:
description: Participation in digitization initiatives
thematic_network:
description: Subject or technology-specific networks
international_aggregator:
description: Global aggregators and authority files
national_certification:
description: Official government or UNESCO recognition
academic_network:
description: University consortia and research networks
linked_data_platform:
description: Semantic web and linked open data platforms
dataset_registry:
description: Research data repository participation
partnership:
description: Generic or unclassified partnership
MembershipStatusEnum:
permissible_values:
active:
description: Currently active partnership
inactive:
description: Partnership ended or suspended
pending:
description: Partnership under negotiation
pilot:
description: Experimental or trial phase
```
---
## Extraction Strategies
### Pattern-Based Extraction (Implemented)
The `ConversationParser.extract_partnerships()` method uses:
1. **Named entity patterns**: Regex matching known partnership names
- Europeana, DPLA, Archieven.nl, Collectie Nederland, etc.
- 8 partnership types with 30+ predefined patterns
2. **Generic phrase patterns**:
- "collaborates with [Organization]"
- "member of the [Network/Consortium]"
- "part of [Platform]"
- "participated in [Program]"
3. **Temporal extraction**:
- "from 2020 to 2025" → `start_date`, `end_date`
- "since 2018" → `start_date`
- "until 2023" → `end_date`
- "in 2022" → single-year partnership
4. **Context extraction**:
- Extracts sentence containing partnership mention
- Max 300 characters
- Provides description field for validation
### CSV Parsing (Dutch Organizations)
Direct field mapping from CSV columns:
- `Europeana (J/N)` → If "J", add `{"partner_name": "Europeana", "partnership_type": "aggregator_participation"}`
- `Collectie Nederland (J/N)` → If "J", add partnership
- Similar for Archieven.nl, Rijkscollectie, Museum register, etc.
### NLP-Based Extraction (Future)
For conversations mentioning partnerships without explicit names:
- **Named Entity Recognition**: Extract ORG entities near partnership keywords
- **Dependency parsing**: Identify "member of X" relationships
- **Coreference resolution**: Link "the consortium" to previously mentioned organization
- **Entity linking**: Match organization names to Wikidata entities
---
## Validation and Quality Control
### Confidence Scoring
Assign confidence scores based on extraction method:
- **0.9-1.0**: Explicit mention with temporal info (e.g., "joined Europeana in 2018")
- **0.7-0.9**: Clear mention without dates (e.g., "participates in DPLA")
- **0.5-0.7**: Generic pattern match (e.g., "member of Digital Heritage Network")
- **0.3-0.5**: Inferred from context (e.g., "portal" → likely aggregator)
- **0.0-0.3**: Uncertain, needs manual verification
### Cross-Referencing
Validate extracted partnerships against:
1. **Authoritative sources**:
- Europeana member list (https://www.europeana.eu/en/about-us/members)
- DPLA member list
- Dutch Museum Register
2. **Institutional websites**:
- Check website footer for "partner logos"
- Check "about" or "partnerships" pages
- Verify membership claims
3. **Deduplication**:
- Normalize partner names (e.g., "OCLC WorldCat" vs "WorldCat")
- Handle abbreviations (e.g., "DPLA" vs "Digital Public Library of America")
---
## Use Cases
### 1. Network Analysis
**Question**: Which institutions are most connected to international platforms?
**Query** (SPARQL):
```sparql
SELECT ?institution (COUNT(?partnership) AS ?partnershipCount)
WHERE {
?institution a glam:HeritageCustodian ;
glam:partnership ?partnership .
?partnership glam:partnershipType ?type .
FILTER(?type IN ("aggregator_participation", "international_aggregator"))
}
GROUP BY ?institution
ORDER BY DESC(?partnershipCount)
```
### 2. Digitization Impact Assessment
**Question**: Which institutions participated in digitization programs 2020-2025?
**Query**:
```sparql
SELECT ?institution ?program ?startDate
WHERE {
?institution glam:partnership ?partnership .
?partnership glam:partnerName ?program ;
glam:partnershipType "digitization_program" ;
glam:startDate ?startDate .
FILTER(?startDate >= "2020-01-01"^^xsd:date)
}
```
### 3. Regional Platform Coverage
**Question**: What percentage of Dutch museums participate in Collectie Nederland?
**Analysis**: 987 / 1,351 Dutch institutions = **73% participation**
### 4. Temporal Network Evolution
**Question**: How has Europeana membership grown over time?
**Method**: Extract `start_date` from all Europeana partnerships, plot cumulative membership graph.
---
## Future Work
### 1. Expand Partnership Detection
- **Social media monitoring**: Detect partnership announcements on Twitter, LinkedIn
- **Press releases**: Scrape institutional news pages for partnership announcements
- **Grant databases**: Parse EU Horizon, NEH, IMLS grant databases for partnership projects
### 2. Relationship Modeling
Beyond binary partnerships, model:
- **Hierarchies**: Sub-networks within larger networks (e.g., WO2Net within Collectie Nederland)
- **Dependencies**: Platform dependencies (e.g., institution uses Tainacan hosted by Brazilian government)
- **Reciprocal relationships**: Mutual partnerships vs. service provider relationships
### 3. Impact Metrics
Quantify partnership impact:
- **Discoverability**: Institutions in Europeana see X% increase in website traffic
- **Funding**: Digitization program participants receive Y average funding
- **Visibility**: Certified museums (Museum Register) have Z% higher visitor counts
### 4. Temporal Dynamics
Track partnership lifecycle:
- **Formation**: When do institutions join networks? (e.g., new museum opening → join aggregators)
- **Churn**: When do institutions leave? (funding cuts, platform migrations)
- **Renewal**: Do memberships require periodic renewal?
---
## References
### Standards and Vocabularies
- **AAT** (Getty): https://www.getty.edu/research/tools/vocabularies/aat/
- **PROV-O**: https://www.w3.org/TR/prov-o/
- **CPOV** (EU): https://joinup.ec.europa.eu/collection/semantic-interoperability-community-semic/solution/core-public-organisation-vocabulary
- **Schema.org**: https://schema.org/Organization
### Platform Documentation
- **Europeana Aggregation**: https://pro.europeana.eu/page/aggregation
- **DPLA Metadata Application Profile**: https://pro.dp.la/hubs/metadata-application-profile
- **OAI-PMH**: http://www.openarchives.org/pmh/
- **IIIF**: https://iiif.io/
### Related Research
- Terras, M. (2015). *Opening Access to Collections: The Making and Using of Open Digitised Cultural Content*. Online Information Review, 39(5), 733-752.
- Navarrete, T., & Borowiecki, K. J. (2016). *Changes in Cultural Heritage Institutions in the Digital Era*. Economics of Art and Culture, 295-315.
---
**Document Status**: Draft v1.0
**Feedback**: Please submit issues or pull requests to improve this taxonomy.