glam/docs/plan/global_glam/06-consumers-use-cases.md
2025-11-19 23:25:22 +01:00

18 KiB

Global GLAM Dataset: Consumers & Use Cases

Overview

This document describes the intended consumers, use cases, and applications for the Global GLAM Dataset. Understanding who will use the data and how they will use it informs design decisions throughout the project.

Primary Consumer Segments

1. Heritage Sector Professionals

1.1 Archivists & Records Managers

Needs:

  • Discover similar institutions for collaboration
  • Identify common preservation standards
  • Find institutions using specific collection management systems
  • Map archival networks and consortia

Use Cases:

# Find all archives using RiC-O ontology
SELECT ?archive ?name ?url WHERE {
  ?archive a hc:HeritageCustodian ;
           hc:institution_type "archive" ;
           schema:name ?name ;
           hc:metadata_standard "RiC-O" ;
           schema:url ?url .
}

Query Examples:

  • "Which archives in Southeast Asia have digital repositories?"
  • "What collection management systems are used by national archives?"
  • "Find archives participating in Archives Portal Europe"

1.2 Librarians & Information Professionals

Needs:

  • Identify libraries by ISIL code
  • Discover special collections
  • Map interlibrary loan networks
  • Find institutions using specific discovery systems

Use Cases:

# Find libraries with specific metadata standards
import duckdb

conn = duckdb.connect('glam_dataset.duckdb')
libraries = conn.execute("""
    SELECT name, country, website_url
    FROM institutions
    WHERE 'library' IN institution_types
      AND 'BIBFRAME' IN metadata_standards
    ORDER BY country, name
""").fetchall()

Query Examples:

  • "Which libraries in Africa have digitized collections?"
  • "Find research libraries using Alma/Ex Libris"
  • "Map libraries with IIIF image servers"

1.3 Museum Professionals

Needs:

  • Discover museums by collection type
  • Find institutions using LIDO/SPECTRUM
  • Identify museum networks and consortia
  • Locate museums with 3D digitization projects

Use Cases:

# Museums with natural history collections
museums = dataset.query(
    institution_type="museum",
    collection_subjects__contains="natural history",
    has_digital_repository=True
)

for museum in museums:
    print(f"{museum.name} ({museum.country}): {museum.website_url}")

2. Researchers & Academics

2.1 Digital Humanities Researchers

Needs:

  • Analyze global digitization patterns
  • Study metadata standards adoption
  • Research cultural heritage policies
  • Map institutional networks

Use Cases:

  • Network Analysis: Build graphs of institutional relationships
  • Temporal Analysis: Track digitization project growth over time
  • Geographic Analysis: Map heritage infrastructure by region
  • Standards Analysis: Study metadata standard adoption rates

Example Research Questions:

  • "How has museum digitization evolved across continents?"
  • "What factors predict digital repository adoption?"
  • "Which countries lead in heritage sector linked data adoption?"

2.2 Library & Information Science Scholars

Needs:

  • Study collection management practices globally
  • Analyze institutional collaboration patterns
  • Research digital preservation strategies
  • Compare national approaches to heritage documentation

Datasets Derived:

# Generate dataset for research
df = dataset.to_dataframe()

# Analysis: CMS adoption by institution type
cms_by_type = df.groupby(['institution_type', 'collection_management_system']) \
                .size() \
                .unstack(fill_value=0)

# Export for statistical analysis
df.to_csv('glam_research_dataset.csv')
df.to_stata('glam_research_dataset.dta')  # For Stata users

2.3 Heritage Studies Researchers

Needs:

  • Map colonial heritage institutions
  • Study repatriation networks
  • Analyze indigenous heritage representation
  • Research post-conflict heritage reconstruction

Use Cases:

  • Identify institutions holding colonial-era collections
  • Map indigenous cultural heritage repositories
  • Track repatriation initiatives
  • Study war-affected heritage institutions

3. Software Developers & Data Engineers

3.1 Heritage Platform Developers

Needs:

  • Institution lookup APIs
  • ISIL code resolution
  • Geographic search capabilities
  • Aggregation endpoints for portals

Integration Examples:

# REST API integration (future)
import requests

# Find institutions near coordinates
response = requests.get('https://api.glam-dataset.org/institutions', params={
    'lat': 52.3676,
    'lon': 4.9041,
    'radius': 50,  # km
    'type': 'museum'
})

institutions = response.json()

Applications:

  • Heritage aggregation portals (like Europeana)
  • Collection discovery platforms
  • Digital preservation tools
  • Institutional registries

3.2 Linked Data Application Developers

Needs:

  • SPARQL endpoint access
  • RDF data dumps
  • Authority linking (Wikidata, VIAF)
  • Schema.org compatible data

Integration Examples:

from rdflib import Graph
from SPARQLWrapper import SPARQLWrapper

# Query SPARQL endpoint
sparql = SPARQLWrapper("https://sparql.glam-dataset.org/query")
sparql.setQuery("""
    PREFIX hc: <https://w3id.org/heritage-custodian/>
    PREFIX schema: <http://schema.org/>
    
    SELECT ?name ?wikidata WHERE {
      ?inst a hc:HeritageCustodian ;
            schema:name ?name ;
            hc:wikidata_id ?wikidata ;
            schema:address/schema:addressCountry "NL" .
    }
""")

results = sparql.query().convert()

4. Heritage Funding Organizations

4.1 Grant-Making Foundations

Needs:

  • Identify underserved regions
  • Assess digitization infrastructure gaps
  • Find institutions for partnership programs
  • Measure impact of funded initiatives

Analysis Examples:

# Identify regions with low digitization rates
analysis = dataset.analyze_coverage()

underserved = analysis.filter(
    digital_repository_rate < 0.3,  # <30% have repositories
    population > 1_000_000  # Significant population
)

# Generate report for funding priorities
report = underserved.to_report(
    metrics=['institution_count', 'digital_readiness_score'],
    format='pdf'
)

4.2 International Development Organizations

Needs:

  • Map heritage infrastructure in developing regions
  • Identify capacity-building opportunities
  • Track sustainable development goal alignment
  • Assess disaster risk to cultural heritage

Use Cases:

  • UNESCO heritage site documentation
  • Cultural heritage emergency preparedness
  • Digital literacy program planning
  • Infrastructure investment prioritization

5. Government & Policy Makers

5.1 National Heritage Authorities

Needs:

  • National heritage institution inventories
  • Compliance monitoring (standards, regulations)
  • Strategic planning data
  • International comparison benchmarks

Reports Generated:

  • National heritage infrastructure status reports
  • Digital transformation progress tracking
  • Compliance dashboards (GDPR, accessibility, preservation standards)
  • Budget allocation recommendations

5.2 Cultural Policy Researchers

Needs:

  • Cross-national policy comparison
  • Impact assessment of heritage initiatives
  • Evidence for policy recommendations
  • Public access statistics

Analysis Examples:

# R analysis for policy research
library(tidyverse)

glam_data <- read_csv("glam_dataset.csv")

# Compare digital access policies by country
digital_access <- glam_data %>%
  group_by(country) %>%
  summarise(
    open_access_rate = mean(access_type == "open", na.rm = TRUE),
    avg_digitization_year = mean(digitization_start_year, na.rm = TRUE),
    linked_data_adoption = mean(linked_data_participation, na.rm = TRUE)
  )

# Correlation with cultural policy funding
cor.test(digital_access$open_access_rate, policy_funding$cultural_budget)

6. Public & Community Users

6.1 Genealogists & Family Historians

Needs:

  • Find archives with vital records
  • Locate regional historical societies
  • Identify digitized collections by location
  • Access information for remote research

Discovery Interface:

Map-based search:
- Select region
- Filter by: "genealogy", "vital records", "local history"
- View: institutions with online access
- Links: Direct to collection catalogs

6.2 Cultural Heritage Enthusiasts

Needs:

  • Discover museums by interest area
  • Plan cultural heritage tourism
  • Find specialized collections
  • Access digital exhibitions

Use Cases:

  • "Find maritime museums in Scandinavia"
  • "Museums with Egyptian antiquities worldwide"
  • "Archives with medieval manuscripts"
  • "Libraries with rare book collections"

6.3 Citizen Scholars & Local Historians

Needs:

  • Community archive directories
  • Local historical society contacts
  • Volunteer digitization opportunities
  • Collection donation information

Consumption Patterns

Pattern 1: Bulk Dataset Download

Who: Researchers, data analysts What: Complete dataset dumps Format: CSV, Parquet, RDF/Turtle dumps Frequency: Quarterly releases

Distribution:

https://glam-dataset.org/downloads/
├── glam-dataset-v1.0.0-full.zip
│   ├── institutions.csv
│   ├── institutions.parquet
│   ├── institutions.ttl
│   ├── collections.csv
│   ├── digital_platforms.csv
│   └── README.md
├── glam-dataset-v1.0.0-by-country.zip
│   ├── netherlands.csv
│   ├── brazil.csv
│   └── ...
└── glam-dataset-v1.0.0-rdf.tar.gz
    └── data/*.ttl

Pattern 2: API Access

Who: Application developers, integrators What: RESTful API queries Format: JSON, JSON-LD Frequency: Real-time

API Endpoints (future):

GET /api/v1/institutions
  ?country=NL
  &type=museum
  &has_digital_repository=true
  &limit=50

GET /api/v1/institutions/{id}

GET /api/v1/institutions/search
  ?q=maritime
  &lat=52.36&lon=4.88&radius=25

POST /api/v1/institutions/batch
  {ids: ["inst-001", "inst-002", ...]}

Pattern 3: SPARQL Queries

Who: Linked data developers, semantic web researchers What: SPARQL endpoint access Format: RDF results (XML, JSON, Turtle) Frequency: Real-time

Endpoint: https://sparql.glam-dataset.org/query

Example Queries:

# Complex relationship query
PREFIX hc: <https://w3id.org/heritage-custodian/>
PREFIX schema: <http://schema.org/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT ?inst ?name ?consortium ?cms WHERE {
  ?inst a hc:HeritageCustodian ;
        schema:name ?name ;
        hc:consortium_membership ?consortium ;
        hc:collection_management_system ?cms .
  
  FILTER(CONTAINS(LCASE(?name), "national"))
  FILTER(?consortium != "")
}
ORDER BY ?consortium ?name

Pattern 4: Data Visualization

Who: Journalists, public communicators, educators What: Interactive visualizations, dashboards Format: Web interfaces Frequency: On-demand

Visualization Types:

  • Interactive maps of heritage institutions
  • Network graphs of institutional relationships
  • Timeline views of digitization projects
  • Statistical dashboards by country/region
  • Comparative charts (standards adoption, digital readiness)

Pattern 5: Embedded Data

Who: Website builders, CMS administrators What: Embeddable widgets Format: JavaScript embeds, iframes Frequency: Real-time

Examples:

<!-- Embed institution finder widget -->
<div class="glam-finder" 
     data-country="NL" 
     data-type="archive">
</div>
<script src="https://cdn.glam-dataset.org/widget.js"></script>

<!-- Embed institution profile card -->
<iframe src="https://glam-dataset.org/embed/NL-AsdRM" 
        width="400" height="300"></iframe>

Application Examples

Application 1: Heritage Discovery Portal

Name: "Global Heritage Explorer"

Features:

  • Map-based browsing of institutions
  • Advanced search (type, collections, standards)
  • Institution profiles with enriched metadata
  • Links to digital repositories
  • Network visualization of partnerships

Tech Stack:

  • Frontend: React + Leaflet maps
  • Backend: FastAPI + PostgreSQL
  • Data: GLAM dataset (weekly sync)

Application 2: ISIL Code Resolver

Name: "ISIL Lookup Service"

Features:

  • REST API for ISIL code resolution
  • Returns institution metadata + URLs
  • Authority linking (Wikidata, VIAF)
  • OpenAPI specification

Example:

curl https://isil.glam-dataset.org/resolve/NL-AsdRM

{
  "isil": "NL-AsdRM",
  "name": "Rijksmuseum",
  "type": ["museum"],
  "country": "Netherlands",
  "city": "Amsterdam",
  "website": "https://www.rijksmuseum.nl",
  "wikidata": "Q190804",
  "collections_url": "https://www.rijksmuseum.nl/en/search"
}

Application 3: Research Data Package

Name: "GLAM Analytics Toolkit"

Features:

  • R package: install.packages("glamdata")
  • Python package: pip install glam-dataset
  • Pre-built analysis functions
  • Visualization helpers
  • Statistical model templates

Usage:

library(glamdata)

# Load dataset
data <- load_glam_dataset(version = "1.0.0")

# Built-in analysis
coverage_stats <- analyze_coverage(data, by = "country")

# Visualizations
plot_digital_readiness(data, region = "Europe")
plot_standards_adoption(data)

Application 4: Heritage Network Graph

Name: "Heritage Connections"

Features:

  • Graph database (Neo4j) of institutions
  • Relationship visualization
  • Community detection algorithms
  • Collaboration opportunity finder

Queries:

// Find institutions 2 hops from Rijksmuseum
MATCH (rijks:Institution {isil: "NL-AsdRM"})-[r*1..2]-(connected)
WHERE connected.country IN ["Netherlands", "Belgium", "Germany"]
RETURN rijks, r, connected

Application 5: Monitoring Dashboard

Name: "Digital Heritage Monitor"

Features:

  • Track digitization progress over time
  • URL health monitoring (link rot detection)
  • Standards adoption trends
  • Regional comparison charts
  • Alert system for dataset updates

Metrics Tracked:

  • Percentage with digital repositories
  • Metadata standards adoption rates
  • Geographic coverage completeness
  • Data quality scores
  • Growth trends

Value Propositions

For Heritage Professionals

Value: "Discover, connect, and learn from peer institutions worldwide"

  • Find collaboration partners
  • Identify best practices
  • Benchmark digital maturity
  • Access to comprehensive institutional directory

For Researchers

Value: "Authoritative, structured data for heritage sector research"

  • Ready-to-use research datasets
  • Longitudinal tracking capability
  • Cross-national comparisons
  • Reproducible research support

For Developers

Value: "Structured, linked, API-accessible heritage institution data"

  • Build applications faster
  • Reliable authority linking
  • Standardized formats (RDF, CSV, Parquet)
  • Active maintenance and updates

For Funders

Value: "Evidence-based insights for strategic heritage investment"

  • Identify funding gaps
  • Measure initiative impact
  • Prioritize underserved regions
  • Track sector development

For the Public

Value: "Find and access cultural heritage collections globally"

  • Discover collections of interest
  • Plan heritage visits
  • Access digital resources
  • Support for genealogy and research

Success Metrics (Usage)

Quantitative Metrics

  • Downloads: Target 10,000+ unique downloads in first year
  • API calls: Target 100,000+ requests/month by year 2
  • Citations: Target 50+ research papers citing dataset in 3 years
  • Applications: Target 20+ applications built on dataset
  • SPARQL queries: Target 1,000+ unique queries/month

Qualitative Metrics

  • Featured in heritage sector publications
  • Adopted by national heritage authorities
  • Integrated into major aggregation platforms (Europeana, DPLA)
  • Cited in policy documents
  • Community contributions (corrections, additions)

Sustainability Model

Open Data Commitment

  • License: CC0 or CC-BY 4.0
  • Access: Free, no registration required for bulk downloads
  • Formats: Open, widely-supported formats
  • Code: Open source on GitHub

Community Governance

  • Accept contributions from institutions
  • Community review process for additions
  • Transparent data quality standards
  • Public issue tracker

Funding Options

  • Grant funding: Initial development (foundations, government)
  • Institutional support: Hosting by heritage organizations
  • API tier pricing (future): Free tier + paid high-volume tier
  • Value-added services: Custom data packages, consulting

Ethical Considerations

Data Quality & Accuracy

  • Clear provenance for all data
  • Confidence scores for derived data
  • Correction mechanism for errors
  • Regular validation against authoritative sources

Representation & Bias

  • Acknowledge data gaps in underrepresented regions
  • Actively seek data from Global South institutions
  • Document extraction methodology limitations
  • Community input on missing institutions
  • Only public institutional data (no personal data)
  • Respect institutional preferences for inclusion
  • Opt-out mechanism for institutions
  • Clear data retention policies

Decolonization

  • Acknowledge colonial heritage in collection descriptions
  • Support indigenous heritage institution visibility
  • Transparent about data sources and limitations
  • Consult with affected communities

Future Expansion

Phase 2 Features

  • User accounts for personalized searches
  • Saved searches and alerts
  • Community ratings and reviews
  • Multi-language interface
  • Mobile application

Phase 3 Features

  • Machine learning for institution classification
  • Automatic link rot detection and correction
  • Crowdsourced data validation
  • Integration with collection-level metadata
  • Real-time data streaming APIs

This comprehensive view of consumers and use cases ensures the dataset design serves real-world needs across diverse user communities.