18 KiB
Global GLAM Dataset: Consumers & Use Cases
Overview
This document describes the intended consumers, use cases, and applications for the Global GLAM Dataset. Understanding who will use the data and how they will use it informs design decisions throughout the project.
Primary Consumer Segments
1. Heritage Sector Professionals
1.1 Archivists & Records Managers
Needs:
- Discover similar institutions for collaboration
- Identify common preservation standards
- Find institutions using specific collection management systems
- Map archival networks and consortia
Use Cases:
# Find all archives using RiC-O ontology
SELECT ?archive ?name ?url WHERE {
?archive a hc:HeritageCustodian ;
hc:institution_type "archive" ;
schema:name ?name ;
hc:metadata_standard "RiC-O" ;
schema:url ?url .
}
Query Examples:
- "Which archives in Southeast Asia have digital repositories?"
- "What collection management systems are used by national archives?"
- "Find archives participating in Archives Portal Europe"
1.2 Librarians & Information Professionals
Needs:
- Identify libraries by ISIL code
- Discover special collections
- Map interlibrary loan networks
- Find institutions using specific discovery systems
Use Cases:
# Find libraries with specific metadata standards
import duckdb
conn = duckdb.connect('glam_dataset.duckdb')
libraries = conn.execute("""
SELECT name, country, website_url
FROM institutions
WHERE 'library' IN institution_types
AND 'BIBFRAME' IN metadata_standards
ORDER BY country, name
""").fetchall()
Query Examples:
- "Which libraries in Africa have digitized collections?"
- "Find research libraries using Alma/Ex Libris"
- "Map libraries with IIIF image servers"
1.3 Museum Professionals
Needs:
- Discover museums by collection type
- Find institutions using LIDO/SPECTRUM
- Identify museum networks and consortia
- Locate museums with 3D digitization projects
Use Cases:
# Museums with natural history collections
museums = dataset.query(
institution_type="museum",
collection_subjects__contains="natural history",
has_digital_repository=True
)
for museum in museums:
print(f"{museum.name} ({museum.country}): {museum.website_url}")
2. Researchers & Academics
2.1 Digital Humanities Researchers
Needs:
- Analyze global digitization patterns
- Study metadata standards adoption
- Research cultural heritage policies
- Map institutional networks
Use Cases:
- Network Analysis: Build graphs of institutional relationships
- Temporal Analysis: Track digitization project growth over time
- Geographic Analysis: Map heritage infrastructure by region
- Standards Analysis: Study metadata standard adoption rates
Example Research Questions:
- "How has museum digitization evolved across continents?"
- "What factors predict digital repository adoption?"
- "Which countries lead in heritage sector linked data adoption?"
2.2 Library & Information Science Scholars
Needs:
- Study collection management practices globally
- Analyze institutional collaboration patterns
- Research digital preservation strategies
- Compare national approaches to heritage documentation
Datasets Derived:
# Generate dataset for research
df = dataset.to_dataframe()
# Analysis: CMS adoption by institution type
cms_by_type = df.groupby(['institution_type', 'collection_management_system']) \
.size() \
.unstack(fill_value=0)
# Export for statistical analysis
df.to_csv('glam_research_dataset.csv')
df.to_stata('glam_research_dataset.dta') # For Stata users
2.3 Heritage Studies Researchers
Needs:
- Map colonial heritage institutions
- Study repatriation networks
- Analyze indigenous heritage representation
- Research post-conflict heritage reconstruction
Use Cases:
- Identify institutions holding colonial-era collections
- Map indigenous cultural heritage repositories
- Track repatriation initiatives
- Study war-affected heritage institutions
3. Software Developers & Data Engineers
3.1 Heritage Platform Developers
Needs:
- Institution lookup APIs
- ISIL code resolution
- Geographic search capabilities
- Aggregation endpoints for portals
Integration Examples:
# REST API integration (future)
import requests
# Find institutions near coordinates
response = requests.get('https://api.glam-dataset.org/institutions', params={
'lat': 52.3676,
'lon': 4.9041,
'radius': 50, # km
'type': 'museum'
})
institutions = response.json()
Applications:
- Heritage aggregation portals (like Europeana)
- Collection discovery platforms
- Digital preservation tools
- Institutional registries
3.2 Linked Data Application Developers
Needs:
- SPARQL endpoint access
- RDF data dumps
- Authority linking (Wikidata, VIAF)
- Schema.org compatible data
Integration Examples:
from rdflib import Graph
from SPARQLWrapper import SPARQLWrapper
# Query SPARQL endpoint
sparql = SPARQLWrapper("https://sparql.glam-dataset.org/query")
sparql.setQuery("""
PREFIX hc: <https://w3id.org/heritage-custodian/>
PREFIX schema: <http://schema.org/>
SELECT ?name ?wikidata WHERE {
?inst a hc:HeritageCustodian ;
schema:name ?name ;
hc:wikidata_id ?wikidata ;
schema:address/schema:addressCountry "NL" .
}
""")
results = sparql.query().convert()
4. Heritage Funding Organizations
4.1 Grant-Making Foundations
Needs:
- Identify underserved regions
- Assess digitization infrastructure gaps
- Find institutions for partnership programs
- Measure impact of funded initiatives
Analysis Examples:
# Identify regions with low digitization rates
analysis = dataset.analyze_coverage()
underserved = analysis.filter(
digital_repository_rate < 0.3, # <30% have repositories
population > 1_000_000 # Significant population
)
# Generate report for funding priorities
report = underserved.to_report(
metrics=['institution_count', 'digital_readiness_score'],
format='pdf'
)
4.2 International Development Organizations
Needs:
- Map heritage infrastructure in developing regions
- Identify capacity-building opportunities
- Track sustainable development goal alignment
- Assess disaster risk to cultural heritage
Use Cases:
- UNESCO heritage site documentation
- Cultural heritage emergency preparedness
- Digital literacy program planning
- Infrastructure investment prioritization
5. Government & Policy Makers
5.1 National Heritage Authorities
Needs:
- National heritage institution inventories
- Compliance monitoring (standards, regulations)
- Strategic planning data
- International comparison benchmarks
Reports Generated:
- National heritage infrastructure status reports
- Digital transformation progress tracking
- Compliance dashboards (GDPR, accessibility, preservation standards)
- Budget allocation recommendations
5.2 Cultural Policy Researchers
Needs:
- Cross-national policy comparison
- Impact assessment of heritage initiatives
- Evidence for policy recommendations
- Public access statistics
Analysis Examples:
# R analysis for policy research
library(tidyverse)
glam_data <- read_csv("glam_dataset.csv")
# Compare digital access policies by country
digital_access <- glam_data %>%
group_by(country) %>%
summarise(
open_access_rate = mean(access_type == "open", na.rm = TRUE),
avg_digitization_year = mean(digitization_start_year, na.rm = TRUE),
linked_data_adoption = mean(linked_data_participation, na.rm = TRUE)
)
# Correlation with cultural policy funding
cor.test(digital_access$open_access_rate, policy_funding$cultural_budget)
6. Public & Community Users
6.1 Genealogists & Family Historians
Needs:
- Find archives with vital records
- Locate regional historical societies
- Identify digitized collections by location
- Access information for remote research
Discovery Interface:
Map-based search:
- Select region
- Filter by: "genealogy", "vital records", "local history"
- View: institutions with online access
- Links: Direct to collection catalogs
6.2 Cultural Heritage Enthusiasts
Needs:
- Discover museums by interest area
- Plan cultural heritage tourism
- Find specialized collections
- Access digital exhibitions
Use Cases:
- "Find maritime museums in Scandinavia"
- "Museums with Egyptian antiquities worldwide"
- "Archives with medieval manuscripts"
- "Libraries with rare book collections"
6.3 Citizen Scholars & Local Historians
Needs:
- Community archive directories
- Local historical society contacts
- Volunteer digitization opportunities
- Collection donation information
Consumption Patterns
Pattern 1: Bulk Dataset Download
Who: Researchers, data analysts What: Complete dataset dumps Format: CSV, Parquet, RDF/Turtle dumps Frequency: Quarterly releases
Distribution:
https://glam-dataset.org/downloads/
├── glam-dataset-v1.0.0-full.zip
│ ├── institutions.csv
│ ├── institutions.parquet
│ ├── institutions.ttl
│ ├── collections.csv
│ ├── digital_platforms.csv
│ └── README.md
├── glam-dataset-v1.0.0-by-country.zip
│ ├── netherlands.csv
│ ├── brazil.csv
│ └── ...
└── glam-dataset-v1.0.0-rdf.tar.gz
└── data/*.ttl
Pattern 2: API Access
Who: Application developers, integrators What: RESTful API queries Format: JSON, JSON-LD Frequency: Real-time
API Endpoints (future):
GET /api/v1/institutions
?country=NL
&type=museum
&has_digital_repository=true
&limit=50
GET /api/v1/institutions/{id}
GET /api/v1/institutions/search
?q=maritime
&lat=52.36&lon=4.88&radius=25
POST /api/v1/institutions/batch
{ids: ["inst-001", "inst-002", ...]}
Pattern 3: SPARQL Queries
Who: Linked data developers, semantic web researchers What: SPARQL endpoint access Format: RDF results (XML, JSON, Turtle) Frequency: Real-time
Endpoint: https://sparql.glam-dataset.org/query
Example Queries:
# Complex relationship query
PREFIX hc: <https://w3id.org/heritage-custodian/>
PREFIX schema: <http://schema.org/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?inst ?name ?consortium ?cms WHERE {
?inst a hc:HeritageCustodian ;
schema:name ?name ;
hc:consortium_membership ?consortium ;
hc:collection_management_system ?cms .
FILTER(CONTAINS(LCASE(?name), "national"))
FILTER(?consortium != "")
}
ORDER BY ?consortium ?name
Pattern 4: Data Visualization
Who: Journalists, public communicators, educators What: Interactive visualizations, dashboards Format: Web interfaces Frequency: On-demand
Visualization Types:
- Interactive maps of heritage institutions
- Network graphs of institutional relationships
- Timeline views of digitization projects
- Statistical dashboards by country/region
- Comparative charts (standards adoption, digital readiness)
Pattern 5: Embedded Data
Who: Website builders, CMS administrators What: Embeddable widgets Format: JavaScript embeds, iframes Frequency: Real-time
Examples:
<!-- Embed institution finder widget -->
<div class="glam-finder"
data-country="NL"
data-type="archive">
</div>
<script src="https://cdn.glam-dataset.org/widget.js"></script>
<!-- Embed institution profile card -->
<iframe src="https://glam-dataset.org/embed/NL-AsdRM"
width="400" height="300"></iframe>
Application Examples
Application 1: Heritage Discovery Portal
Name: "Global Heritage Explorer"
Features:
- Map-based browsing of institutions
- Advanced search (type, collections, standards)
- Institution profiles with enriched metadata
- Links to digital repositories
- Network visualization of partnerships
Tech Stack:
- Frontend: React + Leaflet maps
- Backend: FastAPI + PostgreSQL
- Data: GLAM dataset (weekly sync)
Application 2: ISIL Code Resolver
Name: "ISIL Lookup Service"
Features:
- REST API for ISIL code resolution
- Returns institution metadata + URLs
- Authority linking (Wikidata, VIAF)
- OpenAPI specification
Example:
curl https://isil.glam-dataset.org/resolve/NL-AsdRM
{
"isil": "NL-AsdRM",
"name": "Rijksmuseum",
"type": ["museum"],
"country": "Netherlands",
"city": "Amsterdam",
"website": "https://www.rijksmuseum.nl",
"wikidata": "Q190804",
"collections_url": "https://www.rijksmuseum.nl/en/search"
}
Application 3: Research Data Package
Name: "GLAM Analytics Toolkit"
Features:
- R package:
install.packages("glamdata") - Python package:
pip install glam-dataset - Pre-built analysis functions
- Visualization helpers
- Statistical model templates
Usage:
library(glamdata)
# Load dataset
data <- load_glam_dataset(version = "1.0.0")
# Built-in analysis
coverage_stats <- analyze_coverage(data, by = "country")
# Visualizations
plot_digital_readiness(data, region = "Europe")
plot_standards_adoption(data)
Application 4: Heritage Network Graph
Name: "Heritage Connections"
Features:
- Graph database (Neo4j) of institutions
- Relationship visualization
- Community detection algorithms
- Collaboration opportunity finder
Queries:
// Find institutions 2 hops from Rijksmuseum
MATCH (rijks:Institution {isil: "NL-AsdRM"})-[r*1..2]-(connected)
WHERE connected.country IN ["Netherlands", "Belgium", "Germany"]
RETURN rijks, r, connected
Application 5: Monitoring Dashboard
Name: "Digital Heritage Monitor"
Features:
- Track digitization progress over time
- URL health monitoring (link rot detection)
- Standards adoption trends
- Regional comparison charts
- Alert system for dataset updates
Metrics Tracked:
- Percentage with digital repositories
- Metadata standards adoption rates
- Geographic coverage completeness
- Data quality scores
- Growth trends
Value Propositions
For Heritage Professionals
Value: "Discover, connect, and learn from peer institutions worldwide"
- Find collaboration partners
- Identify best practices
- Benchmark digital maturity
- Access to comprehensive institutional directory
For Researchers
Value: "Authoritative, structured data for heritage sector research"
- Ready-to-use research datasets
- Longitudinal tracking capability
- Cross-national comparisons
- Reproducible research support
For Developers
Value: "Structured, linked, API-accessible heritage institution data"
- Build applications faster
- Reliable authority linking
- Standardized formats (RDF, CSV, Parquet)
- Active maintenance and updates
For Funders
Value: "Evidence-based insights for strategic heritage investment"
- Identify funding gaps
- Measure initiative impact
- Prioritize underserved regions
- Track sector development
For the Public
Value: "Find and access cultural heritage collections globally"
- Discover collections of interest
- Plan heritage visits
- Access digital resources
- Support for genealogy and research
Success Metrics (Usage)
Quantitative Metrics
- Downloads: Target 10,000+ unique downloads in first year
- API calls: Target 100,000+ requests/month by year 2
- Citations: Target 50+ research papers citing dataset in 3 years
- Applications: Target 20+ applications built on dataset
- SPARQL queries: Target 1,000+ unique queries/month
Qualitative Metrics
- Featured in heritage sector publications
- Adopted by national heritage authorities
- Integrated into major aggregation platforms (Europeana, DPLA)
- Cited in policy documents
- Community contributions (corrections, additions)
Sustainability Model
Open Data Commitment
- License: CC0 or CC-BY 4.0
- Access: Free, no registration required for bulk downloads
- Formats: Open, widely-supported formats
- Code: Open source on GitHub
Community Governance
- Accept contributions from institutions
- Community review process for additions
- Transparent data quality standards
- Public issue tracker
Funding Options
- Grant funding: Initial development (foundations, government)
- Institutional support: Hosting by heritage organizations
- API tier pricing (future): Free tier + paid high-volume tier
- Value-added services: Custom data packages, consulting
Ethical Considerations
Data Quality & Accuracy
- Clear provenance for all data
- Confidence scores for derived data
- Correction mechanism for errors
- Regular validation against authoritative sources
Representation & Bias
- Acknowledge data gaps in underrepresented regions
- Actively seek data from Global South institutions
- Document extraction methodology limitations
- Community input on missing institutions
Privacy & Consent
- Only public institutional data (no personal data)
- Respect institutional preferences for inclusion
- Opt-out mechanism for institutions
- Clear data retention policies
Decolonization
- Acknowledge colonial heritage in collection descriptions
- Support indigenous heritage institution visibility
- Transparent about data sources and limitations
- Consult with affected communities
Future Expansion
Phase 2 Features
- User accounts for personalized searches
- Saved searches and alerts
- Community ratings and reviews
- Multi-language interface
- Mobile application
Phase 3 Features
- Machine learning for institution classification
- Automatic link rot detection and correction
- Crowdsourced data validation
- Integration with collection-level metadata
- Real-time data streaming APIs
This comprehensive view of consumers and use cases ensures the dataset design serves real-world needs across diverse user communities.