646 lines
18 KiB
Markdown
646 lines
18 KiB
Markdown
# Global GLAM Dataset: Consumers & Use Cases
|
|
|
|
## Overview
|
|
This document describes the intended consumers, use cases, and applications for the Global GLAM Dataset. Understanding who will use the data and how they will use it informs design decisions throughout the project.
|
|
|
|
## Primary Consumer Segments
|
|
|
|
### 1. Heritage Sector Professionals
|
|
|
|
#### 1.1 Archivists & Records Managers
|
|
**Needs**:
|
|
- Discover similar institutions for collaboration
|
|
- Identify common preservation standards
|
|
- Find institutions using specific collection management systems
|
|
- Map archival networks and consortia
|
|
|
|
**Use Cases**:
|
|
```sparql
|
|
# Find all archives using RiC-O ontology
|
|
SELECT ?archive ?name ?url WHERE {
|
|
?archive a hc:HeritageCustodian ;
|
|
hc:institution_type "archive" ;
|
|
schema:name ?name ;
|
|
hc:metadata_standard "RiC-O" ;
|
|
schema:url ?url .
|
|
}
|
|
```
|
|
|
|
**Query Examples**:
|
|
- "Which archives in Southeast Asia have digital repositories?"
|
|
- "What collection management systems are used by national archives?"
|
|
- "Find archives participating in Archives Portal Europe"
|
|
|
|
#### 1.2 Librarians & Information Professionals
|
|
**Needs**:
|
|
- Identify libraries by ISIL code
|
|
- Discover special collections
|
|
- Map interlibrary loan networks
|
|
- Find institutions using specific discovery systems
|
|
|
|
**Use Cases**:
|
|
```python
|
|
# Find libraries with specific metadata standards
|
|
import duckdb
|
|
|
|
conn = duckdb.connect('glam_dataset.duckdb')
|
|
libraries = conn.execute("""
|
|
SELECT name, country, website_url
|
|
FROM institutions
|
|
WHERE 'library' IN institution_types
|
|
AND 'BIBFRAME' IN metadata_standards
|
|
ORDER BY country, name
|
|
""").fetchall()
|
|
```
|
|
|
|
**Query Examples**:
|
|
- "Which libraries in Africa have digitized collections?"
|
|
- "Find research libraries using Alma/Ex Libris"
|
|
- "Map libraries with IIIF image servers"
|
|
|
|
#### 1.3 Museum Professionals
|
|
**Needs**:
|
|
- Discover museums by collection type
|
|
- Find institutions using LIDO/SPECTRUM
|
|
- Identify museum networks and consortia
|
|
- Locate museums with 3D digitization projects
|
|
|
|
**Use Cases**:
|
|
```python
|
|
# Museums with natural history collections
|
|
museums = dataset.query(
|
|
institution_type="museum",
|
|
collection_subjects__contains="natural history",
|
|
has_digital_repository=True
|
|
)
|
|
|
|
for museum in museums:
|
|
print(f"{museum.name} ({museum.country}): {museum.website_url}")
|
|
```
|
|
|
|
### 2. Researchers & Academics
|
|
|
|
#### 2.1 Digital Humanities Researchers
|
|
**Needs**:
|
|
- Analyze global digitization patterns
|
|
- Study metadata standards adoption
|
|
- Research cultural heritage policies
|
|
- Map institutional networks
|
|
|
|
**Use Cases**:
|
|
- **Network Analysis**: Build graphs of institutional relationships
|
|
- **Temporal Analysis**: Track digitization project growth over time
|
|
- **Geographic Analysis**: Map heritage infrastructure by region
|
|
- **Standards Analysis**: Study metadata standard adoption rates
|
|
|
|
**Example Research Questions**:
|
|
- "How has museum digitization evolved across continents?"
|
|
- "What factors predict digital repository adoption?"
|
|
- "Which countries lead in heritage sector linked data adoption?"
|
|
|
|
#### 2.2 Library & Information Science Scholars
|
|
**Needs**:
|
|
- Study collection management practices globally
|
|
- Analyze institutional collaboration patterns
|
|
- Research digital preservation strategies
|
|
- Compare national approaches to heritage documentation
|
|
|
|
**Datasets Derived**:
|
|
```python
|
|
# Generate dataset for research
|
|
df = dataset.to_dataframe()
|
|
|
|
# Analysis: CMS adoption by institution type
|
|
cms_by_type = df.groupby(['institution_type', 'collection_management_system']) \
|
|
.size() \
|
|
.unstack(fill_value=0)
|
|
|
|
# Export for statistical analysis
|
|
df.to_csv('glam_research_dataset.csv')
|
|
df.to_stata('glam_research_dataset.dta') # For Stata users
|
|
```
|
|
|
|
#### 2.3 Heritage Studies Researchers
|
|
**Needs**:
|
|
- Map colonial heritage institutions
|
|
- Study repatriation networks
|
|
- Analyze indigenous heritage representation
|
|
- Research post-conflict heritage reconstruction
|
|
|
|
**Use Cases**:
|
|
- Identify institutions holding colonial-era collections
|
|
- Map indigenous cultural heritage repositories
|
|
- Track repatriation initiatives
|
|
- Study war-affected heritage institutions
|
|
|
|
### 3. Software Developers & Data Engineers
|
|
|
|
#### 3.1 Heritage Platform Developers
|
|
**Needs**:
|
|
- Institution lookup APIs
|
|
- ISIL code resolution
|
|
- Geographic search capabilities
|
|
- Aggregation endpoints for portals
|
|
|
|
**Integration Examples**:
|
|
```python
|
|
# REST API integration (future)
|
|
import requests
|
|
|
|
# Find institutions near coordinates
|
|
response = requests.get('https://api.glam-dataset.org/institutions', params={
|
|
'lat': 52.3676,
|
|
'lon': 4.9041,
|
|
'radius': 50, # km
|
|
'type': 'museum'
|
|
})
|
|
|
|
institutions = response.json()
|
|
```
|
|
|
|
**Applications**:
|
|
- Heritage aggregation portals (like Europeana)
|
|
- Collection discovery platforms
|
|
- Digital preservation tools
|
|
- Institutional registries
|
|
|
|
#### 3.2 Linked Data Application Developers
|
|
**Needs**:
|
|
- SPARQL endpoint access
|
|
- RDF data dumps
|
|
- Authority linking (Wikidata, VIAF)
|
|
- Schema.org compatible data
|
|
|
|
**Integration Examples**:
|
|
```python
|
|
from rdflib import Graph
|
|
from SPARQLWrapper import SPARQLWrapper
|
|
|
|
# Query SPARQL endpoint
|
|
sparql = SPARQLWrapper("https://sparql.glam-dataset.org/query")
|
|
sparql.setQuery("""
|
|
PREFIX hc: <https://w3id.org/heritage-custodian/>
|
|
PREFIX schema: <http://schema.org/>
|
|
|
|
SELECT ?name ?wikidata WHERE {
|
|
?inst a hc:HeritageCustodian ;
|
|
schema:name ?name ;
|
|
hc:wikidata_id ?wikidata ;
|
|
schema:address/schema:addressCountry "NL" .
|
|
}
|
|
""")
|
|
|
|
results = sparql.query().convert()
|
|
```
|
|
|
|
### 4. Heritage Funding Organizations
|
|
|
|
#### 4.1 Grant-Making Foundations
|
|
**Needs**:
|
|
- Identify underserved regions
|
|
- Assess digitization infrastructure gaps
|
|
- Find institutions for partnership programs
|
|
- Measure impact of funded initiatives
|
|
|
|
**Analysis Examples**:
|
|
```python
|
|
# Identify regions with low digitization rates
|
|
analysis = dataset.analyze_coverage()
|
|
|
|
underserved = analysis.filter(
|
|
digital_repository_rate < 0.3, # <30% have repositories
|
|
population > 1_000_000 # Significant population
|
|
)
|
|
|
|
# Generate report for funding priorities
|
|
report = underserved.to_report(
|
|
metrics=['institution_count', 'digital_readiness_score'],
|
|
format='pdf'
|
|
)
|
|
```
|
|
|
|
#### 4.2 International Development Organizations
|
|
**Needs**:
|
|
- Map heritage infrastructure in developing regions
|
|
- Identify capacity-building opportunities
|
|
- Track sustainable development goal alignment
|
|
- Assess disaster risk to cultural heritage
|
|
|
|
**Use Cases**:
|
|
- UNESCO heritage site documentation
|
|
- Cultural heritage emergency preparedness
|
|
- Digital literacy program planning
|
|
- Infrastructure investment prioritization
|
|
|
|
### 5. Government & Policy Makers
|
|
|
|
#### 5.1 National Heritage Authorities
|
|
**Needs**:
|
|
- National heritage institution inventories
|
|
- Compliance monitoring (standards, regulations)
|
|
- Strategic planning data
|
|
- International comparison benchmarks
|
|
|
|
**Reports Generated**:
|
|
- National heritage infrastructure status reports
|
|
- Digital transformation progress tracking
|
|
- Compliance dashboards (GDPR, accessibility, preservation standards)
|
|
- Budget allocation recommendations
|
|
|
|
#### 5.2 Cultural Policy Researchers
|
|
**Needs**:
|
|
- Cross-national policy comparison
|
|
- Impact assessment of heritage initiatives
|
|
- Evidence for policy recommendations
|
|
- Public access statistics
|
|
|
|
**Analysis Examples**:
|
|
```r
|
|
# R analysis for policy research
|
|
library(tidyverse)
|
|
|
|
glam_data <- read_csv("glam_dataset.csv")
|
|
|
|
# Compare digital access policies by country
|
|
digital_access <- glam_data %>%
|
|
group_by(country) %>%
|
|
summarise(
|
|
open_access_rate = mean(access_type == "open", na.rm = TRUE),
|
|
avg_digitization_year = mean(digitization_start_year, na.rm = TRUE),
|
|
linked_data_adoption = mean(linked_data_participation, na.rm = TRUE)
|
|
)
|
|
|
|
# Correlation with cultural policy funding
|
|
cor.test(digital_access$open_access_rate, policy_funding$cultural_budget)
|
|
```
|
|
|
|
### 6. Public & Community Users
|
|
|
|
#### 6.1 Genealogists & Family Historians
|
|
**Needs**:
|
|
- Find archives with vital records
|
|
- Locate regional historical societies
|
|
- Identify digitized collections by location
|
|
- Access information for remote research
|
|
|
|
**Discovery Interface**:
|
|
```
|
|
Map-based search:
|
|
- Select region
|
|
- Filter by: "genealogy", "vital records", "local history"
|
|
- View: institutions with online access
|
|
- Links: Direct to collection catalogs
|
|
```
|
|
|
|
#### 6.2 Cultural Heritage Enthusiasts
|
|
**Needs**:
|
|
- Discover museums by interest area
|
|
- Plan cultural heritage tourism
|
|
- Find specialized collections
|
|
- Access digital exhibitions
|
|
|
|
**Use Cases**:
|
|
- "Find maritime museums in Scandinavia"
|
|
- "Museums with Egyptian antiquities worldwide"
|
|
- "Archives with medieval manuscripts"
|
|
- "Libraries with rare book collections"
|
|
|
|
#### 6.3 Citizen Scholars & Local Historians
|
|
**Needs**:
|
|
- Community archive directories
|
|
- Local historical society contacts
|
|
- Volunteer digitization opportunities
|
|
- Collection donation information
|
|
|
|
## Consumption Patterns
|
|
|
|
### Pattern 1: Bulk Dataset Download
|
|
**Who**: Researchers, data analysts
|
|
**What**: Complete dataset dumps
|
|
**Format**: CSV, Parquet, RDF/Turtle dumps
|
|
**Frequency**: Quarterly releases
|
|
|
|
**Distribution**:
|
|
```
|
|
https://glam-dataset.org/downloads/
|
|
├── glam-dataset-v1.0.0-full.zip
|
|
│ ├── institutions.csv
|
|
│ ├── institutions.parquet
|
|
│ ├── institutions.ttl
|
|
│ ├── collections.csv
|
|
│ ├── digital_platforms.csv
|
|
│ └── README.md
|
|
├── glam-dataset-v1.0.0-by-country.zip
|
|
│ ├── netherlands.csv
|
|
│ ├── brazil.csv
|
|
│ └── ...
|
|
└── glam-dataset-v1.0.0-rdf.tar.gz
|
|
└── data/*.ttl
|
|
```
|
|
|
|
### Pattern 2: API Access
|
|
**Who**: Application developers, integrators
|
|
**What**: RESTful API queries
|
|
**Format**: JSON, JSON-LD
|
|
**Frequency**: Real-time
|
|
|
|
**API Endpoints** (future):
|
|
```
|
|
GET /api/v1/institutions
|
|
?country=NL
|
|
&type=museum
|
|
&has_digital_repository=true
|
|
&limit=50
|
|
|
|
GET /api/v1/institutions/{id}
|
|
|
|
GET /api/v1/institutions/search
|
|
?q=maritime
|
|
&lat=52.36&lon=4.88&radius=25
|
|
|
|
POST /api/v1/institutions/batch
|
|
{ids: ["inst-001", "inst-002", ...]}
|
|
```
|
|
|
|
### Pattern 3: SPARQL Queries
|
|
**Who**: Linked data developers, semantic web researchers
|
|
**What**: SPARQL endpoint access
|
|
**Format**: RDF results (XML, JSON, Turtle)
|
|
**Frequency**: Real-time
|
|
|
|
**Endpoint**: `https://sparql.glam-dataset.org/query`
|
|
|
|
**Example Queries**:
|
|
```sparql
|
|
# Complex relationship query
|
|
PREFIX hc: <https://w3id.org/heritage-custodian/>
|
|
PREFIX schema: <http://schema.org/>
|
|
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
|
|
|
|
SELECT ?inst ?name ?consortium ?cms WHERE {
|
|
?inst a hc:HeritageCustodian ;
|
|
schema:name ?name ;
|
|
hc:consortium_membership ?consortium ;
|
|
hc:collection_management_system ?cms .
|
|
|
|
FILTER(CONTAINS(LCASE(?name), "national"))
|
|
FILTER(?consortium != "")
|
|
}
|
|
ORDER BY ?consortium ?name
|
|
```
|
|
|
|
### Pattern 4: Data Visualization
|
|
**Who**: Journalists, public communicators, educators
|
|
**What**: Interactive visualizations, dashboards
|
|
**Format**: Web interfaces
|
|
**Frequency**: On-demand
|
|
|
|
**Visualization Types**:
|
|
- Interactive maps of heritage institutions
|
|
- Network graphs of institutional relationships
|
|
- Timeline views of digitization projects
|
|
- Statistical dashboards by country/region
|
|
- Comparative charts (standards adoption, digital readiness)
|
|
|
|
### Pattern 5: Embedded Data
|
|
**Who**: Website builders, CMS administrators
|
|
**What**: Embeddable widgets
|
|
**Format**: JavaScript embeds, iframes
|
|
**Frequency**: Real-time
|
|
|
|
**Examples**:
|
|
```html
|
|
<!-- Embed institution finder widget -->
|
|
<div class="glam-finder"
|
|
data-country="NL"
|
|
data-type="archive">
|
|
</div>
|
|
<script src="https://cdn.glam-dataset.org/widget.js"></script>
|
|
|
|
<!-- Embed institution profile card -->
|
|
<iframe src="https://glam-dataset.org/embed/NL-AsdRM"
|
|
width="400" height="300"></iframe>
|
|
```
|
|
|
|
## Application Examples
|
|
|
|
### Application 1: Heritage Discovery Portal
|
|
|
|
**Name**: "Global Heritage Explorer"
|
|
|
|
**Features**:
|
|
- Map-based browsing of institutions
|
|
- Advanced search (type, collections, standards)
|
|
- Institution profiles with enriched metadata
|
|
- Links to digital repositories
|
|
- Network visualization of partnerships
|
|
|
|
**Tech Stack**:
|
|
- Frontend: React + Leaflet maps
|
|
- Backend: FastAPI + PostgreSQL
|
|
- Data: GLAM dataset (weekly sync)
|
|
|
|
### Application 2: ISIL Code Resolver
|
|
|
|
**Name**: "ISIL Lookup Service"
|
|
|
|
**Features**:
|
|
- REST API for ISIL code resolution
|
|
- Returns institution metadata + URLs
|
|
- Authority linking (Wikidata, VIAF)
|
|
- OpenAPI specification
|
|
|
|
**Example**:
|
|
```bash
|
|
curl https://isil.glam-dataset.org/resolve/NL-AsdRM
|
|
|
|
{
|
|
"isil": "NL-AsdRM",
|
|
"name": "Rijksmuseum",
|
|
"type": ["museum"],
|
|
"country": "Netherlands",
|
|
"city": "Amsterdam",
|
|
"website": "https://www.rijksmuseum.nl",
|
|
"wikidata": "Q190804",
|
|
"collections_url": "https://www.rijksmuseum.nl/en/search"
|
|
}
|
|
```
|
|
|
|
### Application 3: Research Data Package
|
|
|
|
**Name**: "GLAM Analytics Toolkit"
|
|
|
|
**Features**:
|
|
- R package: `install.packages("glamdata")`
|
|
- Python package: `pip install glam-dataset`
|
|
- Pre-built analysis functions
|
|
- Visualization helpers
|
|
- Statistical model templates
|
|
|
|
**Usage**:
|
|
```r
|
|
library(glamdata)
|
|
|
|
# Load dataset
|
|
data <- load_glam_dataset(version = "1.0.0")
|
|
|
|
# Built-in analysis
|
|
coverage_stats <- analyze_coverage(data, by = "country")
|
|
|
|
# Visualizations
|
|
plot_digital_readiness(data, region = "Europe")
|
|
plot_standards_adoption(data)
|
|
```
|
|
|
|
### Application 4: Heritage Network Graph
|
|
|
|
**Name**: "Heritage Connections"
|
|
|
|
**Features**:
|
|
- Graph database (Neo4j) of institutions
|
|
- Relationship visualization
|
|
- Community detection algorithms
|
|
- Collaboration opportunity finder
|
|
|
|
**Queries**:
|
|
```cypher
|
|
// Find institutions 2 hops from Rijksmuseum
|
|
MATCH (rijks:Institution {isil: "NL-AsdRM"})-[r*1..2]-(connected)
|
|
WHERE connected.country IN ["Netherlands", "Belgium", "Germany"]
|
|
RETURN rijks, r, connected
|
|
```
|
|
|
|
### Application 5: Monitoring Dashboard
|
|
|
|
**Name**: "Digital Heritage Monitor"
|
|
|
|
**Features**:
|
|
- Track digitization progress over time
|
|
- URL health monitoring (link rot detection)
|
|
- Standards adoption trends
|
|
- Regional comparison charts
|
|
- Alert system for dataset updates
|
|
|
|
**Metrics Tracked**:
|
|
- Percentage with digital repositories
|
|
- Metadata standards adoption rates
|
|
- Geographic coverage completeness
|
|
- Data quality scores
|
|
- Growth trends
|
|
|
|
## Value Propositions
|
|
|
|
### For Heritage Professionals
|
|
**Value**: "Discover, connect, and learn from peer institutions worldwide"
|
|
- Find collaboration partners
|
|
- Identify best practices
|
|
- Benchmark digital maturity
|
|
- Access to comprehensive institutional directory
|
|
|
|
### For Researchers
|
|
**Value**: "Authoritative, structured data for heritage sector research"
|
|
- Ready-to-use research datasets
|
|
- Longitudinal tracking capability
|
|
- Cross-national comparisons
|
|
- Reproducible research support
|
|
|
|
### For Developers
|
|
**Value**: "Structured, linked, API-accessible heritage institution data"
|
|
- Build applications faster
|
|
- Reliable authority linking
|
|
- Standardized formats (RDF, CSV, Parquet)
|
|
- Active maintenance and updates
|
|
|
|
### For Funders
|
|
**Value**: "Evidence-based insights for strategic heritage investment"
|
|
- Identify funding gaps
|
|
- Measure initiative impact
|
|
- Prioritize underserved regions
|
|
- Track sector development
|
|
|
|
### For the Public
|
|
**Value**: "Find and access cultural heritage collections globally"
|
|
- Discover collections of interest
|
|
- Plan heritage visits
|
|
- Access digital resources
|
|
- Support for genealogy and research
|
|
|
|
## Success Metrics (Usage)
|
|
|
|
### Quantitative Metrics
|
|
- **Downloads**: Target 10,000+ unique downloads in first year
|
|
- **API calls**: Target 100,000+ requests/month by year 2
|
|
- **Citations**: Target 50+ research papers citing dataset in 3 years
|
|
- **Applications**: Target 20+ applications built on dataset
|
|
- **SPARQL queries**: Target 1,000+ unique queries/month
|
|
|
|
### Qualitative Metrics
|
|
- Featured in heritage sector publications
|
|
- Adopted by national heritage authorities
|
|
- Integrated into major aggregation platforms (Europeana, DPLA)
|
|
- Cited in policy documents
|
|
- Community contributions (corrections, additions)
|
|
|
|
## Sustainability Model
|
|
|
|
### Open Data Commitment
|
|
- **License**: CC0 or CC-BY 4.0
|
|
- **Access**: Free, no registration required for bulk downloads
|
|
- **Formats**: Open, widely-supported formats
|
|
- **Code**: Open source on GitHub
|
|
|
|
### Community Governance
|
|
- Accept contributions from institutions
|
|
- Community review process for additions
|
|
- Transparent data quality standards
|
|
- Public issue tracker
|
|
|
|
### Funding Options
|
|
- **Grant funding**: Initial development (foundations, government)
|
|
- **Institutional support**: Hosting by heritage organizations
|
|
- **API tier pricing** (future): Free tier + paid high-volume tier
|
|
- **Value-added services**: Custom data packages, consulting
|
|
|
|
## Ethical Considerations
|
|
|
|
### Data Quality & Accuracy
|
|
- Clear provenance for all data
|
|
- Confidence scores for derived data
|
|
- Correction mechanism for errors
|
|
- Regular validation against authoritative sources
|
|
|
|
### Representation & Bias
|
|
- Acknowledge data gaps in underrepresented regions
|
|
- Actively seek data from Global South institutions
|
|
- Document extraction methodology limitations
|
|
- Community input on missing institutions
|
|
|
|
### Privacy & Consent
|
|
- Only public institutional data (no personal data)
|
|
- Respect institutional preferences for inclusion
|
|
- Opt-out mechanism for institutions
|
|
- Clear data retention policies
|
|
|
|
### Decolonization
|
|
- Acknowledge colonial heritage in collection descriptions
|
|
- Support indigenous heritage institution visibility
|
|
- Transparent about data sources and limitations
|
|
- Consult with affected communities
|
|
|
|
## Future Expansion
|
|
|
|
### Phase 2 Features
|
|
- User accounts for personalized searches
|
|
- Saved searches and alerts
|
|
- Community ratings and reviews
|
|
- Multi-language interface
|
|
- Mobile application
|
|
|
|
### Phase 3 Features
|
|
- Machine learning for institution classification
|
|
- Automatic link rot detection and correction
|
|
- Crowdsourced data validation
|
|
- Integration with collection-level metadata
|
|
- Real-time data streaming APIs
|
|
|
|
This comprehensive view of consumers and use cases ensures the dataset design serves real-world needs across diverse user communities.
|