2.6 KiB
2.6 KiB
Quick Start: Australian Heritage Institution Extraction
Time Required: 10-15 minutes
Expected Output: 200-500 Australian heritage institutions
Data Quality: TIER_1_AUTHORITATIVE (0.95 confidence)
Step 1: Get Trove API Key (5 minutes)
- Visit: https://trove.nla.gov.au/about/create-something/using-api
- Click "Sign up for an API key"
- Fill out form:
- Name
- Intended use: "Heritage institution research"
- Check email for API key (arrives immediately)
- Save key somewhere secure
Step 2: Run Extraction (2-5 minutes)
cd /Users/kempersc/apps/glam
python scripts/extract_trove_contributors.py --api-key YOUR_TROVE_API_KEY
What happens:
- Fetches all Trove contributors (200-500 institutions)
- Retrieves full details (respects 200 req/min rate limit)
- Classifies by GLAMORCUBESFIXPHDNT type
- Generates GHCID identifiers
- Exports to YAML, JSON, CSV
Step 3: Check Results (1 minute)
# Count institutions
wc -l data/instances/trove_contributors_*.csv
# View sample
head -n 50 data/instances/trove_contributors_*.yaml
# Check types
grep "institution_type:" data/instances/trove_contributors_*.yaml | sort | uniq -c
Expected Output
data/instances/
├── trove_contributors_20251118_143000.yaml # Full records
├── trove_contributors_20251118_143000.json # JSON format
└── trove_contributors_20251118_143000.csv # Spreadsheet
Sample Record:
- name: National Library of Australia
institution_type: L # Library
ghcid_current: AU-ACT-CAN-L-NLA
identifiers:
- identifier_scheme: NUC
identifier_value: NLA
- identifier_scheme: ISIL
identifier_value: AU-NLA
homepage: https://www.nla.gov.au
Troubleshooting
"API key required": Get key from https://trove.nla.gov.au/
Rate limit errors: Add --delay 0.5 to slow down requests
No results: Check internet connection and Trove status
Advanced Options
# Custom output directory
python scripts/extract_trove_contributors.py \
--api-key YOUR_KEY \
--output-dir data/australia
# Slower rate (safer)
python scripts/extract_trove_contributors.py \
--api-key YOUR_KEY \
--delay 0.5
# YAML only
python scripts/extract_trove_contributors.py \
--api-key YOUR_KEY \
--formats yaml
Documentation
- Full Guide:
docs/AUSTRALIA_TROVE_EXTRACTION.md - Session Summary:
SESSION_SUMMARY_20251118_AUSTRALIA_TROVE.md - Next Steps:
NEXT_STEPS.md
Status: ✅ Ready to run
Your Action: Get API key, then run the script