116 lines
2.6 KiB
Markdown
116 lines
2.6 KiB
Markdown
# Quick Start: Australian Heritage Institution Extraction
|
|
|
|
**Time Required**: 10-15 minutes
|
|
**Expected Output**: 200-500 Australian heritage institutions
|
|
**Data Quality**: TIER_1_AUTHORITATIVE (0.95 confidence)
|
|
|
|
---
|
|
|
|
## Step 1: Get Trove API Key (5 minutes)
|
|
|
|
1. Visit: https://trove.nla.gov.au/about/create-something/using-api
|
|
2. Click "Sign up for an API key"
|
|
3. Fill out form:
|
|
- Name
|
|
- Email
|
|
- Intended use: "Heritage institution research"
|
|
4. Check email for API key (arrives immediately)
|
|
5. Save key somewhere secure
|
|
|
|
---
|
|
|
|
## Step 2: Run Extraction (2-5 minutes)
|
|
|
|
```bash
|
|
cd /Users/kempersc/apps/glam
|
|
|
|
python scripts/extract_trove_contributors.py --api-key YOUR_TROVE_API_KEY
|
|
```
|
|
|
|
**What happens**:
|
|
- Fetches all Trove contributors (200-500 institutions)
|
|
- Retrieves full details (respects 200 req/min rate limit)
|
|
- Classifies by GLAMORCUBESFIXPHDNT type
|
|
- Generates GHCID identifiers
|
|
- Exports to YAML, JSON, CSV
|
|
|
|
---
|
|
|
|
## Step 3: Check Results (1 minute)
|
|
|
|
```bash
|
|
# Count institutions
|
|
wc -l data/instances/trove_contributors_*.csv
|
|
|
|
# View sample
|
|
head -n 50 data/instances/trove_contributors_*.yaml
|
|
|
|
# Check types
|
|
grep "institution_type:" data/instances/trove_contributors_*.yaml | sort | uniq -c
|
|
```
|
|
|
|
---
|
|
|
|
## Expected Output
|
|
|
|
```
|
|
data/instances/
|
|
├── trove_contributors_20251118_143000.yaml # Full records
|
|
├── trove_contributors_20251118_143000.json # JSON format
|
|
└── trove_contributors_20251118_143000.csv # Spreadsheet
|
|
```
|
|
|
|
**Sample Record**:
|
|
```yaml
|
|
- name: National Library of Australia
|
|
institution_type: L # Library
|
|
ghcid_current: AU-ACT-CAN-L-NLA
|
|
identifiers:
|
|
- identifier_scheme: NUC
|
|
identifier_value: NLA
|
|
- identifier_scheme: ISIL
|
|
identifier_value: AU-NLA
|
|
homepage: https://www.nla.gov.au
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
**"API key required"**: Get key from https://trove.nla.gov.au/
|
|
**Rate limit errors**: Add `--delay 0.5` to slow down requests
|
|
**No results**: Check internet connection and Trove status
|
|
|
|
---
|
|
|
|
## Advanced Options
|
|
|
|
```bash
|
|
# Custom output directory
|
|
python scripts/extract_trove_contributors.py \
|
|
--api-key YOUR_KEY \
|
|
--output-dir data/australia
|
|
|
|
# Slower rate (safer)
|
|
python scripts/extract_trove_contributors.py \
|
|
--api-key YOUR_KEY \
|
|
--delay 0.5
|
|
|
|
# YAML only
|
|
python scripts/extract_trove_contributors.py \
|
|
--api-key YOUR_KEY \
|
|
--formats yaml
|
|
```
|
|
|
|
---
|
|
|
|
## Documentation
|
|
|
|
- **Full Guide**: `docs/AUSTRALIA_TROVE_EXTRACTION.md`
|
|
- **Session Summary**: `SESSION_SUMMARY_20251118_AUSTRALIA_TROVE.md`
|
|
- **Next Steps**: `NEXT_STEPS.md`
|
|
|
|
---
|
|
|
|
**Status**: ✅ Ready to run
|
|
**Your Action**: Get API key, then run the script
|