2 KiB
2 KiB
GLAM Project Quick Status - 2025-11-19
✅ Completed Today
1. German Data Unification
- 20,761 institutions (ISIL 16,979 + DDB 4,937 → unified)
- File:
data/isil/germany/german_institutions_unified_20251119_181857.json(39.2 MB) - 82% with ISIL codes, 71.3% geocoded
2. Austrian Data Consolidation
- 4,348 institutions (ISIL 1,928 + Wikidata 4,859 + OSM 627 → deduplicated)
- File:
data/isil/austria/austrian_institutions_consolidated_20251119_181541.json(1.78 MB) - 67.5% geocoded, 62.8% with Wikidata IDs
3. Scripts Created
scripts/scrapers/harvest_ddb_institutions.py- DDB API harvesterscripts/scrapers/consolidate_austrian_data.py- Austrian multi-source mergerscripts/scrapers/crossreference_german_data.py- German ISIL+DDB cross-reference
📊 Current Data Status
| Country | Institutions | Status |
|---|---|---|
| 🇩🇪 Germany | 20,761 | ✅ Unified |
| 🇨🇿 Czech Republic | 8,694 | ✅ Complete |
| 🇦🇹 Austria | 4,348 | ✅ Consolidated |
| 🇨🇭 Switzerland | 2,379 | ✅ Complete |
| 🇳🇱 Netherlands | ~1,400 | ✅ Complete |
| 🇧🇪 Belgium | 438 | ✅ Complete |
| Total | 37,582 | 38.7% of global target |
🚀 Next Steps
- Denmark ISIL harvest → Complete Phase 1
- Data quality audit → Review 100 random samples
- LinkML conversion → Export to HeritageCustodian schema
- Wikidata enrichment → Add Q-numbers to German institutions
🔑 Key Files
/data/isil/germany/
└── german_institutions_unified_20251119_181857.json (39.2 MB)
/data/isil/austria/
└── austrian_institutions_consolidated_20251119_181541.json (1.78 MB)
/scripts/scrapers/
├── harvest_ddb_institutions.py
├── consolidate_austrian_data.py
└── crossreference_german_data.py
Last Updated: 2025-11-19T18:30:00Z
Session: DDB Harvest & Unification Complete
Status: Ready for Phase 1 Completion