glam/QUICK_STATUS_20251119.md
2025-11-19 23:25:22 +01:00

64 lines
2 KiB
Markdown

# GLAM Project Quick Status - 2025-11-19
## ✅ Completed Today
### 1. German Data Unification
- **20,761 institutions** (ISIL 16,979 + DDB 4,937 → unified)
- File: `data/isil/germany/german_institutions_unified_20251119_181857.json` (39.2 MB)
- 82% with ISIL codes, 71.3% geocoded
### 2. Austrian Data Consolidation
- **4,348 institutions** (ISIL 1,928 + Wikidata 4,859 + OSM 627 → deduplicated)
- File: `data/isil/austria/austrian_institutions_consolidated_20251119_181541.json` (1.78 MB)
- 67.5% geocoded, 62.8% with Wikidata IDs
### 3. Scripts Created
- `scripts/scrapers/harvest_ddb_institutions.py` - DDB API harvester
- `scripts/scrapers/consolidate_austrian_data.py` - Austrian multi-source merger
- `scripts/scrapers/crossreference_german_data.py` - German ISIL+DDB cross-reference
---
## 📊 Current Data Status
| Country | Institutions | Status |
|---------|-------------|--------|
| 🇩🇪 Germany | 20,761 | ✅ Unified |
| 🇨🇿 Czech Republic | 8,694 | ✅ Complete |
| 🇦🇹 Austria | 4,348 | ✅ Consolidated |
| 🇨🇭 Switzerland | 2,379 | ✅ Complete |
| 🇳🇱 Netherlands | ~1,400 | ✅ Complete |
| 🇧🇪 Belgium | 438 | ✅ Complete |
| **Total** | **37,582** | **38.7% of global target** |
---
## 🚀 Next Steps
1. **Denmark ISIL harvest** → Complete Phase 1
2. **Data quality audit** → Review 100 random samples
3. **LinkML conversion** → Export to HeritageCustodian schema
4. **Wikidata enrichment** → Add Q-numbers to German institutions
---
## 🔑 Key Files
```
/data/isil/germany/
└── german_institutions_unified_20251119_181857.json (39.2 MB)
/data/isil/austria/
└── austrian_institutions_consolidated_20251119_181541.json (1.78 MB)
/scripts/scrapers/
├── harvest_ddb_institutions.py
├── consolidate_austrian_data.py
└── crossreference_german_data.py
```
---
**Last Updated**: 2025-11-19T18:30:00Z
**Session**: DDB Harvest & Unification Complete
**Status**: Ready for Phase 1 Completion