glam/data/instances/all/archive/ARCHIVE_NOTES.md
2025-11-19 23:25:22 +01:00

75 lines
2.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Archive Notes - November 11, 2025
## Purpose
This directory contains archived backup files that have been superseded by the master dataset `globalglam-20251111.yaml`.
## Archived Files
### 1. unified_global_heritage_institutions.yaml.backup
- **Size**: 24 MB
- **Institutions**: 4,036
- **Date**: November 10, 2025 10:18
- **Reason**: Superseded by master dataset (13,502 institutions)
- **Status**: Incomplete backup from intermediate merge step
### 2. unified_global_heritage_institutions.yaml.backup2
- **Size**: 24 MB
- **Institutions**: 4,036
- **Date**: November 10, 2025 11:56
- **Reason**: Superseded by master dataset (13,502 institutions)
- **Status**: Incomplete backup from intermediate merge step
### 3. unified_global_heritage_institutions_backup_20251111_092645.yaml
- **Size**: 24 MB
- **Institutions**: 4,036
- **Date**: November 11, 2025 09:26
- **Reason**: Superseded by master dataset (13,502 institutions)
- **Status**: Timestamped backup from pre-final merge state
## Why These Files Were Archived
All three backup files contain **only 4,036 institutions**, representing incomplete data from intermediate merge steps. The authoritative master dataset `globalglam-20251111.yaml` contains **13,502 institutions** (48.0% increase after deduplication from 25,963 raw records).
### Key Differences
| Aspect | Archived Backups | Master Dataset |
|--------|------------------|----------------|
| Institutions | 4,036 | 13,502 |
| Coverage | Partial (1-2 countries) | 18 countries |
| Wikidata Coverage | ~30% | 55.7% (7,520 institutions) |
| Geocoding | ~40% | 60.6% (8,178 institutions) |
| Status | Intermediate state | Final merged dataset |
## Restoration Instructions
If you need to restore these files:
```bash
# Copy back to parent directory
cp archive/unified_global_heritage_institutions.yaml.backup ../
# Rename if needed
mv ../unified_global_heritage_institutions.yaml.backup \
../unified_global_heritage_institutions.yaml.restored
```
⚠️ **WARNING**: These files should NOT be used for production. They contain incomplete data and have been superseded by the master dataset.
## Deletion Policy
These files may be safely deleted after:
- 30 days (December 11, 2025)
- Verification that master dataset is stable
- No outstanding issues requiring historical comparison
## Related Documentation
- `../FILE_STATUS.md` - Current file authority documentation
- `../README.md` - Master dataset overview
- `../DATASET_STATISTICS.yaml` - Statistical analysis of master dataset
---
**Archived**: November 11, 2025
**Archived By**: OpenCODE AI Assistant
**Total Size**: 72 MB (3 files × 24 MB)