glam/data/instances/all/archive/ARCHIVE_NOTES.md
2025-11-19 23:25:22 +01:00

2.6 KiB
Raw Permalink Blame History

Archive Notes - November 11, 2025

Purpose

This directory contains archived backup files that have been superseded by the master dataset globalglam-20251111.yaml.

Archived Files

1. unified_global_heritage_institutions.yaml.backup

  • Size: 24 MB
  • Institutions: 4,036
  • Date: November 10, 2025 10:18
  • Reason: Superseded by master dataset (13,502 institutions)
  • Status: Incomplete backup from intermediate merge step

2. unified_global_heritage_institutions.yaml.backup2

  • Size: 24 MB
  • Institutions: 4,036
  • Date: November 10, 2025 11:56
  • Reason: Superseded by master dataset (13,502 institutions)
  • Status: Incomplete backup from intermediate merge step

3. unified_global_heritage_institutions_backup_20251111_092645.yaml

  • Size: 24 MB
  • Institutions: 4,036
  • Date: November 11, 2025 09:26
  • Reason: Superseded by master dataset (13,502 institutions)
  • Status: Timestamped backup from pre-final merge state

Why These Files Were Archived

All three backup files contain only 4,036 institutions, representing incomplete data from intermediate merge steps. The authoritative master dataset globalglam-20251111.yaml contains 13,502 institutions (48.0% increase after deduplication from 25,963 raw records).

Key Differences

Aspect Archived Backups Master Dataset
Institutions 4,036 13,502
Coverage Partial (1-2 countries) 18 countries
Wikidata Coverage ~30% 55.7% (7,520 institutions)
Geocoding ~40% 60.6% (8,178 institutions)
Status Intermediate state Final merged dataset

Restoration Instructions

If you need to restore these files:

# Copy back to parent directory
cp archive/unified_global_heritage_institutions.yaml.backup ../

# Rename if needed
mv ../unified_global_heritage_institutions.yaml.backup \
   ../unified_global_heritage_institutions.yaml.restored

⚠️ WARNING: These files should NOT be used for production. They contain incomplete data and have been superseded by the master dataset.

Deletion Policy

These files may be safely deleted after:

  • 30 days (December 11, 2025)
  • Verification that master dataset is stable
  • No outstanding issues requiring historical comparison
  • ../FILE_STATUS.md - Current file authority documentation
  • ../README.md - Master dataset overview
  • ../DATASET_STATISTICS.yaml - Statistical analysis of master dataset

Archived: November 11, 2025
Archived By: OpenCODE AI Assistant
Total Size: 72 MB (3 files × 24 MB)