glam/NEXT_SESSION_HANDOFF.md
2025-11-19 23:25:22 +01:00

5.3 KiB

Next Session Handoff

Last Updated: 2025-11-18
Current Focus: Argentina heritage data extraction

Session Summary

Completed investigation of Argentina Z39.50 approach and pivoted strategy based on findings.

See detailed summary: SESSION_SUMMARY_ARGENTINA_Z3950_INVESTIGATION.md

Current Argentina Status

Completed

  1. CONABIP Libraries - 288 popular libraries scraped + Wikidata enriched
  2. AGN (Archivo General de la Nación) - National archive scraped
  3. Z39.50 Investigation - Determined authority catalog unsuitable for ISIL extraction
  4. Email drafts created - Ready to contact IRAM and Biblioteca Nacional

Data Files Ready

  • data/isil/AR/conabip_libraries_wikidata_enriched.json (288 libraries)
  • data/isil/AR/agn_argentina_archives.json (1 archive)
  • data/isil/AR/EMAIL_DRAFTS_ISIL_REQUEST.md (3 email templates)

Critical Decision Made

Do NOT implement full Z39.50 client

Reason: Biblioteca Nacional's authority catalog (BNA10) contains primarily foreign institutions (Spanish archives, Korean archives, etc.), not Argentine institutions with ISIL codes. Estimated yield would be < 50 institutions, not the 200-500 hoped for.

INSTEAD: Contact IRAM directly

IRAM (Argentine Institute for Standardization) is the official ISIL agency and maintains the authoritative registry of 500-1,000 institutions.

Immediate Next Steps (Priority Order)

1. Send IRAM Email TOP PRIORITY

File: data/isil/AR/EMAIL_DRAFTS_ISIL_REQUEST.md (Email #1)
To: iram-iso@iram.org.ar
Subject: Solicitud de acceso al registro nacional de códigos ISIL

Action: Copy email body, customize with your name/affiliation, send

Expected outcome: 60% chance of response with ISIL registry CSV/Excel

2. Send Biblioteca Nacional Email

File: data/isil/AR/EMAIL_DRAFTS_ISIL_REQUEST.md (Email #2)
To: dpt@bn.gov.ar
Subject: Consulta sobre acceso a códigos ISIL en catálogo de autoridades

Action: Send as backup/alternative source

3. Complete CONABIP LinkML Export (While Waiting for IRAM)

Convert 288 CONABIP libraries to LinkML YAML:

# Use existing parser
python3 scripts/convert_argentina_to_linkml.py \
  --input data/isil/AR/conabip_libraries_wikidata_enriched.json \
  --output data/instances/argentina/

Note: Script needs to be created or adapted from existing parsers

4. Add AGN to Instances

Convert AGN JSON to LinkML YAML:

# Manual conversion or script
# Input: data/isil/AR/agn_argentina_archives.json
# Output: data/instances/argentina/agn_archive.yaml

5. Follow-up Strategy

Timeline:

  • Week 1 (Nov 18-25): Wait for IRAM response
  • Week 2 (Nov 25-Dec 2): Send reminder + contact SISBI-UBA (Email #3)
  • Week 3 (Dec 2-9): If no response, pivot to manual extraction

Technical Artifacts Created This Session

Scripts

  1. scripts/scrapers/scrape_agn_argentina.py (AGN scraper - complete)
  2. scripts/query_biblioteca_nacional_z3950.py (Z39.50 framework - do not complete)

Data

  1. data/isil/AR/agn_argentina_archives.json (1 institution, 2 collections)

Documentation

  1. SESSION_SUMMARY_ARGENTINA_Z3950_INVESTIGATION.md (full investigation report)
  2. data/isil/AR/EMAIL_DRAFTS_ISIL_REQUEST.md (email templates)

Argentina Coverage Summary

Dataset Institutions Type Status
CONABIP 288 Popular libraries Scraped + enriched
AGN 1 National archive Scraped
Total Current 289 Mixed Ready for LinkML export
IRAM Registry (potential) 500-1,000 All types Awaiting email response

Lessons Learned

Authority Catalogs Are Not Institutional Directories

Library authority catalogs (accessible via Z39.50) are designed for bibliographic control (standardizing how institutions are cited in bibliographic records), not as comprehensive directories of heritage institutions.

Implication: Z39.50 access is useful for international citation standardization, but not for discovering domestic institutions with ISIL codes.

Best ISIL Data Sources (Priority Order)

  1. Official ISIL agency (IRAM) - Most authoritative
  2. Library consortia (SISBI-UBA, JUBIUNA) - Network directories
  3. Ministry of Culture directories
  4. Web scraping institutional websites
  5. Authority catalogs - Foreign institutions only

Files to Review Next Session

Investigation

  • data/isil/AR/ARGENTINA_ISIL_INVESTIGATION.md (comprehensive research, 456 lines)

Data to Process

  • data/isil/AR/conabip_libraries_wikidata_enriched.json
  • data/isil/AR/agn_argentina_archives.json

Parser Available

  • src/glam_extractor/parsers/argentina_conabip.py

Email Templates

  • data/isil/AR/EMAIL_DRAFTS_ISIL_REQUEST.md

Quick Reference Commands

# Check AGN data
cat data/isil/AR/agn_argentina_archives.json | jq '.institutions[]'

# Check CONABIP data
cat data/isil/AR/conabip_libraries_wikidata_enriched.json | jq 'length'  # Should show 288

# View email templates
cat data/isil/AR/EMAIL_DRAFTS_ISIL_REQUEST.md

# Test Z39.50 connection (if needed later)
python3 scripts/query_biblioteca_nacional_z3950.py --test

Ready to Resume: Send IRAM email, then work on LinkML export while waiting for response