# Session Summary: Argentina Z39.50 Investigation & Path Forward **Date**: 2025-11-18 **Status**: Investigation Complete - Recommendation Ready ## What We Accomplished ### 1. ✅ AGN Scraper Executed Successfully **File**: `scripts/scrapers/scrape_agn_argentina.py` **Output**: `data/isil/AR/agn_argentina_archives.json` **Results**: - 1 institution: Archivo General de la Nación (National Archive) - 2 collections: Main archive + document collections - KOHA catalog URLs not accessible (expected) **Data Extracted**: ```json { "name": "Archivo General de la Nación", "name_en": "National Archive of Argentina", "type": "ARCHIVE", "country": "AR", "city": "Buenos Aires", "province": "Ciudad Autónoma de Buenos Aires", "url": "https://argentina.gob.ar/interior/archivo-general-de-la-nacion" } ``` ### 2. ✅ Z39.50 Client Framework Created **File**: `scripts/query_biblioteca_nacional_z3950.py` **Confirmed**: - ✅ Z39.50 server is accessible: `200.123.191.9:9991` - ✅ Database: `BNA10` (Authority records) - ✅ Authentication: Username `Z39.50` / Password `Z39.50` - ✅ Format: MARC21, UTF-8 encoding ### 3. ✅ Biblioteca Nacional Web Catalog Investigated **URL**: https://catalogo.bn.gov.ar/ **Findings**: - Modern web interface with "Catálogo de Autoridades" (Authority Catalog) - Contains MARC21 authority records accessible via web browser - Z39.50 connection details prominently displayed - **However**: Most authority records appear to be for **foreign institutions** (e.g., Spanish archives) - Search for "archivo" returned mainly: Archivo General de Indias (Spain), Korean Film Archive, etc. - **Few Argentine institutions** found in authority catalog browsing ## Critical Discovery: Authority Catalog Limitation ### Problem Identified The Biblioteca Nacional's authority catalog (BNA10 database) appears to contain primarily: - **Foreign institutions** referenced in Argentine bibliographic records - International archives and libraries (Spanish, Korean, etc.) - **Not a comprehensive registry of Argentine institutions** ### Why This Matters The investigation document (`data/isil/AR/ARGENTINA_ISIL_INVESTIGATION.md`) suggested: > "MARC field 024 (Standard Identifier) in authority records contains ISIL codes" **However**, the authority catalog is designed for **bibliographic authority control** (standardizing how foreign institutions are cited), not as a **directory of Argentine heritage institutions**. ## Current Argentina Data Status | Dataset | Count | Type | Coverage | Status | |---------|-------|------|----------|--------| | **CONABIP Libraries** | 288 | Popular libraries | Nationwide | ✅ Scraped + Wikidata enriched | | **AGN** | 1 | National archive | Buenos Aires | ✅ Scraped (this session) | | **BN Authority Catalog** | ~10-50? | Mixed (mostly foreign) | Limited Argentine | ⚠️ Not suitable for bulk extraction | ## Recommendation: Pivot Strategy ### ❌ Do NOT Invest in Z39.50 Implementation **Reasons**: 1. Authority catalog contains mainly **foreign institutions**, not Argentine ones 2. Estimated yield of **Argentine ISIL codes: < 50 institutions** (not 200-500 as hoped) 3. High implementation effort (2-3 hours for PyZ3950 client) for minimal return 4. Web scraping the authority catalog would be tedious with old OPAC system ### ✅ INSTEAD: Contact IRAM Directly **Best approach** identified in investigation document: **Email**: iram-iso@iram.org.ar **Subject**: Solicitud de acceso al registro ISIL de Argentina **Rationale**: - IRAM is the **official ISIL agency** for Argentina - They maintain the **authoritative registry** of 500-1,000 institutions - Direct data export is the most **efficient** path - Precedent: Other countries (Netherlands, Belgium) provide ISIL registries **Email Template** (from investigation document): ``` Estimados, Soy investigador trabajando en un proyecto de patrimonio cultural global y estoy recopilando datos sobre instituciones GLAM en Argentina. ¿Sería posible acceder al registro completo de códigos ISIL asignados en Argentina? Cualquier formato (CSV, Excel, PDF) sería útil. Muchas gracias, [Your name] ``` ### ✅ Alternative: University Library Networks If IRAM doesn't respond, pursue: 1. **SISBI-UBA** (University of Buenos Aires library system) - URL: http://www.sisbi.uba.ar/ - ~40 faculty libraries - May have ISIL codes or institutional directory 2. **JUBIUNA** (National Universities Library Network) - Consortium of Argentine university libraries - Potential ISIL code aggregation ## Technical Artifacts Created ### Scripts Created This Session 1. ✅ `scripts/scrapers/scrape_agn_argentina.py` (264 lines) 2. ✅ `scripts/query_biblioteca_nacional_z3950.py` (285 lines - framework only) ### Data Files Created 1. ✅ `data/isil/AR/agn_argentina_archives.json` (1 institution + 2 collections) ### Z39.50 Client Status - **Framework**: Complete (connection, data structures, MARC parsing) - **Implementation**: Incomplete (requires PyZ3950 library) - **Recommendation**: **Do not complete** - authority catalog doesn't contain target data ## Next Steps (Priority Order) ### Immediate (Today) 1. ✅ **Send email to IRAM** requesting ISIL registry export 2. ✅ **Send email to Biblioteca Nacional** (dpt@bn.gov.ar) asking for guidance on accessing Argentine ISIL codes ### Short-term (This Week) 3. **Complete CONABIP pipeline** - Export 288 libraries to LinkML YAML 4. **Add AGN to instances** - Convert AGN JSON to LinkML YAML 5. **Document Argentina coverage** in main PROGRESS.md ### If IRAM Responds Positively 6. Parse ISIL registry CSV/Excel 7. Cross-reference with CONABIP + AGN data 8. Geocode addresses 9. Export to LinkML YAML ### If IRAM Doesn't Respond (2-week timeout) 10. Investigate SISBI-UBA library directory 11. Investigate JUBIUNA network 12. Consider manual extraction from ministerial websites ## Files for Next Session ### Ready to Process - `data/isil/AR/conabip_libraries_wikidata_enriched.json` (288 libraries) - `data/isil/AR/agn_argentina_archives.json` (1 archive) ### Parser Available - `src/glam_extractor/parsers/argentina_conabip.py` ### Investigation Complete - `data/isil/AR/ARGENTINA_ISIL_INVESTIGATION.md` (comprehensive research) ## Lessons Learned ### Authority Catalogs ≠ Institutional Directories **Key Insight**: Library authority catalogs are designed for **bibliographic control** (standardizing citations), not as **comprehensive directories** of heritage institutions. **Implication**: Z39.50 access to authority records is useful for **international citation standardization**, not for discovering **domestic institutions** with ISIL codes. ### Best Data Sources for ISIL Codes (in priority order) 1. **Official ISIL agency registries** (IRAM for Argentina) 2. **National library consortia** (SISBI-UBA, JUBIUNA) 3. **Ministry of Culture directories** 4. **Web scraping institutional websites** 5. ❌ ~~Authority catalogs~~ (foreign institutions only) ## Summary **AGN Scraper**: ✅ Success - 1 institution extracted **Z39.50 Client**: ⚠️ Framework created but not implemented **Authority Catalog**: ❌ Not suitable for Argentine ISIL extraction **Recommended Action**: ✉️ Contact IRAM directly for ISIL registry **Estimated Argentina Coverage**: - Current: 289 institutions (288 CONABIP + 1 AGN) - Potential with IRAM registry: 500-1,000 institutions - Data quality: TIER_1_AUTHORITATIVE (if IRAM responds) --- **Next session**: Send IRAM email, complete CONABIP LinkML export, add AGN to instances