7.4 KiB
Session Summary: Argentina Z39.50 Investigation & Path Forward
Date: 2025-11-18
Status: Investigation Complete - Recommendation Ready
What We Accomplished
1. ✅ AGN Scraper Executed Successfully
File: scripts/scrapers/scrape_agn_argentina.py
Output: data/isil/AR/agn_argentina_archives.json
Results:
- 1 institution: Archivo General de la Nación (National Archive)
- 2 collections: Main archive + document collections
- KOHA catalog URLs not accessible (expected)
Data Extracted:
{
"name": "Archivo General de la Nación",
"name_en": "National Archive of Argentina",
"type": "ARCHIVE",
"country": "AR",
"city": "Buenos Aires",
"province": "Ciudad Autónoma de Buenos Aires",
"url": "https://argentina.gob.ar/interior/archivo-general-de-la-nacion"
}
2. ✅ Z39.50 Client Framework Created
File: scripts/query_biblioteca_nacional_z3950.py
Confirmed:
- ✅ Z39.50 server is accessible:
200.123.191.9:9991 - ✅ Database:
BNA10(Authority records) - ✅ Authentication: Username
Z39.50/ PasswordZ39.50 - ✅ Format: MARC21, UTF-8 encoding
3. ✅ Biblioteca Nacional Web Catalog Investigated
URL: https://catalogo.bn.gov.ar/
Findings:
- Modern web interface with "Catálogo de Autoridades" (Authority Catalog)
- Contains MARC21 authority records accessible via web browser
- Z39.50 connection details prominently displayed
- However: Most authority records appear to be for foreign institutions (e.g., Spanish archives)
- Search for "archivo" returned mainly: Archivo General de Indias (Spain), Korean Film Archive, etc.
- Few Argentine institutions found in authority catalog browsing
Critical Discovery: Authority Catalog Limitation
Problem Identified
The Biblioteca Nacional's authority catalog (BNA10 database) appears to contain primarily:
- Foreign institutions referenced in Argentine bibliographic records
- International archives and libraries (Spanish, Korean, etc.)
- Not a comprehensive registry of Argentine institutions
Why This Matters
The investigation document (data/isil/AR/ARGENTINA_ISIL_INVESTIGATION.md) suggested:
"MARC field 024 (Standard Identifier) in authority records contains ISIL codes"
However, the authority catalog is designed for bibliographic authority control (standardizing how foreign institutions are cited), not as a directory of Argentine heritage institutions.
Current Argentina Data Status
| Dataset | Count | Type | Coverage | Status |
|---|---|---|---|---|
| CONABIP Libraries | 288 | Popular libraries | Nationwide | ✅ Scraped + Wikidata enriched |
| AGN | 1 | National archive | Buenos Aires | ✅ Scraped (this session) |
| BN Authority Catalog | ~10-50? | Mixed (mostly foreign) | Limited Argentine | ⚠️ Not suitable for bulk extraction |
Recommendation: Pivot Strategy
❌ Do NOT Invest in Z39.50 Implementation
Reasons:
- Authority catalog contains mainly foreign institutions, not Argentine ones
- Estimated yield of Argentine ISIL codes: < 50 institutions (not 200-500 as hoped)
- High implementation effort (2-3 hours for PyZ3950 client) for minimal return
- Web scraping the authority catalog would be tedious with old OPAC system
✅ INSTEAD: Contact IRAM Directly
Best approach identified in investigation document:
Email: iram-iso@iram.org.ar
Subject: Solicitud de acceso al registro ISIL de Argentina
Rationale:
- IRAM is the official ISIL agency for Argentina
- They maintain the authoritative registry of 500-1,000 institutions
- Direct data export is the most efficient path
- Precedent: Other countries (Netherlands, Belgium) provide ISIL registries
Email Template (from investigation document):
Estimados,
Soy investigador trabajando en un proyecto de patrimonio cultural global
y estoy recopilando datos sobre instituciones GLAM en Argentina.
¿Sería posible acceder al registro completo de códigos ISIL asignados
en Argentina? Cualquier formato (CSV, Excel, PDF) sería útil.
Muchas gracias,
[Your name]
✅ Alternative: University Library Networks
If IRAM doesn't respond, pursue:
-
SISBI-UBA (University of Buenos Aires library system)
- URL: http://www.sisbi.uba.ar/
- ~40 faculty libraries
- May have ISIL codes or institutional directory
-
JUBIUNA (National Universities Library Network)
- Consortium of Argentine university libraries
- Potential ISIL code aggregation
Technical Artifacts Created
Scripts Created This Session
- ✅
scripts/scrapers/scrape_agn_argentina.py(264 lines) - ✅
scripts/query_biblioteca_nacional_z3950.py(285 lines - framework only)
Data Files Created
- ✅
data/isil/AR/agn_argentina_archives.json(1 institution + 2 collections)
Z39.50 Client Status
- Framework: Complete (connection, data structures, MARC parsing)
- Implementation: Incomplete (requires PyZ3950 library)
- Recommendation: Do not complete - authority catalog doesn't contain target data
Next Steps (Priority Order)
Immediate (Today)
- ✅ Send email to IRAM requesting ISIL registry export
- ✅ Send email to Biblioteca Nacional (dpt@bn.gov.ar) asking for guidance on accessing Argentine ISIL codes
Short-term (This Week)
- Complete CONABIP pipeline - Export 288 libraries to LinkML YAML
- Add AGN to instances - Convert AGN JSON to LinkML YAML
- Document Argentina coverage in main PROGRESS.md
If IRAM Responds Positively
- Parse ISIL registry CSV/Excel
- Cross-reference with CONABIP + AGN data
- Geocode addresses
- Export to LinkML YAML
If IRAM Doesn't Respond (2-week timeout)
- Investigate SISBI-UBA library directory
- Investigate JUBIUNA network
- Consider manual extraction from ministerial websites
Files for Next Session
Ready to Process
data/isil/AR/conabip_libraries_wikidata_enriched.json(288 libraries)data/isil/AR/agn_argentina_archives.json(1 archive)
Parser Available
src/glam_extractor/parsers/argentina_conabip.py
Investigation Complete
data/isil/AR/ARGENTINA_ISIL_INVESTIGATION.md(comprehensive research)
Lessons Learned
Authority Catalogs ≠ Institutional Directories
Key Insight: Library authority catalogs are designed for bibliographic control (standardizing citations), not as comprehensive directories of heritage institutions.
Implication: Z39.50 access to authority records is useful for international citation standardization, not for discovering domestic institutions with ISIL codes.
Best Data Sources for ISIL Codes (in priority order)
- Official ISIL agency registries (IRAM for Argentina)
- National library consortia (SISBI-UBA, JUBIUNA)
- Ministry of Culture directories
- Web scraping institutional websites
- ❌
Authority catalogs(foreign institutions only)
Summary
AGN Scraper: ✅ Success - 1 institution extracted
Z39.50 Client: ⚠️ Framework created but not implemented
Authority Catalog: ❌ Not suitable for Argentine ISIL extraction
Recommended Action: ✉️ Contact IRAM directly for ISIL registry
Estimated Argentina Coverage:
- Current: 289 institutions (288 CONABIP + 1 AGN)
- Potential with IRAM registry: 500-1,000 institutions
- Data quality: TIER_1_AUTHORITATIVE (if IRAM responds)
Next session: Send IRAM email, complete CONABIP LinkML export, add AGN to instances