glam/data/isil/AR/ARGENTINA_CONABIP_README.md
2025-11-19 23:25:22 +01:00

4.4 KiB

Argentina CONABIP Popular Libraries Dataset

Overview

Source: CONABIP (Comisión Nacional de Bibliotecas Populares)
URL: https://www.conabip.gob.ar/buscador_bp
Date Scraped: November 17, 2025
Total Institutions: 288
Coverage: 22 provinces, 220 cities

Files

Basic Dataset (Complete)

  • conabip_libraries.csv (47 KB) - Basic institution data
  • conabip_libraries.json (115 KB) - JSON with metadata

Fields:

  • Registration number (REG)
  • Institution name
  • Province
  • City/Locality
  • Neighborhood
  • Street address
  • Profile URL

Enhanced Dataset (Sample Only)

  • conabip_libraries_with_profiles_test.csv (12 KB) - 32 institutions
  • conabip_libraries_with_profiles_test.json (24 KB) - 32 institutions

Additional Fields:

  • Latitude/longitude (from Google Maps)
  • Google Maps URL
  • Services offered (WiFi, computers, workshops, etc.)

Note: Full enhanced dataset (288 institutions) not yet complete due to scraping timeout constraints. See session summary for details.

Geographic Distribution

Top 10 Provinces

  1. Buenos Aires: 82 institutions (28.5%)
  2. Santa Fe: 61 institutions (21.2%)
  3. Entre Ríos: 27 institutions (9.4%)
  4. Córdoba: 18 institutions (6.3%)
  5. Corrientes: 13 institutions (4.5%)
  6. La Pampa: 12 institutions (4.2%)
  7. Ciudad Autónoma de Buenos Aires: 10 institutions (3.5%)
  8. Jujuy: 8 institutions (2.8%)
  9. Santiago del Estero: 7 institutions (2.4%)
  10. San Juan: 6 institutions (2.1%)

Geographic Spread

  • 220 unique cities represented
  • Average: 13.1 institutions per province
  • Concentration: ~50% of institutions in Buenos Aires and Santa Fe provinces

Most Common Institution Names

Popular library names honor Argentine historical figures:

  1. Domingo Faustino Sarmiento: 41 institutions

    • 7th President of Argentina (1868-1874)
    • Champion of public education and libraries
  2. Bernardino Rivadavia: 21 institutions

    • 1st President of Argentina (1826-1827)
    • Founding father, education reformer
  3. Juan Bautista Alberdi: 14 institutions

    • Political theorist, author of Argentine Constitution basis
  4. Mariano Moreno: 11 institutions

    • Revolutionary leader, journalist
  5. Florentino Ameghino: 7 institutions

    • Naturalist, paleontologist, anthropologist
  6. Bartolomé Mitre: 7 institutions

    • President (1862-1868), historian, writer

Data Quality

Strengths

  • Official government source (authoritative)
  • Clean, structured data
  • Consistent formatting
  • Zero parsing errors
  • Geographic coordinates available (via profile pages)

Limitations

  • ⚠️ Only 288 institutions (may not be comprehensive)
  • ⚠️ Profile data requires additional scraping (slow)
  • ⚠️ Registration numbers not consistently extracted
  • ⚠️ Some fields may be incomplete or empty

Data Tier Classification

Per GLAM project schema:

  • Data Source: WEB_SCRAPING
  • Data Tier: TIER_2_VERIFIED (official government website)
  • Institution Type: LIBRARY (popular libraries)
  • Country Code: AR (Argentina, ISO 3166-1 alpha-2)

Usage Notes

For GLAM Project Integration

  1. Parse CSV/JSON into LinkML HeritageCustodian instances
  2. Set institution_type: LIBRARY
  3. Map provinces to ISO 3166-2 region codes (AR-B, AR-X, etc.)
  4. Geocode addresses using Nominatim API (if coordinates not available)
  5. Generate GHCIDs: AR-{ProvinceCode}-{CityCode}-L-{Abbrev}
  6. Enrich with Wikidata Q-numbers where available

Known Issues

  • Full enhanced dataset incomplete (requires long-running background scrape)
  • Server occasionally slow/unreliable (4 timeouts in 288 requests)
  • Some duplicate names across different cities (need location-based deduplication)

Next Steps

  1. Complete profile scraping for all 288 institutions (run in background)
  2. Parse into LinkML heritage_custodian instances
  3. Enrich with Wikidata/VIAF identifiers
  4. Integrate with global GLAM dataset
  5. Export to RDF/JSON-LD for semantic web
  • SESSION_SUMMARY_ARGENTINA_CONABIP.md - Detailed session notes
  • ARGENTINA_ISIL_INVESTIGATION.md - Argentina ISIL registry investigation
  • /scripts/scrapers/scrape_conabip_argentina.py - Web scraper source code
  • /tests/scrapers/test_conabip_scraper.py - Test suite (19 tests, 100% pass)

Scraper: OpenCODE AI Agent
License: Dataset public domain (Argentine government data)
Last Updated: 2025-11-17