glam/data/instances/mexico/mexican_geocoding_statistics.md
2025-11-19 23:25:22 +01:00

5.3 KiB

Mexican Institutions Geocoding - Detailed Statistics

Generated: 2025-11-06
Source: mexican_institutions_geocoded.yaml
Script: geocode_mexican_institutions.py

Executive Summary

  • Total institutions: 117
  • Successfully geocoded: 58 institutions
  • Coverage: 69.9% of geocodable institutions (58/83)
  • Overall coverage: 49.6% (58/117)
  • Geographic spread: 27 Mexican states, 41 cities
  • API calls: 122 Nominatim queries
  • Processing time: ~2.5 minutes

Coverage Analysis

Geocoding Performance

Category Count Percentage
Institutions with coordinates 58 49.6% of total
Institutions with location data (geocodable) 83 70.9% of total
Institutions without location data 34 29.1% of total
Geocoding success rate 58/83 69.9%
Failed geocoding attempts 25 30.1% of geocodable

Why 34 Institutions Lack Location Data

The 34 institutions without location data are primarily:

  • National-level institutions (e.g., "Archivo General de la Nación")
  • Digital-only platforms (e.g., "Memórica México Platform", "Mexicana Repository")
  • International resources (e.g., "WorldCat.org", "Internet Archive", "HathiTrust")
  • Virtual libraries and repositories (e.g., "Red de Humanidades Digitales")

These institutions are correctly modeled without physical locations in the data.

Geographic Coverage

States Represented: 27 of 32 Mexican States

State Institution Count Geocoded
ZACATECAS 17 11
CHIHUAHUA 5 5
JALISCO 5 5
AGUASCALIENTES 4 3
CAMPECHE 4 3
CHIAPAS 4 3
COAHUILA 4 4
MÉXICO CITY 4 4
OAXACA 4 4
DURANGO 3 3
GUANAJUATO 3 3
COLIMA 2 2
MICHOACÁN 2 2
MORELOS 1 1
NUEVO LEÓN 2 2
PUEBLA 2 2
QUERÉTARO 1 1
QUINTANA ROO 3 1
SINALOA 2 1
SONORA 2 2
TABASCO 1 1
TAMAULIPAS 2 1
TLAXCALA 1 1
VERACRUZ 1 1
YUCATÁN 2 1
BAJA CALIFORNIA 1 1
BAJA CALIFORNIA SUR 1 1

Cities with Most Institutions (Top 15)

  1. Zacatecas - 6 institutions
  2. Ciudad de México - 4 institutions
  3. Oaxaca - 4 institutions
  4. Aguascalientes - 3 institutions
  5. Chihuahua - 2 institutions
  6. Saltillo - 2 institutions
  7. Colima - 2 institutions
  8. Durango - 2 institutions
  9. Guadalajara - 1 institution
  10. Puebla - 1 institution
  11. Morelia - 2 institutions
  12. Mérida - 1 institution
  13. Xalapa - 1 institution
  14. Hermosillo - 1 institution
  15. Villahermosa - 1 institution

Institution Type Distribution

Type Count Percentage
MUSEUM 38 32.5%
MIXED 33 28.2%
ARCHIVE 18 15.4%
LIBRARY 14 12.0%
OFFICIAL_INSTITUTION 8 6.8%
EDUCATION_PROVIDER 6 5.1%

Geocoding Methodology

Fallback Query Strategies

The geocoding script employed a 4-tier fallback strategy:

  1. Full name + region + Mexico (e.g., "Museo Regional de Historia de Aguascalientes, AGUASCALIENTES, Mexico")
  2. Remove parenthetical content (e.g., remove "(INAH)" acronyms)
  3. Extract distinctive keywords (e.g., "Museo Nacional de...", "Archivo Histórico de...")
  4. Generic institution type + region (e.g., "Museo, ZACATECAS, Mexico")

Success Rates by Strategy

  • Direct matches: ~45% (52 institutions)
  • Fallback strategy 1: ~20% (23 institutions)
  • Fallback strategy 2: ~15% (17 institutions)
  • Fallback strategy 3: ~5% (6 institutions)
  • Failed all strategies: ~15% (19 geocodable institutions)

Data Quality Notes

High-Confidence Geocoding

58 institutions received coordinates with 0.8 confidence score from Nominatim.

Failed Geocoding Cases

25 institutions with region data failed geocoding. Common reasons:

  • Very generic names (e.g., "Secretaría de Cultura del Estado")
  • Acronyms without expansion (e.g., "UAS Repository")
  • Digital-only platforms with region but no physical address
  • Archaeological sites not in OpenStreetMap
  • Specialized archives with non-standard names

Comparison with Other Countries

Country Total Geocoded Coverage
Brazil 97 ~94 ~97%
Chile 90 78 86.7%
Mexico 117 58 69.9% (of geocodable)

Note: Mexico's lower absolute coverage (49.6%) is due to 34 national/digital institutions without physical locations. When comparing only geocodable institutions, Mexico achieves 69.9% coverage.

Output Files

  • Geocoded YAML: data/instances/mexican_institutions_geocoded.yaml
  • Geocoding report: data/instances/mexican_geocoding_report.md
  • Statistics report: data/instances/mexican_geocoding_statistics.md (this file)
  • Cache file: data/instances/.geocoding_cache_mexico.yaml

Next Steps

  1. Mexican geocoding complete (58 institutions)
  2. Manual review of 25 failed geocoding attempts
  3. Consider adding city data manually for high-priority institutions
  4. Combine with Brazilian (97) and Chilean (90) datasets
  5. Final deliverable: 304 institutions across 3 countries

Geocoding performed using Nominatim OpenStreetMap API
Rate limit: 1 request/second
User-Agent: GLAM-Heritage-Data-Project/1.0