# Brazil Batch 9 Enrichment Candidates **Generated**: 2025-11-11 **Purpose**: Identify high-priority Brazilian institutions for Wikidata enrichment **Target**: Add 10-15 Wikidata identifiers to increase Brazil coverage from 14.6% → 19.3%+ ## Summary Statistics - **Total Brazilian institutions**: 212 - **With Wikidata**: 31 (14.6%) - **Without Wikidata**: 181 (85.4%) - **Candidates analyzed**: 181 - **Top candidates selected**: 15 ## Institution Type Distribution (Without Wikidata) | Type | Count | % of Total | |------|-------|------------| | MIXED | 61 | 33.7% | | EDUCATION_PROVIDER | 43 | 23.8% | | MUSEUM | 42 | 23.2% | | OFFICIAL_INSTITUTION | 16 | 8.8% | | ARCHIVE | 9 | 5.0% | | RESEARCH_CENTER | 4 | 2.2% | | GALLERY | 3 | 1.7% | | LIBRARY | 3 | 1.7% | ## Scoring Methodology Institutions are scored based on: 1. **Institution Type** (0-10 points) - MUSEUM, LIBRARY, ARCHIVE: 10 points (core heritage institutions) - GALLERY: 8 points - RESEARCH_CENTER: 7 points - OFFICIAL_INSTITUTION: 6 points - EDUCATION_PROVIDER: 4 points - MIXED: 3 points 2. **Name Specificity** (0-5 points) - Explicit institutional names: +3 points - National/state/federal/municipal institutions: +2 points - Generic educational names: 0 bonus 3. **Digital Platforms** (0-3 points) - Each platform: +1 point (max 3) 4. **Website Available** (0-2 points) - Has website identifier: +2 points 5. **Geographic Location** (0-3 points) - Major city (São Paulo, Rio, etc.): +3 points - Has city information: +1 point 6. **Description Richness** (0-2 points) - Detailed (>100 chars): +2 points - Moderate (>50 chars): +1 point **Maximum possible score**: 25 points ## Top 15 Candidates for Batch 9 ### 1. Museu Paulista - **Score**: 18.0/25 - **Type**: MUSEUM - **Location**: São Paulo, SP - **Website**: Not available - **Platforms**: 0 - **Description**: University of São Paulo museum, subject of research on collection policy (1990-2015) and acquisition strategies. Publishes Anais do Museu Paulista jou... **Wikidata Search Strategy**: - Search term: `Museu Paulista São Paulo Brazil` - Filter: `instance of` → museum - Verify: Location matches São Paulo, Brazil --- ### 2. Museu Casa de Rui Barbosa - **Score**: 18.0/25 - **Type**: MUSEUM - **Location**: Rio de Janeiro, - **Website**: Not available - **Platforms**: 0 - **Description**: Developed systematic cataloging methodologies for museum environments, conducting research finding 47% of surveyed institutions perform room documenta... **Wikidata Search Strategy**: - Search term: `Museu Casa de Rui Barbosa Rio de Janeiro Brazil` - Filter: `instance of` → museum - Verify: Location matches Rio de Janeiro, Brazil --- ### 3. UnB BCE - **Score**: 18.0/25 - **Type**: LIBRARY - **Location**: Brasília, DISTRITO FEDERAL - **Website**: https://bce.unb.br/ - **Platforms**: 0 - **Description**: 24/7 operations, @bceunb (38K) **Wikidata Search Strategy**: - Search term: `UnB BCE Brasília Brazil` - Filter: `instance of` → library - Verify: Location matches Brasília, Brazil --- ### 4. Museu de Arte de São Paulo (MASP) - **Score**: 17.0/25 - **Type**: MUSEUM - **Location**: São Paulo, SP - **Website**: Not available - **Platforms**: 0 - **Description**: Major art museum collaborating with Google Arts & Culture to reach global audiences. **Wikidata Search Strategy**: - Search term: `Museu de Arte de São Paulo (MASP) São Paulo Brazil` - Filter: `instance of` → museum - Verify: Location matches São Paulo, Brazil --- ### 5. MAX - **Score**: 17.0/25 - **Type**: MUSEUM - **Location**: Rio de Janeiro, SERGIPE - **Website**: Not available - **Platforms**: 0 - **Description**: Archaeological museum, UFS-administered Contact: Phone: 21 3395 8905 **Wikidata Search Strategy**: - Search term: `MAX Rio de Janeiro Brazil` - Filter: `instance of` → museum - Verify: Location matches Rio de Janeiro, Brazil --- ### 6. UFAL Natural History Museum - **Score**: 16.0/25 - **Type**: MUSEUM - **Location**: Maceió, ALAGOAS - **Website**: Not available - **Platforms**: 0 - **Description**: **Wikidata Search Strategy**: - Search term: `UFAL Natural History Museum Maceió Brazil` - Filter: `instance of` → museum - Verify: Location matches Maceió, Brazil --- ### 7. Museu Sacaca - **Score**: 16.0/25 - **Type**: MUSEUM - **Location**: Macapá, AMAPÁ - **Website**: http://www.museusacaca.ap.gov.br - **Platforms**: 0 - **Description**: 21,000m², indigenous culture focus **Wikidata Search Strategy**: - Search term: `Museu Sacaca Macapá Brazil` - Filter: `instance of` → museum - Verify: Location matches Macapá, Brazil --- ### 8. Arquivo Público DF - **Score**: 16.0/25 - **Type**: ARCHIVE - **Location**: Brasília, DISTRITO FEDERAL - **Website**: Not available - **Platforms**: 0 - **Description**: @arpdf (16K), 1.445M photos **Wikidata Search Strategy**: - Search term: `Arquivo Público DF Brasília Brazil` - Filter: `instance of` → archive - Verify: Location matches Brasília, Brazil --- ### 9. UFPA - **Score**: 16.0/25 - **Type**: MUSEUM - **Location**: Belém, PARÁ - **Website**: Not available - **Platforms**: 0 - **Description**: 50,000+ students, MUFPA museum **Wikidata Search Strategy**: - Search term: `UFPA Belém Brazil` - Filter: `instance of` → museum - Verify: Location matches Belém, Brazil --- ### 10. Arquivo Blumenau - **Score**: 16.0/25 - **Type**: ARCHIVE - **Location**: Blumenau, SANTA CATARINA - **Website**: http://arquivodeblumenau.com.br/ - **Platforms**: 0 - **Description**: 500,000+ photos, German/Italian **Wikidata Search Strategy**: - Search term: `Arquivo Blumenau Blumenau Brazil` - Filter: `instance of` → archive - Verify: Location matches Blumenau, Brazil --- ### 11. Museu Palacinho - **Score**: 16.0/25 - **Type**: MUSEUM - **Location**: Palmas, TOCANTINS - **Website**: https://museupalacinho.com/ - **Platforms**: 0 - **Description**: Contact: Phone: +55 63 99232-8613 **Wikidata Search Strategy**: - Search term: `Museu Palacinho Palmas Brazil` - Filter: `instance of` → museum - Verify: Location matches Palmas, Brazil --- ### 12. Museu Nacional - **Score**: 15.0/25 - **Type**: MUSEUM - **Location**: Unknown, RJ - **Website**: Not available - **Platforms**: 0 - **Description**: @museunacionalufrj (126K), reopening 2026 **Wikidata Search Strategy**: - Search term: `Museu Nacional Unknown Brazil` - Filter: `instance of` → museum - Verify: Location matches Unknown, Brazil --- ### 13. Museu Palacinho - **Score**: 15.0/25 - **Type**: MUSEUM - **Location**: Unknown, TO - **Website**: https://museupalacinho.com/ - **Platforms**: 0 - **Description**: https://museupalacinho.com/ **Wikidata Search Strategy**: - Search term: `Museu Palacinho Unknown Brazil` - Filter: `instance of` → museum - Verify: Location matches Unknown, Brazil --- ### 14. Biblioteca Digital Brasileira de Teses e Dissertações (BDTD) - **Score**: 15.0/25 - **Type**: LIBRARY - **Location**: Unknown, - **Website**: Not available - **Platforms**: 0 - **Description**: Aggregates graduate research from universities nationwide, providing access to Brazilian theses and dissertations. **Wikidata Search Strategy**: - Search term: `Biblioteca Digital Brasileira de Teses e Dissertações (BDTD) Unknown Brazil` - Filter: `instance of` → library - Verify: Location matches Unknown, Brazil --- ### 15. Museu da Borracha - **Score**: 14.0/25 - **Type**: MUSEUM - **Location**: Unknown, AC - **Website**: Not available - **Platforms**: 0 - **Description**: 5,300+ pieces, 31,756+ newspapers, 4,700-volume library **Wikidata Search Strategy**: - Search term: `Museu da Borracha Unknown Brazil` - Filter: `instance of` → museum - Verify: Location matches Unknown, Brazil --- ## Additional High-Priority Candidates (16-30) These institutions scored well but didn't make the top 15. Consider for Batch 10. | Rank | Score | Name | Type | City | |------|-------|------|------|------| | 16 | 14.0 | Museu dos Povos Acreanos | MUSEUM | Rio Branco | | 17 | 14.0 | Museu Histórico | MUSEUM | Alcântara | | 18 | 14.0 | MARCO | MUSEUM | Campo Grande | | 19 | 14.0 | Dom Bosco Museum | MUSEUM | Campo Grande | | 20 | 14.0 | Ouro Preto System | MUSEUM | Ouro Preto | | 21 | 14.0 | Natural History Museum | MUSEUM | Campina Grande | | 22 | 14.0 | Memorial do RS | MUSEUM | Pelotas | | 23 | 14.0 | Museu Memória | MUSEUM | Porto Velho | | 24 | 14.0 | Museu do Homem Sergipano | MUSEUM | Aracaju | | 25 | 13.0 | Museu dos Povos Acreanos | MUSEUM | Unknown | | 26 | 13.0 | UFAL Natural History Museum | MUSEUM | Unknown | | 27 | 13.0 | Museu Sacaca | MUSEUM | Unknown | | 28 | 13.0 | Museu de Arqueologia e Etnologia | MUSEUM | Unknown | | 29 | 13.0 | Arquivo Público (APEB) | ARCHIVE | Unknown | | 30 | 13.0 | Arquivo Público DF | ARCHIVE | Unknown | ## Recommendations ### Batch 9 Strategy (Target: 10-15 enrichments) 1. **Manual Wikidata Search** (Most Reliable) - Search each top candidate on Wikidata - Verify location and institution type match - Record Q-numbers in enrichment script 2. **Automated Fuzzy Matching** (Faster, Lower Precision) - Use existing `scripts/enrich_brazil_batch9.py` template - Adapt fuzzy matching from previous batches - Manually verify all matches before committing 3. **Hybrid Approach** (Recommended) - Manual search for top 10 candidates (highest confidence) - Fuzzy matching for candidates 11-15 (with verification) - This balances speed and accuracy ### Expected Outcome - **Current coverage**: 31/212 (14.6%) - **After Batch 9** (+10 institutions): 41/212 (19.3%) - **After Batch 9** (+15 institutions): 46/212 (21.7%) ### Next Batches - **Batch 10**: Focus on remaining MUSEUM institutions (42 without Wikidata) - **Batch 11**: Focus on ARCHIVE + LIBRARY (12 total without Wikidata) - **Batch 12**: Cherry-pick high-scoring EDUCATION_PROVIDER institutions **Projected 30% coverage**: Batches 9-11 combined (~35-40 total enrichments) ## Files Generated - **Candidate records**: `data/instances/brazil/batch9_candidates_analysis.yaml` - **This report**: `data/instances/brazil/BATCH9_CANDIDATES_REPORT.md` ## Manual Enrichment Template For each candidate, follow this workflow: ```python # In scripts/enrich_brazil_batch9.py BATCH_9_ENRICHMENTS = { "Museo Name Example": { "wikidata_id": "Q12345678", "match_score": 1.0, # Manual verification "match_method": "Manual Wikidata search", "verification_notes": "Verified: location, type, and name match" }, # ... add 10-15 entries } ``` --- **Status**: Ready for manual Wikidata search **Next Action**: Create `scripts/enrich_brazil_batch9.py` with top candidates