glam/data/isil/bosnia/QUICK_START_AUTOMATION.md
2025-11-19 23:25:22 +01:00

1.7 KiB

Quick Start: Bosnia ISIL Automation

TL;DR

Automated script ready to check all 80 COBISS.BH libraries for ISIL codes.

Time: 10-20 minutes (vs. 6.5 hours manual)
Output: bosnia_isil_codes_found.json with results


Run Now

cd /Users/kempersc/apps/glam
python scripts/bosnia_isil_scraper.py

Monitor progress:

tail -f data/isil/bosnia/scraper_log.txt

What It Does

  1. Loads 80 libraries from bosnia_cobiss_libraries_raw.json
  2. For each library:
    • Checks COBISS library pages
    • Checks institutional website
    • Searches for ISIL code patterns (BA-, BO-, ISO 15511)
  3. Outputs structured JSON with findings

Possible Outcomes

Success: ISIL Codes Found

  • Extract 50-80 ISIL codes
  • Create LinkML records
  • Update investigation report

No Codes Found

  • Confirms ISIL codes not publicly accessible
  • Provides evidence for NUBBiH email request
  • Validates need for direct authority contact

⚠️ Partial Results

  • Some libraries publish ISIL, others don't
  • Reveals COBISS data inconsistencies
  • Prioritizes follow-up contacts

After Completion

Analyze results:

# Count how many found
jq '[.[] | select(.isil_found == true)] | length' data/isil/bosnia/bosnia_isil_codes_found.json

# List all codes
jq '[.[].isil_codes[]] | unique' data/isil/bosnia/bosnia_isil_codes_found.json

Files Created

  • scripts/bosnia_isil_scraper.py - Main script
  • data/isil/bosnia/bosnia_isil_codes_found.json - Results
  • data/isil/bosnia/scraper_log.txt - Progress log
  • AUTOMATION_SCRIPT_CREATED.md - Full documentation
  • QUICK_START_AUTOMATION.md - This file

Ready to execute!