# Task: Curate Chilean GLAM Institutions from Conversation ## Objective Manually enrich the 90 existing Chilean GLAM institution records by extracting comprehensive information from the source conversation JSON file. ## Source Files - **Conversation JSON**: `/Users/kempersc/apps/glam/data/raw/chilean_glam_conversation.json` - **Current YAML**: `/Users/kempersc/apps/glam/data/instances/chilean_institutions.yaml` (90 minimally-populated records) - **Target Output**: `/Users/kempersc/apps/glam/data/instances/chilean_institutions_curated.yaml` ## Required Enrichments For each of the 90 institutions, extract and add: 1. **Rich Descriptions** - Contextual information about the institution from conversation 2. **Complete Location Data** - Cities, addresses, coordinates where mentioned 3. **Identifiers** - ISIL codes, Wikidata IDs, URLs, platform IDs 4. **Digital Platforms** - SURDOC, SINAR, institutional websites, catalogs 5. **Collection Metadata** - Types, subjects, temporal coverage, extent 6. **Change History** - Founding dates, mergers, organizational events 7. **Provenance Tracking** - Enhanced confidence scores based on explicit vs. inferred data ## Schema Compliance All records MUST conform to LinkML schema v0.2.0: - `schemas/core.yaml` - HeritageCustodian, Location, Identifier, DigitalPlatform - `schemas/enums.yaml` - InstitutionTypeEnum, ChangeTypeEnum, DataSource, DataTier - `schemas/provenance.yaml` - Provenance, ChangeEvent, GHCIDHistoryEntry - `schemas/collections.yaml` - Collection, Accession, DigitalObject ## Key Conversation Content The conversation contains information about: - **695+ library services** nationwide - **500,000+ digitized archival records** - **72,000+ catalogued museum objects** - National platforms: SURDOC, SINAR, Memoria Chilena - Major institutions across all Chilean regions - Regional networks and specialized collections ## Expected Deliverables 1. Fully curated YAML file with 90 enriched records 2. Report on data completeness and quality 3. List of top 5 most complete records 4. List of institutions with minimal data (need further research) ## Instructions for NLP Agent Read the entire conversation JSON file and extract ALL available information for EACH of the 90 institutions currently in the YAML file. Create comprehensive, LinkML-compliant records with: - Detailed descriptions synthesized from conversation context - All mentioned locations, identifiers, platforms - Inferred collection information where appropriate - Founding dates and organizational history - Proper confidence scores (0.9-1.0 for explicit mentions, 0.5-0.8 for inferred data) Use your full comprehension abilities to create the most complete, accurate records possible.