Task: Curate Chilean GLAM Institutions from Conversation

Objective

Manually enrich the 90 existing Chilean GLAM institution records by extracting comprehensive information from the source conversation JSON file.

Source Files

Conversation JSON: /Users/kempersc/apps/glam/data/raw/chilean_glam_conversation.json
Current YAML: /Users/kempersc/apps/glam/data/instances/chilean_institutions.yaml (90 minimally-populated records)
Target Output: /Users/kempersc/apps/glam/data/instances/chilean_institutions_curated.yaml

Required Enrichments

For each of the 90 institutions, extract and add:

Rich Descriptions - Contextual information about the institution from conversation
Complete Location Data - Cities, addresses, coordinates where mentioned
Identifiers - ISIL codes, Wikidata IDs, URLs, platform IDs
Digital Platforms - SURDOC, SINAR, institutional websites, catalogs
Collection Metadata - Types, subjects, temporal coverage, extent
Change History - Founding dates, mergers, organizational events
Provenance Tracking - Enhanced confidence scores based on explicit vs. inferred data

Schema Compliance

All records MUST conform to LinkML schema v0.2.0:

schemas/core.yaml - HeritageCustodian, Location, Identifier, DigitalPlatform
schemas/enums.yaml - InstitutionTypeEnum, ChangeTypeEnum, DataSource, DataTier
schemas/provenance.yaml - Provenance, ChangeEvent, GHCIDHistoryEntry
schemas/collections.yaml - Collection, Accession, DigitalObject

Key Conversation Content

The conversation contains information about:

695+ library services nationwide
500,000+ digitized archival records
72,000+ catalogued museum objects
National platforms: SURDOC, SINAR, Memoria Chilena
Major institutions across all Chilean regions
Regional networks and specialized collections

Expected Deliverables

Fully curated YAML file with 90 enriched records
Report on data completeness and quality
List of top 5 most complete records
List of institutions with minimal data (need further research)

Instructions for NLP Agent

Read the entire conversation JSON file and extract ALL available information for EACH of the 90 institutions currently in the YAML file. Create comprehensive, LinkML-compliant records with:

Detailed descriptions synthesized from conversation context
All mentioned locations, identifiers, platforms
Inferred collection information where appropriate
Founding dates and organizational history
Proper confidence scores (0.9-1.0 for explicit mentions, 0.5-0.8 for inferred data)

Use your full comprehension abilities to create the most complete, accurate records possible.

2.7 KiB Raw Blame History