glam/data/wikidata/GLAMORCUBEPSXHFN/00-QUERY_EXECUTION_GUIDE.md
2025-11-19 23:25:22 +01:00

9.4 KiB
Raw Blame History

GLAMORCUBEPSXHF SPARQL Query Execution Guide

Project: Multilingual vocabulary/thesaurus extraction for 15 heritage institution types
Date: 2025-11-12
Purpose: Execute all SPARQL queries against Wikidata to retrieve terminology


Query Execution Order

Execute these queries in the order listed below. Save results to the specified output files.

Completed Classes (8/15 = 53%)

G-Class (GALLERY) - COMPLETED

  • Query: data/wikidata/GLAMORCUBEPSXH/G/sparql/gallery_hyponyms.sparql
  • Output: data/wikidata/GLAMORCUBEPSXH/G/sparql/hyponyms_raw.json
  • Status: Results saved, analysis complete

L-Class (LIBRARY) - COMPLETED

  • Query: data/wikidata/GLAMORCUBEPSXH/L/sparql/library_hyponyms.sparql
  • Output: data/wikidata/GLAMORCUBEPSXH/L/sparql/hyponyms_raw.json
  • Status: Results saved, analysis complete

A-Class (ARCHIVE) - COMPLETED

  • Query: data/wikidata/GLAMORCUBEPSXH/A/sparql/archive_hyponyms.sparql
  • Output: data/wikidata/GLAMORCUBEPSXH/A/sparql/hyponyms_raw.json
  • Status: Results saved, analysis complete

M-Class (MUSEUM) - COMPLETED

  • Query: data/wikidata/GLAMORCUBEPSXH/M/sparql/museum_hyponyms.sparql
  • Output: data/wikidata/GLAMORCUBEPSXH/M/sparql/hyponyms_raw.json
  • Status: Results saved, analysis complete

O-Class (OFFICIAL_INSTITUTION) - COMPLETED

  • Query: data/wikidata/GLAMORCUBEPSXH/O/sparql/query.sparql
  • Output: data/wikidata/GLAMORCUBEPSXH/O/sparql/hyponyms_merged.json
  • Status: Results saved, analysis complete

H-Class (HOLY_SITES) - COMPLETED

  • Query: data/wikidata/GLAMORCUBEPSXH/H/sparql/holy_sites_hyponyms.sparql
  • Output: data/wikidata/GLAMORCUBEPSXH/H/sparql/hyponyms_raw.json
  • Status: Results saved, analysis complete

F-Class (FEATURES) - COMPLETED

  • Query: Multiple feature-specific queries in data/wikidata/GLAMORCUBEPSXH/F/sparql/
  • Output: data/wikidata/GLAMORCUBEPSXH/F/sparql/hyponyms_raw.json
  • Status: Results saved, analysis complete

R-Class (RESEARCH_CENTER) - COMPLETED

  • Query: data/wikidata/GLAMORCUBEPSXH/R/sparql/research_center_hyponyms.sparql
  • Output: data/wikidata/GLAMORCUBEPSXH/R/sparql/hyponyms_raw.json
  • Status: Results saved, analysis complete

Remaining Classes (7/15 = 47%)

Priority 1: C-Class (CORPORATION)

Status: Query ready, awaiting execution

Query File: data/wikidata/GLAMORCUBEPSXH/C/sparql/corporation_hyponyms.sparql

Target Classes:

  • Q18631232 (corporate archive)
  • Q33506 (museum) + Q4830453 (business)
  • Q1616075 (company museum)

Output File: data/wikidata/GLAMORCUBEPSXH/C/sparql/hyponyms_raw.json

Expected Terms: Corporate museum, company archive, brand heritage center, business archive, Firmenmuseum (de), 企業博物館 (ja)

Execution Command (Wikidata Query Service):

# Copy query from corporation_hyponyms.sparql
# Execute at: https://query.wikidata.org/
# Export results as JSON
# Save to: data/wikidata/GLAMORCUBEPSXH/C/sparql/hyponyms_raw.json

Priority 2: U-Class (UNKNOWN)

Status: Not applicable - UNKNOWN is assigned during data extraction

Query File: N/A

Target Classes: No Wikidata query needed for U-class

Explanation: The U-class represents institutions where the type cannot be determined during data extraction. This is not a Wikidata class but rather a fallback classification used when:

  • Source data lacks type information
  • Institution description is ambiguous
  • Multiple conflicting type indicators exist

U-class institutions should be manually reviewed and reclassified to appropriate types (G, L, A, M, etc.) when more information becomes available.

Note: Universities are classified under E (EDUCATION_PROVIDER), not U-class.


Priority 3: B-Class (BOTANICAL_ZOO)

Status: Query ready, awaiting execution

Query File: data/wikidata/GLAMORCUBEPSXH/B/sparql/botanical_zoo_hyponyms.sparql

Target Classes:

  • Q167346 (botanical garden)
  • Q43501 (zoo)
  • Q27686 (aquarium)
  • Q1855774 (natural history museum)

Output File: data/wikidata/GLAMORCUBEPSXH/B/sparql/hyponyms_raw.json

Expected Terms: Botanical garden, zoo, aquarium, arboretum, jardin botanique (fr), zoológico (es)

Execution Command:

# Copy query from botanical_zoo_hyponyms.sparql
# Execute at: https://query.wikidata.org/
# Export results as JSON
# Save to: data/wikidata/GLAMORCUBEPSXH/B/sparql/hyponyms_raw.json

Priority 4: E-Class (EDUCATION_PROVIDER)

Status: Query ready, awaiting execution

Query File: data/wikidata/GLAMORCUBEPSXH/E/sparql/education_provider_hyponyms.sparql

Target Classes:

  • Q3914 (school)
  • Q15936437 (training center)
  • Q1390872 (vocational school)
  • Q2385804 (educational institution)

Output File: data/wikidata/GLAMORCUBEPSXH/E/sparql/hyponyms_raw.json

Expected Terms: School, training center, vocational school, Schule (de), escuela (es), 学校 (ja)

Execution Command:

# Copy query from education_provider_hyponyms.sparql
# Execute at: https://query.wikidata.org/
# Export results as JSON
# Save to: data/wikidata/GLAMORCUBEPSXH/E/sparql/hyponyms_raw.json

Priority 5: S-Class (COLLECTING_SOCIETY)

Status: Query ready, awaiting execution

Query File: data/wikidata/GLAMORCUBEPSXH/S/sparql/collecting_society_hyponyms.sparql

Target Classes:

  • Q1391145 (historical society)
  • Q2900544 (heritage society)
  • Q955824 (learned society)
  • Q5533467 (genealogical society)
  • Q564323 (antiquarian society)

Output File: data/wikidata/GLAMORCUBEPSXH/S/sparql/hyponyms_raw.json

Expected Terms: Historical society, heritage society, heemkundige kring (nl), genealogical society

Execution Command:

# Copy query from collecting_society_hyponyms.sparql
# Execute at: https://query.wikidata.org/
# Export results as JSON
# Save to: data/wikidata/GLAMORCUBEPSXH/S/sparql/hyponyms_raw.json

Priority 6: P-Class (PERSONAL_COLLECTION)

Status: Query ready, awaiting execution

Query File: data/wikidata/GLAMORCUBEPSXH/P/sparql/personal_collection_hyponyms.sparql

Target Classes:

  • Q768717 (private collection)
  • Private museums, archives, libraries

Output File: data/wikidata/GLAMORCUBEPSXH/P/sparql/hyponyms_raw.json

Expected Terms: Private collection, personal collection, private museum, colección privada (es)

Execution Command:

# Copy query from personal_collection_hyponyms.sparql
# Execute at: https://query.wikidata.org/
# Export results as JSON
# Save to: data/wikidata/GLAMORCUBEPSXH/P/sparql/hyponyms_raw.json

Priority 7: X-Class (MIXED)

Status: Query ready, awaiting execution

Query File: data/wikidata/GLAMORCUBEPSXH/X/sparql/mixed_hyponyms.sparql

Target Classes:

  • Q207694 (cultural center)
  • Q22808320 (heritage center)
  • Q3152824 (cultural institution)
  • Q1030034 (memory institution)

Output File: data/wikidata/GLAMORCUBEPSXH/X/sparql/hyponyms_raw.json

Expected Terms: Cultural center, heritage center, memory institution, multi-purpose institution

Execution Command:

# Copy query from mixed_hyponyms.sparql
# Execute at: https://query.wikidata.org/
# Export results as JSON
# Save to: data/wikidata/GLAMORCUBEPSXH/X/sparql/hyponyms_raw.json

Batch Execution Workflow

Step 1: Execute All Queries

For each remaining class (C, U, B, E, S, P, X):

  1. Open Wikidata Query Service: https://query.wikidata.org/
  2. Copy SPARQL query from the respective .sparql file
  3. Paste into query editor
  4. Click "Execute" button
  5. Wait for results (may take 30-60 seconds)
  6. Click "Download" → "JSON"
  7. Save to the specified output file (hyponyms_raw.json)

Step 2: Verify Results

After downloading each result file, verify:

  • File size > 0 bytes
  • Valid JSON format
  • Contains results.bindings array
  • Has expected fields: class, classLabel, altLabels

Step 3: Processing

Once all 7 raw result files are saved, notify the AI agent to:

  1. Deduplicate by QID
  2. Generate statistics
  3. Create analysis documents
  4. Update master checklist

Query Execution Checklist

  • C-Class (CORPORATION) - corporation_hyponyms.sparqlhyponyms_raw.json
  • U-Class (UNIVERSITY) - university_hyponyms.sparqlhyponyms_raw.json
  • B-Class (BOTANICAL_ZOO) - botanical_zoo_hyponyms.sparqlhyponyms_raw.json
  • E-Class (EDUCATION_PROVIDER) - education_provider_hyponyms.sparqlhyponyms_raw.json
  • S-Class (COLLECTING_SOCIETY) - collecting_society_hyponyms.sparqlhyponyms_raw.json
  • P-Class (PERSONAL_COLLECTION) - personal_collection_hyponyms.sparqlhyponyms_raw.json
  • X-Class (MIXED) - mixed_hyponyms.sparqlhyponyms_raw.json

Expected Timeline

  • Query execution: 5-10 minutes per class (7 classes × 10 min = ~70 minutes)
  • Data processing: 30-60 minutes (automated by AI agent)
  • Total completion: 2-3 hours

Success Criteria

All queries executed successfully when:

  • 7 new hyponyms_raw.json files created
  • Each file > 1 KB (contains results)
  • Valid JSON format in all files
  • Master checklist updated to 15/15 (100%)

Notes

  • Wikidata Query Service may timeout for very large result sets
  • If timeout occurs, consider splitting query into smaller geographic regions
  • Save queries incrementally (don't lose progress)
  • JSON export preserves all language labels and altLabels

Next Action: Execute C-Class query first, then proceed through priorities 2-7.