glam/TAXONOMY_UPDATE_SUMMARY.md
2025-11-19 23:25:22 +01:00

7.5 KiB

GLAMORCUBESFIXPHDNT Taxonomy Update Summary

Date

2025-11-14

Overview

Updated all documentation and code to reflect the expanded GLAMORCUBESFIXPHDNT taxonomy with 19 institution types (expanded from 15 types).

New Taxonomy Structure

Mnemonic

GLAMORCUBESFIXPHDNT - Galleries, Libraries, Archives, Museums, Official institutions, Research centers, Corporations, Unknown, Botanical gardens/zoos, Education providers, Societies, Features, Intangible heritage groups, miXed, Personal collections, Holy sites, Digital platforms, NGOs, Taste/smell heritage

Order note: The mnemonic follows alphabetical ordering by single-letter code: G-L-A-M-O-R-C-U-B-E-S-F-I-X-P-H-D-N-T

Complete Type List (19 types)

Code Type Description
G GALLERY Art gallery or exhibition space
L LIBRARY Library (public, academic, specialized)
A ARCHIVE Archive (government, corporate, personal)
M MUSEUM Museum (art, history, science, etc.)
O OFFICIAL_INSTITUTION Government heritage agencies
R RESEARCH_CENTER Research institutes and documentation centers
C CORPORATION Corporate heritage collections
U UNKNOWN Institution type cannot be determined
B BOTANICAL_ZOO Botanical gardens and zoological parks
E EDUCATION_PROVIDER Educational institutions with collections
S COLLECTING_SOCIETY Societies collecting specialized materials
F FEATURES Physical landscape features with heritage significance
I INTANGIBLE_HERITAGE_GROUP Organizations preserving intangible heritage (NEW)
X MIXED Multiple types
P PERSONAL_COLLECTION Private personal collections
H HOLY_SITES Religious heritage sites and institutions
D DIGITAL_PLATFORM Digital heritage platforms and repositories (NEW)
N NGO Non-governmental heritage organizations (NEW)
T TASTE_SMELL Culinary and olfactory heritage institutions (NEW)

Changes Made

1. Core Documentation

  • AGENTS.md: Updated taxonomy description from 15 to 19 types
  • schemas/enums.yaml: Added four new enum values (INTANGIBLE_HERITAGE_GROUP, DIGITAL_PLATFORM, NGO, TASTE_SMELL)
  • data/instances/all/README.md: Updated institution type table

2. Wikidata Documentation

Updated all references from GLAMORCUBEPSXHF to GLAMORCUBESFIXPHDNT in:

  • All Markdown files in data/wikidata/GLAMORCUBEPSXHFN/
  • All query YAML files
  • All SPARQL query files
  • Analysis reports and session summaries

3. Planning Documents

Updated taxonomy references in:

  • docs/plan/glamorcubepsxh_vocab/ (all files)
  • docs/plan/unesco/ (all relevant files)
  • Session summary documents in docs/sessions/

4. Code Files

  • src/glam_extractor/models.py: Updated InstitutionTypeEnum docstring
  • scripts/*.py: Updated comments and descriptions

New Type Definitions

INTANGIBLE_HERITAGE_GROUP (I)

INTANGIBLE_HERITAGE_GROUP:
  description: >-
    Organizations preserving intangible cultural heritage (traditional performance groups,
    oral history societies, folklore organizations, cultural practice preservation).
    Includes UNESCO-recognized intangible heritage bearers.    
  meaning: schema:PerformingGroup
  annotations:
    ghcid_code: I
    also_maps_to: schema:Organization

DIGITAL_PLATFORM (D)

DIGITAL_PLATFORM:
  description: >-
    Digital heritage platforms and repositories (online archives, digital libraries,
    virtual museums, aggregation portals). Born-digital heritage institutions.    
  meaning: schema:WebSite
  annotations:
    ghcid_code: D
    also_maps_to: schema:DataCatalog

NGO (N)

NGO:
  description: >-
    Non-governmental heritage organizations (heritage advocacy groups, preservation societies,
    cultural heritage NGOs). Non-profit organizations not classified under other types.    
  meaning: schema:NGO
  annotations:
    ghcid_code: N
    also_maps_to: schema:Organization

TASTE_SMELL (T)

TASTE_SMELL:
  description: >-
    Culinary and olfactory heritage institutions (restaurants, eateries, parfumeries, distilleries,
    breweries) actively preserving traditional recipes, cooking techniques, perfume formulations,
    and sensory cultural practices. Includes establishments maintaining historical food preparation
    methods and scent heritage.    
  meaning: schema:Restaurant
  annotations:
    ghcid_code: T
    also_maps_to: schema:FoodEstablishment

Migration Impact

Backward Compatibility

  • All existing 15 types remain unchanged
  • Existing GHCID codes are stable
  • No breaking changes to LinkML schema structure
  • New types are additions, not modifications

Data Migration Required

  • ⚠️ Review existing institutions classified as UNKNOWN to potentially reclassify as:
    • DIGITAL_PLATFORM (for online-only repositories)
    • NGO (for heritage advocacy organizations)
    • INTANGIBLE_HERITAGE_GROUP (for performance/oral tradition organizations)
    • TASTE_SMELL (for historic restaurants, parfumeries, distilleries)

Query Updates Needed

  • ⚠️ Update Wikidata SPARQL queries to include:
    • I-class queries for intangible heritage groups
    • D-class queries for digital platforms
    • N-class queries for NGOs
    • T-class queries for taste/smell heritage
  • ⚠️ Add new query files:
    • data/wikidata/GLAMORCUBESFIXPHDNT/I/sparql/intangible_heritage_hyponyms.sparql
    • data/wikidata/GLAMORCUBESFIXPHDNT/D/sparql/digital_platform_hyponyms.sparql
    • data/wikidata/GLAMORCUBESFIXPHDNT/N/sparql/ngo_hyponyms.sparql
    • data/wikidata/GLAMORCUBESFIXPHDNT/T/sparql/taste_smell_hyponyms.sparql

Next Steps

Immediate

  1. Create Wikidata query templates for new I, D, N, T classes
  2. Generate curated vocabulary lists for new types
  3. Update hyponym exclusion lists in hyponyms_curated.yaml

Short-term

  1. Review existing UNKNOWN institutions for reclassification
  2. Extract intangible heritage organizations from conversations
  3. Identify digital-only heritage platforms
  4. Map NGO heritage organizations
  5. Identify culinary and olfactory heritage institutions

Long-term

  1. Integrate new types into global extraction workflows
  2. Update NLP classification models to recognize new types
  3. Add new types to Wikidata mapping documentation

Files Modified

Total: ~30+ files across:

  • Documentation (AGENTS.md, README files, planning docs)
  • Schema definitions (schemas/enums.yaml)
  • Code (src/glam_extractor/models.py, scripts/*.py)
  • Wikidata queries and analysis (data/wikidata/GLAMORCUBESFIXPHDNT/)
  • Session summaries (docs/sessions/)

Testing Required

  • Validate LinkML schema with new enum values
  • Test GHCID generation for new types (including T)
  • Verify RDF serialization includes new schema:NGO, schema:WebSite, schema:Restaurant mappings
  • Update test fixtures to include new institution types
  • Regenerate models.py from updated schema (if using LinkML code generation)

References

  • Taxonomy Definition: AGENTS.md (line 391)
  • Schema Enums: schemas/enums.yaml (line 33)
  • Type Examples: data/instances/all/README.md (line 169)
  • Mnemonic: GLAMORCUBESFIXPHDNT = Galleries, Libraries, Archives, Museums, Official, Research, Corporations, Unknown, Botanical/zoos, Education, Societies, Features, Intangible, miXed, Personal, Holy sites, Digital, NGOs, Taste/smell
  • Code Order: G-L-A-M-O-R-C-U-B-E-S-F-I-X-P-H-D-N-T (alphabetical by single-letter code)

Update completed: 2025-11-14
Author: AI Assistant
Status: Complete