# Country Class Implementation - Complete **Date**: 2025-11-22 **Session**: Continuation of FeaturePlace implementation --- ## Summary Successfully created the **Country class** to handle country-specific feature types and legal forms. ### ✅ What Was Completed #### 1. Fixed Duplicate Keys in FeatureTypeEnum.yaml - **Problem**: YAML had 2 duplicate keys (RESIDENTIAL_BUILDING, SANCTUARY) - **Resolution**: - Removed duplicate RESIDENTIAL_BUILDING (Q11755880) at lines 3692-3714 - Kept Catholic-specific SANCTUARY (Q21850178 "shrine of the Catholic Church") - Removed generic SANCTUARY (Q29553 "sacred place") - **Result**: 294 unique feature types (down from 296 with duplicates) #### 2. Created Country Class **File**: `schemas/20251121/linkml/modules/classes/Country.yaml` **Design Philosophy**: - **Minimal design**: ONLY ISO 3166-1 alpha-2 and alpha-3 codes - **No other metadata**: No country names, languages, capitals, regions - **Rationale**: ISO codes are authoritative, stable, language-neutral identifiers - All other country metadata should be resolved via external services (GeoNames, UN M49) **Schema**: ```yaml Country: description: Country identified by ISO 3166-1 codes slots: - alpha_2 # ISO 3166-1 alpha-2 (2-letter: "NL", "PE", "US") - alpha_3 # ISO 3166-1 alpha-3 (3-letter: "NLD", "PER", "USA") slot_usage: alpha_2: required: true pattern: "^[A-Z]{2}$" slot_uri: schema:addressCountry alpha_3: required: true pattern: "^[A-Z]{3}$" ``` **Examples**: - Netherlands: `alpha_2="NL"`, `alpha_3="NLD"` - Peru: `alpha_2="PE"`, `alpha_3="PER"` - United States: `alpha_2="US"`, `alpha_3="USA"` - Japan: `alpha_2="JP"`, `alpha_3="JPN"` #### 3. Integrated Country into CustodianPlace **File**: `schemas/20251121/linkml/modules/classes/CustodianPlace.yaml` **Changes**: - Added `country` slot (optional, range: Country) - Links place to its country location - Enables country-specific feature type validation **Use Cases**: - Disambiguate places across countries ("Victoria Museum" exists in multiple countries) - Enable country-conditional feature types (e.g., "cultural heritage of Peru") - Generate country-specific enum values **Example**: ```yaml CustodianPlace: place_name: "Machu Picchu" place_language: "es" country: alpha_2: "PE" alpha_3: "PER" has_feature_type: feature_type: CULTURAL_HERITAGE_OF_PERU # Only valid for PE! ``` #### 4. Integrated Country into LegalForm **File**: `schemas/20251121/linkml/modules/classes/LegalForm.yaml` **Changes**: - Updated `country_code` from string pattern to Country class reference - Enforces jurisdiction-specific legal forms via ontology links **Rationale**: - Legal forms are jurisdiction-specific - A "Stichting" in Netherlands ≠ "Fundación" in Spain (different legal meaning) - Country class provides canonical ISO codes for legal jurisdictions **Before** (string): ```yaml country_code: range: string pattern: "^[A-Z]{2}$" ``` **After** (Country class): ```yaml country_code: range: Country required: true ``` **Example**: ```yaml LegalForm: elf_code: "8888" country_code: alpha_2: "NL" alpha_3: "NLD" local_name: "Stichting" abbreviation: "Stg." ``` #### 5. Created Example Instance File **File**: `schemas/20251121/examples/country_integration_example.yaml` Shows: - Country instances (NL, PE, US) - CustodianPlace with country linking - LegalForm with country linking - Country-specific feature types (CULTURAL_HERITAGE_OF_PERU, BUITENPLAATS) --- ## Design Decisions ### Why Minimal Country Class? **Excluded Metadata**: - ❌ Country names (language-dependent: "Netherlands" vs "Pays-Bas" vs "荷兰") - ❌ Capital cities (change over time: Myanmar moved capital 2006) - ❌ Languages (multilingual countries: Belgium has 3 official languages) - ❌ Regions/continents (political: Is Turkey in Europe or Asia?) - ❌ Currency (changes: Eurozone adoption) - ❌ Phone codes (technical, not heritage-relevant) **Rationale**: 1. **Language neutrality**: ISO codes work across all languages 2. **Temporal stability**: Country names and capitals change; ISO codes are persistent 3. **Separation of concerns**: Heritage ontology shouldn't duplicate geopolitical databases 4. **External resolution**: Use GeoNames, UN M49, or ISO 3166 Maintenance Agency for metadata **External Services**: - GeoNames API: Country names in 20+ languages, capitals, regions - UN M49: Standard country codes and regions - ISO 3166 Maintenance Agency: Official ISO code updates ### Why Both Alpha-2 and Alpha-3? **Alpha-2** (2-letter): - Used by: Internet ccTLDs (.nl, .pe, .us), Schema.org addressCountry - Compact, widely recognized - **Primary for web applications** **Alpha-3** (3-letter): - Used by: United Nations, International Olympic Committee, ISO 4217 (currency codes) - Less ambiguous (e.g., "AT" = Austria vs "AT" = @-sign in some systems) - **Primary for international standards** **Both are required** to ensure interoperability across different systems. --- ## Country-Specific Feature Types ### Problem: Some feature types only apply to specific countries **Examples from FeatureTypeEnum.yaml**: 1. **CULTURAL_HERITAGE_OF_PERU** (Q16617058) - Description: "cultural heritage of Peru" - Only valid for: `country.alpha_2 = "PE"` - Hypernym: cultural heritage 2. **BUITENPLAATS** (Q2927789) - Description: "summer residence for rich townspeople in the Netherlands" - Only valid for: `country.alpha_2 = "NL"` - Hypernym: heritage site 3. **NATIONAL_MEMORIAL_OF_THE_UNITED_STATES** (Q20010800) - Description: "national memorial in the United States" - Only valid for: `country.alpha_2 = "US"` - Hypernym: heritage site ### Solution: Country-Conditional Enum Values **Implementation Strategy**: When validating CustodianPlace.has_feature_type: ```python place_country = custodian_place.country.alpha_2 if feature_type == "CULTURAL_HERITAGE_OF_PERU": assert place_country == "PE", "CULTURAL_HERITAGE_OF_PERU only valid for Peru" if feature_type == "BUITENPLAATS": assert place_country == "NL", "BUITENPLAATS only valid for Netherlands" ``` **LinkML Implementation** (future enhancement): ```yaml FeatureTypeEnum: permissible_values: CULTURAL_HERITAGE_OF_PERU: meaning: wd:Q16617058 annotations: country_restriction: "PE" # Only valid for Peru BUITENPLAATS: meaning: wd:Q2927789 annotations: country_restriction: "NL" # Only valid for Netherlands ``` --- ## Files Modified ### Created: 1. `schemas/20251121/linkml/modules/classes/Country.yaml` (new) 2. `schemas/20251121/examples/country_integration_example.yaml` (new) ### Modified: 3. `schemas/20251121/linkml/modules/classes/CustodianPlace.yaml` - Added `country` import - Added `country` slot - Added slot_usage documentation 4. `schemas/20251121/linkml/modules/classes/LegalForm.yaml` - Added `Country` import - Changed `country_code` from string to Country class reference 5. `schemas/20251121/linkml/modules/enums/FeatureTypeEnum.yaml` - Removed duplicate RESIDENTIAL_BUILDING (Q11755880) - Removed duplicate SANCTUARY (Q29553, kept Q21850178) - Result: 294 unique feature types --- ## Validation Results ### YAML Syntax: ✅ Valid ``` Total enum values: 294 No duplicate keys found ``` ### Country Class: ✅ Minimal Design ``` - alpha_2: Required, pattern: ^[A-Z]{2}$ - alpha_3: Required, pattern: ^[A-Z]{3}$ - No other fields (names, languages, capitals excluded) ``` ### Integration: ✅ Complete - CustodianPlace → Country (optional link) - LegalForm → Country (required link, jurisdiction-specific) - FeatureTypeEnum → Ready for country-conditional validation --- ## Next Steps (Future Work) ### 1. Country-Conditional Enum Validation **Task**: Implement validation rules for country-specific feature types **Approach**: - Add `country_restriction` annotation to FeatureTypeEnum entries - Create LinkML validation rule to check CustodianPlace.country matches restriction - Generate country-specific enum subsets for UI dropdowns **Example Rule**: ```yaml rules: - title: "Country-specific feature type validation" preconditions: slot_conditions: has_feature_type: range: FeaturePlace postconditions: slot_conditions: country: value_must_match: "{has_feature_type.country_restriction}" ``` ### 2. Populate Country Instances **Task**: Create Country instances for all countries in the dataset **Data Source**: ISO 3166-1 official list (249 countries) **Implementation**: ```yaml # schemas/20251121/data/countries.yaml countries: - id: https://nde.nl/ontology/hc/country/NL alpha_2: "NL" alpha_3: "NLD" - id: https://nde.nl/ontology/hc/country/PE alpha_2: "PE" alpha_3: "PER" # ... 247 more entries ``` ### 3. Link LegalForm to ISO 20275 ELF Codes **Task**: Populate LegalForm instances with ISO 20275 Entity Legal Form codes **Data Source**: GLEIF ISO 20275 Code List (1,600+ legal forms across 150+ jurisdictions) **Example**: ```yaml # Netherlands legal forms - id: https://nde.nl/ontology/hc/legal-form/nl-8888 elf_code: "8888" country_code: {alpha_2: "NL", alpha_3: "NLD"} local_name: "Stichting" abbreviation: "Stg." - id: https://nde.nl/ontology/hc/legal-form/nl-akd2 elf_code: "AKD2" country_code: {alpha_2: "NL", alpha_3: "NLD"} local_name: "Besloten vennootschap" abbreviation: "B.V." ``` ### 4. External Resolution Service Integration **Task**: Provide helper functions to resolve country metadata via GeoNames API **Implementation**: ```python from typing import Dict import requests def resolve_country_metadata(alpha_2: str) -> Dict: """Resolve country metadata from GeoNames API.""" url = f"http://api.geonames.org/countryInfoJSON" params = { "country": alpha_2, "username": "your_geonames_username" } response = requests.get(url, params=params) data = response.json() return { "name_en": data["geonames"][0]["countryName"], "capital": data["geonames"][0]["capital"], "languages": data["geonames"][0]["languages"].split(","), "continent": data["geonames"][0]["continent"] } # Usage country_metadata = resolve_country_metadata("NL") # Returns: { # "name_en": "Netherlands", # "capital": "Amsterdam", # "languages": ["nl", "fy"], # "continent": "EU" # } ``` ### 5. UI Dropdown Generation **Task**: Generate country-filtered feature type dropdowns for data entry forms **Use Case**: When user selects "Netherlands" as country, only show: - Universal feature types (MUSEUM, CHURCH, MANSION, etc.) - Netherlands-specific types (BUITENPLAATS) - Exclude Peru-specific types (CULTURAL_HERITAGE_OF_PERU) **Implementation**: ```python def get_valid_feature_types(country_alpha_2: str) -> List[str]: """Get valid feature types for a given country.""" universal_types = [ft for ft in FeatureTypeEnum if not has_country_restriction(ft)] country_specific = [ft for ft in FeatureTypeEnum if get_country_restriction(ft) == country_alpha_2] return universal_types + country_specific ``` --- ## References - **ISO 3166-1**: https://www.iso.org/iso-3166-country-codes.html - **GeoNames API**: https://www.geonames.org/export/web-services.html - **UN M49**: https://unstats.un.org/unsd/methodology/m49/ - **ISO 20275**: https://www.gleif.org/en/about-lei/code-lists/iso-20275-entity-legal-forms-code-list - **Schema.org addressCountry**: https://schema.org/addressCountry - **Wikidata Q-numbers**: Country-specific heritage feature types --- ## Status ✅ **Country Class Implementation: COMPLETE** - [x] Duplicate keys fixed in FeatureTypeEnum.yaml - [x] Country class created with minimal design - [x] CustodianPlace integrated with country linking - [x] LegalForm integrated with country linking - [x] Example instance file created - [x] Documentation complete **Ready for**: Country-conditional enum validation and LegalForm population with ISO 20275 codes.