- Created the Country class with ISO 3166-1 alpha-2 and alpha-3 codes, ensuring minimal design without additional metadata. - Integrated the Country class into CustodianPlace and LegalForm schemas to support country-specific feature types and legal forms. - Removed duplicate keys in FeatureTypeEnum.yaml, resulting in 294 unique feature types. - Eliminated "Hypernyms:" text from FeatureTypeEnum descriptions, verifying that semantic relationships are now conveyed through ontology mappings. - Created example instance file demonstrating integration of Country with CustodianPlace and LegalForm. - Updated documentation to reflect the completion of the Country class implementation and hypernyms removal.
407 lines
12 KiB
Markdown
407 lines
12 KiB
Markdown
# Country Class Implementation - Complete
|
|
|
|
**Date**: 2025-11-22
|
|
**Session**: Continuation of FeaturePlace implementation
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
Successfully created the **Country class** to handle country-specific feature types and legal forms.
|
|
|
|
### ✅ What Was Completed
|
|
|
|
#### 1. Fixed Duplicate Keys in FeatureTypeEnum.yaml
|
|
- **Problem**: YAML had 2 duplicate keys (RESIDENTIAL_BUILDING, SANCTUARY)
|
|
- **Resolution**:
|
|
- Removed duplicate RESIDENTIAL_BUILDING (Q11755880) at lines 3692-3714
|
|
- Kept Catholic-specific SANCTUARY (Q21850178 "shrine of the Catholic Church")
|
|
- Removed generic SANCTUARY (Q29553 "sacred place")
|
|
- **Result**: 294 unique feature types (down from 296 with duplicates)
|
|
|
|
#### 2. Created Country Class
|
|
**File**: `schemas/20251121/linkml/modules/classes/Country.yaml`
|
|
|
|
**Design Philosophy**:
|
|
- **Minimal design**: ONLY ISO 3166-1 alpha-2 and alpha-3 codes
|
|
- **No other metadata**: No country names, languages, capitals, regions
|
|
- **Rationale**: ISO codes are authoritative, stable, language-neutral identifiers
|
|
- All other country metadata should be resolved via external services (GeoNames, UN M49)
|
|
|
|
**Schema**:
|
|
```yaml
|
|
Country:
|
|
description: Country identified by ISO 3166-1 codes
|
|
slots:
|
|
- alpha_2 # ISO 3166-1 alpha-2 (2-letter: "NL", "PE", "US")
|
|
- alpha_3 # ISO 3166-1 alpha-3 (3-letter: "NLD", "PER", "USA")
|
|
|
|
slot_usage:
|
|
alpha_2:
|
|
required: true
|
|
pattern: "^[A-Z]{2}$"
|
|
slot_uri: schema:addressCountry
|
|
|
|
alpha_3:
|
|
required: true
|
|
pattern: "^[A-Z]{3}$"
|
|
```
|
|
|
|
**Examples**:
|
|
- Netherlands: `alpha_2="NL"`, `alpha_3="NLD"`
|
|
- Peru: `alpha_2="PE"`, `alpha_3="PER"`
|
|
- United States: `alpha_2="US"`, `alpha_3="USA"`
|
|
- Japan: `alpha_2="JP"`, `alpha_3="JPN"`
|
|
|
|
#### 3. Integrated Country into CustodianPlace
|
|
**File**: `schemas/20251121/linkml/modules/classes/CustodianPlace.yaml`
|
|
|
|
**Changes**:
|
|
- Added `country` slot (optional, range: Country)
|
|
- Links place to its country location
|
|
- Enables country-specific feature type validation
|
|
|
|
**Use Cases**:
|
|
- Disambiguate places across countries ("Victoria Museum" exists in multiple countries)
|
|
- Enable country-conditional feature types (e.g., "cultural heritage of Peru")
|
|
- Generate country-specific enum values
|
|
|
|
**Example**:
|
|
```yaml
|
|
CustodianPlace:
|
|
place_name: "Machu Picchu"
|
|
place_language: "es"
|
|
country:
|
|
alpha_2: "PE"
|
|
alpha_3: "PER"
|
|
has_feature_type:
|
|
feature_type: CULTURAL_HERITAGE_OF_PERU # Only valid for PE!
|
|
```
|
|
|
|
#### 4. Integrated Country into LegalForm
|
|
**File**: `schemas/20251121/linkml/modules/classes/LegalForm.yaml`
|
|
|
|
**Changes**:
|
|
- Updated `country_code` from string pattern to Country class reference
|
|
- Enforces jurisdiction-specific legal forms via ontology links
|
|
|
|
**Rationale**:
|
|
- Legal forms are jurisdiction-specific
|
|
- A "Stichting" in Netherlands ≠ "Fundación" in Spain (different legal meaning)
|
|
- Country class provides canonical ISO codes for legal jurisdictions
|
|
|
|
**Before** (string):
|
|
```yaml
|
|
country_code:
|
|
range: string
|
|
pattern: "^[A-Z]{2}$"
|
|
```
|
|
|
|
**After** (Country class):
|
|
```yaml
|
|
country_code:
|
|
range: Country
|
|
required: true
|
|
```
|
|
|
|
**Example**:
|
|
```yaml
|
|
LegalForm:
|
|
elf_code: "8888"
|
|
country_code:
|
|
alpha_2: "NL"
|
|
alpha_3: "NLD"
|
|
local_name: "Stichting"
|
|
abbreviation: "Stg."
|
|
```
|
|
|
|
#### 5. Created Example Instance File
|
|
**File**: `schemas/20251121/examples/country_integration_example.yaml`
|
|
|
|
Shows:
|
|
- Country instances (NL, PE, US)
|
|
- CustodianPlace with country linking
|
|
- LegalForm with country linking
|
|
- Country-specific feature types (CULTURAL_HERITAGE_OF_PERU, BUITENPLAATS)
|
|
|
|
---
|
|
|
|
## Design Decisions
|
|
|
|
### Why Minimal Country Class?
|
|
|
|
**Excluded Metadata**:
|
|
- ❌ Country names (language-dependent: "Netherlands" vs "Pays-Bas" vs "荷兰")
|
|
- ❌ Capital cities (change over time: Myanmar moved capital 2006)
|
|
- ❌ Languages (multilingual countries: Belgium has 3 official languages)
|
|
- ❌ Regions/continents (political: Is Turkey in Europe or Asia?)
|
|
- ❌ Currency (changes: Eurozone adoption)
|
|
- ❌ Phone codes (technical, not heritage-relevant)
|
|
|
|
**Rationale**:
|
|
1. **Language neutrality**: ISO codes work across all languages
|
|
2. **Temporal stability**: Country names and capitals change; ISO codes are persistent
|
|
3. **Separation of concerns**: Heritage ontology shouldn't duplicate geopolitical databases
|
|
4. **External resolution**: Use GeoNames, UN M49, or ISO 3166 Maintenance Agency for metadata
|
|
|
|
**External Services**:
|
|
- GeoNames API: Country names in 20+ languages, capitals, regions
|
|
- UN M49: Standard country codes and regions
|
|
- ISO 3166 Maintenance Agency: Official ISO code updates
|
|
|
|
### Why Both Alpha-2 and Alpha-3?
|
|
|
|
**Alpha-2** (2-letter):
|
|
- Used by: Internet ccTLDs (.nl, .pe, .us), Schema.org addressCountry
|
|
- Compact, widely recognized
|
|
- **Primary for web applications**
|
|
|
|
**Alpha-3** (3-letter):
|
|
- Used by: United Nations, International Olympic Committee, ISO 4217 (currency codes)
|
|
- Less ambiguous (e.g., "AT" = Austria vs "AT" = @-sign in some systems)
|
|
- **Primary for international standards**
|
|
|
|
**Both are required** to ensure interoperability across different systems.
|
|
|
|
---
|
|
|
|
## Country-Specific Feature Types
|
|
|
|
### Problem: Some feature types only apply to specific countries
|
|
|
|
**Examples from FeatureTypeEnum.yaml**:
|
|
|
|
1. **CULTURAL_HERITAGE_OF_PERU** (Q16617058)
|
|
- Description: "cultural heritage of Peru"
|
|
- Only valid for: `country.alpha_2 = "PE"`
|
|
- Hypernym: cultural heritage
|
|
|
|
2. **BUITENPLAATS** (Q2927789)
|
|
- Description: "summer residence for rich townspeople in the Netherlands"
|
|
- Only valid for: `country.alpha_2 = "NL"`
|
|
- Hypernym: heritage site
|
|
|
|
3. **NATIONAL_MEMORIAL_OF_THE_UNITED_STATES** (Q20010800)
|
|
- Description: "national memorial in the United States"
|
|
- Only valid for: `country.alpha_2 = "US"`
|
|
- Hypernym: heritage site
|
|
|
|
### Solution: Country-Conditional Enum Values
|
|
|
|
**Implementation Strategy**:
|
|
|
|
When validating CustodianPlace.has_feature_type:
|
|
```python
|
|
place_country = custodian_place.country.alpha_2
|
|
|
|
if feature_type == "CULTURAL_HERITAGE_OF_PERU":
|
|
assert place_country == "PE", "CULTURAL_HERITAGE_OF_PERU only valid for Peru"
|
|
|
|
if feature_type == "BUITENPLAATS":
|
|
assert place_country == "NL", "BUITENPLAATS only valid for Netherlands"
|
|
```
|
|
|
|
**LinkML Implementation** (future enhancement):
|
|
```yaml
|
|
FeatureTypeEnum:
|
|
permissible_values:
|
|
CULTURAL_HERITAGE_OF_PERU:
|
|
meaning: wd:Q16617058
|
|
annotations:
|
|
country_restriction: "PE" # Only valid for Peru
|
|
|
|
BUITENPLAATS:
|
|
meaning: wd:Q2927789
|
|
annotations:
|
|
country_restriction: "NL" # Only valid for Netherlands
|
|
```
|
|
|
|
---
|
|
|
|
## Files Modified
|
|
|
|
### Created:
|
|
1. `schemas/20251121/linkml/modules/classes/Country.yaml` (new)
|
|
2. `schemas/20251121/examples/country_integration_example.yaml` (new)
|
|
|
|
### Modified:
|
|
3. `schemas/20251121/linkml/modules/classes/CustodianPlace.yaml`
|
|
- Added `country` import
|
|
- Added `country` slot
|
|
- Added slot_usage documentation
|
|
|
|
4. `schemas/20251121/linkml/modules/classes/LegalForm.yaml`
|
|
- Added `Country` import
|
|
- Changed `country_code` from string to Country class reference
|
|
|
|
5. `schemas/20251121/linkml/modules/enums/FeatureTypeEnum.yaml`
|
|
- Removed duplicate RESIDENTIAL_BUILDING (Q11755880)
|
|
- Removed duplicate SANCTUARY (Q29553, kept Q21850178)
|
|
- Result: 294 unique feature types
|
|
|
|
---
|
|
|
|
## Validation Results
|
|
|
|
### YAML Syntax: ✅ Valid
|
|
```
|
|
Total enum values: 294
|
|
No duplicate keys found
|
|
```
|
|
|
|
### Country Class: ✅ Minimal Design
|
|
```
|
|
- alpha_2: Required, pattern: ^[A-Z]{2}$
|
|
- alpha_3: Required, pattern: ^[A-Z]{3}$
|
|
- No other fields (names, languages, capitals excluded)
|
|
```
|
|
|
|
### Integration: ✅ Complete
|
|
- CustodianPlace → Country (optional link)
|
|
- LegalForm → Country (required link, jurisdiction-specific)
|
|
- FeatureTypeEnum → Ready for country-conditional validation
|
|
|
|
---
|
|
|
|
## Next Steps (Future Work)
|
|
|
|
### 1. Country-Conditional Enum Validation
|
|
**Task**: Implement validation rules for country-specific feature types
|
|
|
|
**Approach**:
|
|
- Add `country_restriction` annotation to FeatureTypeEnum entries
|
|
- Create LinkML validation rule to check CustodianPlace.country matches restriction
|
|
- Generate country-specific enum subsets for UI dropdowns
|
|
|
|
**Example Rule**:
|
|
```yaml
|
|
rules:
|
|
- title: "Country-specific feature type validation"
|
|
preconditions:
|
|
slot_conditions:
|
|
has_feature_type:
|
|
range: FeaturePlace
|
|
postconditions:
|
|
slot_conditions:
|
|
country:
|
|
value_must_match: "{has_feature_type.country_restriction}"
|
|
```
|
|
|
|
### 2. Populate Country Instances
|
|
**Task**: Create Country instances for all countries in the dataset
|
|
|
|
**Data Source**: ISO 3166-1 official list (249 countries)
|
|
|
|
**Implementation**:
|
|
```yaml
|
|
# schemas/20251121/data/countries.yaml
|
|
countries:
|
|
- id: https://nde.nl/ontology/hc/country/NL
|
|
alpha_2: "NL"
|
|
alpha_3: "NLD"
|
|
|
|
- id: https://nde.nl/ontology/hc/country/PE
|
|
alpha_2: "PE"
|
|
alpha_3: "PER"
|
|
|
|
# ... 247 more entries
|
|
```
|
|
|
|
### 3. Link LegalForm to ISO 20275 ELF Codes
|
|
**Task**: Populate LegalForm instances with ISO 20275 Entity Legal Form codes
|
|
|
|
**Data Source**: GLEIF ISO 20275 Code List (1,600+ legal forms across 150+ jurisdictions)
|
|
|
|
**Example**:
|
|
```yaml
|
|
# Netherlands legal forms
|
|
- id: https://nde.nl/ontology/hc/legal-form/nl-8888
|
|
elf_code: "8888"
|
|
country_code: {alpha_2: "NL", alpha_3: "NLD"}
|
|
local_name: "Stichting"
|
|
abbreviation: "Stg."
|
|
|
|
- id: https://nde.nl/ontology/hc/legal-form/nl-akd2
|
|
elf_code: "AKD2"
|
|
country_code: {alpha_2: "NL", alpha_3: "NLD"}
|
|
local_name: "Besloten vennootschap"
|
|
abbreviation: "B.V."
|
|
```
|
|
|
|
### 4. External Resolution Service Integration
|
|
**Task**: Provide helper functions to resolve country metadata via GeoNames API
|
|
|
|
**Implementation**:
|
|
```python
|
|
from typing import Dict
|
|
import requests
|
|
|
|
def resolve_country_metadata(alpha_2: str) -> Dict:
|
|
"""Resolve country metadata from GeoNames API."""
|
|
url = f"http://api.geonames.org/countryInfoJSON"
|
|
params = {
|
|
"country": alpha_2,
|
|
"username": "your_geonames_username"
|
|
}
|
|
response = requests.get(url, params=params)
|
|
data = response.json()
|
|
|
|
return {
|
|
"name_en": data["geonames"][0]["countryName"],
|
|
"capital": data["geonames"][0]["capital"],
|
|
"languages": data["geonames"][0]["languages"].split(","),
|
|
"continent": data["geonames"][0]["continent"]
|
|
}
|
|
|
|
# Usage
|
|
country_metadata = resolve_country_metadata("NL")
|
|
# Returns: {
|
|
# "name_en": "Netherlands",
|
|
# "capital": "Amsterdam",
|
|
# "languages": ["nl", "fy"],
|
|
# "continent": "EU"
|
|
# }
|
|
```
|
|
|
|
### 5. UI Dropdown Generation
|
|
**Task**: Generate country-filtered feature type dropdowns for data entry forms
|
|
|
|
**Use Case**: When user selects "Netherlands" as country, only show:
|
|
- Universal feature types (MUSEUM, CHURCH, MANSION, etc.)
|
|
- Netherlands-specific types (BUITENPLAATS)
|
|
- Exclude Peru-specific types (CULTURAL_HERITAGE_OF_PERU)
|
|
|
|
**Implementation**:
|
|
```python
|
|
def get_valid_feature_types(country_alpha_2: str) -> List[str]:
|
|
"""Get valid feature types for a given country."""
|
|
universal_types = [ft for ft in FeatureTypeEnum if not has_country_restriction(ft)]
|
|
country_specific = [ft for ft in FeatureTypeEnum if get_country_restriction(ft) == country_alpha_2]
|
|
return universal_types + country_specific
|
|
```
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- **ISO 3166-1**: https://www.iso.org/iso-3166-country-codes.html
|
|
- **GeoNames API**: https://www.geonames.org/export/web-services.html
|
|
- **UN M49**: https://unstats.un.org/unsd/methodology/m49/
|
|
- **ISO 20275**: https://www.gleif.org/en/about-lei/code-lists/iso-20275-entity-legal-forms-code-list
|
|
- **Schema.org addressCountry**: https://schema.org/addressCountry
|
|
- **Wikidata Q-numbers**: Country-specific heritage feature types
|
|
|
|
---
|
|
|
|
## Status
|
|
|
|
✅ **Country Class Implementation: COMPLETE**
|
|
|
|
- [x] Duplicate keys fixed in FeatureTypeEnum.yaml
|
|
- [x] Country class created with minimal design
|
|
- [x] CustodianPlace integrated with country linking
|
|
- [x] LegalForm integrated with country linking
|
|
- [x] Example instance file created
|
|
- [x] Documentation complete
|
|
|
|
**Ready for**: Country-conditional enum validation and LegalForm population with ISO 20275 codes.
|