glam/COUNTRY_RESTRICTION_QUICKSTART.md
kempersc 67657c39b6 feat: Complete Country Class Implementation and Hypernyms Removal
- Created the Country class with ISO 3166-1 alpha-2 and alpha-3 codes, ensuring minimal design without additional metadata.
- Integrated the Country class into CustodianPlace and LegalForm schemas to support country-specific feature types and legal forms.
- Removed duplicate keys in FeatureTypeEnum.yaml, resulting in 294 unique feature types.
- Eliminated "Hypernyms:" text from FeatureTypeEnum descriptions, verifying that semantic relationships are now conveyed through ontology mappings.
- Created example instance file demonstrating integration of Country with CustodianPlace and LegalForm.
- Updated documentation to reflect the completion of the Country class implementation and hypernyms removal.
2025-11-23 13:09:38 +01:00

199 lines
5.8 KiB
Markdown

# Country Restriction Quick Start Guide
**Goal**: Ensure country-specific feature types (like "City of Pittsburgh historic designation") are only used in the correct country.
---
## TL;DR Solution
1. **Add `dcterms:spatial` annotations** to country-specific feature types in FeatureTypeEnum
2. **Implement Python validator** to check CustodianPlace.country matches feature type restriction
3. **Integrate validator** into data validation pipeline
---
## 3-Step Implementation
### Step 1: Annotate Country-Specific Feature Types (15 min)
Edit `schemas/20251121/linkml/modules/enums/FeatureTypeEnum.yaml`:
```yaml
permissible_values:
CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION:
title: City of Pittsburgh historic designation
meaning: wd:Q64960148
annotations:
wikidata_id: Q64960148
dcterms:spatial: "US" # ← ADD THIS
spatial_note: "Pittsburgh, Pennsylvania, United States"
CULTURAL_HERITAGE_OF_PERU:
meaning: wd:Q16617058
annotations:
dcterms:spatial: "PE" # ← ADD THIS
BUITENPLAATS:
meaning: wd:Q2927789
annotations:
dcterms:spatial: "NL" # ← ADD THIS
NATIONAL_MEMORIAL_OF_THE_UNITED_STATES:
meaning: wd:Q1967454
annotations:
dcterms:spatial: "US" # ← ADD THIS
# Global feature types have NO dcterms:spatial
MANSION:
meaning: wd:Q1802963
# No dcterms:spatial - can be used anywhere
```
### Step 2: Create Validator Script (30 min)
Create `scripts/validate_country_restrictions.py`:
```python
from linkml_runtime.utils.schemaview import SchemaView
def validate_country_restrictions(custodian_place_data: dict, schema_view: SchemaView):
"""Validate feature type country restrictions."""
# Extract spatial restrictions from enum annotations
enum_def = schema_view.get_enum("FeatureTypeEnum")
restrictions = {}
for pv_name, pv in enum_def.permissible_values.items():
if pv.annotations and "dcterms:spatial" in pv.annotations:
restrictions[pv_name] = pv.annotations["dcterms:spatial"].value
# Get feature type and country from data
feature_place = custodian_place_data.get("has_feature_type")
if not feature_place:
return None # No restriction if no feature type
feature_type = feature_place.get("feature_type")
required_country = restrictions.get(feature_type)
if not required_country:
return None # No restriction for this feature type
# Check country matches
country = custodian_place_data.get("country", {})
actual_country = country.get("alpha_2") if isinstance(country, dict) else country
if actual_country != required_country:
return f"❌ ERROR: Feature type '{feature_type}' restricted to '{required_country}', but country is '{actual_country}'"
return None # Valid
# Test
schema = SchemaView("schemas/20251121/linkml/01_custodian_name.yaml")
test_data = {
"place_name": "Lima Building",
"country": {"alpha_2": "PE"},
"has_feature_type": {"feature_type": "CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION"}
}
error = validate_country_restrictions(test_data, schema)
print(error) # Should print error message
```
### Step 3: Integrate Validator (15 min)
Add to data loading pipeline:
```python
# In your data processing script
from validate_country_restrictions import validate_country_restrictions
for custodian_place in data:
error = validate_country_restrictions(custodian_place, schema_view)
if error:
logger.warning(error)
# Or raise ValidationError(error) to halt processing
```
---
## Quick Test
```bash
# Create test file
cat > test_country_restriction.yaml << EOF
place_name: "Lima Historic Site"
country:
alpha_2: "PE"
has_feature_type:
feature_type: CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION # Should fail
EOF
# Run validator
python scripts/validate_country_restrictions.py test_country_restriction.yaml
# Expected output:
# ❌ ERROR: Feature type 'CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION'
# restricted to 'US', but country is 'PE'
```
---
## Country-Specific Feature Types to Annotate
**Search for these patterns in FeatureTypeEnum.yaml**:
- `CITY_OF_PITTSBURGH_*``dcterms:spatial: "US"`
- `CULTURAL_HERITAGE_OF_PERU``dcterms:spatial: "PE"`
- `BUITENPLAATS``dcterms:spatial: "NL"`
- `NATIONAL_MEMORIAL_OF_THE_UNITED_STATES``dcterms:spatial: "US"`
- Search descriptions for: "United States", "Peru", "Netherlands", "Brazil", etc.
**Regex search**:
```bash
rg "(United States|Peru|Netherlands|Brazil|Mexico|France|Germany|India|China|Japan)" \
schemas/20251121/linkml/modules/enums/FeatureTypeEnum.yaml
```
---
## Why This Approach?
**Ontology-aligned**: Uses W3C Dublin Core `dcterms:spatial` property
**Non-invasive**: No schema restructuring needed
**Maintainable**: Add annotation to restrict, remove to unrestrict
**Flexible**: Easy to extend to other restrictions (temporal, etc.)
---
## FAQ
**Q: What if a feature type doesn't have `dcterms:spatial`?**
A: It's globally applicable (can be used in any country).
**Q: Can a feature type apply to multiple countries?**
A: Not with current design. For multi-country restrictions, use:
```yaml
annotations:
dcterms:spatial: ["US", "CA"] # List format
```
And update validator to check `if actual_country in required_countries`.
**Q: What about regions (e.g., "European Union")?**
A: Use ISO 3166-1 alpha-2 codes only. For regional restrictions, list all country codes.
**Q: When is `CustodianPlace.country` required?**
A: Only when `has_feature_type` uses a country-restricted enum value.
---
## Complete Documentation
See `COUNTRY_RESTRICTION_IMPLEMENTATION.md` for:
- Full ontology property analysis
- Alternative approaches considered
- Detailed implementation steps
- Python validator code with tests
---
**Status**: Ready to implement
**Time**: ~1 hour total
**Priority**: Medium (validation enhancement, not blocking)