# Country Restriction Quick Start Guide **Goal**: Ensure country-specific feature types (like "City of Pittsburgh historic designation") are only used in the correct country. --- ## TL;DR Solution 1. **Add `dcterms:spatial` annotations** to country-specific feature types in FeatureTypeEnum 2. **Implement Python validator** to check CustodianPlace.country matches feature type restriction 3. **Integrate validator** into data validation pipeline --- ## 3-Step Implementation ### Step 1: Annotate Country-Specific Feature Types (15 min) Edit `schemas/20251121/linkml/modules/enums/FeatureTypeEnum.yaml`: ```yaml permissible_values: CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION: title: City of Pittsburgh historic designation meaning: wd:Q64960148 annotations: wikidata_id: Q64960148 dcterms:spatial: "US" # ← ADD THIS spatial_note: "Pittsburgh, Pennsylvania, United States" CULTURAL_HERITAGE_OF_PERU: meaning: wd:Q16617058 annotations: dcterms:spatial: "PE" # ← ADD THIS BUITENPLAATS: meaning: wd:Q2927789 annotations: dcterms:spatial: "NL" # ← ADD THIS NATIONAL_MEMORIAL_OF_THE_UNITED_STATES: meaning: wd:Q1967454 annotations: dcterms:spatial: "US" # ← ADD THIS # Global feature types have NO dcterms:spatial MANSION: meaning: wd:Q1802963 # No dcterms:spatial - can be used anywhere ``` ### Step 2: Create Validator Script (30 min) Create `scripts/validate_country_restrictions.py`: ```python from linkml_runtime.utils.schemaview import SchemaView def validate_country_restrictions(custodian_place_data: dict, schema_view: SchemaView): """Validate feature type country restrictions.""" # Extract spatial restrictions from enum annotations enum_def = schema_view.get_enum("FeatureTypeEnum") restrictions = {} for pv_name, pv in enum_def.permissible_values.items(): if pv.annotations and "dcterms:spatial" in pv.annotations: restrictions[pv_name] = pv.annotations["dcterms:spatial"].value # Get feature type and country from data feature_place = custodian_place_data.get("has_feature_type") if not feature_place: return None # No restriction if no feature type feature_type = feature_place.get("feature_type") required_country = restrictions.get(feature_type) if not required_country: return None # No restriction for this feature type # Check country matches country = custodian_place_data.get("country", {}) actual_country = country.get("alpha_2") if isinstance(country, dict) else country if actual_country != required_country: return f"❌ ERROR: Feature type '{feature_type}' restricted to '{required_country}', but country is '{actual_country}'" return None # Valid # Test schema = SchemaView("schemas/20251121/linkml/01_custodian_name.yaml") test_data = { "place_name": "Lima Building", "country": {"alpha_2": "PE"}, "has_feature_type": {"feature_type": "CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION"} } error = validate_country_restrictions(test_data, schema) print(error) # Should print error message ``` ### Step 3: Integrate Validator (15 min) Add to data loading pipeline: ```python # In your data processing script from validate_country_restrictions import validate_country_restrictions for custodian_place in data: error = validate_country_restrictions(custodian_place, schema_view) if error: logger.warning(error) # Or raise ValidationError(error) to halt processing ``` --- ## Quick Test ```bash # Create test file cat > test_country_restriction.yaml << EOF place_name: "Lima Historic Site" country: alpha_2: "PE" has_feature_type: feature_type: CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION # Should fail EOF # Run validator python scripts/validate_country_restrictions.py test_country_restriction.yaml # Expected output: # ❌ ERROR: Feature type 'CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION' # restricted to 'US', but country is 'PE' ``` --- ## Country-Specific Feature Types to Annotate **Search for these patterns in FeatureTypeEnum.yaml**: - `CITY_OF_PITTSBURGH_*` → `dcterms:spatial: "US"` - `CULTURAL_HERITAGE_OF_PERU` → `dcterms:spatial: "PE"` - `BUITENPLAATS` → `dcterms:spatial: "NL"` - `NATIONAL_MEMORIAL_OF_THE_UNITED_STATES` → `dcterms:spatial: "US"` - Search descriptions for: "United States", "Peru", "Netherlands", "Brazil", etc. **Regex search**: ```bash rg "(United States|Peru|Netherlands|Brazil|Mexico|France|Germany|India|China|Japan)" \ schemas/20251121/linkml/modules/enums/FeatureTypeEnum.yaml ``` --- ## Why This Approach? ✅ **Ontology-aligned**: Uses W3C Dublin Core `dcterms:spatial` property ✅ **Non-invasive**: No schema restructuring needed ✅ **Maintainable**: Add annotation to restrict, remove to unrestrict ✅ **Flexible**: Easy to extend to other restrictions (temporal, etc.) --- ## FAQ **Q: What if a feature type doesn't have `dcterms:spatial`?** A: It's globally applicable (can be used in any country). **Q: Can a feature type apply to multiple countries?** A: Not with current design. For multi-country restrictions, use: ```yaml annotations: dcterms:spatial: ["US", "CA"] # List format ``` And update validator to check `if actual_country in required_countries`. **Q: What about regions (e.g., "European Union")?** A: Use ISO 3166-1 alpha-2 codes only. For regional restrictions, list all country codes. **Q: When is `CustodianPlace.country` required?** A: Only when `has_feature_type` uses a country-restricted enum value. --- ## Complete Documentation See `COUNTRY_RESTRICTION_IMPLEMENTATION.md` for: - Full ontology property analysis - Alternative approaches considered - Detailed implementation steps - Python validator code with tests --- **Status**: Ready to implement **Time**: ~1 hour total **Priority**: Medium (validation enhancement, not blocking)