- Created the Country class with ISO 3166-1 alpha-2 and alpha-3 codes, ensuring minimal design without additional metadata. - Integrated the Country class into CustodianPlace and LegalForm schemas to support country-specific feature types and legal forms. - Removed duplicate keys in FeatureTypeEnum.yaml, resulting in 294 unique feature types. - Eliminated "Hypernyms:" text from FeatureTypeEnum descriptions, verifying that semantic relationships are now conveyed through ontology mappings. - Created example instance file demonstrating integration of Country with CustodianPlace and LegalForm. - Updated documentation to reflect the completion of the Country class implementation and hypernyms removal.
17 KiB
Country Restriction Implementation for FeatureTypeEnum
Date: 2025-11-22
Status: Implementation Plan
Related Files:
schemas/20251121/linkml/modules/enums/FeatureTypeEnum.yamlschemas/20251121/linkml/modules/classes/CustodianPlace.yamlschemas/20251121/linkml/modules/classes/FeaturePlace.yamlschemas/20251121/linkml/modules/classes/Country.yaml
Problem Statement
Some feature types in FeatureTypeEnum are country-specific and should only be used when the CustodianPlace.country matches a specific jurisdiction:
Examples:
CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION(Q64960148) - US only (Pittsburgh, Pennsylvania)CULTURAL_HERITAGE_OF_PERU(Q16617058) - Peru onlyBUITENPLAATS(Q2927789) - Netherlands only (Dutch country estates)NATIONAL_MEMORIAL_OF_THE_UNITED_STATES(Q1967454) - US only
Current Issue: No validation mechanism enforces country restrictions on feature type usage.
Ontology Properties for Jurisdiction
1. Dublin Core Terms - dcterms:spatial ✅ RECOMMENDED
Property: dcterms:spatial
Definition: "The spatial or temporal topic of the resource, spatial applicability of the resource, or jurisdiction under which the resource is relevant."
Source: data/ontology/dublin_core_elements.rdf
<dcterms:spatial>
rdfs:comment "The spatial or temporal topic of the resource, spatial applicability
of the resource, or jurisdiction under which the resource is relevant."@en
dcterms:description "Spatial topic and spatial applicability may be a named place or
a location specified by its geographic coordinates. ...
A jurisdiction may be a named administrative entity or a geographic
place to which the resource applies."@en
</dcterms:spatial>
Why this is perfect:
- ✅ Explicitly covers "jurisdiction under which the resource is relevant"
- ✅ Allows both named places and ISO country codes
- ✅ W3C standard, widely adopted
- ✅ Already used in DBpedia for HistoricalPeriod → Place relationships
Example usage:
CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION:
meaning: wd:Q64960148
annotations:
dcterms:spatial: "US" # ISO 3166-1 alpha-2 code
2. RiC-O - rico:hasOrHadJurisdiction (Alternative)
Property: rico:hasOrHadJurisdiction
Inverse: rico:isOrWasJurisdictionOf
Domain: rico:Agent (organizations)
Range: rico:Place
Source: data/ontology/RiC-O_1-1.rdf
<rico:hasOrHadJurisdiction>
rdfs:subPropertyOf rico:isAgentAssociatedWithPlace
owl:inverseOf rico:isOrWasJurisdictionOf
rdfs:domain rico:Agent
rdfs:range rico:Place
rdfs:comment "Inverse of 'is or was jurisdiction of' object relation"@en
</rico:hasOrHadJurisdiction>
Why this is less suitable:
- ⚠️ Designed for organizational jurisdiction (which organization has authority over which place)
- ⚠️ Not designed for feature type geographic applicability
- ⚠️ Domain is
Agent, notFeatureorEnumValue
Conclusion: Use RiC-O for organizational jurisdiction (e.g., "Netherlands National Archives has jurisdiction over Noord-Holland"), NOT for feature type restrictions.
3. Schema.org - schema:addressCountry ✅ ALREADY USED
Property: schema:addressCountry
Range: schema:Country or ISO 3166-1 alpha-2 code
Current usage: Already mapped in CustodianPlace.country:
country:
slot_uri: schema:addressCountry
range: Country
Why this works for validation:
- ✅
CustodianPlace.countryalready uses ISO 3166-1 codes - ✅ Can cross-reference with
dcterms:spatialin FeatureTypeEnum - ✅ Validation rule: "If feature_type.spatial annotation exists, CustodianPlace.country MUST match"
LinkML Implementation Strategy
Approach 1: Annotations + Custom Validation Rules ✅ RECOMMENDED
Rationale: LinkML doesn't have built-in "enum value → class field" conditional validation, so we:
- Add
dcterms:spatialannotations to country-specific enum values - Implement custom validation rules at the
CustodianPlaceclass level
Step 1: Add dcterms:spatial Annotations to FeatureTypeEnum
# schemas/20251121/linkml/modules/enums/FeatureTypeEnum.yaml
enums:
FeatureTypeEnum:
permissible_values:
CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION:
title: City of Pittsburgh historic designation
meaning: wd:Q64960148
annotations:
wikidata_id: Q64960148
dcterms:spatial: "US" # ← NEW: Country restriction
spatial_note: "Pittsburgh, Pennsylvania, United States"
CULTURAL_HERITAGE_OF_PERU:
title: cultural heritage of Peru
meaning: wd:Q16617058
annotations:
wikidata_id: Q16617058
dcterms:spatial: "PE" # ← NEW: Country restriction
BUITENPLAATS:
title: buitenplaats
meaning: wd:Q2927789
annotations:
wikidata_id: Q2927789
dcterms:spatial: "NL" # ← NEW: Country restriction
NATIONAL_MEMORIAL_OF_THE_UNITED_STATES:
title: National Memorial of the United States
meaning: wd:Q1967454
annotations:
wikidata_id: Q1967454
dcterms:spatial: "US" # ← NEW: Country restriction
# Global feature types have NO dcterms:spatial annotation
MANSION:
title: mansion
meaning: wd:Q1802963
annotations:
wikidata_id: Q1802963
# NO dcterms:spatial - applicable globally
Step 2: Add Validation Rules to CustodianPlace Class
# schemas/20251121/linkml/modules/classes/CustodianPlace.yaml
classes:
CustodianPlace:
class_uri: crm:E53_Place
slots:
- place_name
- country
- has_feature_type
# ... other slots
rules:
- title: "Feature type country restriction validation"
description: >-
If a feature type has a dcterms:spatial annotation (country restriction),
then the CustodianPlace.country MUST match that restriction.
Examples:
- CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION requires country.alpha_2 = "US"
- CULTURAL_HERITAGE_OF_PERU requires country.alpha_2 = "PE"
- BUITENPLAATS requires country.alpha_2 = "NL"
Feature types WITHOUT dcterms:spatial are applicable globally.
preconditions:
slot_conditions:
has_feature_type:
# If has_feature_type is populated
required: true
country:
# And country is populated
required: true
postconditions:
# CUSTOM VALIDATION (requires external validator)
description: >-
Validate that if has_feature_type.feature_type enum value has
a dcterms:spatial annotation, then country.alpha_2 MUST equal
that annotation value.
Pseudocode:
feature_enum_value = has_feature_type.feature_type
spatial_restriction = enum_annotations[feature_enum_value]['dcterms:spatial']
if spatial_restriction is not None:
assert country.alpha_2 == spatial_restriction, \
f"Feature type {feature_enum_value} restricted to {spatial_restriction}, \
but CustodianPlace country is {country.alpha_2}"
Limitation: LinkML's rules block cannot directly access enum annotations. We need a custom Python validator.
Approach 2: Python Custom Validator ✅ IMPLEMENTATION REQUIRED
Since LinkML rules can't access enum annotations, implement a post-validation Python script:
# scripts/validate_country_restrictions.py
from linkml_runtime.loaders import yaml_loader
from linkml_runtime.utils.schemaview import SchemaView
from linkml.validators import JsonSchemaDataValidator
from typing import Dict, Optional
def load_feature_type_spatial_restrictions(schema_view: SchemaView) -> Dict[str, str]:
"""
Extract dcterms:spatial annotations from FeatureTypeEnum permissible values.
Returns:
Dict mapping feature type enum key → ISO 3166-1 alpha-2 country code
Example: {"CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION": "US", ...}
"""
restrictions = {}
enum_def = schema_view.get_enum("FeatureTypeEnum")
for pv_name, pv in enum_def.permissible_values.items():
if pv.annotations and "dcterms:spatial" in pv.annotations:
restrictions[pv_name] = pv.annotations["dcterms:spatial"].value
return restrictions
def validate_custodian_place_country_restrictions(
custodian_place_data: dict,
spatial_restrictions: Dict[str, str]
) -> Optional[str]:
"""
Validate that feature types with country restrictions match CustodianPlace.country.
Returns:
None if valid, error message string if invalid
"""
# Extract feature type and country
feature_place = custodian_place_data.get("has_feature_type")
if not feature_place:
return None # No feature type, no restriction
feature_type_enum = feature_place.get("feature_type")
if not feature_type_enum:
return None
# Check if this feature type has a country restriction
required_country = spatial_restrictions.get(feature_type_enum)
if not required_country:
return None # No restriction, globally applicable
# Get actual country
country = custodian_place_data.get("country")
if not country:
return f"Feature type '{feature_type_enum}' requires country='{required_country}', but no country specified"
# Validate country matches
actual_country = country.get("alpha_2") if isinstance(country, dict) else country
if actual_country != required_country:
return (
f"Feature type '{feature_type_enum}' restricted to country '{required_country}', "
f"but CustodianPlace.country='{actual_country}'"
)
return None # Valid
# Example usage
if __name__ == "__main__":
schema_view = SchemaView("schemas/20251121/linkml/01_custodian_name.yaml")
restrictions = load_feature_type_spatial_restrictions(schema_view)
# Test case 1: Invalid (Pittsburgh designation in Peru)
invalid_data = {
"place_name": "Lima Historic Building",
"country": {"alpha_2": "PE"},
"has_feature_type": {
"feature_type": "CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION"
}
}
error = validate_custodian_place_country_restrictions(invalid_data, restrictions)
assert error is not None, "Should detect country mismatch"
print(f"❌ Validation error: {error}")
# Test case 2: Valid (Pittsburgh designation in US)
valid_data = {
"place_name": "Pittsburgh Historic Building",
"country": {"alpha_2": "US"},
"has_feature_type": {
"feature_type": "CITY_OF_PITTSBURGH_HISTORIC_DESIGNATION"
}
}
error = validate_custodian_place_country_restrictions(valid_data, restrictions)
assert error is None, "Should pass validation"
print(f"✅ Valid: Pittsburgh designation in US")
# Test case 3: Valid (MANSION has no restriction, can be anywhere)
global_data = {
"place_name": "Mansion in France",
"country": {"alpha_2": "FR"},
"has_feature_type": {
"feature_type": "MANSION"
}
}
error = validate_custodian_place_country_restrictions(global_data, restrictions)
assert error is None, "Should pass validation (global feature type)"
print(f"✅ Valid: MANSION (global feature type) in France")
Implementation Checklist
Phase 1: Schema Annotations ✅ START HERE
-
Identify all country-specific feature types in
FeatureTypeEnum.yaml- Search Wikidata descriptions for country names
- Examples: "City of Pittsburgh", "cultural heritage of Peru", "buitenplaats"
- Use regex:
/(United States|Peru|Netherlands|Brazil|Mexico|France|Germany|etc)/i
-
Add
dcterms:spatialannotations to country-specific enum values- Format:
dcterms:spatial: "US"(ISO 3166-1 alpha-2) - Add
spatial_notefor human readability: "Pittsburgh, Pennsylvania, United States"
- Format:
-
Document annotation semantics in FeatureTypeEnum header
# Annotations: # dcterms:spatial - Country restriction (ISO 3166-1 alpha-2 code) # If present, feature type only applicable in specified country # If absent, feature type is globally applicable
Phase 2: Custom Validator Implementation
-
Create validation script
scripts/validate_country_restrictions.py- Implement
load_feature_type_spatial_restrictions() - Implement
validate_custodian_place_country_restrictions() - Add comprehensive test cases
- Implement
-
Integrate with LinkML validation workflow
- Add to
linkml-validatepost-validation step - Or create standalone
validate-country-restrictionsCLI command
- Add to
-
Add validation tests to test suite
- Test country-restricted feature types
- Test global feature types (no restriction)
- Test missing country field
Phase 3: Documentation
-
Update CustodianPlace documentation
- Explain country field is required when using country-specific feature types
- Link to FeatureTypeEnum country restriction annotations
-
Update FeaturePlace documentation
- Explain feature type country restrictions
- Provide examples of restricted vs. global feature types
-
Create VALIDATION.md guide
- Document validation workflow
- Provide troubleshooting guide for country restriction errors
Alternative Approaches (Not Recommended)
❌ Approach: Split FeatureTypeEnum by Country
Create separate enums: FeatureTypeEnum_US, FeatureTypeEnum_NL, etc.
Why not:
- Duplicates global feature types (MANSION exists in every country enum)
- Breaks DRY principle
- Hard to maintain (298 feature types → 298 × N countries)
- Loses semantic clarity
❌ Approach: Create Country-Specific Subclasses of CustodianPlace
Create CustodianPlace_US, CustodianPlace_NL, etc., each with restricted enum ranges.
Why not:
- Explosion of subclasses (one per country)
- Type polymorphism issues
- Hard to extend to new countries
- Violates Open/Closed Principle
❌ Approach: Use LinkML any_of Conditional Range
has_feature_type:
range: FeaturePlace
any_of:
- country.alpha_2 = "US" → feature_type in [PITTSBURGH_DESIGNATION, NATIONAL_MEMORIAL, ...]
- country.alpha_2 = "PE" → feature_type in [CULTURAL_HERITAGE_OF_PERU, ...]
Why not:
- LinkML
any_ofdoesn't support cross-slot conditionals - Would require massive
any_ofblock for every country - Unreadable and unmaintainable
Rationale for Chosen Approach
Why Annotations + Custom Validator?
✅ Separation of Concerns:
- Schema defines what (data structure)
- Annotations define metadata (country restrictions)
- Validator enforces constraints (business rules)
✅ Maintainability:
- Add new country-specific feature type: Just add annotation
- Change restriction: Update annotation, validator logic unchanged
✅ Flexibility:
- Easy to extend with other restrictions (e.g.,
dcterms:temporalfor time periods) - Custom validators can implement complex logic
✅ Ontology Alignment:
dcterms:spatialis W3C standard property- Aligns with DBpedia and Schema.org spatial semantics
✅ Backward Compatibility:
- Existing global feature types unaffected (no annotation = no restriction)
- Gradual migration: Add annotations incrementally
Next Steps
- Run ontology property search to confirm
dcterms:spatialis best choice - Audit FeatureTypeEnum to identify all country-specific values
- Add annotations to schema
- Implement Python validator
- Integrate into CI/CD validation pipeline
References
Ontology Documentation
- Dublin Core Terms:
data/ontology/dublin_core_elements.rdfdcterms:spatial- Geographic/jurisdictional applicability
- RiC-O:
data/ontology/RiC-O_1-1.rdfrico:hasOrHadJurisdiction- Organizational jurisdiction
- Schema.org:
data/ontology/schemaorg.owlschema:addressCountry- ISO 3166-1 country codes
LinkML Documentation
- Constraints and Rules: https://linkml.io/linkml/schemas/constraints.html
- Advanced Features: https://linkml.io/linkml/schemas/advanced.html
- Conditional Validation Examples: https://linkml.io/linkml/faq/modeling.html#conditional-slot-ranges
Related Files
schemas/20251121/linkml/modules/enums/FeatureTypeEnum.yaml- Feature type definitionsschemas/20251121/linkml/modules/classes/CustodianPlace.yaml- Place class with country fieldschemas/20251121/linkml/modules/classes/FeaturePlace.yaml- Feature type classifierschemas/20251121/linkml/modules/classes/Country.yaml- ISO 3166-1 country codesAGENTS.md- Agent instructions (Rule 1: Ontology Files Are Your Primary Reference)
Status: Ready for implementation
Priority: Medium (nice-to-have validation, not blocking)
Estimated Effort: 4-6 hours (annotation audit + validator + tests)