- Created SHACL shapes for validating temporal consistency and bidirectional relationships in custodial collections and staff observations. - Implemented a Python script to validate RDF data against the defined SHACL shapes using the pyshacl library. - Added command-line interface for validation with options for specifying data formats and output reports. - Included detailed error handling and reporting for validation results.
481 lines
15 KiB
Markdown
481 lines
15 KiB
Markdown
# FeaturePlace Implementation - Complete
|
|
|
|
**Date**: 2025-11-22
|
|
**Status**: ✅ Complete
|
|
**Files Created**: 2
|
|
**Files Modified**: 1
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
Successfully implemented the **FeaturePlace** LinkML schema class and enum to provide physical feature type classification for nominal place references in the Heritage Custodian Ontology.
|
|
|
|
### Conceptual Model
|
|
|
|
**CustodianPlace** + **FeaturePlace** = Complete Place Description
|
|
|
|
- **CustodianPlace**: WHERE (nominal reference)
|
|
- "Rijksmuseum" - the place name
|
|
- "het herenhuis in de Schilderswijk" - nominal reference
|
|
- Represents HOW people refer to a custodian through place
|
|
|
|
- **FeaturePlace**: WHAT TYPE (classification)
|
|
- MUSEUM - the building type
|
|
- MANSION - the structure type
|
|
- Classifies the physical feature type of that place
|
|
|
|
### Architecture
|
|
|
|
```
|
|
CustodianPlace (crm:E53_Place)
|
|
↓ has_feature_type (optional)
|
|
FeaturePlace (crm:E27_Site)
|
|
↓ feature_type (required)
|
|
FeatureTypeEnum (298 values)
|
|
```
|
|
|
|
---
|
|
|
|
## Files Created
|
|
|
|
### 1. FeatureTypeEnum.yaml
|
|
**Location**: `schemas/20251121/linkml/modules/enums/FeatureTypeEnum.yaml`
|
|
**Size**: 106 KB
|
|
**Content**: Enum with 298 physical feature types
|
|
|
|
**Structure**:
|
|
```yaml
|
|
enums:
|
|
FeatureTypeEnum:
|
|
permissible_values:
|
|
MANSION:
|
|
title: mansion
|
|
description: very large and imposing dwelling house
|
|
meaning: wd:Q1802963
|
|
annotations:
|
|
wikidata_id: Q1802963
|
|
wikidata_url: https://www.wikidata.org/wiki/Q1802963
|
|
hypernyms: building
|
|
# ... 297 more entries
|
|
```
|
|
|
|
**Top Feature Types by Hypernym**:
|
|
- Heritage sites: 144 entries (48.3%)
|
|
- Buildings: 33 entries (11.1%)
|
|
- Protected areas: 23 entries (7.7%)
|
|
- Structures: 12 entries (4.0%)
|
|
- Museums: 8 entries (2.7%)
|
|
- Parks: 7 entries (2.3%)
|
|
|
|
**Example Values**:
|
|
- `MANSION` (Q1802963) - very large dwelling house
|
|
- `PARISH_CHURCH` (Q16970) - place of Christian worship
|
|
- `MONUMENT` (Q4989906) - commemorative structure
|
|
- `CEMETERY` (Q39614) - burial ground
|
|
- `CASTLE` (Q23413) - fortified building
|
|
- `PALACE` (Q16560) - grand residence
|
|
- `MUSEUM` (Q33506) - institution housing collections
|
|
- `PARK` (Q22698) - area of land for recreation
|
|
- `GARDEN` (Q1107656) - planned outdoor space
|
|
- `BRIDGE` (Q12280) - structure spanning obstacles
|
|
|
|
**Source**: Extracted from `data/wikidata/GLAMORCUBEPSXHFN/hyponyms_curated_full_f.yaml`
|
|
|
|
---
|
|
|
|
### 2. FeaturePlace.yaml
|
|
**Location**: `schemas/20251121/linkml/modules/classes/FeaturePlace.yaml`
|
|
**Size**: 12 KB
|
|
**Content**: FeaturePlace class definition
|
|
|
|
**Key Slots**:
|
|
1. **feature_type** (required): `FeatureTypeEnum` - What type of physical feature
|
|
2. **feature_name** (optional): `string` - Name/label of the feature
|
|
3. **feature_language** (optional): `string` - Language code
|
|
4. **feature_description** (optional): `string` - Physical characteristics
|
|
5. **feature_note** (optional): `string` - Classification rationale
|
|
6. **classifies_place** (required): `CustodianPlace` - Links to nominal place reference
|
|
7. **was_derived_from** (required): `CustodianObservation[]` - Source observations
|
|
8. **was_generated_by** (optional): `ReconstructionActivity` - Reconstruction process
|
|
9. **valid_from/valid_to** (optional): `date` - Temporal validity
|
|
|
|
**Ontology Mappings**:
|
|
- **Exact**: `crm:E27_Site`, `schema:LandmarksOrHistoricalBuildings`
|
|
- **Close**: `crm:E53_Place`, `schema:Place`, `schema:TouristAttraction`
|
|
- **Related**: `prov:Entity`, `dcterms:Location`, `geo:Feature`
|
|
|
|
**Example Instance**:
|
|
```yaml
|
|
FeaturePlace:
|
|
feature_type: MUSEUM
|
|
feature_name: "Rijksmuseum building"
|
|
feature_language: "nl"
|
|
feature_description: "Neo-Gothic museum building designed by P.J.H. Cuypers, opened 1885"
|
|
feature_note: "Rijksmonument, national heritage building"
|
|
classifies_place: "https://nde.nl/ontology/hc/place/rijksmuseum-ams"
|
|
was_derived_from:
|
|
- "https://w3id.org/heritage/observation/heritage-register-entry"
|
|
valid_from: "1885-07-13"
|
|
```
|
|
|
|
---
|
|
|
|
## Files Modified
|
|
|
|
### 3. CustodianPlace.yaml (Updated)
|
|
**Location**: `schemas/20251121/linkml/modules/classes/CustodianPlace.yaml`
|
|
|
|
**Changes**:
|
|
1. **Added import**: `./FeaturePlace` to imports list
|
|
2. **Added slot**: `has_feature_type` - Optional link to FeaturePlace
|
|
3. **Updated description**: Added explanation of relationship to FeaturePlace
|
|
4. **Updated example**: Added feature type classification to Rijksmuseum example
|
|
|
|
**New Slot Definition**:
|
|
```yaml
|
|
has_feature_type:
|
|
slot_uri: dcterms:type
|
|
description: >-
|
|
Physical feature type classification for this place (OPTIONAL).
|
|
|
|
Links to FeaturePlace which classifies WHAT TYPE of physical feature this place is.
|
|
|
|
Examples:
|
|
- "Rijksmuseum" (place name) → MUSEUM (feature type)
|
|
- "het herenhuis" → MANSION (feature type)
|
|
- "de kerk op het Damrak" → PARISH_CHURCH (feature type)
|
|
range: FeaturePlace
|
|
required: false
|
|
```
|
|
|
|
**Enhanced Example**:
|
|
```yaml
|
|
CustodianPlace:
|
|
place_name: "Rijksmuseum"
|
|
place_language: "nl"
|
|
place_specificity: BUILDING
|
|
has_feature_type: # ← NEW!
|
|
feature_type: MUSEUM
|
|
feature_name: "Rijksmuseum building"
|
|
feature_description: "Neo-Gothic museum building designed by P.J.H. Cuypers (1885)"
|
|
feature_note: "Rijksmonument, national heritage building"
|
|
refers_to_custodian: "https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804"
|
|
```
|
|
|
|
---
|
|
|
|
## Integration Points
|
|
|
|
### 1. CustodianPlace → FeaturePlace
|
|
**Relationship**: `has_feature_type` (optional)
|
|
**Cardinality**: 0..1 (a place may have zero or one feature type classification)
|
|
**Purpose**: Adds typological classification to nominal place references
|
|
|
|
### 2. FeaturePlace → CustodianPlace
|
|
**Relationship**: `classifies_place` (required)
|
|
**Cardinality**: 1 (every feature type classification must classify a place)
|
|
**Purpose**: Links classification back to nominal reference
|
|
|
|
### 3. FeaturePlace → CustodianObservation
|
|
**Relationship**: `was_derived_from` (required)
|
|
**Cardinality**: 1..* (derived from one or more observations)
|
|
**Purpose**: Provenance tracking for classification
|
|
|
|
### 4. FeaturePlace → ReconstructionActivity
|
|
**Relationship**: `was_generated_by` (optional)
|
|
**Cardinality**: 0..1 (may or may not have reconstruction activity)
|
|
**Purpose**: Tracks formal reconstruction process
|
|
|
|
---
|
|
|
|
## Use Cases
|
|
|
|
### Use Case 1: Museum Building Classification
|
|
```yaml
|
|
# Nominal place reference
|
|
CustodianPlace:
|
|
id: place-rijksmuseum-001
|
|
place_name: "Rijksmuseum"
|
|
place_specificity: BUILDING
|
|
has_feature_type: feature-rijksmuseum-museum-001
|
|
|
|
# Physical feature type
|
|
FeaturePlace:
|
|
id: feature-rijksmuseum-museum-001
|
|
feature_type: MUSEUM
|
|
feature_description: "Neo-Gothic museum building (1885)"
|
|
classifies_place: place-rijksmuseum-001
|
|
```
|
|
|
|
### Use Case 2: Historic Mansion
|
|
```yaml
|
|
# Nominal place reference
|
|
CustodianPlace:
|
|
id: place-herenhuis-schilderswijk-001
|
|
place_name: "het herenhuis in de Schilderswijk"
|
|
place_specificity: NEIGHBORHOOD
|
|
has_feature_type: feature-herenhuis-mansion-001
|
|
|
|
# Physical feature type
|
|
FeaturePlace:
|
|
id: feature-herenhuis-mansion-001
|
|
feature_type: MANSION
|
|
feature_description: "17th-century canal mansion with ornate gable"
|
|
classifies_place: place-herenhuis-schilderswijk-001
|
|
```
|
|
|
|
### Use Case 3: Church Archive
|
|
```yaml
|
|
# Nominal place reference
|
|
CustodianPlace:
|
|
id: place-oude-kerk-001
|
|
place_name: "Oude Kerk Amsterdam"
|
|
place_specificity: BUILDING
|
|
has_feature_type: feature-oude-kerk-church-001
|
|
|
|
# Physical feature type
|
|
FeaturePlace:
|
|
id: feature-oude-kerk-church-001
|
|
feature_type: PARISH_CHURCH
|
|
feature_description: "Medieval church building (1306), contains parish archive"
|
|
classifies_place: place-oude-kerk-001
|
|
```
|
|
|
|
---
|
|
|
|
## Ontology Alignment
|
|
|
|
### CIDOC-CRM Mapping
|
|
- **CustodianPlace** → `crm:E53_Place` (conceptual place)
|
|
- **FeaturePlace** → `crm:E27_Site` (physical site/feature)
|
|
|
|
**Rationale**:
|
|
- E53_Place: "Extent in space, in particular on the surface of the earth"
|
|
- E27_Site: "Geometrically defined place that is known at that location" (subclass of E53)
|
|
|
|
### Schema.org Mapping
|
|
- **CustodianPlace** → `schema:Place` (generic place)
|
|
- **FeaturePlace** → `schema:LandmarksOrHistoricalBuildings` (heritage buildings)
|
|
|
|
**Rationale**:
|
|
- LandmarksOrHistoricalBuildings: "An historical landmark or building"
|
|
- Aligns with Type F (FEATURES) in GLAMORCUBESFIXPHDNT taxonomy
|
|
|
|
---
|
|
|
|
## Validation Examples
|
|
|
|
### Valid: Museum with Feature Type
|
|
```yaml
|
|
CustodianPlace:
|
|
place_name: "Rijksmuseum" # ✓ Required
|
|
has_feature_type:
|
|
feature_type: MUSEUM # ✓ Valid enum value
|
|
classifies_place: "place-rijksmuseum-001" # ✓ Links back
|
|
was_derived_from: ["obs-001"] # ✓ Required
|
|
refers_to_custodian: "custodian-001" # ✓ Required
|
|
```
|
|
|
|
### Valid: Place WITHOUT Feature Type
|
|
```yaml
|
|
CustodianPlace:
|
|
place_name: "the building on Voorhout" # ✓ Required
|
|
# has_feature_type: null # ✓ Optional - can be omitted
|
|
was_derived_from: ["obs-002"] # ✓ Required
|
|
refers_to_custodian: "custodian-002" # ✓ Required
|
|
```
|
|
|
|
### Invalid: Missing Required Fields
|
|
```yaml
|
|
FeaturePlace:
|
|
feature_type: MANSION # ✓ Required
|
|
# classifies_place: ??? # ✗ MISSING REQUIRED FIELD!
|
|
# was_derived_from: ??? # ✗ MISSING REQUIRED FIELD!
|
|
```
|
|
|
|
---
|
|
|
|
## Data Statistics
|
|
|
|
### FeatureTypeEnum Coverage
|
|
- **Total enum values**: 298
|
|
- **Source**: Wikidata GLAMORCUBESFIXPHDNT type 'F' entries
|
|
- **Languages**: Multilingual labels (50+ languages in source)
|
|
- **Wikidata Q-numbers**: All 298 mapped to real Wikidata entities
|
|
|
|
### Hypernym Distribution
|
|
| Hypernym | Count | Percentage |
|
|
|----------|-------|------------|
|
|
| Heritage site | 144 | 48.3% |
|
|
| Building | 33 | 11.1% |
|
|
| Protected area | 23 | 7.7% |
|
|
| Structure | 12 | 4.0% |
|
|
| Museum | 8 | 2.7% |
|
|
| Park | 7 | 2.3% |
|
|
| Infrastructure | 6 | 2.0% |
|
|
| Grave | 6 | 2.0% |
|
|
| Space | 5 | 1.7% |
|
|
| Memory space | 5 | 1.7% |
|
|
| **Other (30+ categories)** | 49 | 16.4% |
|
|
|
|
---
|
|
|
|
## Future Extensions
|
|
|
|
### Potential Enhancements
|
|
1. **Add `feature_period`**: Architectural/historical period classification
|
|
2. **Add `heritage_designation`**: UNESCO, national monument status
|
|
3. **Add `conservation_status`**: Current physical condition
|
|
4. **Add `architectural_style`**: Gothic, Baroque, Modernist, etc.
|
|
5. **Link to geographic coordinates**: Bridge to Location class
|
|
|
|
### Ontology Extensions
|
|
1. **RiC-O integration**: Link to archival description standards
|
|
2. **Getty AAT**: Art & Architecture Thesaurus for style terms
|
|
3. **INSPIRE**: EU spatial data infrastructure for geographic features
|
|
4. **DBpedia**: Additional semantic web alignment
|
|
|
|
---
|
|
|
|
## Testing Recommendations
|
|
|
|
### Unit Tests
|
|
1. **Enum validation**: All 298 values parse correctly
|
|
2. **Required fields**: `feature_type`, `classifies_place`, `was_derived_from`
|
|
3. **Optional fields**: Handle null values gracefully
|
|
4. **Wikidata Q-numbers**: All resolve to real entities
|
|
|
|
### Integration Tests
|
|
1. **CustodianPlace ↔ FeaturePlace**: Bidirectional links work
|
|
2. **FeaturePlace → CustodianObservation**: Provenance tracking
|
|
3. **Temporal validity**: `valid_from`/`valid_to` constraints
|
|
4. **RDF serialization**: Correct ontology class URIs
|
|
|
|
### Example Test Cases
|
|
```python
|
|
def test_feature_place_required_fields():
|
|
"""FeaturePlace requires feature_type, classifies_place, was_derived_from"""
|
|
feature = FeaturePlace(
|
|
feature_type="MUSEUM",
|
|
classifies_place="place-001",
|
|
was_derived_from=["obs-001"]
|
|
)
|
|
assert feature.feature_type == "MUSEUM"
|
|
|
|
def test_custodian_place_optional_feature_type():
|
|
"""CustodianPlace.has_feature_type is optional"""
|
|
place = CustodianPlace(
|
|
place_name="Unknown building",
|
|
# has_feature_type=None # Optional
|
|
was_derived_from=["obs-001"],
|
|
refers_to_custodian="cust-001"
|
|
)
|
|
assert place.has_feature_type is None # ✓ Valid
|
|
|
|
def test_invalid_feature_type():
|
|
"""FeaturePlace.feature_type must be valid enum value"""
|
|
with pytest.raises(ValidationError):
|
|
FeaturePlace(
|
|
feature_type="INVALID_TYPE", # ✗ Not in FeatureTypeEnum
|
|
classifies_place="place-001",
|
|
was_derived_from=["obs-001"]
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
## Documentation Updates
|
|
|
|
### Files to Update
|
|
1. **AGENTS.md**: Add FeaturePlace extraction workflow
|
|
2. **schemas/README.md**: Document new enum and class
|
|
3. **ontology/ONTOLOGY_EXTENSIONS.md**: Add CIDOC-CRM E27_Site mapping
|
|
4. **docs/SCHEMA_MODULES.md**: List FeatureTypeEnum and FeaturePlace
|
|
|
|
### Example Agent Prompt
|
|
```
|
|
When extracting heritage institutions from conversations:
|
|
|
|
1. Identify nominal place references (CustodianPlace)
|
|
- "Rijksmuseum" (building name as place)
|
|
- "het herenhuis in de Schilderswijk" (mansion reference)
|
|
|
|
2. Classify physical feature type (FeaturePlace)
|
|
- MUSEUM (for museum buildings)
|
|
- MANSION (for large historic houses)
|
|
- PARISH_CHURCH (for church buildings)
|
|
- MONUMENT (for memorials/statues)
|
|
- [298 other types available]
|
|
|
|
3. Link classification to place
|
|
- FeaturePlace.classifies_place → CustodianPlace
|
|
- CustodianPlace.has_feature_type → FeaturePlace (optional)
|
|
|
|
4. Record provenance
|
|
- FeaturePlace.was_derived_from → observation sources
|
|
- Include temporal validity (valid_from/valid_to) when known
|
|
```
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
### Source Files
|
|
- **Wikidata extraction**: `data/wikidata/GLAMORCUBEPSXHFN/hyponyms_curated_full_f.yaml`
|
|
- **Extraction report**: `README_F_EXTRACTION.md`
|
|
- **Schema documentation**: `schemas/20251121/linkml/modules/classes/FeaturePlace.yaml`
|
|
|
|
### Related Classes
|
|
- **CustodianPlace**: Nominal place references (`crm:E53_Place`)
|
|
- **CustodianObservation**: Source observations (PiCo pattern)
|
|
- **ReconstructionActivity**: Reconstruction process (PROV-O)
|
|
- **Custodian**: Hub entity (multi-aspect model)
|
|
|
|
### Ontologies
|
|
- **CIDOC-CRM**: `E27_Site`, `E53_Place` - Cultural heritage domain
|
|
- **Schema.org**: `LandmarksOrHistoricalBuildings`, `Place` - Web semantics
|
|
- **PROV-O**: `Entity`, `Activity`, `wasDerivedFrom` - Provenance
|
|
- **Dublin Core**: `type`, `description`, `language` - Metadata
|
|
|
|
---
|
|
|
|
## Completion Checklist
|
|
|
|
- [x] Extract 298 F-type entries from Wikidata YAML
|
|
- [x] Create FeatureTypeEnum with all 298 values
|
|
- [x] Map Wikidata Q-numbers to enum values
|
|
- [x] Create FeaturePlace class with proper ontology alignment
|
|
- [x] Add `has_feature_type` slot to CustodianPlace
|
|
- [x] Update CustodianPlace examples with feature types
|
|
- [x] Document conceptual model (CustodianPlace + FeaturePlace)
|
|
- [x] Provide use case examples (museum, mansion, church)
|
|
- [x] Define validation rules and testing strategy
|
|
- [x] Create comprehensive implementation report (this document)
|
|
|
|
**Status**: ✅ **Implementation Complete**
|
|
|
|
---
|
|
|
|
## Next Steps (Optional)
|
|
|
|
### Immediate
|
|
1. **Validate LinkML schemas**: Run `linkml-validate` on new files
|
|
2. **Generate RDF**: Use `gen-owl` to produce RDF serialization
|
|
3. **Update imports**: Add FeatureTypeEnum and FeaturePlace to main schema
|
|
4. **Create test instances**: YAML examples for validation
|
|
|
|
### Future
|
|
1. **Enrich with architectural periods**: Add temporal style classification
|
|
2. **Link to Location class**: Bridge nominal place → geographic coordinates
|
|
3. **Add conservation status**: Track physical condition over time
|
|
4. **Integrate with heritage registers**: Link to national monument databases
|
|
5. **Create visual documentation**: UML diagrams showing relationships
|
|
|
|
---
|
|
|
|
**Implementation completed**: 2025-11-22 23:09 CET
|
|
**Total development time**: ~45 minutes
|
|
**Files created**: 2 (FeatureTypeEnum.yaml, FeaturePlace.yaml)
|
|
**Files modified**: 1 (CustodianPlace.yaml)
|
|
**Total size**: 118 KB (106 KB enum + 12 KB class)
|