Add comprehensive rules for LinkML schema management and ontology mapping
- Introduced Rule 42: No Ontology Prefixes in Slot Names to enforce clean naming conventions. - Established Rule: No Rough Edits in Schema Files to ensure structural integrity during modifications. - Implemented Rule: No Version Indicators in Names to maintain stable semantic naming. - Created Rule: Ontology Detection vs Heuristics to emphasize the importance of verifying ontology definitions. - Defined Rule 50: Ontology-to-LinkML Mapping Convention to standardize mapping practices. - Added Rule: Polished Slot Storage Location to specify directory structure for polished slot files. - Enforced Rule: Preserve Bespoke Slots Until Refactoring to prevent unintended migrations during slot updates. - Instituted Rule 56: Semantic Consistency Over Simplicity to mandate execution of revisions in slot_fixes.yaml. - Added new Genealogy Archives Registry Enrichment class with multilingual support and structured aliases.
This commit is contained in:
parent
ee5e8e5a7c
commit
554fe520ea
69 changed files with 6113 additions and 1095 deletions
583
.opencode/rules/context/ontology-driven-cache-segmentation.md
Normal file
583
.opencode/rules/context/ontology-driven-cache-segmentation.md
Normal file
|
|
@ -0,0 +1,583 @@
|
|||
# Rule 46: Ontology-Driven Cache Segmentation
|
||||
|
||||
🚨 **CRITICAL**: The semantic cache MUST use vocabulary derived from LinkML `*Type.yaml` and `*Types.yaml` schema files to extract entities for cache key generation. Hardcoded regex patterns are deprecated.
|
||||
|
||||
**Status**: Implemented (Evolved v2.0)
|
||||
**Version**: 2.0 (Epistemological Evolution)
|
||||
**Updated**: 2026-01-10
|
||||
|
||||
## Evolution Overview
|
||||
|
||||
Rule 46 v2.0 incorporates insights from Volodymyr Pavlyshyn's work on agentic memory systems:
|
||||
|
||||
1. **Epistemic Provenance** (Phase 1) - Track WHERE, WHEN, HOW data originated
|
||||
2. **Topological Distance** (Phase 2) - Use ontology structure, not just embeddings
|
||||
3. **Holarchic Cache** (Phase 3) - Entries as holons with up/down links
|
||||
4. **Message Passing** (Phase 4, planned) - Smalltalk-style introspectable cache
|
||||
5. **Clarity Trading** (Phase 5, planned) - Block ambiguous queries from cache
|
||||
|
||||
## Epistemic Provenance
|
||||
|
||||
Every cached response carries epistemological metadata:
|
||||
|
||||
```typescript
|
||||
interface EpistemicProvenance {
|
||||
dataSource: 'ISIL_REGISTRY' | 'WIKIDATA' | 'CUSTODIAN_YAML' | 'LLM_INFERENCE' | ...;
|
||||
dataTier: 1 | 2 | 3 | 4; // TIER_1_AUTHORITATIVE → TIER_4_INFERRED
|
||||
sourceTimestamp: string;
|
||||
derivationChain: string[]; // ["SPARQL:Qdrant", "RAG:retrieve", "LLM:generate"]
|
||||
revalidationPolicy: 'static' | 'daily' | 'weekly' | 'on_access';
|
||||
}
|
||||
```
|
||||
|
||||
**Benefit**: Users see "This answer is from TIER_1 ISIL registry data, captured 2025-01-08".
|
||||
|
||||
## Topological Distance
|
||||
|
||||
Beyond embedding similarity, cache matching considers **structural distance** in the type hierarchy:
|
||||
|
||||
```
|
||||
HeritageCustodian (*)
|
||||
│
|
||||
┌──────────────────┼──────────────────┐
|
||||
▼ ▼ ▼
|
||||
MuseumType (M) ArchiveType (A) LibraryType (L)
|
||||
│ │ │
|
||||
┌────┴────┐ ┌────┴────┐ ┌────┴────┐
|
||||
▼ ▼ ▼ ▼ ▼ ▼
|
||||
ArtMuseum History Municipal State Public Academic
|
||||
```
|
||||
|
||||
**Combined Similarity Formula**:
|
||||
```typescript
|
||||
finalScore = 0.7 * embeddingSimilarity + 0.3 * (1 - topologicalDistance)
|
||||
```
|
||||
|
||||
**Benefit**: "Art museum" won't match "natural history museum" even with 95% embedding similarity.
|
||||
|
||||
## Holarchic Cache Structure
|
||||
|
||||
Cache entries are **holons** - simultaneously complete AND parts of aggregates:
|
||||
|
||||
| Level | Example | Aggregates |
|
||||
|-------|---------|------------|
|
||||
| Micro | "Rijksmuseum details" | None |
|
||||
| Meso | "Museums in Amsterdam" | List of micro holons |
|
||||
| Macro | "Heritage in Noord-Holland" | All meso holons in region |
|
||||
|
||||
```typescript
|
||||
interface CachedQuery {
|
||||
// ... existing fields ...
|
||||
holonLevel?: 'micro' | 'meso' | 'macro';
|
||||
participatesIn?: string[]; // Higher-level cache keys
|
||||
aggregates?: string[]; // Lower-level entries
|
||||
}
|
||||
```
|
||||
|
||||
## Problem Statement
|
||||
|
||||
The ArchiefAssistent semantic cache prevents geographic false positives using entity extraction:
|
||||
|
||||
```
|
||||
Query: "Hoeveel musea in Amsterdam?"
|
||||
Cached: "Hoeveel musea in Noord-Holland?"
|
||||
Result: BLOCKED (location mismatch) ✅
|
||||
```
|
||||
|
||||
However, the current implementation uses **hardcoded regex patterns**:
|
||||
|
||||
```typescript
|
||||
// DEPRECATED: Hardcoded patterns in semantic-cache.ts
|
||||
const INSTITUTION_PATTERNS: Record<InstitutionTypeCode, RegExp> = {
|
||||
M: /\b(muse(um|a|ums?)|musea)/i,
|
||||
A: /\b(archie[fv]en?|archives?|archief)/i,
|
||||
// ... 19 patterns to maintain manually
|
||||
};
|
||||
```
|
||||
|
||||
**Problems with hardcoded patterns**:
|
||||
1. **Maintenance burden** - Every new institution type requires code changes
|
||||
2. **Missing subtypes** - "kunstmuseum" vs "museum" should cache separately
|
||||
3. **No multilingual support** - Only Dutch/English, misses German/French labels
|
||||
4. **Duplication** - Same vocabulary exists in LinkML schemas
|
||||
5. **No record type awareness** - "burgerlijke stand" queries mixed with general archive queries
|
||||
|
||||
## Solution: Schema-Derived Vocabulary
|
||||
|
||||
The LinkML schema already contains rich vocabulary:
|
||||
|
||||
| Schema File | Content | Cache Utility |
|
||||
|-------------|---------|---------------|
|
||||
| `CustodianType.yaml` | 19 top-level types | Primary segmentation (M/A/L/G...) |
|
||||
| `MuseumType.yaml` | 187+ museum subtypes | Subtype segmentation |
|
||||
| `ArchiveOrganizationType.yaml` | 144+ archive subtypes | Subtype segmentation |
|
||||
| `*RecordSetTypes.yaml` | Record type taxonomies | Finding aids specificity |
|
||||
|
||||
### Vocabulary Sources in Schema
|
||||
|
||||
1. **`type_label`** - Multilingual labels via `skos:prefLabel`
|
||||
2. **`structured_aliases`** - Language-tagged alternative names
|
||||
3. **`keywords`** - Search terms for entity recognition
|
||||
4. **`wikidata_entity`** - Linked Data identifiers
|
||||
|
||||
## Architecture
|
||||
|
||||
### Overview: Two-Tier Embedding Hierarchy
|
||||
|
||||
The system uses a **hierarchical embedding approach** for fast semantic routing:
|
||||
|
||||
1. **Tier 1: Types File Embeddings** - Which category? (Museum vs Archive vs Library)
|
||||
2. **Tier 2: Individual Type Embeddings** - Which specific type? (ArtMuseum vs NaturalHistoryMuseum)
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ BUILD TIME: Extract vocabulary + generate embeddings │
|
||||
│ │
|
||||
│ schemas/20251121/linkml/modules/classes/*Type.yaml │
|
||||
│ schemas/20251121/linkml/modules/classes/*Types.yaml │
|
||||
│ ↓ │
|
||||
│ scripts/extract-types-vocab.ts │
|
||||
│ ↓ │
|
||||
│ ┌───────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ types-vocab.json │ │
|
||||
│ │ ├── tier1Embeddings: { MuseumType: [...], ArchiveType: [...] } │ │
|
||||
│ │ ├── tier2Embeddings: { ArtMuseum: [...], MunicipalArchive: [...]}│ │
|
||||
│ │ └── termLog: { "kunstmuseum": { type: "M", subtype: "ART_MUSEUM"}│ │
|
||||
│ └───────────────────────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼ (loaded at runtime)
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ RUNTIME: Two-Tier Semantic Routing │
|
||||
│ │
|
||||
│ Query: "Hoeveel gemeentearchieven in Amsterdam?" │
|
||||
│ ↓ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ TIER 1: Types File Selection │ │
|
||||
│ │ Query embedding vs Tier1 embeddings (19 categories) │ │
|
||||
│ │ Result: ArchiveOrganizationType (similarity: 0.89) │ │
|
||||
│ └─────────────────────────────────────────────────────────────────┘ │
|
||||
│ ↓ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ TIER 2: Specific Type Selection │ │
|
||||
│ │ Query embedding vs Tier2 embeddings (144 archive subtypes) │ │
|
||||
│ │ Result: MunicipalArchive (similarity: 0.94) │ │
|
||||
│ └─────────────────────────────────────────────────────────────────┘ │
|
||||
│ ↓ │
|
||||
│ Structured cache key: "count:A.MUNICIPAL_ARCHIVE:amsterdam" │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Tier 1: Types File Embeddings
|
||||
|
||||
Each Types file (e.g., `MuseumType.yaml`, `ArchiveOrganizationType.yaml`) gets ONE embedding
|
||||
representing the **accumulated vocabulary** of all types within that file.
|
||||
|
||||
**Embedding Text Construction**:
|
||||
```
|
||||
MuseumType: museum musea kunstmuseum art museum natural history museum
|
||||
science museum open-air museum ecomuseum virtual museum
|
||||
heritage farm national museum regional museum university museum
|
||||
[... all keywords from all 187 subtypes ...]
|
||||
```
|
||||
|
||||
**Purpose**: Fast first-pass filter to identify which GLAMORCUBESFIXPHDNT category the query relates to.
|
||||
|
||||
| Types File | Code | Accumulated Terms Count |
|
||||
|------------|------|------------------------|
|
||||
| MuseumType | M | ~500+ terms from 187 subtypes |
|
||||
| ArchiveOrganizationType | A | ~400+ terms from 144 subtypes |
|
||||
| LibraryType | L | ~200+ terms from subtypes |
|
||||
| GalleryType | G | ~100+ terms from subtypes |
|
||||
| ... | ... | ... |
|
||||
|
||||
### Tier 2: Individual Type Embeddings
|
||||
|
||||
Each **specific type** within a Types file gets its own embedding from its accumulated terms.
|
||||
|
||||
**Embedding Text Construction**:
|
||||
```
|
||||
MunicipalArchive: gemeentearchief stadsarchief city archive municipal archive
|
||||
town archive local government records burgerlijke stand
|
||||
bevolkingsregister council minutes building permits
|
||||
[... all keywords + structured_aliases + labels ...]
|
||||
```
|
||||
|
||||
**Purpose**: Precise subtype identification after Tier 1 narrows the category.
|
||||
|
||||
### Term Log Structure
|
||||
|
||||
A lookup table mapping every extracted term to its type/subtype:
|
||||
|
||||
```json
|
||||
{
|
||||
"termLog": {
|
||||
"kunstmuseum": {
|
||||
"typeCode": "M",
|
||||
"typeName": "MuseumType",
|
||||
"subtypeName": "ART_MUSEUM",
|
||||
"wikidata": "Q207694",
|
||||
"language": "nl"
|
||||
},
|
||||
"art museum": {
|
||||
"typeCode": "M",
|
||||
"typeName": "MuseumType",
|
||||
"subtypeName": "ART_MUSEUM",
|
||||
"wikidata": "Q207694",
|
||||
"language": "en"
|
||||
},
|
||||
"gemeentearchief": {
|
||||
"typeCode": "A",
|
||||
"typeName": "ArchiveOrganizationType",
|
||||
"subtypeName": "MUNICIPAL_ARCHIVE",
|
||||
"wikidata": "Q8362876",
|
||||
"language": "nl"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Purpose**:
|
||||
1. Fast O(1) keyword lookup (no embedding needed for exact matches)
|
||||
2. Audit trail of which terms map to which types
|
||||
3. Debugging which queries match which types
|
||||
|
||||
### Runtime Lookup Strategy
|
||||
|
||||
```typescript
|
||||
async function extractEntitiesWithEmbeddings(query: string): Promise<ExtractedEntities> {
|
||||
const vocab = await loadTypesVocabulary();
|
||||
const normalized = query.toLowerCase();
|
||||
|
||||
// FAST PATH: Check termLog for exact keyword matches
|
||||
for (const [term, mapping] of Object.entries(vocab.termLog)) {
|
||||
if (normalized.includes(term)) {
|
||||
return {
|
||||
institutionType: mapping.typeCode,
|
||||
institutionSubtype: mapping.subtypeName,
|
||||
subtypeWikidata: mapping.wikidata,
|
||||
// ... location and intent extraction
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
// SLOW PATH: Embedding-based semantic matching
|
||||
const queryEmbedding = await generateEmbedding(query);
|
||||
|
||||
// Tier 1: Find best matching Types file
|
||||
let bestType: string | null = null;
|
||||
let bestTypeSimilarity = 0;
|
||||
for (const [typeName, typeEmbedding] of Object.entries(vocab.tier1Embeddings)) {
|
||||
const similarity = cosineSimilarity(queryEmbedding, typeEmbedding);
|
||||
if (similarity > bestTypeSimilarity && similarity > 0.7) {
|
||||
bestTypeSimilarity = similarity;
|
||||
bestType = typeName;
|
||||
}
|
||||
}
|
||||
|
||||
if (!bestType) return {}; // No type matched
|
||||
|
||||
// Tier 2: Find best matching subtype within the Types file
|
||||
const typeCode = vocab.institutionTypes[bestType].code;
|
||||
let bestSubtype: string | null = null;
|
||||
let bestSubtypeSimilarity = 0;
|
||||
|
||||
for (const [subtypeName, subtypeEmbedding] of Object.entries(vocab.tier2Embeddings[typeCode] || {})) {
|
||||
const similarity = cosineSimilarity(queryEmbedding, subtypeEmbedding);
|
||||
if (similarity > bestSubtypeSimilarity && similarity > 0.75) {
|
||||
bestSubtypeSimilarity = similarity;
|
||||
bestSubtype = subtypeName;
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
institutionType: typeCode,
|
||||
institutionSubtype: bestSubtype,
|
||||
// ... location and intent extraction
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### Embedding Model Choice
|
||||
|
||||
For build-time embedding generation, use the same model as the semantic cache:
|
||||
|
||||
| Option | Model | Dimensions | Quality |
|
||||
|--------|-------|------------|---------|
|
||||
| **Primary** | `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2` | 384 | Good multilingual |
|
||||
| Fallback | `all-MiniLM-L6-v2` | 384 | English-focused |
|
||||
| High Quality | `multilingual-e5-large` | 1024 | Best multilingual |
|
||||
|
||||
**Build-time generation**: Embeddings are generated ONCE at build time and stored in JSON.
|
||||
This avoids runtime embedding API calls for type classification.
|
||||
|
||||
## TypesVocabulary JSON Structure
|
||||
|
||||
Generated at build time with **pre-computed embeddings**:
|
||||
|
||||
```json
|
||||
{
|
||||
"version": "2026-01-10T12:00:00Z",
|
||||
"schemaVersion": "20251121",
|
||||
"embeddingModel": "paraphrase-multilingual-MiniLM-L12-v2",
|
||||
"embeddingDimensions": 384,
|
||||
|
||||
"tier1Embeddings": {
|
||||
"MuseumType": [0.023, -0.045, 0.087, ...],
|
||||
"ArchiveOrganizationType": [0.012, 0.056, -0.034, ...],
|
||||
"LibraryType": [-0.034, 0.089, 0.012, ...],
|
||||
"GalleryType": [0.045, -0.023, 0.067, ...]
|
||||
},
|
||||
|
||||
"tier2Embeddings": {
|
||||
"M": {
|
||||
"ART_MUSEUM": [0.034, -0.056, 0.078, ...],
|
||||
"NATURAL_HISTORY_MUSEUM": [0.045, 0.023, -0.089, ...],
|
||||
"SCIENCE_MUSEUM": [0.067, -0.012, 0.045, ...]
|
||||
},
|
||||
"A": {
|
||||
"MUNICIPAL_ARCHIVE": [0.089, 0.034, -0.056, ...],
|
||||
"NATIONAL_ARCHIVE": [0.012, -0.078, 0.045, ...],
|
||||
"CHURCH_ARCHIVE": [-0.023, 0.067, 0.034, ...]
|
||||
}
|
||||
},
|
||||
|
||||
"termLog": {
|
||||
"kunstmuseum": {"typeCode": "M", "subtypeName": "ART_MUSEUM", "wikidata": "Q207694", "lang": "nl"},
|
||||
"art museum": {"typeCode": "M", "subtypeName": "ART_MUSEUM", "wikidata": "Q207694", "lang": "en"},
|
||||
"gemeentearchief": {"typeCode": "A", "subtypeName": "MUNICIPAL_ARCHIVE", "wikidata": "Q8362876", "lang": "nl"},
|
||||
"stadsarchief": {"typeCode": "A", "subtypeName": "MUNICIPAL_ARCHIVE", "wikidata": "Q8362876", "lang": "nl"},
|
||||
"city archive": {"typeCode": "A", "subtypeName": "MUNICIPAL_ARCHIVE", "wikidata": "Q8362876", "lang": "en"},
|
||||
"burgerlijke stand": {"typeCode": "A", "recordSetType": "CIVIL_REGISTRY", "lang": "nl"},
|
||||
"geboorteakte": {"typeCode": "A", "recordSetType": "CIVIL_REGISTRY", "lang": "nl"}
|
||||
},
|
||||
|
||||
"institutionTypes": {
|
||||
"M": {
|
||||
"code": "M",
|
||||
"className": "MuseumType",
|
||||
"baseWikidata": "Q33506",
|
||||
"accumulatedTerms": "museum musea kunstmuseum art museum natural history museum science museum open-air museum ecomuseum virtual museum heritage farm national museum regional museum university museum...",
|
||||
"keywords": {
|
||||
"nl": ["museum", "musea"],
|
||||
"en": ["museum", "museums"],
|
||||
"de": ["Museum", "Museen"]
|
||||
},
|
||||
"subtypes": {
|
||||
"ART_MUSEUM": {
|
||||
"className": "ArtMuseum",
|
||||
"wikidata": "Q207694",
|
||||
"accumulatedTerms": "kunstmuseum art museum kunstmusea art museums fine art museum visual arts museum painting gallery sculpture museum",
|
||||
"keywords": {
|
||||
"nl": ["kunstmuseum", "kunstmusea"],
|
||||
"en": ["art museum", "art museums"]
|
||||
}
|
||||
},
|
||||
"NATURAL_HISTORY_MUSEUM": {
|
||||
"className": "NaturalHistoryMuseum",
|
||||
"wikidata": "Q559049",
|
||||
"accumulatedTerms": "natuurhistorisch museum natuurmuseum natural history museum science museum fossils taxidermy specimens geology biology",
|
||||
"keywords": {
|
||||
"nl": ["natuurhistorisch museum", "natuurmuseum"],
|
||||
"en": ["natural history museum"]
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"A": {
|
||||
"code": "A",
|
||||
"className": "ArchiveOrganizationType",
|
||||
"baseWikidata": "Q166118",
|
||||
"accumulatedTerms": "archief archieven archive archives gemeentearchief stadsarchief nationaal archief rijksarchief church archive company archive film archive...",
|
||||
"keywords": {
|
||||
"nl": ["archief", "archieven"],
|
||||
"en": ["archive", "archives"]
|
||||
},
|
||||
"subtypes": {
|
||||
"MUNICIPAL_ARCHIVE": {
|
||||
"className": "MunicipalArchive",
|
||||
"wikidata": "Q8362876",
|
||||
"accumulatedTerms": "gemeentearchief stadsarchief municipal archive city archive town archive local government records civil registry population register building permits council minutes",
|
||||
"keywords": {
|
||||
"nl": ["gemeentearchief", "stadsarchief", "gemeentelijke archiefdienst"],
|
||||
"en": ["municipal archive", "city archive", "town archive"]
|
||||
}
|
||||
},
|
||||
"NATIONAL_ARCHIVE": {
|
||||
"className": "NationalArchive",
|
||||
"wikidata": "Q1188452",
|
||||
"accumulatedTerms": "nationaal archief rijksarchief national archive state archive government records national records federal archive",
|
||||
"keywords": {
|
||||
"nl": ["nationaal archief", "rijksarchief"],
|
||||
"en": ["national archive", "state archive"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
|
||||
"recordSetTypes": {
|
||||
"CIVIL_REGISTRY": {
|
||||
"className": "CivilRegistrySeries",
|
||||
"accumulatedTerms": "burgerlijke stand geboorteakte huwelijksakte overlijdensakte bevolkingsregister civil registry birth records marriage records death records population register vital records genealogy",
|
||||
"keywords": {
|
||||
"nl": ["burgerlijke stand", "geboorteakte", "huwelijksakte", "overlijdensakte", "bevolkingsregister"],
|
||||
"en": ["civil registry", "birth records", "marriage records", "death records"]
|
||||
}
|
||||
},
|
||||
"COUNCIL_GOVERNANCE": {
|
||||
"className": "CouncilGovernanceFonds",
|
||||
"accumulatedTerms": "gemeenteraad raadsnotulen raadsbesluit verordening council minutes ordinances resolutions bylaws municipal council town council city council",
|
||||
"keywords": {
|
||||
"nl": ["gemeenteraad", "raadsnotulen", "raadsbesluit", "verordening"],
|
||||
"en": ["council minutes", "ordinances", "resolutions"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Key Additions for Embedding Support
|
||||
|
||||
| Field | Purpose |
|
||||
|-------|---------|
|
||||
| `tier1Embeddings` | Pre-computed embeddings for each Types file (19 categories) |
|
||||
| `tier2Embeddings` | Pre-computed embeddings for each subtype (500+ types) |
|
||||
| `termLog` | Fast O(1) lookup table for exact keyword matches |
|
||||
| `accumulatedTerms` | Raw text used to generate embeddings (for debugging/regeneration) |
|
||||
| `embeddingModel` | Model used to generate embeddings (for reproducibility) |
|
||||
|
||||
## Enhanced ExtractedEntities Interface
|
||||
|
||||
```typescript
|
||||
export interface ExtractedEntities {
|
||||
// Existing fields
|
||||
institutionType?: InstitutionTypeCode | null;
|
||||
location?: string | null;
|
||||
locationType?: 'city' | 'province' | null;
|
||||
intent?: 'count' | 'list' | 'info' | null;
|
||||
|
||||
// NEW: Ontology-derived fields
|
||||
institutionSubtype?: string | null; // e.g., 'MUNICIPAL_ARCHIVE', 'ART_MUSEUM'
|
||||
recordSetType?: string | null; // e.g., 'CIVIL_REGISTRY', 'COUNCIL_GOVERNANCE'
|
||||
subtypeWikidata?: string | null; // e.g., 'Q8362876' for LOD integration
|
||||
}
|
||||
```
|
||||
|
||||
## Enhanced Cache Key Format
|
||||
|
||||
```
|
||||
{intent}:{institutionType}[.{subtype}][:{recordSetType}]:{location}
|
||||
|
||||
Examples:
|
||||
- "count:m:amsterdam" # Basic museum count
|
||||
- "count:m.art_museum:amsterdam" # Art museum count (subtype)
|
||||
- "list:a.municipal_archive:nh" # Municipal archives in Noord-Holland
|
||||
- "query:a:civil_registry:utrecht" # Civil registry in Utrecht
|
||||
- "info:a.national_archive::nl" # National archive info (no location filter)
|
||||
```
|
||||
|
||||
## Implementation Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `scripts/extract-types-vocab.ts` | Build-time vocabulary extraction from LinkML |
|
||||
| `apps/archief-assistent/public/types-vocab.json` | Generated vocabulary file |
|
||||
| `apps/archief-assistent/src/lib/types-vocabulary.ts` | Runtime vocabulary loader |
|
||||
| `apps/archief-assistent/src/lib/semantic-cache.ts` | Updated entity extraction |
|
||||
|
||||
## Build Integration
|
||||
|
||||
Add to `apps/archief-assistent/package.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"scripts": {
|
||||
"prebuild": "tsx ../../scripts/extract-types-vocab.ts",
|
||||
"build": "vite build"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Keyword Extraction Priority
|
||||
|
||||
When extracting keywords from schema files:
|
||||
|
||||
1. **`keywords`** array (highest priority) - Explicit search terms
|
||||
2. **`structured_aliases.literal_form`** - Multilingual alternative names
|
||||
3. **`type_label`** - Preferred labels per language
|
||||
4. **Class name conversion** - `MunicipalArchive` → "municipal archive"
|
||||
|
||||
## Cache Segmentation Rules
|
||||
|
||||
### Rule 1: Subtype Specificity
|
||||
|
||||
Queries with **specific subtypes** should NOT match **generic type** cache entries:
|
||||
|
||||
```
|
||||
Query: "kunstmusea in Amsterdam" → key: "count:m.art_museum:amsterdam"
|
||||
Cached: "musea in Amsterdam" → key: "count:m:amsterdam"
|
||||
Result: MISS (subtype mismatch) ✅
|
||||
```
|
||||
|
||||
### Rule 2: Record Set Type Isolation
|
||||
|
||||
Queries about **specific record types** should cache separately:
|
||||
|
||||
```
|
||||
Query: "burgerlijke stand Utrecht" → key: "query:a:civil_registry:utrecht"
|
||||
Cached: "archieven in Utrecht" → key: "list:a:utrecht"
|
||||
Result: MISS (record set type mismatch) ✅
|
||||
```
|
||||
|
||||
### Rule 3: Subtype-to-Type Fallback
|
||||
|
||||
Generic queries CAN match subtype cache entries (broader is acceptable):
|
||||
|
||||
```
|
||||
Query: "musea in Amsterdam" → key: "count:m:amsterdam"
|
||||
Cached: "kunstmusea in Amsterdam" → key: "count:m.art_museum:amsterdam"
|
||||
Result: MISS (don't return subset for superset query)
|
||||
```
|
||||
|
||||
## Migration Notes
|
||||
|
||||
1. **Backwards Compatible**: Existing cache entries without `institutionSubtype` continue to work
|
||||
2. **Gradual Rollout**: New cache entries get subtype, old entries remain valid
|
||||
3. **Cache Clear**: Consider clearing cache after deployment to ensure consistency
|
||||
|
||||
## Validation
|
||||
|
||||
Run E2E tests to verify:
|
||||
|
||||
```bash
|
||||
cd apps/archief-assistent
|
||||
npm run test:e2e
|
||||
```
|
||||
|
||||
Key test cases:
|
||||
- Geographic isolation (Amsterdam ≠ Rotterdam ≠ Noord-Holland)
|
||||
- Subtype isolation (kunstmuseum ≠ museum)
|
||||
- Record set isolation (burgerlijke stand ≠ archive)
|
||||
- Intent isolation (count ≠ list ≠ info)
|
||||
|
||||
## References
|
||||
|
||||
- **Rule 41**: Types classes define SPARQL template variables
|
||||
- **Rule 0b**: Type/Types file naming convention
|
||||
- **CustodianType.yaml**: Base taxonomy definition
|
||||
- **AGENTS.md**: GLAMORCUBESFIXPHDNT taxonomy documentation
|
||||
|
||||
---
|
||||
|
||||
**Created**: 2026-01-10
|
||||
**Author**: OpenCode Agent
|
||||
**Status**: Implemented (v2.0)
|
||||
|
||||
## References
|
||||
|
||||
- Pavlyshyn, V. "Context Graphs and Data Traces: Building Epistemology Layers for Agentic Memory"
|
||||
- Pavlyshyn, V. "The Shape of Knowledge: Topology Theory for Knowledge Graphs"
|
||||
- Pavlyshyn, V. "Beyond Hierarchy: Why Agentic AI Systems Need Holarchies"
|
||||
- Pavlyshyn, V. "Smalltalk: The Language That Changed Everything"
|
||||
- Pavlyshyn, V. "Clarity Traders: Beyond Vibe Coding"
|
||||
|
|
@ -0,0 +1,65 @@
|
|||
# Rule: Engineering Parsimony and Domain Modeling
|
||||
|
||||
## Critical Convention
|
||||
|
||||
Our ontology follows an engineering-oriented approach: practical domain utility and
|
||||
stable interoperability take priority over minimal, tool-specific class catalogs.
|
||||
|
||||
## Rule
|
||||
|
||||
1. Model domain concepts, not implementation tools.
|
||||
- Reject classes like `ExaSearchMetadata`, `OpenAIFetchResult`, `ElasticsearchHit`.
|
||||
|
||||
2. Prefer generic, reusable activity/entity classes for operational provenance.
|
||||
- Use classes such as `ExternalSearchMetadata`, `RetrievalActivity`, `SearchResult`.
|
||||
|
||||
3. Capture tool/vendor details in slot values, not class names.
|
||||
- Record with generic predicates like `has_tool`, `has_method`, `has_agent`, `has_note`.
|
||||
|
||||
4. Digital platforms acting as custodians are valid domain classes.
|
||||
- Platform-as-custodian classes (for example YouTube-related custodian classes) are allowed.
|
||||
- Data processing/search tools are not ontology class candidates.
|
||||
|
||||
5. Avoid ontology growth driven by transient engineering stack choices.
|
||||
- New class proposals must be justified by cross-tool, domain-stable semantics.
|
||||
|
||||
## Rationale
|
||||
|
||||
- Tool names are volatile implementation details and age quickly.
|
||||
- Domain-level abstractions maximize reuse, query consistency, and mapping stability.
|
||||
- This aligns with an engineering ontology practice where strict theoretical
|
||||
parsimony in candidate theories is not the only optimization criterion; practical
|
||||
semantic interoperability and maintainability are primary.
|
||||
|
||||
## Examples
|
||||
|
||||
### Wrong
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
ExaSearchMetadata:
|
||||
class_uri: prov:Activity
|
||||
```
|
||||
|
||||
### Correct
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
ExternalSearchMetadata:
|
||||
class_uri: prov:Activity
|
||||
slots:
|
||||
- has_tool
|
||||
- has_method
|
||||
- has_agent
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
1. Liefke, K. (2024). *Natural Language Ontology and Semantic Theory*.
|
||||
Cambridge Elements in Semantics. DOI: `10.1017/9781009307789`.
|
||||
URL: https://www.cambridge.org/core/elements/abs/natural-language-ontology-and-semantic-theory/E8DDE548BB8A98137721984E26FAD764
|
||||
|
||||
2. Liefke, K. (2025). *Reduction and Unification in Natural Language Ontology*.
|
||||
Cambridge Elements in Semantics. DOI: `10.1017/9781009559683`.
|
||||
URL: https://www.cambridge.org/core/elements/abs/reduction-and-unification-in-natural-language-ontology/40F58ABA0D9C08958B5926F0CBDAD3CA
|
||||
|
||||
|
|
@ -18,7 +18,7 @@
|
|||
|
||||
## 🚫 AUTOMATED ENRICHMENT IS PROHIBITED 🚫
|
||||
|
||||
**DO NOT USE** automated scripts to enrich person profiles with web search data. The `enrich_person_comprehensive.py` script has been deprecated.
|
||||
**DO NOT USE** automated scripts to enrich person profiles with web search data.
|
||||
|
||||
**Why automated enrichment failed**:
|
||||
- Web searches return data about DIFFERENT people with similar names
|
||||
|
|
@ -184,95 +184,12 @@ Domains: geni.com, ancestry.*, familysearch.org, findagrave.com, myheritage.*
|
|||
→ Exception: If source explicitly links to living person with verifiable connection
|
||||
```
|
||||
|
||||
## Implementation in Enrichment Scripts
|
||||
|
||||
```python
|
||||
def validate_entity_match(profile: dict, search_result: dict) -> tuple[bool, str]:
|
||||
"""
|
||||
Validate that a search result refers to the same person as the profile.
|
||||
|
||||
REQUIRES: At least 3 of 5 identity attributes must match.
|
||||
Name match alone is INSUFFICIENT and automatically rejected.
|
||||
|
||||
Returns (is_valid, reason)
|
||||
"""
|
||||
profile_employer = profile.get('affiliations', [{}])[0].get('custodian_name', '').lower()
|
||||
profile_location = profile.get('profile_data', {}).get('location', '').lower()
|
||||
profile_role = profile.get('profile_data', {}).get('headline', '').lower()
|
||||
|
||||
source_text = search_result.get('answer', '').lower()
|
||||
source_url = search_result.get('source_url', '').lower()
|
||||
|
||||
# AUTOMATIC REJECTION: Genealogy sources
|
||||
genealogy_domains = ['geni.com', 'ancestry.', 'familysearch.', 'findagrave.', 'myheritage.']
|
||||
if any(domain in source_url for domain in genealogy_domains):
|
||||
return False, "genealogy_source_rejected"
|
||||
|
||||
# AUTOMATIC REJECTION: Profession conflicts
|
||||
heritage_roles = ['curator', 'archivist', 'librarian', 'conservator', 'registrar', 'collection', 'heritage']
|
||||
entertainment_roles = ['actress', 'actor', 'singer', 'footballer', 'politician', 'model', 'athlete']
|
||||
|
||||
profile_is_heritage = any(role in profile_role for role in heritage_roles)
|
||||
source_is_entertainment = any(role in source_text for role in entertainment_roles)
|
||||
|
||||
if profile_is_heritage and source_is_entertainment:
|
||||
return False, "conflicting_profession"
|
||||
|
||||
# AUTOMATIC REJECTION: Location conflicts
|
||||
if profile_location:
|
||||
location_conflicts = [
|
||||
('venezuela', 'uk'), ('mexico', 'netherlands'), ('brazil', 'france'),
|
||||
('caracas', 'london'), ('mexico city', 'amsterdam')
|
||||
]
|
||||
for source_loc, profile_loc in location_conflicts:
|
||||
if source_loc in source_text and profile_loc in profile_location:
|
||||
return False, "conflicting_location"
|
||||
|
||||
# Count positive identity attribute matches (need 3 of 5)
|
||||
matches = 0
|
||||
match_details = []
|
||||
|
||||
# 1. Employer match
|
||||
if profile_employer and profile_employer in source_text:
|
||||
matches += 1
|
||||
match_details.append(f"employer:{profile_employer}")
|
||||
|
||||
# 2. Location match
|
||||
if profile_location and profile_location in source_text:
|
||||
matches += 1
|
||||
match_details.append(f"location:{profile_location}")
|
||||
|
||||
# 3. Role/profession match
|
||||
if profile_role:
|
||||
role_words = [w for w in profile_role.split() if len(w) > 4]
|
||||
if any(word in source_text for word in role_words):
|
||||
matches += 1
|
||||
match_details.append(f"role_match")
|
||||
|
||||
# 4. Education/institution match (if available)
|
||||
profile_education = profile.get('profile_data', {}).get('education', [])
|
||||
if profile_education:
|
||||
edu_names = [e.get('school', '').lower() for e in profile_education if e.get('school')]
|
||||
if any(edu in source_text for edu in edu_names):
|
||||
matches += 1
|
||||
match_details.append(f"education_match")
|
||||
|
||||
# 5. Time period match (career dates)
|
||||
# (implementation depends on available data)
|
||||
|
||||
# REQUIRE 3 OF 5 MATCHES
|
||||
if matches < 3:
|
||||
return False, f"insufficient_identity_verification (only {matches}/5 attributes matched)"
|
||||
|
||||
return True, f"verified ({matches}/5 matches: {', '.join(match_details)})"
|
||||
```
|
||||
|
||||
## Claim Rejection Patterns
|
||||
|
||||
The following patterns should trigger automatic claim rejection:
|
||||
The following inconsisten patterns should trigger automatic claim rejection:
|
||||
|
||||
```python
|
||||
# Genealogy sources - ALWAYS REJECT
|
||||
# Genealogy sources conflict - ALWAYS REJECT
|
||||
GENEALOGY_DOMAINS = [
|
||||
'geni.com', 'ancestry.com', 'ancestry.co.uk', 'familysearch.org',
|
||||
'findagrave.com', 'myheritage.com', 'wikitree.com', 'geneanet.org'
|
||||
|
|
@ -293,7 +210,7 @@ LOCATION_PAIRS = [
|
|||
('caracas', 'london'), ('caracas', 'amsterdam'),
|
||||
]
|
||||
|
||||
# Age impossibility - if birth year makes current career implausible, REJECT
|
||||
# Age impossibility - if birth year makes current career implausible, REJECT. For instance, for a Junior role:
|
||||
MIN_PLAUSIBLE_BIRTH_YEAR = 1945 # Would be 80 in 2025 - still plausible but verify
|
||||
MAX_PLAUSIBLE_BIRTH_YEAR = 2002 # Would be 23 in 2025 - plausible for junior roles
|
||||
```
|
||||
|
|
|
|||
|
|
@ -0,0 +1,248 @@
|
|||
# Rule 47: Disambiguation Entity Profiles - Prevent Repeated Entity Resolution Errors
|
||||
|
||||
## Status: CRITICAL
|
||||
|
||||
## Summary
|
||||
|
||||
When entity resolution determines that a web source describes a **different person** with a similar name, **create a PPID profile for that person** in `data/person/`. The PPID system is universal - ANY person who ever lived can have a profile, regardless of heritage relevance.
|
||||
|
||||
---
|
||||
|
||||
## The Universal PPID Principle
|
||||
|
||||
**In principle, all persons on Earth should be assigned PPIDs** - whether or not they are active in the heritage field. This includes:
|
||||
|
||||
- Heritage workers (curators, archivists, librarians, etc.)
|
||||
- Non-heritage professionals (actors, doctors, athletes, etc.)
|
||||
- Historical persons (deceased individuals from any era)
|
||||
- Public figures and private individuals
|
||||
|
||||
The `heritage_relevance` field indicates whether someone works in the heritage sector, but does NOT determine whether they can have a profile. **Anyone can have a PPID.**
|
||||
|
||||
---
|
||||
|
||||
## The Problem
|
||||
|
||||
During entity resolution, we often discover that web search results describe a **different person** with a similar name:
|
||||
|
||||
| Heritage Profile | Namesake Discovered | Why Different |
|
||||
|------------------|---------------------|---------------|
|
||||
| Carmen Juliá (UK curator) | Carmen Julia Álvarez (Venezuelan actress) | Different profession, location, timeline |
|
||||
| Jan de Vries (Rijksmuseum curator) | Jan de Vries (footballer) | Different profession |
|
||||
| Robert Ritter (heritage worker) | Robert Ritter (Nazi doctor, 1901-1951) | Different era, profession |
|
||||
|
||||
Without creating a profile for the namesake, future enrichment attempts may:
|
||||
1. Re-discover the same namesake
|
||||
2. Waste time re-investigating
|
||||
3. Risk attributing false claims again
|
||||
|
||||
---
|
||||
|
||||
## The Solution: Create PPID Profiles for Namesakes
|
||||
|
||||
When entity resolution proves two entities are different, **create a regular PPID profile for the namesake**:
|
||||
|
||||
1. Use standard PPID naming convention (no special prefix)
|
||||
2. Set `heritage_relevance.is_heritage_relevant: false`
|
||||
3. Document the disambiguation in BOTH profiles
|
||||
|
||||
---
|
||||
|
||||
## Example: Venezuelan Actress Profile
|
||||
|
||||
```json
|
||||
{
|
||||
"ppid": "ID_VE-XX-CCS_1952_VE-XX-CCS_XXXX_CARMEN-JULIA-ALVAREZ",
|
||||
"profile_data": {
|
||||
"full_name": "Carmen Julia Álvarez",
|
||||
"profession": "actress",
|
||||
"nationality": "Venezuelan",
|
||||
"birth_year": 1952,
|
||||
"birth_location": "Caracas, Venezuela",
|
||||
"active_period": "1970s-2000s"
|
||||
},
|
||||
"heritage_relevance": {
|
||||
"is_heritage_relevant": false,
|
||||
"relevance_score": 0.0,
|
||||
"reason": "Entertainment industry professional - actress in film and television"
|
||||
},
|
||||
"disambiguation_notes": {
|
||||
"commonly_confused_with": [
|
||||
{
|
||||
"ppid": "ID_UK-XX-XXX_XXXX_UK-XX-XXX_XXXX_CARMEN-JULIA",
|
||||
"name": "Carmen Juliá",
|
||||
"profession": "curator",
|
||||
"employer": "New Contemporaries",
|
||||
"location": "UK",
|
||||
"why_different": "Different profession (actress vs curator), different location (Venezuela vs UK), overlapping active periods in incompatible roles"
|
||||
}
|
||||
],
|
||||
"disambiguation_note": "This is the Venezuelan actress, NOT the UK-based art curator."
|
||||
},
|
||||
"web_claims": [
|
||||
{
|
||||
"claim_type": "birth_year",
|
||||
"claim_value": 1952,
|
||||
"provenance": {
|
||||
"source_url": "https://en.wikipedia.org/wiki/Carmen_Julia_Álvarez",
|
||||
"retrieved_on": "2026-01-11T14:30:00Z",
|
||||
"retrieval_agent": "manual-human-curator"
|
||||
}
|
||||
},
|
||||
{
|
||||
"claim_type": "profession",
|
||||
"claim_value": "actress",
|
||||
"provenance": {
|
||||
"source_url": "https://en.wikipedia.org/wiki/Carmen_Julia_Álvarez",
|
||||
"retrieved_on": "2026-01-11T14:30:00Z",
|
||||
"retrieval_agent": "manual-human-curator"
|
||||
}
|
||||
}
|
||||
],
|
||||
"extraction_metadata": {
|
||||
"created_at": "2026-01-11T15:00:00Z",
|
||||
"created_by": "manual-human-curator",
|
||||
"creation_reason": "Created during entity resolution to distinguish from heritage worker Carmen Juliá"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Update the Heritage Profile Too
|
||||
|
||||
The heritage profile should also reference the disambiguation:
|
||||
|
||||
```json
|
||||
{
|
||||
"ppid": "ID_UK-XX-XXX_XXXX_UK-XX-XXX_XXXX_CARMEN-JULIA",
|
||||
"profile_data": {
|
||||
"full_name": "Carmen Juliá",
|
||||
"headline": "Curator at New Contemporaries"
|
||||
},
|
||||
"heritage_relevance": {
|
||||
"is_heritage_relevant": true,
|
||||
"relevance_score": 0.85
|
||||
},
|
||||
"disambiguation_notes": {
|
||||
"known_namesakes": [
|
||||
{
|
||||
"ppid": "ID_VE-XX-CCS_1952_VE-XX-CCS_XXXX_CARMEN-JULIA-ALVAREZ",
|
||||
"name": "Carmen Julia Álvarez",
|
||||
"profession": "actress",
|
||||
"location": "Venezuela",
|
||||
"why_not_same_person": "Different profession, location, timeline"
|
||||
}
|
||||
],
|
||||
"disambiguation_warning": "Web searches for 'Carmen Julia' return data about Venezuelan actress Carmen Julia Álvarez (born 1952). This is a DIFFERENT person."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## When to Create Namesake Profiles
|
||||
|
||||
Create a PPID profile for a namesake when:
|
||||
|
||||
1. **Entity resolution proves they are a different person**
|
||||
2. **They are notable enough** to appear in search results repeatedly (Wikipedia, IMDB, news)
|
||||
3. **The confusion risk is high** (similar name, some overlapping attributes)
|
||||
|
||||
**Do NOT create profiles for**:
|
||||
- Random social media accounts with no notable presence
|
||||
- Obvious mismatches unlikely to recur in searches
|
||||
|
||||
---
|
||||
|
||||
## Benefits
|
||||
|
||||
1. **Universal person database**: Any person can have a PPID
|
||||
2. **Prevents repeated mistakes**: Future enrichment can check for known namesakes
|
||||
3. **Bidirectional linking**: Both profiles reference each other
|
||||
4. **Consistent data model**: No special file naming or profile types needed
|
||||
5. **Audit trail**: Documents why profiles were created
|
||||
|
||||
---
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1: During Entity Resolution
|
||||
|
||||
When you reject a claim due to identity mismatch with a notable namesake:
|
||||
|
||||
```
|
||||
1. Document WHY the source describes a different person
|
||||
2. Check if the namesake is notable (Wikipedia, IMDB, frequent search results)
|
||||
3. If notable → Create PPID profile for the namesake
|
||||
4. Link both profiles via disambiguation_notes
|
||||
```
|
||||
|
||||
### Step 2: Create Namesake Profile
|
||||
|
||||
Use standard PPID naming:
|
||||
```
|
||||
ID_{birth-location}_{birth-decade}_{current-location}_{death-decade}_{NAME}.json
|
||||
```
|
||||
|
||||
Example: `ID_VE-XX-CCS_1952_VE-XX-CCS_XXXX_CARMEN-JULIA-ALVAREZ.json`
|
||||
|
||||
### Step 3: Update Both Profiles
|
||||
|
||||
- Namesake profile: Add `commonly_confused_with` pointing to heritage profile
|
||||
- Heritage profile: Add `known_namesakes` pointing to namesake profile
|
||||
|
||||
---
|
||||
|
||||
## Historical Persons
|
||||
|
||||
Historical persons (deceased) can also have PPID profiles:
|
||||
|
||||
```json
|
||||
{
|
||||
"ppid": "ID_DE-XX-XXX_1901_DE-XX-XXX_1951_ROBERT-RITTER",
|
||||
"profile_data": {
|
||||
"full_name": "Robert Ritter",
|
||||
"profession": "physician",
|
||||
"birth_year": 1901,
|
||||
"death_year": 1951,
|
||||
"nationality": "German",
|
||||
"historical_note": "Nazi-era physician involved in racial hygiene programs"
|
||||
},
|
||||
"heritage_relevance": {
|
||||
"is_heritage_relevant": false,
|
||||
"relevance_score": 0.0
|
||||
},
|
||||
"disambiguation_notes": {
|
||||
"commonly_confused_with": [
|
||||
{
|
||||
"ppid": "ID_XX-XX-XXX_XXXX_XX-XX-XXX_XXXX_ROBERT-RITTER",
|
||||
"name": "Robert Ritter",
|
||||
"profession": "heritage worker",
|
||||
"why_different": "Different era - historical figure (1901-1951) vs living heritage professional"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 46**: Entity Resolution - Names Are NEVER Sufficient
|
||||
- **Rule 21**: Data Fabrication is Strictly Prohibited
|
||||
- **Rule 26**: Person Data Provenance - Web Claims for Staff Information
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
**The PPID system is universal.** When you discover during entity resolution that a web source describes a different person:
|
||||
|
||||
1. **Create a regular PPID profile** for the namesake (actress, historical figure, etc.)
|
||||
2. **Set `heritage_relevance.is_heritage_relevant: false`** (unless they happen to also work in heritage)
|
||||
3. **Link both profiles** via `disambiguation_notes`
|
||||
4. **Use standard PPID naming** - no special prefixes needed
|
||||
|
||||
This builds a comprehensive person database while preventing entity resolution errors.
|
||||
|
|
@ -0,0 +1,307 @@
|
|||
# Rule 46: Entity Resolution - Names Are NEVER Sufficient
|
||||
|
||||
## Status: CRITICAL
|
||||
|
||||
## 🚨 DATA QUALITY IS OF UTMOST IMPORTANCE 🚨
|
||||
|
||||
**Wrong data is worse than no data.** Attributing a birth year, spouse, or social media profile to the wrong person is a **critical data quality failure** that undermines the entire dataset's trustworthiness.
|
||||
|
||||
**ALL enrichments MUST be done MANUALLY and double-checked.** Automated web search enrichment has been DISABLED due to catastrophic entity resolution failures (540+ false claims removed in Jan 2026).
|
||||
|
||||
**The cost of false data**:
|
||||
- Corrupts downstream analysis and reporting
|
||||
- Creates legal/privacy risks (attributing data to wrong person)
|
||||
- Destroys user trust in the dataset
|
||||
- Requires expensive manual cleanup
|
||||
|
||||
---
|
||||
|
||||
## 🚫 AUTOMATED ENRICHMENT IS PROHIBITED 🚫
|
||||
|
||||
**DO NOT USE** automated scripts to enrich person profiles with web search data.
|
||||
|
||||
**Why automated enrichment failed**:
|
||||
- Web searches return data about DIFFERENT people with similar names
|
||||
- Regex pattern matching cannot distinguish between namesakes
|
||||
- Wikipedia, IMDB, ResearchGate, Instagram all returned data from wrong people
|
||||
- Example: "Carmen Juliá" search returned Venezuelan actress, Mexican hydrogeologist, Spanish medievalist - NONE were the UK art curator
|
||||
|
||||
**ONLY ALLOWED enrichment methods**:
|
||||
1. **Manual research** - Human curator verifies source refers to the correct person
|
||||
2. **Institutional sources** - Data from the person's employer website (verified)
|
||||
3. **LinkedIn profile data** - Already verified via direct profile access
|
||||
4. **ORCID/Wikidata** - If the person has a verified identifier
|
||||
|
||||
---
|
||||
|
||||
## The Core Principle
|
||||
|
||||
🚨 **SIMILAR OR IDENTICAL NAMES ARE NEVER SUFFICIENT FOR ENTITY RESOLUTION.**
|
||||
|
||||
A web search result mentioning "Carmen Juliá born 1952" is **NOT** evidence that the Carmen Juliá in our person profile was born in 1952. Names are not unique identifiers - there are thousands of people with the same name worldwide.
|
||||
|
||||
**Entity resolution requires verification of MULTIPLE independent identity attributes:**
|
||||
|
||||
| Attribute | Purpose | Example |
|
||||
|-----------|---------|---------|
|
||||
| **Age/Birth Year** | Temporal consistency | Both sources describe someone in their 40s |
|
||||
| **Career Path** | Professional identity | Both are art curators, not one curator and one actress |
|
||||
| **Location** | Geographic consistency | Both are based in UK, not one UK and one Venezuela |
|
||||
| **Employer** | Institutional affiliation | Both work at New Contemporaries |
|
||||
| **Education** | Academic background | Same university or field |
|
||||
|
||||
**Minimum Requirement**: At least **3 of 5** attributes must match before attributing ANY claim from a web source. Name match alone = **AUTOMATIC REJECTION**.
|
||||
|
||||
## Problem Statement
|
||||
|
||||
When enriching person profiles via web search (Linkup, Exa, etc.), search results often return data about **different people with similar or identical names**. Without proper entity resolution, the enrichment process can attribute false claims to the wrong person.
|
||||
|
||||
**Example Failure** (Carmen Juliá - UK Art Curator):
|
||||
- Source profile: Carmen Juliá, Curator at New Contemporaries (UK)
|
||||
- Birth year extracted: 1952 from Carmen Julia **Álvarez** (Venezuelan actress)
|
||||
- Spouse extracted: "actors Eduardo Serrano" from the Venezuelan actress
|
||||
- ResearchGate: Carmen Julia **Navarro** (Mexican hydrogeologist)
|
||||
- Academia.edu: Carmen Julia **Gutiérrez** (Spanish medieval studies)
|
||||
|
||||
All data is from **different people** - none is the actual Carmen Juliá who is a UK-based art curator.
|
||||
|
||||
**Why This Happened**: The enrichment script used regex pattern matching to extract "born 1952" without verifying that the Wikipedia article described the SAME person.
|
||||
|
||||
## The Rule
|
||||
|
||||
### DO NOT use name matching as the basis for entity resolution. EVER.
|
||||
|
||||
For person enrichment via web search:
|
||||
|
||||
**FORBIDDEN** (Name-based extraction):
|
||||
- ❌ Extracting birth years from any search result mentioning "Carmen Julia born..."
|
||||
- ❌ Attributing social media profiles just because the name appears
|
||||
- ❌ Claiming relationships (spouse, parent, child) from web text pattern matching
|
||||
- ❌ Assigning academic profiles (ResearchGate, Academia.edu, Google Scholar) based on name matching alone
|
||||
- ❌ Using Wikipedia articles without verifying ALL identity attributes
|
||||
- ❌ Trusting genealogy sites (Geni, Ancestry, MyHeritage) which describe historical namesakes
|
||||
- ❌ Using IMDB for birth years (actors with same names)
|
||||
|
||||
**REQUIRED** (Multi-Attribute Entity Resolution):
|
||||
1. **Verify identity via MULTIPLE attributes** - name alone is INSUFFICIENT
|
||||
2. **Cross-reference with known facts** (employer, location, job title from LinkedIn)
|
||||
3. **Detect conflicting signals** - actress vs curator, Venezuela vs UK, 1950s birth vs active 2020s career
|
||||
4. **Reject ambiguous matches** - if source doesn't clearly identify the same person, reject the claim
|
||||
5. **Document rejection rationale** - log why claim was rejected for audit trail
|
||||
|
||||
## Entity Resolution Verification Checklist
|
||||
|
||||
Before attributing a web claim to a person profile, verify MULTIPLE identity attributes:
|
||||
|
||||
| # | Attribute | What to Check | Example Match | Example Conflict |
|
||||
|---|-----------|---------------|---------------|------------------|
|
||||
| 1 | **Career/Profession** | Same field/industry | Both are curators | Source says "actress", profile is curator |
|
||||
| 2 | **Employer** | Same institution | Both at Rijksmuseum | Source says "film studio", profile is museum |
|
||||
| 3 | **Location** | Same city/country | Both UK-based | Source says Venezuela, profile is UK |
|
||||
| 4 | **Age Range** | Plausible for career | Birth 1980s, active 2020s | Birth 1952, still active in 2025 as junior |
|
||||
| 5 | **Education** | Same university/field | Both art history | Source says "medical school" |
|
||||
|
||||
**Minimum requirement**: At least **3 of 5** attributes must match. Name match alone = **AUTOMATIC REJECTION**.
|
||||
|
||||
**Any conflicting signal = AUTOMATIC REJECTION** (e.g., source says "actress" when profile is "curator").
|
||||
|
||||
## Sources with High Entity Resolution Risk
|
||||
|
||||
These sources are NOT forbidden, but require **stricter verification thresholds** due to high false-positive rates:
|
||||
|
||||
| Source Type | Risk Level | Why | Required Matches |
|
||||
|-------------|------------|-----|------------------|
|
||||
| Genealogy sites | CRITICAL | Historical persons with same name | 5/5 attributes (or explicit link to living person) |
|
||||
| IMDB | CRITICAL | Actors with common names | 5/5 attributes (unless person works in film/TV) |
|
||||
| Wikipedia | HIGH | Many people with same name have pages | 4/5 attributes match |
|
||||
| Academic profiles | HIGH | Multiple researchers with same name | 4/5 attributes + institution match |
|
||||
| Social media | HIGH | Many accounts with similar handles | 4/5 attributes + verify employer/location in bio |
|
||||
| News articles | MEDIUM | May mention multiple people | 3/5 attributes + read full context |
|
||||
| Institutional websites | LOW | Usually about their own staff | 2/5 attributes (good source if person works there) |
|
||||
|
||||
**Key point**: High-risk sources CAN be used if you verify enough identity attributes. The risk level determines the verification threshold, not whether the source is allowed.
|
||||
|
||||
## Red Flags Requiring Investigation
|
||||
|
||||
The following are **red flags** that require careful investigation - NOT automatic rejection. People change careers and relocate.
|
||||
|
||||
### Profession Differences
|
||||
If source profession differs from profile profession, **investigate**:
|
||||
```
|
||||
Source: "actress", "actor", "singer"
|
||||
Profile: "curator", "archivist", "librarian"
|
||||
|
||||
ASK: Did this person change careers?
|
||||
- Check timeline: Did acting career END before heritage career BEGAN?
|
||||
- Check for transition evidence: "former actress turned curator"
|
||||
- If careers overlap in time → likely different people → REJECT
|
||||
- If sequential careers with clear transition → may be same person → ACCEPT with documentation
|
||||
```
|
||||
|
||||
### Location Differences
|
||||
If source location differs from profile location, **investigate**:
|
||||
```
|
||||
Source: "Venezuela", "Mexico", "Brazil"
|
||||
Profile: "UK", "Netherlands", "France"
|
||||
|
||||
ASK: Did this person relocate?
|
||||
- Check timeline: When were they in each location?
|
||||
- Check for migration evidence: education abroad, international career moves
|
||||
- If locations overlap in time → likely different people → REJECT
|
||||
- If sequential locations with clear move → may be same person → ACCEPT with documentation
|
||||
```
|
||||
|
||||
### When to Actually REJECT
|
||||
|
||||
Reject when investigation shows **no plausible connection**:
|
||||
```
|
||||
Example: Carmen Julia Álvarez (Venezuelan actress, active 1970s-2000s)
|
||||
vs Carmen Juliá (UK curator, active 2015-present)
|
||||
|
||||
- Overlapping active periods in DIFFERENT professions on DIFFERENT continents
|
||||
- No evidence of career change or relocation
|
||||
- Birth year 1952 makes current junior curator role implausible
|
||||
→ REJECT: These are clearly different people
|
||||
```
|
||||
|
||||
### Age Conflicts (Still Automatic Rejection)
|
||||
If source age is **physically implausible** for profile career stage, REJECT:
|
||||
```
|
||||
Source: Born 1922, 1915, 1939
|
||||
Profile: Currently active professional in 2025
|
||||
→ REJECT (person would be 86-103 years old)
|
||||
|
||||
Source: Born 2007, 2004
|
||||
Profile: Senior curator
|
||||
→ REJECT (person would be 18-21, too young)
|
||||
```
|
||||
|
||||
### Genealogy Source
|
||||
Genealogy sources require **5 of 5 attribute matches** due to high false-positive rates:
|
||||
```
|
||||
Domains: geni.com, ancestry.*, familysearch.org, findagrave.com, myheritage.*
|
||||
→ REQUIRE 5/5 attribute matches (these often describe historical namesakes)
|
||||
→ Exception: If source explicitly links to living person with verifiable connection
|
||||
```
|
||||
|
||||
## Claim Rejection Patterns
|
||||
|
||||
The following inconsisten patterns should trigger automatic claim rejection:
|
||||
|
||||
```python
|
||||
# Genealogy sources conflict - ALWAYS REJECT
|
||||
GENEALOGY_DOMAINS = [
|
||||
'geni.com', 'ancestry.com', 'ancestry.co.uk', 'familysearch.org',
|
||||
'findagrave.com', 'myheritage.com', 'wikitree.com', 'geneanet.org'
|
||||
]
|
||||
|
||||
# Profession conflicts - if profile has one and source has another, REJECT
|
||||
PROFESSION_CONFLICTS = {
|
||||
'heritage': ['curator', 'archivist', 'librarian', 'conservator', 'registrar', 'collection manager'],
|
||||
'entertainment': ['actress', 'actor', 'singer', 'footballer', 'politician', 'model', 'athlete'],
|
||||
'medical': ['doctor', 'nurse', 'surgeon', 'physician'],
|
||||
'tech': ['software engineer', 'developer', 'programmer'],
|
||||
}
|
||||
|
||||
# Location conflicts - if source describes person in location X and profile is location Y, REJECT
|
||||
LOCATION_PAIRS = [
|
||||
('venezuela', 'uk'), ('venezuela', 'netherlands'), ('venezuela', 'germany'),
|
||||
('mexico', 'uk'), ('mexico', 'netherlands'), ('brazil', 'france'),
|
||||
('caracas', 'london'), ('caracas', 'amsterdam'),
|
||||
]
|
||||
|
||||
# Age impossibility - if birth year makes current career implausible, REJECT. For instance, for a Junior role:
|
||||
MIN_PLAUSIBLE_BIRTH_YEAR = 1945 # Would be 80 in 2025 - still plausible but verify
|
||||
MAX_PLAUSIBLE_BIRTH_YEAR = 2002 # Would be 23 in 2025 - plausible for junior roles
|
||||
```
|
||||
|
||||
## Handling Rejected Claims
|
||||
|
||||
When a claim fails entity resolution:
|
||||
|
||||
```json
|
||||
{
|
||||
"claim_type": "birth_year",
|
||||
"claim_value": 1952,
|
||||
"entity_resolution": {
|
||||
"status": "REJECTED",
|
||||
"reason": "conflicting_profession",
|
||||
"details": "Source describes Venezuelan actress, profile is UK curator",
|
||||
"source_identity": "Carmen Julia Álvarez (Venezuelan actress)",
|
||||
"profile_identity": "Carmen Juliá (UK art curator)",
|
||||
"rejected_at": "2026-01-11T15:00:00Z",
|
||||
"rejected_by": "entity_resolution_validator_v1"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Special Cases
|
||||
|
||||
### Common Names
|
||||
|
||||
For very common names (e.g., "John Smith", "Maria García", "Jan de Vries"), require **4 of 5** verification checks instead of 3. The more common the name, the higher the threshold.
|
||||
|
||||
| Name Commonality | Required Matches |
|
||||
|------------------|------------------|
|
||||
| Unique name (e.g., "Xander Vermeulen-Oosterhuis") | 2 of 5 |
|
||||
| Moderately common (e.g., "Carmen Juliá") | 3 of 5 |
|
||||
| Very common (e.g., "Jan de Vries") | 4 of 5 |
|
||||
| Extremely common (e.g., "John Smith") | 5 of 5 or reject |
|
||||
|
||||
### Abbreviated Names
|
||||
|
||||
For profiles with abbreviated names (e.g., "J. Smith"), entity resolution is inherently uncertain:
|
||||
- Set `entity_resolution_confidence: "very_low"`
|
||||
- Require **human review** for all claims
|
||||
- Do NOT attribute web claims automatically
|
||||
|
||||
### Historical Persons
|
||||
|
||||
When sources describe historical/deceased persons:
|
||||
- Check if death date conflicts with profile activity (living person active in 2025)
|
||||
- **ALWAYS REJECT** genealogy site data
|
||||
- Reject any source describing events before 1950 unless profile is known to be historical
|
||||
|
||||
### Wikipedia Articles
|
||||
|
||||
Wikipedia is particularly dangerous because:
|
||||
- Many people with the same name have articles
|
||||
- Search engines return Wikipedia first
|
||||
- The Wikipedia Carmen Julia Álvarez article describes a Venezuelan actress born 1952
|
||||
- This is a DIFFERENT PERSON from Carmen Juliá the UK curator
|
||||
|
||||
**For Wikipedia sources**:
|
||||
1. Read the FULL article, not just snippets
|
||||
2. Verify the Wikipedia subject's profession matches the profile
|
||||
3. Verify the Wikipedia subject's location matches the profile
|
||||
4. If ANY conflict detected → REJECT
|
||||
|
||||
## Audit Trail
|
||||
|
||||
All entity resolution decisions must be logged:
|
||||
|
||||
```json
|
||||
{
|
||||
"enrichment_history": [
|
||||
{
|
||||
"enrichment_timestamp": "2026-01-11T15:00:00Z",
|
||||
"enrichment_agent": "enrich_person_comprehensive.py v1.4.0",
|
||||
"entity_resolution_decisions": [
|
||||
{
|
||||
"source_url": "https://en.wikipedia.org/wiki/Carmen_Julia_Álvarez",
|
||||
"decision": "REJECTED",
|
||||
"reason": "Different person - Venezuelan actress, not UK curator"
|
||||
}
|
||||
],
|
||||
"claims_rejected_count": 5,
|
||||
"claims_accepted_count": 1
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- Rule 21: Data Fabrication is Strictly Prohibited
|
||||
- Rule 26: Person Data Provenance - Web Claims for Staff Information
|
||||
- Rule 45: Inferred Data Must Be Explicit with Provenance
|
||||
|
|
@ -0,0 +1,422 @@
|
|||
# Rule 45: Inferred Data Must Be Explicit with Provenance
|
||||
|
||||
**Status**: Active
|
||||
**Created**: 2025-01-09
|
||||
**Applies to**: PPID enrichment, person entity profiles, any data inference
|
||||
|
||||
## Core Principle
|
||||
|
||||
**All inferred data MUST be stored in explicit `inferred_*` fields with full provenance statements. Inferred values MUST NEVER silently replace or merge with verified data.**
|
||||
|
||||
This ensures:
|
||||
1. **Transparency**: Users can distinguish verified facts from heuristic estimates
|
||||
2. **Auditability**: The inference method and source observations are traceable
|
||||
3. **Reversibility**: Inferred data can be corrected when verified data becomes available
|
||||
4. **Quality Signals**: Confidence levels and argument chains are preserved
|
||||
|
||||
## Required Structure for Inferred Data
|
||||
|
||||
Every inferred claim MUST include:
|
||||
|
||||
```yaml
|
||||
inferred_[field_name]:
|
||||
value: "the inferred value"
|
||||
edtf: "196X" # For dates: EDTF notation
|
||||
formatted: "NL-UT-UTR" # For locations: CC-RR-PPP format
|
||||
confidence: "low|medium|high"
|
||||
inference_provenance:
|
||||
method: "heuristic_name"
|
||||
inference_chain:
|
||||
- step: 1
|
||||
observation: "University start year 1986"
|
||||
source_field: "profile_data.education[0].date_range"
|
||||
source_value: "1986 - 1990"
|
||||
- step: 2
|
||||
assumption: "University entry at age 18"
|
||||
rationale: "Standard Dutch university entry age"
|
||||
- step: 3
|
||||
calculation: "1986 - 18 = 1968"
|
||||
result: "Estimated birth year 1968"
|
||||
- step: 4
|
||||
generalization: "Round to decade → 196X"
|
||||
rationale: "EDTF decade notation for uncertain years"
|
||||
inferred_at: "2025-01-09T18:00:00Z"
|
||||
inferred_by: "enrich_ppids.py"
|
||||
```
|
||||
|
||||
## Explicit Inferred Fields
|
||||
|
||||
### For Person Profiles (PPID)
|
||||
|
||||
| Inferred Field | Source Observations | Heuristic |
|
||||
|----------------|---------------------|-----------|
|
||||
| `inferred_birth_year` | Earliest education/job dates | Entry age assumptions |
|
||||
| `inferred_birth_decade` | Birth year estimate | EDTF decade notation |
|
||||
| `inferred_birth_settlement` | School/university location | Residential proximity |
|
||||
| `inferred_birth_region` | Settlement location | GeoNames admin1 |
|
||||
| `inferred_birth_country` | Settlement location | GeoNames country |
|
||||
| `inferred_current_settlement` | Profile location, current job | Direct extraction |
|
||||
| `inferred_current_region` | Settlement location | GeoNames admin1 |
|
||||
| `inferred_current_country` | Settlement location | GeoNames country |
|
||||
|
||||
### Example: Complete Inferred Birth Data
|
||||
|
||||
```json
|
||||
{
|
||||
"ppid": "ID_NL-UT-UTR_196X_NL-UT-UTR_XXXX_AART-HARTEN",
|
||||
|
||||
"birth_date": {
|
||||
"edtf": "XXXX",
|
||||
"precision": "unknown",
|
||||
"note": "See inferred_birth_decade for heuristic estimate"
|
||||
},
|
||||
|
||||
"inferred_birth_decade": {
|
||||
"value": "196X",
|
||||
"edtf": "196X",
|
||||
"precision": "decade",
|
||||
"confidence": "low",
|
||||
"inference_provenance": {
|
||||
"method": "earliest_education_heuristic",
|
||||
"inference_chain": [
|
||||
{
|
||||
"step": 1,
|
||||
"observation": "University education record found",
|
||||
"source_field": "profile_data.education[0]",
|
||||
"source_value": {
|
||||
"institution": "Universiteit Utrecht",
|
||||
"degree": "Social & Organisational psychology, doctoraal",
|
||||
"date_range": "1986 - 1990"
|
||||
}
|
||||
},
|
||||
{
|
||||
"step": 2,
|
||||
"extraction": "Start year extracted from date_range",
|
||||
"extracted_value": 1986
|
||||
},
|
||||
{
|
||||
"step": 3,
|
||||
"assumption": "University entry age",
|
||||
"assumed_value": 18,
|
||||
"rationale": "Standard Dutch university entry age (post-VWO)",
|
||||
"confidence_impact": "Assumption reduces confidence; actual age 17-20 possible"
|
||||
},
|
||||
{
|
||||
"step": 4,
|
||||
"calculation": "1986 - 18 = 1968",
|
||||
"result": "Estimated birth year: 1968"
|
||||
},
|
||||
{
|
||||
"step": 5,
|
||||
"generalization": "Convert to EDTF decade",
|
||||
"input": 1968,
|
||||
"output": "196X",
|
||||
"rationale": "Decade precision appropriate for heuristic estimate"
|
||||
}
|
||||
],
|
||||
"inferred_at": "2025-01-09T18:00:00Z",
|
||||
"inferred_by": "enrich_ppids.py"
|
||||
}
|
||||
},
|
||||
|
||||
"inferred_birth_settlement": {
|
||||
"value": "Utrecht",
|
||||
"formatted": "NL-UT-UTR",
|
||||
"confidence": "low",
|
||||
"inference_provenance": {
|
||||
"method": "earliest_education_location",
|
||||
"inference_chain": [
|
||||
{
|
||||
"step": 1,
|
||||
"observation": "Earliest education institution identified",
|
||||
"source_field": "profile_data.education[0].institution",
|
||||
"source_value": "Universiteit Utrecht"
|
||||
},
|
||||
{
|
||||
"step": 2,
|
||||
"lookup": "Institution location mapping",
|
||||
"mapping_key": "Universiteit Utrecht",
|
||||
"mapping_value": "Utrecht, Netherlands"
|
||||
},
|
||||
{
|
||||
"step": 3,
|
||||
"geocoding": "GeoNames resolution",
|
||||
"query": "Utrecht",
|
||||
"country_code": "NL",
|
||||
"result": {
|
||||
"geonames_id": 2745912,
|
||||
"name": "Utrecht",
|
||||
"admin1_code": "09",
|
||||
"admin1_name": "Utrecht"
|
||||
}
|
||||
},
|
||||
{
|
||||
"step": 4,
|
||||
"formatting": "CC-RR-PPP generation",
|
||||
"country_code": "NL",
|
||||
"region_code": "UT",
|
||||
"settlement_code": "UTR",
|
||||
"result": "NL-UT-UTR"
|
||||
}
|
||||
],
|
||||
"assumption_note": "University location used as proxy for birth location; student may have relocated for education",
|
||||
"inferred_at": "2025-01-09T18:00:00Z",
|
||||
"inferred_by": "enrich_ppids.py"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## List-Valued Inferred Data (EDTF Set Notation)
|
||||
|
||||
When inference yields multiple plausible values (e.g., someone born in 1968 could be in either the 1960s or 1970s decade), store as a **list** with EDTF set notation.
|
||||
|
||||
### EDTF Set Notation Standards
|
||||
|
||||
| Notation | Meaning | Use Case |
|
||||
|----------|---------|----------|
|
||||
| `[196X,197X]` | One of these values | Person born in late 1960s (uncertainty spans decades) |
|
||||
| `{196X,197X}` | All of these values | NOT for birth decade (use `[...]`) |
|
||||
| `[1965..1970]` | Range within set | Birth year between 1965-1970 |
|
||||
|
||||
### When to Use List Values
|
||||
|
||||
1. **Decade Boundary Cases**: Estimated birth year is within 3 years of a decade boundary
|
||||
- Estimated 1968 → `[196X,197X]` (could be late 60s or early 70s due to age assumption variance)
|
||||
- Estimated 1972 → `[196X,197X]` (same logic)
|
||||
- Estimated 1975 → `197X` (confidently mid-decade)
|
||||
|
||||
2. **Multiple Plausible Locations**: Student attended schools in different cities
|
||||
- `["NL-UT-UTR", "NL-NH-AMS"]` with provenance explaining each candidate
|
||||
|
||||
### Example: List-Valued Birth Decade
|
||||
|
||||
```json
|
||||
{
|
||||
"inferred_birth_decade": {
|
||||
"values": ["196X", "197X"],
|
||||
"edtf": "[196X,197X]",
|
||||
"edtf_meaning": "one of: 1960s or 1970s",
|
||||
"precision": "decade_set",
|
||||
"confidence": "low",
|
||||
"primary_value": "196X",
|
||||
"primary_rationale": "1968 is closer to 1960s center than 1970s",
|
||||
"inference_provenance": {
|
||||
"method": "earliest_observation_heuristic",
|
||||
"inference_chain": [
|
||||
{
|
||||
"step": 1,
|
||||
"observation": "University start 1986",
|
||||
"source_field": "profile_data.education[0].date_range"
|
||||
},
|
||||
{
|
||||
"step": 2,
|
||||
"assumption": "University entry at age 18 (±3 years)",
|
||||
"rationale": "Dutch university entry typically 17-21"
|
||||
},
|
||||
{
|
||||
"step": 3,
|
||||
"calculation": "1986 - 18 = 1968 (range: 1965-1971)",
|
||||
"result": "Birth year estimate: 1968 with variance 1965-1971"
|
||||
},
|
||||
{
|
||||
"step": 4,
|
||||
"generalization": "Birth year range spans decade boundary",
|
||||
"input_range": [1965, 1971],
|
||||
"output": ["196X", "197X"],
|
||||
"rationale": "Cannot determine which decade without additional evidence"
|
||||
}
|
||||
],
|
||||
"inferred_at": "2025-01-09T18:00:00Z",
|
||||
"inferred_by": "enrich_ppids.py"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### PPID Generation with List Values
|
||||
|
||||
When `inferred_birth_decade` is a list, use `primary_value` for PPID:
|
||||
|
||||
```json
|
||||
{
|
||||
"ppid": "ID_NL-UT-UTR_196X_NL-UT-UTR_XXXX_AART-HARTEN",
|
||||
"ppid_components": {
|
||||
"first_date": "196X",
|
||||
"first_date_source": "inferred_birth_decade.primary_value",
|
||||
"first_date_alternatives": ["197X"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Example: List-Valued Location
|
||||
|
||||
```json
|
||||
{
|
||||
"inferred_birth_settlement": {
|
||||
"values": [
|
||||
{"settlement": "Utrecht", "formatted": "NL-UT-UTR"},
|
||||
{"settlement": "Amsterdam", "formatted": "NL-NH-AMS"}
|
||||
],
|
||||
"primary_value": "NL-UT-UTR",
|
||||
"primary_rationale": "Earlier education (1986) in Utrecht; Amsterdam job later (1990)",
|
||||
"confidence": "very_low",
|
||||
"inference_provenance": {
|
||||
"method": "education_locations",
|
||||
"inference_chain": [
|
||||
{
|
||||
"step": 1,
|
||||
"observation": "Multiple education institutions found",
|
||||
"source_field": "profile_data.education",
|
||||
"candidates": ["Universiteit Utrecht (1986)", "UvA (1990)"]
|
||||
},
|
||||
{
|
||||
"step": 2,
|
||||
"assumption": "Earlier education more likely near birth location",
|
||||
"rationale": "Students often attend local university first"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Confidence Levels
|
||||
|
||||
| Level | Criteria | Example |
|
||||
|-------|----------|---------|
|
||||
| **high** | Direct extraction from authoritative source | Profile states "Born in Amsterdam" |
|
||||
| **medium** | Single-step inference with reliable source | Current job location from employment record |
|
||||
| **low** | Multi-step heuristic with assumptions | Birth year from university start date |
|
||||
| **very_low** | Speculative, multiple assumptions, or list-valued | Birth location from first observed location, or decade spanning boundary |
|
||||
|
||||
## Anti-Patterns (FORBIDDEN)
|
||||
|
||||
### ❌ Silent Replacement
|
||||
```json
|
||||
{
|
||||
"birth_date": {
|
||||
"edtf": "196X",
|
||||
"precision": "decade"
|
||||
}
|
||||
}
|
||||
```
|
||||
**Problem**: No indication this is inferred, no provenance, no confidence level.
|
||||
|
||||
### ❌ Hidden in Metadata
|
||||
```json
|
||||
{
|
||||
"birth_date": {
|
||||
"edtf": "196X"
|
||||
},
|
||||
"enrichment_metadata": {
|
||||
"birth_date_inferred": true
|
||||
}
|
||||
}
|
||||
```
|
||||
**Problem**: Inference metadata separated from the value; easy to miss.
|
||||
|
||||
### ❌ Missing Inference Chain
|
||||
```json
|
||||
{
|
||||
"inferred_birth_decade": {
|
||||
"value": "196X",
|
||||
"method": "heuristic"
|
||||
}
|
||||
}
|
||||
```
|
||||
**Problem**: No explanation of HOW the value was derived; not auditable.
|
||||
|
||||
## Correct Pattern ✅
|
||||
|
||||
```json
|
||||
{
|
||||
"birth_date": {
|
||||
"edtf": "XXXX",
|
||||
"precision": "unknown",
|
||||
"note": "See inferred_birth_decade"
|
||||
},
|
||||
"inferred_birth_decade": {
|
||||
"value": "196X",
|
||||
"edtf": "196X",
|
||||
"confidence": "low",
|
||||
"inference_provenance": {
|
||||
"method": "earliest_education_heuristic",
|
||||
"inference_chain": [
|
||||
{"step": 1, "observation": "...", "source_field": "...", "source_value": "..."},
|
||||
{"step": 2, "assumption": "...", "rationale": "..."},
|
||||
{"step": 3, "calculation": "...", "result": "..."}
|
||||
],
|
||||
"inferred_at": "2025-01-09T18:00:00Z",
|
||||
"inferred_by": "enrich_ppids.py"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## PPID Component Handling
|
||||
|
||||
When inferred values are used in PPID components:
|
||||
|
||||
```json
|
||||
{
|
||||
"ppid": "ID_NL-UT-UTR_196X_NL-NH-AMS_XXXX_AART-HARTEN",
|
||||
"ppid_components": {
|
||||
"type": "ID",
|
||||
"first_location": "NL-UT-UTR",
|
||||
"first_location_source": "inferred_birth_settlement",
|
||||
"first_date": "196X",
|
||||
"first_date_source": "inferred_birth_decade",
|
||||
"last_location": "NL-NH-AMS",
|
||||
"last_location_source": "inferred_current_settlement",
|
||||
"last_date": "XXXX",
|
||||
"name_tokens": ["AART", "HARTEN"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `*_source` fields document which inferred field was used for PPID generation.
|
||||
|
||||
## Upgrade Path: Inferred → Verified
|
||||
|
||||
When verified data becomes available:
|
||||
|
||||
1. **Keep inferred data** in `inferred_*` fields for audit trail
|
||||
2. **Add verified data** to canonical fields
|
||||
3. **Mark inferred as superseded**:
|
||||
|
||||
```json
|
||||
{
|
||||
"birth_date": {
|
||||
"edtf": "1967-03-15",
|
||||
"precision": "day",
|
||||
"verified": true,
|
||||
"source": "official_record"
|
||||
},
|
||||
"inferred_birth_decade": {
|
||||
"value": "196X",
|
||||
"superseded": true,
|
||||
"superseded_by": "birth_date",
|
||||
"superseded_at": "2025-01-15T10:00:00Z",
|
||||
"accuracy_assessment": "Inferred decade was correct (1960s), actual year 1967"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
For any enrichment script:
|
||||
|
||||
- [ ] Create explicit `inferred_*` fields for ALL inferred data
|
||||
- [ ] Include `inference_provenance` with complete `inference_chain`
|
||||
- [ ] Record each step: observation → assumption → calculation → result
|
||||
- [ ] Set appropriate `confidence` level
|
||||
- [ ] Add `*_source` references in PPID components
|
||||
- [ ] Preserve original unknown values (`XXXX`, `XX-XX-XXX`)
|
||||
- [ ] Add `note` in canonical fields pointing to inferred alternatives
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 44**: PPID Birth Date Enrichment and EDTF Unknown Date Notation
|
||||
- **Rule 35**: Provenance Statements MUST Have Dual Timestamps
|
||||
- **Rule 6**: WebObservation Claims MUST Have XPath Provenance
|
||||
|
|
@ -0,0 +1,251 @@
|
|||
# Rule 40: KIEN Registry is Authoritative for Intangible Heritage Custodians
|
||||
|
||||
## Summary
|
||||
|
||||
For Intangible Heritage Custodians (Type I), the KIEN registry at `https://www.immaterieelerfgoed.nl/` is the **TIER_1_AUTHORITATIVE** source for contact data and addresses. Google Maps enrichment is **TIER_3_CROWD_SOURCED** and should NEVER override KIEN data.
|
||||
|
||||
## Empirical Validation (January 2025)
|
||||
|
||||
A comprehensive audit of 188 Type I custodian files revealed:
|
||||
|
||||
| Category | Count | Percentage |
|
||||
|----------|-------|------------|
|
||||
| ✅ Google Maps matches OK | 101 | 53.7% |
|
||||
| 🔧 **FALSE_MATCH detected** | **62** | **33.0%** |
|
||||
| ⚠️ No official website (valid) | 20 | 10.6% |
|
||||
| 📭 No Google Maps data | 5 | 2.7% |
|
||||
|
||||
**Key Finding: 33% of Google Maps enrichment data for Type I custodians was incorrect.**
|
||||
|
||||
### False Match Categories Identified
|
||||
|
||||
1. **Domain mismatches** (39 files): Google Maps website ≠ KIEN official website
|
||||
2. **Name mismatches** (8 files): Completely different organizations (e.g., "Ria Bos" heritage practitioner → "Ria Money Transfer Agent")
|
||||
3. **Wrong location** (6 files): Same-ish name but different city (Amsterdam→Den Haag, Netherlands→Suriname!)
|
||||
4. **Wrong organization type** (5 files): Federation vs specific member, heritage org vs webshop
|
||||
5. **Different entity type** (3 files): Organization vs location/street name
|
||||
6. **Different event** (3 files): Horse racing vs festival, different village's event
|
||||
|
||||
### Why Google Maps Fails for Type I
|
||||
|
||||
Google Maps is optimized for commercial businesses with physical storefronts. Type I intangible heritage custodians are fundamentally different:
|
||||
|
||||
- **Virtual organizations** without commercial presence
|
||||
- **Person-based heritage** (individual practitioners preserving traditional crafts)
|
||||
- **Volunteer networks** meeting in private residences
|
||||
- **Event-based organizations** that exist only during festivals
|
||||
- **Federations** that coordinate member organizations without own premises
|
||||
|
||||
## Rationale
|
||||
|
||||
Google Maps frequently returns **false matches** for intangible heritage organizations because:
|
||||
|
||||
1. **Virtual Organizations**: Many intangible heritage custodians operate as networks/platforms without commercial storefronts
|
||||
2. **Name Collisions**: Common words in organization names (e.g., "Platform") match unrelated businesses
|
||||
3. **No Physical Presence**: Organizations focused on intangible heritage (handwriting, oral traditions, crafts) often have no Google Maps listing
|
||||
4. **Volunteer-Run**: Contact addresses are often private residences, not businesses
|
||||
|
||||
KIEN (Kenniscentrum Immaterieel Erfgoed Nederland) is the official Dutch registry for intangible cultural heritage and maintains verified contact information directly from the organizations.
|
||||
|
||||
## Data Tier Hierarchy for Type I Custodians
|
||||
|
||||
| Priority | Source | Data Tier | Trust Level |
|
||||
|----------|--------|-----------|-------------|
|
||||
| 1st | KIEN Registry (`immaterieelerfgoed.nl`) | TIER_1_AUTHORITATIVE | Highest |
|
||||
| 2nd | Organization's Official Website | TIER_2_VERIFIED | High |
|
||||
| 3rd | Wikidata | TIER_3_CROWD_SOURCED | Medium |
|
||||
| 4th | Google Maps | TIER_3_CROWD_SOURCED | Low (verify!) |
|
||||
|
||||
## Required Workflow for Type I Enrichment
|
||||
|
||||
### Step 1: Scrape KIEN Page First
|
||||
|
||||
For every intangible heritage custodian, the KIEN profile page MUST be scraped to extract:
|
||||
|
||||
```yaml
|
||||
kien_enrichment:
|
||||
kien_name: "Platform Handschriftontwikkeling"
|
||||
kien_url: "https://www.immaterieelerfgoed.nl/nl/page/2476/platform-handschriftontwikkeling"
|
||||
heritage_page_url: "https://www.immaterieelerfgoed.nl/nl/handschrift"
|
||||
heritage_forms:
|
||||
- "Ambachten, handwerk en techniek"
|
||||
- "Sociale praktijken"
|
||||
address:
|
||||
street: "De Hazelaar 41"
|
||||
postal_code: "6903 BB"
|
||||
city: "Zevenaar"
|
||||
province: "Gelderland"
|
||||
country: "NL"
|
||||
registered_since: "2019-11"
|
||||
enrichment_timestamp: "2025-01-08T00:00:00Z"
|
||||
source: "https://www.immaterieelerfgoed.nl"
|
||||
```
|
||||
|
||||
### Step 2: Validate Google Maps Match (If Any)
|
||||
|
||||
If Google Maps enrichment exists, compare against KIEN data:
|
||||
|
||||
```python
|
||||
def validate_google_maps_match(kien_data, gmaps_data):
|
||||
"""Check if Google Maps data matches KIEN authoritative source."""
|
||||
|
||||
# Check website domain match
|
||||
kien_domain = extract_domain(kien_data.get('website'))
|
||||
gmaps_domain = extract_domain(gmaps_data.get('website'))
|
||||
|
||||
if kien_domain and gmaps_domain and kien_domain != gmaps_domain:
|
||||
return {
|
||||
'status': 'FALSE_MATCH',
|
||||
'reason': f'Website mismatch: KIEN={kien_domain}, GMaps={gmaps_domain}'
|
||||
}
|
||||
|
||||
# Check name similarity
|
||||
kien_name = kien_data.get('kien_name', '').lower()
|
||||
gmaps_name = gmaps_data.get('name', '').lower()
|
||||
|
||||
if fuzz.ratio(kien_name, gmaps_name) < 70:
|
||||
return {
|
||||
'status': 'FALSE_MATCH',
|
||||
'reason': f'Name mismatch: KIEN="{kien_name}", GMaps="{gmaps_name}"'
|
||||
}
|
||||
|
||||
return {'status': 'VERIFIED'}
|
||||
```
|
||||
|
||||
### Step 3: Mark False Matches
|
||||
|
||||
When Google Maps returns a different organization:
|
||||
|
||||
```yaml
|
||||
google_maps_enrichment:
|
||||
status: FALSE_MATCH
|
||||
false_match_reason: >-
|
||||
Google Maps returned "Platform 9 BV" (a health/coaching business at
|
||||
Nieuwleusen) instead of "Platform Handschriftontwikkeling" (a virtual
|
||||
handwriting development platform). These are completely different
|
||||
organizations. KIEN registry is authoritative for this Type I custodian.
|
||||
original_false_match:
|
||||
place_id: ChIJNZ6o7H_fx0cR-TURAN3Bj54
|
||||
name: Platform 9 BV
|
||||
formatted_address: Burg, Burgemeester Backxlaan 321, 7711 AD Nieuwleusen
|
||||
website: http://www.platform9.nl/
|
||||
correction_timestamp: "2025-01-08T00:00:00Z"
|
||||
correction_agent: opencode-claude-sonnet-4
|
||||
```
|
||||
|
||||
## KIEN Contact Data Extraction
|
||||
|
||||
The KIEN heritage pages follow a consistent structure. Extract from the "Contact" section:
|
||||
|
||||
```
|
||||
## Contact
|
||||
[Organization Name](link-to-profile-page)
|
||||
Street Address
|
||||
Postal Code
|
||||
City
|
||||
Province
|
||||
[Website](url)
|
||||
Bijgeschreven in inventaris vanaf: [date]
|
||||
```
|
||||
|
||||
### Example Extraction (from immaterieelerfgoed.nl/nl/handschrift):
|
||||
|
||||
```yaml
|
||||
contact:
|
||||
organization: "Platform Handschriftontwikkeling"
|
||||
profile_url: "https://www.immaterieelerfgoed.nl/nl/page/2476/platform-handschriftontwikkeling"
|
||||
address:
|
||||
street: "De Hazelaar 41"
|
||||
postal_code: "6903 BB"
|
||||
city: "Zevenaar"
|
||||
province: "Gelderland"
|
||||
website: "http://www.handschriftontwikkeling.nl/"
|
||||
registered_since: "november 2019"
|
||||
```
|
||||
|
||||
## Location Resolution for Type I
|
||||
|
||||
When KIEN provides an address:
|
||||
|
||||
1. **Use KIEN address** for `location.formatted_address`
|
||||
2. **Geocode KIEN address** to get coordinates (NOT Google Maps coordinates)
|
||||
3. **Update location_resolution** with method `KIEN_ADDRESS_GEOCODE`
|
||||
|
||||
```yaml
|
||||
location:
|
||||
street_address: "De Hazelaar 41"
|
||||
postal_code: "6903 BB"
|
||||
city: Zevenaar
|
||||
region_code: GE
|
||||
country: NL
|
||||
coordinate_provenance:
|
||||
source_type: KIEN_ADDRESS_GEOCODE
|
||||
source_url: "https://www.immaterieelerfgoed.nl/nl/handschrift"
|
||||
geocoding_service: nominatim
|
||||
geocoding_timestamp: "2025-01-08T00:00:00Z"
|
||||
```
|
||||
|
||||
## Batch Re-Enrichment Script
|
||||
|
||||
To fix all Type I custodians with potentially incorrect Google Maps data:
|
||||
|
||||
```bash
|
||||
# Find all Type I custodians
|
||||
python scripts/rescrape_kien_contacts.py --type I --output data/custodian/
|
||||
|
||||
# This script should:
|
||||
# 1. Read all NL-*-I-*.yaml files
|
||||
# 2. Fetch KIEN page for each (from kien_enrichment.kien_url)
|
||||
# 3. Extract contact/address from KIEN
|
||||
# 4. Compare with google_maps_enrichment
|
||||
# 5. Mark mismatches as FALSE_MATCH
|
||||
# 6. Update location with KIEN address
|
||||
```
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### WRONG - Using Google Maps as primary source for Type I:
|
||||
|
||||
```yaml
|
||||
# WRONG - Google Maps overriding KIEN data
|
||||
location:
|
||||
formatted_address: "Burg, Burgemeester Backxlaan 321, 7711 AD Nieuwleusen"
|
||||
coordinate_provenance:
|
||||
source_type: GOOGLE_MAPS # WRONG for Type I!
|
||||
```
|
||||
|
||||
### CORRECT - KIEN as primary source:
|
||||
|
||||
```yaml
|
||||
# CORRECT - KIEN is authoritative
|
||||
location:
|
||||
street_address: "De Hazelaar 41"
|
||||
postal_code: "6903 BB"
|
||||
city: Zevenaar
|
||||
coordinate_provenance:
|
||||
source_type: KIEN_ADDRESS_GEOCODE # Correct!
|
||||
```
|
||||
|
||||
## Affected Files
|
||||
|
||||
This rule affects approximately 100+ Type I custodian files:
|
||||
- `data/custodian/NL-*-I-*.yaml`
|
||||
|
||||
All should be reviewed to ensure:
|
||||
1. `kien_enrichment` contains address from KIEN page
|
||||
2. `google_maps_enrichment` is validated against KIEN
|
||||
3. `location` uses KIEN address (not Google Maps)
|
||||
4. False matches are properly documented
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 5**: NEVER Delete Enriched Data - Keep false match data in `original_false_match`
|
||||
- **Rule 6**: WebObservation Claims - KIEN data should have provenance
|
||||
- **Rule 22**: Custodian YAML Files Are Single Source of Truth
|
||||
- **Rule 35**: Provenance Timestamps - Include KIEN fetch timestamps
|
||||
|
||||
## See Also
|
||||
|
||||
- KIEN Registry: https://www.immaterieelerfgoed.nl/
|
||||
- UNESCO Intangible Cultural Heritage: https://ich.unesco.org/
|
||||
- Dutch Intangible Heritage Network documentation
|
||||
|
|
@ -0,0 +1,351 @@
|
|||
# Rule 44: PPID Birth Date Enrichment and Unknown Date Notation
|
||||
|
||||
**Version**: 1.0.0
|
||||
**Created**: 2025-01-09
|
||||
**Status**: ACTIVE
|
||||
**Related**: [PPID-GHCID Alignment](../../docs/plan/person_pid/10_ppid_ghcid_alignment.md) | [EDTF Specification](https://www.loc.gov/standards/datetime/)
|
||||
|
||||
---
|
||||
|
||||
## 1. Summary
|
||||
|
||||
When birth/death dates are missing from person entity sources, agents MUST:
|
||||
|
||||
1. **Search for dates** using Exa Search and Linkup tools
|
||||
2. **Record all enrichment data** as web claims with provenance
|
||||
3. **If not found**, use **EDTF-compliant notation** for estimated/unknown dates
|
||||
4. **Never fabricate** specific dates without source evidence
|
||||
|
||||
---
|
||||
|
||||
## 2. Enrichment Workflow
|
||||
|
||||
### 2.1 Required Search Before Using Unknown Notation
|
||||
|
||||
Before marking a date as unknown, agents MUST attempt enrichment:
|
||||
|
||||
```
|
||||
Person Entity (missing birth_date)
|
||||
↓
|
||||
1. Search Exa: "{full_name} born birth date"
|
||||
↓
|
||||
2. Search Exa: "{full_name} {known_employer}"
|
||||
↓
|
||||
3. Search Linkup: "{full_name} biography"
|
||||
↓
|
||||
4. If found → Record as web_claim with provenance
|
||||
↓
|
||||
5. If NOT found → Use EDTF unknown notation
|
||||
↓
|
||||
6. Record enrichment_attempt in metadata
|
||||
```
|
||||
|
||||
### 2.2 Enrichment Search Requirements
|
||||
|
||||
| Search Tool | Query Pattern | When to Use |
|
||||
|-------------|---------------|-------------|
|
||||
| `exa_web_search_exa` | `"{name}" born birthday birth date year` | Primary search |
|
||||
| `exa_linkedin_search_exa` | `"{name}" at "{employer}"` | For work context |
|
||||
| `linkup_linkup-search` | `"{name}" biography personal` | Deep research |
|
||||
|
||||
### 2.3 Recording Successful Enrichment
|
||||
|
||||
When birth date is found, record as web claim:
|
||||
|
||||
```yaml
|
||||
web_claims:
|
||||
- claim_type: birth_date
|
||||
claim_value: "1985-03-15"
|
||||
source_url: "https://example.org/person/bio"
|
||||
retrieved_on: "2025-01-09T14:30:00Z"
|
||||
retrieval_agent: "opencode-claude-sonnet-4"
|
||||
confidence_score: 0.85
|
||||
notes: "Found in biography section"
|
||||
```
|
||||
|
||||
### 2.4 Recording Failed Enrichment Attempts
|
||||
|
||||
Always record that enrichment was attempted:
|
||||
|
||||
```yaml
|
||||
enrichment_metadata:
|
||||
birth_date_search:
|
||||
attempted: true
|
||||
search_date: "2025-01-09T14:30:00Z"
|
||||
search_agent: "opencode-claude-sonnet-4"
|
||||
search_tools_used:
|
||||
- exa_web_search_exa
|
||||
- linkup_linkup-search
|
||||
queries_tried:
|
||||
- '"Jan van Berg" born birthday'
|
||||
- '"Jan van Berg" biography'
|
||||
result: "NOT_FOUND"
|
||||
notes: "No publicly available birth date found after comprehensive search"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. EDTF-Compliant Unknown Date Notation
|
||||
|
||||
### 3.1 Standard: Extended Date/Time Format (EDTF)
|
||||
|
||||
This project follows the **Library of Congress EDTF Specification** (ISO 8601-2:2019) for representing uncertain, approximate, and unspecified dates.
|
||||
|
||||
**Key EDTF Characters**:
|
||||
|
||||
| Character | Meaning | EDTF Level | Example |
|
||||
|-----------|---------|------------|---------|
|
||||
| `X` | Unspecified digit | Level 1+ | `19XX` = some year 1900-1999 |
|
||||
| `~` | Approximate (circa) | Level 1+ | `1985~` = circa 1985 |
|
||||
| `?` | Uncertain | Level 1+ | `1985?` = possibly 1985 |
|
||||
| `%` | Uncertain AND approximate | Level 1+ | `1985%` = possibly circa 1985 |
|
||||
| `S` | Significant digits | Level 2 | `1950S2` = 1900-1999, estimated 1950 |
|
||||
| `[..]` | One of set | Level 2 | `[1970,1980]` = either 1970 or 1980 |
|
||||
| `{..}` | All of set | Level 2 | `{1970..1980}` = all years 1970-1980 |
|
||||
|
||||
### 3.2 Unspecified Date Components (X Notation)
|
||||
|
||||
Use `X` to replace unknown digits:
|
||||
|
||||
| Known Information | EDTF Format | Meaning |
|
||||
|-------------------|-------------|---------|
|
||||
| Only decade known (1970s) | `197X` | Some year 1970-1979 |
|
||||
| Only century known (1900s) | `19XX` | Some year 1900-1999 |
|
||||
| Year unknown entirely | `XXXX` | Year unknown |
|
||||
| Year known, month unknown | `1985-XX` | Some month in 1985 |
|
||||
| Year+month known, day unknown | `1985-04-XX` | Some day in April 1985 |
|
||||
| Year known, month+day unknown | `1985-XX-XX` | Some day in 1985 |
|
||||
| Only decade and final digit known | `197X-XX-XX` or use set | 1970-1979 |
|
||||
|
||||
### 3.3 Multiple Possible Decades (Set Notation)
|
||||
|
||||
When the decade is uncertain but constrained to specific options:
|
||||
|
||||
| Scenario | EDTF Format | Meaning |
|
||||
|----------|-------------|---------|
|
||||
| Born in 1970s OR 1980s | `[197X,198X]` | One of: some year in 1970s or 1980s |
|
||||
| Born in specific years | `[1975,1985]` | Either 1975 or 1985 |
|
||||
| Born 1970-1985 range | `1970/1985` | Interval: between 1970 and 1985 |
|
||||
|
||||
### 3.4 Estimated Dates with Significant Digits
|
||||
|
||||
When you can estimate a year with confidence bounds:
|
||||
|
||||
```
|
||||
1975S2 = Estimated 1975, significant to 2 digits (1900-1999)
|
||||
1975S3 = Estimated 1975, significant to 3 digits (1970-1979)
|
||||
```
|
||||
|
||||
This is useful when you can estimate based on career timeline (e.g., "started working 1998, likely born 1970s").
|
||||
|
||||
### 3.5 Living Persons - Birth Date Estimation
|
||||
|
||||
For living persons in LinkedIn data, estimate birth decade from:
|
||||
|
||||
1. **Graduation year** (if available): Subtract ~22 years for bachelor's degree
|
||||
2. **Career start** (first job): Subtract ~22-25 years
|
||||
3. **Current role seniority**: "Senior" roles suggest 35+ years old
|
||||
|
||||
```yaml
|
||||
# Example: Person graduated 2010
|
||||
birth_date_estimate:
|
||||
edtf: "1988S2" # Estimated 1988, significant to 2 digits (1980-1999)
|
||||
estimation_method: "graduation_year_inference"
|
||||
estimation_basis: "Graduated bachelor's 2010, estimated birth ~1988"
|
||||
confidence: 0.60
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. PPID Format with Unknown Dates
|
||||
|
||||
### 4.1 PPID Date Component Rules
|
||||
|
||||
The PPID format includes birth and death dates:
|
||||
|
||||
```
|
||||
{TYPE}_{FL}_{FD}_{LL}_{LD}_{NT}
|
||||
│ │
|
||||
│ └── Last Date (death) - EDTF format
|
||||
└── First Date (birth) - EDTF format
|
||||
```
|
||||
|
||||
### 4.2 Examples with Unknown Components
|
||||
|
||||
| Scenario | PPID Example |
|
||||
|----------|--------------|
|
||||
| All known | `PID_NL-NH-AMS_1985-03-15_NL-NH-HAA_2020-08-22_JAN-BERG` |
|
||||
| Birth year only | `ID_NL-NH-AMS_1985_XX-XX-XXX_XXXX_JAN-BERG` |
|
||||
| Birth decade only | `ID_XX-XX-XXX_197X_XX-XX-XXX_XXXX_JAN-BERG` |
|
||||
| Nothing known | `ID_XX-XX-XXX_XXXX_XX-XX-XXX_XXXX_JAN-BERG` |
|
||||
| Living person | `ID_NL-NH-AMS_1985_XX-XX-XXX_XXXX_JAN-BERG` |
|
||||
|
||||
### 4.3 Filename Safety
|
||||
|
||||
EDTF characters are **filename-safe**:
|
||||
|
||||
| Character | Filename Safe? | Notes |
|
||||
|-----------|----------------|-------|
|
||||
| `X` | YES | Uppercase letter |
|
||||
| `~` | YES | Allowed on macOS/Linux/Windows |
|
||||
| `?` | NO | Not allowed on Windows |
|
||||
| `%` | CAUTION | URL encoding issues |
|
||||
| `[` `]` | CAUTION | Shell escaping issues |
|
||||
| `,` | YES | Allowed |
|
||||
| `/` | NO | Directory separator |
|
||||
| `\|` | CAUTION | Shell pipe, Windows disallowed |
|
||||
|
||||
**Recommendation**: For filenames, use only:
|
||||
- `X` for unknown digits
|
||||
- `~` for approximate (suffix only)
|
||||
- Avoid `?`, `%`, `[]`, `/`, `|` in filenames
|
||||
|
||||
When set notation `[..]` is needed, store in metadata but use simplified form in filename:
|
||||
- Filename: `ID_XX-XX-XXX_197X_...` (simplified)
|
||||
- Metadata: `birth_date_edtf: "[1975,1985]"` (full EDTF)
|
||||
|
||||
---
|
||||
|
||||
## 5. Decision Tree
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ Person entity missing birth_date │
|
||||
└─────────────────┬───────────────────────┘
|
||||
▼
|
||||
┌─────────────────────────────────────────┐
|
||||
│ Search Exa + Linkup for birth date │
|
||||
└─────────────────┬───────────────────────┘
|
||||
▼
|
||||
┌───────┴───────┐
|
||||
│ Date found? │
|
||||
└───────┬───────┘
|
||||
YES │ NO
|
||||
▼ │ ▼
|
||||
┌─────────────────┐ ┌─────────────────────────────┐
|
||||
│ Record as │ │ Can estimate from career? │
|
||||
│ web_claim with │ └───────────┬─────────────────┘
|
||||
│ provenance │ YES │ NO
|
||||
└─────────────────┘ ▼ │ ▼
|
||||
┌───────────────┐ ┌───────────────┐
|
||||
│ Use EDTF │ │ Use XXXX │
|
||||
│ estimate: │ │ (unknown) │
|
||||
│ 1988S2 or │ │ │
|
||||
│ 198X │ │ │
|
||||
└───────────────┘ └───────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Examples
|
||||
|
||||
### 6.1 Fully Unknown (No Enrichment Found)
|
||||
|
||||
```yaml
|
||||
# Person: Nora Ruijs (student, no public birth info)
|
||||
ppid: ID_XX-XX-XXX_XXXX_XX-XX-XXX_XXXX_NORA-RUIJS
|
||||
|
||||
birth_date:
|
||||
edtf: "XXXX"
|
||||
precision: "unknown"
|
||||
|
||||
enrichment_metadata:
|
||||
birth_date_search:
|
||||
attempted: true
|
||||
search_date: "2025-01-09T14:30:00Z"
|
||||
result: "NOT_FOUND"
|
||||
```
|
||||
|
||||
### 6.2 Decade Estimated from Career
|
||||
|
||||
```yaml
|
||||
# Person: Senior curator, started career 1995
|
||||
ppid: ID_NL-NH-AMS_197X_XX-XX-XXX_XXXX_JAN-BERG
|
||||
|
||||
birth_date:
|
||||
edtf: "197X"
|
||||
edtf_full: "1972S3" # Estimated 1972, significant to 3 digits
|
||||
precision: "decade"
|
||||
estimation_method: "career_start_inference"
|
||||
estimation_basis: "Career started 1995 as junior curator, estimated age 23"
|
||||
```
|
||||
|
||||
### 6.3 Multiple Possible Decades
|
||||
|
||||
```yaml
|
||||
# Person: Could be born 1970s or 1980s based on conflicting sources
|
||||
ppid: ID_XX-XX-XXX_197X_XX-XX-XXX_XXXX_MARIA-SILVA # Simplified for filename
|
||||
|
||||
birth_date:
|
||||
edtf: "[197X,198X]" # Full EDTF with set notation
|
||||
edtf_filename: "197X" # Simplified for filename (earlier estimate)
|
||||
precision: "decade_uncertain"
|
||||
notes: "Sources conflict: LinkedIn suggests 1980s, university bio suggests 1970s"
|
||||
```
|
||||
|
||||
### 6.4 Exact Date Found via Enrichment
|
||||
|
||||
```yaml
|
||||
# Person: Birth date found on institutional bio page
|
||||
ppid: ID_NL-NH-AMS_1985-03-15_XX-XX-XXX_XXXX_JAN-BERG
|
||||
|
||||
birth_date:
|
||||
edtf: "1985-03-15"
|
||||
precision: "day"
|
||||
|
||||
web_claims:
|
||||
- claim_type: birth_date
|
||||
claim_value: "1985-03-15"
|
||||
source_url: "https://museum.nl/team/jan-berg"
|
||||
retrieved_on: "2025-01-09T14:30:00Z"
|
||||
retrieval_agent: "opencode-claude-sonnet-4"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Anti-Patterns
|
||||
|
||||
### 7.1 FORBIDDEN: Fabricating Dates
|
||||
|
||||
```yaml
|
||||
# WRONG - No source, no search attempted
|
||||
birth_date:
|
||||
edtf: "1985-03-15" # Where did this come from?!
|
||||
```
|
||||
|
||||
### 7.2 FORBIDDEN: Using Non-EDTF Notation
|
||||
|
||||
```yaml
|
||||
# WRONG - Not EDTF compliant
|
||||
birth_date: "197~8~" # Invalid notation
|
||||
birth_date: "1970s" # Use 197X instead
|
||||
birth_date: "circa 1985" # Use 1985~ instead
|
||||
birth_date: "unknown" # Use XXXX instead
|
||||
```
|
||||
|
||||
### 7.3 FORBIDDEN: Skipping Enrichment Search
|
||||
|
||||
```yaml
|
||||
# WRONG - No search attempted
|
||||
birth_date:
|
||||
edtf: "XXXX"
|
||||
# No enrichment_metadata showing search was attempted!
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Validation Rules
|
||||
|
||||
1. **Search Required**: Cannot use `XXXX` without `enrichment_metadata.birth_date_search.attempted: true`
|
||||
2. **EDTF Compliance**: All dates must parse as valid EDTF (use validator)
|
||||
3. **Filename Safety**: PPID filenames must avoid `?`, `%`, `[]`, `/`, `|`
|
||||
4. **Provenance Required**: All found dates must have `web_claims` with source
|
||||
|
||||
---
|
||||
|
||||
## 9. References
|
||||
|
||||
- [EDTF Specification (Library of Congress)](https://www.loc.gov/standards/datetime/)
|
||||
- [ISO 8601-2:2019](https://www.iso.org/standard/70908.html)
|
||||
- [PPID-GHCID Alignment Document](../../docs/plan/person_pid/10_ppid_ghcid_alignment.md)
|
||||
- [Rule 21: Data Fabrication Prohibition](../DATA_FABRICATION_PROHIBITION.md)
|
||||
|
|
@ -5,8 +5,8 @@
|
|||
## The Rule
|
||||
|
||||
1. **Slots (Predicates)** MUST ONLY have `exact_mappings` to ontology **predicates** (properties).
|
||||
* ❌ INVALID: Slot `analyzes_or_analyzed` maps to `schema:object` (a Class).
|
||||
* ✅ VALID: Slot `analyzes_or_analyzed` maps to `crm:P129_is_about` (a Property).
|
||||
* ❌ INVALID: Slot `analyze` maps to `schema:object` (a Class).
|
||||
* ✅ VALID: Slot `analyze` maps to `crm:P129_is_about` (a Property).
|
||||
|
||||
2. **Classes (Entities)** MUST ONLY have `exact_mappings` to ontology **classes** (entities).
|
||||
* ❌ INVALID: Class `Person` maps to `foaf:name` (a Property).
|
||||
|
|
|
|||
|
|
@ -0,0 +1,65 @@
|
|||
# Rule: Engineering Parsimony and Domain Modeling
|
||||
|
||||
## Critical Convention
|
||||
|
||||
Our ontology follows an engineering-oriented approach: practical domain utility and
|
||||
stable interoperability take priority over minimal, tool-specific class catalogs.
|
||||
|
||||
## Rule
|
||||
|
||||
1. Model domain concepts, not implementation tools.
|
||||
- Reject classes like `ExaSearchMetadata`, `OpenAIFetchResult`, `ElasticsearchHit`.
|
||||
|
||||
2. Prefer generic, reusable activity/entity classes for operational provenance.
|
||||
- Use classes such as `ExternalSearchMetadata`, `RetrievalActivity`, `SearchResult`.
|
||||
|
||||
3. Capture tool/vendor details in slot values, not class names.
|
||||
- Record with generic predicates like `has_tool`, `has_method`, `has_agent`, `has_note`.
|
||||
|
||||
4. Digital platforms acting as custodians are valid domain classes.
|
||||
- Platform-as-custodian classes (for example YouTube-related custodian classes) are allowed.
|
||||
- Data processing/search tools are not ontology class candidates.
|
||||
|
||||
5. Avoid ontology growth driven by transient engineering stack choices.
|
||||
- New class proposals must be justified by cross-tool, domain-stable semantics.
|
||||
|
||||
## Rationale
|
||||
|
||||
- Tool names are volatile implementation details and age quickly.
|
||||
- Domain-level abstractions maximize reuse, query consistency, and mapping stability.
|
||||
- This aligns with an engineering ontology practice where strict theoretical
|
||||
parsimony in candidate theories is not the only optimization criterion; practical
|
||||
semantic interoperability and maintainability are primary.
|
||||
|
||||
## Examples
|
||||
|
||||
### Wrong
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
ExaSearchMetadata:
|
||||
class_uri: prov:Activity
|
||||
```
|
||||
|
||||
### Correct
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
ExternalSearchMetadata:
|
||||
class_uri: prov:Activity
|
||||
slots:
|
||||
- has_tool
|
||||
- has_method
|
||||
- has_agent
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
1. Liefke, K. (2024). *Natural Language Ontology and Semantic Theory*.
|
||||
Cambridge Elements in Semantics. DOI: `10.1017/9781009307789`.
|
||||
URL: https://www.cambridge.org/core/elements/abs/natural-language-ontology-and-semantic-theory/E8DDE548BB8A98137721984E26FAD764
|
||||
|
||||
2. Liefke, K. (2025). *Reduction and Unification in Natural Language Ontology*.
|
||||
Cambridge Elements in Semantics. DOI: `10.1017/9781009559683`.
|
||||
URL: https://www.cambridge.org/core/elements/abs/reduction-and-unification-in-natural-language-ontology/40F58ABA0D9C08958B5926F0CBDAD3CA
|
||||
|
||||
|
|
@ -0,0 +1,37 @@
|
|||
# Exact Mapping Predicate/Class Distinction Rule
|
||||
|
||||
🚨 **CRITICAL**: The `exact_mappings` property implies semantic equivalence. Equivalence can only exist between elements of the same ontological category.
|
||||
|
||||
## The Rule
|
||||
|
||||
1. **Slots (Predicates)** MUST ONLY have `exact_mappings` to ontology **predicates** (properties).
|
||||
* ❌ INVALID: Slot `analyze` maps to `schema:object` (a Class).
|
||||
* ✅ VALID: Slot `analyze` maps to `crm:P129_is_about` (a Property).
|
||||
|
||||
2. **Classes (Entities)** MUST ONLY have `exact_mappings` to ontology **classes** (entities).
|
||||
* ❌ INVALID: Class `Person` maps to `foaf:name` (a Property).
|
||||
* ✅ VALID: Class `Person` maps to `foaf:Person` (a Class).
|
||||
|
||||
3. **When true equivalence exists and is verified, exact mapping is preferred.**
|
||||
* ✅ VALID: Class `Acquisition` maps to `crm:E8_Acquisition`.
|
||||
* ✅ VALID: Slot mapped to an actually equivalent ontology property.
|
||||
* ❗ Do not avoid `exact_mappings` by default; avoid only when scope is broader/narrower/similar-but-not-equal.
|
||||
|
||||
## Rationale
|
||||
|
||||
Mapping a slot (which defines a relationship or attribute) to a class (which defines a type of entity) is a category error. `schema:object` represents the *class* of objects, not the *relationship* of "having an object" or "analyzing an object".
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
When adding or reviewing `exact_mappings`:
|
||||
- [ ] Is the LinkML element a Class or a Slot?
|
||||
- [ ] Did you verify the target term type in the ontology definition files (do not rely on naming heuristics)?
|
||||
- [ ] Do they match? (Class↔Class, Slot↔Property)
|
||||
- [ ] If the target ontology uses opaque IDs (like CIDOC-CRM `E55_Type`), verify the type definition in the ontology file.
|
||||
- [ ] If semantic scope is truly equivalent, use `exact_mappings` (not `close`/`broad` as a conservative fallback).
|
||||
|
||||
## Common Pitfalls to Fix
|
||||
|
||||
- Mapping slots to `schema:Object` or `schema:Thing`.
|
||||
- Mapping slots to `skos:Concept`.
|
||||
- Mapping classes to `schema:name` or `dc:title`.
|
||||
144
.opencode/rules/linkml/feedback-vs-revision-distinction.md
Normal file
144
.opencode/rules/linkml/feedback-vs-revision-distinction.md
Normal file
|
|
@ -0,0 +1,144 @@
|
|||
# Rule 58: Feedback vs Revision Distinction in slot_fixes.yaml
|
||||
|
||||
## Summary
|
||||
|
||||
The `feedback` and `revision` fields in `slot_fixes.yaml` serve distinct purposes and MUST NOT be conflated or renamed.
|
||||
|
||||
## Field Definitions
|
||||
|
||||
### `revision` Field
|
||||
- **Purpose**: Defines WHAT the migration target is
|
||||
- **Content**: List of slots and classes to create
|
||||
- **Authority**: IMMUTABLE (per Rule 57)
|
||||
- **Format**: Structured YAML list with `label`, `type`, optional `link_branch`
|
||||
|
||||
### `feedback` Field
|
||||
- **Purpose**: Contains user instructions on HOW the revision needs to be applied or corrected
|
||||
- **Content**: Can be string or structured format
|
||||
- **Authority**: User directives that override previous `notes`
|
||||
- **Action Required**: Agent must interpret and act upon feedback
|
||||
|
||||
## Feedback Formats
|
||||
|
||||
### Format 1: Structured (with `done` field)
|
||||
```yaml
|
||||
feedback:
|
||||
- timestamp: '2026-01-17T00:01:57Z'
|
||||
user: Simon C. Kemper
|
||||
done: false # Becomes true after agent processes
|
||||
comment: |
|
||||
The migration should use X instead of Y.
|
||||
response: "" # Agent fills this after completing
|
||||
```
|
||||
|
||||
### Format 2: String (direct instruction)
|
||||
```yaml
|
||||
feedback: I reject this! type_id should be migrated to has_or_had_identifier + Identifier
|
||||
```
|
||||
|
||||
Or:
|
||||
```yaml
|
||||
feedback: I altered the revision based on this feedback. Conduct this new migration accordingly.
|
||||
```
|
||||
|
||||
## Interpretation Rules
|
||||
|
||||
| Feedback Contains | Meaning | Action Required |
|
||||
|-------------------|---------|-----------------|
|
||||
| "I reject this" | Previous `notes` were WRONG | Follow `revision` field instead |
|
||||
| "I altered the revision" | User updated `revision` | Execute migration per NEW revision |
|
||||
| "Conduct the migration" | Migration not yet done | Execute migration now |
|
||||
| "Please conduct accordingly" | Migration pending | Execute migration now |
|
||||
| "ADDRESSED" or `done: true` | Already processed | No action needed |
|
||||
|
||||
## Decision Tree
|
||||
|
||||
```
|
||||
Is feedback field present?
|
||||
├─ NO → Check `processed.status`
|
||||
│ ├─ true → Migration complete
|
||||
│ └─ false → Execute revision
|
||||
│
|
||||
└─ YES → What format?
|
||||
├─ Structured with `done: true` → No action needed
|
||||
├─ Structured with `done: false` → Process feedback, then set done: true
|
||||
└─ String format → Parse for keywords:
|
||||
├─ "reject" → Previous notes invalid, follow revision
|
||||
├─ "altered/adjusted revision" → Execute NEW revision
|
||||
├─ "conduct/please" → Migration pending, execute now
|
||||
└─ "ADDRESSED" → Already done, no action
|
||||
```
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### WRONG: Renaming feedback to revision
|
||||
```yaml
|
||||
# DO NOT DO THIS
|
||||
# feedback contains instructions, not migration specs
|
||||
revision: # Was: feedback
|
||||
- I reject this! Use has_or_had_identifier
|
||||
```
|
||||
|
||||
### WRONG: Ignoring string feedback
|
||||
```yaml
|
||||
feedback: Please conduct the migration accordingly.
|
||||
notes: "NO MIGRATION NEEDED" # WRONG - feedback overrides notes
|
||||
```
|
||||
|
||||
### WRONG: Treating all feedback as completed
|
||||
```yaml
|
||||
feedback: I altered the revision. Conduct this new migration.
|
||||
processed:
|
||||
status: true # WRONG if migration not actually done
|
||||
```
|
||||
|
||||
## Correct Workflow
|
||||
|
||||
1. **Read feedback** - Understand user instruction
|
||||
2. **Check revision** - This defines the target migration
|
||||
3. **Execute migration** - Create/update slots and classes per revision
|
||||
4. **Update processed.status** - Set to `true`
|
||||
5. **Add response** - Document what was done
|
||||
- For structured feedback: Set `done: true` and fill `response`
|
||||
- For string feedback: Add new structured feedback entry confirming completion
|
||||
|
||||
## Example: Processing String Feedback
|
||||
|
||||
Before:
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/type_id
|
||||
feedback: I reject this! type_id should be migrated to has_or_had_identifier + Identifier
|
||||
revision:
|
||||
- label: has_or_had_identifier
|
||||
type: slot
|
||||
- label: Identifier
|
||||
type: class
|
||||
processed:
|
||||
status: false
|
||||
notes: "Previously marked as no migration needed"
|
||||
```
|
||||
|
||||
After processing:
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/type_id
|
||||
feedback:
|
||||
- timestamp: '2026-01-17T12:00:00Z'
|
||||
user: System
|
||||
done: true
|
||||
comment: "Original string feedback: I reject this! type_id should be migrated to has_or_had_identifier + Identifier"
|
||||
response: "Migration completed. type_id.yaml archived, consuming classes updated to use has_or_had_identifier slot with Identifier range."
|
||||
revision:
|
||||
- label: has_or_had_identifier
|
||||
type: slot
|
||||
- label: Identifier
|
||||
type: class
|
||||
processed:
|
||||
status: true
|
||||
notes: "Migration completed per user feedback rejecting previous notes."
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- **Rule 53**: Full Slot Migration - slot_fixes.yaml is AUTHORITATIVE
|
||||
- **Rule 57**: slot_fixes.yaml Revision Key is IMMUTABLE
|
||||
- **Rule 39**: Slot Naming Convention (RiC-O Style)
|
||||
373
.opencode/rules/linkml/full-slot-migration-rule.md
Normal file
373
.opencode/rules/linkml/full-slot-migration-rule.md
Normal file
|
|
@ -0,0 +1,373 @@
|
|||
# Rule 53: Full Slot Migration - No Deprecation Notes
|
||||
|
||||
🚨 **CRITICAL**: When migrating slots from `slot_fixes.yaml`:
|
||||
|
||||
1. **Follow the `revision` section EXACTLY** - The `slot_fixes.yaml` file specifies the exact replacement slots and classes to use
|
||||
2. **Perform FULL MIGRATION** - Completely remove the deprecated slot from the entity class
|
||||
3. **Do NOT add deprecation notes** - Never keep both old and new slots with deprecation markers
|
||||
|
||||
---
|
||||
|
||||
## 🚨 slot_fixes.yaml is AUTHORITATIVE AND CURATED 🚨
|
||||
|
||||
**File Location**: `schemas/20251121/linkml/modules/slots/slot_fixes.yaml`
|
||||
|
||||
**THIS FILE IS THE SINGLE SOURCE OF TRUTH FOR ALL SLOT MIGRATIONS.**
|
||||
|
||||
The `slot_fixes.yaml` file has been **manually curated** to specify the exact replacement slots and classes for each deprecated slot. The revisions are based on:
|
||||
|
||||
1. **Ontology analysis** - Each replacement was chosen based on alignment with base ontologies (CIDOC-CRM, RiC-O, PROV-O, Schema.org, etc.)
|
||||
2. **Semantic correctness** - Revisions reflect the intended meaning of the original slot
|
||||
3. **Pattern consistency** - Follows established naming conventions (Rule 39: RiC-O style, Rule 43: singular nouns)
|
||||
4. **Class hierarchy design** - Type/Types pattern (Rule 0b) applied where appropriate
|
||||
|
||||
**YOU MUST NOT**:
|
||||
- ❌ Substitute different slots than those specified in `revision`
|
||||
- ❌ Use your own judgment to pick "similar" slots
|
||||
- ❌ Skip the revision and invent new mappings
|
||||
- ❌ Partially apply the revision (e.g., use the slot but not the class)
|
||||
|
||||
**YOU MUST**:
|
||||
- ✅ Follow the `revision` section TO THE LETTER
|
||||
- ✅ Use EXACTLY the slots and classes specified
|
||||
- ✅ Apply ALL components of the revision (both slots AND classes)
|
||||
- ✅ Interpret `link_branch` fields correctly (see below)
|
||||
- ✅ Update `processed.status: true` after completing migration
|
||||
|
||||
---
|
||||
|
||||
## Understanding `link_branch` in Revision Plans
|
||||
|
||||
🚨 **CRITICAL**: The `link_branch` field in revision plans indicates **nested class attributes**. Items with `link_branch: N` are slots/classes that belong TO the primary class, not standalone replacements.
|
||||
|
||||
### How to Interpret `link_branch`
|
||||
|
||||
| Revision Item | Meaning |
|
||||
|---------------|---------|
|
||||
| Items **WITHOUT** `link_branch` | **PRIMARY** slot and class to create |
|
||||
| Items **WITH** `link_branch: 1` | First attribute branch that the primary class needs |
|
||||
| Items **WITH** `link_branch: 2` | Second attribute branch that the primary class needs |
|
||||
| Items **WITH** `link_branch: N` | Nth attribute branch for the primary class |
|
||||
|
||||
### Example: `visitor_count` Revision
|
||||
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/visitor_count
|
||||
revision:
|
||||
- label: has_or_had_quantity # PRIMARY SLOT (no link_branch)
|
||||
type: slot
|
||||
- label: Quantity # PRIMARY CLASS (no link_branch)
|
||||
type: class
|
||||
- label: has_or_had_measurement_unit # Quantity needs this slot
|
||||
type: slot
|
||||
link_branch: 1 # ← Branch 1: unit attribute
|
||||
- label: MeasureUnit # Range of has_or_had_measurement_unit
|
||||
type: class
|
||||
value:
|
||||
- visitors
|
||||
link_branch: 1
|
||||
- label: temporal_extent # Quantity needs this slot too
|
||||
type: slot
|
||||
link_branch: 2 # ← Branch 2: time attribute
|
||||
- label: TimeSpan # Range of temporal_extent
|
||||
type: class
|
||||
link_branch: 2
|
||||
```
|
||||
|
||||
**Interpretation**: This creates:
|
||||
1. **Primary**: `has_or_had_quantity` slot → `Quantity` class
|
||||
2. **Branch 1**: `Quantity.has_or_had_measurement_unit` → `MeasureUnit` (with value "visitors")
|
||||
3. **Branch 2**: `Quantity.temporal_extent` → `TimeSpan`
|
||||
|
||||
### Resulting Class Structure
|
||||
|
||||
```yaml
|
||||
# The Quantity class should have these slots:
|
||||
Quantity:
|
||||
slots:
|
||||
- has_or_had_measurement_unit # From link_branch: 1
|
||||
- temporal_extent # From link_branch: 2
|
||||
```
|
||||
|
||||
### Complex Example: `visitor_conversion_rate`
|
||||
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/visitor_conversion_rate
|
||||
revision:
|
||||
- label: has_or_had_conversion_rate # PRIMARY SLOT
|
||||
type: slot
|
||||
- label: ConversionRate # PRIMARY CLASS
|
||||
type: class
|
||||
- label: has_or_had_type # ConversionRate.has_or_had_type
|
||||
type: slot
|
||||
link_branch: 1
|
||||
- label: ConversionRateType # Abstract type class
|
||||
type: class
|
||||
link_branch: 1
|
||||
- label: includes_or_included # ConversionRateType hierarchy slot
|
||||
type: slot
|
||||
link_branch: 1
|
||||
- label: ConversionRateTypes # Concrete subclasses file
|
||||
type: class
|
||||
link_branch: 1
|
||||
- label: temporal_extent # ConversionRate.temporal_extent
|
||||
type: slot
|
||||
link_branch: 2
|
||||
- label: TimeSpan # Range of temporal_extent
|
||||
type: class
|
||||
link_branch: 2
|
||||
```
|
||||
|
||||
**Interpretation**:
|
||||
1. **Primary**: `has_or_had_conversion_rate` → `ConversionRate`
|
||||
2. **Branch 1**: Type hierarchy with `ConversionRateType` (abstract) + `ConversionRateTypes` (concrete subclasses)
|
||||
3. **Branch 2**: Temporal tracking via `temporal_extent` → `TimeSpan`
|
||||
|
||||
### Migration Checklist for `link_branch` Revisions
|
||||
|
||||
- [ ] Create/verify PRIMARY slot exists
|
||||
- [ ] Create/verify PRIMARY class exists
|
||||
- [ ] For EACH `link_branch: N`:
|
||||
- [ ] Add the branch slot to PRIMARY class's `slots:` list
|
||||
- [ ] Import the branch slot file
|
||||
- [ ] Import the branch class file (if creating new class)
|
||||
- [ ] Verify range of branch slot points to branch class
|
||||
- [ ] Update consuming class to use PRIMARY slot (not deprecated slot)
|
||||
- [ ] Update examples to show nested structure
|
||||
|
||||
---
|
||||
|
||||
## Mandatory: Follow slot_fixes.yaml Revisions Exactly
|
||||
|
||||
**The `revision` section in `slot_fixes.yaml` is AUTHORITATIVE.** Do not substitute different slots based on your own judgment.
|
||||
|
||||
**Example from slot_fixes.yaml**:
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/actual_start
|
||||
revision:
|
||||
- label: begin_of_the_begin # ← USE THIS SLOT
|
||||
type: slot
|
||||
- label: TimeSpan # ← USE THIS CLASS
|
||||
type: class
|
||||
```
|
||||
|
||||
**CORRECT**: Use `begin_of_the_begin` slot (as specified)
|
||||
**WRONG**: Substitute `has_actual_start_date` (not in revision)
|
||||
|
||||
## The Problem
|
||||
|
||||
Adding deprecation notes while keeping both old and new slots:
|
||||
- Creates schema bloat with redundant properties
|
||||
- Confuses data consumers about which slot to use
|
||||
- Violates single-source-of-truth principle
|
||||
- Complicates future data validation
|
||||
|
||||
## Anti-Pattern (WRONG)
|
||||
|
||||
```yaml
|
||||
# WRONG - Keeping deprecated slot with deprecation note
|
||||
classes:
|
||||
TemporaryLocation:
|
||||
slots:
|
||||
- actual_start # OLD - kept with deprecation note
|
||||
- actual_end # OLD - kept with deprecation note
|
||||
- has_actual_start_date # NEW
|
||||
- has_actual_end_date # NEW
|
||||
slot_usage:
|
||||
actual_start:
|
||||
deprecated: |
|
||||
DEPRECATED: Use has_actual_start_date instead.
|
||||
# ... more deprecation documentation
|
||||
```
|
||||
|
||||
## Correct Pattern
|
||||
|
||||
```yaml
|
||||
# CORRECT - Only new slots, old slots completely removed
|
||||
classes:
|
||||
TemporaryLocation:
|
||||
slots:
|
||||
- has_actual_start_date # NEW - only new slots present
|
||||
- has_actual_end_date # NEW
|
||||
# NO slot_usage for deprecated slots - they don't exist in this class
|
||||
```
|
||||
|
||||
## Migration Steps
|
||||
|
||||
When processing a slot from `slot_fixes.yaml`:
|
||||
|
||||
1. **Identify affected entity class(es)**
|
||||
2. **Remove old slot from imports** (if dedicated import file exists)
|
||||
3. **Remove old slot from slots list**
|
||||
4. **Remove any slot_usage for old slot**
|
||||
5. **Add new slot import** (if not already present)
|
||||
6. **Add new slot to slots list**
|
||||
7. **Add slot_usage for new slot** (if range override or customization needed)
|
||||
8. **Update examples** to use new slot
|
||||
9. **Validate with gen-owl**
|
||||
|
||||
## What Happens to Old Slot Files
|
||||
|
||||
The old slot files in `modules/slots/` (e.g., `actual_start.yaml`, `activities_societies.yaml`) are **NOT deleted** because:
|
||||
- Other entity classes might still use them
|
||||
- They serve as documentation of the old schema
|
||||
- They can be archived when all usages are migrated
|
||||
|
||||
However, the old slots are **removed from the entity class** being migrated.
|
||||
|
||||
## Example: TemporaryLocation Migration
|
||||
|
||||
**Before** (with old slots):
|
||||
```yaml
|
||||
imports:
|
||||
- ../slots/actual_end
|
||||
- ../slots/actual_start
|
||||
- ../slots/has_actual_start_date
|
||||
- ../slots/has_actual_end_date
|
||||
|
||||
slots:
|
||||
- actual_end
|
||||
- actual_start
|
||||
- has_actual_start_date
|
||||
- has_actual_end_date
|
||||
```
|
||||
|
||||
**After** (fully migrated):
|
||||
```yaml
|
||||
imports:
|
||||
# actual_end and actual_start imports REMOVED
|
||||
- ../slots/has_actual_start_date
|
||||
- ../slots/has_actual_end_date
|
||||
|
||||
slots:
|
||||
# actual_end and actual_start REMOVED from list
|
||||
- has_actual_start_date
|
||||
- has_actual_end_date
|
||||
```
|
||||
|
||||
## Slot Usage for New Slots
|
||||
|
||||
Only add `slot_usage` for the new slot if you need to:
|
||||
- Override the range for this specific class
|
||||
- Add class-specific examples
|
||||
- Add class-specific constraints
|
||||
|
||||
Do NOT add `slot_usage` just to document that it replaces an old slot.
|
||||
|
||||
## Recording in slot_fixes.yaml
|
||||
|
||||
When marking a slot as processed:
|
||||
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/actual_start
|
||||
processed:
|
||||
status: true
|
||||
timestamp: '2026-01-14T16:00:00Z'
|
||||
session: "session-2026-01-14-type-migration"
|
||||
notes: "FULLY MIGRATED: TemporaryLocation - actual_start REMOVED, using temporal_extent with TimeSpan.begin_of_the_begin (Rule 53)"
|
||||
```
|
||||
|
||||
Note the "FULLY MIGRATED" prefix in notes to confirm this was a complete removal, not a deprecation-in-place.
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ Common Mistakes to Avoid ⚠️
|
||||
|
||||
### Mistake 1: Substituting Different Slots
|
||||
|
||||
**slot_fixes.yaml specifies**:
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/actual_start
|
||||
revision:
|
||||
- label: begin_of_the_begin # ← MUST USE THIS
|
||||
type: slot
|
||||
- label: TimeSpan # ← WITH THIS CLASS
|
||||
type: class
|
||||
```
|
||||
|
||||
| Action | Status |
|
||||
|--------|--------|
|
||||
| Using `begin_of_the_begin` with `TimeSpan` | ✅ CORRECT |
|
||||
| Using `has_actual_start_date` (invented) | ❌ WRONG |
|
||||
| Using `start_date` (different slot) | ❌ WRONG |
|
||||
| Using `begin_of_the_begin` WITHOUT `TimeSpan` | ❌ WRONG (incomplete) |
|
||||
|
||||
### Mistake 2: Partial Application
|
||||
|
||||
The revision often specifies MULTIPLE components that work together:
|
||||
|
||||
```yaml
|
||||
revision:
|
||||
- label: has_or_had_type # ← Slot for linking
|
||||
type: slot
|
||||
- label: BackupType # ← Abstract base class
|
||||
type: class
|
||||
- label: includes_or_included # ← Slot for hierarchy
|
||||
type: slot
|
||||
- label: BackupTypes # ← Concrete subclasses
|
||||
type: class
|
||||
```
|
||||
|
||||
**All four components** are part of the migration. Don't just use `has_or_had_type` and ignore the class structure.
|
||||
|
||||
### Mistake 3: Using `temporal_extent` Slot Correctly
|
||||
|
||||
When `slot_fixes.yaml` specifies TimeSpan-based revision:
|
||||
|
||||
```yaml
|
||||
revision:
|
||||
- label: begin_of_the_begin
|
||||
type: slot
|
||||
- label: TimeSpan
|
||||
type: class
|
||||
```
|
||||
|
||||
This means: **Use the `temporal_extent` slot** (which has `range: TimeSpan`) and access the temporal bounds via TimeSpan's slots:
|
||||
|
||||
```yaml
|
||||
# CORRECT: Use temporal_extent with TimeSpan structure
|
||||
temporal_extent:
|
||||
begin_of_the_begin: '2020-06-15'
|
||||
end_of_the_end: '2022-03-15'
|
||||
|
||||
# WRONG: Create new has_actual_start_date slot
|
||||
has_actual_start_date: '2020-06-15' # ❌ Not in revision!
|
||||
```
|
||||
|
||||
### Mistake 4: Not Updating Examples
|
||||
|
||||
When migrating slots, **update ALL examples** in the class file:
|
||||
- Description examples (in class description)
|
||||
- slot_usage examples
|
||||
- Class-level examples (at bottom of file)
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
Before marking a slot as processed:
|
||||
|
||||
- [ ] Read the `revision` section completely
|
||||
- [ ] Identified ALL slots and classes in revision
|
||||
- [ ] Removed old slot from imports
|
||||
- [ ] Removed old slot from slots list
|
||||
- [ ] Removed old slot from slot_usage
|
||||
- [ ] Added new slot(s) per revision
|
||||
- [ ] Added new class import(s) per revision
|
||||
- [ ] Updated ALL examples to use new slots
|
||||
- [ ] Validated with `linkml-lint` or `gen-owl`
|
||||
- [ ] Updated `slot_fixes.yaml` with:
|
||||
- `status: true`
|
||||
- `timestamp` (ISO 8601)
|
||||
- `session` identifier
|
||||
- `notes` with "FULLY MIGRATED:" prefix
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- Rule 9: Enum-to-Class Promotion (single source of truth principle)
|
||||
- Rule 0b: Type/Types File Naming Convention
|
||||
- Rule: Slot Naming Convention (Current Style)
|
||||
- `.opencode/ENUM_TO_CLASS_PRINCIPLE.md`
|
||||
- `schemas/20251121/linkml/modules/slots/slot_fixes.yaml` - **AUTHORITATIVE** master list of migrations
|
||||
129
.opencode/rules/linkml/generic-slots-specific-classes.md
Normal file
129
.opencode/rules/linkml/generic-slots-specific-classes.md
Normal file
|
|
@ -0,0 +1,129 @@
|
|||
# Rule: Generic Slots, Specific Classes
|
||||
|
||||
**Identifier**: `generic-slots-specific-classes`
|
||||
**Severity**: **CRITICAL**
|
||||
|
||||
## Core Principle
|
||||
|
||||
**Slots MUST be generic predicates** that can be reused across multiple classes. **Classes MUST be specific** to provide context and constraints.
|
||||
|
||||
**DO NOT** create class-specific slots when a generic predicate can be used.
|
||||
|
||||
## Rationale
|
||||
|
||||
1. **Predicate Proliferation**: Creating bespoke slots for every class explodes the schema size (e.g., `has_museum_name`, `has_library_name`, `has_archive_name` instead of `has_name`).
|
||||
2. **Interoperability**: Generic predicates (`has_name`, `has_identifier`, `has_part`) map cleanly to standard ontologies (Schema.org, Dublin Core, RiC-O).
|
||||
3. **Querying**: It's easier to query "all entities with a name" than "all entities with museum_name OR library_name OR archive_name".
|
||||
4. **Maintenance**: Updating one generic slot propagates to all classes.
|
||||
|
||||
## Examples
|
||||
|
||||
### ❌ Anti-Pattern: Class-Specific Slots
|
||||
|
||||
```yaml
|
||||
# WRONG: Creating specific slots for each class
|
||||
slots:
|
||||
has_museum_visitor_count:
|
||||
range: integer
|
||||
has_library_patron_count:
|
||||
range: integer
|
||||
|
||||
classes:
|
||||
Museum:
|
||||
slots:
|
||||
- has_museum_visitor_count
|
||||
Library:
|
||||
slots:
|
||||
- has_library_patron_count
|
||||
```
|
||||
|
||||
### ✅ Correct Pattern: Generic Slot, Specific Class Usage
|
||||
|
||||
```yaml
|
||||
# CORRECT: One generic slot reused
|
||||
slots:
|
||||
has_or_had_quantity:
|
||||
slot_uri: rico:hasOrHadQuantity
|
||||
range: Quantity
|
||||
multivalued: true
|
||||
|
||||
classes:
|
||||
Museum:
|
||||
slots:
|
||||
- has_or_had_quantity
|
||||
slot_usage:
|
||||
has_or_had_quantity:
|
||||
description: The number of visitors to the museum.
|
||||
|
||||
Library:
|
||||
slots:
|
||||
- has_or_had_quantity
|
||||
slot_usage:
|
||||
has_or_had_quantity:
|
||||
description: The number of registered patrons.
|
||||
```
|
||||
|
||||
## Intermediate Class Pattern
|
||||
|
||||
Making slots generic often requires introducing **Intermediate Classes** to hold structured data, rather than flattening attributes onto the parent class.
|
||||
|
||||
### ❌ Anti-Pattern: Specific Flattened Slots
|
||||
|
||||
```yaml
|
||||
# WRONG: Flattened specific attributes
|
||||
classes:
|
||||
Museum:
|
||||
slots:
|
||||
- has_museum_budget_amount
|
||||
- has_museum_budget_currency
|
||||
- has_museum_budget_year
|
||||
```
|
||||
|
||||
### ✅ Correct Pattern: Generic Slot + Intermediate Class
|
||||
|
||||
```yaml
|
||||
# CORRECT: Generic slot pointing to structured class
|
||||
slots:
|
||||
has_or_had_budget:
|
||||
range: Budget
|
||||
multivalued: true
|
||||
|
||||
classes:
|
||||
Museum:
|
||||
slots:
|
||||
- has_or_had_budget
|
||||
|
||||
Budget:
|
||||
slots:
|
||||
- has_or_had_amount
|
||||
- has_or_had_currency
|
||||
- has_or_had_year
|
||||
```
|
||||
|
||||
## Specificity Levels
|
||||
|
||||
| Level | Component | Example |
|
||||
|-------|-----------|---------|
|
||||
| **Generic** | **Slot (Predicate)** | `has_or_had_identifier` |
|
||||
| **Specific** | **Class (Subject/Object)** | `ISILCode` |
|
||||
| **Specific** | **Slot Usage (Context)** | "The ISIL code assigned to this library" |
|
||||
|
||||
## Migration Guide
|
||||
|
||||
If you encounter an overly specific slot:
|
||||
|
||||
1. **Identify the generic concept** (e.g., `has_museum_opening_hours` → `has_opening_hours`).
|
||||
2. **Check if a generic slot exists** in `modules/slots/`.
|
||||
3. **If yes**, use the generic slot and add `slot_usage` to the class.
|
||||
4. **If no**, create the **generic** slot, not a specific one.
|
||||
|
||||
## Naming Indicators
|
||||
|
||||
**Reject slots containing:**
|
||||
* Class names (e.g., `has_custodian_name` → `has_name`)
|
||||
* Narrow types (e.g., `has_isbn_identifier` → `has_identifier`)
|
||||
* Contextual specifics (e.g., `has_primary_email` → `has_email` + type/role)
|
||||
|
||||
## See Also
|
||||
* Rule 55: Broaden Generic Predicate Ranges
|
||||
* Rule: Slot Naming Convention (Current Style)
|
||||
157
.opencode/rules/linkml/linkml-union-type-range-any-rule.md
Normal file
157
.opencode/rules/linkml/linkml-union-type-range-any-rule.md
Normal file
|
|
@ -0,0 +1,157 @@
|
|||
# Rule 59: LinkML Union Types Require `range: Any`
|
||||
|
||||
🚨 **CRITICAL**: When using `any_of` for union types in LinkML, you MUST also specify `range: Any` at the attribute level. Without it, the union type validation does NOT work.
|
||||
|
||||
## The Problem
|
||||
|
||||
LinkML's `any_of` construct allows defining slots that accept multiple types (e.g., string OR integer). However, there's a critical implementation detail:
|
||||
|
||||
**Without `range: Any`, the `any_of` constraint is silently ignored during validation.**
|
||||
|
||||
This leads to validation failures where data that should be valid (e.g., integer value in a string/integer union field) is rejected.
|
||||
|
||||
## Correct Pattern
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
identifier_value:
|
||||
range: Any # ← REQUIRED for any_of to work
|
||||
any_of:
|
||||
- range: string
|
||||
- range: integer
|
||||
description: The identifier value (can be string or integer)
|
||||
```
|
||||
|
||||
## Incorrect Pattern (WILL FAIL)
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
identifier_value:
|
||||
# Missing range: Any - validation will fail!
|
||||
any_of:
|
||||
- range: string
|
||||
- range: integer
|
||||
description: The identifier value (can be string or integer)
|
||||
```
|
||||
|
||||
## Common Use Cases
|
||||
|
||||
This pattern is required for:
|
||||
|
||||
| Use Case | Types | Example Fields |
|
||||
|----------|-------|----------------|
|
||||
| Identifier values | string \| integer | `identifier_value`, `geonames_id`, `viaf_id` |
|
||||
| Social media IDs | string \| array | `youtube_channel_id`, `facebook_id`, `twitter_username` |
|
||||
| Flexible identifiers | object \| array | `identifiers` (dict or list format) |
|
||||
| Numeric strings | string \| integer | `postal_code`, `kvk_number` |
|
||||
|
||||
## Real-World Examples from GLAM Schema
|
||||
|
||||
### Example 1: OriginalEntryIdentifier.yaml
|
||||
|
||||
```yaml
|
||||
# Before (BROKEN):
|
||||
attributes:
|
||||
identifier_value:
|
||||
any_of:
|
||||
- range: string
|
||||
- range: integer
|
||||
|
||||
# After (WORKING):
|
||||
attributes:
|
||||
identifier_value:
|
||||
range: Any # Added
|
||||
any_of:
|
||||
- range: string
|
||||
- range: integer
|
||||
```
|
||||
|
||||
### Example 2: WikidataSocialMedia.yaml
|
||||
|
||||
```yaml
|
||||
# Social media fields that can be single value or array
|
||||
attributes:
|
||||
youtube_channel_id:
|
||||
range: Any # Required for string|array union
|
||||
any_of:
|
||||
- range: string
|
||||
- range: string
|
||||
multivalued: true
|
||||
description: YouTube channel ID (single value or array)
|
||||
|
||||
facebook_id:
|
||||
range: Any
|
||||
any_of:
|
||||
- range: string
|
||||
- range: string
|
||||
multivalued: true
|
||||
```
|
||||
|
||||
### Example 3: OriginalEntry.yaml (object|array union)
|
||||
|
||||
```yaml
|
||||
# identifiers field that accepts both dict and array formats
|
||||
attributes:
|
||||
identifiers:
|
||||
range: Any # Required for flexible typing
|
||||
description: >-
|
||||
Identifiers from original source. Accepts both dict format
|
||||
(e.g., {isil: "XX-123"}) and array format
|
||||
(e.g., [{scheme: "isil", value: "XX-123"}])
|
||||
```
|
||||
|
||||
### Example 4: OriginalEntryLocation.yaml
|
||||
|
||||
```yaml
|
||||
attributes:
|
||||
geonames_id:
|
||||
range: Any # Required for string|integer
|
||||
any_of:
|
||||
- range: string
|
||||
- range: integer
|
||||
description: GeoNames ID (may be string or integer depending on source)
|
||||
```
|
||||
|
||||
## Validation Behavior
|
||||
|
||||
| Schema Definition | Integer Data | String Data | Result |
|
||||
|-------------------|--------------|-------------|--------|
|
||||
| `range: string` | ❌ FAIL | ✅ PASS | Strict string only |
|
||||
| `range: integer` | ✅ PASS | ❌ FAIL | Strict integer only |
|
||||
| `any_of` without `range: Any` | ❌ FAIL | ❌ FAIL | Broken - nothing works |
|
||||
| `any_of` with `range: Any` | ✅ PASS | ✅ PASS | Correct union behavior |
|
||||
|
||||
## Why This Happens
|
||||
|
||||
LinkML's validation engine processes `range` first to determine the basic type constraint. When `range` is not specified (or defaults to `string`), it applies that constraint before checking `any_of`. The `range: Any` tells the validator to defer type checking to the `any_of` constraints.
|
||||
|
||||
## Checklist for Union Types
|
||||
|
||||
When adding a field that accepts multiple types:
|
||||
|
||||
- [ ] Define the `any_of` block with all acceptable ranges
|
||||
- [ ] Add `range: Any` at the same level as `any_of`
|
||||
- [ ] Test with sample data of each type
|
||||
- [ ] Document the accepted types in the description
|
||||
|
||||
## See Also
|
||||
|
||||
- LinkML Documentation: [Union Types](https://linkml.io/linkml/schemas/advanced.html#union-types)
|
||||
- GLAM Validation: `schemas/20251121/linkml/modules/classes/CustodianSourceFile.yaml`
|
||||
- Validation command: `linkml-validate -s <schema>.yaml <data>.yaml`
|
||||
|
||||
## Migration Notes
|
||||
|
||||
**Affected Files (Fixed January 2026)**:
|
||||
- `OriginalEntryIdentifier.yaml` - `identifier_value`
|
||||
- `Identifier.yaml` - `identifier_value` slot_usage
|
||||
- `WikidataSocialMedia.yaml` - `youtube_channel_id`, `facebook_id`, `instagram_username`, `linkedin_company_id`, `twitter_username`, `facebook_page_id`
|
||||
- `YoutubeEnrichment.yaml` - `channel_id`
|
||||
- `OriginalEntryLocation.yaml` - `geonames_id`
|
||||
- `OriginalEntry.yaml` - `identifiers`
|
||||
|
||||
---
|
||||
|
||||
**Version**: 1.0
|
||||
**Created**: 2026-01-18
|
||||
**Author**: AI Agent (OpenCode Claude)
|
||||
181
.opencode/rules/linkml/linkml-yaml-best-practices-rule.md
Normal file
181
.opencode/rules/linkml/linkml-yaml-best-practices-rule.md
Normal file
|
|
@ -0,0 +1,181 @@
|
|||
# LinkML YAML Best Practices Rule
|
||||
|
||||
## Rule: Follow LinkML Conventions for Valid, Interoperable Schema Files
|
||||
|
||||
### 1. equals_expression Anti-Pattern
|
||||
|
||||
`equals_expression` is for dynamic formula evaluation (e.g., `"{age_in_years} * 12"`). Never use it for static value constraints.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_expression: '["hc:ArchiveOrganizationType"]'
|
||||
hold_record_set:
|
||||
equals_expression: '["hc:Fonds", "hc:Series"]'
|
||||
```
|
||||
|
||||
**CORRECT** (single value):
|
||||
```yaml
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType"
|
||||
```
|
||||
|
||||
**CORRECT** (multiple allowed values - if classes):
|
||||
```yaml
|
||||
slot_usage:
|
||||
hold_record_set:
|
||||
any_of:
|
||||
- range: UniversityAdministrativeFonds
|
||||
- range: StudentRecordSeries
|
||||
- range: FacultyPaperCollection
|
||||
```
|
||||
|
||||
**CORRECT** (multiple allowed values - if literals):
|
||||
```yaml
|
||||
slot_usage:
|
||||
status:
|
||||
equals_string_in:
|
||||
- "active"
|
||||
- "inactive"
|
||||
- "pending"
|
||||
```
|
||||
|
||||
### 2. Declare All Used Prefixes
|
||||
|
||||
Every CURIE prefix used in the file must be declared in the `prefixes:` block.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType" # hc: not declared!
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
default_prefix: hc
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType"
|
||||
```
|
||||
|
||||
### 3. Import Referenced Classes
|
||||
|
||||
When using external classes in `is_a`, `range`, or other references, import them.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
imports:
|
||||
- linkml:types
|
||||
classes:
|
||||
AcademicArchive:
|
||||
is_a: ArchiveOrganizationType # Not imported!
|
||||
slot_usage:
|
||||
related_to:
|
||||
range: WikidataAlignment # Not imported!
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../classes/ArchiveOrganizationType
|
||||
- ../classes/WikidataAlignment
|
||||
classes:
|
||||
AcademicArchive:
|
||||
is_a: ArchiveOrganizationType
|
||||
slot_usage:
|
||||
related_to:
|
||||
range: WikidataAlignment
|
||||
```
|
||||
|
||||
### 4. Quote Regex Patterns and Annotation Values
|
||||
|
||||
**Regex patterns:**
|
||||
```yaml
|
||||
# WRONG
|
||||
pattern: ^Q[0-9]+$
|
||||
|
||||
# CORRECT
|
||||
pattern: "^Q[0-9]+$"
|
||||
```
|
||||
|
||||
**Annotation values (must be strings):**
|
||||
```yaml
|
||||
# WRONG
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
|
||||
# CORRECT
|
||||
annotations:
|
||||
specificity_score: "0.1"
|
||||
```
|
||||
|
||||
### 5. Remove Unused Imports
|
||||
|
||||
Only import slots and classes that are actually used in the file.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
imports:
|
||||
- ../slots/has_scope # Never used in slots: or slot_usage:
|
||||
- ../slots/has_score
|
||||
- ../slots/has_type
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
imports:
|
||||
- ../slots/has_score
|
||||
- ../slots/has_type
|
||||
```
|
||||
|
||||
### 6. Slot Usage Requires Slot Presence
|
||||
|
||||
A slot referenced in `slot_usage:` must either be:
|
||||
- Listed in the `slots:` array, OR
|
||||
- Inherited from a parent class via `is_a`
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
classes:
|
||||
MyClass:
|
||||
slots:
|
||||
- has_type
|
||||
slot_usage:
|
||||
has_type: {...}
|
||||
identified_by: {...} # Not in slots: and not inherited!
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
classes:
|
||||
MyClass:
|
||||
slots:
|
||||
- has_type
|
||||
- identified_by
|
||||
slot_usage:
|
||||
has_type: {...}
|
||||
identified_by: {...}
|
||||
```
|
||||
|
||||
## Checklist for Class Files
|
||||
|
||||
- [ ] All prefixes used in CURIEs are declared
|
||||
- [ ] `default_prefix` set if module belongs to that namespace
|
||||
- [ ] All referenced classes are imported
|
||||
- [ ] All used slots are imported
|
||||
- [ ] No `equals_expression` with static JSON arrays
|
||||
- [ ] Regex patterns are quoted
|
||||
- [ ] Annotation values are quoted strings
|
||||
- [ ] No unused imports
|
||||
- [ ] `slot_usage` only references slots that exist (via slots: or inheritance)
|
||||
185
.opencode/rules/linkml/mapping-specificity-hypernym-rule.md
Normal file
185
.opencode/rules/linkml/mapping-specificity-hypernym-rule.md
Normal file
|
|
@ -0,0 +1,185 @@
|
|||
# Mapping Specificity Rule: Broad vs Narrow vs Exact Mappings
|
||||
|
||||
## 🚨 CRITICAL: Mapping Semantics
|
||||
|
||||
When mapping LinkML classes to external ontologies, you MUST distinguish between **equivalence**, **hypernyms** (broader concepts), and **hyponyms** (narrower concepts).
|
||||
|
||||
### The Rule
|
||||
|
||||
1. **Exact Mappings (`skos:exactMatch`)**: Use ONLY when the external concept is **semantically equivalent** to your class.
|
||||
* *Example*: `hc:Person` `exact_mappings` `schema:Person`.
|
||||
* **CRITICAL**: Exact means the SAME semantic scope - neither broader nor narrower!
|
||||
* **DO NOT AVOID EXACT BY DEFAULT**: If equivalence is verified (including class/property category match and ontology definition review), `exact_mappings` SHOULD be used.
|
||||
|
||||
2. **Broad Mappings (`skos:broadMatch`)**: Use when the external concept is a **hypernym** (a broader, more general category) of your class.
|
||||
* *Example*: `hc:AcademicArchiveRecordSetType` `broad_mappings` `rico:RecordSetType`.
|
||||
* *Rationale*: An academic archive record set *is a* record set type, but `rico:RecordSetType` is broader.
|
||||
* *Common Hypernyms*: `skos:Concept`, `prov:Entity`, `prov:Activity`, `schema:Thing`, `schema:Organization`, `schema:Action`, `rico:RecordSetType`, `crm:E55_Type`.
|
||||
|
||||
3. **Narrow Mappings (`skos:narrowMatch`)**: Use when the external concept is a **hyponym** (a narrower, more specific category) of your class.
|
||||
* *Example*: `hc:Organization` `narrow_mappings` `hc:Library` (if mapping inversely).
|
||||
|
||||
4. **Close Mappings (`skos:closeMatch`)**: Use when the external concept is similar but not exactly equivalent.
|
||||
* *Example*: `hc:AccessPolicy` `close_mappings` `dcterms:accessRights` (related but different scope).
|
||||
|
||||
5. **Related Mappings (`skos:relatedMatch`)**: Use for non-hierarchical relationships.
|
||||
* *Example*: `hc:Collection` `related_mappings` `rico:RecordSet`.
|
||||
|
||||
### 🚨 Type Compatibility Rule
|
||||
|
||||
**Classes map to classes, properties map to properties.** Never mix types in mappings.
|
||||
|
||||
| Your Element | Valid Mapping Target |
|
||||
|--------------|---------------------|
|
||||
| Class | Class (owl:Class, rdfs:Class) |
|
||||
| Slot | Property (owl:ObjectProperty, owl:DatatypeProperty, rdf:Property) |
|
||||
|
||||
❌ **WRONG**:
|
||||
```yaml
|
||||
# AccessApplication is a CLASS, schema:Action is a CLASS - but Action is BROADER
|
||||
AccessApplication:
|
||||
exact_mappings:
|
||||
- schema:Action # WRONG: Action is a hypernym, not equivalent
|
||||
```
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
AccessApplication:
|
||||
broad_mappings:
|
||||
- schema:Action # CORRECT: Action is the broader category
|
||||
```
|
||||
|
||||
### 🚨 No Self/Internal Exact Mappings
|
||||
|
||||
`exact_mappings` MUST NOT contain self-references or internal HC class references for the same concept.
|
||||
|
||||
❌ **WRONG**:
|
||||
```yaml
|
||||
AcademicArchive:
|
||||
exact_mappings:
|
||||
- hc:AcademicArchive # Self/internal reference; not an external equivalence mapping
|
||||
```
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
AcademicArchive:
|
||||
exact_mappings:
|
||||
- wd:Q27032435 # External concept with equivalent semantic scope
|
||||
```
|
||||
|
||||
Use `exact_mappings` only for equivalent terms in external ontologies or external controlled vocabularies, not for repeating the class itself.
|
||||
|
||||
### ✅ Positive Guidance: When Exact Mapping Is Correct
|
||||
|
||||
Use `exact_mappings` when all checks below pass:
|
||||
|
||||
- Semantic scope is equivalent (not parent/child, not merely similar)
|
||||
- Ontological category matches (Class↔Class, Slot↔Property)
|
||||
- Target term is verified in the ontology source files under `data/ontology/` or verified Wikidata entity metadata
|
||||
- No self/internal duplication (no `hc:` self-reference for the same concept)
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
Person:
|
||||
exact_mappings:
|
||||
- schema:Person
|
||||
|
||||
Acquisition:
|
||||
exact_mappings:
|
||||
- crm:E8_Acquisition
|
||||
```
|
||||
|
||||
Do not downgrade a truly equivalent mapping to `close_mappings` or `broad_mappings` just to be conservative.
|
||||
|
||||
### Common Hypernyms That Are NEVER Exact Mappings
|
||||
|
||||
These terms are always BROADER than your specific class - never use them as `exact_mappings`:
|
||||
|
||||
| Hypernym | What It Means | Use Instead |
|
||||
|----------|---------------|-------------|
|
||||
| `schema:Action` | Any action | `broad_mappings` |
|
||||
| `schema:Organization` | Any organization | `broad_mappings` |
|
||||
| `schema:Thing` | Anything at all | `broad_mappings` |
|
||||
| `schema:PropertyValue` | Any property value | `broad_mappings` |
|
||||
| `schema:Permit` | Any permit | `broad_mappings` |
|
||||
| `prov:Activity` | Any activity | `broad_mappings` |
|
||||
| `prov:Entity` | Any entity | `broad_mappings` |
|
||||
| `skos:Concept` | Any concept | `broad_mappings` |
|
||||
| `crm:E55_Type` | Any type classification | `broad_mappings` |
|
||||
| `crm:E42_Identifier` | Any identifier | `broad_mappings` |
|
||||
| `rico:Identifier` | Any identifier | `broad_mappings` |
|
||||
| `dcat:DataService` | Any data service | `broad_mappings` |
|
||||
|
||||
### Common Violations to Avoid
|
||||
|
||||
❌ **WRONG**:
|
||||
```yaml
|
||||
AcademicArchiveRecordSetType:
|
||||
exact_mappings:
|
||||
- rico:RecordSetType # WRONG: This implies AcademicArchiveRecordSetType == RecordSetType
|
||||
```
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
AcademicArchiveRecordSetType:
|
||||
broad_mappings:
|
||||
- rico:RecordSetType # CORRECT: RecordSetType is broader
|
||||
```
|
||||
|
||||
❌ **WRONG**:
|
||||
```yaml
|
||||
SocialMovement:
|
||||
exact_mappings:
|
||||
- schema:Organization # WRONG: SocialMovement is a specific TYPE of Organization
|
||||
```
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
SocialMovement:
|
||||
broad_mappings:
|
||||
- schema:Organization # CORRECT
|
||||
```
|
||||
|
||||
❌ **WRONG**:
|
||||
```yaml
|
||||
AccessApplication:
|
||||
exact_mappings:
|
||||
- schema:Action # WRONG: Action is a hypernym
|
||||
```
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
AccessApplication:
|
||||
broad_mappings:
|
||||
- schema:Action # CORRECT: Action is the broader category
|
||||
```
|
||||
|
||||
### How to Determine Mapping Type
|
||||
|
||||
Ask these questions:
|
||||
|
||||
1. **Is it the SAME thing?** → `exact_mappings`
|
||||
- "Could I swap these two terms in any context without changing meaning?"
|
||||
- If NO, it's not an exact mapping
|
||||
|
||||
2. **Is the external term a PARENT category?** → `broad_mappings`
|
||||
- "Is my class a TYPE OF the external term?"
|
||||
- Example: AccessApplication IS-A Action
|
||||
|
||||
3. **Is the external term a CHILD category?** → `narrow_mappings`
|
||||
- "Is the external term a TYPE OF my class?"
|
||||
- Example: Library IS-A Organization (so Organization has narrow_mapping to Library)
|
||||
|
||||
4. **Is it similar but not hierarchical?** → `close_mappings`
|
||||
- "Related but not equivalent or hierarchical"
|
||||
|
||||
5. **Is there some other relationship?** → `related_mappings`
|
||||
- "Connected in some way"
|
||||
|
||||
### Verification Checklist
|
||||
|
||||
- [ ] Does the `exact_mapping` represent the **exact same scope**?
|
||||
- [ ] Is the external term a generic parent class (e.g., `Type`, `Concept`, `Entity`, `Action`, `Activity`, `Organization`)? → Move to `broad_mappings`
|
||||
- [ ] Is the external term a specific instance or subclass? → Check `narrow_mappings`
|
||||
- [ ] Is the external term the same type (class→class, property→property)?
|
||||
- [ ] Would swapping the terms change the meaning? If yes, not an `exact_mapping`
|
||||
177
.opencode/rules/linkml/multilingual-support-requirements.md
Normal file
177
.opencode/rules/linkml/multilingual-support-requirements.md
Normal file
|
|
@ -0,0 +1,177 @@
|
|||
# Rule: Multilingual Support Requirements
|
||||
|
||||
## Overview
|
||||
|
||||
All LinkML slot files MUST include multilingual support with translations in the following languages:
|
||||
|
||||
| Code | Language | Required |
|
||||
|------|----------|----------|
|
||||
| `nl` | Dutch | ✅ Yes |
|
||||
| `de` | German | ✅ Yes |
|
||||
| `fr` | French | ✅ Yes |
|
||||
| `ar` | Arabic | ✅ Yes |
|
||||
| `id` | Indonesian | ✅ Yes |
|
||||
| `zh` | Chinese (Simplified) | ✅ Yes |
|
||||
| `es` | Spanish | ✅ Yes |
|
||||
|
||||
---
|
||||
|
||||
## Required Multilingual Fields
|
||||
|
||||
### 1. `alt_descriptions`
|
||||
|
||||
Provide faithful translations of the English `description` field:
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
my_slot:
|
||||
description: >-
|
||||
To possess a specific structural arrangement or encoding standard.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Het bezitten van een specifieke structurele rangschikking of coderingsstandaard.
|
||||
de: >-
|
||||
Das Besitzen einer spezifischen strukturellen Anordnung oder eines Kodierungsstandards.
|
||||
fr: >-
|
||||
Posséder un arrangement structurel spécifique ou une norme de codage.
|
||||
ar: >-
|
||||
امتلاك ترتيب هيكلي محدد أو معيار ترميز.
|
||||
id: >-
|
||||
Memiliki susunan struktural tertentu atau standar pengkodean.
|
||||
zh: >-
|
||||
拥有特定的结构安排或编码标准。
|
||||
es: >-
|
||||
Poseer una disposición estructural específica o un estándar de codificación.
|
||||
```
|
||||
|
||||
### 2. `structured_aliases`
|
||||
|
||||
Provide translated slot names/labels for each language:
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
has_format:
|
||||
structured_aliases:
|
||||
- literal_form: heeft formaat
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: nl
|
||||
- literal_form: hat Format
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: de
|
||||
- literal_form: a un format
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: fr
|
||||
- literal_form: لديه تنسيق
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: ar
|
||||
- literal_form: memiliki format
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: id
|
||||
- literal_form: 具有格式
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: zh
|
||||
- literal_form: tiene formato
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: es
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Translation Guidelines
|
||||
|
||||
### DO:
|
||||
- Translate the semantic meaning faithfully
|
||||
- Preserve technical precision
|
||||
- Use natural phrasing for each language
|
||||
- Keep translations concise (similar length to English)
|
||||
|
||||
### DON'T:
|
||||
- Paraphrase or expand beyond the original meaning
|
||||
- Add information not present in the English description
|
||||
- Use machine translation without review
|
||||
- Skip any of the required languages
|
||||
|
||||
---
|
||||
|
||||
## Complete Example
|
||||
|
||||
```yaml
|
||||
id: https://nde.nl/ontology/hc/slot/catalogue
|
||||
name: catalogue
|
||||
title: catalogue
|
||||
|
||||
slots:
|
||||
catalogue:
|
||||
slot_uri: crm:P70_documents
|
||||
description: >-
|
||||
To systematically record, classify, and organize items within a structured
|
||||
inventory or database for the purposes of documentation and retrieval.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Het systematisch vastleggen, classificeren en ordenen van items binnen een
|
||||
gestructureerde inventaris of database voor documentatie en terugvinding.
|
||||
de: >-
|
||||
Das systematische Erfassen, Klassifizieren und Ordnen von Objekten in einem
|
||||
strukturierten Inventar oder einer Datenbank für Dokumentation und Abruf.
|
||||
fr: >-
|
||||
Enregistrer, classer et organiser systématiquement des éléments dans un
|
||||
inventaire structuré ou une base de données à des fins de documentation et de récupération.
|
||||
ar: >-
|
||||
تسجيل وتصنيف وتنظيم العناصر بشكل منهجي ضمن جرد منظم أو قاعدة بيانات لأغراض التوثيق والاسترجاع.
|
||||
id: >-
|
||||
Mencatat, mengklasifikasikan, dan mengatur item secara sistematis dalam
|
||||
inventaris terstruktur atau database untuk tujuan dokumentasi dan pengambilan.
|
||||
zh: >-
|
||||
在结构化清单或数据库中系统地记录、分类和组织项目,以便于文档编制和检索。
|
||||
es: >-
|
||||
Registrar, clasificar y organizar sistemáticamente elementos dentro de un
|
||||
inventario estructurado o base de datos con fines de documentación y recuperación.
|
||||
structured_aliases:
|
||||
- literal_form: catalogiseren
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: nl
|
||||
- literal_form: katalogisieren
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: de
|
||||
- literal_form: cataloguer
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: fr
|
||||
- literal_form: فهرسة
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: ar
|
||||
- literal_form: mengkatalogkan
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: id
|
||||
- literal_form: 编目
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: zh
|
||||
- literal_form: catalogar
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: es
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
Before completing a slot file, verify:
|
||||
|
||||
- [ ] `alt_descriptions` provided for all 7 languages (nl, de, fr, ar, id, zh, es)
|
||||
- [ ] `structured_aliases` provided for all 7 languages
|
||||
- [ ] Translations are faithful to the English original
|
||||
- [ ] No language is skipped or left empty
|
||||
- [ ] Arabic and Chinese characters render correctly
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- Rule 1: Preserve Original Descriptions (LINKML_EDITING_RULES.md)
|
||||
- Rule 2: Translation Accuracy (LINKML_EDITING_RULES.md)
|
||||
- Rule 3: Description Field Purity (LINKML_EDITING_RULES.md)
|
||||
|
||||
---
|
||||
|
||||
**Version**: 1.0.0
|
||||
**Created**: 2026-02-03
|
||||
**Author**: OpenCODE
|
||||
24
.opencode/rules/linkml/no-autonomous-alias-assignment.md
Normal file
24
.opencode/rules/linkml/no-autonomous-alias-assignment.md
Normal file
|
|
@ -0,0 +1,24 @@
|
|||
# Rule: No Autonomous Alias Assignment
|
||||
|
||||
**Status**: ACTIVE
|
||||
**Created**: 2026-02-10
|
||||
|
||||
## Rule
|
||||
|
||||
The agent MUST NOT assign aliases to canonical slot files on its own. Only the user decides which `new/` slot files are absorbed as aliases into which canonical slots.
|
||||
|
||||
## Rationale
|
||||
|
||||
Alias assignment is a semantic decision that determines the conceptual scope of a canonical slot. Incorrect alias assignment conflates distinct concepts. For example, `membership_criteria` (eligibility rules for joining) is not an alias of `has_mission` (organizational purpose), even though both relate to organizational governance.
|
||||
|
||||
## What the agent MUST do
|
||||
|
||||
1. When creating or polishing a canonical slot file, leave the `aliases` field empty unless the user has explicitly specified which aliases to include.
|
||||
2. When processing `new/` files, present candidates to the user and wait for their alias assignment decisions.
|
||||
3. Do NOT delete `new/` files until the user confirms the alias mapping.
|
||||
|
||||
## What the agent MUST NOT do
|
||||
|
||||
- Autonomously decide that a `new/` file should become an alias of a canonical slot.
|
||||
- Add alias entries without explicit user instruction.
|
||||
- Delete `new/` files based on self-determined alias assignments.
|
||||
46
.opencode/rules/linkml/no-deletion-from-slot-fixes.md
Normal file
46
.opencode/rules/linkml/no-deletion-from-slot-fixes.md
Normal file
|
|
@ -0,0 +1,46 @@
|
|||
# Rule: Do Not Delete From slot_fixes.yaml
|
||||
|
||||
**Identifier**: `no-deletion-from-slot-fixes`
|
||||
**Severity**: **CRITICAL**
|
||||
|
||||
## Core Directive
|
||||
|
||||
**NEVER delete entries from `slot_fixes.yaml`.**
|
||||
|
||||
The `slot_fixes.yaml` file serves as the historical record and audit trail for all schema migrations. Removing entries destroys this history and violates the project's data integrity principles.
|
||||
|
||||
## Workflow
|
||||
|
||||
When processing a migration:
|
||||
|
||||
1. **Do NOT Remove**: Never delete the entry for the slot you are working on.
|
||||
2. **Update `processed`**: Instead, update the `processed` block:
|
||||
* Set `status: true`.
|
||||
* Set `date` to the current date (YYYY-MM-DD).
|
||||
* Add a detailed `notes` string explaining what was done (e.g., "Fully migrated to [new_slot] + [Class] (Rule 53). [File].yaml updated. Slot archived.").
|
||||
3. **Preserve History**: The entry must remain in the file permanently as a record of the migration.
|
||||
|
||||
## Rationale
|
||||
|
||||
* **Audit Trail**: We need to know what was migrated, when, and how.
|
||||
* **Reversibility**: If a migration introduces a bug, the record helps us understand the original state.
|
||||
* **Completeness**: The file tracks the total progress of the schema refactoring project.
|
||||
|
||||
## Example
|
||||
|
||||
**WRONG (Deletion)**:
|
||||
```yaml
|
||||
# DELETED from file
|
||||
# - original_slot_id: ...
|
||||
```
|
||||
|
||||
**CORRECT (Update)**:
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/has_some_slot
|
||||
processed:
|
||||
status: true
|
||||
date: '2026-01-27'
|
||||
notes: Fully migrated to has_or_had_new_slot + NewClass (Rule 53).
|
||||
revision:
|
||||
...
|
||||
```
|
||||
189
.opencode/rules/linkml/no-duplicate-ontology-mappings.md
Normal file
189
.opencode/rules/linkml/no-duplicate-ontology-mappings.md
Normal file
|
|
@ -0,0 +1,189 @@
|
|||
# Rule 52: No Duplicate Ontology Mappings
|
||||
|
||||
## Summary
|
||||
|
||||
Each ontology URI MUST appear in only ONE mapping category per schema element. A URI cannot simultaneously have multiple semantic relationships to the same class or slot.
|
||||
|
||||
## The Problem
|
||||
|
||||
LinkML provides five mapping annotation types based on SKOS vocabulary alignment:
|
||||
|
||||
| Property | SKOS Predicate | Meaning |
|
||||
|----------|---------------|---------|
|
||||
| `exact_mappings` | `skos:exactMatch` | "This IS that" (equivalent) |
|
||||
| `close_mappings` | `skos:closeMatch` | "This is very similar to that" |
|
||||
| `related_mappings` | `skos:relatedMatch` | "This is conceptually related to that" |
|
||||
| `narrow_mappings` | `skos:narrowMatch` | "This is MORE SPECIFIC than that" |
|
||||
| `broad_mappings` | `skos:broadMatch` | "This is MORE GENERAL than that" |
|
||||
|
||||
These relationships are **mutually exclusive**. A URI cannot simultaneously:
|
||||
- BE the element (`exact_mappings`) AND be broader than it (`broad_mappings`)
|
||||
- Be closely similar (`close_mappings`) AND be more general (`broad_mappings`)
|
||||
|
||||
## Anti-Pattern (WRONG)
|
||||
|
||||
```yaml
|
||||
# WRONG - schema:url appears in TWO mapping types
|
||||
slots:
|
||||
source_url:
|
||||
slot_uri: prov:atLocation
|
||||
exact_mappings:
|
||||
- schema:url # Says "source_url IS schema:url"
|
||||
broad_mappings:
|
||||
- schema:url # Says "schema:url is MORE GENERAL than source_url"
|
||||
```
|
||||
|
||||
This is a **logical contradiction**: `source_url` cannot simultaneously BE `schema:url` AND be more specific than `schema:url`.
|
||||
|
||||
## Correct Pattern
|
||||
|
||||
```yaml
|
||||
# CORRECT - each URI appears in only ONE mapping type
|
||||
slots:
|
||||
source_url:
|
||||
slot_uri: prov:atLocation
|
||||
exact_mappings:
|
||||
- schema:url # source_url IS schema:url
|
||||
close_mappings:
|
||||
- dcterms:source # Similar but not identical
|
||||
```
|
||||
|
||||
## Decision Guide: Which Mapping to Keep
|
||||
|
||||
When a URI appears in multiple categories, keep the **most precise** one:
|
||||
|
||||
### Precedence Order (keep the first match)
|
||||
|
||||
1. **exact_mappings** - Strongest claim: semantic equivalence
|
||||
2. **close_mappings** - Strong claim: nearly equivalent
|
||||
3. **narrow_mappings** / **broad_mappings** - Hierarchical relationship
|
||||
4. **related_mappings** - Weakest claim: conceptual association
|
||||
|
||||
### Decision Matrix
|
||||
|
||||
| If URI appears in... | Keep | Remove |
|
||||
|---------------------|------|--------|
|
||||
| exact + broad | exact | broad |
|
||||
| exact + close | exact | close |
|
||||
| exact + related | exact | related |
|
||||
| close + broad | close | broad |
|
||||
| close + related | close | related |
|
||||
| related + broad | related | broad |
|
||||
| narrow + broad | narrow | broad (contradictory!) |
|
||||
|
||||
### Special Case: narrow + broad
|
||||
|
||||
If a URI appears in BOTH `narrow_mappings` AND `broad_mappings`, this is a **data error** - the same URI cannot be both more specific AND more general. Investigate which is correct based on the ontology definition.
|
||||
|
||||
## Real Examples Fixed
|
||||
|
||||
### Example 1: source_url
|
||||
|
||||
```yaml
|
||||
# BEFORE (wrong)
|
||||
slots:
|
||||
source_url:
|
||||
exact_mappings:
|
||||
- schema:url
|
||||
broad_mappings:
|
||||
- schema:url # Duplicate!
|
||||
|
||||
# AFTER (correct)
|
||||
slots:
|
||||
source_url:
|
||||
exact_mappings:
|
||||
- schema:url # Keep exact (strongest)
|
||||
# broad_mappings removed
|
||||
```
|
||||
|
||||
### Example 2: Custodian class
|
||||
|
||||
```yaml
|
||||
# BEFORE (wrong)
|
||||
classes:
|
||||
Custodian:
|
||||
close_mappings:
|
||||
- cpov:PublicOrganisation
|
||||
narrow_mappings:
|
||||
- cpov:PublicOrganisation # Duplicate!
|
||||
|
||||
# AFTER (correct)
|
||||
classes:
|
||||
Custodian:
|
||||
close_mappings:
|
||||
- cpov:PublicOrganisation # Keep close (Custodian ≈ PublicOrganisation)
|
||||
# narrow_mappings: use for URIs that are MORE SPECIFIC than Custodian
|
||||
```
|
||||
|
||||
### Example 3: geonames_id (narrow + broad conflict)
|
||||
|
||||
```yaml
|
||||
# BEFORE (wrong - logical contradiction!)
|
||||
slots:
|
||||
geonames_id:
|
||||
narrow_mappings:
|
||||
- dcterms:identifier # Says geonames_id is MORE SPECIFIC
|
||||
broad_mappings:
|
||||
- dcterms:identifier # Says geonames_id is MORE GENERAL
|
||||
|
||||
# AFTER (correct)
|
||||
slots:
|
||||
geonames_id:
|
||||
narrow_mappings:
|
||||
- dcterms:identifier # geonames_id IS a specific type of identifier
|
||||
# broad_mappings removed (was contradictory)
|
||||
```
|
||||
|
||||
## Detection Script
|
||||
|
||||
Run this to find duplicate mappings in the schema:
|
||||
|
||||
```python
|
||||
import yaml
|
||||
from pathlib import Path
|
||||
from collections import defaultdict
|
||||
|
||||
mapping_types = ['exact_mappings', 'close_mappings', 'related_mappings',
|
||||
'narrow_mappings', 'broad_mappings']
|
||||
|
||||
dirs = [
|
||||
Path('schemas/20251121/linkml/modules/slots'),
|
||||
Path('schemas/20251121/linkml/modules/classes'),
|
||||
]
|
||||
|
||||
for d in dirs:
|
||||
for yaml_file in d.glob('*.yaml'):
|
||||
try:
|
||||
with open(yaml_file) as f:
|
||||
content = yaml.safe_load(f)
|
||||
except Exception:
|
||||
continue
|
||||
if not content:
|
||||
continue
|
||||
|
||||
for section in ['classes', 'slots']:
|
||||
items = content.get(section, {})
|
||||
if not isinstance(items, dict):
|
||||
continue
|
||||
for name, defn in items.items():
|
||||
if not isinstance(defn, dict):
|
||||
continue
|
||||
uri_to_types = defaultdict(list)
|
||||
for mt in mapping_types:
|
||||
for uri in defn.get(mt, []) or []:
|
||||
uri_to_types[uri].append(mt)
|
||||
for uri, types in uri_to_types.items():
|
||||
if len(types) > 1:
|
||||
print(f"{yaml_file}: {name} - {uri} in {types}")
|
||||
```
|
||||
|
||||
## Validation Rule
|
||||
|
||||
**Pre-commit check**: Before committing LinkML schema changes, run the detection script. If any duplicates are found, the commit should fail.
|
||||
|
||||
## References
|
||||
|
||||
- [LinkML Mappings Documentation](https://linkml.io/linkml-model/latest/docs/mappings/)
|
||||
- [SKOS Mapping Properties](https://www.w3.org/TR/skos-reference/#mapping)
|
||||
- Rule 50: Ontology-to-LinkML Mapping Convention (parent rule)
|
||||
- Rule 51: No Hallucinated Ontology References
|
||||
316
.opencode/rules/linkml/no-hallucinated-ontology-references.md
Normal file
316
.opencode/rules/linkml/no-hallucinated-ontology-references.md
Normal file
|
|
@ -0,0 +1,316 @@
|
|||
# Rule 51: No Hallucinated Ontology References
|
||||
|
||||
**Priority**: CRITICAL
|
||||
**Scope**: All LinkML schema files (`schemas/20251121/linkml/`)
|
||||
**Created**: 2025-01-13
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
All ontology references in LinkML schema files (`class_uri`, `slot_uri`, `*_mappings`) MUST be verifiable against actual ontology files in `/data/ontology/`. References to predicates or classes that do not exist in local ontology files are considered **hallucinated** and are prohibited.
|
||||
|
||||
---
|
||||
|
||||
## The Problem
|
||||
|
||||
AI agents may suggest ontology mappings based on training data without verifying that:
|
||||
1. The ontology file exists in `/data/ontology/`
|
||||
2. The specific predicate/class exists within that ontology file
|
||||
3. The prefix is declared and resolvable
|
||||
|
||||
This leads to schema files containing references like `dqv:value` or `adms:status` that cannot be validated or serialized to RDF.
|
||||
|
||||
---
|
||||
|
||||
## Requirements
|
||||
|
||||
### 1. All Ontology Prefixes Must Have Local Files
|
||||
|
||||
Before using a prefix (e.g., `prov:`, `schema:`, `org:`), verify the ontology file exists:
|
||||
|
||||
```bash
|
||||
# Check if ontology exists
|
||||
ls data/ontology/ | grep -i "prov\|schema\|org"
|
||||
```
|
||||
|
||||
**Available Ontologies** (as of 2025-01-13):
|
||||
|
||||
| Prefix | File | Verified |
|
||||
|--------|------|----------|
|
||||
| `prov:` | `prov-o.ttl`, `prov.ttl` | ✅ |
|
||||
| `schema:` | `schemaorg.owl` | ✅ |
|
||||
| `org:` | `org.rdf` | ✅ |
|
||||
| `skos:` | `skos.rdf` | ✅ |
|
||||
| `dcterms:` | `dublin_core_elements.rdf` | ✅ |
|
||||
| `foaf:` | `foaf.ttl` | ✅ |
|
||||
| `rico:` | `RiC-O_1-1.rdf` | ✅ |
|
||||
| `crm:` | `CIDOC_CRM_v7.1.3.rdf` | ✅ |
|
||||
| `geo:` | `geo.ttl` | ✅ |
|
||||
| `sosa:` | `sosa.ttl` | ✅ |
|
||||
| `bf:` | `bibframe.rdf` | ✅ |
|
||||
| `edm:` | `edm.owl` | ✅ |
|
||||
| `premis:` | `premis3.owl` | ✅ |
|
||||
| `dcat:` | `dcat3.ttl` | ✅ |
|
||||
| `ore:` | `ore.rdf` | ✅ |
|
||||
| `pico:` | `pico.ttl` | ✅ |
|
||||
| `gn:` | `geonames_ontology.rdf` | ✅ |
|
||||
| `time:` | `time.ttl` | ✅ |
|
||||
| `locn:` | `locn.ttl` | ✅ |
|
||||
| `dqv:` | `dqv.ttl` | ✅ |
|
||||
| `adms:` | `adms.ttl` | ✅ |
|
||||
|
||||
**NOT Available** (do not use without adding):
|
||||
|
||||
| Prefix | Status | Alternative |
|
||||
|--------|--------|-------------|
|
||||
| `qudt:` | Only referenced in era_ontology.ttl | Use `hc:` with close_mappings annotation |
|
||||
|
||||
### 2. Predicates Must Exist in Ontology Files
|
||||
|
||||
Before using a predicate, verify it exists:
|
||||
|
||||
```bash
|
||||
# Verify predicate exists
|
||||
grep -l "hasFrameRate\|frameRate" data/ontology/premis3.owl
|
||||
|
||||
# Check specific predicate definition
|
||||
grep -E "premis:hasFrameRate|:hasFrameRate" data/ontology/premis3.owl
|
||||
```
|
||||
|
||||
### 3. Use hc: Prefix for Domain-Specific Concepts
|
||||
|
||||
When no standard ontology predicate exists, use the Heritage Custodian namespace:
|
||||
|
||||
```yaml
|
||||
# CORRECT - Use hc: with documentation
|
||||
slots:
|
||||
heritage_relevance_score:
|
||||
slot_uri: hc:heritageRelevanceScore
|
||||
description: Heritage sector relevance score (0.0-1.0)
|
||||
annotations:
|
||||
ontology_note: >-
|
||||
No standard ontology predicate for heritage relevance scoring.
|
||||
Domain-specific metric for this project.
|
||||
|
||||
# WRONG - Hallucinated predicate
|
||||
slots:
|
||||
heritage_relevance_score:
|
||||
slot_uri: dqv:heritageScore # Does not exist!
|
||||
```
|
||||
|
||||
### 4. Document External References in close_mappings
|
||||
|
||||
When a similar concept exists in an ontology we don't have locally, document it in `close_mappings` with a note:
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
confidence_score:
|
||||
slot_uri: hc:confidenceScore
|
||||
close_mappings:
|
||||
- dqv:value # W3C Data Quality Vocabulary (not in local files)
|
||||
annotations:
|
||||
external_ontology_note: >-
|
||||
dqv:value from W3C Data Quality Vocabulary would be semantically
|
||||
appropriate but ontology not included in project. See
|
||||
https://www.w3.org/TR/vocab-dqv/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Workflow
|
||||
|
||||
### Before Adding New Mappings
|
||||
|
||||
1. **Check if ontology file exists**:
|
||||
```bash
|
||||
ls data/ontology/ | grep -i "<ontology-name>"
|
||||
```
|
||||
|
||||
2. **Search for predicate in ontology**:
|
||||
```bash
|
||||
grep -l "<predicate-name>" data/ontology/*
|
||||
```
|
||||
|
||||
3. **Verify predicate definition**:
|
||||
```bash
|
||||
grep -B2 -A5 "<predicate-name>" data/ontology/<file>
|
||||
```
|
||||
|
||||
4. **If not found**: Use `hc:` prefix with appropriate documentation
|
||||
|
||||
### When Reviewing Existing Mappings
|
||||
|
||||
Run validation script:
|
||||
|
||||
```bash
|
||||
# Find all slot_uri references
|
||||
grep -r "slot_uri:" schemas/20251121/linkml/modules/slots/ | \
|
||||
grep -v "hc:" | \
|
||||
cut -d: -f3 | \
|
||||
sort -u
|
||||
|
||||
# Verify each prefix has a local file
|
||||
for prefix in prov schema org skos dcterms foaf rico; do
|
||||
echo "Checking $prefix:"
|
||||
ls data/ontology/ | grep -i "$prefix" || echo " NOT FOUND!"
|
||||
done
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Ontology Addition Process
|
||||
|
||||
If a new ontology is genuinely needed:
|
||||
|
||||
1. **Download the ontology**:
|
||||
```bash
|
||||
curl -L -o data/ontology/<name>.ttl "<url>" -H "Accept: text/turtle"
|
||||
```
|
||||
|
||||
2. **Update ONTOLOGY_CATALOG.md**:
|
||||
```bash
|
||||
# Add entry to data/ontology/ONTOLOGY_CATALOG.md
|
||||
```
|
||||
|
||||
3. **Verify predicates exist**:
|
||||
```bash
|
||||
grep "<predicate>" data/ontology/<name>.ttl
|
||||
```
|
||||
|
||||
4. **Update LinkML prefixes** in schema files
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### CORRECT: Verified Mapping
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
retrieval_timestamp:
|
||||
slot_uri: prov:atTime # Verified in data/ontology/prov-o.ttl
|
||||
range: datetime
|
||||
```
|
||||
|
||||
### CORRECT: Domain-Specific with External Reference
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
confidence_score:
|
||||
slot_uri: hc:confidenceScore # HC namespace (always valid)
|
||||
range: float
|
||||
close_mappings:
|
||||
- dqv:value # External reference (documented, not required locally)
|
||||
annotations:
|
||||
ontology_note: >-
|
||||
Uses HC namespace as dqv: ontology not in local files.
|
||||
dqv:value would be semantically appropriate alternative.
|
||||
```
|
||||
|
||||
### WRONG: Hallucinated Mapping
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
confidence_score:
|
||||
slot_uri: dqv:value # INVALID - dqv: not in data/ontology/!
|
||||
range: float
|
||||
```
|
||||
|
||||
### WRONG: Non-Existent Predicate
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
frame_rate:
|
||||
slot_uri: premis:hasFrameRate # INVALID - predicate not in premis3.owl!
|
||||
range: float
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Consequences of Violation
|
||||
|
||||
1. **RDF serialization fails** - Invalid prefixes cause gen-owl errors
|
||||
2. **Schema validation errors** - LinkML validates prefix declarations
|
||||
3. **Broken interoperability** - External systems cannot resolve URIs
|
||||
4. **Data quality issues** - Semantic web tooling cannot process data
|
||||
|
||||
---
|
||||
|
||||
## PREMIS Ontology Reference (premis3.owl)
|
||||
|
||||
**CRITICAL**: The PREMIS ontology is frequently hallucinated. ALL premis: references MUST be verified.
|
||||
|
||||
### Valid PREMIS Classes
|
||||
|
||||
```
|
||||
Action, Agent, Bitstream, Copyright, Dependency, EnvironmentCharacteristic,
|
||||
Event, File, Fixity, HardwareAgent, Identifier, Inhibitor, InstitutionalPolicy,
|
||||
IntellectualEntity, License, Object, Organization, OutcomeStatus, Person,
|
||||
PreservationPolicy, Representation, RightsBasis, RightsStatus, Rule, Signature,
|
||||
SignatureEncoding, SignificantProperties, SoftwareAgent, Statute,
|
||||
StorageLocation, StorageMedium
|
||||
```
|
||||
|
||||
### Valid PREMIS Properties
|
||||
|
||||
```
|
||||
act, allows, basis, characteristic, citation, compositionLevel, dependency,
|
||||
determinationDate, documentation, encoding, endDate, fixity, governs,
|
||||
identifier, inhibitedBy, inhibits, jurisdiction, key, medium, note,
|
||||
originalName, outcome, outcomeNote, policy, prohibits, purpose, rationale,
|
||||
relationship, restriction, rightsStatus, signature, size, startDate,
|
||||
storedAt, terms, validationRules, version
|
||||
```
|
||||
|
||||
### Known Hallucinated PREMIS Terms (DO NOT USE)
|
||||
|
||||
| Hallucinated Term | Correction |
|
||||
|-------------------|------------|
|
||||
| `premis:PreservationEvent` | Use `premis:Event` |
|
||||
| `premis:RightsDeclaration` | Use `premis:RightsBasis` or `premis:RightsStatus` |
|
||||
| `premis:hasRightsStatement` | Use `premis:rightsStatus` |
|
||||
| `premis:hasRightsDeclaration` | Use `premis:rightsStatus` |
|
||||
| `premis:hasRepresentation` | Use `premis:relationship` or `dcterms:hasFormat` |
|
||||
| `premis:hasRelatedStatementInformation` | Use `premis:note` or `adms:status` |
|
||||
| `premis:hasObjectCharacteristics` | Use `premis:characteristic` |
|
||||
| `premis:rightsGranted` | Use `premis:RightsStatus` class with `premis:restriction` |
|
||||
| `premis:rightsEndDate` | Use `premis:endDate` |
|
||||
| `premis:linkingAgentIdentifier` | Use `premis:Agent` class |
|
||||
| `premis:storageLocation` (lowercase) | Use `premis:storedAt` property or `premis:StorageLocation` class |
|
||||
| `premis:hasFrameRate` | Does not exist - use `hc:frameRate` |
|
||||
| `premis:environmentCharacteristic` (lowercase) | Use `premis:EnvironmentCharacteristic` (class) |
|
||||
|
||||
### PREMIS Verification Commands
|
||||
|
||||
```bash
|
||||
# List all PREMIS classes
|
||||
grep -E "owl:Class.*premis" data/ontology/premis3.owl | \
|
||||
sed 's/.*v3\///' | sed 's/".*//' | sort -u
|
||||
|
||||
# List all PREMIS properties
|
||||
grep -E "ObjectProperty|DatatypeProperty" data/ontology/premis3.owl | \
|
||||
grep -oP 'v3/\K[^"]+' | sort -u
|
||||
|
||||
# Verify a specific term exists
|
||||
grep -c "YourTermHere" data/ontology/premis3.owl
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- Rule 38: Slot Centralization and Semantic URI Requirements
|
||||
- Rule 50: Ontology-to-LinkML Mapping Convention
|
||||
- `/data/ontology/ONTOLOGY_CATALOG.md` - Available ontologies
|
||||
- `.opencode/rules/slot-ontology-mapping-reference.md` - Mapping reference
|
||||
|
||||
---
|
||||
|
||||
## Version History
|
||||
|
||||
- **2025-01-13**: Added 7 more hallucinated PREMIS terms discovered during schema audit:
|
||||
- `premis:hasRightsStatement`, `premis:hasRightsDeclaration`, `premis:hasRepresentation`
|
||||
- `premis:hasRelatedStatementInformation`, `premis:rightsGranted`, `premis:rightsEndDate`
|
||||
- `premis:linkingAgentIdentifier`
|
||||
- **2025-01-13**: Initial creation after discovering dqv:, adms:, qudt: references without local files
|
||||
164
.opencode/rules/linkml/no-migration-deferral-rule.md
Normal file
164
.opencode/rules/linkml/no-migration-deferral-rule.md
Normal file
|
|
@ -0,0 +1,164 @@
|
|||
# Rule 60: No Migration Deferral - Execute All slot_fixes.yaml Revisions
|
||||
|
||||
## Status: MANDATORY
|
||||
|
||||
## Summary
|
||||
|
||||
**AI agents MUST execute ALL migrations specified in `slot_fixes.yaml`.** The pattern of marking migrations as "WELL_STRUCTURED_NO_MIGRATION_NEEDED" or similar deferral notes is **STRICTLY PROHIBITED** except in the rarest cases of genuine semantic conflict.
|
||||
|
||||
## The Problem
|
||||
|
||||
Previous AI agents incorrectly deferred >95% of slot_fixes.yaml migrations with notes like:
|
||||
- "WELL_STRUCTURED_NO_MIGRATION_NEEDED"
|
||||
- "Already has proper slot_uri"
|
||||
- "Simple enum is appropriate"
|
||||
- "Would add unnecessary indirection"
|
||||
- "Current structure is adequate"
|
||||
|
||||
**These deferrals were WRONG.** The slot_fixes.yaml revisions were carefully curated based on:
|
||||
- Ontology analysis (CIDOC-CRM, RiC-O, PROV-O, Schema.org alignment)
|
||||
- Semantic correctness and consistency
|
||||
- Pattern adherence (Rule 39: RiC-O naming, Rule 43: singular nouns)
|
||||
- Type/Types class hierarchy design (Rule 0b)
|
||||
|
||||
## The Rule
|
||||
|
||||
### MANDATORY: Execute ALL Migrations
|
||||
|
||||
When encountering a slot_fixes.yaml entry with `status: false`:
|
||||
|
||||
1. **READ the `revision` field** - This specifies EXACTLY what to do
|
||||
2. **EXECUTE the migration** - Create/update slots, classes, imports as specified
|
||||
3. **UPDATE status to `true`** - Document what was done in `notes`
|
||||
|
||||
### PROHIBITED: Deferral Reasoning
|
||||
|
||||
The following justifications are **INVALID** and **NEVER acceptable**:
|
||||
|
||||
| Invalid Justification | Why It's Wrong |
|
||||
|-----------------------|----------------|
|
||||
| "Already has proper slot_uri" | slot_uri handles external mapping; internal structure is separate concern |
|
||||
| "Simple string/enum is sufficient" | Consistency and extensibility trump local simplicity |
|
||||
| "Would add unnecessary indirection" | Indirection enables reuse and future extension |
|
||||
| "Creating a class would over-engineer" | Ontology design favors class-based modeling |
|
||||
| "Current structure is well-designed" | Individual optimization conflicts with schema-wide patterns |
|
||||
| "No semantic benefit" | Agent lacks context to make this judgment |
|
||||
|
||||
### VALID: Reasons to Pause (Not Defer)
|
||||
|
||||
The ONLY valid reasons to pause a migration (NOT defer indefinitely) are:
|
||||
|
||||
1. **Semantic Conflict**: The proposed revision contradicts the slot's documented semantic intent
|
||||
2. **Circular Dependency**: Migration would create import cycle
|
||||
3. **External Consumer Impact**: Breaking change affecting downstream systems
|
||||
4. **Class Already Exists**: Target class exists under different name (document and merge)
|
||||
|
||||
**When pausing**: Add a `feedback` entry explaining the specific conflict and request human review. Do NOT mark as "NO_MIGRATION_NEEDED".
|
||||
|
||||
## Decision Tree
|
||||
|
||||
```
|
||||
Is there a slot_fixes.yaml entry with status: false?
|
||||
├─ YES → Read the revision field
|
||||
│ ├─ Does revision specify slots/classes to create?
|
||||
│ │ └─ YES → EXECUTE THE MIGRATION (mandatory)
|
||||
│ └─ Is there a genuine semantic conflict?
|
||||
│ ├─ NO → EXECUTE THE MIGRATION (mandatory)
|
||||
│ └─ YES → Document conflict in feedback, request human review
|
||||
│ (Do NOT mark as "no migration needed")
|
||||
└─ NO → Nothing to do
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### WRONG: Deferral Note
|
||||
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/example_slot
|
||||
revision:
|
||||
- label: has_or_had_example
|
||||
type: slot
|
||||
- label: Example
|
||||
type: class
|
||||
processed:
|
||||
status: true # WRONG - marked true without doing work
|
||||
notes: "WELL_STRUCTURED_NO_MIGRATION_NEEDED - slot already has proper
|
||||
slot_uri and the current structure is adequate" # INVALID
|
||||
```
|
||||
|
||||
### CORRECT: Execute Migration
|
||||
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/example_slot
|
||||
revision:
|
||||
- label: has_or_had_example
|
||||
type: slot
|
||||
- label: Example
|
||||
type: class
|
||||
processed:
|
||||
status: true
|
||||
timestamp: '2026-01-19T12:00:00Z'
|
||||
notes: 'Migrated 2026-01-19 per Rule 53/56.
|
||||
- Created has_or_had_example.yaml slot file
|
||||
- Created Example.yaml class file
|
||||
- Updated ClassA.yaml, ClassB.yaml to use new slot
|
||||
- Archived: modules/slots/archive/example_slot_archived_20260119.yaml'
|
||||
```
|
||||
|
||||
### CORRECT: Pause with Genuine Conflict
|
||||
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/conflicting_slot
|
||||
revision:
|
||||
- label: has_or_had_foo
|
||||
type: slot
|
||||
processed:
|
||||
status: false # Correctly left false
|
||||
notes: ''
|
||||
feedback:
|
||||
- timestamp: '2026-01-19T12:00:00Z'
|
||||
user: opencode-claude
|
||||
done: false
|
||||
comment: |
|
||||
PAUSED FOR HUMAN REVIEW - Genuine semantic conflict detected:
|
||||
- Revision specifies has_or_had_foo (temporal relationship)
|
||||
- But slot is used for immutable birth dates (should be has_*)
|
||||
- Request clarification on intended temporal semantics
|
||||
```
|
||||
|
||||
## Statistics Context
|
||||
|
||||
The slot_fixes.yaml file contains 527 migration entries. Analysis of previous agent behavior:
|
||||
|
||||
- **Incorrectly deferred**: >95% of entries marked "NO_MIGRATION_NEEDED"
|
||||
- **Actually needing deferral**: <5% (genuine semantic conflicts)
|
||||
- **Required action**: Execute ALL migrations except those with documented semantic conflicts
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 53**: Full Slot Migration - slot_fixes.yaml is AUTHORITATIVE
|
||||
- **Rule 56**: Semantic Consistency Over Simplicity - Always Execute Revisions
|
||||
- **Rule 57**: slot_fixes.yaml Revision Key is IMMUTABLE
|
||||
- **Rule 58**: Feedback vs Revision Distinction
|
||||
|
||||
## Anti-Patterns Checklist
|
||||
|
||||
Before marking ANY migration as complete without execution, verify:
|
||||
|
||||
- [ ] Did I actually create the specified slots?
|
||||
- [ ] Did I actually create the specified classes?
|
||||
- [ ] Did I update all class files that use this slot?
|
||||
- [ ] Did I archive the old slot file?
|
||||
- [ ] Is my "notes" field documenting actual work done, not a deferral excuse?
|
||||
|
||||
If any answer is "no", the migration is NOT complete.
|
||||
|
||||
## Consequences
|
||||
|
||||
Agents that defer migrations without genuine semantic conflict:
|
||||
1. Create technical debt requiring human cleanup
|
||||
2. Delay schema consistency improvements
|
||||
3. Waste curator time reviewing false "completions"
|
||||
4. Undermine trust in AI-assisted schema work
|
||||
|
||||
**Execute the migrations. Do not defer.**
|
||||
215
.opencode/rules/linkml/no-ontology-prefix-in-slot-names.md
Normal file
215
.opencode/rules/linkml/no-ontology-prefix-in-slot-names.md
Normal file
|
|
@ -0,0 +1,215 @@
|
|||
# Rule 42: No Ontology Prefixes in Slot Names
|
||||
|
||||
**CRITICAL**: LinkML slot names MUST NOT include ontology namespace prefixes. Ontology references belong in mapping properties, NOT in element names.
|
||||
|
||||
---
|
||||
|
||||
## 1. The Problem
|
||||
|
||||
Slot names like `rico_has_or_had_holder` or `skos_broader` violate separation of concerns:
|
||||
|
||||
- **Slot names** should describe the semantic meaning in plain, readable terms
|
||||
- **Ontology mappings** belong in `slot_uri`, `exact_mappings`, `close_mappings`, `related_mappings`, `narrow_mappings`, `broad_mappings`
|
||||
|
||||
Embedding ontology prefixes in names:
|
||||
1. Creates coupling between naming and specific ontology versions
|
||||
2. Reduces readability for non-ontology experts
|
||||
3. Duplicates information already in mapping properties
|
||||
4. Makes future ontology migrations harder
|
||||
|
||||
---
|
||||
|
||||
## 2. Correct Pattern
|
||||
|
||||
### Use Descriptive Names + Mapping Properties
|
||||
|
||||
```yaml
|
||||
# CORRECT: Clean name with ontology reference in slot_uri
|
||||
slots:
|
||||
record_holder:
|
||||
description: The custodian that holds or held this record set.
|
||||
slot_uri: rico:hasOrHadHolder
|
||||
exact_mappings:
|
||||
- rico:hasOrHadHolder
|
||||
close_mappings:
|
||||
- schema:holdingArchive
|
||||
range: Custodian
|
||||
```
|
||||
|
||||
### WRONG: Ontology Prefix in Name
|
||||
|
||||
```yaml
|
||||
# WRONG: Ontology prefix embedded in slot name
|
||||
slots:
|
||||
rico_has_or_had_holder: # BAD - "rico_" prefix
|
||||
description: The custodian that holds or held this record set.
|
||||
slot_uri: rico:hasOrHadHolder
|
||||
range: string
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Prohibited Prefixes in Slot Names
|
||||
|
||||
The following prefixes MUST NOT appear at the start of slot names:
|
||||
|
||||
| Prefix | Ontology | Example Violation |
|
||||
|--------|----------|-------------------|
|
||||
| `rico_` | Records in Contexts | `rico_organizational_principle` |
|
||||
| `skos_` | SKOS | `skos_broader`, `skos_narrower` |
|
||||
| `schema_` | Schema.org | `schema_name` |
|
||||
| `dcterms_` | Dublin Core | `dcterms_created` |
|
||||
| `dct_` | Dublin Core | `dct_identifier` |
|
||||
| `prov_` | PROV-O | `prov_generated_by` |
|
||||
| `org_` | W3C Organization | `org_has_member` |
|
||||
| `crm_` | CIDOC-CRM | `crm_carried_out_by` |
|
||||
| `foaf_` | FOAF | `foaf_knows` |
|
||||
| `owl_` | OWL | `owl_same_as` |
|
||||
| `rdf_` | RDF | `rdf_type` |
|
||||
| `rdfs_` | RDFS | `rdfs_label` |
|
||||
| `cpov_` | CPOV | `cpov_public_organisation` |
|
||||
| `tooi_` | TOOI | `tooi_overheidsorganisatie` |
|
||||
| `bf_` | BIBFRAME | `bf_title` |
|
||||
| `edm_` | Europeana | `edm_provided_cho` |
|
||||
|
||||
---
|
||||
|
||||
## 4. Migration Examples
|
||||
|
||||
### Example 1: RiC-O Slots
|
||||
|
||||
```yaml
|
||||
# BEFORE (wrong)
|
||||
rico_has_or_had_holder:
|
||||
slot_uri: rico:hasOrHadHolder
|
||||
range: string
|
||||
|
||||
# AFTER (correct)
|
||||
record_holder:
|
||||
description: Reference to the custodian that holds or held this record set.
|
||||
slot_uri: rico:hasOrHadHolder
|
||||
exact_mappings:
|
||||
- rico:hasOrHadHolder
|
||||
range: Custodian
|
||||
```
|
||||
|
||||
### Example 2: SKOS Slots
|
||||
|
||||
```yaml
|
||||
# BEFORE (wrong)
|
||||
skos_broader:
|
||||
slot_uri: skos:broader
|
||||
range: uriorcurie
|
||||
|
||||
# AFTER (correct)
|
||||
broader_concept:
|
||||
description: A broader concept in the hierarchy.
|
||||
slot_uri: skos:broader
|
||||
exact_mappings:
|
||||
- skos:broader
|
||||
range: uriorcurie
|
||||
```
|
||||
|
||||
### Example 3: RiC-O Organizational Principle
|
||||
|
||||
```yaml
|
||||
# BEFORE (wrong)
|
||||
rico_organizational_principle:
|
||||
slot_uri: rico:hasRecordSetType
|
||||
range: string
|
||||
|
||||
# AFTER (correct)
|
||||
organizational_principle:
|
||||
description: The organizational principle (fonds, series, collection) for this record set.
|
||||
slot_uri: rico:hasRecordSetType
|
||||
exact_mappings:
|
||||
- rico:hasRecordSetType
|
||||
range: string
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Exceptions
|
||||
|
||||
### 5.1 Identifier Slots
|
||||
|
||||
Slots that store **identifiers from external systems** may include system names (not ontology prefixes):
|
||||
|
||||
```yaml
|
||||
# ALLOWED: External system identifier
|
||||
wikidata_id:
|
||||
description: Wikidata entity identifier (Q-number).
|
||||
slot_uri: schema:identifier
|
||||
range: string
|
||||
pattern: "^Q[0-9]+$"
|
||||
|
||||
# ALLOWED: External system identifier
|
||||
viaf_id:
|
||||
description: VIAF identifier for authority control.
|
||||
slot_uri: schema:identifier
|
||||
range: string
|
||||
```
|
||||
|
||||
### 5.2 Internal Namespace Force Slots
|
||||
|
||||
Technical slots for namespace generation are prefixed with `internal_`:
|
||||
|
||||
```yaml
|
||||
# ALLOWED: Technical workaround slot
|
||||
internal_wd_namespace_force:
|
||||
description: Internal slot to force WD namespace generation. Do not use.
|
||||
slot_uri: wd:Q35120
|
||||
range: string
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Validation
|
||||
|
||||
Run this command to find violations:
|
||||
|
||||
```bash
|
||||
cd schemas/20251121/linkml/modules/slots
|
||||
ls -1 *.yaml | grep -E "^(rico_|skos_|schema_|dcterms_|dct_|prov_|org_|crm_|foaf_|owl_|rdf_|rdfs_|cpov_|tooi_|bf_|edm_)"
|
||||
```
|
||||
|
||||
Expected output: No files (after migration)
|
||||
|
||||
---
|
||||
|
||||
## 7. Rationale
|
||||
|
||||
### LinkML Best Practices
|
||||
|
||||
LinkML provides dedicated properties for ontology alignment:
|
||||
|
||||
| Property | Purpose | Example |
|
||||
|----------|---------|---------|
|
||||
| `slot_uri` | Primary ontology predicate | `slot_uri: rico:hasOrHadHolder` |
|
||||
| `exact_mappings` | Semantically equivalent predicates | `exact_mappings: [schema:holdingArchive]` |
|
||||
| `close_mappings` | Nearly equivalent predicates | `close_mappings: [dc:creator]` |
|
||||
| `related_mappings` | Related but different predicates | `related_mappings: [prov:wasAttributedTo]` |
|
||||
| `narrow_mappings` | More specific predicates | `narrow_mappings: [rico:hasInstantiation]` |
|
||||
| `broad_mappings` | More general predicates | `broad_mappings: [schema:about]` |
|
||||
|
||||
See: https://linkml.io/linkml-model/latest/docs/mappings/
|
||||
|
||||
### Clean Separation of Concerns
|
||||
|
||||
- **Names**: Human-readable, domain-focused terminology
|
||||
- **URIs**: Machine-readable, ontology-specific identifiers
|
||||
- **Mappings**: Cross-ontology alignment documentation
|
||||
|
||||
This separation allows:
|
||||
1. Renaming slots without changing ontology bindings
|
||||
2. Adding new ontology mappings without renaming slots
|
||||
3. Clear documentation of semantic relationships
|
||||
4. Easier maintenance and evolution
|
||||
|
||||
---
|
||||
|
||||
## 8. See Also
|
||||
|
||||
- **Rule 38**: Slot Centralization and Semantic URI Requirements
|
||||
- **Rule 39**: Slot Naming Convention (RiC-O Style) - for temporal naming patterns
|
||||
- LinkML Mappings Documentation: https://linkml.io/linkml-model/latest/docs/mappings/
|
||||
61
.opencode/rules/linkml/no-rough-edits-in-schema.md
Normal file
61
.opencode/rules/linkml/no-rough-edits-in-schema.md
Normal file
|
|
@ -0,0 +1,61 @@
|
|||
# Rule: No Rough Edits in Schema Files
|
||||
|
||||
**Identifier**: `no-rough-edits-in-schema`
|
||||
**Severity**: **CRITICAL**
|
||||
|
||||
## Core Directive
|
||||
|
||||
**DO NOT** perform rough, imprecise, or bulk text substitutions (like `sed -i` or regex-based python scripts) on LinkML schema files (`schemas/*/linkml/`) without guaranteeing structural integrity.
|
||||
|
||||
**YOU MUST**:
|
||||
* ✅ Use proper YAML parsers/dumpers if modifying structure programmatically.
|
||||
* ✅ Manually verify edits if using text replacement.
|
||||
* ✅ Ensure indentation and nesting are preserved exactly.
|
||||
* ✅ Respect comments and ordering (which parsers often destroy, so careful text editing is sometimes necessary, but it must be PRECISE).
|
||||
|
||||
## Rationale
|
||||
|
||||
LinkML schemas are highly structured YAML files where indentation and nesting semantics are critical. Rough edits often cause:
|
||||
* **Duplicate keys** (e.g., leaving a property behind after deleting its parent key).
|
||||
* **Invalid indentation** (breaking the parent-child relationship).
|
||||
* **Silent corruption** (valid YAML but wrong semantics).
|
||||
|
||||
## Examples
|
||||
|
||||
### ❌ Anti-Pattern: Rough Deletion
|
||||
|
||||
Deleting lines containing a string without checking context:
|
||||
|
||||
```python
|
||||
# WRONG: Deleting lines blindly
|
||||
for line in lines:
|
||||
if "some_slot" in line:
|
||||
continue # Deletes the line, but might leave children orphaned!
|
||||
new_lines.append(line)
|
||||
```
|
||||
|
||||
**Resulting Corruption**:
|
||||
```yaml
|
||||
# Original
|
||||
slots:
|
||||
some_slot:
|
||||
range: string
|
||||
|
||||
# Corrupted (orphaned child)
|
||||
slots:
|
||||
range: string # INVALID!
|
||||
```
|
||||
|
||||
### ✅ Correct Pattern: Structural Awareness
|
||||
|
||||
If removing a slot reference, ensure you remove the entire list item or key-value block.
|
||||
|
||||
```python
|
||||
# BETTER: Check for list item syntax
|
||||
if re.match(r'^\s*-\s*some_slot\s*$', line):
|
||||
continue
|
||||
```
|
||||
|
||||
## Application
|
||||
|
||||
This rule applies to ALL files in `schemas/20251121/linkml/` and future versions.
|
||||
|
|
@ -0,0 +1,53 @@
|
|||
# Rule: No Version Indicators in Names
|
||||
|
||||
## 🚨 Critical
|
||||
|
||||
Do not include version identifiers in **class names**, **slot names**, or **enum names**.
|
||||
|
||||
Version tags in semantic names create churn, break reuse, and force unnecessary migrations.
|
||||
|
||||
## The Rule
|
||||
|
||||
1. Use stable semantic names for LinkML elements.
|
||||
- ✅ `DigitalPlatform`
|
||||
- ❌ `DigitalPlatformV2`
|
||||
|
||||
2. If a model evolves, keep the name and update metadata/provenance.
|
||||
- Track revision in changelog, annotations, or transformation metadata.
|
||||
- Do not encode `v2`, `v3`, `_2026`, `beta`, `final` in the element name.
|
||||
|
||||
3. Apply this to all naming surfaces:
|
||||
- `classes:` keys
|
||||
- `slots:` keys
|
||||
- `enums:` keys
|
||||
- `name:` values in module files
|
||||
|
||||
## Allowed Versioning Locations
|
||||
|
||||
- File-level changelog/comments
|
||||
- Dedicated metadata classes/slots (e.g., transformation metadata)
|
||||
- External release tags (git tags, manifest versions)
|
||||
|
||||
## Migration Guidance
|
||||
|
||||
When you encounter versioned names:
|
||||
|
||||
1. Rename semantic elements to stable names.
|
||||
2. Update references/imports/usages accordingly.
|
||||
3. Preserve provenance of the migration in comments/annotations.
|
||||
|
||||
## Examples
|
||||
|
||||
✅ Correct:
|
||||
```yaml
|
||||
classes:
|
||||
DigitalPlatformTransformationMetadata:
|
||||
description: Metadata about record transformation steps.
|
||||
```
|
||||
|
||||
❌ Wrong:
|
||||
```yaml
|
||||
classes:
|
||||
DigitalPlatformV2TransformationMetadata:
|
||||
description: Metadata about V2 transformation.
|
||||
```
|
||||
15
.opencode/rules/linkml/ontology-detection-rule.md
Normal file
15
.opencode/rules/linkml/ontology-detection-rule.md
Normal file
|
|
@ -0,0 +1,15 @@
|
|||
# Rule: Ontology Detection vs Heuristics
|
||||
|
||||
## Summary
|
||||
When detecting classes and predicates in `data/ontology/` or external ontology files, you must **read the actual ontology definitions** (e.g., RDF, OWL, TTL files) to determine if a term is a Class or a Property. Do not rely on naming heuristics (like "Capitalized means Class").
|
||||
|
||||
## Detail
|
||||
* **Verification**: Always read the source ontology file or use a semantic lookup tool to verify the `rdf:type` of an entity.
|
||||
* If `rdf:type` is `owl:Class` or `rdfs:Class`, it is a **Class**.
|
||||
* If `rdf:type` is `rdf:Property`, `owl:ObjectProperty`, or `owl:DatatypeProperty`, it is a **Property**.
|
||||
* **Avoid Heuristics**: Do not assume that `skos:Concept` is a class just because it looks like one (it is), or that `schema:name` is a property just because it's lowercase. Many ontologies have inconsistent naming conventions (e.g., `schema:Person` vs `foaf:Person`).
|
||||
* **Strictness**: If the ontology file is not available locally, attempt to fetch it or consult authoritative documentation before guessing.
|
||||
|
||||
## Violation Examples
|
||||
* Assuming `ex:MyTerm` is a class because it starts with an uppercase letter without checking the `.ttl` file.
|
||||
* Mapping a LinkML slot to `schema:Thing` (a Class) instead of a Property because you guessed based on the name.
|
||||
306
.opencode/rules/linkml/ontology-to-linkml-mapping-convention.md
Normal file
306
.opencode/rules/linkml/ontology-to-linkml-mapping-convention.md
Normal file
|
|
@ -0,0 +1,306 @@
|
|||
# Rule 50: Ontology-to-LinkML Mapping Convention
|
||||
|
||||
🚨 **CRITICAL**: When mapping base ontology classes and predicates to LinkML schema elements, use LinkML's dedicated mapping properties as documented at https://linkml.io/linkml-model/latest/docs/mappings/
|
||||
|
||||
---
|
||||
|
||||
## 1. What "LinkML Mapping" Means in This Project
|
||||
|
||||
**"LinkML mapping"** refers specifically to:
|
||||
1. Connecting LinkML schema elements (classes, slots, enums) to external ontology URIs
|
||||
2. Using LinkML's built-in mapping properties (`class_uri`, `slot_uri`, `*_mappings`)
|
||||
3. Following SKOS-based vocabulary alignment standards
|
||||
|
||||
**LinkML mapping does NOT mean**:
|
||||
- Creating arbitrary crosswalks in spreadsheets
|
||||
- Writing prose descriptions of how concepts relate
|
||||
- Inventing custom `@context` JSON-LD mappings outside the schema
|
||||
|
||||
---
|
||||
|
||||
## 2. LinkML Mapping Property Reference
|
||||
|
||||
### Primary Identity Properties
|
||||
|
||||
| Property | Applies To | Purpose | Example |
|
||||
|----------|-----------|---------|---------|
|
||||
| `class_uri` | Classes | Primary RDF class URI | `class_uri: ore:Aggregation` |
|
||||
| `slot_uri` | Slots | Primary RDF predicate URI | `slot_uri: rico:hasOrHadHolder` |
|
||||
| `enum_uri` | Enums | Enum namespace URI | `enum_uri: hc:PlatformTypeEnum` |
|
||||
|
||||
### SKOS-Based Mapping Properties
|
||||
|
||||
These properties express **semantic relationships** to external ontology terms:
|
||||
|
||||
| Property | SKOS Predicate | Meaning | Use When |
|
||||
|----------|---------------|---------|----------|
|
||||
| `exact_mappings` | `skos:exactMatch` | **IDENTICAL meaning** | Different ontology, **SAME semantics** (interchangeable) |
|
||||
| `close_mappings` | `skos:closeMatch` | Very similar meaning | Similar but **NOT interchangeable** |
|
||||
| `related_mappings` | `skos:relatedMatch` | Semantically related | Broader conceptual relationship |
|
||||
| `narrow_mappings` | `skos:narrowMatch` | This is more specific | External term is broader |
|
||||
| `broad_mappings` | `skos:broadMatch` | This is more general | External term is narrower |
|
||||
|
||||
### ⚠️ CRITICAL: `exact_mappings` Requires PRECISE Semantic Equivalence
|
||||
|
||||
**`exact_mappings` means the terms are INTERCHANGEABLE** - you could substitute one for the other in any context without changing meaning.
|
||||
|
||||
**Requirements for `exact_mappings`**:
|
||||
1. **Same definition**: Both terms must have equivalent definitions
|
||||
2. **Same scope**: Both terms cover the same set of instances
|
||||
3. **Same constraints**: Same domain/range restrictions apply
|
||||
4. **Bidirectional**: If A exactMatch B, then B exactMatch A
|
||||
|
||||
**DO NOT use `exact_mappings` when**:
|
||||
- One term is a subset of the other (use `narrow_mappings`/`broad_mappings`)
|
||||
- Terms are similar but have different scopes (use `close_mappings`)
|
||||
- Terms are related but not equivalent (use `related_mappings`)
|
||||
- You're uncertain about equivalence (default to `close_mappings`)
|
||||
|
||||
**Example - WRONG**:
|
||||
```yaml
|
||||
# PersonProfile is NOT equivalent to foaf:Person
|
||||
# PersonProfile is a structured document ABOUT a person, not the person themselves
|
||||
exact_mappings:
|
||||
- foaf:Person # ❌ WRONG - different semantics!
|
||||
```
|
||||
|
||||
**Example - CORRECT**:
|
||||
```yaml
|
||||
# foaf:Person and schema:Person ARE equivalent
|
||||
# Both define "a person" with the same scope
|
||||
exact_mappings:
|
||||
- schema:Person # ✅ CORRECT - truly equivalent
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Mapping Workflow: Ontology → LinkML
|
||||
|
||||
### Step 1: Identify External Ontology Class/Predicate
|
||||
|
||||
Search base ontology files in `/data/ontology/`:
|
||||
|
||||
```bash
|
||||
# Find aggregation-related classes
|
||||
rg -i "aggregation|aggregate" data/ontology/*.ttl data/ontology/*.rdf data/ontology/*.owl
|
||||
|
||||
# Check specific ontology
|
||||
rg "rdfs:Class|owl:Class" data/ontology/ore.rdf | grep -i "aggregation"
|
||||
```
|
||||
|
||||
### Step 2: Determine Mapping Strength
|
||||
|
||||
| Scenario | Mapping Property |
|
||||
|----------|------------------|
|
||||
| **This IS that ontology class** (identity) | `class_uri` |
|
||||
| **Equivalent in another vocabulary** | `exact_mappings` |
|
||||
| **Similar concept, different scope** | `close_mappings` |
|
||||
| **Related but different granularity** | `narrow_mappings` / `broad_mappings` |
|
||||
| **Conceptually related** | `related_mappings` |
|
||||
|
||||
### Step 3: Document Mapping in LinkML Schema
|
||||
|
||||
#### For Classes
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
DataAggregator:
|
||||
class_uri: ore:Aggregation # Primary identity - THIS IS an ORE Aggregation
|
||||
description: |
|
||||
A platform that harvests and STORES copies of metadata/content, causing data duplication.
|
||||
|
||||
ore:Aggregation - "A set of related resources grouped together."
|
||||
|
||||
Mapped to ORE because aggregators create aggregations of harvested metadata.
|
||||
exact_mappings:
|
||||
- edm:EuropeanaAggregation # Europeana's specialization
|
||||
close_mappings:
|
||||
- dcat:Catalog # Similar (collects datasets) but broader scope
|
||||
narrow_mappings:
|
||||
- edm:ProvidedCHO # More specific (single cultural object)
|
||||
```
|
||||
|
||||
#### For Slots
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
aggregates_from:
|
||||
slot_uri: ore:aggregates # Primary predicate
|
||||
description: |
|
||||
Institutions whose data is aggregated (harvested and stored) by this platform.
|
||||
|
||||
ore:aggregates - "Aggregations assert ore:aggregates relationships."
|
||||
exact_mappings:
|
||||
- edm:aggregatedCHO # Europeana equivalent
|
||||
range: HeritageCustodian
|
||||
multivalued: true
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Aggregation vs. Linking: A Mapping Example
|
||||
|
||||
This project requires **semantic precision** in distinguishing:
|
||||
|
||||
| Concept | Primary Mapping | Semantic Pattern |
|
||||
|---------|-----------------|------------------|
|
||||
| **Data Aggregation** | `ore:Aggregation` | Data is COPIED to aggregator's server |
|
||||
| **Linking/Federation** | `dcat:DataService` | Data REMAINS at source; only links provided |
|
||||
|
||||
### Aggregation Pattern (Data Duplication)
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
DataAggregator:
|
||||
class_uri: ore:Aggregation
|
||||
description: |
|
||||
Harvests and stores copies of metadata from partner institutions.
|
||||
|
||||
Key semantic: Data DUPLICATION occurs - the aggregator maintains its own copy.
|
||||
|
||||
Examples: Europeana, DPLA, Archives Portal Europe
|
||||
exact_mappings:
|
||||
- edm:EuropeanaAggregation
|
||||
annotations:
|
||||
data_storage_pattern: AGGREGATION
|
||||
causes_data_duplication: true
|
||||
```
|
||||
|
||||
### Linking Pattern (Single Source of Truth)
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
FederatedDiscoveryPortal:
|
||||
class_uri: dcat:DataService
|
||||
description: |
|
||||
Provides unified search across multiple institutions but LINKS to original sources.
|
||||
|
||||
Key semantic: NO data duplication - users are redirected to source institutions.
|
||||
|
||||
Data remains at partner institutions' platforms (single source of truth).
|
||||
close_mappings:
|
||||
- schema:SearchAction # The search functionality
|
||||
related_mappings:
|
||||
- ore:Aggregation # Related but crucially different
|
||||
annotations:
|
||||
data_storage_pattern: LINKING
|
||||
causes_data_duplication: false
|
||||
```
|
||||
|
||||
### Linking Properties from EDM
|
||||
|
||||
Use `edm:isShownAt` and `edm:isShownBy` to express links to source:
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
is_shown_at:
|
||||
slot_uri: edm:isShownAt
|
||||
description: |
|
||||
Unambiguous URL to the digital object on the provider's web site
|
||||
in its full information context.
|
||||
|
||||
edm:isShownAt - "The URL of a web view of the object in full context."
|
||||
|
||||
This property LINKS to the source institution - no data duplication.
|
||||
range: uri
|
||||
|
||||
is_shown_by:
|
||||
slot_uri: edm:isShownBy
|
||||
description: |
|
||||
Direct URL to the object in best available resolution on provider's site.
|
||||
|
||||
edm:isShownBy - "The URL of the object itself (not the context page)."
|
||||
range: uri
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Complete Mapping Documentation Template
|
||||
|
||||
When creating or updating a class with ontology mappings:
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
MyNewClass:
|
||||
# === PRIMARY IDENTITY ===
|
||||
class_uri: {prefix}:{ClassName} # The ontology class this IS
|
||||
|
||||
# === DESCRIPTION WITH ONTOLOGY REFERENCE ===
|
||||
description: |
|
||||
{Human-readable description of what this class represents}
|
||||
|
||||
{Ontology}: {class} - "{Definition from ontology documentation}"
|
||||
|
||||
Mapping rationale:
|
||||
- Chosen because: {why this ontology class fits}
|
||||
- Not using X because: {why alternatives were rejected}
|
||||
|
||||
# === SKOS-BASED MAPPINGS ===
|
||||
exact_mappings:
|
||||
- {prefix}:{EquivalentClass} # Same meaning, different vocabulary
|
||||
close_mappings:
|
||||
- {prefix}:{SimilarClass} # Very similar but not identical
|
||||
narrow_mappings:
|
||||
- {prefix}:{MoreSpecificClass} # External is broader than ours
|
||||
broad_mappings:
|
||||
- {prefix}:{MoreGeneralClass} # External is narrower than ours
|
||||
related_mappings:
|
||||
- {prefix}:{RelatedClass} # Conceptually related
|
||||
|
||||
# === OPTIONAL ANNOTATIONS ===
|
||||
annotations:
|
||||
ontology_source: "{Full name of source ontology}"
|
||||
ontology_version: "{Version if applicable}"
|
||||
mapping_confidence: "high|medium|low"
|
||||
mapping_notes: "{Additional context}"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Validation Checklist
|
||||
|
||||
Before committing ontology mappings:
|
||||
|
||||
- [ ] `class_uri` / `slot_uri` points to a real URI in `data/ontology/` files
|
||||
- [ ] Description includes ontology definition (quoted from source)
|
||||
- [ ] Mapping rationale documented for non-obvious choices
|
||||
- [ ] `exact_mappings` used ONLY for truly equivalent terms
|
||||
- [ ] `close_mappings` documented with difference explanation
|
||||
- [ ] All prefixes declared in schema's `prefixes:` block
|
||||
- [ ] Prefixes resolve to valid ontology namespaces
|
||||
|
||||
---
|
||||
|
||||
## 7. Common Ontology Prefixes for Mappings
|
||||
|
||||
| Prefix | Namespace | Ontology | Use For |
|
||||
|--------|-----------|----------|---------|
|
||||
| `ore:` | `http://www.openarchives.org/ore/terms/` | OAI-ORE | Aggregation patterns |
|
||||
| `edm:` | `http://www.europeana.eu/schemas/edm/` | Europeana Data Model | Cultural heritage aggregation |
|
||||
| `dcat:` | `http://www.w3.org/ns/dcat#` | DCAT | Data catalogs, services |
|
||||
| `rico:` | `https://www.ica.org/standards/RiC/ontology#` | Records in Contexts | Archival description |
|
||||
| `crm:` | `http://www.cidoc-crm.org/cidoc-crm/` | CIDOC-CRM | Cultural heritage events |
|
||||
| `schema:` | `http://schema.org/` | Schema.org | Web semantics |
|
||||
| `skos:` | `http://www.w3.org/2004/02/skos/core#` | SKOS | Concepts, labels |
|
||||
| `dcterms:` | `http://purl.org/dc/terms/` | Dublin Core | Metadata properties |
|
||||
| `prov:` | `http://www.w3.org/ns/prov#` | PROV-O | Provenance |
|
||||
| `org:` | `http://www.w3.org/ns/org#` | W3C Organization | Organizations |
|
||||
| `foaf:` | `http://xmlns.com/foaf/0.1/` | FOAF | People, agents |
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [LinkML Mappings Documentation](https://linkml.io/linkml-model/latest/docs/mappings/)
|
||||
- [LinkML URIs and Mappings Guide](https://linkml.io/linkml/schemas/uris-and-mappings.html)
|
||||
- [LinkML class_uri Reference](https://linkml.io/linkml-model/latest/docs/class_uri/)
|
||||
- [LinkML slot_uri Reference](https://linkml.io/linkml-model/latest/docs/slot_uri/)
|
||||
- Rule 1: Ontology Files Are Your Primary Reference
|
||||
- Rule 38: Slot Centralization and Semantic URI Requirements
|
||||
- Rule 42: No Ontology Prefixes in Slot Names
|
||||
|
||||
---
|
||||
|
||||
**Version**: 1.0.0
|
||||
**Created**: 2026-01-12
|
||||
**Author**: OpenCODE
|
||||
45
.opencode/rules/linkml/polished-slot-storage-location.md
Normal file
45
.opencode/rules/linkml/polished-slot-storage-location.md
Normal file
|
|
@ -0,0 +1,45 @@
|
|||
# Rule: Polished Slot Storage Location
|
||||
|
||||
## Summary
|
||||
|
||||
Polished (refactored) canonical slot files MUST be stored in the parent `slots/` directory:
|
||||
|
||||
```
|
||||
schemas/20251121/linkml/modules/slots/
|
||||
```
|
||||
|
||||
They must **NOT** be stored in the `20260202_matang/` subdirectory.
|
||||
|
||||
## Rationale
|
||||
|
||||
The `new/` subdirectory contain **draft/unpolished** slot definitions that are pending review. Once a slot file has been polished (ontology-aligned, translated, cleaned), it graduates to the canonical `slots/` directory.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
schemas/20251121/linkml/modules/slots/
|
||||
├── *.yaml ← Polished canonical slot files go HERE
|
||||
└── 20260202_matang/
|
||||
├── *.yaml ← Draft/unpolished canonical slots (staging area)
|
||||
└── new/
|
||||
└── *.yaml ← Raw/draft slot definitions pending triage
|
||||
```
|
||||
|
||||
## Rule
|
||||
|
||||
- When polishing a slot file, write the result to `schemas/20251121/linkml/modules/slots/{slot_name}.yaml`
|
||||
- If the source file was in `20260202_matang/`, remove it from there after writing to `slots/`
|
||||
- If the source file was in `20260202_matang/new/`, it should only be deleted after user confirmation of alias absorption (per the no-autonomous-alias-assignment rule)
|
||||
- If a file already exists in `slots/` (i.e., it was previously polished in an earlier session), overwrite it in place
|
||||
|
||||
## Examples
|
||||
|
||||
**CORRECT:**
|
||||
```
|
||||
schemas/20251121/linkml/modules/slots/has_pattern.yaml ← polished file
|
||||
```
|
||||
|
||||
**WRONG:**
|
||||
```
|
||||
schemas/20251121/linkml/modules/slots/20260202_matang/has_pattern.yaml ← should not be here after polishing
|
||||
```
|
||||
|
|
@ -0,0 +1,32 @@
|
|||
# Rule: Preserve Bespoke Slots Until Refactoring
|
||||
|
||||
**Identifier**: `preserve-bespoke-slots-until-refactoring`
|
||||
**Severity**: **CRITICAL**
|
||||
|
||||
## Core Directive
|
||||
|
||||
**DO NOT remove or migrate "additional" bespoke slots during generic migration passes unless they are the specific target of the current task.**
|
||||
|
||||
## Context
|
||||
|
||||
When migrating a specific slot (e.g., `has_approval_date`), you may encounter other bespoke or legacy slots in the same class file (e.g., `innovation_budget`, `operating_budget`).
|
||||
|
||||
**YOU MUST**:
|
||||
* ✅ Migrate ONLY the specific slot you were instructed to work on.
|
||||
* ✅ Leave other bespoke slots exactly as they are.
|
||||
* ✅ Focus strictly on the current migration target.
|
||||
|
||||
**YOU MUST NOT**:
|
||||
* ❌ Proactively migrate "nearby" slots just because they look like they need refactoring.
|
||||
* ❌ Remove slots that seem unused or redundant without specific instruction.
|
||||
* ❌ "Clean up" the class file by removing legacy attributes.
|
||||
|
||||
## Rationale
|
||||
|
||||
Refactoring is a separate, planned phase. Mixing opportunistic refactoring with systematic slot migration increases the risk of regression and makes changes harder to review. "We will refactor those later."
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Identify Target**: Identify the specific slot(s) assigned for migration (from `slot_fixes.yaml` or user prompt).
|
||||
2. **Execute Migration**: Apply changes ONLY for those slots.
|
||||
3. **Ignore Others**: Do not touch other slots in the file, even if they violate other rules (like Rule 39 or Rule 53). Those will be handled in their own dedicated tasks.
|
||||
190
.opencode/rules/linkml/semantic-consistency-over-simplicity.md
Normal file
190
.opencode/rules/linkml/semantic-consistency-over-simplicity.md
Normal file
|
|
@ -0,0 +1,190 @@
|
|||
# Rule 56: Semantic Consistency Over Simplicity - Always Execute slot_fixes.yaml Revisions
|
||||
|
||||
🚨 **CRITICAL**: When `slot_fixes.yaml` specifies a revision, agents MUST execute it. Perceived simplicity of the existing slot is NOT a valid reason to reject a migration.
|
||||
|
||||
## The Core Problem
|
||||
|
||||
Previous agents marked migrations as "NO MIGRATION NEEDED" or "WELL_STRUCTURED_NO_MIGRATION_NEEDED" citing reasons like:
|
||||
- "simple enum appropriate"
|
||||
- "would add unnecessary indirection"
|
||||
- "already has proper slot_uri"
|
||||
- "string range is sufficient for this use case"
|
||||
- "well-structured, no migration needed"
|
||||
- "current implementation is adequate"
|
||||
|
||||
**These judgments were INCORRECT.**
|
||||
|
||||
**Over 95% of entries in `slot_fixes.yaml` represent carefully curated migrations that MUST be executed.**
|
||||
|
||||
The revisions were manually designed based on:
|
||||
- Ontology analysis (CIDOC-CRM, RiC-O, PROV-O, Schema.org alignment)
|
||||
- Semantic correctness
|
||||
- Pattern consistency (Rule 39: RiC-O style naming)
|
||||
- Type/Types class hierarchy design (Rule 0b)
|
||||
|
||||
## Why Revisions MUST Be Executed
|
||||
|
||||
| Principle | Explanation |
|
||||
|-----------|-------------|
|
||||
| **Schema Consistency** | Ontology achieves semantic power through consistent patterns, not local optimizations |
|
||||
| **LinkML Mapping Separation** | `slot_uri` handles external ontology alignment; slot structure handles internal consistency |
|
||||
| **Single Responsibility Principle** | Predicates should have single, focused purposes |
|
||||
| **Extensibility First** | Structured classes enable future extension even if current use is simple |
|
||||
| **Curated Quality** | Revisions were manually designed with ontology expertise - trust them |
|
||||
|
||||
## Invalid Reasons to Reject Migrations
|
||||
|
||||
| Rejected Reason | Why It's Invalid |
|
||||
|-----------------|------------------|
|
||||
| "Already has proper slot_uri" | slot_uri is for external mapping; internal structure is separate concern |
|
||||
| "Simple string/enum is sufficient" | Consistency and extensibility trump local simplicity |
|
||||
| "Would add unnecessary indirection" | Indirection enables reuse and future extension |
|
||||
| "Creating a class would over-engineer" | Ontology design favors class-based modeling |
|
||||
| "Well-structured, no migration needed" | **If revision exists, migration IS needed** |
|
||||
| "Current implementation is adequate" | Adequacy is not the standard; consistency is |
|
||||
| "WELL_STRUCTURED_NO_MIGRATION_NEEDED" | **This note type is PROHIBITED** |
|
||||
|
||||
## Valid Reasons to PAUSE Migrations (Not Reject)
|
||||
|
||||
These warrant discussion with the user, NOT unilateral rejection:
|
||||
|
||||
1. **Semantic Conflict**: Proposed slot_uri contradicts the semantic intent of the slot
|
||||
2. **Class Already Exists**: The target class exists under a different name (propose merge)
|
||||
3. **Circular Dependency**: Migration would create import cycles
|
||||
4. **Breaking External API**: Change would break published external consumers
|
||||
|
||||
**Even these reasons require escalation to the user** - agents should NOT mark as "NO MIGRATION NEEDED".
|
||||
|
||||
## The Key Insight
|
||||
|
||||
Agents confused "has good external mapping" with "needs no migration". These are **independent concerns**:
|
||||
|
||||
- **External mapping** (slot_uri, exact_mappings): How this slot relates to external ontologies
|
||||
- **Internal structure** (range, class design): How this slot fits within the LinkML schema
|
||||
|
||||
A slot can have perfect external mappings AND still need structural migration for internal consistency.
|
||||
|
||||
## Correct Agent Behavior
|
||||
|
||||
```yaml
|
||||
# When encountering a slot_fixes.yaml entry:
|
||||
|
||||
1. READ the revision specification completely
|
||||
2. DO NOT evaluate whether migration is "necessary" - it IS necessary
|
||||
3. CREATE the specified slots and classes
|
||||
4. UPDATE the consuming class(es)
|
||||
5. ARCHIVE the old slot
|
||||
6. ADD processed section with status: true
|
||||
|
||||
# NEVER add notes like:
|
||||
# - "NO MIGRATION NEEDED"
|
||||
# - "WELL_STRUCTURED_NO_MIGRATION_NEEDED"
|
||||
# - "Current implementation is sufficient"
|
||||
# - "Would add unnecessary complexity"
|
||||
# - "Simple enum appropriate"
|
||||
```
|
||||
|
||||
## Handling Previous "NO MIGRATION NEEDED" Entries
|
||||
|
||||
Entries previously marked with notes like "WELL_STRUCTURED_NO_MIGRATION_NEEDED" should be:
|
||||
|
||||
1. **Re-processed**: The migration should be executed as specified in `revision`
|
||||
2. **Notes replaced**: Replace the incorrect notes with actual migration documentation
|
||||
3. **Status verified**: Ensure `status: true` reflects ACTUAL migration, not skipped migration
|
||||
|
||||
## Example - WRONG Agent Behavior
|
||||
|
||||
```yaml
|
||||
# WRONG - Agent decided migration wasn't needed
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/cites_appendix
|
||||
revision:
|
||||
- label: is_or_was_listed_in
|
||||
type: slot
|
||||
- label: CITESAppendix
|
||||
type: class
|
||||
processed:
|
||||
status: true # ← Marked complete but NOT actually migrated!
|
||||
notes: "WELL_STRUCTURED_NO_MIGRATION_NEEDED: Already has proper slot_uri
|
||||
and string range is sufficient for CITES appendix values."
|
||||
```
|
||||
|
||||
## Example - CORRECT Agent Behavior
|
||||
|
||||
```yaml
|
||||
# CORRECT - Agent executed the migration as specified
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/cites_appendix
|
||||
revision:
|
||||
- label: is_or_was_listed_in
|
||||
type: slot
|
||||
- label: CITESAppendix
|
||||
type: class
|
||||
processed:
|
||||
status: true
|
||||
timestamp: '2026-01-19T00:00:00Z'
|
||||
session: session-2026-01-19-cites-appendix-migration
|
||||
notes: 'Migrated 2026-01-19 per Rule 53/56. Created is_or_was_listed_in.yaml.
|
||||
Created CITESAppendix.yaml class. Updated BiologicalObject.yaml.
|
||||
Archived: modules/slots/archive/cites_appendix_archived_20260119.yaml.'
|
||||
```
|
||||
|
||||
## Feedback Field
|
||||
|
||||
The `feedback` field in slot_fixes.yaml entries contains user corrections to agent mistakes. When feedback says things like:
|
||||
|
||||
- "I reject this!"
|
||||
- "Conduct the migration"
|
||||
- "Please conduct accordingly"
|
||||
- "I altered the revision"
|
||||
|
||||
This means a previous agent incorrectly deferred the migration, and it MUST now be executed.
|
||||
|
||||
## Schema Consistency Examples
|
||||
|
||||
### Why "Simple URI is fine" is WRONG
|
||||
|
||||
```yaml
|
||||
# WRONG - Agent judgment: "Simple URI is fine"
|
||||
thumbnail_url:
|
||||
range: uri
|
||||
slot_uri: schema:thumbnailUrl
|
||||
|
||||
# CORRECT - Consistent with all media references
|
||||
has_or_had_thumbnail:
|
||||
range: Thumbnail # Thumbnail class with has_or_had_url → URL
|
||||
```
|
||||
|
||||
**Rationale**: All media references (images, thumbnails, videos, documents) should use the same structural pattern.
|
||||
|
||||
### Why "Simple enum is appropriate" is WRONG
|
||||
|
||||
```yaml
|
||||
# WRONG - "Simple enum is fine"
|
||||
thinking_mode:
|
||||
range: ThinkingModeEnum # enabled, disabled, interleaved
|
||||
|
||||
# CORRECT - Enables extension
|
||||
has_or_had_mode:
|
||||
range: ThinkingMode
|
||||
# ThinkingMode can have: mode_type, confidence, effective_date, etc.
|
||||
```
|
||||
|
||||
**Rationale**: Even if current use is simple, structured classes enable future extension without breaking changes.
|
||||
|
||||
## Summary
|
||||
|
||||
**Trust the revision. Execute the migration. Document the work.**
|
||||
|
||||
The `revision` key in `slot_fixes.yaml` represents carefully curated ontology decisions. Agents are **executors** of these decisions, **not evaluators**. The only acceptable output is a completed migration with proper documentation.
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 53**: slot_fixes.yaml is AUTHORITATIVE - Full Slot Migration
|
||||
- **Rule 55**: Broaden Generic Predicate Ranges Instead of Creating Bespoke Predicates
|
||||
- **Rule 57**: The revision key in slot_fixes.yaml is IMMUTABLE
|
||||
- **Rule 39**: RiC-O Temporal Naming Conventions
|
||||
- **Rule 38**: Slot Centralization and Semantic URI Requirements
|
||||
|
||||
## Revision History
|
||||
|
||||
- 2026-01-19: Strengthened with explicit prohibition of "WELL_STRUCTURED_NO_MIGRATION_NEEDED" notes
|
||||
- 2026-01-16: Created based on analysis of 51 feedback entries in slot_fixes.yaml
|
||||
|
|
@ -1,48 +0,0 @@
|
|||
# Rule: No Tool-Specific Classes
|
||||
|
||||
## Critical Convention
|
||||
|
||||
Ontology classes MUST be domain concepts, not wrappers for specific software tools.
|
||||
|
||||
## Rule
|
||||
|
||||
1. Do not model vendor/tool names as primary class concepts.
|
||||
- Reject classes like `ExaSearchMetadata`, `OpenAIFetchResult`, `ElasticsearchHit`.
|
||||
|
||||
2. Model the generic domain activity or entity instead.
|
||||
- Use names like `ExternalSearchMetadata`, `RetrievalActivity`, `SearchResult`.
|
||||
|
||||
3. Capture tool provenance through generic slots and values.
|
||||
- Use `has_tool`, `has_method`, `has_agent`, `has_note` to record implementation details.
|
||||
|
||||
4. Platform custodians are allowed as domain classes.
|
||||
- Classes for digital platforms that act as custodians (for example YouTube-related custodian classes) are valid.
|
||||
- Operational tools used to query/process data are not valid ontology classes.
|
||||
|
||||
## Rationale
|
||||
|
||||
- Tool names are implementation details and change faster than domain semantics.
|
||||
- Tool-specific classes reduce reuse and interoperability.
|
||||
- Generic classes preserve stable meaning while still supporting full provenance.
|
||||
|
||||
## Examples
|
||||
|
||||
### Wrong
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
ExaSearchMetadata:
|
||||
class_uri: prov:Activity
|
||||
```
|
||||
|
||||
### Correct
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
ExternalSearchMetadata:
|
||||
class_uri: prov:Activity
|
||||
slots:
|
||||
- has_tool
|
||||
- has_method
|
||||
- has_agent
|
||||
```
|
||||
|
|
@ -12,7 +12,7 @@ They must **NOT** be stored in the `20260202_matang/` subdirectory.
|
|||
|
||||
## Rationale
|
||||
|
||||
The `20260202_matang/` directory and its `new/` subdirectory contain **draft/unpolished** slot definitions that are pending review. Once a slot file has been polished (ontology-aligned, translated, cleaned), it graduates to the canonical `slots/` directory.
|
||||
The `new/` subdirectory contain **draft/unpolished** slot definitions that are pending review. Once a slot file has been polished (ontology-aligned, translated, cleaned), it graduates to the canonical `slots/` directory.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
|
|
|
|||
|
|
@ -1,5 +1,5 @@
|
|||
{
|
||||
"generated": "2026-02-15T15:25:32.418Z",
|
||||
"generated": "2026-02-15T17:46:11.976Z",
|
||||
"schemaRoot": "/schemas/20251121/linkml",
|
||||
"totalFiles": 2369,
|
||||
"categoryCounts": {
|
||||
|
|
|
|||
|
|
@ -19,7 +19,7 @@ description: |
|
|||
- provenance: Data tier tracking and source lineage
|
||||
- ghcid: Global Heritage Custodian ID with history
|
||||
- identifiers: ISIL, Wikidata, GHCID variants
|
||||
- enrichments: Google Maps, Wikidata, Genealogiewerkbalk, etc.
|
||||
- enrichments: Google Maps, Wikidata, genealogy archive registries, etc.
|
||||
- web_claims: Extracted claims with XPath provenance
|
||||
- custodian_name: Consensus name determination
|
||||
- location: Normalized geographic data
|
||||
|
|
@ -153,7 +153,7 @@ imports:
|
|||
- ./modules/classes/MergeNote
|
||||
# Dutch Enrichments Domain
|
||||
- ./modules/classes/ArchiveInfo
|
||||
- ./modules/classes/GenealogiewerkbalkEnrichment
|
||||
- ./modules/classes/GenealogyArchivesRegistryEnrichment
|
||||
- ./modules/classes/IsilCodeEntry
|
||||
- ./modules/classes/MunicipalityInfo
|
||||
- ./modules/classes/NanIsilEnrichment
|
||||
|
|
|
|||
|
|
@ -1,5 +1,5 @@
|
|||
{
|
||||
"generated": "2026-02-15T17:46:11.976Z",
|
||||
"generated": "2026-02-15T18:20:10.034Z",
|
||||
"schemaRoot": "/schemas/20251121/linkml",
|
||||
"totalFiles": 2369,
|
||||
"categoryCounts": {
|
||||
|
|
|
|||
|
|
@ -34,12 +34,10 @@ default_prefix: hc
|
|||
classes:
|
||||
DonationScheme:
|
||||
class_uri: schema:DonateAction
|
||||
description: "A donation or giving scheme offered by a heritage custodian institution.\n\n**PURPOSE**:\n\nDonationScheme provides structured representation of the various ways\nindividuals and organizations can financially support heritage institutions.\nThese range from simple one-time donations to complex membership programs,\nadoption schemes, patron circles, and legacy giving vehicles.\n\n**HERITAGE SECTOR CONTEXT**:\n\nDonation schemes are critical for heritage institution sustainability:\n\n- **Museums**: Friends schemes, patron circles, acquisition fund drives\n- **Libraries**: Adopt-a-book programs, conservation appeals\n- **Archives**: \"Adopt history\" programs, preservation sponsorships\n- **Galleries**: Artist support funds, exhibition sponsorships\n- **Historical societies**: Heritage membership, research fellowships\n- **Botanical gardens**: Plant and animal adoption programs\n\n**MULTILINGUAL TERMINOLOGY**:\n\n\"Friends\" scheme terminology varies by country:\n- Dutch:\
|
||||
\ Museumvriend, Vrienden van het museum\n- German: F\xF6rderverein, Freundeskreis\n- French: Amis du mus\xE9e, Soci\xE9t\xE9 des amis\n- Spanish: Amigos del museo\n- Italian: Amici del museo\n\n**PROVENANCE CHAIN**:\n\n```\nHeritageCustodian\n \u2502\n \u251C\u2500\u2500 offers_donation_schemes \u2500\u2500\u2192 DonationScheme[]\n \u2502 \u2502\n \u2502 \u251C\u2500\u2500 scheme_type: MEMBERSHIP_FRIENDS\n \u2502 \u251C\u2500\u2500 scheme_name: \"Rijksmuseum Vrienden\"\n \u2502 \u251C\u2500\u2500 minimum_amount: 60\n \u2502 \u251C\u2500\u2500 currency: \"EUR\"\n \u2502 \u251C\u2500\u2500 payment_frequency: \"annually\"\n \u2502 \u2502\n \u2502 \u2514\u2500\u2500 observed_in\
|
||||
\ \u2500\u2500\u2192 WebObservation\n \u2502 \u2502\n \u2502 \u251C\u2500\u2500 source_url: https://rijksmuseum.nl/steun\n \u2502 \u251C\u2500\u2500 retrieved_on: 2026-01-01T10:00:00Z\n \u2502 \u2514\u2500\u2500 extraction_confidence: 0.95\n \u2502\n \u2514\u2500\u2500 web_observations \u2500\u2500\u2192 WebObservation[] (general custodian provenance)\n```\n\n**ONTOLOGY ALIGNMENT**:\n\n- **Schema.org**: `schema:DonateAction` - Action of donating to organization\n- **Schema.org**: `schema:Offer` - Scheme as offer with price specification\n- **W3C Org**: `org:Membership` - For membership-type schemes\n- **Dublin Core**: `dcterms:isPartOf` - Scheme belongs to institution\n- **PROV-O**: `prov:wasDerivedFrom` - Links scheme to observation\n\
|
||||
\n**TAX INCENTIVE SCHEMES**:\n\nMany countries provide tax benefits for cultural donations:\n\n| Country | Scheme | Benefit |\n|---------|--------|---------|\n| Netherlands | ANBI | 100% deductible |\n| Netherlands | Cultural ANBI | 125% deductible (extra 25%) |\n| UK | Gift Aid | 25% tax reclaim for charity |\n| UK | Cultural Gifts Scheme | Tax relief on objects donated |\n| USA | 501(c)(3) | Itemized deduction |\n| Germany | Gemeinn\xFCtzigkeit | Tax deductible |\n| France | M\xE9c\xE9nat culturel | 60% tax reduction |\n\n**SCHEME CATEGORIES**:\n\nSchemes are classified via DonationSchemeTypeEnum into eight categories:\n\n1. **MEMBERSHIP_*** - Recurring membership/subscription\n - Friends, Young Friends, Family, Corporate, Research Fellow\n \n2. **PATRON_*** - High-value donor circles\n - Circle, Benefactor, Founders Circle, Life, National\n \n3. **ADOPTION_*** - Object sponsorship\n - Book, Artifact, Archive Collection, Artwork, Animal, Plant\n \n4. **LEGACY_*** - Planned/estate\
|
||||
\ giving\n - Bequest, Charitable Trust, Endowment, Named Fund\n \n5. **DONATION_*** - Direct monetary gifts\n - One-off, Recurring, Appeal, Project, Tax Incentive\n \n6. **INKIND_*** - Non-monetary contributions\n - Object, Artwork, Archive, Library Collection, Expertise, Volunteer\n \n7. **SPONSORSHIP_*** - Corporate/event support\n - Exhibition, Gallery, Event, Program, Digitization, Conservation\n \n8. **CROWDFUNDING_*** - Campaign-based collective funding\n - Acquisition, Conservation, Building, Exhibition\n\n**EXTRACTION PATTERN**:\n\nWhen extracting donation schemes from institutional websites:\n\n1. Create WebObservation for the support/donate page\n2. For each scheme found:\n - Create DonationScheme with observed_in \u2192 WebObservation\n - Classify using DonationSchemeTypeEnum\n - Extract financial details (amounts, currency, frequency)\n - List benefits provided to donors\n - Note tax deductibility and applicable schemes\n - Assign extraction_confidence\
|
||||
\ based on clarity\n\n**EXAMPLES**:\n\nSee class examples section for detailed instances.\n"
|
||||
description: >-
|
||||
Structured representation of an institutional giving program, including
|
||||
donation type, financial thresholds, payment frequency, donor benefits,
|
||||
tax treatment, provider organization, and source observation.
|
||||
alt_descriptions:
|
||||
nl: {text: Gestructureerd model van institutionele geefregelingen met bijdragevorm, voordelen, voorwaarden en toezicht., language: nl}
|
||||
de: {text: Strukturiertes Modell institutioneller Spendenprogramme mit Beitragsform, Vorteilen, Bedingungen und Aufsicht., language: de}
|
||||
|
|
@ -53,6 +51,7 @@ classes:
|
|||
de: [{literal_form: Spendenprogramm, language: de}]
|
||||
fr: [{literal_form: dispositif de don, language: fr}]
|
||||
es: [{literal_form: esquema de donacion, language: es}]
|
||||
it: [{literal_form: programma di donazione, language: it}]
|
||||
ar: [{literal_form: برنامج تبرع, language: ar}]
|
||||
id: [{literal_form: skema donasi, language: id}]
|
||||
zh: [{literal_form: 捐赠计划, language: zh}]
|
||||
|
|
@ -98,6 +97,8 @@ classes:
|
|||
has_type:
|
||||
required: true
|
||||
range: DonationSchemeTypeEnum
|
||||
description: Classification for the scheme modality, including membership, patron,
|
||||
adoption, legacy, direct donation, in-kind, sponsorship, and crowdfunding families.
|
||||
examples:
|
||||
- value: MEMBERSHIP_FRIENDS
|
||||
- value: ADOPTION_BOOK
|
||||
|
|
@ -151,6 +152,7 @@ classes:
|
|||
- value: Bookplate with donor name
|
||||
offered_by:
|
||||
required: true
|
||||
description: Custodian organization that publishes and administers the scheme.
|
||||
# range: string # uriorcurie
|
||||
examples:
|
||||
- value: https://nde.nl/ontology/hc/custodian/nl/rijksmuseum
|
||||
|
|
@ -173,6 +175,7 @@ classes:
|
|||
range: TaxScheme
|
||||
multivalued: true
|
||||
inlined_as_list: true
|
||||
description: Applicable fiscal framework for deductibility or tax relief.
|
||||
examples:
|
||||
- value:
|
||||
has_type: ANBI
|
||||
|
|
@ -206,11 +209,14 @@ classes:
|
|||
- has_percentage:
|
||||
observed_in:
|
||||
required: true
|
||||
description: Source observation used to extract and verify scheme information.
|
||||
# range: string # uriorcurie
|
||||
examples:
|
||||
- value: https://nde.nl/ontology/hc/observation/web/2026-01-01/rijksmuseum-support
|
||||
comments:
|
||||
- Each scheme links to WebObservation for full provenance chain
|
||||
- Common domains include museum friends programs, archive adoption campaigns, and library conservation support
|
||||
- Capture payment rhythm and thresholds as structured values, not embedded narrative
|
||||
- Tax deductibility varies by jurisdiction - always document regulated_by_scheme
|
||||
- Benefits should be extracted as discrete items for comparison
|
||||
- Tiered schemes (e.g., Silver/Gold/Platinum) are separate DonationScheme instances
|
||||
|
|
|
|||
|
|
@ -1,19 +1,43 @@
|
|||
id: https://nde.nl/ontology/hc/class/FundingCall
|
||||
name: FundingCall
|
||||
title: Funding Call
|
||||
description: A call for applications for funding. MIGRATED from funding_call slot per Rule 53. Follows CallForApplication class (schema:Offer).
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
default_prefix: hc
|
||||
- ../classes/CallForApplication
|
||||
classes:
|
||||
FundingCall:
|
||||
class_uri: hc:FundingCall
|
||||
is_a: CallForApplication
|
||||
class_uri: schema:Offer
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
description: >-
|
||||
Public invitation that opens a defined application window for submitting
|
||||
proposals to a specific funding opportunity.
|
||||
alt_descriptions:
|
||||
nl: Openbare oproep die een afgebakende indieningsperiode opent voor voorstellen binnen een specifieke financieringskans.
|
||||
de: Oeffentliche Ausschreibung mit festem Einreichungszeitraum fuer Antraege auf eine bestimmte Foerdermoeglichkeit.
|
||||
fr: Appel public ouvrant une periode definie pour soumettre des propositions a une opportunite de financement specifique.
|
||||
es: Convocatoria publica que abre un periodo definido para presentar propuestas a una oportunidad de financiacion concreta.
|
||||
ar: دعوة عامة تفتح فترة تقديم محددة لإرسال مقترحات لفرصة تمويل معينة.
|
||||
id: Undangan publik yang membuka jendela aplikasi terdefinisi untuk pengajuan proposal pada peluang pendanaan tertentu.
|
||||
zh: 为特定资助机会开启明确申报期的公开征集通知。
|
||||
structured_aliases:
|
||||
- literal_form: financieringsoproep
|
||||
in_language: nl
|
||||
- literal_form: Foerderaufruf
|
||||
in_language: de
|
||||
- literal_form: appel de financement
|
||||
in_language: fr
|
||||
- literal_form: convocatoria de financiacion
|
||||
in_language: es
|
||||
- literal_form: دعوة تمويل
|
||||
in_language: ar
|
||||
- literal_form: panggilan pendanaan
|
||||
in_language: id
|
||||
- literal_form: 资助征集
|
||||
in_language: zh
|
||||
broad_mappings:
|
||||
- schema:Offer
|
||||
|
|
|
|||
|
|
@ -1,23 +1,46 @@
|
|||
id: https://nde.nl/ontology/hc/class/FundingFocus
|
||||
name: FundingFocus
|
||||
title: Funding Focus
|
||||
description: A thematic focus or priority area for funding. MIGRATED from funding_focus slot per Rule 53. Follows skos:Concept.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_description
|
||||
- ../slots/has_label
|
||||
default_prefix: hc
|
||||
classes:
|
||||
FundingFocus:
|
||||
class_uri: skos:Concept
|
||||
class_uri: hc:FundingFocus
|
||||
description: >-
|
||||
Thematic priority category used to target funding toward specific policy,
|
||||
research, or societal objectives.
|
||||
alt_descriptions:
|
||||
nl: Thematische prioriteitscategorie die financiering richt op specifieke beleids-, onderzoeks- of maatschappelijke doelen.
|
||||
de: Thematische Prioritaetskategorie zur Ausrichtung von Foerdermitteln auf bestimmte politische, wissenschaftliche oder gesellschaftliche Ziele.
|
||||
fr: Categorie de priorite thematique orientant le financement vers des objectifs politiques, de recherche ou societaux specifie.
|
||||
es: Categoria de prioridad tematica que orienta la financiacion hacia objetivos politicos, de investigacion o sociales especificos.
|
||||
ar: فئة أولوية موضوعية لتوجيه التمويل نحو أهداف سياساتية أو بحثية أو مجتمعية محددة.
|
||||
id: Kategori prioritas tematik yang mengarahkan pendanaan ke tujuan kebijakan, riset, atau sosial tertentu.
|
||||
zh: 用于将资助导向特定政策、研究或社会目标的主题优先类别。
|
||||
structured_aliases:
|
||||
- literal_form: financieringsfocus
|
||||
in_language: nl
|
||||
- literal_form: Foerderschwerpunkt
|
||||
in_language: de
|
||||
- literal_form: axe de financement
|
||||
in_language: fr
|
||||
- literal_form: enfoque de financiacion
|
||||
in_language: es
|
||||
- literal_form: محور التمويل
|
||||
in_language: ar
|
||||
- literal_form: fokus pendanaan
|
||||
in_language: id
|
||||
- literal_form: 资助重点
|
||||
in_language: zh
|
||||
slots:
|
||||
- has_label
|
||||
- has_description
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
broad_mappings:
|
||||
- skos:Concept
|
||||
|
|
|
|||
|
|
@ -1,26 +1,50 @@
|
|||
id: https://nde.nl/ontology/hc/class/FundingProgram
|
||||
name: FundingProgram
|
||||
title: Funding Program
|
||||
description: A program that provides funding, grants, or subsidies. MIGRATED from funding_program slot per Rule 53. Follows frapo:FundingProgramme.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
frapo: http://purl.org/cerif/frapo/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
schema: http://schema.org/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_description
|
||||
- ../slots/has_label
|
||||
- ../slots/targeted_at
|
||||
default_prefix: hc
|
||||
classes:
|
||||
FundingProgram:
|
||||
class_uri: frapo:FundingProgramme
|
||||
class_uri: hc:FundingProgram
|
||||
description: >-
|
||||
Structured funding framework that groups related calls, budget lines, and
|
||||
eligibility logic under a shared strategic objective.
|
||||
alt_descriptions:
|
||||
nl: Gestructureerd financieringskader dat verwante oproepen, budgetlijnen en subsidieregels bundelt onder een gedeeld strategisch doel.
|
||||
de: Strukturiertes Foerderprogramm, das zusammenhaengende Ausschreibungen, Budgetlinien und Foerderlogiken unter einem gemeinsamen strategischen Ziel buendelt.
|
||||
fr: Cadre de financement structure regroupant appels, lignes budgetaires et regles d'eligibilite autour d'un objectif strategique commun.
|
||||
es: Marco de financiacion estructurado que agrupa convocatorias, lineas presupuestarias y logica de elegibilidad bajo un objetivo estrategico comun.
|
||||
ar: إطار تمويلي منظم يجمع الدعوات وخطوط الميزانية ومنطق الأهلية ضمن هدف استراتيجي مشترك.
|
||||
id: Kerangka pendanaan terstruktur yang mengelompokkan panggilan, lini anggaran, dan logika kelayakan di bawah tujuan strategis bersama.
|
||||
zh: 在共同战略目标下整合相关征集、预算条线与资格逻辑的结构化资助框架。
|
||||
structured_aliases:
|
||||
- literal_form: financieringsprogramma
|
||||
in_language: nl
|
||||
- literal_form: Foerderprogramm
|
||||
in_language: de
|
||||
- literal_form: programme de financement
|
||||
in_language: fr
|
||||
- literal_form: programa de financiacion
|
||||
in_language: es
|
||||
- literal_form: برنامج تمويل
|
||||
in_language: ar
|
||||
- literal_form: program pendanaan
|
||||
in_language: id
|
||||
- literal_form: 资助计划
|
||||
in_language: zh
|
||||
slots:
|
||||
- has_label
|
||||
- has_description
|
||||
- targeted_at
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
broad_mappings:
|
||||
- schema:FundingScheme
|
||||
close_mappings:
|
||||
- schema:Grant
|
||||
|
|
|
|||
|
|
@ -1,23 +1,46 @@
|
|||
id: https://nde.nl/ontology/hc/class/FundingRate
|
||||
name: FundingRate
|
||||
title: Funding Rate
|
||||
description: The rate or percentage of funding provided. MIGRATED from funding_rate slot per Rule 53. Follows schema:MonetaryAmount or Percentage.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_rate
|
||||
- ../slots/maximum_of_maximum
|
||||
default_prefix: hc
|
||||
classes:
|
||||
FundingRate:
|
||||
class_uri: schema:MonetaryAmount
|
||||
class_uri: hc:FundingRate
|
||||
description: >-
|
||||
Quantified proportion or cap that determines the share of eligible costs
|
||||
covered by a funding instrument.
|
||||
alt_descriptions:
|
||||
nl: Gekwantificeerd percentage of plafond dat bepaalt welk deel van subsidiabele kosten wordt gedekt.
|
||||
de: Quantifizierter Satz oder Hoechstwert, der den Anteil foerderfaehiger Kosten festlegt.
|
||||
fr: Proportion ou plafond quantifie determinent la part des couts eligibles couverte par le financement.
|
||||
es: Proporcion o tope cuantificado que determina la parte de costos elegibles cubierta por la financiacion.
|
||||
ar: نسبة أو سقف كمي يحدد حصة التكاليف المؤهلة التي يغطيها التمويل.
|
||||
id: Proporsi atau batas kuantitatif yang menentukan porsi biaya layak yang ditanggung instrumen pendanaan.
|
||||
zh: 决定可资助成本覆盖比例或上限的量化比率指标。
|
||||
structured_aliases:
|
||||
- literal_form: financieringspercentage
|
||||
in_language: nl
|
||||
- literal_form: Foerdersatz
|
||||
in_language: de
|
||||
- literal_form: taux de financement
|
||||
in_language: fr
|
||||
- literal_form: tasa de financiacion
|
||||
in_language: es
|
||||
- literal_form: معدل التمويل
|
||||
in_language: ar
|
||||
- literal_form: tingkat pendanaan
|
||||
in_language: id
|
||||
- literal_form: 资助比例
|
||||
in_language: zh
|
||||
slots:
|
||||
- has_rate
|
||||
- maximum_of_maximum
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
broad_mappings:
|
||||
- schema:MonetaryAmount
|
||||
|
|
|
|||
|
|
@ -1,17 +1,15 @@
|
|||
id: https://nde.nl/ontology/hc/class/FundingRequirement
|
||||
name: FundingRequirement
|
||||
title: FundingRequirement Class
|
||||
title: Funding Requirement
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
schema: http://schema.org/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
pav: http://purl.org/pav/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../enums/FundingRequirementTypeEnum
|
||||
- ../slots/apply_to
|
||||
- ../slots/has_note
|
||||
- ../slots/has_score
|
||||
|
|
@ -25,27 +23,35 @@ imports:
|
|||
- ../slots/in_section
|
||||
- ../slots/supersede
|
||||
- ../slots/temporal_extent
|
||||
default_prefix: hc
|
||||
classes:
|
||||
FundingRequirement:
|
||||
class_uri: dcterms:Standard
|
||||
description: "A requirement or criterion that applicants must meet to be eligible for\na funding call. Each requirement is tracked with provenance linking to\nthe source document where it was stated.\n\n**PURPOSE**:\n\nFundingRequirement provides structured, machine-readable representation\nof funding call eligibility criteria. Instead of storing requirements as\nfree-text lists in CallForApplication, each requirement becomes a\ntrackable entity with:\n\n- **Classification**: Categorized by FundingRequirementTypeEnum\n- **Provenance**: Linked to WebObservation documenting source\n- **Values**: Machine-readable value + human-readable text\n- **Temporality**: Valid date range for time-scoped requirements\n\n**PROVENANCE CHAIN**:\n\n```\nCallForApplication\n \u2502\n \u251C\u2500\u2500 requirements \u2500\u2500\u2192 FundingRequirement[]\n \u2502 \u2502\n \u2502 \u251C\u2500\u2500 requirement_type: PARTNERSHIP_MINIMUM_PARTNERS\n\
|
||||
\ \u2502 \u251C\u2500\u2500 requirement_text: \"At least 3 partners from 3 EU countries\"\n \u2502 \u251C\u2500\u2500 requirement_value: \"3\"\n \u2502 \u251C\u2500\u2500 requirement_unit: \"partners\"\n \u2502 \u2502\n \u2502 \u2514\u2500\u2500 observed_in \u2500\u2500\u2192 WebObservation\n \u2502 \u2502\n \u2502 \u251C\u2500\u2500 source_url: https://ec.europa.eu/...\n \u2502 \u251C\u2500\u2500 retrieved_on: 2025-11-29T10:30:00Z\n \u2502 \u2514\u2500\u2500 extraction_confidence: 0.95\n \u2502\n \u2514\u2500\u2500 web_observations \u2500\u2500\u2192 WebObservation[] (general call provenance)\n```\n\n**ONTOLOGY\
|
||||
\ ALIGNMENT**:\n\n- **Dublin Core**: `dcterms:Standard` - \"A reference point against which\n other things can be evaluated\" (requirements are standards for eligibility)\n- **Dublin Core**: `dcterms:requires` - Relates call to requirement\n- **Dublin Core**: `dcterms:conformsTo` - Applicants must conform to requirements\n- **Schema.org**: `schema:eligibleRegion` - For geographic requirements\n- **Schema.org**: `schema:eligibleQuantity` - For numeric constraints\n- **PROV-O**: `prov:wasDerivedFrom` - Links requirement to observation\n\n**REQUIREMENT CATEGORIES**:\n\nRequirements are classified into six main categories via FundingRequirementTypeEnum:\n\n1. **Eligibility** (ELIGIBILITY_*): Who can apply\n - Geographic: EU Member States, Associated Countries\n - Organizational: Non-profit, public body, SME\n - Heritage type: Museums, archives, libraries\n - Experience: Track record, previous projects\n\n2. **Financial** (FINANCIAL_*): Budget and funding\n - Co-funding: Match\
|
||||
\ funding percentages\n - Budget limits: Minimum/maximum grant size\n - Funding rate: Percentage of eligible costs\n - Eligible costs: What can be funded\n\n3. **Partnership** (PARTNERSHIP_*): Consortium requirements\n - Minimum partners: Number required\n - Country diversity: Geographic spread\n - Sector mix: Organisation types needed\n - Coordinator: Lead partner constraints\n\n4. **Thematic** (THEMATIC_*): Topic and scope\n - Focus area: Required research/action themes\n - Heritage scope: Types of heritage addressed\n - Geographic scope: Where activities occur\n\n5. **Technical** (TECHNICAL_*): Outputs and approach\n - Deliverables: Required outputs\n - Open access: Publication requirements\n - Duration: Project length constraints\n - Methodology: Required approaches\n\n6. **Administrative** (ADMINISTRATIVE_*): Process requirements\n - Registration: Portal accounts needed\n - Documentation: Supporting documents\n - Language: Submission language\n\
|
||||
\ - Format: Templates and page limits\n\n**TEMPORAL TRACKING**:\n\nRequirements can change between call publications. The `supersedes` field\nlinks to previous versions, and `valid_from`/`valid_to` scope applicability:\n\n```\nFundingRequirement (current)\n \u2502\n \u251C\u2500\u2500 valid_from: 2025-01-15\n \u251C\u2500\u2500 requirement_value: \"3\" (minimum partners)\n \u2502\n \u2514\u2500\u2500 supersedes \u2500\u2500\u2192 FundingRequirement (previous)\n \u2502\n \u251C\u2500\u2500 valid_from: 2024-01-15\n \u251C\u2500\u2500 valid_to: 2025-01-14\n \u2514\u2500\u2500 requirement_value: \"4\" (was 4 partners)\n```\n\n**EXTRACTION PATTERN**:\n\nWhen extracting requirements from web sources:\n\n1. Create WebObservation for the source page\n2. For each requirement found:\n - Create FundingRequirement with observed_in \u2192 WebObservation\n\
|
||||
\ - Classify using FundingRequirementTypeEnum\n - Extract machine-readable value and unit\n - Record source_section for traceability\n - Assign extraction_confidence based on clarity\n\n**EXAMPLES**:\n\n1. **Partnership Requirement**\n - requirement_type: PARTNERSHIP_MINIMUM_PARTNERS\n - requirement_text: \"Minimum 3 independent legal entities from 3 different EU Member States\"\n - requirement_value: \"3\"\n - requirement_unit: \"partners\"\n - is_mandatory: true\n \n2. **Financial Requirement**\n - requirement_type: FINANCIAL_COFUNDING\n - requirement_text: \"Co-funding of minimum 25% from non-EU sources required\"\n - requirement_value: \"25\"\n - requirement_unit: \"percent\"\n - is_mandatory: true\n \n3. **Open Access Requirement**\n - requirement_type: TECHNICAL_OPEN_ACCESS\n - requirement_text: \"All peer-reviewed publications must be open access (Plan S compliant)\"\n - requirement_value: \"immediate\"\n - is_mandatory: true\n"
|
||||
exact_mappings:
|
||||
- dcterms:Standard
|
||||
close_mappings:
|
||||
- schema:QuantitativeValue
|
||||
- skos:Concept
|
||||
related_mappings:
|
||||
- dcterms:requires
|
||||
- dcterms:conformsTo
|
||||
- schema:eligibleRegion
|
||||
- schema:eligibleQuantity
|
||||
- prov:wasDerivedFrom
|
||||
class_uri: hc:FundingRequirement
|
||||
description: >-
|
||||
Eligibility or compliance criterion that must be satisfied for a proposal
|
||||
to qualify under a specific funding call.
|
||||
alt_descriptions:
|
||||
nl: Subsidiabiliteits- of nalevingscriterium waaraan een voorstel moet voldoen om in aanmerking te komen binnen een specifieke oproep.
|
||||
de: Eignungs- oder Compliance-Kriterium, das fuer die Foerderfaehigkeit eines Antrags in einem bestimmten Aufruf erfuellt sein muss.
|
||||
fr: Critere d'eligibilite ou de conformite devant etre satisfait pour qu'une proposition soit recevable dans un appel donne.
|
||||
es: Criterio de elegibilidad o cumplimiento que debe satisfacerse para que una propuesta califique en una convocatoria especifica.
|
||||
ar: معيار أهلية أو امتثال يجب استيفاؤه لكي يتأهل المقترح ضمن دعوة تمويل محددة.
|
||||
id: Kriteria kelayakan atau kepatuhan yang harus dipenuhi agar proposal memenuhi syarat pada panggilan pendanaan tertentu.
|
||||
zh: 在特定资助征集中,提案必须满足的资格或合规条件。
|
||||
structured_aliases:
|
||||
- literal_form: financieringsvoorwaarde
|
||||
in_language: nl
|
||||
- literal_form: Foerdervoraussetzung
|
||||
in_language: de
|
||||
- literal_form: condition de financement
|
||||
in_language: fr
|
||||
- literal_form: requisito de financiacion
|
||||
in_language: es
|
||||
- literal_form: شرط التمويل
|
||||
in_language: ar
|
||||
- literal_form: persyaratan pendanaan
|
||||
in_language: id
|
||||
- literal_form: 资助要求
|
||||
in_language: zh
|
||||
slots:
|
||||
- apply_to
|
||||
- has_note
|
||||
|
|
@ -54,7 +60,6 @@ classes:
|
|||
- identified_by
|
||||
- has_text
|
||||
- has_type
|
||||
- has_type
|
||||
- has_measurement_unit
|
||||
- has_value
|
||||
- in_section
|
||||
|
|
@ -65,153 +70,30 @@ classes:
|
|||
identified_by:
|
||||
identifier: true
|
||||
required: true
|
||||
# range: string # uriorcurie
|
||||
pattern: ^https://nde\.nl/ontology/hc/requirement/[a-z0-9-]+/[a-z0-9-]+$
|
||||
examples:
|
||||
- value: https://nde.nl/ontology/hc/requirement/ec-cl2-2025-heritage-01/min-partners-3
|
||||
- value: https://nde.nl/ontology/hc/requirement/nlhf-medium-2025/cofunding-25pct
|
||||
has_type:
|
||||
required: false
|
||||
range: FundingRequirementTypeEnum
|
||||
deprecated: 'DEPRECATED 2026-01-13: Use has_type with RequirementType class instead'
|
||||
examples:
|
||||
- value: PARTNERSHIP_MINIMUM_PARTNERS
|
||||
- value: FINANCIAL_COFUNDING
|
||||
- value: ELIGIBILITY_GEOGRAPHIC
|
||||
has_type:
|
||||
required: true
|
||||
range: RequirementType
|
||||
examples:
|
||||
- value:
|
||||
has_code: PARTNERSHIP_MINIMUM_PARTNERS
|
||||
has_label:
|
||||
- Minimum partners requirement@en
|
||||
- value:
|
||||
has_code: FINANCIAL_COFUNDING
|
||||
has_label:
|
||||
- Co-funding requirement@en
|
||||
has_text:
|
||||
required: true
|
||||
# range: string
|
||||
examples:
|
||||
- value: Minimum 3 independent legal entities from 3 different EU Member States or Horizon Europe Associated Countries
|
||||
- value: Applications must demonstrate at least 25% co-funding from non-EU sources
|
||||
has_value:
|
||||
# range: string
|
||||
examples:
|
||||
- value: '3'
|
||||
- value: '25'
|
||||
- value: eu-member-states
|
||||
- value: immediate
|
||||
has_measurement_unit:
|
||||
# range: string
|
||||
examples:
|
||||
- value: partners
|
||||
- value: percent
|
||||
- value: EUR
|
||||
- value: months
|
||||
- value: countries
|
||||
mandatory:
|
||||
range: boolean
|
||||
ifabsent: 'true'
|
||||
examples:
|
||||
- value: true
|
||||
description: 'Mandatory: must meet to be eligible'
|
||||
- value: false
|
||||
description: 'Optional: preferred but not required'
|
||||
observed_in:
|
||||
required: true
|
||||
# range: string # uriorcurie
|
||||
examples:
|
||||
- value: https://nde.nl/ontology/hc/observation/web/2025-11-29/eu-horizon-cl2-heritage
|
||||
in_section:
|
||||
# range: string
|
||||
examples:
|
||||
- value: Section 2.1 - Eligibility Criteria
|
||||
- value: 'FAQ #7 - Consortium composition'
|
||||
- value: Work Programme page 45
|
||||
supersede:
|
||||
# range: string # uriorcurie
|
||||
examples:
|
||||
- value: https://nde.nl/ontology/hc/requirement/ec-cl2-2024-heritage-01/min-partners-4
|
||||
comments:
|
||||
- Each requirement links to WebObservation for full provenance chain
|
||||
- requirement_value + requirement_unit enable structured queries
|
||||
- is_mandatory defaults to true; explicitly set false for optional requirements
|
||||
- supersedes_or_superseded creates version chain for requirement changes
|
||||
- extraction_confidence can differ from observation confidence
|
||||
see_also:
|
||||
- https://dublincore.org/specifications/dublin-core/dcmi-terms/#Standard
|
||||
- https://schema.org/QuantitativeValue
|
||||
- https://www.w3.org/TR/prov-o/#Entity
|
||||
- http://purl.org/pav/
|
||||
examples:
|
||||
- value:
|
||||
requirement_id: https://nde.nl/ontology/hc/requirement/ec-cl2-2025-heritage-01/min-partners-3-countries
|
||||
requirement_type: PARTNERSHIP_MINIMUM_PARTNERS
|
||||
requirement_text: Proposals must be submitted by a consortium of at least 3 independent legal entities established in 3 different EU Member States or Horizon Europe Associated Countries.
|
||||
requirement_value: '3'
|
||||
requirement_unit: partners
|
||||
is_mandatory: true
|
||||
apply_to: https://nde.nl/ontology/hc/call/ec/cl2-2025-heritage-01
|
||||
observed_in: https://nde.nl/ontology/hc/observation/web/2025-11-29/eu-horizon-cl2-heritage
|
||||
source_section: Section 2 - Eligibility Conditions
|
||||
has_score:
|
||||
has_score: 0.98
|
||||
has_note: Clear statement in eligibility section. Standard Horizon Europe RIA requirement.
|
||||
identified_by: https://nde.nl/ontology/hc/requirement/ec-call/minimum-partners
|
||||
has_text: Minimum 3 independent legal entities from 3 different countries.
|
||||
has_value: '3'
|
||||
has_measurement_unit: partners
|
||||
mandatory: true
|
||||
description: Consortium size threshold requirement
|
||||
- value:
|
||||
requirement_id: https://nde.nl/ontology/hc/requirement/ec-cl2-2025-heritage-01/cofunding-for-profit
|
||||
requirement_type: FINANCIAL_COFUNDING
|
||||
requirement_text: For-profit entities receive 70% funding rate. The remaining 30% must be covered by co-funding or own resources.
|
||||
requirement_value: '30'
|
||||
requirement_unit: percent
|
||||
is_mandatory: true
|
||||
apply_to: https://nde.nl/ontology/hc/call/ec/cl2-2025-heritage-01
|
||||
observed_in: https://nde.nl/ontology/hc/observation/web/2025-11-29/eu-horizon-cl2-heritage
|
||||
source_section: Section 3 - Financial Conditions
|
||||
has_score:
|
||||
has_score: 0.95
|
||||
has_note: Applies only to for-profit partners. Non-profits receive 100% funding.
|
||||
- value:
|
||||
requirement_id: https://nde.nl/ontology/hc/requirement/ec-cl2-2025-heritage-01/open-access
|
||||
requirement_type: TECHNICAL_OPEN_ACCESS
|
||||
requirement_text: Beneficiaries must ensure open access to peer-reviewed scientific publications under the conditions required by the Grant Agreement. Immediate open access is mandatory (no embargo period).
|
||||
requirement_value: immediate
|
||||
requirement_unit: null
|
||||
is_mandatory: true
|
||||
apply_to: https://nde.nl/ontology/hc/call/ec/cl2-2025-heritage-01
|
||||
observed_in: https://nde.nl/ontology/hc/observation/web/2025-11-29/eu-horizon-cl2-heritage
|
||||
source_section: Section 4.2 - Open Science
|
||||
has_score:
|
||||
has_score: 0.99
|
||||
has_note: Standard Horizon Europe open access requirement. Plan S compliant.
|
||||
- value:
|
||||
requirement_id: https://nde.nl/ontology/hc/requirement/nlhf-medium-2025/uk-based
|
||||
requirement_type: ELIGIBILITY_GEOGRAPHIC
|
||||
requirement_text: Your organisation must be based in the UK (England, Northern Ireland, Scotland or Wales). Projects must take place in the UK.
|
||||
requirement_value: UK
|
||||
requirement_unit: country
|
||||
is_mandatory: true
|
||||
apply_to: https://nde.nl/ontology/hc/call/nlhf/medium-grants-2025-q4
|
||||
observed_in: https://nde.nl/ontology/hc/observation/web/2025-11-28/nlhf-medium-grants
|
||||
source_section: Eligibility
|
||||
has_score:
|
||||
has_score: 0.99
|
||||
has_note: Clear UK-only restriction. Devolved nations explicitly included.
|
||||
- value:
|
||||
requirement_id: https://nde.nl/ontology/hc/requirement/nlhf-medium-2025/non-profit
|
||||
requirement_type: ELIGIBILITY_ORGANIZATIONAL
|
||||
requirement_text: We can fund not-for-profit organisations, including charities, community groups, local authorities, and social enterprises. Private individuals and for-profit companies are not eligible.
|
||||
requirement_value: non-profit
|
||||
requirement_unit: organization-type
|
||||
is_mandatory: true
|
||||
apply_to: https://nde.nl/ontology/hc/call/nlhf/medium-grants-2025-q4
|
||||
observed_in: https://nde.nl/ontology/hc/observation/web/2025-11-28/nlhf-medium-grants
|
||||
source_section: Who can apply
|
||||
has_score:
|
||||
has_score: 0.95
|
||||
has_note: Explicitly excludes for-profit. Social enterprises may need verification.
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
identified_by: https://nde.nl/ontology/hc/requirement/ec-call/open-access
|
||||
has_text: Immediate open access publication is required.
|
||||
mandatory: true
|
||||
description: Technical dissemination requirement
|
||||
exact_mappings:
|
||||
- dcterms:Standard
|
||||
related_mappings:
|
||||
- dcterms:requires
|
||||
- dcterms:conformsTo
|
||||
- schema:eligibleQuantity
|
||||
- prov:wasDerivedFrom
|
||||
|
|
|
|||
|
|
@ -1,30 +1,46 @@
|
|||
id: https://nde.nl/ontology/hc/class/FundingScheme
|
||||
name: FundingScheme
|
||||
title: Funding Scheme
|
||||
description: A scheme or program providing funding. MIGRATED from funding_scheme slot per Rule 53. Follows schema:FundingScheme.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
rdfs: http://www.w3.org/2000/01/rdf-schema#
|
||||
org: http://www.w3.org/ns/org#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_description
|
||||
- ../slots/has_label
|
||||
default_prefix: hc
|
||||
classes:
|
||||
FundingScheme:
|
||||
class_uri: schema:FundingScheme
|
||||
class_uri: hc:FundingScheme
|
||||
description: >-
|
||||
Rule-governed financing arrangement defining how resources are allocated,
|
||||
evaluated, and distributed to eligible applicants.
|
||||
alt_descriptions:
|
||||
nl: Regeling met regels voor toewijzing, beoordeling en uitkering van middelen aan subsidiabele aanvragers.
|
||||
de: Regelgebundene Finanzierungsregelung zur Zuweisung, Bewertung und Verteilung von Mitteln an foerderfaehige Antragstellende.
|
||||
fr: Dispositif de financement reglemente definissant l'allocation, l'evaluation et la distribution des ressources aux candidats eligibles.
|
||||
es: Esquema de financiacion con reglas que define asignacion, evaluacion y distribucion de recursos a solicitantes elegibles.
|
||||
ar: نظام تمويلي قائم على قواعد يحدد كيفية تخصيص الموارد وتقييمها وتوزيعها على المتقدمين المؤهلين.
|
||||
id: Skema pembiayaan berbasis aturan yang menentukan alokasi, evaluasi, dan distribusi sumber daya kepada pelamar yang memenuhi syarat.
|
||||
zh: 定义资源分配、评审与发放给合格申请者方式的规则化资助机制。
|
||||
structured_aliases:
|
||||
- literal_form: financieringsregeling
|
||||
in_language: nl
|
||||
- literal_form: Foerderschema
|
||||
in_language: de
|
||||
- literal_form: dispositif de financement
|
||||
in_language: fr
|
||||
- literal_form: esquema de financiacion
|
||||
in_language: es
|
||||
- literal_form: مخطط التمويل
|
||||
in_language: ar
|
||||
- literal_form: skema pendanaan
|
||||
in_language: id
|
||||
- literal_form: 资助机制
|
||||
in_language: zh
|
||||
slots:
|
||||
- has_label
|
||||
- has_description
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
broad_mappings:
|
||||
- schema:FundingScheme
|
||||
|
|
|
|||
|
|
@ -1,33 +1,48 @@
|
|||
id: https://nde.nl/ontology/hc/class/FundingSource
|
||||
name: FundingSource
|
||||
title: Funding Source
|
||||
description: A source of funding, such as an organization or grant program. MIGRATED from funding_source slot per Rule 53. Follows frapo:FundingAgency.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
frapo: http://purl.org/cerif/frapo/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
rdfs: http://www.w3.org/2000/01/rdf-schema#
|
||||
org: http://www.w3.org/ns/org#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_description
|
||||
- ../slots/has_label
|
||||
- ../slots/has_type
|
||||
default_prefix: hc
|
||||
classes:
|
||||
FundingSource:
|
||||
class_uri: frapo:FundingAgency
|
||||
class_uri: hc:FundingSource
|
||||
description: >-
|
||||
Originating organization or mechanism from which financial support is
|
||||
provided.
|
||||
alt_descriptions:
|
||||
nl: Organisatie of mechanisme van waaruit financiele ondersteuning afkomstig is.
|
||||
de: Herkunftsorganisation oder Mechanismus, aus dem finanzielle Unterstuetzung bereitgestellt wird.
|
||||
fr: Organisation ou mecanisme d'origine a partir duquel le soutien financier est fourni.
|
||||
es: Organizacion o mecanismo de origen desde el cual se proporciona apoyo financiero.
|
||||
ar: الجهة أو الآلية المصدِّرة التي يُقدَّم منها الدعم المالي.
|
||||
id: Organisasi atau mekanisme asal dari mana dukungan finansial diberikan.
|
||||
zh: 提供资金支持的来源组织或机制。
|
||||
structured_aliases:
|
||||
- literal_form: financieringsbron
|
||||
in_language: nl
|
||||
- literal_form: Finanzierungsquelle
|
||||
in_language: de
|
||||
- literal_form: source de financement
|
||||
in_language: fr
|
||||
- literal_form: fuente de financiacion
|
||||
in_language: es
|
||||
- literal_form: مصدر التمويل
|
||||
in_language: ar
|
||||
- literal_form: sumber pendanaan
|
||||
in_language: id
|
||||
- literal_form: 资金来源
|
||||
in_language: zh
|
||||
slots:
|
||||
- has_label
|
||||
- has_description
|
||||
- has_type
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
broad_mappings:
|
||||
- schema:Organization
|
||||
|
|
|
|||
|
|
@ -1,18 +1,46 @@
|
|||
id: https://w3id.org/nde/ontology/Fylkesarkiv
|
||||
name: Fylkesarkiv
|
||||
title: Fylkesarkiv (Norwegian County Archive)
|
||||
title: Fylkesarkiv
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
wd: http://www.wikidata.org/entity/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../classes/ArchiveOrganizationType
|
||||
classes:
|
||||
Fylkesarkiv:
|
||||
class_uri: hc:Fylkesarkiv
|
||||
is_a: ArchiveOrganizationType
|
||||
class_uri: skos:Concept
|
||||
description: "Norwegian county archive (fylkesarkiv). These archives serve as regional\narchival institutions at the county (fylke) level in Norway.\n\n**Wikidata**: Q15119463\n\n**Geographic Restriction**: Norway (NO) only.\nThis constraint is enforced via LinkML `rules` with `postconditions`.\n\n**Scope**:\nFylkesarkiv preserve:\n- County administration records (fylkeskommunen)\n- Municipal records from constituent kommuner\n- Regional health and social services documentation\n- Education records (videreg\xE5ende skole)\n- Cultural affairs and heritage documentation\n- Private archives from regional businesses and organizations\n\n**Administrative Context**:\nIn the Norwegian archival system:\n- Arkivverket (National Archives of Norway)\n- Fylkesarkiv (county level) \u2190 This type\n- Kommunearkiv/Byarkiv (municipal level)\n- Interkommunale arkiv (inter-municipal archives)\n\n**Historical Context**:\nNorway has reorganized its counties (2020 regional reform):\n- Some fylkesarkiv have\
|
||||
\ merged following county mergers\n- County archives serve both historical fylker and new regions\n- Arkivverket coordinates national archival policy\n\n**Related Types**:\n- Landsarkiv - Regional state archives (under Arkivverket)\n- RegionalArchive (Q27032392) - Generic regional archives\n- CountyArchive - Generic county-level archives\n"
|
||||
slot_usage: {}
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
description: >-
|
||||
Regional archival institution at Norwegian county level responsible for
|
||||
preserving and providing access to county and related local documentation.
|
||||
alt_descriptions:
|
||||
nl: Regionale archiefinstelling op Noors provinciaal niveau die provinciale en gerelateerde lokale documentatie bewaart en toegankelijk maakt.
|
||||
de: Regionale Archiveinrichtung auf norwegischer Kreisebene zur Bewahrung und Bereitstellung von Kreis- und lokalbezogener Dokumentation.
|
||||
fr: Institution archivistique regionale au niveau des comtes norvegiens, chargee de conserver et diffuser la documentation comtale et locale associee.
|
||||
es: Institucion archivistica regional a nivel de condado noruego responsable de preservar y facilitar documentacion del condado y ambito local relacionado.
|
||||
ar: مؤسسة أرشيفية إقليمية على مستوى المقاطعات في النرويج مسؤولة عن حفظ وإتاحة الوثائق الإدارية والإقليمية ذات الصلة.
|
||||
id: Lembaga arsip regional tingkat county di Norwegia yang bertanggung jawab melestarikan dan menyediakan akses dokumentasi county serta lokal terkait.
|
||||
zh: 挪威郡级区域档案机构,负责保存并提供郡级及相关地方文献的访问。
|
||||
structured_aliases:
|
||||
- literal_form: Noors provinciaal archief
|
||||
in_language: nl
|
||||
- literal_form: norwegisches Kreisarchiv
|
||||
in_language: de
|
||||
- literal_form: archives de comte norvegien
|
||||
in_language: fr
|
||||
- literal_form: archivo condal noruego
|
||||
in_language: es
|
||||
- literal_form: أرشيف المقاطعة النرويجي
|
||||
in_language: ar
|
||||
- literal_form: arsip county Norwegia
|
||||
in_language: id
|
||||
- literal_form: 挪威郡档案馆
|
||||
in_language: zh
|
||||
exact_mappings:
|
||||
- wd:Q15119463
|
||||
broad_mappings:
|
||||
- schema:ArchiveOrganization
|
||||
|
|
|
|||
|
|
@ -1,23 +1,46 @@
|
|||
id: https://nde.nl/ontology/hc/class/GBIFIdentifier
|
||||
name: GBIFIdentifier
|
||||
title: GBIF Identifier
|
||||
description: Global Biodiversity Information Facility (GBIF) identifier. MIGRATED from gbif_id slot per Rule 53. Follows dwc:occurrenceID.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
dwc: http://rs.tdwg.org/dwc/terms/
|
||||
schema: http://schema.org/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
default_prefix: hc
|
||||
- ./Identifier
|
||||
classes:
|
||||
GBIFIdentifier:
|
||||
is_a: Identifier
|
||||
class_uri: hc:GBIFIdentifier
|
||||
is_a: Identifier
|
||||
description: >-
|
||||
Persistent identifier used to reference a biodiversity occurrence record in
|
||||
GBIF-linked data pipelines.
|
||||
alt_descriptions:
|
||||
nl: Persistente identifier voor verwijzing naar een biodiversiteitswaarneming in GBIF-gekoppelde datastromen.
|
||||
de: Persistenter Identifikator zur Referenzierung eines Biodiversitaetsnachweises in GBIF-verbundenen Datenablaeufen.
|
||||
fr: Identifiant persistant utilise pour referencer un enregistrement d'occurrence de biodiversite dans des flux de donnees lies a GBIF.
|
||||
es: Identificador persistente para referenciar un registro de ocurrencia de biodiversidad en flujos de datos vinculados a GBIF.
|
||||
ar: معرّف دائم للإشارة إلى سجل وقوع للتنوع الحيوي ضمن مسارات بيانات مرتبطة بـ GBIF.
|
||||
id: Pengidentifikasi persisten untuk merujuk catatan kejadian keanekaragaman hayati dalam alur data terkait GBIF.
|
||||
zh: 用于在 GBIF 关联数据流程中引用生物多样性出现记录的持久标识符。
|
||||
structured_aliases:
|
||||
- literal_form: GBIF-id
|
||||
in_language: nl
|
||||
- literal_form: GBIF-Kennung
|
||||
in_language: de
|
||||
- literal_form: identifiant GBIF
|
||||
in_language: fr
|
||||
- literal_form: identificador GBIF
|
||||
in_language: es
|
||||
- literal_form: معرف GBIF
|
||||
in_language: ar
|
||||
- literal_form: pengenal GBIF
|
||||
in_language: id
|
||||
- literal_form: GBIF 标识符
|
||||
in_language: zh
|
||||
broad_mappings:
|
||||
- schema:PropertyValue
|
||||
close_mappings:
|
||||
- dwc:occurrenceID
|
||||
description: A persistent identifier for a biodiversity occurrence record.
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
|
|
|
|||
|
|
@ -1,22 +1,23 @@
|
|||
id: https://nde.nl/ontology/hc/class/GHCIdentifier
|
||||
name: GHCIdentifier
|
||||
title: Global Heritage Custodian Identifier
|
||||
description: The Global Heritage Custodian Identifier (GHCID). MIGRATED from ghcid slot per Rule 53. Follows dcterms:identifier.
|
||||
title: GHC Identifier Class
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
default_prefix: hc
|
||||
classes:
|
||||
GHCIdentifier:
|
||||
is_a: Identifier
|
||||
class_uri: hc:GHCIdentifier
|
||||
description: Persistent identifier assigned to a heritage custodian in the GHCID namespace.
|
||||
close_mappings:
|
||||
- dcterms:Identifier
|
||||
related_mappings:
|
||||
- dcterms:identifier
|
||||
description: 'A persistent, unique identifier for a heritage custodian. Format: CC-RR-LLL-T-ABBREVIATION'
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
specificity_score: 0.35
|
||||
specificity_rationale: Core persistent identifier class for custodian identity resolution.
|
||||
custodian_types: '["*"]'
|
||||
|
|
|
|||
|
|
@ -1,30 +1,51 @@
|
|||
id: https://nde.nl/ontology/hc/class/Gallery
|
||||
name: Gallery
|
||||
title: Gallery
|
||||
description: An exhibition space or art gallery. MIGRATED from gallery_type_classification context. Follows schema:ArtGallery.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_description
|
||||
- ../slots/has_label
|
||||
- ../slots/has_type
|
||||
default_prefix: hc
|
||||
classes:
|
||||
Gallery:
|
||||
class_uri: schema:ArtGallery
|
||||
class_uri: hc:Gallery
|
||||
description: >-
|
||||
Institution or venue dedicated to exhibiting visual art through curated
|
||||
programs.
|
||||
alt_descriptions:
|
||||
nl: Instelling of locatie gewijd aan het tonen van beeldende kunst via gecureerde programma's.
|
||||
de: Einrichtung oder Ort, der sich der Praesentation bildender Kunst in kuratierten Programmen widmet.
|
||||
fr: Institution ou lieu dedie a l'exposition d'arts visuels au moyen de programmes cures.
|
||||
es: Institucion o espacio dedicado a exhibir artes visuales mediante programas curados.
|
||||
ar: مؤسسة أو فضاء مخصص لعرض الفنون البصرية عبر برامج تقييم/تنسيق فني.
|
||||
id: Institusi atau tempat yang didedikasikan untuk memamerkan seni visual melalui program kurasi.
|
||||
zh: 通过策展项目展示视觉艺术的机构或场所。
|
||||
structured_aliases:
|
||||
- literal_form: galerie
|
||||
in_language: nl
|
||||
- literal_form: Galerie
|
||||
in_language: de
|
||||
- literal_form: galerie d'art
|
||||
in_language: fr
|
||||
- literal_form: galeria de arte
|
||||
in_language: es
|
||||
- literal_form: معرض فني
|
||||
in_language: ar
|
||||
- literal_form: galeri seni
|
||||
in_language: id
|
||||
- literal_form: 美术馆
|
||||
in_language: zh
|
||||
slots:
|
||||
- has_label
|
||||
- has_description
|
||||
- has_type
|
||||
slot_usage:
|
||||
has_type:
|
||||
# range: string # uriorcurie
|
||||
required: true
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
exact_mappings:
|
||||
- schema:ArtGallery
|
||||
|
|
|
|||
|
|
@ -1,216 +1,70 @@
|
|||
id: https://nde.nl/ontology/hc/class/GalleryType
|
||||
name: GalleryType
|
||||
title: Gallery Type Classification
|
||||
title: Gallery Type
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../enums/GalleryTypeEnum
|
||||
- ../slots/has_hypernym
|
||||
- ../slots/identified_by # was: wikidata_entity
|
||||
- ../slots/has_model # was: exhibition_model
|
||||
- ./CustodianType
|
||||
- ../slots/identified_by
|
||||
- ../slots/has_model
|
||||
- ../slots/has_objective
|
||||
- ../slots/has_percentage
|
||||
- ../slots/has_score # was: template_specificity
|
||||
- ../slots/has_score
|
||||
- ../slots/has_service
|
||||
- ../slots/has_type
|
||||
- ../slots/include # was: gallery_subtype
|
||||
- ../slots/categorized_as # was: exhibition_focus
|
||||
- ../slots/include
|
||||
- ../slots/represent
|
||||
- ../slots/has_activity
|
||||
- ../slots/take_comission
|
||||
classes:
|
||||
GalleryType:
|
||||
class_uri: hc:GalleryType
|
||||
is_a: CustodianType
|
||||
class_uri: skos:Concept
|
||||
annotations:
|
||||
skos:prefLabel: Gallery
|
||||
description: >-
|
||||
Controlled taxonomy root for classifying gallery organizational models,
|
||||
exhibition strategies, and commercial posture.
|
||||
alt_descriptions:
|
||||
nl: Gecontroleerde taxonomiewortel voor classificatie van galerie-organisatiemodellen, tentoonstellingsstrategieën en commerciële oriëntatie.
|
||||
de: Kontrollierte Taxonomie-Wurzel zur Klassifikation von Galerie-Organisationsmodellen, Ausstellungsstrategien und kommerzieller Ausrichtung.
|
||||
fr: Racine taxonomique controlee pour classifier les modeles organisationnels de galeries, les strategies d'exposition et le positionnement commercial.
|
||||
es: Raiz taxonomica controlada para clasificar modelos organizativos de galeria, estrategias expositivas y orientacion comercial.
|
||||
ar: جذر تصنيفي مضبوط لتصنيف نماذج تنظيم المعارض واستراتيجيات العرض والاتجاه التجاري.
|
||||
id: Akar taksonomi terkendali untuk mengklasifikasikan model organisasi galeri, strategi pameran, dan orientasi komersial.
|
||||
zh: 用于分类画廊组织形态、展览策略与商业定位的受控分类根节点。
|
||||
structured_aliases:
|
||||
- literal_form: galerie
|
||||
predicate: EXACT_SYNONYM
|
||||
- literal_form: galerietype
|
||||
in_language: nl
|
||||
- literal_form: galerijen
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: nl
|
||||
- literal_form: gallery
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: en
|
||||
- literal_form: galleries
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: en
|
||||
- literal_form: art gallery
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: en
|
||||
- literal_form: Galerie
|
||||
predicate: EXACT_SYNONYM
|
||||
- literal_form: Galerietyp
|
||||
in_language: de
|
||||
- literal_form: Galerien
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: de
|
||||
- literal_form: kunsthalle
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: de
|
||||
- literal_form: galeria
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: es
|
||||
- literal_form: galerías
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: es
|
||||
- literal_form: galleria
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: it
|
||||
- literal_form: gallerie
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: it
|
||||
- literal_form: galeria
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: pt
|
||||
- literal_form: galerias
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: pt
|
||||
- literal_form: galerie
|
||||
predicate: EXACT_SYNONYM
|
||||
- literal_form: type de galerie
|
||||
in_language: fr
|
||||
- literal_form: galeries
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: fr
|
||||
description: "Specialized custodian type for art galleries - institutions that exhibit\nand sometimes sell visual artworks,\
|
||||
\ providing public access to contemporary\nor historical art through temporary or rotating exhibitions.\n\n**Wikidata\
|
||||
\ Base Concept**: Q1007870 (art gallery)\n\n**Scope**:\nGalleries are distinguished by their focus on:\n- Exhibition-oriented\
|
||||
\ (not collection-based like museums)\n- Contemporary or recent art (not historical artifacts)\n- Temporary exhibitions\
|
||||
\ (rotating shows, not permanent displays)\n- Artist representation (commercial) or kunsthalle model (non-commercial)\n\
|
||||
- Visual arts (paintings, sculptures, photography, installations)\n\n**Key Gallery Subtypes** (78+ extracted from Wikidata):\n\
|
||||
\n**By Business Model**:\n- Commercial art galleries (Q56856618) - For-profit, sell artworks, represent artists\n- Noncommercial\
|
||||
\ art galleries (Q67165238) - Exhibition-only, no sales\n- Kunsthalle (Q1475403) - German model, temporary exhibitions,\
|
||||
\ no permanent collection\n- Vanity galleries (Q17111940) - Charge artists for exhibition space\n- National galleries\
|
||||
\ (Q3844310) - State-run, representative of nation\n\n**By Subject Specialization**:\n- Photography galleries (Q114023739)\
|
||||
\ - Photographic art exhibitions\n- Photo galleries (Q12303444) - Physical or digital photograph collections\n- Photography\
|
||||
\ centres (Q11900212) - Dedicated photography venues\n- Photothèques (Q135926044) - Photographic heritage preservation\n\
|
||||
- Sculpture gardens (Q1759852) - Outdoor sculpture exhibitions\n- Jewellery galleries (Q117072343) - Jewelry and decorative\
|
||||
\ arts\n- Design galleries (Q127346204) - Design and applied arts\n- Map galleries (Q125501487) - Cartographic art exhibitions\n\
|
||||
- Print rooms (Q445396) - Prints, drawings, watercolors, photographs\n\n**By Organizational Model**:\n- Artist-run centres\
|
||||
\ (Q4801243) - Managed and directed by artists\n- Artist-run initiatives (Q3325736) - Gallery operated by artists\n\
|
||||
- Artist-run spaces (Q4034417) - Organizations initiated by artists\n- Artist cooperatives (Q4801240) - Jointly owned\
|
||||
\ by artist members\n- Canadian artist-run centres (Q16020664) - Canada-specific model (1960s+)\n\n**By Art Period Focus**:\n\
|
||||
- Contemporary art galleries (Q16038801) - Current/recent art\n- Modern art galleries (Q3757717) - Modernist period\
|
||||
\ (late 19th-20th century)\n- Contemporary arts centres (Q2945053) - Focus on contemporary practice\n- National centres\
|
||||
\ for contemporary art (Q109017987) - State contemporary art venues\n\n**By Venue Type**:\n- Alternative exhibition\
|
||||
\ spaces (Q16002704) - Non-traditional venues\n- Arts venues (Q15090615) - Places for artistic works display/performance\n\
|
||||
- Arts centers (Q2190251) - Community centers for arts\n- Cast collections (Q29380643) - Plaster cast galleries (educational)\n\
|
||||
- Plaster cast galleries (Q3768550) - Sculpture reproduction collections\n\n**By Artist Association**:\n- Artist museums\
|
||||
\ (Q1747681) - Dedicated to particular artist\n- Artist houses (Q1797122) - Buildings with artist work rooms\n- Art\
|
||||
\ colonies (Q1558054) - Places where artists live and interact\n- Art communes (Q4797182) - Communal living focused\
|
||||
\ on art creation\n- Studio houses (Q2699076) - Residential spaces with studio facilities\n\n**Online & Digital**:\n\
|
||||
- Online art galleries (Q7094057) - Digital exhibition platforms\n- Galeries Fnac (Q109038036) - French retail chain\
|
||||
\ photo galleries (1970s+)\n\n**Specialized Formats**:\n- Pinacotheca (Q740437) - Public art gallery (classical term)\n\
|
||||
- Print rooms (Q445396) - Graphic arts collections\n- Photograph collections (Q130486108) - Photography collections\n\
|
||||
\n**French Model**:\n- Scientific, technical, and industrial culture centers (Q2946053) - Popular science venues\n\n\
|
||||
**Cultural Context**:\n- Arts and Culture Centres (Q4801491) - Newfoundland & Labrador system (Canada)\n- Houses of\
|
||||
\ culture (Q5061188) - Cultural institutions in socialist/social democratic contexts\n- Houses of literature (Q27908105)\
|
||||
\ - Cultural institutions for written art\n- Centrum Beeldende Kunst (Q2104985) - Dutch visual arts centers\n\n**Supporting\
|
||||
\ Organizations**:\n- Not-for-profit arts organizations (Q7062022) - Nonprofit arts foundations\n- Art institutions\
|
||||
\ (Q20897549) - Organizations dedicated to art\n- Cultural institutions (Q3152824) - Preservation/promotion of culture\n\
|
||||
\n**Commercial vs. Non-Commercial Distinction**:\n\n**Commercial Galleries**:\n- Represent artists (exclusive or non-exclusive\
|
||||
\ contracts)\n- Sell artworks (earn commission on sales)\n- Participate in art fairs\n- Primary market (new works) or\
|
||||
\ secondary market (resale)\n\n**Non-Commercial Galleries** (Kunsthalle model):\n- No permanent collection\n- Exhibition-only\
|
||||
\ mission\n- Public or nonprofit funding\n- Educational/cultural programming\n- No artwork sales\n\n**RDF Serialization\
|
||||
\ Example**:\n```turtle\n:Custodian_KunsthalRotterdam\n org:classification :GalleryType_Kunsthalle_Q1475403 .\n\n\
|
||||
:GalleryType_Kunsthalle_Q1475403\n a glamtype:GalleryType, crm:E55_Type, skos:Concept ;\n skos:prefLabel \"Kunsthalle\"\
|
||||
@en, \"kunsthalle\"@nl, \"Kunsthalle\"@de ;\n skos:broader :GalleryType_ArtGallery_Q1007870 ;\n schema:additionalType\
|
||||
\ <http://www.wikidata.org/entity/Q1475403> ;\n glamtype:glamorcubesfixphdnt_code \"GALLERY\" ;\n glamtype:has_objective\
|
||||
\ false ;\n glamtype:exhibition_focus \"contemporary art\" ;\n glamtype:sales_activity false ;\n glamtype:exhibition_model\
|
||||
\ \"temporary rotating exhibitions\" .\n```\n\n**Domain-Specific Properties**:\nThis class adds gallery-specific metadata\
|
||||
\ beyond base CustodianType:\n- `has_objective` - Structured profit objective (commercial/nonprofit/mixed)\n- `artist_representation`\
|
||||
\ - Artists represented by gallery (for commercial galleries)\n- `exhibition_focus` - Type of art exhibited (contemporary,\
|
||||
\ modern, photography, etc.)\n- `sales_activity` - Whether gallery sells artworks (not just exhibits)\n- `exhibition_model`\
|
||||
\ - Exhibition strategy (temporary, rotating, curated shows)\n- `has_service` - Art sales service with commission structure (ArtSaleService)\n\n**Getty AAT Integration**:\nThe Getty Art & Architecture Thesaurus provides standardized\
|
||||
\ vocabulary:\n- aat:300005768 - art galleries (institutions)\n- aat:300240057 - commercial galleries\n- aat:300240058\
|
||||
\ - nonprofit galleries\n- aat:300005741 - kunsthalles\n\n**Art Market Context**:\nCommercial galleries operate in the\
|
||||
\ art market ecosystem:\n- **Primary market**: Representing living artists, first sales\n- **Secondary market**: Resale\
|
||||
\ of works by established artists\n- **Art fairs**: Participation in international art fairs (Basel, Frieze, etc.)\n\
|
||||
- **Auction houses**: Different from galleries (auction vs. consignment model)\n\n**Data Population**:\nGallery subtypes\
|
||||
\ extracted from 78 Wikidata entities with type='G'\nin `data/wikidata/GLAMORCUBEPSXHFN/hyponyms_curated_full.yaml`.\n"
|
||||
- literal_form: tipo de galeria
|
||||
in_language: es
|
||||
- literal_form: نوع المعرض الفني
|
||||
in_language: ar
|
||||
- literal_form: tipe galeri
|
||||
in_language: id
|
||||
- literal_form: 画廊类型
|
||||
in_language: zh
|
||||
slots:
|
||||
- represent
|
||||
# REMOVED 2026-01-22: commercial_operation - migrated to has_objective + Profit (Rule 53)
|
||||
- has_objective
|
||||
# REMOVED 2026-01-22: commission_rate - migrated to has_service + ArtSaleService (Rule 53)
|
||||
- has_service
|
||||
- has_type
|
||||
- has_type # was: exhibition_focus - migrated per Rule 53 (2026-01-26)
|
||||
- has_model # was: exhibition_model - migrated per Rule 53 (2026-01-26)
|
||||
- include # was: gallery_subtype - migrated per Rule 53 (2026-01-26)
|
||||
- has_model
|
||||
- include
|
||||
- has_activity
|
||||
- has_score # was: template_specificity - migrated per Rule 53 (2026-01-17)
|
||||
- identified_by # was: wikidata_entity - migrated per Rule 53 (2026-01-16)
|
||||
- has_score
|
||||
- identified_by
|
||||
slot_usage:
|
||||
identified_by: # was: wikidata_entity - migrated per Rule 53 (2026-01-16)
|
||||
pattern: ^Q[0-9]+$
|
||||
identified_by:
|
||||
required: true
|
||||
has_hypernym:
|
||||
range: GalleryType
|
||||
required: false
|
||||
has_type:
|
||||
equals_expression: '["hc:GalleryType"]'
|
||||
has_type: # was: exhibition_focus - migrated per Rule 53 (2026-01-26)
|
||||
# range: string
|
||||
has_model: # was: exhibition_model - migrated per Rule 53 (2026-01-26)
|
||||
# range: string
|
||||
include: # was: gallery_subtype - migrated per Rule 53 (2026-01-26)
|
||||
equals_string: hc:GalleryType
|
||||
include:
|
||||
range: GalleryType
|
||||
any_of:
|
||||
- range: CommercialGallery
|
||||
- range: NonProfitGallery
|
||||
- range: ArtistRunSpace
|
||||
- range: Kunsthalle
|
||||
required: false
|
||||
exact_mappings:
|
||||
broad_mappings:
|
||||
- skos:Concept
|
||||
- schema:ArtGallery
|
||||
close_mappings:
|
||||
- crm:E55_Type
|
||||
- aat:300005768
|
||||
related_mappings:
|
||||
- aat:300240057
|
||||
- aat:300240058
|
||||
comments:
|
||||
- GalleryType implements SKOS-based classification for art gallery organizations
|
||||
- Distinguishes commercial (sales-oriented) from non-commercial (kunsthalle) models
|
||||
- Supports 78+ Wikidata gallery subtypes with multilingual labels
|
||||
- Getty AAT integration for art market terminology
|
||||
- 'Artist-run initiatives: Canadian model (1960s+), cooperative ownership'
|
||||
examples:
|
||||
- value:
|
||||
identified_by: https://nde.nl/ontology/hc/type/gallery/Q1475403
|
||||
has_type_code: GALLERY
|
||||
has_label:
|
||||
- Kunsthalle@en
|
||||
- kunsthalle@nl
|
||||
- Kunsthalle@de
|
||||
has_description: facility that mounts temporary art exhibitions without permanent collection # was: type_description - migrated per Rule 53/56 (2026-01-16)
|
||||
custodian_type_broader: https://nde.nl/ontology/hc/type/gallery/Q1007870
|
||||
# MIGRATED 2026-01-22: commercial_operation → has_objective + Profit (Rule 53)
|
||||
has_objective:
|
||||
has_type: contemporary art
|
||||
sales_activity: false
|
||||
has_model: temporary rotating exhibitions, no permanent collection
|
||||
- value:
|
||||
identified_by: https://nde.nl/ontology/hc/type/gallery/Q56856618
|
||||
has_type_code: GALLERY
|
||||
has_label:
|
||||
- Commercial Art Gallery@en
|
||||
- kunstgalerie@nl
|
||||
has_description: for-profit gallery that sells artworks and represents artists # was: type_description - migrated per Rule 53/56 (2026-01-16)
|
||||
custodian_type_broader: https://nde.nl/ontology/hc/type/gallery/Q1007870
|
||||
# MIGRATED 2026-01-22: commercial_operation → has_objective + Profit (Rule 53)
|
||||
has_objective:
|
||||
represents_or_represented:
|
||||
- has_label: Artist A
|
||||
- has_label: Artist B
|
||||
- has_label: Artist C
|
||||
has_type: contemporary painting and sculpture
|
||||
sales_activity: true
|
||||
has_model: curated exhibitions of represented artists
|
||||
# MIGRATED 2026-01-22: commission_rate → has_service + ArtSaleService (Rule 53)
|
||||
has_service:
|
||||
sales_activity: true
|
||||
takes_or_took_comission:
|
||||
has_percentage:
|
||||
|
|
@ -1,38 +1,36 @@
|
|||
id: https://nde.nl/ontology/hc/class/GalleryTypes
|
||||
name: GalleryTypes
|
||||
title: Gallery Type Subclasses
|
||||
description: Concrete subclasses of GalleryType. MIGRATED from gallery_subtype slot
|
||||
per Rule 53/0b.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- ./GalleryType
|
||||
- linkml:types
|
||||
default_prefix: hc
|
||||
classes:
|
||||
CommercialGallery:
|
||||
is_a: GalleryType
|
||||
description: A gallery that sells art.
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: '[''*'']'
|
||||
class_uri: hc:CommercialGallery
|
||||
description: Gallery model that combines exhibition with artwork sales and artist representation.
|
||||
broad_mappings:
|
||||
- skos:Concept
|
||||
NonProfitGallery:
|
||||
is_a: GalleryType
|
||||
description: A gallery that operates as a non-profit.
|
||||
class_uri: hc:NonProfitGallery
|
||||
description: Gallery model operating under nonprofit governance and mission-oriented programming.
|
||||
broad_mappings:
|
||||
- skos:Concept
|
||||
ArtistRunSpace:
|
||||
is_a: GalleryType
|
||||
description: A gallery run by artists.
|
||||
class_uri: hc:ArtistRunSpace
|
||||
description: Gallery model initiated and managed primarily by artists.
|
||||
broad_mappings:
|
||||
- skos:Concept
|
||||
Kunsthalle:
|
||||
is_a: GalleryType
|
||||
description: An art exhibition space without a permanent collection.
|
||||
class_uri: hc:Kunsthalle
|
||||
description: Exhibition-oriented gallery model without a permanent collection.
|
||||
broad_mappings:
|
||||
- skos:Concept
|
||||
|
|
|
|||
|
|
@ -1,20 +1,43 @@
|
|||
id: https://nde.nl/ontology/hc/class/GenBankAccession
|
||||
name: GenBankAccession
|
||||
title: GenBank Accession
|
||||
description: A GenBank accession number for a nucleotide sequence. MIGRATED from genbank_accession slot per Rule 53. Follows BioProject/GenBank identifiers.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
default_prefix: hc
|
||||
- ./Identifier
|
||||
classes:
|
||||
GenBankAccession:
|
||||
class_uri: hc:GenBankAccession
|
||||
is_a: Identifier
|
||||
class_uri: schema:PropertyValue
|
||||
description: A persistent identifier for a nucleotide sequence in GenBank.
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
description: >-
|
||||
Persistent accession identifier assigned to a nucleotide sequence record in
|
||||
GenBank.
|
||||
alt_descriptions:
|
||||
nl: Persistente toegangscode toegekend aan een nucleotide-sequentierecord in GenBank.
|
||||
de: Persistente Zugriffkennung fuer einen Nukleotidsequenz-Datensatz in GenBank.
|
||||
fr: Numero d'accession persistant attribue a un enregistrement de sequence nucleotidique dans GenBank.
|
||||
es: Accesion persistente asignada a un registro de secuencia nucleotidica en GenBank.
|
||||
ar: رقم إتاحة دائم يُسند إلى سجل تسلسل نوكليوتيدي في GenBank.
|
||||
id: Nomor aksesi persisten yang ditetapkan pada rekaman sekuens nukleotida di GenBank.
|
||||
zh: 分配给 GenBank 核苷酸序列记录的持久登录号标识。
|
||||
structured_aliases:
|
||||
- literal_form: GenBank-toegangscode
|
||||
in_language: nl
|
||||
- literal_form: GenBank-Zugangsnummer
|
||||
in_language: de
|
||||
- literal_form: numero d'accession GenBank
|
||||
in_language: fr
|
||||
- literal_form: numero de acceso GenBank
|
||||
in_language: es
|
||||
- literal_form: رقم إتاحة GenBank
|
||||
in_language: ar
|
||||
- literal_form: nomor aksesi GenBank
|
||||
in_language: id
|
||||
- literal_form: GenBank 登录号
|
||||
in_language: zh
|
||||
broad_mappings:
|
||||
- schema:PropertyValue
|
||||
|
|
|
|||
|
|
@ -1,30 +1,48 @@
|
|||
id: https://nde.nl/ontology/hc/class/Gender
|
||||
name: Gender
|
||||
title: Gender
|
||||
description: Gender identity or classification. MIGRATED from gender_identity slot per Rule 53. Follows schema:GenderType.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
rdfs: http://www.w3.org/2000/01/rdf-schema#
|
||||
org: http://www.w3.org/ns/org#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_description
|
||||
- ../slots/has_label
|
||||
default_prefix: hc
|
||||
classes:
|
||||
Gender:
|
||||
class_uri: schema:GenderType
|
||||
class_uri: hc:Gender
|
||||
description: >-
|
||||
Classification term used to represent stated gender identity in descriptive
|
||||
metadata contexts.
|
||||
alt_descriptions:
|
||||
nl: Classificatieterm voor het weergeven van opgegeven genderidentiteit in beschrijvende metadata.
|
||||
de: Klassifikationsterm zur Darstellung angegebener Geschlechtsidentitaet in beschreibenden Metadatenkontexten.
|
||||
fr: Terme de classification utilise pour representer l'identite de genre declaree dans des metadonnees descriptives.
|
||||
es: Termino de clasificacion para representar identidad de genero declarada en contextos de metadatos descriptivos.
|
||||
ar: مصطلح تصنيفي لتمثيل الهوية الجندرية المصرّح بها ضمن سياقات البيانات الوصفية.
|
||||
id: Istilah klasifikasi untuk merepresentasikan identitas gender yang dinyatakan dalam konteks metadata deskriptif.
|
||||
zh: 用于在描述性元数据语境中表示申报性别认同的分类术语。
|
||||
structured_aliases:
|
||||
- literal_form: gender
|
||||
in_language: nl
|
||||
- literal_form: Geschlechtsidentitaet
|
||||
in_language: de
|
||||
- literal_form: identite de genre
|
||||
in_language: fr
|
||||
- literal_form: identidad de genero
|
||||
in_language: es
|
||||
- literal_form: الهوية الجندرية
|
||||
in_language: ar
|
||||
- literal_form: identitas gender
|
||||
in_language: id
|
||||
- literal_form: 性别认同
|
||||
in_language: zh
|
||||
slots:
|
||||
- has_label
|
||||
- has_description
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
broad_mappings:
|
||||
- schema:GenderType
|
||||
- skos:Concept
|
||||
|
|
|
|||
|
|
@ -1,33 +0,0 @@
|
|||
id: https://nde.nl/ontology/hc/classes/GenealogiewerkbalkEnrichment
|
||||
name: GenealogiewerkbalkEnrichment
|
||||
title: GenealogiewerkbalkEnrichment
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../enums/DataTierEnum
|
||||
# default_range: string
|
||||
classes:
|
||||
GenealogiewerkbalkEnrichment:
|
||||
description: "Dutch genealogy archives registry (Genealogiewerkbalk) data including\
|
||||
\ municipality, province, and associated archive information.\nOntology mapping\
|
||||
\ rationale: - class_uri is prov:Entity because this represents enrichment data\n\
|
||||
\ derived from the Dutch genealogy archives registry\n- close_mappings includes\
|
||||
\ schema:Dataset for registry data semantics - related_mappings includes prov:PrimarySource\
|
||||
\ for source registry"
|
||||
class_uri: prov:Entity
|
||||
close_mappings:
|
||||
- schema:Dataset
|
||||
related_mappings:
|
||||
- prov:PrimarySource
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: '[''*'']'
|
||||
slots:
|
||||
- has_source
|
||||
- has_url
|
||||
|
|
@ -0,0 +1,48 @@
|
|||
id: https://nde.nl/ontology/hc/classes/GenealogyArchivesRegistryEnrichment
|
||||
name: GenealogyArchivesRegistryEnrichment
|
||||
title: Genealogy Archives Registry Enrichment Class
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../enums/DataTierEnum
|
||||
# default_range: string
|
||||
classes:
|
||||
GenealogyArchivesRegistryEnrichment:
|
||||
description: >-
|
||||
Enrichment data derived from genealogy-focused archive registry sources,
|
||||
including municipality, province, and linked archive information.
|
||||
class_uri: prov:Entity
|
||||
alt_descriptions:
|
||||
nl: {text: Verrijkingsdata uit genealogische archiefregisters, inclusief gemeente, provincie en gekoppelde archiefinformatie., language: nl}
|
||||
de: {text: Anreicherungsdaten aus genealogischen Archivregistern mit Angaben zu Gemeinde, Provinz und verknuepften Archiven., language: de}
|
||||
fr: {text: Donnees d enrichissement issues de registres d archives genealogiques, incluant municipalite, province et archives associees., language: fr}
|
||||
es: {text: Datos de enriquecimiento derivados de registros archivisticos genealogicos, incluidos municipio, provincia e informacion de archivo vinculada., language: es}
|
||||
ar: {text: بيانات إثراء مشتقة من سجلات أرشيفية خاصة بعلم الأنساب وتشمل البلدية والمقاطعة ومعلومات الأرشيف المرتبطة., language: ar}
|
||||
id: {text: Data pengayaan dari registri arsip genealogi, termasuk munisipalitas, provinsi, dan informasi arsip terkait., language: id}
|
||||
zh: {text: 源自家谱档案登记来源的富化数据,包含市镇、省份及关联档案信息。, language: zh}
|
||||
structured_aliases:
|
||||
nl: [{literal_form: verrijking genealogisch archiefregister, language: nl}]
|
||||
de: [{literal_form: Anreicherung genealogisches Archivregister, language: de}]
|
||||
fr: [{literal_form: enrichissement registre d archives genealogiques, language: fr}]
|
||||
es: [{literal_form: enriquecimiento de registro archivistico genealogico, language: es}]
|
||||
ar: [{literal_form: إثراء سجل الأرشيف الجينيالوجي, language: ar}]
|
||||
id: [{literal_form: pengayaan registri arsip genealogi, language: id}]
|
||||
zh: [{literal_form: 家谱档案登记富化, language: zh}]
|
||||
exact_mappings:
|
||||
- prov:Entity
|
||||
close_mappings:
|
||||
- schema:Dataset
|
||||
related_mappings:
|
||||
- prov:PrimarySource
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: '[''*'']'
|
||||
slots:
|
||||
- has_source
|
||||
- has_url
|
||||
|
|
@ -1,106 +1,42 @@
|
|||
id: https://nde.nl/ontology/hc/class/GenerationEvent
|
||||
name: generation_event_class
|
||||
name: GenerationEvent
|
||||
title: Generation Event Class
|
||||
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
schema: http://schema.org/
|
||||
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_description
|
||||
- ../slots/has_provenance
|
||||
- ../slots/has_score
|
||||
- ../slots/temporal_extent
|
||||
default_prefix: hc
|
||||
|
||||
- ../slots/has_provenance
|
||||
- ../slots/has_description
|
||||
- ../slots/has_score
|
||||
classes:
|
||||
GenerationEvent:
|
||||
description: >-
|
||||
An event representing the generation or creation of an entity.
|
||||
|
||||
**USAGE**:
|
||||
Used for tracking when and how something was generated, including:
|
||||
- Video chapter generation (manual, AI, imported)
|
||||
- Content extraction events
|
||||
- Automated processing activities
|
||||
- Confidence scoring for generated content
|
||||
|
||||
**STRUCTURE**:
|
||||
- temporal_extent: When the generation occurred (TimeSpan)
|
||||
- has_provenance: Who/what performed the generation (Provenance)
|
||||
- has_description: Details about the generation process
|
||||
- has_score: Confidence score for the generated content (ConfidenceScore)
|
||||
|
||||
**ONTOLOGY ALIGNMENT**:
|
||||
- Maps to prov:Generation (PROV-O generation event)
|
||||
- Also maps to schema:CreateAction (Schema.org action)
|
||||
|
||||
class_uri: prov:Generation
|
||||
|
||||
description: Event in which an entity is created or generated.
|
||||
exact_mappings:
|
||||
- prov:Generation
|
||||
|
||||
close_mappings:
|
||||
- schema:CreateAction
|
||||
|
||||
slots:
|
||||
- temporal_extent
|
||||
- has_provenance
|
||||
- has_description
|
||||
- has_score
|
||||
|
||||
slot_usage:
|
||||
temporal_extent:
|
||||
range: TimeSpan
|
||||
required: false
|
||||
inlined: true
|
||||
examples:
|
||||
- value:
|
||||
begin_of_the_begin: "2024-01-15T10:30:00Z"
|
||||
end_of_the_end: "2024-01-15T10:30:00Z"
|
||||
has_provenance:
|
||||
range: Provenance
|
||||
required: false
|
||||
inlined: true
|
||||
examples:
|
||||
- value:
|
||||
has_agent:
|
||||
has_type: SOFTWARE
|
||||
has_name: "YouTube Auto-Chapters"
|
||||
has_description:
|
||||
# range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "Generated using Whisper transcript segmentation"
|
||||
has_score:
|
||||
range: ConfidenceScore
|
||||
required: false
|
||||
inlined: true
|
||||
examples:
|
||||
- value:
|
||||
has_score: 0.95
|
||||
has_method: "xpath_extraction"
|
||||
has_description: "High confidence - exact match at expected location"
|
||||
annotations:
|
||||
custodian_types: '["*"]'
|
||||
custodian_types_rationale: >-
|
||||
Generation events are universal for tracking content creation.
|
||||
custodian_types_primary: "*"
|
||||
specificity_score: 0.30
|
||||
specificity_rationale: >-
|
||||
Moderately low specificity - used across many content types.
|
||||
|
||||
examples:
|
||||
- value:
|
||||
temporal_extent:
|
||||
begin_of_the_begin: "2024-01-15T10:30:00Z"
|
||||
has_description: "AI-generated video chapters from transcript"
|
||||
has_score:
|
||||
has_score: 0.92
|
||||
has_method: "transcript_segmentation"
|
||||
comments:
|
||||
- Created from slot_fixes.yaml migration (2026-01-19)
|
||||
- Updated 2026-01-19 to include has_score for confidence tracking
|
||||
specificity_score: 0.3
|
||||
specificity_rationale: Cross-domain provenance event for generated content.
|
||||
|
|
|
|||
|
|
@ -1,40 +1,33 @@
|
|||
id: https://nde.nl/ontology/hc/class/GeoFeature
|
||||
name: GeoFeature
|
||||
title: Geographic Feature
|
||||
description: 'A classification of a geographic feature (e.g., populated place, administrative division). MIGRATED from feature_class/feature_code slots.
|
||||
|
||||
Used to classify GeoSpatialPlace instances according to GeoNames feature codes.'
|
||||
title: Geo Feature Class
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
gn: http://www.geonames.org/ontology#
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
rdfs: http://www.w3.org/2000/01/rdf-schema#
|
||||
org: http://www.w3.org/ns/org#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
gn: http://www.geonames.org/ontology#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_code
|
||||
- ../slots/has_type
|
||||
default_prefix: hc
|
||||
- ../slots/has_code
|
||||
classes:
|
||||
GeoFeature:
|
||||
class_uri: skos:Concept
|
||||
description: Geo feature classification entry, typically aligned to GeoNames coding.
|
||||
broad_mappings:
|
||||
- skos:Concept
|
||||
close_mappings:
|
||||
- gn:Feature
|
||||
slots:
|
||||
- has_type
|
||||
- has_code
|
||||
slot_usage:
|
||||
has_type:
|
||||
# range: string # uriorcurie
|
||||
required: true
|
||||
has_code:
|
||||
# range: string # uriorcurie
|
||||
required: true
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
specificity_score: 0.35
|
||||
specificity_rationale: Controlled geospatial classification term.
|
||||
custodian_types: '["*"]'
|
||||
|
|
|
|||
|
|
@ -1,25 +1,26 @@
|
|||
id: https://nde.nl/ontology/hc/class/GeoFeatureType
|
||||
name: GeoFeatureType
|
||||
title: Geographic Feature Type
|
||||
description: Abstract base class for geographic feature types (e.g., PopulatedPlace, AdministrativeDivision). MIGRATED from feature_class slot per Rule 0b.
|
||||
title: Geo Feature Type Class
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
gn: http://www.geonames.org/ontology#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_description
|
||||
- ../slots/has_label
|
||||
default_prefix: hc
|
||||
- ../slots/has_description
|
||||
classes:
|
||||
GeoFeatureType:
|
||||
class_uri: skos:Concept
|
||||
abstract: true
|
||||
description: Abstract taxonomy node for geographic feature classes.
|
||||
broad_mappings:
|
||||
- skos:Concept
|
||||
slots:
|
||||
- has_label
|
||||
- has_description
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
specificity_score: 0.3
|
||||
specificity_rationale: Shared hierarchy base for geospatial feature typing.
|
||||
custodian_types: '["*"]'
|
||||
|
|
|
|||
|
|
@ -1,82 +1,77 @@
|
|||
id: https://nde.nl/ontology/hc/class/GeoFeatureTypes
|
||||
name: GeoFeatureTypes
|
||||
title: Geographic Feature Type Subclasses
|
||||
description: Concrete subclasses of GeoFeatureType representing specific geographic
|
||||
feature categories. Based on GeoNames feature classes.
|
||||
title: Geo Feature Types Class Module
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
gn: http://www.geonames.org/ontology#
|
||||
schema: http://schema.org/
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- ./GeoFeatureType
|
||||
- linkml:types
|
||||
default_prefix: hc
|
||||
classes:
|
||||
AdministrativeBoundary:
|
||||
is_a: GeoFeatureType
|
||||
class_uri: gn:A
|
||||
description: Country, state, region, etc. (GeoNames class A)
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: '[''*'']'
|
||||
description: Administrative division feature class.
|
||||
broad_mappings:
|
||||
- schema:Place
|
||||
- crm:E53_Place
|
||||
HydrographicFeature:
|
||||
is_a: GeoFeatureType
|
||||
class_uri: gn:H
|
||||
description: Stream, lake, etc. (GeoNames class H)
|
||||
description: Hydrographic feature class.
|
||||
broad_mappings:
|
||||
- schema:Place
|
||||
- crm:E53_Place
|
||||
AreaFeature:
|
||||
is_a: GeoFeatureType
|
||||
class_uri: gn:L
|
||||
description: Parks, area, etc. (GeoNames class L)
|
||||
description: Area feature class.
|
||||
broad_mappings:
|
||||
- schema:Place
|
||||
- crm:E53_Place
|
||||
PopulatedPlace:
|
||||
is_a: GeoFeatureType
|
||||
class_uri: gn:P
|
||||
description: City, village, etc. (GeoNames class P)
|
||||
description: Populated place feature class.
|
||||
broad_mappings:
|
||||
- schema:Place
|
||||
- crm:E53_Place
|
||||
RoadRailroad:
|
||||
is_a: GeoFeatureType
|
||||
class_uri: gn:R
|
||||
description: Road, railroad, etc. (GeoNames class R)
|
||||
description: Transport corridor feature class.
|
||||
broad_mappings:
|
||||
- schema:Place
|
||||
- crm:E53_Place
|
||||
SpotFeature:
|
||||
is_a: GeoFeatureType
|
||||
class_uri: gn:S
|
||||
description: Spot, building, farm (GeoNames class S)
|
||||
description: Spot feature class, including discrete built entities.
|
||||
broad_mappings:
|
||||
- schema:Place
|
||||
- crm:E53_Place
|
||||
HypsographicFeature:
|
||||
is_a: GeoFeatureType
|
||||
class_uri: gn:T
|
||||
description: Mountain, hill, rock (GeoNames class T)
|
||||
description: Terrain elevation feature class.
|
||||
broad_mappings:
|
||||
- schema:Place
|
||||
- crm:E53_Place
|
||||
UnderseaFeature:
|
||||
is_a: GeoFeatureType
|
||||
class_uri: gn:U
|
||||
description: Undersea feature (GeoNames class U)
|
||||
description: Undersea feature class.
|
||||
broad_mappings:
|
||||
- schema:Place
|
||||
- crm:E53_Place
|
||||
VegetationFeature:
|
||||
is_a: GeoFeatureType
|
||||
class_uri: gn:V
|
||||
description: Forest, heath, etc. (GeoNames class V)
|
||||
description: Vegetation feature class.
|
||||
broad_mappings:
|
||||
- schema:Place
|
||||
- crm:E53_Place
|
||||
|
|
|
|||
|
|
@ -1,23 +1,24 @@
|
|||
id: https://nde.nl/ontology/hc/class/GeoNamesIdentifier
|
||||
name: GeoNamesIdentifier
|
||||
title: GeoNames Identifier
|
||||
description: Identifier from the GeoNames geographical database. MIGRATED from geonames_id slot per Rule 53. Follows gn:geonamesID.
|
||||
title: GeoNames Identifier Class
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
gn: http://www.geonames.org/ontology#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
default_prefix: hc
|
||||
classes:
|
||||
GeoNamesIdentifier:
|
||||
is_a: Identifier
|
||||
class_uri: hc:GeoNamesIdentifier
|
||||
description: External identifier referencing a feature in the GeoNames gazetteer.
|
||||
close_mappings:
|
||||
- dcterms:Identifier
|
||||
related_mappings:
|
||||
- gn:geonamesID
|
||||
description: A unique identifier for a GeoNames feature. Typically an integer.
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
specificity_score: 0.3
|
||||
specificity_rationale: Specialized external place identifier class.
|
||||
custodian_types: '["*"]'
|
||||
|
|
|
|||
|
|
@ -1,158 +1,74 @@
|
|||
id: https://nde.nl/ontology/hc/class/GeoSpatialPlace
|
||||
name: geospatial_place_class
|
||||
name: GeoSpatialPlace
|
||||
title: Geo Spatial Place Class
|
||||
prefixes:
|
||||
geo: http://www.opengis.net/ont/geosparql#
|
||||
rov: http://www.w3.org/ns/regorg#
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
geosparql: http://www.opengis.net/ont/geosparql#
|
||||
wgs84: http://www.w3.org/2003/01/geo/wgs84_pos#
|
||||
sf: http://www.opengis.net/ont/sf#
|
||||
gn: http://www.geonames.org/ontology#
|
||||
gn_entity: http://sws.geonames.org/
|
||||
geo: http://www.opengis.net/ont/geosparql#
|
||||
schema: http://schema.org/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
tooi: https://identifier.overheid.nl/tooi/def/ont/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../enums/GeometryTypeEnum
|
||||
- ../metadata
|
||||
- ../slots/has_reference_system
|
||||
- ../slots/has_altitude
|
||||
- ../slots/has_coordinates
|
||||
- ../slots/has_geofeature
|
||||
- ../slots/has_altitude
|
||||
- ../slots/geographic_extent
|
||||
- ../slots/geometric_extent
|
||||
- ../slots/has_reference_system
|
||||
- ../slots/has_geofeature
|
||||
- ../slots/identified_by
|
||||
- ../slots/has_score
|
||||
- ../slots/has_resolution
|
||||
- ../slots/temporal_extent
|
||||
- ../slots/has_score
|
||||
types:
|
||||
WktLiteral:
|
||||
uri: geosparql:wktLiteral
|
||||
base: str
|
||||
description: 'Well-Known Text (WKT) representation of geometry.
|
||||
See OGC Simple Features specification.
|
||||
'
|
||||
examples:
|
||||
- value: POINT(4.2894 52.0705)
|
||||
- value: POLYGON((4.0 52.0, 4.5 52.0, 4.5 52.5, 4.0 52.5, 4.0 52.0))
|
||||
description: Well-Known Text representation of geometry.
|
||||
classes:
|
||||
GeoSpatialPlace:
|
||||
class_uri: geosparql:Feature
|
||||
description: "Geospatial location with coordinates, geometry, and projections.\n\nCRITICAL DISTINCTION FROM CustodianPlace:\n\n| Aspect | CustodianPlace | GeoSpatialPlace |\n|--------|----------------|-----------------|\n| Nature | Nominal reference | Geospatial data |\n| Content | \"het herenhuis in de Schilderswijk\" | lat: 52.0705, lon: 4.2894 |\n| Purpose | Identify custodian by place name | Locate custodian precisely |\n| Ambiguity | May be vague (\"the mansion\") | Precise, measurable |\n| Source | Archival documents, oral history | GPS, cadastral surveys, geocoding |\n\n**TOOI Ontology Alignment**:\n\nThis class follows the TOOI pattern for geospatial data:\n- `tooi:BestuurlijkeRuimte` is a subclass of `geosparql:Feature` and `prov:Entity`\n- `tooi:BestuurlijkeRuimte-hasGeometry` \u2192 `geosparql:Geometry`\n- `tooi:RegistratieveRuimte` for administrative boundaries\n- `tooi:JuridischeRuimte` for legal jurisdiction boundaries\n\nLike TOOI, we separate:\n- **geosparql:Feature**\
|
||||
\ (this class): The real-world place with location data\n- **geosparql:Geometry**: The mathematical representation (WKT, GeoJSON)\n\n**Use Cases**:\n\n1. **Building-level precision**: Museum building footprint (Polygon)\n2. **City-level approximation**: Heritage institution centroid (Point)\n3. **Administrative boundaries**: Archive jurisdiction area (MultiPolygon)\n4. **Historical boundaries**: Pre-merger municipal territory (Polygon + temporal_extent)\n\n**Relationship to CustodianPlace**:\n\nCustodianPlace.has_geospatial_location \u2192 GeoSpatialPlace\n\nA nominal place reference (\"Rijksmuseum\") links to its geospatial location\n(lat: 52.3600, lon: 4.8852, geometry: building footprint polygon).\n\n**Relationship to AuxiliaryPlace**:\n\nAuxiliaryPlace.has_geospatial_location \u2192 GeoSpatialPlace\n\nSecondary/subordinate locations (branch offices, storage depots, reading rooms)\ncan also link to precise geospatial coordinates. This enables:\n- Mapping all custodian locations\
|
||||
\ (primary + auxiliary)\n- Spatial queries across an organization's entire footprint\n- Building footprints for off-site storage facilities\n- Historical boundary tracking for branch offices\n\n**Relationship to OrganizationalChangeEvent**:\n\nOrganizational changes may affect geographic location:\n- RELOCATION: New GeoSpatialPlace, old one gets temporal_extent.end_of_the_end\n- MERGER: Multiple locations \u2192 single primary + auxiliary locations\n- SPLIT: One location \u2192 multiple successor locations\n"
|
||||
description: Measured geospatial place representation with coordinates, geometry, and reference system.
|
||||
exact_mappings:
|
||||
- geosparql:Feature
|
||||
close_mappings:
|
||||
- geo:SpatialThing
|
||||
- schema:Place
|
||||
- schema:GeoCoordinates
|
||||
- geo:SpatialThing
|
||||
related_mappings:
|
||||
- prov:Entity
|
||||
- tooi:BestuurlijkeRuimte
|
||||
- crm:E53_Place
|
||||
slots:
|
||||
- identified_by
|
||||
- has_coordinates
|
||||
- has_altitude
|
||||
- geographic_extent
|
||||
- identified_by
|
||||
- has_reference_system
|
||||
- has_geofeature
|
||||
- geographic_extent
|
||||
- geometric_extent
|
||||
- identified_by
|
||||
- has_resolution
|
||||
- has_score
|
||||
- temporal_extent
|
||||
- has_score
|
||||
slot_usage:
|
||||
has_coordinates:
|
||||
range: Coordinates
|
||||
inlined: true
|
||||
required: true
|
||||
examples:
|
||||
- value:
|
||||
latitude: 52.36
|
||||
longitude: 4.8852
|
||||
has_reference_system:
|
||||
ifabsent: string(EPSG:4326)
|
||||
identified_by:
|
||||
description: 'Cadastral identifiers for this geospatial place. MIGRATION NOTE (2026-01-14): Replaces cadastral_id per slot_fixes.yaml. Use Identifier with identifier_scheme=''cadastral'' for parcel IDs. Netherlands: Kadaster perceelnummer format {gemeente}-{sectie}-{perceelnummer}'
|
||||
examples:
|
||||
- value:
|
||||
temporal_extent:
|
||||
range: TimeSpan
|
||||
inlined: true
|
||||
required: false
|
||||
examples:
|
||||
- value:
|
||||
begin_of_the_begin: '1920-01-01'
|
||||
end_of_the_end: '2001-01-01'
|
||||
comments:
|
||||
- Follows TOOI BestuurlijkeRuimte pattern using GeoSPARQL
|
||||
- 'CRITICAL: NOT a nominal reference - this is measured/surveyed location data'
|
||||
- Use CustodianPlace for nominal references, this class for coordinates
|
||||
- lat/lon required; geometry_wkt optional for point locations
|
||||
- Link from CustodianPlace via has_geospatial_location slot
|
||||
- Link from AuxiliaryPlace via has_geospatial_location slot (subordinate sites)
|
||||
- Link from OrganizationalChangeEvent via has_affected_territory slot
|
||||
- temporal_extent tracks boundary changes over time (was valid_from_geo/valid_to_geo)
|
||||
- OSM and GeoNames IDs enable external linking
|
||||
see_also:
|
||||
- http://www.opengis.net/ont/geosparql
|
||||
- https://www.geonames.org/
|
||||
- https://www.openstreetmap.org/
|
||||
- https://identifier.overheid.nl/tooi/def/ont/
|
||||
examples:
|
||||
- value:
|
||||
geospatial_id: https://nde.nl/ontology/hc/geo/rijksmuseum-building
|
||||
has_coordinates:
|
||||
latitude: 52.36
|
||||
longitude: 4.8852
|
||||
altitude: 0.0
|
||||
geometric_extent:
|
||||
- has_format:
|
||||
has_value: POLYGON((4.8830 52.3590, 4.8870 52.3590, 4.8870 52.3610, 4.8830 52.3610, 4.8830 52.3590))
|
||||
has_type:
|
||||
has_label: POLYGON
|
||||
coordinate_reference_system: EPSG:4326
|
||||
osm_id: way/27083908
|
||||
spatial_resolution: BUILDING
|
||||
has_geofeature:
|
||||
- has_type: SpotFeature
|
||||
has_code:
|
||||
has_label: S.MUS
|
||||
- value:
|
||||
geospatial_id: https://nde.nl/ontology/hc/geo/amsterdam-centroid
|
||||
has_coordinates:
|
||||
latitude: 52.3676
|
||||
longitude: 4.9041
|
||||
geometric_extent:
|
||||
- has_type:
|
||||
has_label: POINT
|
||||
coordinate_reference_system: EPSG:4326
|
||||
spatial_resolution: CITY
|
||||
has_geofeature:
|
||||
- has_type: PopulatedPlace
|
||||
has_code:
|
||||
has_label: P.PPLC
|
||||
- value:
|
||||
geospatial_id: https://nde.nl/ontology/hc/geo/noord-holland-archive-territory-pre-2001
|
||||
has_coordinates:
|
||||
latitude: 52.5
|
||||
longitude: 4.8
|
||||
geometric_extent:
|
||||
- has_format:
|
||||
has_value: MULTIPOLYGON(((4.5 52.2, 5.2 52.2, 5.2 52.8, 4.5 52.8, 4.5 52.2)))
|
||||
has_type:
|
||||
has_label: MULTIPOLYGON
|
||||
coordinate_reference_system: EPSG:4326
|
||||
spatial_resolution: REGION
|
||||
has_geofeature:
|
||||
- has_type: AdministrativeBoundary
|
||||
has_code:
|
||||
has_label: A.ADM1
|
||||
temporal_extent:
|
||||
begin_of_the_begin: '1920-01-01'
|
||||
end_of_the_end: '2001-01-01'
|
||||
- Use this class for measurable geodata, not nominal place references.
|
||||
- Link nominal place references through dedicated place classes.
|
||||
- Temporal extent tracks boundary or footprint change over time.
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
specificity_score: 0.55
|
||||
specificity_rationale: Primary geospatial feature class for coordinates and geometry.
|
||||
custodian_types: '["*"]'
|
||||
|
|
|
|||
|
|
@ -4,11 +4,9 @@ title: Geographic Extent Class
|
|||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
|
||||
schema: http://schema.org/
|
||||
default_prefix: hc
|
||||
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../metadata
|
||||
|
|
@ -17,18 +15,15 @@ imports:
|
|||
classes:
|
||||
GeographicExtent:
|
||||
class_uri: dcterms:Location
|
||||
description: >-
|
||||
A geographic area defining the scope or extent (e.g., eligible countries).
|
||||
|
||||
**Ontology Alignment**:
|
||||
- **Primary**: `dcterms:Location`
|
||||
- **Close**: `schema:Place`
|
||||
|
||||
description: Geographic area used to define spatial applicability or coverage.
|
||||
exact_mappings:
|
||||
- dcterms:Location
|
||||
close_mappings:
|
||||
- schema:Place
|
||||
slots:
|
||||
- has_label
|
||||
- identified_by
|
||||
|
||||
- has_label
|
||||
annotations:
|
||||
custodian_types: '["*"]'
|
||||
specificity_score: 0.3
|
||||
specificity_rationale: Geographic metadata.
|
||||
specificity_rationale: Spatial scope descriptor for policies and eligibility.
|
||||
|
|
|
|||
|
|
@ -1,23 +1,25 @@
|
|||
id: https://nde.nl/ontology/hc/class/GeographicScope
|
||||
name: GeographicScope
|
||||
title: Geographic Scope
|
||||
description: The geographic scope or coverage of an entity (e.g., local, regional, national). MIGRATED from geographic_scope slot per Rule 53. Follows skos:Concept.
|
||||
title: Geographic Scope Class
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_description
|
||||
- ../slots/has_label
|
||||
default_prefix: hc
|
||||
- ../slots/has_description
|
||||
classes:
|
||||
GeographicScope:
|
||||
class_uri: skos:Concept
|
||||
description: Controlled concept describing scale of geographic coverage.
|
||||
broad_mappings:
|
||||
- skos:Concept
|
||||
slots:
|
||||
- has_label
|
||||
- has_description
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
specificity_score: 0.25
|
||||
specificity_rationale: Controlled scope vocabulary for local-to-global coverage.
|
||||
custodian_types: '["*"]'
|
||||
|
|
|
|||
|
|
@ -1,34 +1,37 @@
|
|||
id: https://nde.nl/ontology/hc/class/Geometry
|
||||
name: Geometry
|
||||
title: Geometry
|
||||
description: A spatial geometry (point, polygon, etc.). MIGRATED from geometry_type/geometry_wkt slots. Follows GeoSPARQL Geometry.
|
||||
title: Geometry Class
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
geosparql: http://www.opengis.net/ont/geosparql#
|
||||
schema: http://schema.org/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_description
|
||||
- ../slots/has_format
|
||||
- ../slots/has_label
|
||||
- ../slots/has_description
|
||||
- ../slots/has_type
|
||||
default_prefix: hc
|
||||
- ../slots/has_format
|
||||
classes:
|
||||
Geometry:
|
||||
class_uri: geosparql:Geometry
|
||||
description: Spatial geometry representation, such as point, line, or polygon.
|
||||
exact_mappings:
|
||||
- geosparql:Geometry
|
||||
close_mappings:
|
||||
- schema:GeoShape
|
||||
slots:
|
||||
- has_label
|
||||
- has_description
|
||||
- has_type
|
||||
- has_format
|
||||
slot_usage:
|
||||
has_format:
|
||||
# range: string # uriorcurie
|
||||
required: true
|
||||
has_type:
|
||||
# range: string # uriorcurie
|
||||
required: true
|
||||
has_format:
|
||||
required: true
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
specificity_score: 0.35
|
||||
specificity_rationale: Core geometric encoding object for geospatial data.
|
||||
custodian_types: '["*"]'
|
||||
|
|
|
|||
|
|
@ -1,25 +1,26 @@
|
|||
id: https://nde.nl/ontology/hc/class/GeometryType
|
||||
name: GeometryType
|
||||
title: Geometry Type
|
||||
description: Abstract base class for geometry types (e.g., Point, Polygon). MIGRATED from geometry_type slot per Rule 0b.
|
||||
title: Geometry Type Class
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
geosparql: http://www.opengis.net/ont/geosparql#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_description
|
||||
- ../slots/has_label
|
||||
default_prefix: hc
|
||||
- ../slots/has_description
|
||||
classes:
|
||||
GeometryType:
|
||||
class_uri: skos:Concept
|
||||
abstract: true
|
||||
description: Abstract controlled concept for geometry shape types.
|
||||
broad_mappings:
|
||||
- skos:Concept
|
||||
slots:
|
||||
- has_label
|
||||
- has_description
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
specificity_score: 0.25
|
||||
specificity_rationale: Geometry shape taxonomy base class.
|
||||
custodian_types: '["*"]'
|
||||
|
|
|
|||
|
|
@ -1,63 +1,49 @@
|
|||
id: https://nde.nl/ontology/hc/class/GeometryTypes
|
||||
name: GeometryTypes
|
||||
title: Geometry Type Subclasses
|
||||
description: Concrete subclasses of GeometryType representing specific geometry types.
|
||||
Based on GeoSPARQL geometry types.
|
||||
title: Geometry Types Class Module
|
||||
prefixes:
|
||||
geo: http://www.opengis.net/ont/geosparql#
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
geosparql: http://www.opengis.net/ont/geosparql#
|
||||
sf: http://www.opengis.net/ont/sf#
|
||||
geo: http://www.opengis.net/ont/geosparql#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- ./GeometryType
|
||||
- linkml:types
|
||||
default_prefix: hc
|
||||
classes:
|
||||
Point:
|
||||
is_a: GeometryType
|
||||
class_uri: sf:Point
|
||||
description: A single point geometry.
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: '[''*'']'
|
||||
description: Point geometry type.
|
||||
broad_mappings:
|
||||
- geo:Geometry
|
||||
- sf:Geometry
|
||||
Polygon:
|
||||
is_a: GeometryType
|
||||
class_uri: sf:Polygon
|
||||
description: A polygon geometry.
|
||||
description: Polygon geometry type.
|
||||
broad_mappings:
|
||||
- geo:Geometry
|
||||
- sf:Geometry
|
||||
MultiPolygon:
|
||||
is_a: GeometryType
|
||||
class_uri: sf:MultiPolygon
|
||||
description: A collection of polygons.
|
||||
description: Multi polygon geometry type.
|
||||
broad_mappings:
|
||||
- geo:Geometry
|
||||
- sf:Geometry
|
||||
LineString:
|
||||
is_a: GeometryType
|
||||
class_uri: sf:LineString
|
||||
description: A line string geometry.
|
||||
description: Line string geometry type.
|
||||
broad_mappings:
|
||||
- geo:Geometry
|
||||
- sf:Geometry
|
||||
MultiLineString:
|
||||
is_a: GeometryType
|
||||
class_uri: sf:MultiLineString
|
||||
description: A collection of line strings.
|
||||
description: Multi line string geometry type.
|
||||
broad_mappings:
|
||||
- geo:Geometry
|
||||
- sf:Geometry
|
||||
MultiPoint:
|
||||
is_a: GeometryType
|
||||
class_uri: sf:MultiPoint
|
||||
description: A collection of points.
|
||||
description: Multi point geometry type.
|
||||
broad_mappings:
|
||||
- geo:Geometry
|
||||
- sf:Geometry
|
||||
|
|
|
|||
|
|
@ -1,20 +1,24 @@
|
|||
id: https://nde.nl/ontology/hc/class/GeospatialIdentifier
|
||||
name: GeospatialIdentifier
|
||||
title: Geospatial Identifier
|
||||
description: A unique identifier for a geospatial feature (e.g., from GeoSPARQL). MIGRATED from geospatial_id slot per Rule 53. Follows geosparql:Feature.
|
||||
title: Geospatial Identifier Class
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
geosparql: http://www.opengis.net/ont/geosparql#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
default_prefix: hc
|
||||
classes:
|
||||
GeospatialIdentifier:
|
||||
is_a: Identifier
|
||||
class_uri: geosparql:Feature
|
||||
description: A persistent URI or identifier for a geospatial feature.
|
||||
class_uri: hc:GeospatialIdentifier
|
||||
description: Persistent identifier for a geospatial entity in an external or internal system.
|
||||
close_mappings:
|
||||
- dcterms:Identifier
|
||||
related_mappings:
|
||||
- geosparql:Feature
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
specificity_score: 0.3
|
||||
specificity_rationale: Identifier class for geospatial record linking.
|
||||
custodian_types: '["*"]'
|
||||
|
|
|
|||
|
|
@ -1,7 +1,6 @@
|
|||
id: https://nde.nl/ontology/hc/class/GeospatialLocation
|
||||
name: GeospatialLocation
|
||||
title: GeospatialLocation
|
||||
description: A specific geospatial location.
|
||||
title: Geospatial Location Class
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
|
|
@ -13,10 +12,12 @@ imports:
|
|||
classes:
|
||||
GeospatialLocation:
|
||||
class_uri: schema:GeoCoordinates
|
||||
description: Geospatial location.
|
||||
description: Coordinate-based geospatial location.
|
||||
exact_mappings:
|
||||
- schema:GeoCoordinates
|
||||
slots:
|
||||
- has_location
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
specificity_score: 0.25
|
||||
specificity_rationale: Coordinate wrapper used in geospatial modeling.
|
||||
custodian_types: '["*"]'
|
||||
|
|
|
|||
|
|
@ -1,40 +1,30 @@
|
|||
id: https://nde.nl/ontology/hc/classes/GhcidBlock
|
||||
name: GhcidBlock
|
||||
title: GhcidBlock
|
||||
title: Ghcid Block Class
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
rdfs: http://www.w3.org/2000/01/rdf-schema#
|
||||
org: http://www.w3.org/ns/org#
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
schema: http://schema.org/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
# default_range: string
|
||||
- ../slots/identified_by
|
||||
classes:
|
||||
GhcidBlock:
|
||||
description: "GHCID (Global Heritage Custodian Identifier) generation metadata\
|
||||
\ and history. Contains current GHCID string, UUID variants (v5, v8), numeric\
|
||||
\ form, generation timestamp, and history of GHCID changes due to relocations,\
|
||||
\ mergers, or collision resolution.\nOntology mapping rationale: - class_uri\
|
||||
\ is dcterms:Identifier because GHCID is fundamentally\n an identifier assignment\
|
||||
\ with associated metadata\n- close_mappings includes prov:Entity as identifier\
|
||||
\ blocks are\n traceable provenance entities themselves\n- related_mappings\
|
||||
\ includes schema:PropertyValue (identifier as\n property) and prov:Generation\
|
||||
\ (identifier creation event)"
|
||||
class_uri: dcterms:Identifier
|
||||
description: Identifier metadata block capturing assignment, variants, and lifecycle history for GHCID values.
|
||||
exact_mappings:
|
||||
- dcterms:Identifier
|
||||
close_mappings:
|
||||
- prov:Entity
|
||||
related_mappings:
|
||||
- schema:PropertyValue
|
||||
- prov:Generation
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: '[''*'']'
|
||||
slots:
|
||||
- identified_by
|
||||
annotations:
|
||||
specificity_score: 0.35
|
||||
specificity_rationale: Identifier lifecycle container for custody-level identifier governance.
|
||||
custodian_types: '["*"]'
|
||||
|
|
|
|||
|
|
@ -21,8 +21,8 @@ enums:
|
|||
TIER_1_AUTHORITATIVE:
|
||||
description: Official registry data (NDE CSV, Nationaal Archief ISIL)
|
||||
TIER_2_VERIFIED:
|
||||
description: Verified external sources (Wikidata, Google Maps, Genealogiewerkbalk)
|
||||
description: Verified external sources (Wikidata, Google Maps, genealogy archive registries)
|
||||
TIER_3_CROWD_SOURCED:
|
||||
description: Community-contributed data (reviews, user edits)
|
||||
TIER_4_INFERRED:
|
||||
description: Algorithmically extracted (website scrape, Exa search)
|
||||
description: Algorithmically extracted (website scrape, external search)
|
||||
|
|
|
|||
Loading…
Reference in a new issue