Compare commits
No commits in common. "8d7a8e5362b3253fd6a521412d69eced5a80e2e3" and "ca3ce0bd11cd900f35fb736d0fbb132c53c076be" have entirely different histories.
8d7a8e5362
...
ca3ce0bd11
3742 changed files with 147890 additions and 230579 deletions
|
|
@ -1,74 +0,0 @@
|
|||
# Archive Organization Type Description Rule
|
||||
|
||||
## Rule
|
||||
|
||||
When describing archive classes that do NOT have `recordType` or `hold_record_set` as a primary distinguishing feature, emphasize that they represent the **archive as an organization/institution**, not just a collection of records.
|
||||
|
||||
## Rationale
|
||||
|
||||
Many archive type classes (e.g., `BankArchive`, `ChurchArchive`, `MunicipalArchive`) classify the **type of organization** that maintains the records, rather than the type of records themselves. This is an important semantic distinction:
|
||||
|
||||
- **Archive Organization Types** (no recordType focus): Classify the institution by its domain/sector
|
||||
- Examples: `BankArchive`, `ChurchArchive`, `MunicipalArchive`, `UniversityArchive`
|
||||
- Emphasis: The organization's mission, governance, and institutional context
|
||||
|
||||
- **Record Set Types** (have recordType): Classify the collections by record type
|
||||
- Examples: `AudiovisualArchiveRecordSetType`, `PhotographicArchiveRecordSetType`
|
||||
- Emphasis: The nature and format of the records
|
||||
|
||||
## Description Pattern
|
||||
|
||||
### For Archive Organization Types (WITHOUT recordType):
|
||||
|
||||
```yaml
|
||||
description: >-
|
||||
Type of heritage institution that [primary function], specializing in
|
||||
[domain/subject area], with organizational characteristics including
|
||||
[governance, funding, legal status, or other institutional features].
|
||||
```
|
||||
|
||||
**Key elements to include:**
|
||||
1. "Type of heritage institution" or "Type of archive organization"
|
||||
2. The institution's primary domain or sector
|
||||
3. Organizational characteristics (governance, funding, legal status)
|
||||
4. Institutional context (parent organization, regulatory framework)
|
||||
5. Typical services and public-facing functions
|
||||
|
||||
### For Record Set Types (WITH recordType):
|
||||
|
||||
```yaml
|
||||
description: >-
|
||||
Classification of archival records documenting [subject/domain],
|
||||
typically including [record formats, content types, provenance patterns].
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### ✅ Correct - Archive Organization Type (BankArchive):
|
||||
|
||||
```yaml
|
||||
description: >-
|
||||
Type of heritage institution operating within the banking sector, preserving
|
||||
records of financial institutions and documenting banking history. Characterized
|
||||
by corporate governance structures, extended closure periods for personal data,
|
||||
and institutional relationships with parent banking organizations.
|
||||
```
|
||||
|
||||
### ✅ Correct - Record Set Type (has recordType):
|
||||
|
||||
```yaml
|
||||
description: >-
|
||||
Classification of archival records documenting banking activities, including
|
||||
ledgers, correspondence, customer accounts, and financial instruments.
|
||||
```
|
||||
|
||||
## Files Affected
|
||||
|
||||
All classes in the `*Archive` family that:
|
||||
- Do NOT have `hold_record_set` or `recordType` as a primary slot
|
||||
- Are subclassed from `ArchiveOrganizationType` (not `ArchiveRecordSetType`)
|
||||
|
||||
## Related Rules
|
||||
|
||||
- `mapping-specificity-hypernym-rule.md` - For correct ontology mappings
|
||||
- `class-description-quality-rule.md` - For general description quality
|
||||
|
|
@ -174,6 +174,6 @@ This approach:
|
|||
## See Also
|
||||
|
||||
- Rule 38: Slot Centralization and Semantic URI Requirements
|
||||
- Rule: Slot Naming Convention (Current Style)
|
||||
- Rule 39: Slot Naming Convention (RiC-O Style)
|
||||
- Rule 49: Slot Usage Minimization
|
||||
- LinkML Documentation: [slot_usage](https://linkml.io/linkml-model/latest/docs/slot_usage/)
|
||||
|
|
|
|||
|
|
@ -6,7 +6,7 @@ When resolving slot aliases to canonical names, a slot name that has its own `.y
|
|||
|
||||
## Context
|
||||
|
||||
Slot files in `schemas/20251121/linkml/modules/slots/` (top-level and `new/`) each define a canonical slot name. Some slot files also list aliases that overlap with canonical names from other slot files. These cross-references are accidental (e.g., indicating semantic relatedness) and should be corrected by removing the canonical names from the aliases lists in which they occur. The occurance of canonical names in alianses lists does NOT mean the referenced slot should be renamed.
|
||||
Slot files in `schemas/20251121/linkml/modules/slots/20260202_matang/` (top-level and `new/`) each define a canonical slot name. Some slot files also list aliases that overlap with canonical names from other slot files. These cross-references are intentional (e.g., indicating semantic relatedness) but do NOT mean the referenced slot should be renamed.
|
||||
|
||||
## Rule
|
||||
|
||||
|
|
@ -58,4 +58,4 @@ def should_rename(slot_name, alias_map, existing_slot_files):
|
|||
|
||||
## Rationale
|
||||
|
||||
Multiple slot files may list overlapping aliases by accident or for documentation or semantic linking purposes. A canonical slot name appearing as an alias in another file does not invalidate the original slot definition. Treating it as an alias would incorrectly redirect class files away from the slot's own definition, breaking the schema's intended structure.
|
||||
Multiple slot files may list overlapping aliases for documentation or semantic linking purposes. A canonical slot name appearing as an alias in another file does not invalidate the original slot definition. Treating it as an alias would incorrectly redirect class files away from the slot's own definition, breaking the schema's intended structure.
|
||||
|
|
|
|||
|
|
@ -1,48 +0,0 @@
|
|||
# Rule: Capitalization Consistency for LinkML Names
|
||||
|
||||
## Purpose
|
||||
|
||||
Ensure naming is consistent across LinkML classes, slots, enums, and their files,
|
||||
with special care for acronyms (for example: `GLAM`, `GHC`, `GHCID`, `GLEIF`).
|
||||
|
||||
## Mandatory Requirements
|
||||
|
||||
1. **Class names**
|
||||
- Use `PascalCase`.
|
||||
- Preserve canonical acronym casing.
|
||||
- Example: `GHCIdentifier`, not `GhcidIdentifier`.
|
||||
|
||||
2. **Slot names**
|
||||
- Use project slot naming convention consistently.
|
||||
- If acronym appears in a slot, keep its canonical uppercase form.
|
||||
- Example: `has_GHCID_history` (if acronymed slot is required), not `has_ghcid_history`.
|
||||
|
||||
3. **Enum names**
|
||||
- Use `PascalCase` with `Enum` suffix where applicable.
|
||||
- Preserve acronym casing in enum identifiers and permissible values.
|
||||
- Example: `GLAMTypeEnum`.
|
||||
|
||||
4. **File names must match primary term exactly**
|
||||
- Class file name must match class name (case-sensitive) plus `.yaml`.
|
||||
- Enum file name must match enum name (case-sensitive) plus `.yaml`.
|
||||
- Slot file name must match slot name (case-sensitive) plus `.yaml`.
|
||||
|
||||
5. **No mixed acronym variants in same schema branch**
|
||||
- Do not mix forms like `Ghcid`, `GHCID`, and `ghcid` for the same concept.
|
||||
- Pick canonical form once and use it everywhere.
|
||||
|
||||
## Refactoring Rule
|
||||
|
||||
When normalizing capitalization:
|
||||
|
||||
- Update term declaration (`name`, class/slot/enum key).
|
||||
- Update file name to match.
|
||||
- Update all imports and references transitively.
|
||||
- Do not leave aliases as operational identifiers; keep aliases only for lexical metadata.
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
- [ ] Class, slot, enum declarations use canonical casing.
|
||||
- [ ] File names exactly match declaration names.
|
||||
- [ ] Acronyms are consistent across declarations and references.
|
||||
- [ ] Imports and references resolve after renaming.
|
||||
|
|
@ -1,228 +0,0 @@
|
|||
# Class Description Quality Rule
|
||||
|
||||
## Rule: Write Dictionary-Style Definitions Without Repeating the Class Name
|
||||
|
||||
When writing class descriptions, follow these principles.
|
||||
|
||||
### 1. No Repetition of Class Name Components
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
AcademicArchiveRecordSetType:
|
||||
description: >-
|
||||
A classification type for archival record sets created by academic
|
||||
institutions. This class represents the record set type...
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
AcademicArchiveRecordSetType:
|
||||
description: >-
|
||||
Category for grouping documentary materials accumulated by tertiary
|
||||
educational institutions during their administrative, academic, and
|
||||
operational activities.
|
||||
```
|
||||
|
||||
The description should define the concept using synonyms and related terms, not repeat words from the class name.
|
||||
|
||||
### 2. MIGRATE Structured Data Before Removing from Descriptions
|
||||
|
||||
**CRITICAL**: When a description contains structured data (examples, typical contents, alignment notes, etc.), you MUST:
|
||||
|
||||
1. **First check** if the structured data already exists in proper LinkML fields
|
||||
2. **If NOT present**: ADD it to the appropriate structured field
|
||||
3. **ONLY THEN**: Remove it from the description
|
||||
|
||||
**Never simply delete structured content from descriptions without preserving it elsewhere.**
|
||||
|
||||
**MIGRATION CHECKLIST:**
|
||||
|
||||
| Content Type | Target Field | Example |
|
||||
|--------------|--------------|---------|
|
||||
| Example instances | `examples:` | `- value: {...} description: "..."` |
|
||||
| Typical contents | `keywords:` or `comments:` | List of typical materials |
|
||||
| Alignment explanations | `broad_mappings`, `related_mappings` | Ontology references |
|
||||
| Usage notes | `comments:` | Operational guidance |
|
||||
| Provenance notes | `comments:` or `annotations:` | Historical context |
|
||||
| Privacy/legal notes | `comments:` | Access restrictions |
|
||||
| Definition details | Keep in description | Core semantic definition |
|
||||
|
||||
**WRONG - Deleting without migration:**
|
||||
```yaml
|
||||
# BEFORE (has rich content)
|
||||
description: |
|
||||
Records documenting student academic careers.
|
||||
|
||||
**Typical Contents**:
|
||||
- Enrollment records
|
||||
- Academic transcripts
|
||||
- Graduation records
|
||||
|
||||
Subject to privacy regulations (FERPA, GDPR).
|
||||
|
||||
# AFTER (lost information!) - DON'T DO THIS
|
||||
description: >-
|
||||
Records documenting student academic careers.
|
||||
```
|
||||
|
||||
**CORRECT - Migrate first, then clean:**
|
||||
```yaml
|
||||
# Step 1: Add to structured fields
|
||||
description: >-
|
||||
Records documenting student academic careers.
|
||||
keywords:
|
||||
- enrollment records
|
||||
- academic transcripts
|
||||
- graduation records
|
||||
comments:
|
||||
- Subject to privacy regulations (FERPA, GDPR, AVG)
|
||||
- Access restrictions typically apply for records less than 75 years old
|
||||
|
||||
# Step 2: Now description is clean but no information lost
|
||||
```
|
||||
|
||||
### 3. No Structured Data or Meta-Discussion in Descriptions
|
||||
|
||||
After migration, descriptions should contain only the definition. Do not include:
|
||||
- Alignment explanations (use `broad_mappings`, `close_mappings`, `exact_mappings`)
|
||||
- Pattern explanations (use `see_also`, `comments`)
|
||||
- Usage examples (use `examples:` annotation)
|
||||
- Rationale for mappings (use `comments:` or `annotations:`)
|
||||
- Typical contents lists (use `keywords:` or `comments:`)
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
description: >-
|
||||
A type for X.
|
||||
|
||||
**RiC-O Alignment**: Maps to rico:RecordSetType because...
|
||||
|
||||
**Pattern**: This is part of a dual-class pattern with Y.
|
||||
|
||||
**Examples**: Administrative fonds, student records...
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
description: >-
|
||||
Category for grouping documentary materials accumulated by tertiary
|
||||
educational institutions.
|
||||
|
||||
broad_mappings:
|
||||
- rico:RecordSetType
|
||||
see_also:
|
||||
- AcademicArchive
|
||||
keywords:
|
||||
- administrative fonds
|
||||
- student records
|
||||
examples:
|
||||
- value: {...}
|
||||
description: Administrative fonds containing governance records
|
||||
```
|
||||
|
||||
### 4. Use Folded Block Scalar (`>-`) for Descriptions
|
||||
|
||||
Use `>-` (folded, strip) instead of `|` (literal) to ensure clean paragraph formatting in generated documentation.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
description: |
|
||||
A type for X.
|
||||
This spans multiple lines.
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
description: >-
|
||||
A type for X. This will be formatted as a single clean paragraph
|
||||
in the generated documentation.
|
||||
```
|
||||
|
||||
### 5. Use LinkML `examples:` Annotation for Examples
|
||||
|
||||
Structure examples properly with `value:` and `description:` keys.
|
||||
|
||||
```yaml
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: University Administrative Records
|
||||
description: Administrative fonds containing governance records
|
||||
```
|
||||
|
||||
### 6. Keywords vs Examples - Know the Difference
|
||||
|
||||
**CRITICAL**: Do not confuse `keywords:` with `examples:`. They serve different purposes:
|
||||
|
||||
| Field | Purpose | Content Type |
|
||||
|-------|---------|--------------|
|
||||
| `keywords:` | Search terms, topics, categories | List of strings (topics/materials) |
|
||||
| `examples:` | Valid instance data demonstrations | Structured objects with `value` and `description` |
|
||||
|
||||
**Keywords** = Topics, material types, categories that describe what the class is about:
|
||||
```yaml
|
||||
keywords:
|
||||
- enrollment records # type of material
|
||||
- academic transcripts # type of material
|
||||
- graduation records # type of material
|
||||
```
|
||||
|
||||
**Examples** = Actual instances of the class with populated slots:
|
||||
```yaml
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Registrar Student Records
|
||||
has_note: Enrollment, transcripts, graduation records
|
||||
description: Student records series from the registrar's office
|
||||
```
|
||||
|
||||
**WRONG - Using keywords as examples:**
|
||||
```yaml
|
||||
# DON'T: "enrollment records" is not an instance of AcademicStudentRecordSeries
|
||||
examples:
|
||||
- value: enrollment records
|
||||
description: Type of student record
|
||||
```
|
||||
|
||||
**CORRECT - Keywords for topics, examples for instances:**
|
||||
```yaml
|
||||
keywords:
|
||||
- enrollment records
|
||||
- academic transcripts
|
||||
- graduation records
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Historical Student Records
|
||||
has_note: Pre-1950 student records with fewer access restrictions
|
||||
description: Historical student records open for research access
|
||||
```
|
||||
|
||||
### 7. Multiple Examples for Different Use Cases
|
||||
|
||||
Provide multiple examples to show different contexts or configurations:
|
||||
|
||||
```yaml
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Recent Student Records
|
||||
description: Current records subject to privacy restrictions
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Historical Student Records
|
||||
description: Records 75+ years old with fewer access restrictions
|
||||
```
|
||||
|
||||
## Summary
|
||||
|
||||
| Element | Placement |
|
||||
|---------|-----------|
|
||||
| Definition | `description:` (concise, no repetition) |
|
||||
| Ontology mappings | `exact_mappings`, `broad_mappings`, etc. |
|
||||
| Related concepts | `see_also:` |
|
||||
| Usage notes | `comments:` |
|
||||
| Metadata | `annotations:` |
|
||||
| Examples | `examples:` with `value` and `description` |
|
||||
| Typical contents | `keywords:` or `comments:` |
|
||||
|
|
@ -1,54 +0,0 @@
|
|||
# Rule: Class File Name Must Match Class Label/Name
|
||||
|
||||
## 🚨 Critical
|
||||
|
||||
When a class label/name is changed, the class file name must be renamed to match.
|
||||
|
||||
This keeps class modules discoverable, prevents stale imports, and avoids long-term naming drift.
|
||||
|
||||
## The Rule
|
||||
|
||||
1. If the primary class identifier changes, rename the file in the same edit set.
|
||||
- Change triggers include updates to:
|
||||
- top-level `name:`
|
||||
- class key under `classes:`
|
||||
- canonical class label used for module naming
|
||||
|
||||
2. File naming must reflect the canonical class name.
|
||||
- ✅ `DigitalPlatformProfile.yaml` for class `DigitalPlatformProfile`
|
||||
- ❌ `DigitalPlatformV2.yaml` for class `DigitalPlatformProfile`
|
||||
|
||||
3. After renaming a file, update all references.
|
||||
- `imports:` in other class/slot/type files
|
||||
- manifests/indexes/build inputs
|
||||
- any generated or curated mapping lists that include file paths
|
||||
|
||||
4. Keep semantic names versionless.
|
||||
- Do not preserve old versioned file names when class names are de-versioned.
|
||||
- Coordinate with `no-version-indicators-in-names-rule.md`.
|
||||
|
||||
## Required Checklist
|
||||
|
||||
- [ ] File name matches canonical class name
|
||||
- [ ] `id:` and `name:` are internally consistent
|
||||
- [ ] All import paths updated
|
||||
- [ ] Search confirms no stale old file-name references remain
|
||||
- [ ] YAML parses after rename
|
||||
|
||||
## Example
|
||||
|
||||
Before:
|
||||
```yaml
|
||||
# file: DigitalPlatformV2.yaml
|
||||
name: DigitalPlatformProfile
|
||||
classes:
|
||||
DigitalPlatformProfile:
|
||||
```
|
||||
|
||||
After:
|
||||
```yaml
|
||||
# file: DigitalPlatformProfile.yaml
|
||||
name: DigitalPlatformProfile
|
||||
classes:
|
||||
DigitalPlatformProfile:
|
||||
```
|
||||
|
|
@ -135,6 +135,6 @@ The following class files have been identified as defining their own slots and r
|
|||
## See Also
|
||||
|
||||
- Rule 38: Slot Centralization and Semantic URI Requirements
|
||||
- Rule: Slot Naming Convention (Current Style)
|
||||
- Rule 39: Slot Naming Convention (RiC-O Style)
|
||||
- Rule 42: No Ontology Prefixes in Slot Names
|
||||
- Rule 43: Slot Nouns Must Be Singular
|
||||
|
|
|
|||
|
|
@ -1,158 +0,0 @@
|
|||
# Class Multilingual Support Rule
|
||||
|
||||
## Rule: All Class Files Must Include Multilingual Descriptions and Aliases
|
||||
|
||||
Every class file must provide `alt_descriptions` and `structured_aliases` in all supported languages to ensure internationalization and interoperability with multilingual heritage systems.
|
||||
|
||||
### Required Languages
|
||||
|
||||
| Code | Language |
|
||||
|------|----------|
|
||||
| `nl` | Dutch |
|
||||
| `de` | German |
|
||||
| `fr` | French |
|
||||
| `es` | Spanish |
|
||||
| `ar` | Arabic |
|
||||
| `id` | Indonesian |
|
||||
| `zh` | Chinese |
|
||||
|
||||
### Structure
|
||||
|
||||
#### alt_descriptions
|
||||
|
||||
Provide translated descriptions for each supported language:
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
AcademicArchiveRecordSetType:
|
||||
description: >-
|
||||
Category for grouping documentary materials accumulated by tertiary
|
||||
educational institutions during their administrative, academic, and
|
||||
operational activities.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Categorie voor het groeperen van documentair materiaal dat door
|
||||
hogeronderwijsinstellingen is verzameld tijdens hun administratieve,
|
||||
academische en operationele activiteiten.
|
||||
de: >-
|
||||
Kategorie zur Gruppierung von Dokumentenmaterial, das von Hochschulen
|
||||
während ihrer administrativen, akademischen und betrieblichen Aktivitäten
|
||||
angesammelt wurde.
|
||||
fr: >-
|
||||
Catégorie de regroupement des documents accumulés par les établissements
|
||||
d'enseignement supérieur au cours de leurs activités administratives,
|
||||
académiques et opérationnelles.
|
||||
es: >-
|
||||
Categoría para agrupar materiales documentales acumulados por instituciones
|
||||
de educación superior durante sus actividades administrativas, académicas
|
||||
y operativas.
|
||||
ar: >-
|
||||
فئة لتجميع المواد الوثائقية التي جمعتها مؤسسات التعليم العالي
|
||||
خلال أنشطتها الإدارية والأكاديمية والتشغيلية.
|
||||
id: >-
|
||||
Kategori untuk mengelompokkan materi dokumenter yang dikumpulkan oleh
|
||||
institusi pendidikan tinggi selama aktivitas administratif, akademik,
|
||||
dan operasional mereka.
|
||||
zh: >-
|
||||
高等教育机构在行政、学术和运营活动中积累的文献材料的分类类别。
|
||||
```
|
||||
|
||||
#### structured_aliases
|
||||
|
||||
Provide language-specific aliases/alternative names:
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
AcademicArchiveRecordSetType:
|
||||
structured_aliases:
|
||||
- literal_form: academisch archiefbestand
|
||||
in_language: nl
|
||||
- literal_form: Hochschularchivbestand
|
||||
in_language: de
|
||||
- literal_form: fonds d'archives académiques
|
||||
in_language: fr
|
||||
- literal_form: fondo de archivo académico
|
||||
in_language: es
|
||||
- literal_form: أرشيف أكاديمي
|
||||
in_language: ar
|
||||
- literal_form: koleksi arsip akademik
|
||||
in_language: id
|
||||
- literal_form: 学术档案集
|
||||
in_language: zh
|
||||
```
|
||||
|
||||
### Complete Example
|
||||
|
||||
```yaml
|
||||
id: https://nde.nl/ontology/hc/class/AcademicArchiveRecordSetType
|
||||
name: AcademicArchiveRecordSetType
|
||||
title: Academic Archive Record Set Type
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../classes/CollectionType
|
||||
classes:
|
||||
AcademicArchiveRecordSetType:
|
||||
description: >-
|
||||
Category for grouping documentary materials accumulated by tertiary
|
||||
educational institutions during their administrative, academic, and
|
||||
operational activities.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Categorie voor het groeperen van documentair materiaal dat door
|
||||
hogeronderwijsinstellingen is verzameld.
|
||||
de: >-
|
||||
Kategorie zur Gruppierung von Dokumentenmaterial, das von Hochschulen
|
||||
angesammelt wurde.
|
||||
fr: >-
|
||||
Catégorie de regroupement des documents accumulés par les établissements
|
||||
d'enseignement supérieur.
|
||||
es: >-
|
||||
Categoría para agrupar materiales documentales acumulados por instituciones
|
||||
de educación superior.
|
||||
ar: >-
|
||||
فئة لتجميع المواد الوثائقية التي جمعتها مؤسسات التعليم العالي.
|
||||
id: >-
|
||||
Kategori untuk mengelompokkan materi dokumenter yang dikumpulkan oleh
|
||||
institusi pendidikan tinggi.
|
||||
zh: >-
|
||||
高等教育机构积累的文献材料的分类类别。
|
||||
structured_aliases:
|
||||
- literal_form: academisch archiefbestand
|
||||
in_language: nl
|
||||
- literal_form: Hochschularchivbestand
|
||||
in_language: de
|
||||
- literal_form: fonds d'archives académiques
|
||||
in_language: fr
|
||||
- literal_form: fondo de archivo académico
|
||||
in_language: es
|
||||
- literal_form: أرشيف أكاديمي
|
||||
in_language: ar
|
||||
- literal_form: koleksi arsip akademik
|
||||
in_language: id
|
||||
- literal_form: 学术档案集
|
||||
in_language: zh
|
||||
is_a: CollectionType
|
||||
# ... rest of class definition
|
||||
```
|
||||
|
||||
### Translation Guidelines
|
||||
|
||||
1. **Accuracy over literal translation**: Translate the concept, not word-by-word
|
||||
2. **Use domain-appropriate terminology**: Use archival/library/museum terminology standard in each language
|
||||
3. **Consult existing vocabularies**: Reference RiC-O, ISAD(G), AAT translations when available
|
||||
4. **Maintain consistency**: Same term should be translated consistently across all class files
|
||||
|
||||
### Checklist
|
||||
|
||||
For each class file, verify:
|
||||
|
||||
- [ ] `alt_descriptions` present with all 7 languages
|
||||
- [ ] `structured_aliases` present with all 7 languages
|
||||
- [ ] Translations are accurate and domain-appropriate
|
||||
- [ ] Arabic text is properly encoded (RTL)
|
||||
- [ ] Chinese uses simplified characters (zh) unless traditional specified (zh-hant)
|
||||
|
|
@ -1,583 +0,0 @@
|
|||
# Rule 46: Ontology-Driven Cache Segmentation
|
||||
|
||||
🚨 **CRITICAL**: The semantic cache MUST use vocabulary derived from LinkML `*Type.yaml` and `*Types.yaml` schema files to extract entities for cache key generation. Hardcoded regex patterns are deprecated.
|
||||
|
||||
**Status**: Implemented (Evolved v2.0)
|
||||
**Version**: 2.0 (Epistemological Evolution)
|
||||
**Updated**: 2026-01-10
|
||||
|
||||
## Evolution Overview
|
||||
|
||||
Rule 46 v2.0 incorporates insights from Volodymyr Pavlyshyn's work on agentic memory systems:
|
||||
|
||||
1. **Epistemic Provenance** (Phase 1) - Track WHERE, WHEN, HOW data originated
|
||||
2. **Topological Distance** (Phase 2) - Use ontology structure, not just embeddings
|
||||
3. **Holarchic Cache** (Phase 3) - Entries as holons with up/down links
|
||||
4. **Message Passing** (Phase 4, planned) - Smalltalk-style introspectable cache
|
||||
5. **Clarity Trading** (Phase 5, planned) - Block ambiguous queries from cache
|
||||
|
||||
## Epistemic Provenance
|
||||
|
||||
Every cached response carries epistemological metadata:
|
||||
|
||||
```typescript
|
||||
interface EpistemicProvenance {
|
||||
dataSource: 'ISIL_REGISTRY' | 'WIKIDATA' | 'CUSTODIAN_YAML' | 'LLM_INFERENCE' | ...;
|
||||
dataTier: 1 | 2 | 3 | 4; // TIER_1_AUTHORITATIVE → TIER_4_INFERRED
|
||||
sourceTimestamp: string;
|
||||
derivationChain: string[]; // ["SPARQL:Qdrant", "RAG:retrieve", "LLM:generate"]
|
||||
revalidationPolicy: 'static' | 'daily' | 'weekly' | 'on_access';
|
||||
}
|
||||
```
|
||||
|
||||
**Benefit**: Users see "This answer is from TIER_1 ISIL registry data, captured 2025-01-08".
|
||||
|
||||
## Topological Distance
|
||||
|
||||
Beyond embedding similarity, cache matching considers **structural distance** in the type hierarchy:
|
||||
|
||||
```
|
||||
HeritageCustodian (*)
|
||||
│
|
||||
┌──────────────────┼──────────────────┐
|
||||
▼ ▼ ▼
|
||||
MuseumType (M) ArchiveType (A) LibraryType (L)
|
||||
│ │ │
|
||||
┌────┴────┐ ┌────┴────┐ ┌────┴────┐
|
||||
▼ ▼ ▼ ▼ ▼ ▼
|
||||
ArtMuseum History Municipal State Public Academic
|
||||
```
|
||||
|
||||
**Combined Similarity Formula**:
|
||||
```typescript
|
||||
finalScore = 0.7 * embeddingSimilarity + 0.3 * (1 - topologicalDistance)
|
||||
```
|
||||
|
||||
**Benefit**: "Art museum" won't match "natural history museum" even with 95% embedding similarity.
|
||||
|
||||
## Holarchic Cache Structure
|
||||
|
||||
Cache entries are **holons** - simultaneously complete AND parts of aggregates:
|
||||
|
||||
| Level | Example | Aggregates |
|
||||
|-------|---------|------------|
|
||||
| Micro | "Rijksmuseum details" | None |
|
||||
| Meso | "Museums in Amsterdam" | List of micro holons |
|
||||
| Macro | "Heritage in Noord-Holland" | All meso holons in region |
|
||||
|
||||
```typescript
|
||||
interface CachedQuery {
|
||||
// ... existing fields ...
|
||||
holonLevel?: 'micro' | 'meso' | 'macro';
|
||||
participatesIn?: string[]; // Higher-level cache keys
|
||||
aggregates?: string[]; // Lower-level entries
|
||||
}
|
||||
```
|
||||
|
||||
## Problem Statement
|
||||
|
||||
The ArchiefAssistent semantic cache prevents geographic false positives using entity extraction:
|
||||
|
||||
```
|
||||
Query: "Hoeveel musea in Amsterdam?"
|
||||
Cached: "Hoeveel musea in Noord-Holland?"
|
||||
Result: BLOCKED (location mismatch) ✅
|
||||
```
|
||||
|
||||
However, the current implementation uses **hardcoded regex patterns**:
|
||||
|
||||
```typescript
|
||||
// DEPRECATED: Hardcoded patterns in semantic-cache.ts
|
||||
const INSTITUTION_PATTERNS: Record<InstitutionTypeCode, RegExp> = {
|
||||
M: /\b(muse(um|a|ums?)|musea)/i,
|
||||
A: /\b(archie[fv]en?|archives?|archief)/i,
|
||||
// ... 19 patterns to maintain manually
|
||||
};
|
||||
```
|
||||
|
||||
**Problems with hardcoded patterns**:
|
||||
1. **Maintenance burden** - Every new institution type requires code changes
|
||||
2. **Missing subtypes** - "kunstmuseum" vs "museum" should cache separately
|
||||
3. **No multilingual support** - Only Dutch/English, misses German/French labels
|
||||
4. **Duplication** - Same vocabulary exists in LinkML schemas
|
||||
5. **No record type awareness** - "burgerlijke stand" queries mixed with general archive queries
|
||||
|
||||
## Solution: Schema-Derived Vocabulary
|
||||
|
||||
The LinkML schema already contains rich vocabulary:
|
||||
|
||||
| Schema File | Content | Cache Utility |
|
||||
|-------------|---------|---------------|
|
||||
| `CustodianType.yaml` | 19 top-level types | Primary segmentation (M/A/L/G...) |
|
||||
| `MuseumType.yaml` | 187+ museum subtypes | Subtype segmentation |
|
||||
| `ArchiveOrganizationType.yaml` | 144+ archive subtypes | Subtype segmentation |
|
||||
| `*RecordSetTypes.yaml` | Record type taxonomies | Finding aids specificity |
|
||||
|
||||
### Vocabulary Sources in Schema
|
||||
|
||||
1. **`type_label`** - Multilingual labels via `skos:prefLabel`
|
||||
2. **`structured_aliases`** - Language-tagged alternative names
|
||||
3. **`keywords`** - Search terms for entity recognition
|
||||
4. **`wikidata_entity`** - Linked Data identifiers
|
||||
|
||||
## Architecture
|
||||
|
||||
### Overview: Two-Tier Embedding Hierarchy
|
||||
|
||||
The system uses a **hierarchical embedding approach** for fast semantic routing:
|
||||
|
||||
1. **Tier 1: Types File Embeddings** - Which category? (Museum vs Archive vs Library)
|
||||
2. **Tier 2: Individual Type Embeddings** - Which specific type? (ArtMuseum vs NaturalHistoryMuseum)
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ BUILD TIME: Extract vocabulary + generate embeddings │
|
||||
│ │
|
||||
│ schemas/20251121/linkml/modules/classes/*Type.yaml │
|
||||
│ schemas/20251121/linkml/modules/classes/*Types.yaml │
|
||||
│ ↓ │
|
||||
│ scripts/extract-types-vocab.ts │
|
||||
│ ↓ │
|
||||
│ ┌───────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ types-vocab.json │ │
|
||||
│ │ ├── tier1Embeddings: { MuseumType: [...], ArchiveType: [...] } │ │
|
||||
│ │ ├── tier2Embeddings: { ArtMuseum: [...], MunicipalArchive: [...]}│ │
|
||||
│ │ └── termLog: { "kunstmuseum": { type: "M", subtype: "ART_MUSEUM"}│ │
|
||||
│ └───────────────────────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼ (loaded at runtime)
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ RUNTIME: Two-Tier Semantic Routing │
|
||||
│ │
|
||||
│ Query: "Hoeveel gemeentearchieven in Amsterdam?" │
|
||||
│ ↓ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ TIER 1: Types File Selection │ │
|
||||
│ │ Query embedding vs Tier1 embeddings (19 categories) │ │
|
||||
│ │ Result: ArchiveOrganizationType (similarity: 0.89) │ │
|
||||
│ └─────────────────────────────────────────────────────────────────┘ │
|
||||
│ ↓ │
|
||||
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ TIER 2: Specific Type Selection │ │
|
||||
│ │ Query embedding vs Tier2 embeddings (144 archive subtypes) │ │
|
||||
│ │ Result: MunicipalArchive (similarity: 0.94) │ │
|
||||
│ └─────────────────────────────────────────────────────────────────┘ │
|
||||
│ ↓ │
|
||||
│ Structured cache key: "count:A.MUNICIPAL_ARCHIVE:amsterdam" │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Tier 1: Types File Embeddings
|
||||
|
||||
Each Types file (e.g., `MuseumType.yaml`, `ArchiveOrganizationType.yaml`) gets ONE embedding
|
||||
representing the **accumulated vocabulary** of all types within that file.
|
||||
|
||||
**Embedding Text Construction**:
|
||||
```
|
||||
MuseumType: museum musea kunstmuseum art museum natural history museum
|
||||
science museum open-air museum ecomuseum virtual museum
|
||||
heritage farm national museum regional museum university museum
|
||||
[... all keywords from all 187 subtypes ...]
|
||||
```
|
||||
|
||||
**Purpose**: Fast first-pass filter to identify which GLAMORCUBESFIXPHDNT category the query relates to.
|
||||
|
||||
| Types File | Code | Accumulated Terms Count |
|
||||
|------------|------|------------------------|
|
||||
| MuseumType | M | ~500+ terms from 187 subtypes |
|
||||
| ArchiveOrganizationType | A | ~400+ terms from 144 subtypes |
|
||||
| LibraryType | L | ~200+ terms from subtypes |
|
||||
| GalleryType | G | ~100+ terms from subtypes |
|
||||
| ... | ... | ... |
|
||||
|
||||
### Tier 2: Individual Type Embeddings
|
||||
|
||||
Each **specific type** within a Types file gets its own embedding from its accumulated terms.
|
||||
|
||||
**Embedding Text Construction**:
|
||||
```
|
||||
MunicipalArchive: gemeentearchief stadsarchief city archive municipal archive
|
||||
town archive local government records burgerlijke stand
|
||||
bevolkingsregister council minutes building permits
|
||||
[... all keywords + structured_aliases + labels ...]
|
||||
```
|
||||
|
||||
**Purpose**: Precise subtype identification after Tier 1 narrows the category.
|
||||
|
||||
### Term Log Structure
|
||||
|
||||
A lookup table mapping every extracted term to its type/subtype:
|
||||
|
||||
```json
|
||||
{
|
||||
"termLog": {
|
||||
"kunstmuseum": {
|
||||
"typeCode": "M",
|
||||
"typeName": "MuseumType",
|
||||
"subtypeName": "ART_MUSEUM",
|
||||
"wikidata": "Q207694",
|
||||
"language": "nl"
|
||||
},
|
||||
"art museum": {
|
||||
"typeCode": "M",
|
||||
"typeName": "MuseumType",
|
||||
"subtypeName": "ART_MUSEUM",
|
||||
"wikidata": "Q207694",
|
||||
"language": "en"
|
||||
},
|
||||
"gemeentearchief": {
|
||||
"typeCode": "A",
|
||||
"typeName": "ArchiveOrganizationType",
|
||||
"subtypeName": "MUNICIPAL_ARCHIVE",
|
||||
"wikidata": "Q8362876",
|
||||
"language": "nl"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Purpose**:
|
||||
1. Fast O(1) keyword lookup (no embedding needed for exact matches)
|
||||
2. Audit trail of which terms map to which types
|
||||
3. Debugging which queries match which types
|
||||
|
||||
### Runtime Lookup Strategy
|
||||
|
||||
```typescript
|
||||
async function extractEntitiesWithEmbeddings(query: string): Promise<ExtractedEntities> {
|
||||
const vocab = await loadTypesVocabulary();
|
||||
const normalized = query.toLowerCase();
|
||||
|
||||
// FAST PATH: Check termLog for exact keyword matches
|
||||
for (const [term, mapping] of Object.entries(vocab.termLog)) {
|
||||
if (normalized.includes(term)) {
|
||||
return {
|
||||
institutionType: mapping.typeCode,
|
||||
institutionSubtype: mapping.subtypeName,
|
||||
subtypeWikidata: mapping.wikidata,
|
||||
// ... location and intent extraction
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
// SLOW PATH: Embedding-based semantic matching
|
||||
const queryEmbedding = await generateEmbedding(query);
|
||||
|
||||
// Tier 1: Find best matching Types file
|
||||
let bestType: string | null = null;
|
||||
let bestTypeSimilarity = 0;
|
||||
for (const [typeName, typeEmbedding] of Object.entries(vocab.tier1Embeddings)) {
|
||||
const similarity = cosineSimilarity(queryEmbedding, typeEmbedding);
|
||||
if (similarity > bestTypeSimilarity && similarity > 0.7) {
|
||||
bestTypeSimilarity = similarity;
|
||||
bestType = typeName;
|
||||
}
|
||||
}
|
||||
|
||||
if (!bestType) return {}; // No type matched
|
||||
|
||||
// Tier 2: Find best matching subtype within the Types file
|
||||
const typeCode = vocab.institutionTypes[bestType].code;
|
||||
let bestSubtype: string | null = null;
|
||||
let bestSubtypeSimilarity = 0;
|
||||
|
||||
for (const [subtypeName, subtypeEmbedding] of Object.entries(vocab.tier2Embeddings[typeCode] || {})) {
|
||||
const similarity = cosineSimilarity(queryEmbedding, subtypeEmbedding);
|
||||
if (similarity > bestSubtypeSimilarity && similarity > 0.75) {
|
||||
bestSubtypeSimilarity = similarity;
|
||||
bestSubtype = subtypeName;
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
institutionType: typeCode,
|
||||
institutionSubtype: bestSubtype,
|
||||
// ... location and intent extraction
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### Embedding Model Choice
|
||||
|
||||
For build-time embedding generation, use the same model as the semantic cache:
|
||||
|
||||
| Option | Model | Dimensions | Quality |
|
||||
|--------|-------|------------|---------|
|
||||
| **Primary** | `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2` | 384 | Good multilingual |
|
||||
| Fallback | `all-MiniLM-L6-v2` | 384 | English-focused |
|
||||
| High Quality | `multilingual-e5-large` | 1024 | Best multilingual |
|
||||
|
||||
**Build-time generation**: Embeddings are generated ONCE at build time and stored in JSON.
|
||||
This avoids runtime embedding API calls for type classification.
|
||||
|
||||
## TypesVocabulary JSON Structure
|
||||
|
||||
Generated at build time with **pre-computed embeddings**:
|
||||
|
||||
```json
|
||||
{
|
||||
"version": "2026-01-10T12:00:00Z",
|
||||
"schemaVersion": "20251121",
|
||||
"embeddingModel": "paraphrase-multilingual-MiniLM-L12-v2",
|
||||
"embeddingDimensions": 384,
|
||||
|
||||
"tier1Embeddings": {
|
||||
"MuseumType": [0.023, -0.045, 0.087, ...],
|
||||
"ArchiveOrganizationType": [0.012, 0.056, -0.034, ...],
|
||||
"LibraryType": [-0.034, 0.089, 0.012, ...],
|
||||
"GalleryType": [0.045, -0.023, 0.067, ...]
|
||||
},
|
||||
|
||||
"tier2Embeddings": {
|
||||
"M": {
|
||||
"ART_MUSEUM": [0.034, -0.056, 0.078, ...],
|
||||
"NATURAL_HISTORY_MUSEUM": [0.045, 0.023, -0.089, ...],
|
||||
"SCIENCE_MUSEUM": [0.067, -0.012, 0.045, ...]
|
||||
},
|
||||
"A": {
|
||||
"MUNICIPAL_ARCHIVE": [0.089, 0.034, -0.056, ...],
|
||||
"NATIONAL_ARCHIVE": [0.012, -0.078, 0.045, ...],
|
||||
"CHURCH_ARCHIVE": [-0.023, 0.067, 0.034, ...]
|
||||
}
|
||||
},
|
||||
|
||||
"termLog": {
|
||||
"kunstmuseum": {"typeCode": "M", "subtypeName": "ART_MUSEUM", "wikidata": "Q207694", "lang": "nl"},
|
||||
"art museum": {"typeCode": "M", "subtypeName": "ART_MUSEUM", "wikidata": "Q207694", "lang": "en"},
|
||||
"gemeentearchief": {"typeCode": "A", "subtypeName": "MUNICIPAL_ARCHIVE", "wikidata": "Q8362876", "lang": "nl"},
|
||||
"stadsarchief": {"typeCode": "A", "subtypeName": "MUNICIPAL_ARCHIVE", "wikidata": "Q8362876", "lang": "nl"},
|
||||
"city archive": {"typeCode": "A", "subtypeName": "MUNICIPAL_ARCHIVE", "wikidata": "Q8362876", "lang": "en"},
|
||||
"burgerlijke stand": {"typeCode": "A", "recordSetType": "CIVIL_REGISTRY", "lang": "nl"},
|
||||
"geboorteakte": {"typeCode": "A", "recordSetType": "CIVIL_REGISTRY", "lang": "nl"}
|
||||
},
|
||||
|
||||
"institutionTypes": {
|
||||
"M": {
|
||||
"code": "M",
|
||||
"className": "MuseumType",
|
||||
"baseWikidata": "Q33506",
|
||||
"accumulatedTerms": "museum musea kunstmuseum art museum natural history museum science museum open-air museum ecomuseum virtual museum heritage farm national museum regional museum university museum...",
|
||||
"keywords": {
|
||||
"nl": ["museum", "musea"],
|
||||
"en": ["museum", "museums"],
|
||||
"de": ["Museum", "Museen"]
|
||||
},
|
||||
"subtypes": {
|
||||
"ART_MUSEUM": {
|
||||
"className": "ArtMuseum",
|
||||
"wikidata": "Q207694",
|
||||
"accumulatedTerms": "kunstmuseum art museum kunstmusea art museums fine art museum visual arts museum painting gallery sculpture museum",
|
||||
"keywords": {
|
||||
"nl": ["kunstmuseum", "kunstmusea"],
|
||||
"en": ["art museum", "art museums"]
|
||||
}
|
||||
},
|
||||
"NATURAL_HISTORY_MUSEUM": {
|
||||
"className": "NaturalHistoryMuseum",
|
||||
"wikidata": "Q559049",
|
||||
"accumulatedTerms": "natuurhistorisch museum natuurmuseum natural history museum science museum fossils taxidermy specimens geology biology",
|
||||
"keywords": {
|
||||
"nl": ["natuurhistorisch museum", "natuurmuseum"],
|
||||
"en": ["natural history museum"]
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"A": {
|
||||
"code": "A",
|
||||
"className": "ArchiveOrganizationType",
|
||||
"baseWikidata": "Q166118",
|
||||
"accumulatedTerms": "archief archieven archive archives gemeentearchief stadsarchief nationaal archief rijksarchief church archive company archive film archive...",
|
||||
"keywords": {
|
||||
"nl": ["archief", "archieven"],
|
||||
"en": ["archive", "archives"]
|
||||
},
|
||||
"subtypes": {
|
||||
"MUNICIPAL_ARCHIVE": {
|
||||
"className": "MunicipalArchive",
|
||||
"wikidata": "Q8362876",
|
||||
"accumulatedTerms": "gemeentearchief stadsarchief municipal archive city archive town archive local government records civil registry population register building permits council minutes",
|
||||
"keywords": {
|
||||
"nl": ["gemeentearchief", "stadsarchief", "gemeentelijke archiefdienst"],
|
||||
"en": ["municipal archive", "city archive", "town archive"]
|
||||
}
|
||||
},
|
||||
"NATIONAL_ARCHIVE": {
|
||||
"className": "NationalArchive",
|
||||
"wikidata": "Q1188452",
|
||||
"accumulatedTerms": "nationaal archief rijksarchief national archive state archive government records national records federal archive",
|
||||
"keywords": {
|
||||
"nl": ["nationaal archief", "rijksarchief"],
|
||||
"en": ["national archive", "state archive"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
|
||||
"recordSetTypes": {
|
||||
"CIVIL_REGISTRY": {
|
||||
"className": "CivilRegistrySeries",
|
||||
"accumulatedTerms": "burgerlijke stand geboorteakte huwelijksakte overlijdensakte bevolkingsregister civil registry birth records marriage records death records population register vital records genealogy",
|
||||
"keywords": {
|
||||
"nl": ["burgerlijke stand", "geboorteakte", "huwelijksakte", "overlijdensakte", "bevolkingsregister"],
|
||||
"en": ["civil registry", "birth records", "marriage records", "death records"]
|
||||
}
|
||||
},
|
||||
"COUNCIL_GOVERNANCE": {
|
||||
"className": "CouncilGovernanceFonds",
|
||||
"accumulatedTerms": "gemeenteraad raadsnotulen raadsbesluit verordening council minutes ordinances resolutions bylaws municipal council town council city council",
|
||||
"keywords": {
|
||||
"nl": ["gemeenteraad", "raadsnotulen", "raadsbesluit", "verordening"],
|
||||
"en": ["council minutes", "ordinances", "resolutions"]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Key Additions for Embedding Support
|
||||
|
||||
| Field | Purpose |
|
||||
|-------|---------|
|
||||
| `tier1Embeddings` | Pre-computed embeddings for each Types file (19 categories) |
|
||||
| `tier2Embeddings` | Pre-computed embeddings for each subtype (500+ types) |
|
||||
| `termLog` | Fast O(1) lookup table for exact keyword matches |
|
||||
| `accumulatedTerms` | Raw text used to generate embeddings (for debugging/regeneration) |
|
||||
| `embeddingModel` | Model used to generate embeddings (for reproducibility) |
|
||||
|
||||
## Enhanced ExtractedEntities Interface
|
||||
|
||||
```typescript
|
||||
export interface ExtractedEntities {
|
||||
// Existing fields
|
||||
institutionType?: InstitutionTypeCode | null;
|
||||
location?: string | null;
|
||||
locationType?: 'city' | 'province' | null;
|
||||
intent?: 'count' | 'list' | 'info' | null;
|
||||
|
||||
// NEW: Ontology-derived fields
|
||||
institutionSubtype?: string | null; // e.g., 'MUNICIPAL_ARCHIVE', 'ART_MUSEUM'
|
||||
recordSetType?: string | null; // e.g., 'CIVIL_REGISTRY', 'COUNCIL_GOVERNANCE'
|
||||
subtypeWikidata?: string | null; // e.g., 'Q8362876' for LOD integration
|
||||
}
|
||||
```
|
||||
|
||||
## Enhanced Cache Key Format
|
||||
|
||||
```
|
||||
{intent}:{institutionType}[.{subtype}][:{recordSetType}]:{location}
|
||||
|
||||
Examples:
|
||||
- "count:m:amsterdam" # Basic museum count
|
||||
- "count:m.art_museum:amsterdam" # Art museum count (subtype)
|
||||
- "list:a.municipal_archive:nh" # Municipal archives in Noord-Holland
|
||||
- "query:a:civil_registry:utrecht" # Civil registry in Utrecht
|
||||
- "info:a.national_archive::nl" # National archive info (no location filter)
|
||||
```
|
||||
|
||||
## Implementation Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `scripts/extract-types-vocab.ts` | Build-time vocabulary extraction from LinkML |
|
||||
| `apps/archief-assistent/public/types-vocab.json` | Generated vocabulary file |
|
||||
| `apps/archief-assistent/src/lib/types-vocabulary.ts` | Runtime vocabulary loader |
|
||||
| `apps/archief-assistent/src/lib/semantic-cache.ts` | Updated entity extraction |
|
||||
|
||||
## Build Integration
|
||||
|
||||
Add to `apps/archief-assistent/package.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"scripts": {
|
||||
"prebuild": "tsx ../../scripts/extract-types-vocab.ts",
|
||||
"build": "vite build"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Keyword Extraction Priority
|
||||
|
||||
When extracting keywords from schema files:
|
||||
|
||||
1. **`keywords`** array (highest priority) - Explicit search terms
|
||||
2. **`structured_aliases.literal_form`** - Multilingual alternative names
|
||||
3. **`type_label`** - Preferred labels per language
|
||||
4. **Class name conversion** - `MunicipalArchive` → "municipal archive"
|
||||
|
||||
## Cache Segmentation Rules
|
||||
|
||||
### Rule 1: Subtype Specificity
|
||||
|
||||
Queries with **specific subtypes** should NOT match **generic type** cache entries:
|
||||
|
||||
```
|
||||
Query: "kunstmusea in Amsterdam" → key: "count:m.art_museum:amsterdam"
|
||||
Cached: "musea in Amsterdam" → key: "count:m:amsterdam"
|
||||
Result: MISS (subtype mismatch) ✅
|
||||
```
|
||||
|
||||
### Rule 2: Record Set Type Isolation
|
||||
|
||||
Queries about **specific record types** should cache separately:
|
||||
|
||||
```
|
||||
Query: "burgerlijke stand Utrecht" → key: "query:a:civil_registry:utrecht"
|
||||
Cached: "archieven in Utrecht" → key: "list:a:utrecht"
|
||||
Result: MISS (record set type mismatch) ✅
|
||||
```
|
||||
|
||||
### Rule 3: Subtype-to-Type Fallback
|
||||
|
||||
Generic queries CAN match subtype cache entries (broader is acceptable):
|
||||
|
||||
```
|
||||
Query: "musea in Amsterdam" → key: "count:m:amsterdam"
|
||||
Cached: "kunstmusea in Amsterdam" → key: "count:m.art_museum:amsterdam"
|
||||
Result: MISS (don't return subset for superset query)
|
||||
```
|
||||
|
||||
## Migration Notes
|
||||
|
||||
1. **Backwards Compatible**: Existing cache entries without `institutionSubtype` continue to work
|
||||
2. **Gradual Rollout**: New cache entries get subtype, old entries remain valid
|
||||
3. **Cache Clear**: Consider clearing cache after deployment to ensure consistency
|
||||
|
||||
## Validation
|
||||
|
||||
Run E2E tests to verify:
|
||||
|
||||
```bash
|
||||
cd apps/archief-assistent
|
||||
npm run test:e2e
|
||||
```
|
||||
|
||||
Key test cases:
|
||||
- Geographic isolation (Amsterdam ≠ Rotterdam ≠ Noord-Holland)
|
||||
- Subtype isolation (kunstmuseum ≠ museum)
|
||||
- Record set isolation (burgerlijke stand ≠ archive)
|
||||
- Intent isolation (count ≠ list ≠ info)
|
||||
|
||||
## References
|
||||
|
||||
- **Rule 41**: Types classes define SPARQL template variables
|
||||
- **Rule 0b**: Type/Types file naming convention
|
||||
- **CustodianType.yaml**: Base taxonomy definition
|
||||
- **AGENTS.md**: GLAMORCUBESFIXPHDNT taxonomy documentation
|
||||
|
||||
---
|
||||
|
||||
**Created**: 2026-01-10
|
||||
**Author**: OpenCode Agent
|
||||
**Status**: Implemented (v2.0)
|
||||
|
||||
## References
|
||||
|
||||
- Pavlyshyn, V. "Context Graphs and Data Traces: Building Epistemology Layers for Agentic Memory"
|
||||
- Pavlyshyn, V. "The Shape of Knowledge: Topology Theory for Knowledge Graphs"
|
||||
- Pavlyshyn, V. "Beyond Hierarchy: Why Agentic AI Systems Need Holarchies"
|
||||
- Pavlyshyn, V. "Smalltalk: The Language That Changed Everything"
|
||||
- Pavlyshyn, V. "Clarity Traders: Beyond Vibe Coding"
|
||||
|
|
@ -1,65 +0,0 @@
|
|||
# Rule: Engineering Parsimony and Domain Modeling
|
||||
|
||||
## Critical Convention
|
||||
|
||||
Our ontology follows an engineering-oriented approach: practical domain utility and
|
||||
stable interoperability take priority over minimal, tool-specific class catalogs.
|
||||
|
||||
## Rule
|
||||
|
||||
1. Model domain concepts, not implementation tools.
|
||||
- Reject classes like `ExaSearchMetadata`, `OpenAIFetchResult`, `ElasticsearchHit`.
|
||||
|
||||
2. Prefer generic, reusable activity/entity classes for operational provenance.
|
||||
- Use classes such as `ExternalSearchMetadata`, `RetrievalActivity`, `SearchResult`.
|
||||
|
||||
3. Capture tool/vendor details in slot values, not class names.
|
||||
- Record with generic predicates like `has_tool`, `has_method`, `has_agent`, `has_note`.
|
||||
|
||||
4. Digital platforms acting as custodians are valid domain classes.
|
||||
- Platform-as-custodian classes (for example YouTube-related custodian classes) are allowed.
|
||||
- Data processing/search tools are not ontology class candidates.
|
||||
|
||||
5. Avoid ontology growth driven by transient engineering stack choices.
|
||||
- New class proposals must be justified by cross-tool, domain-stable semantics.
|
||||
|
||||
## Rationale
|
||||
|
||||
- Tool names are volatile implementation details and age quickly.
|
||||
- Domain-level abstractions maximize reuse, query consistency, and mapping stability.
|
||||
- This aligns with an engineering ontology practice where strict theoretical
|
||||
parsimony in candidate theories is not the only optimization criterion; practical
|
||||
semantic interoperability and maintainability are primary.
|
||||
|
||||
## Examples
|
||||
|
||||
### Wrong
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
ExaSearchMetadata:
|
||||
class_uri: prov:Activity
|
||||
```
|
||||
|
||||
### Correct
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
ExternalSearchMetadata:
|
||||
class_uri: prov:Activity
|
||||
slots:
|
||||
- has_tool
|
||||
- has_method
|
||||
- has_agent
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
1. Liefke, K. (2024). *Natural Language Ontology and Semantic Theory*.
|
||||
Cambridge Elements in Semantics. DOI: `10.1017/9781009307789`.
|
||||
URL: https://www.cambridge.org/core/elements/abs/natural-language-ontology-and-semantic-theory/E8DDE548BB8A98137721984E26FAD764
|
||||
|
||||
2. Liefke, K. (2025). *Reduction and Unification in Natural Language Ontology*.
|
||||
Cambridge Elements in Semantics. DOI: `10.1017/9781009559683`.
|
||||
URL: https://www.cambridge.org/core/elements/abs/reduction-and-unification-in-natural-language-ontology/40F58ABA0D9C08958B5926F0CBDAD3CA
|
||||
|
||||
|
|
@ -18,7 +18,7 @@
|
|||
|
||||
## 🚫 AUTOMATED ENRICHMENT IS PROHIBITED 🚫
|
||||
|
||||
**DO NOT USE** automated scripts to enrich person profiles with web search data.
|
||||
**DO NOT USE** automated scripts to enrich person profiles with web search data. The `enrich_person_comprehensive.py` script has been deprecated.
|
||||
|
||||
**Why automated enrichment failed**:
|
||||
- Web searches return data about DIFFERENT people with similar names
|
||||
|
|
@ -184,12 +184,95 @@ Domains: geni.com, ancestry.*, familysearch.org, findagrave.com, myheritage.*
|
|||
→ Exception: If source explicitly links to living person with verifiable connection
|
||||
```
|
||||
|
||||
## Claim Rejection Patterns
|
||||
|
||||
The following inconsisten patterns should trigger automatic claim rejection:
|
||||
## Implementation in Enrichment Scripts
|
||||
|
||||
```python
|
||||
# Genealogy sources conflict - ALWAYS REJECT
|
||||
def validate_entity_match(profile: dict, search_result: dict) -> tuple[bool, str]:
|
||||
"""
|
||||
Validate that a search result refers to the same person as the profile.
|
||||
|
||||
REQUIRES: At least 3 of 5 identity attributes must match.
|
||||
Name match alone is INSUFFICIENT and automatically rejected.
|
||||
|
||||
Returns (is_valid, reason)
|
||||
"""
|
||||
profile_employer = profile.get('affiliations', [{}])[0].get('custodian_name', '').lower()
|
||||
profile_location = profile.get('profile_data', {}).get('location', '').lower()
|
||||
profile_role = profile.get('profile_data', {}).get('headline', '').lower()
|
||||
|
||||
source_text = search_result.get('answer', '').lower()
|
||||
source_url = search_result.get('source_url', '').lower()
|
||||
|
||||
# AUTOMATIC REJECTION: Genealogy sources
|
||||
genealogy_domains = ['geni.com', 'ancestry.', 'familysearch.', 'findagrave.', 'myheritage.']
|
||||
if any(domain in source_url for domain in genealogy_domains):
|
||||
return False, "genealogy_source_rejected"
|
||||
|
||||
# AUTOMATIC REJECTION: Profession conflicts
|
||||
heritage_roles = ['curator', 'archivist', 'librarian', 'conservator', 'registrar', 'collection', 'heritage']
|
||||
entertainment_roles = ['actress', 'actor', 'singer', 'footballer', 'politician', 'model', 'athlete']
|
||||
|
||||
profile_is_heritage = any(role in profile_role for role in heritage_roles)
|
||||
source_is_entertainment = any(role in source_text for role in entertainment_roles)
|
||||
|
||||
if profile_is_heritage and source_is_entertainment:
|
||||
return False, "conflicting_profession"
|
||||
|
||||
# AUTOMATIC REJECTION: Location conflicts
|
||||
if profile_location:
|
||||
location_conflicts = [
|
||||
('venezuela', 'uk'), ('mexico', 'netherlands'), ('brazil', 'france'),
|
||||
('caracas', 'london'), ('mexico city', 'amsterdam')
|
||||
]
|
||||
for source_loc, profile_loc in location_conflicts:
|
||||
if source_loc in source_text and profile_loc in profile_location:
|
||||
return False, "conflicting_location"
|
||||
|
||||
# Count positive identity attribute matches (need 3 of 5)
|
||||
matches = 0
|
||||
match_details = []
|
||||
|
||||
# 1. Employer match
|
||||
if profile_employer and profile_employer in source_text:
|
||||
matches += 1
|
||||
match_details.append(f"employer:{profile_employer}")
|
||||
|
||||
# 2. Location match
|
||||
if profile_location and profile_location in source_text:
|
||||
matches += 1
|
||||
match_details.append(f"location:{profile_location}")
|
||||
|
||||
# 3. Role/profession match
|
||||
if profile_role:
|
||||
role_words = [w for w in profile_role.split() if len(w) > 4]
|
||||
if any(word in source_text for word in role_words):
|
||||
matches += 1
|
||||
match_details.append(f"role_match")
|
||||
|
||||
# 4. Education/institution match (if available)
|
||||
profile_education = profile.get('profile_data', {}).get('education', [])
|
||||
if profile_education:
|
||||
edu_names = [e.get('school', '').lower() for e in profile_education if e.get('school')]
|
||||
if any(edu in source_text for edu in edu_names):
|
||||
matches += 1
|
||||
match_details.append(f"education_match")
|
||||
|
||||
# 5. Time period match (career dates)
|
||||
# (implementation depends on available data)
|
||||
|
||||
# REQUIRE 3 OF 5 MATCHES
|
||||
if matches < 3:
|
||||
return False, f"insufficient_identity_verification (only {matches}/5 attributes matched)"
|
||||
|
||||
return True, f"verified ({matches}/5 matches: {', '.join(match_details)})"
|
||||
```
|
||||
|
||||
## Claim Rejection Patterns
|
||||
|
||||
The following patterns should trigger automatic claim rejection:
|
||||
|
||||
```python
|
||||
# Genealogy sources - ALWAYS REJECT
|
||||
GENEALOGY_DOMAINS = [
|
||||
'geni.com', 'ancestry.com', 'ancestry.co.uk', 'familysearch.org',
|
||||
'findagrave.com', 'myheritage.com', 'wikitree.com', 'geneanet.org'
|
||||
|
|
@ -210,7 +293,7 @@ LOCATION_PAIRS = [
|
|||
('caracas', 'london'), ('caracas', 'amsterdam'),
|
||||
]
|
||||
|
||||
# Age impossibility - if birth year makes current career implausible, REJECT. For instance, for a Junior role:
|
||||
# Age impossibility - if birth year makes current career implausible, REJECT
|
||||
MIN_PLAUSIBLE_BIRTH_YEAR = 1945 # Would be 80 in 2025 - still plausible but verify
|
||||
MAX_PLAUSIBLE_BIRTH_YEAR = 2002 # Would be 23 in 2025 - plausible for junior roles
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,248 +0,0 @@
|
|||
# Rule 47: Disambiguation Entity Profiles - Prevent Repeated Entity Resolution Errors
|
||||
|
||||
## Status: CRITICAL
|
||||
|
||||
## Summary
|
||||
|
||||
When entity resolution determines that a web source describes a **different person** with a similar name, **create a PPID profile for that person** in `data/person/`. The PPID system is universal - ANY person who ever lived can have a profile, regardless of heritage relevance.
|
||||
|
||||
---
|
||||
|
||||
## The Universal PPID Principle
|
||||
|
||||
**In principle, all persons on Earth should be assigned PPIDs** - whether or not they are active in the heritage field. This includes:
|
||||
|
||||
- Heritage workers (curators, archivists, librarians, etc.)
|
||||
- Non-heritage professionals (actors, doctors, athletes, etc.)
|
||||
- Historical persons (deceased individuals from any era)
|
||||
- Public figures and private individuals
|
||||
|
||||
The `heritage_relevance` field indicates whether someone works in the heritage sector, but does NOT determine whether they can have a profile. **Anyone can have a PPID.**
|
||||
|
||||
---
|
||||
|
||||
## The Problem
|
||||
|
||||
During entity resolution, we often discover that web search results describe a **different person** with a similar name:
|
||||
|
||||
| Heritage Profile | Namesake Discovered | Why Different |
|
||||
|------------------|---------------------|---------------|
|
||||
| Carmen Juliá (UK curator) | Carmen Julia Álvarez (Venezuelan actress) | Different profession, location, timeline |
|
||||
| Jan de Vries (Rijksmuseum curator) | Jan de Vries (footballer) | Different profession |
|
||||
| Robert Ritter (heritage worker) | Robert Ritter (Nazi doctor, 1901-1951) | Different era, profession |
|
||||
|
||||
Without creating a profile for the namesake, future enrichment attempts may:
|
||||
1. Re-discover the same namesake
|
||||
2. Waste time re-investigating
|
||||
3. Risk attributing false claims again
|
||||
|
||||
---
|
||||
|
||||
## The Solution: Create PPID Profiles for Namesakes
|
||||
|
||||
When entity resolution proves two entities are different, **create a regular PPID profile for the namesake**:
|
||||
|
||||
1. Use standard PPID naming convention (no special prefix)
|
||||
2. Set `heritage_relevance.is_heritage_relevant: false`
|
||||
3. Document the disambiguation in BOTH profiles
|
||||
|
||||
---
|
||||
|
||||
## Example: Venezuelan Actress Profile
|
||||
|
||||
```json
|
||||
{
|
||||
"ppid": "ID_VE-XX-CCS_1952_VE-XX-CCS_XXXX_CARMEN-JULIA-ALVAREZ",
|
||||
"profile_data": {
|
||||
"full_name": "Carmen Julia Álvarez",
|
||||
"profession": "actress",
|
||||
"nationality": "Venezuelan",
|
||||
"birth_year": 1952,
|
||||
"birth_location": "Caracas, Venezuela",
|
||||
"active_period": "1970s-2000s"
|
||||
},
|
||||
"heritage_relevance": {
|
||||
"is_heritage_relevant": false,
|
||||
"relevance_score": 0.0,
|
||||
"reason": "Entertainment industry professional - actress in film and television"
|
||||
},
|
||||
"disambiguation_notes": {
|
||||
"commonly_confused_with": [
|
||||
{
|
||||
"ppid": "ID_UK-XX-XXX_XXXX_UK-XX-XXX_XXXX_CARMEN-JULIA",
|
||||
"name": "Carmen Juliá",
|
||||
"profession": "curator",
|
||||
"employer": "New Contemporaries",
|
||||
"location": "UK",
|
||||
"why_different": "Different profession (actress vs curator), different location (Venezuela vs UK), overlapping active periods in incompatible roles"
|
||||
}
|
||||
],
|
||||
"disambiguation_note": "This is the Venezuelan actress, NOT the UK-based art curator."
|
||||
},
|
||||
"web_claims": [
|
||||
{
|
||||
"claim_type": "birth_year",
|
||||
"claim_value": 1952,
|
||||
"provenance": {
|
||||
"source_url": "https://en.wikipedia.org/wiki/Carmen_Julia_Álvarez",
|
||||
"retrieved_on": "2026-01-11T14:30:00Z",
|
||||
"retrieval_agent": "manual-human-curator"
|
||||
}
|
||||
},
|
||||
{
|
||||
"claim_type": "profession",
|
||||
"claim_value": "actress",
|
||||
"provenance": {
|
||||
"source_url": "https://en.wikipedia.org/wiki/Carmen_Julia_Álvarez",
|
||||
"retrieved_on": "2026-01-11T14:30:00Z",
|
||||
"retrieval_agent": "manual-human-curator"
|
||||
}
|
||||
}
|
||||
],
|
||||
"extraction_metadata": {
|
||||
"created_at": "2026-01-11T15:00:00Z",
|
||||
"created_by": "manual-human-curator",
|
||||
"creation_reason": "Created during entity resolution to distinguish from heritage worker Carmen Juliá"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Update the Heritage Profile Too
|
||||
|
||||
The heritage profile should also reference the disambiguation:
|
||||
|
||||
```json
|
||||
{
|
||||
"ppid": "ID_UK-XX-XXX_XXXX_UK-XX-XXX_XXXX_CARMEN-JULIA",
|
||||
"profile_data": {
|
||||
"full_name": "Carmen Juliá",
|
||||
"headline": "Curator at New Contemporaries"
|
||||
},
|
||||
"heritage_relevance": {
|
||||
"is_heritage_relevant": true,
|
||||
"relevance_score": 0.85
|
||||
},
|
||||
"disambiguation_notes": {
|
||||
"known_namesakes": [
|
||||
{
|
||||
"ppid": "ID_VE-XX-CCS_1952_VE-XX-CCS_XXXX_CARMEN-JULIA-ALVAREZ",
|
||||
"name": "Carmen Julia Álvarez",
|
||||
"profession": "actress",
|
||||
"location": "Venezuela",
|
||||
"why_not_same_person": "Different profession, location, timeline"
|
||||
}
|
||||
],
|
||||
"disambiguation_warning": "Web searches for 'Carmen Julia' return data about Venezuelan actress Carmen Julia Álvarez (born 1952). This is a DIFFERENT person."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## When to Create Namesake Profiles
|
||||
|
||||
Create a PPID profile for a namesake when:
|
||||
|
||||
1. **Entity resolution proves they are a different person**
|
||||
2. **They are notable enough** to appear in search results repeatedly (Wikipedia, IMDB, news)
|
||||
3. **The confusion risk is high** (similar name, some overlapping attributes)
|
||||
|
||||
**Do NOT create profiles for**:
|
||||
- Random social media accounts with no notable presence
|
||||
- Obvious mismatches unlikely to recur in searches
|
||||
|
||||
---
|
||||
|
||||
## Benefits
|
||||
|
||||
1. **Universal person database**: Any person can have a PPID
|
||||
2. **Prevents repeated mistakes**: Future enrichment can check for known namesakes
|
||||
3. **Bidirectional linking**: Both profiles reference each other
|
||||
4. **Consistent data model**: No special file naming or profile types needed
|
||||
5. **Audit trail**: Documents why profiles were created
|
||||
|
||||
---
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1: During Entity Resolution
|
||||
|
||||
When you reject a claim due to identity mismatch with a notable namesake:
|
||||
|
||||
```
|
||||
1. Document WHY the source describes a different person
|
||||
2. Check if the namesake is notable (Wikipedia, IMDB, frequent search results)
|
||||
3. If notable → Create PPID profile for the namesake
|
||||
4. Link both profiles via disambiguation_notes
|
||||
```
|
||||
|
||||
### Step 2: Create Namesake Profile
|
||||
|
||||
Use standard PPID naming:
|
||||
```
|
||||
ID_{birth-location}_{birth-decade}_{current-location}_{death-decade}_{NAME}.json
|
||||
```
|
||||
|
||||
Example: `ID_VE-XX-CCS_1952_VE-XX-CCS_XXXX_CARMEN-JULIA-ALVAREZ.json`
|
||||
|
||||
### Step 3: Update Both Profiles
|
||||
|
||||
- Namesake profile: Add `commonly_confused_with` pointing to heritage profile
|
||||
- Heritage profile: Add `known_namesakes` pointing to namesake profile
|
||||
|
||||
---
|
||||
|
||||
## Historical Persons
|
||||
|
||||
Historical persons (deceased) can also have PPID profiles:
|
||||
|
||||
```json
|
||||
{
|
||||
"ppid": "ID_DE-XX-XXX_1901_DE-XX-XXX_1951_ROBERT-RITTER",
|
||||
"profile_data": {
|
||||
"full_name": "Robert Ritter",
|
||||
"profession": "physician",
|
||||
"birth_year": 1901,
|
||||
"death_year": 1951,
|
||||
"nationality": "German",
|
||||
"historical_note": "Nazi-era physician involved in racial hygiene programs"
|
||||
},
|
||||
"heritage_relevance": {
|
||||
"is_heritage_relevant": false,
|
||||
"relevance_score": 0.0
|
||||
},
|
||||
"disambiguation_notes": {
|
||||
"commonly_confused_with": [
|
||||
{
|
||||
"ppid": "ID_XX-XX-XXX_XXXX_XX-XX-XXX_XXXX_ROBERT-RITTER",
|
||||
"name": "Robert Ritter",
|
||||
"profession": "heritage worker",
|
||||
"why_different": "Different era - historical figure (1901-1951) vs living heritage professional"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 46**: Entity Resolution - Names Are NEVER Sufficient
|
||||
- **Rule 21**: Data Fabrication is Strictly Prohibited
|
||||
- **Rule 26**: Person Data Provenance - Web Claims for Staff Information
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
**The PPID system is universal.** When you discover during entity resolution that a web source describes a different person:
|
||||
|
||||
1. **Create a regular PPID profile** for the namesake (actress, historical figure, etc.)
|
||||
2. **Set `heritage_relevance.is_heritage_relevant: false`** (unless they happen to also work in heritage)
|
||||
3. **Link both profiles** via `disambiguation_notes`
|
||||
4. **Use standard PPID naming** - no special prefixes needed
|
||||
|
||||
This builds a comprehensive person database while preventing entity resolution errors.
|
||||
|
|
@ -1,307 +0,0 @@
|
|||
# Rule 46: Entity Resolution - Names Are NEVER Sufficient
|
||||
|
||||
## Status: CRITICAL
|
||||
|
||||
## 🚨 DATA QUALITY IS OF UTMOST IMPORTANCE 🚨
|
||||
|
||||
**Wrong data is worse than no data.** Attributing a birth year, spouse, or social media profile to the wrong person is a **critical data quality failure** that undermines the entire dataset's trustworthiness.
|
||||
|
||||
**ALL enrichments MUST be done MANUALLY and double-checked.** Automated web search enrichment has been DISABLED due to catastrophic entity resolution failures (540+ false claims removed in Jan 2026).
|
||||
|
||||
**The cost of false data**:
|
||||
- Corrupts downstream analysis and reporting
|
||||
- Creates legal/privacy risks (attributing data to wrong person)
|
||||
- Destroys user trust in the dataset
|
||||
- Requires expensive manual cleanup
|
||||
|
||||
---
|
||||
|
||||
## 🚫 AUTOMATED ENRICHMENT IS PROHIBITED 🚫
|
||||
|
||||
**DO NOT USE** automated scripts to enrich person profiles with web search data.
|
||||
|
||||
**Why automated enrichment failed**:
|
||||
- Web searches return data about DIFFERENT people with similar names
|
||||
- Regex pattern matching cannot distinguish between namesakes
|
||||
- Wikipedia, IMDB, ResearchGate, Instagram all returned data from wrong people
|
||||
- Example: "Carmen Juliá" search returned Venezuelan actress, Mexican hydrogeologist, Spanish medievalist - NONE were the UK art curator
|
||||
|
||||
**ONLY ALLOWED enrichment methods**:
|
||||
1. **Manual research** - Human curator verifies source refers to the correct person
|
||||
2. **Institutional sources** - Data from the person's employer website (verified)
|
||||
3. **LinkedIn profile data** - Already verified via direct profile access
|
||||
4. **ORCID/Wikidata** - If the person has a verified identifier
|
||||
|
||||
---
|
||||
|
||||
## The Core Principle
|
||||
|
||||
🚨 **SIMILAR OR IDENTICAL NAMES ARE NEVER SUFFICIENT FOR ENTITY RESOLUTION.**
|
||||
|
||||
A web search result mentioning "Carmen Juliá born 1952" is **NOT** evidence that the Carmen Juliá in our person profile was born in 1952. Names are not unique identifiers - there are thousands of people with the same name worldwide.
|
||||
|
||||
**Entity resolution requires verification of MULTIPLE independent identity attributes:**
|
||||
|
||||
| Attribute | Purpose | Example |
|
||||
|-----------|---------|---------|
|
||||
| **Age/Birth Year** | Temporal consistency | Both sources describe someone in their 40s |
|
||||
| **Career Path** | Professional identity | Both are art curators, not one curator and one actress |
|
||||
| **Location** | Geographic consistency | Both are based in UK, not one UK and one Venezuela |
|
||||
| **Employer** | Institutional affiliation | Both work at New Contemporaries |
|
||||
| **Education** | Academic background | Same university or field |
|
||||
|
||||
**Minimum Requirement**: At least **3 of 5** attributes must match before attributing ANY claim from a web source. Name match alone = **AUTOMATIC REJECTION**.
|
||||
|
||||
## Problem Statement
|
||||
|
||||
When enriching person profiles via web search (Linkup, Exa, etc.), search results often return data about **different people with similar or identical names**. Without proper entity resolution, the enrichment process can attribute false claims to the wrong person.
|
||||
|
||||
**Example Failure** (Carmen Juliá - UK Art Curator):
|
||||
- Source profile: Carmen Juliá, Curator at New Contemporaries (UK)
|
||||
- Birth year extracted: 1952 from Carmen Julia **Álvarez** (Venezuelan actress)
|
||||
- Spouse extracted: "actors Eduardo Serrano" from the Venezuelan actress
|
||||
- ResearchGate: Carmen Julia **Navarro** (Mexican hydrogeologist)
|
||||
- Academia.edu: Carmen Julia **Gutiérrez** (Spanish medieval studies)
|
||||
|
||||
All data is from **different people** - none is the actual Carmen Juliá who is a UK-based art curator.
|
||||
|
||||
**Why This Happened**: The enrichment script used regex pattern matching to extract "born 1952" without verifying that the Wikipedia article described the SAME person.
|
||||
|
||||
## The Rule
|
||||
|
||||
### DO NOT use name matching as the basis for entity resolution. EVER.
|
||||
|
||||
For person enrichment via web search:
|
||||
|
||||
**FORBIDDEN** (Name-based extraction):
|
||||
- ❌ Extracting birth years from any search result mentioning "Carmen Julia born..."
|
||||
- ❌ Attributing social media profiles just because the name appears
|
||||
- ❌ Claiming relationships (spouse, parent, child) from web text pattern matching
|
||||
- ❌ Assigning academic profiles (ResearchGate, Academia.edu, Google Scholar) based on name matching alone
|
||||
- ❌ Using Wikipedia articles without verifying ALL identity attributes
|
||||
- ❌ Trusting genealogy sites (Geni, Ancestry, MyHeritage) which describe historical namesakes
|
||||
- ❌ Using IMDB for birth years (actors with same names)
|
||||
|
||||
**REQUIRED** (Multi-Attribute Entity Resolution):
|
||||
1. **Verify identity via MULTIPLE attributes** - name alone is INSUFFICIENT
|
||||
2. **Cross-reference with known facts** (employer, location, job title from LinkedIn)
|
||||
3. **Detect conflicting signals** - actress vs curator, Venezuela vs UK, 1950s birth vs active 2020s career
|
||||
4. **Reject ambiguous matches** - if source doesn't clearly identify the same person, reject the claim
|
||||
5. **Document rejection rationale** - log why claim was rejected for audit trail
|
||||
|
||||
## Entity Resolution Verification Checklist
|
||||
|
||||
Before attributing a web claim to a person profile, verify MULTIPLE identity attributes:
|
||||
|
||||
| # | Attribute | What to Check | Example Match | Example Conflict |
|
||||
|---|-----------|---------------|---------------|------------------|
|
||||
| 1 | **Career/Profession** | Same field/industry | Both are curators | Source says "actress", profile is curator |
|
||||
| 2 | **Employer** | Same institution | Both at Rijksmuseum | Source says "film studio", profile is museum |
|
||||
| 3 | **Location** | Same city/country | Both UK-based | Source says Venezuela, profile is UK |
|
||||
| 4 | **Age Range** | Plausible for career | Birth 1980s, active 2020s | Birth 1952, still active in 2025 as junior |
|
||||
| 5 | **Education** | Same university/field | Both art history | Source says "medical school" |
|
||||
|
||||
**Minimum requirement**: At least **3 of 5** attributes must match. Name match alone = **AUTOMATIC REJECTION**.
|
||||
|
||||
**Any conflicting signal = AUTOMATIC REJECTION** (e.g., source says "actress" when profile is "curator").
|
||||
|
||||
## Sources with High Entity Resolution Risk
|
||||
|
||||
These sources are NOT forbidden, but require **stricter verification thresholds** due to high false-positive rates:
|
||||
|
||||
| Source Type | Risk Level | Why | Required Matches |
|
||||
|-------------|------------|-----|------------------|
|
||||
| Genealogy sites | CRITICAL | Historical persons with same name | 5/5 attributes (or explicit link to living person) |
|
||||
| IMDB | CRITICAL | Actors with common names | 5/5 attributes (unless person works in film/TV) |
|
||||
| Wikipedia | HIGH | Many people with same name have pages | 4/5 attributes match |
|
||||
| Academic profiles | HIGH | Multiple researchers with same name | 4/5 attributes + institution match |
|
||||
| Social media | HIGH | Many accounts with similar handles | 4/5 attributes + verify employer/location in bio |
|
||||
| News articles | MEDIUM | May mention multiple people | 3/5 attributes + read full context |
|
||||
| Institutional websites | LOW | Usually about their own staff | 2/5 attributes (good source if person works there) |
|
||||
|
||||
**Key point**: High-risk sources CAN be used if you verify enough identity attributes. The risk level determines the verification threshold, not whether the source is allowed.
|
||||
|
||||
## Red Flags Requiring Investigation
|
||||
|
||||
The following are **red flags** that require careful investigation - NOT automatic rejection. People change careers and relocate.
|
||||
|
||||
### Profession Differences
|
||||
If source profession differs from profile profession, **investigate**:
|
||||
```
|
||||
Source: "actress", "actor", "singer"
|
||||
Profile: "curator", "archivist", "librarian"
|
||||
|
||||
ASK: Did this person change careers?
|
||||
- Check timeline: Did acting career END before heritage career BEGAN?
|
||||
- Check for transition evidence: "former actress turned curator"
|
||||
- If careers overlap in time → likely different people → REJECT
|
||||
- If sequential careers with clear transition → may be same person → ACCEPT with documentation
|
||||
```
|
||||
|
||||
### Location Differences
|
||||
If source location differs from profile location, **investigate**:
|
||||
```
|
||||
Source: "Venezuela", "Mexico", "Brazil"
|
||||
Profile: "UK", "Netherlands", "France"
|
||||
|
||||
ASK: Did this person relocate?
|
||||
- Check timeline: When were they in each location?
|
||||
- Check for migration evidence: education abroad, international career moves
|
||||
- If locations overlap in time → likely different people → REJECT
|
||||
- If sequential locations with clear move → may be same person → ACCEPT with documentation
|
||||
```
|
||||
|
||||
### When to Actually REJECT
|
||||
|
||||
Reject when investigation shows **no plausible connection**:
|
||||
```
|
||||
Example: Carmen Julia Álvarez (Venezuelan actress, active 1970s-2000s)
|
||||
vs Carmen Juliá (UK curator, active 2015-present)
|
||||
|
||||
- Overlapping active periods in DIFFERENT professions on DIFFERENT continents
|
||||
- No evidence of career change or relocation
|
||||
- Birth year 1952 makes current junior curator role implausible
|
||||
→ REJECT: These are clearly different people
|
||||
```
|
||||
|
||||
### Age Conflicts (Still Automatic Rejection)
|
||||
If source age is **physically implausible** for profile career stage, REJECT:
|
||||
```
|
||||
Source: Born 1922, 1915, 1939
|
||||
Profile: Currently active professional in 2025
|
||||
→ REJECT (person would be 86-103 years old)
|
||||
|
||||
Source: Born 2007, 2004
|
||||
Profile: Senior curator
|
||||
→ REJECT (person would be 18-21, too young)
|
||||
```
|
||||
|
||||
### Genealogy Source
|
||||
Genealogy sources require **5 of 5 attribute matches** due to high false-positive rates:
|
||||
```
|
||||
Domains: geni.com, ancestry.*, familysearch.org, findagrave.com, myheritage.*
|
||||
→ REQUIRE 5/5 attribute matches (these often describe historical namesakes)
|
||||
→ Exception: If source explicitly links to living person with verifiable connection
|
||||
```
|
||||
|
||||
## Claim Rejection Patterns
|
||||
|
||||
The following inconsisten patterns should trigger automatic claim rejection:
|
||||
|
||||
```python
|
||||
# Genealogy sources conflict - ALWAYS REJECT
|
||||
GENEALOGY_DOMAINS = [
|
||||
'geni.com', 'ancestry.com', 'ancestry.co.uk', 'familysearch.org',
|
||||
'findagrave.com', 'myheritage.com', 'wikitree.com', 'geneanet.org'
|
||||
]
|
||||
|
||||
# Profession conflicts - if profile has one and source has another, REJECT
|
||||
PROFESSION_CONFLICTS = {
|
||||
'heritage': ['curator', 'archivist', 'librarian', 'conservator', 'registrar', 'collection manager'],
|
||||
'entertainment': ['actress', 'actor', 'singer', 'footballer', 'politician', 'model', 'athlete'],
|
||||
'medical': ['doctor', 'nurse', 'surgeon', 'physician'],
|
||||
'tech': ['software engineer', 'developer', 'programmer'],
|
||||
}
|
||||
|
||||
# Location conflicts - if source describes person in location X and profile is location Y, REJECT
|
||||
LOCATION_PAIRS = [
|
||||
('venezuela', 'uk'), ('venezuela', 'netherlands'), ('venezuela', 'germany'),
|
||||
('mexico', 'uk'), ('mexico', 'netherlands'), ('brazil', 'france'),
|
||||
('caracas', 'london'), ('caracas', 'amsterdam'),
|
||||
]
|
||||
|
||||
# Age impossibility - if birth year makes current career implausible, REJECT. For instance, for a Junior role:
|
||||
MIN_PLAUSIBLE_BIRTH_YEAR = 1945 # Would be 80 in 2025 - still plausible but verify
|
||||
MAX_PLAUSIBLE_BIRTH_YEAR = 2002 # Would be 23 in 2025 - plausible for junior roles
|
||||
```
|
||||
|
||||
## Handling Rejected Claims
|
||||
|
||||
When a claim fails entity resolution:
|
||||
|
||||
```json
|
||||
{
|
||||
"claim_type": "birth_year",
|
||||
"claim_value": 1952,
|
||||
"entity_resolution": {
|
||||
"status": "REJECTED",
|
||||
"reason": "conflicting_profession",
|
||||
"details": "Source describes Venezuelan actress, profile is UK curator",
|
||||
"source_identity": "Carmen Julia Álvarez (Venezuelan actress)",
|
||||
"profile_identity": "Carmen Juliá (UK art curator)",
|
||||
"rejected_at": "2026-01-11T15:00:00Z",
|
||||
"rejected_by": "entity_resolution_validator_v1"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Special Cases
|
||||
|
||||
### Common Names
|
||||
|
||||
For very common names (e.g., "John Smith", "Maria García", "Jan de Vries"), require **4 of 5** verification checks instead of 3. The more common the name, the higher the threshold.
|
||||
|
||||
| Name Commonality | Required Matches |
|
||||
|------------------|------------------|
|
||||
| Unique name (e.g., "Xander Vermeulen-Oosterhuis") | 2 of 5 |
|
||||
| Moderately common (e.g., "Carmen Juliá") | 3 of 5 |
|
||||
| Very common (e.g., "Jan de Vries") | 4 of 5 |
|
||||
| Extremely common (e.g., "John Smith") | 5 of 5 or reject |
|
||||
|
||||
### Abbreviated Names
|
||||
|
||||
For profiles with abbreviated names (e.g., "J. Smith"), entity resolution is inherently uncertain:
|
||||
- Set `entity_resolution_confidence: "very_low"`
|
||||
- Require **human review** for all claims
|
||||
- Do NOT attribute web claims automatically
|
||||
|
||||
### Historical Persons
|
||||
|
||||
When sources describe historical/deceased persons:
|
||||
- Check if death date conflicts with profile activity (living person active in 2025)
|
||||
- **ALWAYS REJECT** genealogy site data
|
||||
- Reject any source describing events before 1950 unless profile is known to be historical
|
||||
|
||||
### Wikipedia Articles
|
||||
|
||||
Wikipedia is particularly dangerous because:
|
||||
- Many people with the same name have articles
|
||||
- Search engines return Wikipedia first
|
||||
- The Wikipedia Carmen Julia Álvarez article describes a Venezuelan actress born 1952
|
||||
- This is a DIFFERENT PERSON from Carmen Juliá the UK curator
|
||||
|
||||
**For Wikipedia sources**:
|
||||
1. Read the FULL article, not just snippets
|
||||
2. Verify the Wikipedia subject's profession matches the profile
|
||||
3. Verify the Wikipedia subject's location matches the profile
|
||||
4. If ANY conflict detected → REJECT
|
||||
|
||||
## Audit Trail
|
||||
|
||||
All entity resolution decisions must be logged:
|
||||
|
||||
```json
|
||||
{
|
||||
"enrichment_history": [
|
||||
{
|
||||
"enrichment_timestamp": "2026-01-11T15:00:00Z",
|
||||
"enrichment_agent": "enrich_person_comprehensive.py v1.4.0",
|
||||
"entity_resolution_decisions": [
|
||||
{
|
||||
"source_url": "https://en.wikipedia.org/wiki/Carmen_Julia_Álvarez",
|
||||
"decision": "REJECTED",
|
||||
"reason": "Different person - Venezuelan actress, not UK curator"
|
||||
}
|
||||
],
|
||||
"claims_rejected_count": 5,
|
||||
"claims_accepted_count": 1
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- Rule 21: Data Fabrication is Strictly Prohibited
|
||||
- Rule 26: Person Data Provenance - Web Claims for Staff Information
|
||||
- Rule 45: Inferred Data Must Be Explicit with Provenance
|
||||
|
|
@ -1,422 +0,0 @@
|
|||
# Rule 45: Inferred Data Must Be Explicit with Provenance
|
||||
|
||||
**Status**: Active
|
||||
**Created**: 2025-01-09
|
||||
**Applies to**: PPID enrichment, person entity profiles, any data inference
|
||||
|
||||
## Core Principle
|
||||
|
||||
**All inferred data MUST be stored in explicit `inferred_*` fields with full provenance statements. Inferred values MUST NEVER silently replace or merge with verified data.**
|
||||
|
||||
This ensures:
|
||||
1. **Transparency**: Users can distinguish verified facts from heuristic estimates
|
||||
2. **Auditability**: The inference method and source observations are traceable
|
||||
3. **Reversibility**: Inferred data can be corrected when verified data becomes available
|
||||
4. **Quality Signals**: Confidence levels and argument chains are preserved
|
||||
|
||||
## Required Structure for Inferred Data
|
||||
|
||||
Every inferred claim MUST include:
|
||||
|
||||
```yaml
|
||||
inferred_[field_name]:
|
||||
value: "the inferred value"
|
||||
edtf: "196X" # For dates: EDTF notation
|
||||
formatted: "NL-UT-UTR" # For locations: CC-RR-PPP format
|
||||
confidence: "low|medium|high"
|
||||
inference_provenance:
|
||||
method: "heuristic_name"
|
||||
inference_chain:
|
||||
- step: 1
|
||||
observation: "University start year 1986"
|
||||
source_field: "profile_data.education[0].date_range"
|
||||
source_value: "1986 - 1990"
|
||||
- step: 2
|
||||
assumption: "University entry at age 18"
|
||||
rationale: "Standard Dutch university entry age"
|
||||
- step: 3
|
||||
calculation: "1986 - 18 = 1968"
|
||||
result: "Estimated birth year 1968"
|
||||
- step: 4
|
||||
generalization: "Round to decade → 196X"
|
||||
rationale: "EDTF decade notation for uncertain years"
|
||||
inferred_at: "2025-01-09T18:00:00Z"
|
||||
inferred_by: "enrich_ppids.py"
|
||||
```
|
||||
|
||||
## Explicit Inferred Fields
|
||||
|
||||
### For Person Profiles (PPID)
|
||||
|
||||
| Inferred Field | Source Observations | Heuristic |
|
||||
|----------------|---------------------|-----------|
|
||||
| `inferred_birth_year` | Earliest education/job dates | Entry age assumptions |
|
||||
| `inferred_birth_decade` | Birth year estimate | EDTF decade notation |
|
||||
| `inferred_birth_settlement` | School/university location | Residential proximity |
|
||||
| `inferred_birth_region` | Settlement location | GeoNames admin1 |
|
||||
| `inferred_birth_country` | Settlement location | GeoNames country |
|
||||
| `inferred_current_settlement` | Profile location, current job | Direct extraction |
|
||||
| `inferred_current_region` | Settlement location | GeoNames admin1 |
|
||||
| `inferred_current_country` | Settlement location | GeoNames country |
|
||||
|
||||
### Example: Complete Inferred Birth Data
|
||||
|
||||
```json
|
||||
{
|
||||
"ppid": "ID_NL-UT-UTR_196X_NL-UT-UTR_XXXX_AART-HARTEN",
|
||||
|
||||
"birth_date": {
|
||||
"edtf": "XXXX",
|
||||
"precision": "unknown",
|
||||
"note": "See inferred_birth_decade for heuristic estimate"
|
||||
},
|
||||
|
||||
"inferred_birth_decade": {
|
||||
"value": "196X",
|
||||
"edtf": "196X",
|
||||
"precision": "decade",
|
||||
"confidence": "low",
|
||||
"inference_provenance": {
|
||||
"method": "earliest_education_heuristic",
|
||||
"inference_chain": [
|
||||
{
|
||||
"step": 1,
|
||||
"observation": "University education record found",
|
||||
"source_field": "profile_data.education[0]",
|
||||
"source_value": {
|
||||
"institution": "Universiteit Utrecht",
|
||||
"degree": "Social & Organisational psychology, doctoraal",
|
||||
"date_range": "1986 - 1990"
|
||||
}
|
||||
},
|
||||
{
|
||||
"step": 2,
|
||||
"extraction": "Start year extracted from date_range",
|
||||
"extracted_value": 1986
|
||||
},
|
||||
{
|
||||
"step": 3,
|
||||
"assumption": "University entry age",
|
||||
"assumed_value": 18,
|
||||
"rationale": "Standard Dutch university entry age (post-VWO)",
|
||||
"confidence_impact": "Assumption reduces confidence; actual age 17-20 possible"
|
||||
},
|
||||
{
|
||||
"step": 4,
|
||||
"calculation": "1986 - 18 = 1968",
|
||||
"result": "Estimated birth year: 1968"
|
||||
},
|
||||
{
|
||||
"step": 5,
|
||||
"generalization": "Convert to EDTF decade",
|
||||
"input": 1968,
|
||||
"output": "196X",
|
||||
"rationale": "Decade precision appropriate for heuristic estimate"
|
||||
}
|
||||
],
|
||||
"inferred_at": "2025-01-09T18:00:00Z",
|
||||
"inferred_by": "enrich_ppids.py"
|
||||
}
|
||||
},
|
||||
|
||||
"inferred_birth_settlement": {
|
||||
"value": "Utrecht",
|
||||
"formatted": "NL-UT-UTR",
|
||||
"confidence": "low",
|
||||
"inference_provenance": {
|
||||
"method": "earliest_education_location",
|
||||
"inference_chain": [
|
||||
{
|
||||
"step": 1,
|
||||
"observation": "Earliest education institution identified",
|
||||
"source_field": "profile_data.education[0].institution",
|
||||
"source_value": "Universiteit Utrecht"
|
||||
},
|
||||
{
|
||||
"step": 2,
|
||||
"lookup": "Institution location mapping",
|
||||
"mapping_key": "Universiteit Utrecht",
|
||||
"mapping_value": "Utrecht, Netherlands"
|
||||
},
|
||||
{
|
||||
"step": 3,
|
||||
"geocoding": "GeoNames resolution",
|
||||
"query": "Utrecht",
|
||||
"country_code": "NL",
|
||||
"result": {
|
||||
"geonames_id": 2745912,
|
||||
"name": "Utrecht",
|
||||
"admin1_code": "09",
|
||||
"admin1_name": "Utrecht"
|
||||
}
|
||||
},
|
||||
{
|
||||
"step": 4,
|
||||
"formatting": "CC-RR-PPP generation",
|
||||
"country_code": "NL",
|
||||
"region_code": "UT",
|
||||
"settlement_code": "UTR",
|
||||
"result": "NL-UT-UTR"
|
||||
}
|
||||
],
|
||||
"assumption_note": "University location used as proxy for birth location; student may have relocated for education",
|
||||
"inferred_at": "2025-01-09T18:00:00Z",
|
||||
"inferred_by": "enrich_ppids.py"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## List-Valued Inferred Data (EDTF Set Notation)
|
||||
|
||||
When inference yields multiple plausible values (e.g., someone born in 1968 could be in either the 1960s or 1970s decade), store as a **list** with EDTF set notation.
|
||||
|
||||
### EDTF Set Notation Standards
|
||||
|
||||
| Notation | Meaning | Use Case |
|
||||
|----------|---------|----------|
|
||||
| `[196X,197X]` | One of these values | Person born in late 1960s (uncertainty spans decades) |
|
||||
| `{196X,197X}` | All of these values | NOT for birth decade (use `[...]`) |
|
||||
| `[1965..1970]` | Range within set | Birth year between 1965-1970 |
|
||||
|
||||
### When to Use List Values
|
||||
|
||||
1. **Decade Boundary Cases**: Estimated birth year is within 3 years of a decade boundary
|
||||
- Estimated 1968 → `[196X,197X]` (could be late 60s or early 70s due to age assumption variance)
|
||||
- Estimated 1972 → `[196X,197X]` (same logic)
|
||||
- Estimated 1975 → `197X` (confidently mid-decade)
|
||||
|
||||
2. **Multiple Plausible Locations**: Student attended schools in different cities
|
||||
- `["NL-UT-UTR", "NL-NH-AMS"]` with provenance explaining each candidate
|
||||
|
||||
### Example: List-Valued Birth Decade
|
||||
|
||||
```json
|
||||
{
|
||||
"inferred_birth_decade": {
|
||||
"values": ["196X", "197X"],
|
||||
"edtf": "[196X,197X]",
|
||||
"edtf_meaning": "one of: 1960s or 1970s",
|
||||
"precision": "decade_set",
|
||||
"confidence": "low",
|
||||
"primary_value": "196X",
|
||||
"primary_rationale": "1968 is closer to 1960s center than 1970s",
|
||||
"inference_provenance": {
|
||||
"method": "earliest_observation_heuristic",
|
||||
"inference_chain": [
|
||||
{
|
||||
"step": 1,
|
||||
"observation": "University start 1986",
|
||||
"source_field": "profile_data.education[0].date_range"
|
||||
},
|
||||
{
|
||||
"step": 2,
|
||||
"assumption": "University entry at age 18 (±3 years)",
|
||||
"rationale": "Dutch university entry typically 17-21"
|
||||
},
|
||||
{
|
||||
"step": 3,
|
||||
"calculation": "1986 - 18 = 1968 (range: 1965-1971)",
|
||||
"result": "Birth year estimate: 1968 with variance 1965-1971"
|
||||
},
|
||||
{
|
||||
"step": 4,
|
||||
"generalization": "Birth year range spans decade boundary",
|
||||
"input_range": [1965, 1971],
|
||||
"output": ["196X", "197X"],
|
||||
"rationale": "Cannot determine which decade without additional evidence"
|
||||
}
|
||||
],
|
||||
"inferred_at": "2025-01-09T18:00:00Z",
|
||||
"inferred_by": "enrich_ppids.py"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### PPID Generation with List Values
|
||||
|
||||
When `inferred_birth_decade` is a list, use `primary_value` for PPID:
|
||||
|
||||
```json
|
||||
{
|
||||
"ppid": "ID_NL-UT-UTR_196X_NL-UT-UTR_XXXX_AART-HARTEN",
|
||||
"ppid_components": {
|
||||
"first_date": "196X",
|
||||
"first_date_source": "inferred_birth_decade.primary_value",
|
||||
"first_date_alternatives": ["197X"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Example: List-Valued Location
|
||||
|
||||
```json
|
||||
{
|
||||
"inferred_birth_settlement": {
|
||||
"values": [
|
||||
{"settlement": "Utrecht", "formatted": "NL-UT-UTR"},
|
||||
{"settlement": "Amsterdam", "formatted": "NL-NH-AMS"}
|
||||
],
|
||||
"primary_value": "NL-UT-UTR",
|
||||
"primary_rationale": "Earlier education (1986) in Utrecht; Amsterdam job later (1990)",
|
||||
"confidence": "very_low",
|
||||
"inference_provenance": {
|
||||
"method": "education_locations",
|
||||
"inference_chain": [
|
||||
{
|
||||
"step": 1,
|
||||
"observation": "Multiple education institutions found",
|
||||
"source_field": "profile_data.education",
|
||||
"candidates": ["Universiteit Utrecht (1986)", "UvA (1990)"]
|
||||
},
|
||||
{
|
||||
"step": 2,
|
||||
"assumption": "Earlier education more likely near birth location",
|
||||
"rationale": "Students often attend local university first"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Confidence Levels
|
||||
|
||||
| Level | Criteria | Example |
|
||||
|-------|----------|---------|
|
||||
| **high** | Direct extraction from authoritative source | Profile states "Born in Amsterdam" |
|
||||
| **medium** | Single-step inference with reliable source | Current job location from employment record |
|
||||
| **low** | Multi-step heuristic with assumptions | Birth year from university start date |
|
||||
| **very_low** | Speculative, multiple assumptions, or list-valued | Birth location from first observed location, or decade spanning boundary |
|
||||
|
||||
## Anti-Patterns (FORBIDDEN)
|
||||
|
||||
### ❌ Silent Replacement
|
||||
```json
|
||||
{
|
||||
"birth_date": {
|
||||
"edtf": "196X",
|
||||
"precision": "decade"
|
||||
}
|
||||
}
|
||||
```
|
||||
**Problem**: No indication this is inferred, no provenance, no confidence level.
|
||||
|
||||
### ❌ Hidden in Metadata
|
||||
```json
|
||||
{
|
||||
"birth_date": {
|
||||
"edtf": "196X"
|
||||
},
|
||||
"enrichment_metadata": {
|
||||
"birth_date_inferred": true
|
||||
}
|
||||
}
|
||||
```
|
||||
**Problem**: Inference metadata separated from the value; easy to miss.
|
||||
|
||||
### ❌ Missing Inference Chain
|
||||
```json
|
||||
{
|
||||
"inferred_birth_decade": {
|
||||
"value": "196X",
|
||||
"method": "heuristic"
|
||||
}
|
||||
}
|
||||
```
|
||||
**Problem**: No explanation of HOW the value was derived; not auditable.
|
||||
|
||||
## Correct Pattern ✅
|
||||
|
||||
```json
|
||||
{
|
||||
"birth_date": {
|
||||
"edtf": "XXXX",
|
||||
"precision": "unknown",
|
||||
"note": "See inferred_birth_decade"
|
||||
},
|
||||
"inferred_birth_decade": {
|
||||
"value": "196X",
|
||||
"edtf": "196X",
|
||||
"confidence": "low",
|
||||
"inference_provenance": {
|
||||
"method": "earliest_education_heuristic",
|
||||
"inference_chain": [
|
||||
{"step": 1, "observation": "...", "source_field": "...", "source_value": "..."},
|
||||
{"step": 2, "assumption": "...", "rationale": "..."},
|
||||
{"step": 3, "calculation": "...", "result": "..."}
|
||||
],
|
||||
"inferred_at": "2025-01-09T18:00:00Z",
|
||||
"inferred_by": "enrich_ppids.py"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## PPID Component Handling
|
||||
|
||||
When inferred values are used in PPID components:
|
||||
|
||||
```json
|
||||
{
|
||||
"ppid": "ID_NL-UT-UTR_196X_NL-NH-AMS_XXXX_AART-HARTEN",
|
||||
"ppid_components": {
|
||||
"type": "ID",
|
||||
"first_location": "NL-UT-UTR",
|
||||
"first_location_source": "inferred_birth_settlement",
|
||||
"first_date": "196X",
|
||||
"first_date_source": "inferred_birth_decade",
|
||||
"last_location": "NL-NH-AMS",
|
||||
"last_location_source": "inferred_current_settlement",
|
||||
"last_date": "XXXX",
|
||||
"name_tokens": ["AART", "HARTEN"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The `*_source` fields document which inferred field was used for PPID generation.
|
||||
|
||||
## Upgrade Path: Inferred → Verified
|
||||
|
||||
When verified data becomes available:
|
||||
|
||||
1. **Keep inferred data** in `inferred_*` fields for audit trail
|
||||
2. **Add verified data** to canonical fields
|
||||
3. **Mark inferred as superseded**:
|
||||
|
||||
```json
|
||||
{
|
||||
"birth_date": {
|
||||
"edtf": "1967-03-15",
|
||||
"precision": "day",
|
||||
"verified": true,
|
||||
"source": "official_record"
|
||||
},
|
||||
"inferred_birth_decade": {
|
||||
"value": "196X",
|
||||
"superseded": true,
|
||||
"superseded_by": "birth_date",
|
||||
"superseded_at": "2025-01-15T10:00:00Z",
|
||||
"accuracy_assessment": "Inferred decade was correct (1960s), actual year 1967"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
For any enrichment script:
|
||||
|
||||
- [ ] Create explicit `inferred_*` fields for ALL inferred data
|
||||
- [ ] Include `inference_provenance` with complete `inference_chain`
|
||||
- [ ] Record each step: observation → assumption → calculation → result
|
||||
- [ ] Set appropriate `confidence` level
|
||||
- [ ] Add `*_source` references in PPID components
|
||||
- [ ] Preserve original unknown values (`XXXX`, `XX-XX-XXX`)
|
||||
- [ ] Add `note` in canonical fields pointing to inferred alternatives
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 44**: PPID Birth Date Enrichment and EDTF Unknown Date Notation
|
||||
- **Rule 35**: Provenance Statements MUST Have Dual Timestamps
|
||||
- **Rule 6**: WebObservation Claims MUST Have XPath Provenance
|
||||
|
|
@ -1,251 +0,0 @@
|
|||
# Rule 40: KIEN Registry is Authoritative for Intangible Heritage Custodians
|
||||
|
||||
## Summary
|
||||
|
||||
For Intangible Heritage Custodians (Type I), the KIEN registry at `https://www.immaterieelerfgoed.nl/` is the **TIER_1_AUTHORITATIVE** source for contact data and addresses. Google Maps enrichment is **TIER_3_CROWD_SOURCED** and should NEVER override KIEN data.
|
||||
|
||||
## Empirical Validation (January 2025)
|
||||
|
||||
A comprehensive audit of 188 Type I custodian files revealed:
|
||||
|
||||
| Category | Count | Percentage |
|
||||
|----------|-------|------------|
|
||||
| ✅ Google Maps matches OK | 101 | 53.7% |
|
||||
| 🔧 **FALSE_MATCH detected** | **62** | **33.0%** |
|
||||
| ⚠️ No official website (valid) | 20 | 10.6% |
|
||||
| 📭 No Google Maps data | 5 | 2.7% |
|
||||
|
||||
**Key Finding: 33% of Google Maps enrichment data for Type I custodians was incorrect.**
|
||||
|
||||
### False Match Categories Identified
|
||||
|
||||
1. **Domain mismatches** (39 files): Google Maps website ≠ KIEN official website
|
||||
2. **Name mismatches** (8 files): Completely different organizations (e.g., "Ria Bos" heritage practitioner → "Ria Money Transfer Agent")
|
||||
3. **Wrong location** (6 files): Same-ish name but different city (Amsterdam→Den Haag, Netherlands→Suriname!)
|
||||
4. **Wrong organization type** (5 files): Federation vs specific member, heritage org vs webshop
|
||||
5. **Different entity type** (3 files): Organization vs location/street name
|
||||
6. **Different event** (3 files): Horse racing vs festival, different village's event
|
||||
|
||||
### Why Google Maps Fails for Type I
|
||||
|
||||
Google Maps is optimized for commercial businesses with physical storefronts. Type I intangible heritage custodians are fundamentally different:
|
||||
|
||||
- **Virtual organizations** without commercial presence
|
||||
- **Person-based heritage** (individual practitioners preserving traditional crafts)
|
||||
- **Volunteer networks** meeting in private residences
|
||||
- **Event-based organizations** that exist only during festivals
|
||||
- **Federations** that coordinate member organizations without own premises
|
||||
|
||||
## Rationale
|
||||
|
||||
Google Maps frequently returns **false matches** for intangible heritage organizations because:
|
||||
|
||||
1. **Virtual Organizations**: Many intangible heritage custodians operate as networks/platforms without commercial storefronts
|
||||
2. **Name Collisions**: Common words in organization names (e.g., "Platform") match unrelated businesses
|
||||
3. **No Physical Presence**: Organizations focused on intangible heritage (handwriting, oral traditions, crafts) often have no Google Maps listing
|
||||
4. **Volunteer-Run**: Contact addresses are often private residences, not businesses
|
||||
|
||||
KIEN (Kenniscentrum Immaterieel Erfgoed Nederland) is the official Dutch registry for intangible cultural heritage and maintains verified contact information directly from the organizations.
|
||||
|
||||
## Data Tier Hierarchy for Type I Custodians
|
||||
|
||||
| Priority | Source | Data Tier | Trust Level |
|
||||
|----------|--------|-----------|-------------|
|
||||
| 1st | KIEN Registry (`immaterieelerfgoed.nl`) | TIER_1_AUTHORITATIVE | Highest |
|
||||
| 2nd | Organization's Official Website | TIER_2_VERIFIED | High |
|
||||
| 3rd | Wikidata | TIER_3_CROWD_SOURCED | Medium |
|
||||
| 4th | Google Maps | TIER_3_CROWD_SOURCED | Low (verify!) |
|
||||
|
||||
## Required Workflow for Type I Enrichment
|
||||
|
||||
### Step 1: Scrape KIEN Page First
|
||||
|
||||
For every intangible heritage custodian, the KIEN profile page MUST be scraped to extract:
|
||||
|
||||
```yaml
|
||||
kien_enrichment:
|
||||
kien_name: "Platform Handschriftontwikkeling"
|
||||
kien_url: "https://www.immaterieelerfgoed.nl/nl/page/2476/platform-handschriftontwikkeling"
|
||||
heritage_page_url: "https://www.immaterieelerfgoed.nl/nl/handschrift"
|
||||
heritage_forms:
|
||||
- "Ambachten, handwerk en techniek"
|
||||
- "Sociale praktijken"
|
||||
address:
|
||||
street: "De Hazelaar 41"
|
||||
postal_code: "6903 BB"
|
||||
city: "Zevenaar"
|
||||
province: "Gelderland"
|
||||
country: "NL"
|
||||
registered_since: "2019-11"
|
||||
enrichment_timestamp: "2025-01-08T00:00:00Z"
|
||||
source: "https://www.immaterieelerfgoed.nl"
|
||||
```
|
||||
|
||||
### Step 2: Validate Google Maps Match (If Any)
|
||||
|
||||
If Google Maps enrichment exists, compare against KIEN data:
|
||||
|
||||
```python
|
||||
def validate_google_maps_match(kien_data, gmaps_data):
|
||||
"""Check if Google Maps data matches KIEN authoritative source."""
|
||||
|
||||
# Check website domain match
|
||||
kien_domain = extract_domain(kien_data.get('website'))
|
||||
gmaps_domain = extract_domain(gmaps_data.get('website'))
|
||||
|
||||
if kien_domain and gmaps_domain and kien_domain != gmaps_domain:
|
||||
return {
|
||||
'status': 'FALSE_MATCH',
|
||||
'reason': f'Website mismatch: KIEN={kien_domain}, GMaps={gmaps_domain}'
|
||||
}
|
||||
|
||||
# Check name similarity
|
||||
kien_name = kien_data.get('kien_name', '').lower()
|
||||
gmaps_name = gmaps_data.get('name', '').lower()
|
||||
|
||||
if fuzz.ratio(kien_name, gmaps_name) < 70:
|
||||
return {
|
||||
'status': 'FALSE_MATCH',
|
||||
'reason': f'Name mismatch: KIEN="{kien_name}", GMaps="{gmaps_name}"'
|
||||
}
|
||||
|
||||
return {'status': 'VERIFIED'}
|
||||
```
|
||||
|
||||
### Step 3: Mark False Matches
|
||||
|
||||
When Google Maps returns a different organization:
|
||||
|
||||
```yaml
|
||||
google_maps_enrichment:
|
||||
status: FALSE_MATCH
|
||||
false_match_reason: >-
|
||||
Google Maps returned "Platform 9 BV" (a health/coaching business at
|
||||
Nieuwleusen) instead of "Platform Handschriftontwikkeling" (a virtual
|
||||
handwriting development platform). These are completely different
|
||||
organizations. KIEN registry is authoritative for this Type I custodian.
|
||||
original_false_match:
|
||||
place_id: ChIJNZ6o7H_fx0cR-TURAN3Bj54
|
||||
name: Platform 9 BV
|
||||
formatted_address: Burg, Burgemeester Backxlaan 321, 7711 AD Nieuwleusen
|
||||
website: http://www.platform9.nl/
|
||||
correction_timestamp: "2025-01-08T00:00:00Z"
|
||||
correction_agent: opencode-claude-sonnet-4
|
||||
```
|
||||
|
||||
## KIEN Contact Data Extraction
|
||||
|
||||
The KIEN heritage pages follow a consistent structure. Extract from the "Contact" section:
|
||||
|
||||
```
|
||||
## Contact
|
||||
[Organization Name](link-to-profile-page)
|
||||
Street Address
|
||||
Postal Code
|
||||
City
|
||||
Province
|
||||
[Website](url)
|
||||
Bijgeschreven in inventaris vanaf: [date]
|
||||
```
|
||||
|
||||
### Example Extraction (from immaterieelerfgoed.nl/nl/handschrift):
|
||||
|
||||
```yaml
|
||||
contact:
|
||||
organization: "Platform Handschriftontwikkeling"
|
||||
profile_url: "https://www.immaterieelerfgoed.nl/nl/page/2476/platform-handschriftontwikkeling"
|
||||
address:
|
||||
street: "De Hazelaar 41"
|
||||
postal_code: "6903 BB"
|
||||
city: "Zevenaar"
|
||||
province: "Gelderland"
|
||||
website: "http://www.handschriftontwikkeling.nl/"
|
||||
registered_since: "november 2019"
|
||||
```
|
||||
|
||||
## Location Resolution for Type I
|
||||
|
||||
When KIEN provides an address:
|
||||
|
||||
1. **Use KIEN address** for `location.formatted_address`
|
||||
2. **Geocode KIEN address** to get coordinates (NOT Google Maps coordinates)
|
||||
3. **Update location_resolution** with method `KIEN_ADDRESS_GEOCODE`
|
||||
|
||||
```yaml
|
||||
location:
|
||||
street_address: "De Hazelaar 41"
|
||||
postal_code: "6903 BB"
|
||||
city: Zevenaar
|
||||
region_code: GE
|
||||
country: NL
|
||||
coordinate_provenance:
|
||||
source_type: KIEN_ADDRESS_GEOCODE
|
||||
source_url: "https://www.immaterieelerfgoed.nl/nl/handschrift"
|
||||
geocoding_service: nominatim
|
||||
geocoding_timestamp: "2025-01-08T00:00:00Z"
|
||||
```
|
||||
|
||||
## Batch Re-Enrichment Script
|
||||
|
||||
To fix all Type I custodians with potentially incorrect Google Maps data:
|
||||
|
||||
```bash
|
||||
# Find all Type I custodians
|
||||
python scripts/rescrape_kien_contacts.py --type I --output data/custodian/
|
||||
|
||||
# This script should:
|
||||
# 1. Read all NL-*-I-*.yaml files
|
||||
# 2. Fetch KIEN page for each (from kien_enrichment.kien_url)
|
||||
# 3. Extract contact/address from KIEN
|
||||
# 4. Compare with google_maps_enrichment
|
||||
# 5. Mark mismatches as FALSE_MATCH
|
||||
# 6. Update location with KIEN address
|
||||
```
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### WRONG - Using Google Maps as primary source for Type I:
|
||||
|
||||
```yaml
|
||||
# WRONG - Google Maps overriding KIEN data
|
||||
location:
|
||||
formatted_address: "Burg, Burgemeester Backxlaan 321, 7711 AD Nieuwleusen"
|
||||
coordinate_provenance:
|
||||
source_type: GOOGLE_MAPS # WRONG for Type I!
|
||||
```
|
||||
|
||||
### CORRECT - KIEN as primary source:
|
||||
|
||||
```yaml
|
||||
# CORRECT - KIEN is authoritative
|
||||
location:
|
||||
street_address: "De Hazelaar 41"
|
||||
postal_code: "6903 BB"
|
||||
city: Zevenaar
|
||||
coordinate_provenance:
|
||||
source_type: KIEN_ADDRESS_GEOCODE # Correct!
|
||||
```
|
||||
|
||||
## Affected Files
|
||||
|
||||
This rule affects approximately 100+ Type I custodian files:
|
||||
- `data/custodian/NL-*-I-*.yaml`
|
||||
|
||||
All should be reviewed to ensure:
|
||||
1. `kien_enrichment` contains address from KIEN page
|
||||
2. `google_maps_enrichment` is validated against KIEN
|
||||
3. `location` uses KIEN address (not Google Maps)
|
||||
4. False matches are properly documented
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 5**: NEVER Delete Enriched Data - Keep false match data in `original_false_match`
|
||||
- **Rule 6**: WebObservation Claims - KIEN data should have provenance
|
||||
- **Rule 22**: Custodian YAML Files Are Single Source of Truth
|
||||
- **Rule 35**: Provenance Timestamps - Include KIEN fetch timestamps
|
||||
|
||||
## See Also
|
||||
|
||||
- KIEN Registry: https://www.immaterieelerfgoed.nl/
|
||||
- UNESCO Intangible Cultural Heritage: https://ich.unesco.org/
|
||||
- Dutch Intangible Heritage Network documentation
|
||||
|
|
@ -1,351 +0,0 @@
|
|||
# Rule 44: PPID Birth Date Enrichment and Unknown Date Notation
|
||||
|
||||
**Version**: 1.0.0
|
||||
**Created**: 2025-01-09
|
||||
**Status**: ACTIVE
|
||||
**Related**: [PPID-GHCID Alignment](../../docs/plan/person_pid/10_ppid_ghcid_alignment.md) | [EDTF Specification](https://www.loc.gov/standards/datetime/)
|
||||
|
||||
---
|
||||
|
||||
## 1. Summary
|
||||
|
||||
When birth/death dates are missing from person entity sources, agents MUST:
|
||||
|
||||
1. **Search for dates** using Exa Search and Linkup tools
|
||||
2. **Record all enrichment data** as web claims with provenance
|
||||
3. **If not found**, use **EDTF-compliant notation** for estimated/unknown dates
|
||||
4. **Never fabricate** specific dates without source evidence
|
||||
|
||||
---
|
||||
|
||||
## 2. Enrichment Workflow
|
||||
|
||||
### 2.1 Required Search Before Using Unknown Notation
|
||||
|
||||
Before marking a date as unknown, agents MUST attempt enrichment:
|
||||
|
||||
```
|
||||
Person Entity (missing birth_date)
|
||||
↓
|
||||
1. Search Exa: "{full_name} born birth date"
|
||||
↓
|
||||
2. Search Exa: "{full_name} {known_employer}"
|
||||
↓
|
||||
3. Search Linkup: "{full_name} biography"
|
||||
↓
|
||||
4. If found → Record as web_claim with provenance
|
||||
↓
|
||||
5. If NOT found → Use EDTF unknown notation
|
||||
↓
|
||||
6. Record enrichment_attempt in metadata
|
||||
```
|
||||
|
||||
### 2.2 Enrichment Search Requirements
|
||||
|
||||
| Search Tool | Query Pattern | When to Use |
|
||||
|-------------|---------------|-------------|
|
||||
| `exa_web_search_exa` | `"{name}" born birthday birth date year` | Primary search |
|
||||
| `exa_linkedin_search_exa` | `"{name}" at "{employer}"` | For work context |
|
||||
| `linkup_linkup-search` | `"{name}" biography personal` | Deep research |
|
||||
|
||||
### 2.3 Recording Successful Enrichment
|
||||
|
||||
When birth date is found, record as web claim:
|
||||
|
||||
```yaml
|
||||
web_claims:
|
||||
- claim_type: birth_date
|
||||
claim_value: "1985-03-15"
|
||||
source_url: "https://example.org/person/bio"
|
||||
retrieved_on: "2025-01-09T14:30:00Z"
|
||||
retrieval_agent: "opencode-claude-sonnet-4"
|
||||
confidence_score: 0.85
|
||||
notes: "Found in biography section"
|
||||
```
|
||||
|
||||
### 2.4 Recording Failed Enrichment Attempts
|
||||
|
||||
Always record that enrichment was attempted:
|
||||
|
||||
```yaml
|
||||
enrichment_metadata:
|
||||
birth_date_search:
|
||||
attempted: true
|
||||
search_date: "2025-01-09T14:30:00Z"
|
||||
search_agent: "opencode-claude-sonnet-4"
|
||||
search_tools_used:
|
||||
- exa_web_search_exa
|
||||
- linkup_linkup-search
|
||||
queries_tried:
|
||||
- '"Jan van Berg" born birthday'
|
||||
- '"Jan van Berg" biography'
|
||||
result: "NOT_FOUND"
|
||||
notes: "No publicly available birth date found after comprehensive search"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. EDTF-Compliant Unknown Date Notation
|
||||
|
||||
### 3.1 Standard: Extended Date/Time Format (EDTF)
|
||||
|
||||
This project follows the **Library of Congress EDTF Specification** (ISO 8601-2:2019) for representing uncertain, approximate, and unspecified dates.
|
||||
|
||||
**Key EDTF Characters**:
|
||||
|
||||
| Character | Meaning | EDTF Level | Example |
|
||||
|-----------|---------|------------|---------|
|
||||
| `X` | Unspecified digit | Level 1+ | `19XX` = some year 1900-1999 |
|
||||
| `~` | Approximate (circa) | Level 1+ | `1985~` = circa 1985 |
|
||||
| `?` | Uncertain | Level 1+ | `1985?` = possibly 1985 |
|
||||
| `%` | Uncertain AND approximate | Level 1+ | `1985%` = possibly circa 1985 |
|
||||
| `S` | Significant digits | Level 2 | `1950S2` = 1900-1999, estimated 1950 |
|
||||
| `[..]` | One of set | Level 2 | `[1970,1980]` = either 1970 or 1980 |
|
||||
| `{..}` | All of set | Level 2 | `{1970..1980}` = all years 1970-1980 |
|
||||
|
||||
### 3.2 Unspecified Date Components (X Notation)
|
||||
|
||||
Use `X` to replace unknown digits:
|
||||
|
||||
| Known Information | EDTF Format | Meaning |
|
||||
|-------------------|-------------|---------|
|
||||
| Only decade known (1970s) | `197X` | Some year 1970-1979 |
|
||||
| Only century known (1900s) | `19XX` | Some year 1900-1999 |
|
||||
| Year unknown entirely | `XXXX` | Year unknown |
|
||||
| Year known, month unknown | `1985-XX` | Some month in 1985 |
|
||||
| Year+month known, day unknown | `1985-04-XX` | Some day in April 1985 |
|
||||
| Year known, month+day unknown | `1985-XX-XX` | Some day in 1985 |
|
||||
| Only decade and final digit known | `197X-XX-XX` or use set | 1970-1979 |
|
||||
|
||||
### 3.3 Multiple Possible Decades (Set Notation)
|
||||
|
||||
When the decade is uncertain but constrained to specific options:
|
||||
|
||||
| Scenario | EDTF Format | Meaning |
|
||||
|----------|-------------|---------|
|
||||
| Born in 1970s OR 1980s | `[197X,198X]` | One of: some year in 1970s or 1980s |
|
||||
| Born in specific years | `[1975,1985]` | Either 1975 or 1985 |
|
||||
| Born 1970-1985 range | `1970/1985` | Interval: between 1970 and 1985 |
|
||||
|
||||
### 3.4 Estimated Dates with Significant Digits
|
||||
|
||||
When you can estimate a year with confidence bounds:
|
||||
|
||||
```
|
||||
1975S2 = Estimated 1975, significant to 2 digits (1900-1999)
|
||||
1975S3 = Estimated 1975, significant to 3 digits (1970-1979)
|
||||
```
|
||||
|
||||
This is useful when you can estimate based on career timeline (e.g., "started working 1998, likely born 1970s").
|
||||
|
||||
### 3.5 Living Persons - Birth Date Estimation
|
||||
|
||||
For living persons in LinkedIn data, estimate birth decade from:
|
||||
|
||||
1. **Graduation year** (if available): Subtract ~22 years for bachelor's degree
|
||||
2. **Career start** (first job): Subtract ~22-25 years
|
||||
3. **Current role seniority**: "Senior" roles suggest 35+ years old
|
||||
|
||||
```yaml
|
||||
# Example: Person graduated 2010
|
||||
birth_date_estimate:
|
||||
edtf: "1988S2" # Estimated 1988, significant to 2 digits (1980-1999)
|
||||
estimation_method: "graduation_year_inference"
|
||||
estimation_basis: "Graduated bachelor's 2010, estimated birth ~1988"
|
||||
confidence: 0.60
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. PPID Format with Unknown Dates
|
||||
|
||||
### 4.1 PPID Date Component Rules
|
||||
|
||||
The PPID format includes birth and death dates:
|
||||
|
||||
```
|
||||
{TYPE}_{FL}_{FD}_{LL}_{LD}_{NT}
|
||||
│ │
|
||||
│ └── Last Date (death) - EDTF format
|
||||
└── First Date (birth) - EDTF format
|
||||
```
|
||||
|
||||
### 4.2 Examples with Unknown Components
|
||||
|
||||
| Scenario | PPID Example |
|
||||
|----------|--------------|
|
||||
| All known | `PID_NL-NH-AMS_1985-03-15_NL-NH-HAA_2020-08-22_JAN-BERG` |
|
||||
| Birth year only | `ID_NL-NH-AMS_1985_XX-XX-XXX_XXXX_JAN-BERG` |
|
||||
| Birth decade only | `ID_XX-XX-XXX_197X_XX-XX-XXX_XXXX_JAN-BERG` |
|
||||
| Nothing known | `ID_XX-XX-XXX_XXXX_XX-XX-XXX_XXXX_JAN-BERG` |
|
||||
| Living person | `ID_NL-NH-AMS_1985_XX-XX-XXX_XXXX_JAN-BERG` |
|
||||
|
||||
### 4.3 Filename Safety
|
||||
|
||||
EDTF characters are **filename-safe**:
|
||||
|
||||
| Character | Filename Safe? | Notes |
|
||||
|-----------|----------------|-------|
|
||||
| `X` | YES | Uppercase letter |
|
||||
| `~` | YES | Allowed on macOS/Linux/Windows |
|
||||
| `?` | NO | Not allowed on Windows |
|
||||
| `%` | CAUTION | URL encoding issues |
|
||||
| `[` `]` | CAUTION | Shell escaping issues |
|
||||
| `,` | YES | Allowed |
|
||||
| `/` | NO | Directory separator |
|
||||
| `\|` | CAUTION | Shell pipe, Windows disallowed |
|
||||
|
||||
**Recommendation**: For filenames, use only:
|
||||
- `X` for unknown digits
|
||||
- `~` for approximate (suffix only)
|
||||
- Avoid `?`, `%`, `[]`, `/`, `|` in filenames
|
||||
|
||||
When set notation `[..]` is needed, store in metadata but use simplified form in filename:
|
||||
- Filename: `ID_XX-XX-XXX_197X_...` (simplified)
|
||||
- Metadata: `birth_date_edtf: "[1975,1985]"` (full EDTF)
|
||||
|
||||
---
|
||||
|
||||
## 5. Decision Tree
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ Person entity missing birth_date │
|
||||
└─────────────────┬───────────────────────┘
|
||||
▼
|
||||
┌─────────────────────────────────────────┐
|
||||
│ Search Exa + Linkup for birth date │
|
||||
└─────────────────┬───────────────────────┘
|
||||
▼
|
||||
┌───────┴───────┐
|
||||
│ Date found? │
|
||||
└───────┬───────┘
|
||||
YES │ NO
|
||||
▼ │ ▼
|
||||
┌─────────────────┐ ┌─────────────────────────────┐
|
||||
│ Record as │ │ Can estimate from career? │
|
||||
│ web_claim with │ └───────────┬─────────────────┘
|
||||
│ provenance │ YES │ NO
|
||||
└─────────────────┘ ▼ │ ▼
|
||||
┌───────────────┐ ┌───────────────┐
|
||||
│ Use EDTF │ │ Use XXXX │
|
||||
│ estimate: │ │ (unknown) │
|
||||
│ 1988S2 or │ │ │
|
||||
│ 198X │ │ │
|
||||
└───────────────┘ └───────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Examples
|
||||
|
||||
### 6.1 Fully Unknown (No Enrichment Found)
|
||||
|
||||
```yaml
|
||||
# Person: Nora Ruijs (student, no public birth info)
|
||||
ppid: ID_XX-XX-XXX_XXXX_XX-XX-XXX_XXXX_NORA-RUIJS
|
||||
|
||||
birth_date:
|
||||
edtf: "XXXX"
|
||||
precision: "unknown"
|
||||
|
||||
enrichment_metadata:
|
||||
birth_date_search:
|
||||
attempted: true
|
||||
search_date: "2025-01-09T14:30:00Z"
|
||||
result: "NOT_FOUND"
|
||||
```
|
||||
|
||||
### 6.2 Decade Estimated from Career
|
||||
|
||||
```yaml
|
||||
# Person: Senior curator, started career 1995
|
||||
ppid: ID_NL-NH-AMS_197X_XX-XX-XXX_XXXX_JAN-BERG
|
||||
|
||||
birth_date:
|
||||
edtf: "197X"
|
||||
edtf_full: "1972S3" # Estimated 1972, significant to 3 digits
|
||||
precision: "decade"
|
||||
estimation_method: "career_start_inference"
|
||||
estimation_basis: "Career started 1995 as junior curator, estimated age 23"
|
||||
```
|
||||
|
||||
### 6.3 Multiple Possible Decades
|
||||
|
||||
```yaml
|
||||
# Person: Could be born 1970s or 1980s based on conflicting sources
|
||||
ppid: ID_XX-XX-XXX_197X_XX-XX-XXX_XXXX_MARIA-SILVA # Simplified for filename
|
||||
|
||||
birth_date:
|
||||
edtf: "[197X,198X]" # Full EDTF with set notation
|
||||
edtf_filename: "197X" # Simplified for filename (earlier estimate)
|
||||
precision: "decade_uncertain"
|
||||
notes: "Sources conflict: LinkedIn suggests 1980s, university bio suggests 1970s"
|
||||
```
|
||||
|
||||
### 6.4 Exact Date Found via Enrichment
|
||||
|
||||
```yaml
|
||||
# Person: Birth date found on institutional bio page
|
||||
ppid: ID_NL-NH-AMS_1985-03-15_XX-XX-XXX_XXXX_JAN-BERG
|
||||
|
||||
birth_date:
|
||||
edtf: "1985-03-15"
|
||||
precision: "day"
|
||||
|
||||
web_claims:
|
||||
- claim_type: birth_date
|
||||
claim_value: "1985-03-15"
|
||||
source_url: "https://museum.nl/team/jan-berg"
|
||||
retrieved_on: "2025-01-09T14:30:00Z"
|
||||
retrieval_agent: "opencode-claude-sonnet-4"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Anti-Patterns
|
||||
|
||||
### 7.1 FORBIDDEN: Fabricating Dates
|
||||
|
||||
```yaml
|
||||
# WRONG - No source, no search attempted
|
||||
birth_date:
|
||||
edtf: "1985-03-15" # Where did this come from?!
|
||||
```
|
||||
|
||||
### 7.2 FORBIDDEN: Using Non-EDTF Notation
|
||||
|
||||
```yaml
|
||||
# WRONG - Not EDTF compliant
|
||||
birth_date: "197~8~" # Invalid notation
|
||||
birth_date: "1970s" # Use 197X instead
|
||||
birth_date: "circa 1985" # Use 1985~ instead
|
||||
birth_date: "unknown" # Use XXXX instead
|
||||
```
|
||||
|
||||
### 7.3 FORBIDDEN: Skipping Enrichment Search
|
||||
|
||||
```yaml
|
||||
# WRONG - No search attempted
|
||||
birth_date:
|
||||
edtf: "XXXX"
|
||||
# No enrichment_metadata showing search was attempted!
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Validation Rules
|
||||
|
||||
1. **Search Required**: Cannot use `XXXX` without `enrichment_metadata.birth_date_search.attempted: true`
|
||||
2. **EDTF Compliance**: All dates must parse as valid EDTF (use validator)
|
||||
3. **Filename Safety**: PPID filenames must avoid `?`, `%`, `[]`, `/`, `|`
|
||||
4. **Provenance Required**: All found dates must have `web_claims` with source
|
||||
|
||||
---
|
||||
|
||||
## 9. References
|
||||
|
||||
- [EDTF Specification (Library of Congress)](https://www.loc.gov/standards/datetime/)
|
||||
- [ISO 8601-2:2019](https://www.iso.org/standard/70908.html)
|
||||
- [PPID-GHCID Alignment Document](../../docs/plan/person_pid/10_ppid_ghcid_alignment.md)
|
||||
- [Rule 21: Data Fabrication Prohibition](../DATA_FABRICATION_PROHIBITION.md)
|
||||
|
|
@ -5,18 +5,13 @@
|
|||
## The Rule
|
||||
|
||||
1. **Slots (Predicates)** MUST ONLY have `exact_mappings` to ontology **predicates** (properties).
|
||||
* ❌ INVALID: Slot `analyze` maps to `schema:object` (a Class).
|
||||
* ✅ VALID: Slot `analyze` maps to `crm:P129_is_about` (a Property).
|
||||
* ❌ INVALID: Slot `analyzes_or_analyzed` maps to `schema:object` (a Class).
|
||||
* ✅ VALID: Slot `analyzes_or_analyzed` maps to `crm:P129_is_about` (a Property).
|
||||
|
||||
2. **Classes (Entities)** MUST ONLY have `exact_mappings` to ontology **classes** (entities).
|
||||
* ❌ INVALID: Class `Person` maps to `foaf:name` (a Property).
|
||||
* ✅ VALID: Class `Person` maps to `foaf:Person` (a Class).
|
||||
|
||||
3. **When true equivalence exists and is verified, exact mapping is preferred.**
|
||||
* ✅ VALID: Class `Acquisition` maps to `crm:E8_Acquisition`.
|
||||
* ✅ VALID: Slot mapped to an actually equivalent ontology property.
|
||||
* ❗ Do not avoid `exact_mappings` by default; avoid only when scope is broader/narrower/similar-but-not-equal.
|
||||
|
||||
## Rationale
|
||||
|
||||
Mapping a slot (which defines a relationship or attribute) to a class (which defines a type of entity) is a category error. `schema:object` represents the *class* of objects, not the *relationship* of "having an object" or "analyzing an object".
|
||||
|
|
@ -25,10 +20,9 @@ Mapping a slot (which defines a relationship or attribute) to a class (which def
|
|||
|
||||
When adding or reviewing `exact_mappings`:
|
||||
- [ ] Is the LinkML element a Class or a Slot?
|
||||
- [ ] Did you verify the target term type in the ontology definition files (do not rely on naming heuristics)?
|
||||
- [ ] Does the target ontology term represent a Class (usually Capitalized) or a Property (usually lowercase)?
|
||||
- [ ] Do they match? (Class↔Class, Slot↔Property)
|
||||
- [ ] If the target ontology uses opaque IDs (like CIDOC-CRM `E55_Type`), verify the type definition in the ontology file.
|
||||
- [ ] If semantic scope is truly equivalent, use `exact_mappings` (not `close`/`broad` as a conservative fallback).
|
||||
|
||||
## Common Pitfalls to Fix
|
||||
|
||||
|
|
|
|||
|
|
@ -368,6 +368,6 @@ Before marking a slot as processed:
|
|||
|
||||
- Rule 9: Enum-to-Class Promotion (single source of truth principle)
|
||||
- Rule 0b: Type/Types File Naming Convention
|
||||
- Rule: Slot Naming Convention (Current Style)
|
||||
- Rule 39: Slot Naming Convention (RiC-O Style)
|
||||
- `.opencode/ENUM_TO_CLASS_PRINCIPLE.md`
|
||||
- `schemas/20251121/linkml/modules/slots/slot_fixes.yaml` - **AUTHORITATIVE** master list of migrations
|
||||
|
|
|
|||
|
|
@ -126,4 +126,4 @@ If you encounter an overly specific slot:
|
|||
|
||||
## See Also
|
||||
* Rule 55: Broaden Generic Predicate Ranges
|
||||
* Rule: Slot Naming Convention (Current Style)
|
||||
* Rule 39: Slot Naming Convention (RiC-O Style)
|
||||
|
|
|
|||
|
|
@ -1,181 +0,0 @@
|
|||
# LinkML YAML Best Practices Rule
|
||||
|
||||
## Rule: Follow LinkML Conventions for Valid, Interoperable Schema Files
|
||||
|
||||
### 1. equals_expression Anti-Pattern
|
||||
|
||||
`equals_expression` is for dynamic formula evaluation (e.g., `"{age_in_years} * 12"`). Never use it for static value constraints.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_expression: '["hc:ArchiveOrganizationType"]'
|
||||
hold_record_set:
|
||||
equals_expression: '["hc:Fonds", "hc:Series"]'
|
||||
```
|
||||
|
||||
**CORRECT** (single value):
|
||||
```yaml
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType"
|
||||
```
|
||||
|
||||
**CORRECT** (multiple allowed values - if classes):
|
||||
```yaml
|
||||
slot_usage:
|
||||
hold_record_set:
|
||||
any_of:
|
||||
- range: UniversityAdministrativeFonds
|
||||
- range: StudentRecordSeries
|
||||
- range: FacultyPaperCollection
|
||||
```
|
||||
|
||||
**CORRECT** (multiple allowed values - if literals):
|
||||
```yaml
|
||||
slot_usage:
|
||||
status:
|
||||
equals_string_in:
|
||||
- "active"
|
||||
- "inactive"
|
||||
- "pending"
|
||||
```
|
||||
|
||||
### 2. Declare All Used Prefixes
|
||||
|
||||
Every CURIE prefix used in the file must be declared in the `prefixes:` block.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType" # hc: not declared!
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
default_prefix: hc
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType"
|
||||
```
|
||||
|
||||
### 3. Import Referenced Classes
|
||||
|
||||
When using external classes in `is_a`, `range`, or other references, import them.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
imports:
|
||||
- linkml:types
|
||||
classes:
|
||||
AcademicArchive:
|
||||
is_a: ArchiveOrganizationType # Not imported!
|
||||
slot_usage:
|
||||
related_to:
|
||||
range: WikidataAlignment # Not imported!
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../classes/ArchiveOrganizationType
|
||||
- ../classes/WikidataAlignment
|
||||
classes:
|
||||
AcademicArchive:
|
||||
is_a: ArchiveOrganizationType
|
||||
slot_usage:
|
||||
related_to:
|
||||
range: WikidataAlignment
|
||||
```
|
||||
|
||||
### 4. Quote Regex Patterns and Annotation Values
|
||||
|
||||
**Regex patterns:**
|
||||
```yaml
|
||||
# WRONG
|
||||
pattern: ^Q[0-9]+$
|
||||
|
||||
# CORRECT
|
||||
pattern: "^Q[0-9]+$"
|
||||
```
|
||||
|
||||
**Annotation values (must be strings):**
|
||||
```yaml
|
||||
# WRONG
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
|
||||
# CORRECT
|
||||
annotations:
|
||||
specificity_score: "0.1"
|
||||
```
|
||||
|
||||
### 5. Remove Unused Imports
|
||||
|
||||
Only import slots and classes that are actually used in the file.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
imports:
|
||||
- ../slots/has_scope # Never used in slots: or slot_usage:
|
||||
- ../slots/has_score
|
||||
- ../slots/has_type
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
imports:
|
||||
- ../slots/has_score
|
||||
- ../slots/has_type
|
||||
```
|
||||
|
||||
### 6. Slot Usage Requires Slot Presence
|
||||
|
||||
A slot referenced in `slot_usage:` must either be:
|
||||
- Listed in the `slots:` array, OR
|
||||
- Inherited from a parent class via `is_a`
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
classes:
|
||||
MyClass:
|
||||
slots:
|
||||
- has_type
|
||||
slot_usage:
|
||||
has_type: {...}
|
||||
identified_by: {...} # Not in slots: and not inherited!
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
classes:
|
||||
MyClass:
|
||||
slots:
|
||||
- has_type
|
||||
- identified_by
|
||||
slot_usage:
|
||||
has_type: {...}
|
||||
identified_by: {...}
|
||||
```
|
||||
|
||||
## Checklist for Class Files
|
||||
|
||||
- [ ] All prefixes used in CURIEs are declared
|
||||
- [ ] `default_prefix` set if module belongs to that namespace
|
||||
- [ ] All referenced classes are imported
|
||||
- [ ] All used slots are imported
|
||||
- [ ] No `equals_expression` with static JSON arrays
|
||||
- [ ] Regex patterns are quoted
|
||||
- [ ] Annotation values are quoted strings
|
||||
- [ ] No unused imports
|
||||
- [ ] `slot_usage` only references slots that exist (via slots: or inheritance)
|
||||
|
|
@ -1,98 +0,0 @@
|
|||
# Rule: Archive Folder Convention
|
||||
|
||||
**Rule ID**: archive-folder-convention
|
||||
**Created**: 2026-01-14
|
||||
**Status**: Active
|
||||
|
||||
## Summary
|
||||
|
||||
All archived files MUST be placed in an `/archive/` subfolder within their parent directory, NOT at the same level as active files.
|
||||
|
||||
## Rationale
|
||||
|
||||
1. **Clean separation**: Active files are clearly distinguished from deprecated/archived files
|
||||
2. **Discoverability**: Developers can easily find current files without wading through archived versions
|
||||
3. **Git history**: Archive folder can be `.gitignore`d for lightweight clones if needed
|
||||
4. **Consistent pattern**: Same structure across all schema module types (slots, classes, enums)
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
modules/
|
||||
├── slots/
|
||||
│ ├── archive/ # Archived slot files go HERE
|
||||
│ │ ├── branch_id_archived_20260114.yaml
|
||||
│ │ ├── all_data_real_archived_20260114.yaml
|
||||
│ │ └── ...
|
||||
│ ├── has_or_had_identifier.yaml # Active slots at this level
|
||||
│ └── ...
|
||||
├── classes/
|
||||
│ ├── archive/ # Archived class files go HERE
|
||||
│ │ └── ...
|
||||
│ └── ...
|
||||
└── enums/
|
||||
├── archive/ # Archived enum files go HERE
|
||||
│ └── ...
|
||||
└── ...
|
||||
```
|
||||
|
||||
## Naming Convention for Archived Files
|
||||
|
||||
```
|
||||
{original_filename}_archived_{YYYYMMDD}.yaml
|
||||
```
|
||||
|
||||
**Examples**:
|
||||
- `branch_id.yaml` → `archive/branch_id_archived_20260114.yaml`
|
||||
- `RealnessStatus.yaml` → `archive/RealnessStatus_archived_20260114.yaml`
|
||||
|
||||
## Migration Workflow
|
||||
|
||||
When archiving a file during slot migration:
|
||||
|
||||
```bash
|
||||
# 1. Copy to archive folder with timestamp suffix
|
||||
cp modules/slots/branch_id.yaml modules/slots/archive/branch_id_archived_20260114.yaml
|
||||
|
||||
# 2. Remove from active location
|
||||
rm modules/slots/branch_id.yaml
|
||||
|
||||
# 3. Update manifest counts
|
||||
# (Decrement slot count in manifest.json)
|
||||
|
||||
# 4. Update slot_fixes.yaml
|
||||
# (Mark migration as processed: true)
|
||||
```
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
**WRONG** - Archived files at same level as active:
|
||||
```
|
||||
modules/slots/
|
||||
├── branch_id_archived_20260114.yaml # NO - clutters active directory
|
||||
├── has_or_had_identifier.yaml
|
||||
└── ...
|
||||
```
|
||||
|
||||
**CORRECT** - Archived files in subdirectory:
|
||||
```
|
||||
modules/slots/
|
||||
├── archive/
|
||||
│ └── branch_id_archived_20260114.yaml # YES - clean separation
|
||||
├── has_or_had_identifier.yaml
|
||||
└── ...
|
||||
```
|
||||
|
||||
## Validation
|
||||
|
||||
Before committing migrations, verify:
|
||||
- [ ] No `*_archived_*.yaml` files at module root level
|
||||
- [ ] All archived files are in `archive/` subdirectory
|
||||
- [ ] Archive folder exists for each module type with archived files
|
||||
- [ ] Manifest counts updated to exclude archived files
|
||||
|
||||
## See Also
|
||||
|
||||
- Rule 53: Full Slot Migration (`full-slot-migration-rule.md`)
|
||||
- Rule 9: Enum-to-Class Promotion (`ENUM_TO_CLASS_PRINCIPLE.md`)
|
||||
- slot_fixes.yaml for migration tracking
|
||||
|
|
@ -1,74 +0,0 @@
|
|||
# Archive Organization Type Description Rule
|
||||
|
||||
## Rule
|
||||
|
||||
When describing archive classes that do NOT have `recordType` or `hold_record_set` as a primary distinguishing feature, emphasize that they represent the **archive as an organization/institution**, not just a collection of records.
|
||||
|
||||
## Rationale
|
||||
|
||||
Many archive type classes (e.g., `BankArchive`, `ChurchArchive`, `MunicipalArchive`) classify the **type of organization** that maintains the records, rather than the type of records themselves. This is an important semantic distinction:
|
||||
|
||||
- **Archive Organization Types** (no recordType focus): Classify the institution by its domain/sector
|
||||
- Examples: `BankArchive`, `ChurchArchive`, `MunicipalArchive`, `UniversityArchive`
|
||||
- Emphasis: The organization's mission, governance, and institutional context
|
||||
|
||||
- **Record Set Types** (have recordType): Classify the collections by record type
|
||||
- Examples: `AudiovisualArchiveRecordSetType`, `PhotographicArchiveRecordSetType`
|
||||
- Emphasis: The nature and format of the records
|
||||
|
||||
## Description Pattern
|
||||
|
||||
### For Archive Organization Types (WITHOUT recordType):
|
||||
|
||||
```yaml
|
||||
description: >-
|
||||
Type of heritage institution that [primary function], specializing in
|
||||
[domain/subject area], with organizational characteristics including
|
||||
[governance, funding, legal status, or other institutional features].
|
||||
```
|
||||
|
||||
**Key elements to include:**
|
||||
1. "Type of heritage institution" or "Type of archive organization"
|
||||
2. The institution's primary domain or sector
|
||||
3. Organizational characteristics (governance, funding, legal status)
|
||||
4. Institutional context (parent organization, regulatory framework)
|
||||
5. Typical services and public-facing functions
|
||||
|
||||
### For Record Set Types (WITH recordType):
|
||||
|
||||
```yaml
|
||||
description: >-
|
||||
Classification of archival records documenting [subject/domain],
|
||||
typically including [record formats, content types, provenance patterns].
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### ✅ Correct - Archive Organization Type (BankArchive):
|
||||
|
||||
```yaml
|
||||
description: >-
|
||||
Type of heritage institution operating within the banking sector, preserving
|
||||
records of financial institutions and documenting banking history. Characterized
|
||||
by corporate governance structures, extended closure periods for personal data,
|
||||
and institutional relationships with parent banking organizations.
|
||||
```
|
||||
|
||||
### ✅ Correct - Record Set Type (has recordType):
|
||||
|
||||
```yaml
|
||||
description: >-
|
||||
Classification of archival records documenting banking activities, including
|
||||
ledgers, correspondence, customer accounts, and financial instruments.
|
||||
```
|
||||
|
||||
## Files Affected
|
||||
|
||||
All classes in the `*Archive` family that:
|
||||
- Do NOT have `hold_record_set` or `recordType` as a primary slot
|
||||
- Are subclassed from `ArchiveOrganizationType` (not `ArchiveRecordSetType`)
|
||||
|
||||
## Related Rules
|
||||
|
||||
- `mapping-specificity-hypernym-rule.md` - For correct ontology mappings
|
||||
- `class-description-quality-rule.md` - For general description quality
|
||||
|
|
@ -1,179 +0,0 @@
|
|||
# Rule 54: Broaden Generic Predicate Ranges Instead of Creating Bespoke Predicates
|
||||
|
||||
🚨 **CRITICAL**: When fixing gen-owl "Ambiguous type" warnings, **broaden the range of generic predicates** rather than creating specialized bespoke predicates.
|
||||
|
||||
## The Problem
|
||||
|
||||
gen-owl "Ambiguous type" warnings occur when a slot is used as both:
|
||||
- **DatatypeProperty** (base range: `string`, `integer`, `uri`, etc.)
|
||||
- **ObjectProperty** (slot_usage override range: a class like `Description`, `SubtitleFormatEnum`)
|
||||
|
||||
This creates OWL ambiguity because OWL requires properties to be either DatatypeProperty OR ObjectProperty, not both.
|
||||
|
||||
## ❌ WRONG Approach: Create Bespoke Predicates
|
||||
|
||||
```yaml
|
||||
# DON'T DO THIS - creates proliferation of rare-use predicates
|
||||
slots:
|
||||
has_or_had_subtitle_format: # Only used by VideoSubtitle
|
||||
range: SubtitleFormatEnum
|
||||
has_or_had_transcript_format: # Only used by VideoTranscript
|
||||
range: TranscriptFormat
|
||||
```
|
||||
|
||||
**Why This Is Wrong**:
|
||||
- Creates **predicate proliferation** (schema bloat)
|
||||
- Bespoke predicates are **rarely reused** across classes
|
||||
- **Increases cognitive load** for schema users
|
||||
- **Fragments the ontology** unnecessarily
|
||||
- Violates the principle of schema parsimony
|
||||
|
||||
## ✅ CORRECT Approach: Broaden Generic Predicate Ranges
|
||||
|
||||
```yaml
|
||||
# DO THIS - make the generic predicate flexible enough
|
||||
slots:
|
||||
has_or_had_format:
|
||||
range: uriorcurie # Broadened from string
|
||||
description: |
|
||||
The format of a resource. Classes narrow this to specific
|
||||
enum types (SubtitleFormatEnum, TranscriptFormatEnum) via slot_usage.
|
||||
```
|
||||
|
||||
Then in class files, use `slot_usage` to narrow the range:
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
VideoSubtitle:
|
||||
slots:
|
||||
- has_or_had_format
|
||||
slot_usage:
|
||||
has_or_had_format:
|
||||
range: SubtitleFormatEnum # Narrowed for this class
|
||||
required: true
|
||||
```
|
||||
|
||||
## Range Broadening Options
|
||||
|
||||
| Original Range | Broadened Range | When to Use |
|
||||
|----------------|-----------------|-------------|
|
||||
| `string` | `uriorcurie` | When class overrides use URI-identified types or enums |
|
||||
| `string` | `Any` | When truly polymorphic (strings AND class instances) |
|
||||
| Specific class | Common base class | When multiple subclasses are used |
|
||||
|
||||
## Decision Tree
|
||||
|
||||
```
|
||||
gen-owl warning: "Ambiguous type for: SLOTNAME"
|
||||
↓
|
||||
Is base slot range a primitive (string, integer, uri)?
|
||||
├─ YES → Broaden to uriorcurie or Any
|
||||
│ - Edit modules/slots/SLOTNAME.yaml
|
||||
│ - Change range: string → range: uriorcurie
|
||||
│ - Document change with Rule 54 reference
|
||||
│ - Keep class-level slot_usage overrides (they narrow the range)
|
||||
│
|
||||
└─ NO → Consider if base slot needs common ancestor class
|
||||
- Create abstract base class if needed
|
||||
- Or broaden to uriorcurie
|
||||
```
|
||||
|
||||
## Implementation Workflow
|
||||
|
||||
1. **Identify warning**: `gen-owl ... 2>&1 | grep "Ambiguous type for:"`
|
||||
|
||||
2. **Check base slot range**:
|
||||
```bash
|
||||
cat modules/slots/SLOTNAME.yaml | grep -A5 "^slots:" | grep "range:"
|
||||
```
|
||||
|
||||
3. **Find class overrides**:
|
||||
```bash
|
||||
for f in modules/classes/*.yaml; do
|
||||
grep -l "SLOTNAME" "$f" && grep -A3 "SLOTNAME:" "$f" | grep "range:"
|
||||
done
|
||||
```
|
||||
|
||||
4. **Broaden base range**:
|
||||
- Edit `modules/slots/SLOTNAME.yaml`
|
||||
- Change `range: string` → `range: uriorcurie`
|
||||
- Add annotation documenting the change
|
||||
|
||||
5. **Verify fix**: Run gen-owl and confirm warning is gone
|
||||
|
||||
6. **Keep slot_usage overrides**: Class-level range narrowing is fine and expected
|
||||
|
||||
## Examples
|
||||
|
||||
### Example 1: has_or_had_format
|
||||
|
||||
**Before (caused warning)**:
|
||||
```yaml
|
||||
# Base slot
|
||||
slots:
|
||||
has_or_had_format:
|
||||
range: string # DatatypeProperty
|
||||
|
||||
# Class override
|
||||
classes:
|
||||
VideoSubtitle:
|
||||
slot_usage:
|
||||
has_or_had_format:
|
||||
range: SubtitleFormatEnum # ObjectProperty → CONFLICT!
|
||||
```
|
||||
|
||||
**After (fixed)**:
|
||||
```yaml
|
||||
# Base slot - broadened
|
||||
slots:
|
||||
has_or_had_format:
|
||||
range: uriorcurie # Now ObjectProperty-compatible
|
||||
|
||||
# Class override - unchanged, still narrows
|
||||
classes:
|
||||
VideoSubtitle:
|
||||
slot_usage:
|
||||
has_or_had_format:
|
||||
range: SubtitleFormatEnum # Valid narrowing
|
||||
```
|
||||
|
||||
### Example 2: has_or_had_hypernym
|
||||
|
||||
**Before**: `range: string` (DatatypeProperty)
|
||||
**After**: `range: uriorcurie` (ObjectProperty-compatible)
|
||||
|
||||
Classes that override to class ranges now work without ambiguity.
|
||||
|
||||
## Validation
|
||||
|
||||
After broadening, run:
|
||||
```bash
|
||||
gen-owl 01_custodian_name_modular.yaml 2>&1 | grep "Ambiguous type for: SLOTNAME"
|
||||
```
|
||||
|
||||
The warning should disappear without creating new predicates.
|
||||
|
||||
## Anti-Patterns to Avoid
|
||||
|
||||
| ❌ Anti-Pattern | ✅ Correct Pattern |
|
||||
|----------------|-------------------|
|
||||
| Create `has_or_had_subtitle_format` | Broaden `has_or_had_format` to `uriorcurie` |
|
||||
| Create `has_or_had_entity_type` | Broaden `has_or_had_type` to `uriorcurie` |
|
||||
| Create `has_or_had_X_label` | Broaden `has_or_had_label` to `uriorcurie` |
|
||||
| Create `has_or_had_X_status` | Broaden `has_or_had_status` to `uriorcurie` |
|
||||
|
||||
## Rationale
|
||||
|
||||
This approach:
|
||||
1. **Reduces schema complexity** - Fewer predicates to understand
|
||||
2. **Promotes reuse** - Generic predicates work across domains
|
||||
3. **Maintains OWL consistency** - Single property type per predicate
|
||||
4. **Preserves type safety** - slot_usage still enforces class-specific ranges
|
||||
5. **Follows semantic web best practices** - Broad predicates, narrow contexts
|
||||
|
||||
## See Also
|
||||
|
||||
- Rule 38: Slot Centralization and Semantic URI Requirements
|
||||
- Rule: Slot Naming Convention (Current Style)
|
||||
- Rule 49: Slot Usage Minimization
|
||||
- LinkML Documentation: [slot_usage](https://linkml.io/linkml-model/latest/docs/slot_usage/)
|
||||
|
|
@ -1,61 +0,0 @@
|
|||
# Rule: Canonical Slot Protection
|
||||
|
||||
## Summary
|
||||
|
||||
When resolving slot aliases to canonical names, a slot name that has its own `.yaml` file (i.e., is itself a canonical slot) MUST NOT be replaced with a different canonical name, even if it also appears as an alias in another slot file.
|
||||
|
||||
## Context
|
||||
|
||||
Slot files in `schemas/20251121/linkml/modules/slots/` (top-level and `new/`) each define a canonical slot name. Some slot files also list aliases that overlap with canonical names from other slot files. These cross-references are accidental (e.g., indicating semantic relatedness) and should be corrected by removing the canonical names from the aliases lists in which they occur. The occurance of canonical names in alianses lists does NOT mean the referenced slot should be renamed.
|
||||
|
||||
## Rule
|
||||
|
||||
1. **Before renaming any slot reference** (in `slots:`, `slot_usage:`, or `imports:` of class files), check whether the current name is itself a canonical slot name — i.e., whether a `.yaml` file exists for it in the slots directory.
|
||||
|
||||
2. **If the name IS canonical** (has its own `.yaml` file), do NOT rename it and do NOT redirect its import. The class file is correctly referencing that slot's own definition file.
|
||||
|
||||
3. **Only rename a slot reference** if the name does NOT have its own `.yaml` file and is ONLY found as an alias in another slot's file.
|
||||
|
||||
## Examples
|
||||
|
||||
### WRONG
|
||||
|
||||
```yaml
|
||||
# categorized_as.yaml defines aliases: [..., "has_type", ...]
|
||||
# has_type.yaml exists with canonical name "has_type"
|
||||
|
||||
# WRONG: Renaming has_type -> categorized_as in a class file
|
||||
# This destroys the valid reference to has_type.yaml
|
||||
slots:
|
||||
- categorized_as # was: has_type -- INCORRECT REPLACEMENT
|
||||
```
|
||||
|
||||
### CORRECT
|
||||
|
||||
```yaml
|
||||
# has_type.yaml exists => "has_type" is canonical => leave it alone
|
||||
slots:
|
||||
- has_type # CORRECT: has_type is canonical, keep it
|
||||
|
||||
# "custodian_type" does NOT have its own .yaml file
|
||||
# "custodian_type" is listed as an alias in has_type.yaml
|
||||
# => rename custodian_type -> has_type
|
||||
slots:
|
||||
- has_type # was: custodian_type -- CORRECT REPLACEMENT
|
||||
```
|
||||
|
||||
## Implementation Check
|
||||
|
||||
```python
|
||||
# Pseudocode for alias resolution
|
||||
def should_rename(slot_name, alias_map, existing_slot_files):
|
||||
if slot_name in existing_slot_files:
|
||||
return False # It's canonical — do not rename
|
||||
if slot_name in alias_map:
|
||||
return True # It's only an alias — rename to canonical
|
||||
return False # Unknown — leave alone
|
||||
```
|
||||
|
||||
## Rationale
|
||||
|
||||
Multiple slot files may list overlapping aliases by accident or for documentation or semantic linking purposes. A canonical slot name appearing as an alias in another file does not invalidate the original slot definition. Treating it as an alias would incorrectly redirect class files away from the slot's own definition, breaking the schema's intended structure.
|
||||
|
|
@ -1,48 +0,0 @@
|
|||
# Rule: Capitalization Consistency for LinkML Names
|
||||
|
||||
## Purpose
|
||||
|
||||
Ensure naming is consistent across LinkML classes, slots, enums, and their files,
|
||||
with special care for acronyms (for example: `GLAM`, `GHC`, `GHCID`, `GLEIF`).
|
||||
|
||||
## Mandatory Requirements
|
||||
|
||||
1. **Class names**
|
||||
- Use `PascalCase`.
|
||||
- Preserve canonical acronym casing.
|
||||
- Example: `GHCIdentifier`, not `GhcidIdentifier`.
|
||||
|
||||
2. **Slot names**
|
||||
- Use project slot naming convention consistently.
|
||||
- If acronym appears in a slot, keep its canonical uppercase form.
|
||||
- Example: `has_GHCID_history` (if acronymed slot is required), not `has_ghcid_history`.
|
||||
|
||||
3. **Enum names**
|
||||
- Use `PascalCase` with `Enum` suffix where applicable.
|
||||
- Preserve acronym casing in enum identifiers and permissible values.
|
||||
- Example: `GLAMTypeEnum`.
|
||||
|
||||
4. **File names must match primary term exactly**
|
||||
- Class file name must match class name (case-sensitive) plus `.yaml`.
|
||||
- Enum file name must match enum name (case-sensitive) plus `.yaml`.
|
||||
- Slot file name must match slot name (case-sensitive) plus `.yaml`.
|
||||
|
||||
5. **No mixed acronym variants in same schema branch**
|
||||
- Do not mix forms like `Ghcid`, `GHCID`, and `ghcid` for the same concept.
|
||||
- Pick canonical form once and use it everywhere.
|
||||
|
||||
## Refactoring Rule
|
||||
|
||||
When normalizing capitalization:
|
||||
|
||||
- Update term declaration (`name`, class/slot/enum key).
|
||||
- Update file name to match.
|
||||
- Update all imports and references transitively.
|
||||
- Do not leave aliases as operational identifiers; keep aliases only for lexical metadata.
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
- [ ] Class, slot, enum declarations use canonical casing.
|
||||
- [ ] File names exactly match declaration names.
|
||||
- [ ] Acronyms are consistent across declarations and references.
|
||||
- [ ] Imports and references resolve after renaming.
|
||||
|
|
@ -1,228 +0,0 @@
|
|||
# Class Description Quality Rule
|
||||
|
||||
## Rule: Write Dictionary-Style Definitions Without Repeating the Class Name
|
||||
|
||||
When writing class descriptions, follow these principles.
|
||||
|
||||
### 1. No Repetition of Class Name Components
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
AcademicArchiveRecordSetType:
|
||||
description: >-
|
||||
A classification type for archival record sets created by academic
|
||||
institutions. This class represents the record set type...
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
AcademicArchiveRecordSetType:
|
||||
description: >-
|
||||
Category for grouping documentary materials accumulated by tertiary
|
||||
educational institutions during their administrative, academic, and
|
||||
operational activities.
|
||||
```
|
||||
|
||||
The description should define the concept using synonyms and related terms, not repeat words from the class name.
|
||||
|
||||
### 2. MIGRATE Structured Data Before Removing from Descriptions
|
||||
|
||||
**CRITICAL**: When a description contains structured data (examples, typical contents, alignment notes, etc.), you MUST:
|
||||
|
||||
1. **First check** if the structured data already exists in proper LinkML fields
|
||||
2. **If NOT present**: ADD it to the appropriate structured field
|
||||
3. **ONLY THEN**: Remove it from the description
|
||||
|
||||
**Never simply delete structured content from descriptions without preserving it elsewhere.**
|
||||
|
||||
**MIGRATION CHECKLIST:**
|
||||
|
||||
| Content Type | Target Field | Example |
|
||||
|--------------|--------------|---------|
|
||||
| Example instances | `examples:` | `- value: {...} description: "..."` |
|
||||
| Typical contents | `keywords:` or `comments:` | List of typical materials |
|
||||
| Alignment explanations | `broad_mappings`, `related_mappings` | Ontology references |
|
||||
| Usage notes | `comments:` | Operational guidance |
|
||||
| Provenance notes | `comments:` or `annotations:` | Historical context |
|
||||
| Privacy/legal notes | `comments:` | Access restrictions |
|
||||
| Definition details | Keep in description | Core semantic definition |
|
||||
|
||||
**WRONG - Deleting without migration:**
|
||||
```yaml
|
||||
# BEFORE (has rich content)
|
||||
description: |
|
||||
Records documenting student academic careers.
|
||||
|
||||
**Typical Contents**:
|
||||
- Enrollment records
|
||||
- Academic transcripts
|
||||
- Graduation records
|
||||
|
||||
Subject to privacy regulations (FERPA, GDPR).
|
||||
|
||||
# AFTER (lost information!) - DON'T DO THIS
|
||||
description: >-
|
||||
Records documenting student academic careers.
|
||||
```
|
||||
|
||||
**CORRECT - Migrate first, then clean:**
|
||||
```yaml
|
||||
# Step 1: Add to structured fields
|
||||
description: >-
|
||||
Records documenting student academic careers.
|
||||
keywords:
|
||||
- enrollment records
|
||||
- academic transcripts
|
||||
- graduation records
|
||||
comments:
|
||||
- Subject to privacy regulations (FERPA, GDPR, AVG)
|
||||
- Access restrictions typically apply for records less than 75 years old
|
||||
|
||||
# Step 2: Now description is clean but no information lost
|
||||
```
|
||||
|
||||
### 3. No Structured Data or Meta-Discussion in Descriptions
|
||||
|
||||
After migration, descriptions should contain only the definition. Do not include:
|
||||
- Alignment explanations (use `broad_mappings`, `close_mappings`, `exact_mappings`)
|
||||
- Pattern explanations (use `see_also`, `comments`)
|
||||
- Usage examples (use `examples:` annotation)
|
||||
- Rationale for mappings (use `comments:` or `annotations:`)
|
||||
- Typical contents lists (use `keywords:` or `comments:`)
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
description: >-
|
||||
A type for X.
|
||||
|
||||
**RiC-O Alignment**: Maps to rico:RecordSetType because...
|
||||
|
||||
**Pattern**: This is part of a dual-class pattern with Y.
|
||||
|
||||
**Examples**: Administrative fonds, student records...
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
description: >-
|
||||
Category for grouping documentary materials accumulated by tertiary
|
||||
educational institutions.
|
||||
|
||||
broad_mappings:
|
||||
- rico:RecordSetType
|
||||
see_also:
|
||||
- AcademicArchive
|
||||
keywords:
|
||||
- administrative fonds
|
||||
- student records
|
||||
examples:
|
||||
- value: {...}
|
||||
description: Administrative fonds containing governance records
|
||||
```
|
||||
|
||||
### 4. Use Folded Block Scalar (`>-`) for Descriptions
|
||||
|
||||
Use `>-` (folded, strip) instead of `|` (literal) to ensure clean paragraph formatting in generated documentation.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
description: |
|
||||
A type for X.
|
||||
This spans multiple lines.
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
description: >-
|
||||
A type for X. This will be formatted as a single clean paragraph
|
||||
in the generated documentation.
|
||||
```
|
||||
|
||||
### 5. Use LinkML `examples:` Annotation for Examples
|
||||
|
||||
Structure examples properly with `value:` and `description:` keys.
|
||||
|
||||
```yaml
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: University Administrative Records
|
||||
description: Administrative fonds containing governance records
|
||||
```
|
||||
|
||||
### 6. Keywords vs Examples - Know the Difference
|
||||
|
||||
**CRITICAL**: Do not confuse `keywords:` with `examples:`. They serve different purposes:
|
||||
|
||||
| Field | Purpose | Content Type |
|
||||
|-------|---------|--------------|
|
||||
| `keywords:` | Search terms, topics, categories | List of strings (topics/materials) |
|
||||
| `examples:` | Valid instance data demonstrations | Structured objects with `value` and `description` |
|
||||
|
||||
**Keywords** = Topics, material types, categories that describe what the class is about:
|
||||
```yaml
|
||||
keywords:
|
||||
- enrollment records # type of material
|
||||
- academic transcripts # type of material
|
||||
- graduation records # type of material
|
||||
```
|
||||
|
||||
**Examples** = Actual instances of the class with populated slots:
|
||||
```yaml
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Registrar Student Records
|
||||
has_note: Enrollment, transcripts, graduation records
|
||||
description: Student records series from the registrar's office
|
||||
```
|
||||
|
||||
**WRONG - Using keywords as examples:**
|
||||
```yaml
|
||||
# DON'T: "enrollment records" is not an instance of AcademicStudentRecordSeries
|
||||
examples:
|
||||
- value: enrollment records
|
||||
description: Type of student record
|
||||
```
|
||||
|
||||
**CORRECT - Keywords for topics, examples for instances:**
|
||||
```yaml
|
||||
keywords:
|
||||
- enrollment records
|
||||
- academic transcripts
|
||||
- graduation records
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Historical Student Records
|
||||
has_note: Pre-1950 student records with fewer access restrictions
|
||||
description: Historical student records open for research access
|
||||
```
|
||||
|
||||
### 7. Multiple Examples for Different Use Cases
|
||||
|
||||
Provide multiple examples to show different contexts or configurations:
|
||||
|
||||
```yaml
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Recent Student Records
|
||||
description: Current records subject to privacy restrictions
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Historical Student Records
|
||||
description: Records 75+ years old with fewer access restrictions
|
||||
```
|
||||
|
||||
## Summary
|
||||
|
||||
| Element | Placement |
|
||||
|---------|-----------|
|
||||
| Definition | `description:` (concise, no repetition) |
|
||||
| Ontology mappings | `exact_mappings`, `broad_mappings`, etc. |
|
||||
| Related concepts | `see_also:` |
|
||||
| Usage notes | `comments:` |
|
||||
| Metadata | `annotations:` |
|
||||
| Examples | `examples:` with `value` and `description` |
|
||||
| Typical contents | `keywords:` or `comments:` |
|
||||
|
|
@ -1,54 +0,0 @@
|
|||
# Rule: Class File Name Must Match Class Label/Name
|
||||
|
||||
## 🚨 Critical
|
||||
|
||||
When a class label/name is changed, the class file name must be renamed to match.
|
||||
|
||||
This keeps class modules discoverable, prevents stale imports, and avoids long-term naming drift.
|
||||
|
||||
## The Rule
|
||||
|
||||
1. If the primary class identifier changes, rename the file in the same edit set.
|
||||
- Change triggers include updates to:
|
||||
- top-level `name:`
|
||||
- class key under `classes:`
|
||||
- canonical class label used for module naming
|
||||
|
||||
2. File naming must reflect the canonical class name.
|
||||
- ✅ `DigitalPlatformProfile.yaml` for class `DigitalPlatformProfile`
|
||||
- ❌ `DigitalPlatformV2.yaml` for class `DigitalPlatformProfile`
|
||||
|
||||
3. After renaming a file, update all references.
|
||||
- `imports:` in other class/slot/type files
|
||||
- manifests/indexes/build inputs
|
||||
- any generated or curated mapping lists that include file paths
|
||||
|
||||
4. Keep semantic names versionless.
|
||||
- Do not preserve old versioned file names when class names are de-versioned.
|
||||
- Coordinate with `no-version-indicators-in-names-rule.md`.
|
||||
|
||||
## Required Checklist
|
||||
|
||||
- [ ] File name matches canonical class name
|
||||
- [ ] `id:` and `name:` are internally consistent
|
||||
- [ ] All import paths updated
|
||||
- [ ] Search confirms no stale old file-name references remain
|
||||
- [ ] YAML parses after rename
|
||||
|
||||
## Example
|
||||
|
||||
Before:
|
||||
```yaml
|
||||
# file: DigitalPlatformV2.yaml
|
||||
name: DigitalPlatformProfile
|
||||
classes:
|
||||
DigitalPlatformProfile:
|
||||
```
|
||||
|
||||
After:
|
||||
```yaml
|
||||
# file: DigitalPlatformProfile.yaml
|
||||
name: DigitalPlatformProfile
|
||||
classes:
|
||||
DigitalPlatformProfile:
|
||||
```
|
||||
|
|
@ -1,133 +0,0 @@
|
|||
# Rule 48: Class Files Must Not Define Inline Slots
|
||||
|
||||
🚨 **CRITICAL**: LinkML class files in `schemas/20251121/linkml/modules/classes/` MUST NOT define their own slots inline. All slots MUST be imported from the centralized `modules/slots/` directory.
|
||||
|
||||
## Problem Statement
|
||||
|
||||
When class files define their own slots (e.g., `AccessRestriction.yaml` defining its own slot properties), this creates:
|
||||
|
||||
1. **Duplication**: Same slot semantics defined in multiple places
|
||||
2. **Inconsistency**: Slot definitions may diverge between files
|
||||
3. **Frontend Issues**: LinkML viewer cannot properly render slot relationships
|
||||
4. **Maintenance Burden**: Changes require updates in multiple locations
|
||||
|
||||
## Architecture Requirement
|
||||
|
||||
```
|
||||
schemas/20251121/linkml/
|
||||
├── modules/
|
||||
│ ├── classes/ # Class definitions ONLY
|
||||
│ │ └── *.yaml # NO inline slot definitions
|
||||
│ ├── slots/ # ALL slot definitions go here
|
||||
│ │ └── *.yaml # One file per slot or logical group
|
||||
│ └── enums/ # Enumeration definitions
|
||||
```
|
||||
|
||||
## Correct Pattern
|
||||
|
||||
**Class file** (`modules/classes/AccessRestriction.yaml`):
|
||||
|
||||
```yaml
|
||||
id: https://nde.nl/ontology/hc/class/AccessRestriction
|
||||
name: AccessRestriction
|
||||
prefixes:
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
linkml: https://w3id.org/linkml/
|
||||
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/restriction_type # Import slot from centralized location
|
||||
- ../slots/restriction_reason
|
||||
- ../slots/applies_from
|
||||
- ../slots/applies_until
|
||||
|
||||
default_range: string
|
||||
|
||||
classes:
|
||||
AccessRestriction:
|
||||
class_uri: hc:AccessRestriction
|
||||
description: >-
|
||||
Describes access restrictions on heritage collections or items.
|
||||
slots:
|
||||
- restriction_type # Reference slot by name
|
||||
- restriction_reason
|
||||
- applies_from
|
||||
- applies_until
|
||||
```
|
||||
|
||||
**Slot file** (`modules/slots/restriction_type.yaml`):
|
||||
|
||||
```yaml
|
||||
id: https://nde.nl/ontology/hc/slot/restriction_type
|
||||
name: restriction_type
|
||||
prefixes:
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
linkml: https://w3id.org/linkml/
|
||||
|
||||
imports:
|
||||
- linkml:types
|
||||
|
||||
slots:
|
||||
restriction_type:
|
||||
slot_uri: hc:restrictionType
|
||||
description: The type of access restriction applied.
|
||||
range: string
|
||||
exact_mappings:
|
||||
- schema:accessMode
|
||||
```
|
||||
|
||||
## Anti-Pattern (WRONG)
|
||||
|
||||
**DO NOT** define slots inline in class files:
|
||||
|
||||
```yaml
|
||||
# WRONG - AccessRestriction.yaml with inline slots
|
||||
classes:
|
||||
AccessRestriction:
|
||||
slots:
|
||||
- restriction_type
|
||||
|
||||
slots: # ❌ DO NOT define slots here
|
||||
restriction_type:
|
||||
description: Type of restriction
|
||||
range: string
|
||||
```
|
||||
|
||||
## Identifying Violations
|
||||
|
||||
To find class files that incorrectly define slots:
|
||||
|
||||
```bash
|
||||
# Find class files with inline slot definitions
|
||||
grep -l "^slots:" schemas/20251121/linkml/modules/classes/*.yaml
|
||||
```
|
||||
|
||||
Files that match need refactoring:
|
||||
1. Extract slot definitions to `modules/slots/`
|
||||
2. Add imports for the extracted slots
|
||||
3. Remove inline `slots:` section from class file
|
||||
|
||||
## Migration Workflow
|
||||
|
||||
1. **Identify inline slots** in class file
|
||||
2. **Check if slot exists** in `modules/slots/`
|
||||
3. **If exists**: Remove inline definition, add import
|
||||
4. **If not exists**: Create new slot file in `modules/slots/`, then add import
|
||||
5. **Validate**: Run `linkml-validate` to ensure schema integrity
|
||||
6. **Update manifest**: Regenerate `manifest.json` if needed
|
||||
|
||||
## Rationale
|
||||
|
||||
- **Single Source of Truth**: Each slot defined exactly once
|
||||
- **Reusability**: Slots can be used across multiple classes
|
||||
- **Frontend Compatibility**: LinkML viewer depends on centralized slots for proper edge rendering in UML diagrams
|
||||
- **Semantic Consistency**: `slot_uri` and mappings defined once, applied everywhere
|
||||
- **Maintenance**: Changes to slot semantics applied in one place
|
||||
|
||||
## See Also
|
||||
|
||||
- Rule 38: Slot Centralization and Semantic URI Requirements
|
||||
- Rule: Slot Naming Convention (Current Style)
|
||||
- Rule 42: No Ontology Prefixes in Slot Names
|
||||
- Rule 43: Slot Nouns Must Be Singular
|
||||
|
|
@ -1,158 +0,0 @@
|
|||
# Class Multilingual Support Rule
|
||||
|
||||
## Rule: All Class Files Must Include Multilingual Descriptions and Aliases
|
||||
|
||||
Every class file must provide `alt_descriptions` and `structured_aliases` in all supported languages to ensure internationalization and interoperability with multilingual heritage systems.
|
||||
|
||||
### Required Languages
|
||||
|
||||
| Code | Language |
|
||||
|------|----------|
|
||||
| `nl` | Dutch |
|
||||
| `de` | German |
|
||||
| `fr` | French |
|
||||
| `es` | Spanish |
|
||||
| `ar` | Arabic |
|
||||
| `id` | Indonesian |
|
||||
| `zh` | Chinese |
|
||||
|
||||
### Structure
|
||||
|
||||
#### alt_descriptions
|
||||
|
||||
Provide translated descriptions for each supported language:
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
AcademicArchiveRecordSetType:
|
||||
description: >-
|
||||
Category for grouping documentary materials accumulated by tertiary
|
||||
educational institutions during their administrative, academic, and
|
||||
operational activities.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Categorie voor het groeperen van documentair materiaal dat door
|
||||
hogeronderwijsinstellingen is verzameld tijdens hun administratieve,
|
||||
academische en operationele activiteiten.
|
||||
de: >-
|
||||
Kategorie zur Gruppierung von Dokumentenmaterial, das von Hochschulen
|
||||
während ihrer administrativen, akademischen und betrieblichen Aktivitäten
|
||||
angesammelt wurde.
|
||||
fr: >-
|
||||
Catégorie de regroupement des documents accumulés par les établissements
|
||||
d'enseignement supérieur au cours de leurs activités administratives,
|
||||
académiques et opérationnelles.
|
||||
es: >-
|
||||
Categoría para agrupar materiales documentales acumulados por instituciones
|
||||
de educación superior durante sus actividades administrativas, académicas
|
||||
y operativas.
|
||||
ar: >-
|
||||
فئة لتجميع المواد الوثائقية التي جمعتها مؤسسات التعليم العالي
|
||||
خلال أنشطتها الإدارية والأكاديمية والتشغيلية.
|
||||
id: >-
|
||||
Kategori untuk mengelompokkan materi dokumenter yang dikumpulkan oleh
|
||||
institusi pendidikan tinggi selama aktivitas administratif, akademik,
|
||||
dan operasional mereka.
|
||||
zh: >-
|
||||
高等教育机构在行政、学术和运营活动中积累的文献材料的分类类别。
|
||||
```
|
||||
|
||||
#### structured_aliases
|
||||
|
||||
Provide language-specific aliases/alternative names:
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
AcademicArchiveRecordSetType:
|
||||
structured_aliases:
|
||||
- literal_form: academisch archiefbestand
|
||||
in_language: nl
|
||||
- literal_form: Hochschularchivbestand
|
||||
in_language: de
|
||||
- literal_form: fonds d'archives académiques
|
||||
in_language: fr
|
||||
- literal_form: fondo de archivo académico
|
||||
in_language: es
|
||||
- literal_form: أرشيف أكاديمي
|
||||
in_language: ar
|
||||
- literal_form: koleksi arsip akademik
|
||||
in_language: id
|
||||
- literal_form: 学术档案集
|
||||
in_language: zh
|
||||
```
|
||||
|
||||
### Complete Example
|
||||
|
||||
```yaml
|
||||
id: https://nde.nl/ontology/hc/class/AcademicArchiveRecordSetType
|
||||
name: AcademicArchiveRecordSetType
|
||||
title: Academic Archive Record Set Type
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../classes/CollectionType
|
||||
classes:
|
||||
AcademicArchiveRecordSetType:
|
||||
description: >-
|
||||
Category for grouping documentary materials accumulated by tertiary
|
||||
educational institutions during their administrative, academic, and
|
||||
operational activities.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Categorie voor het groeperen van documentair materiaal dat door
|
||||
hogeronderwijsinstellingen is verzameld.
|
||||
de: >-
|
||||
Kategorie zur Gruppierung von Dokumentenmaterial, das von Hochschulen
|
||||
angesammelt wurde.
|
||||
fr: >-
|
||||
Catégorie de regroupement des documents accumulés par les établissements
|
||||
d'enseignement supérieur.
|
||||
es: >-
|
||||
Categoría para agrupar materiales documentales acumulados por instituciones
|
||||
de educación superior.
|
||||
ar: >-
|
||||
فئة لتجميع المواد الوثائقية التي جمعتها مؤسسات التعليم العالي.
|
||||
id: >-
|
||||
Kategori untuk mengelompokkan materi dokumenter yang dikumpulkan oleh
|
||||
institusi pendidikan tinggi.
|
||||
zh: >-
|
||||
高等教育机构积累的文献材料的分类类别。
|
||||
structured_aliases:
|
||||
- literal_form: academisch archiefbestand
|
||||
in_language: nl
|
||||
- literal_form: Hochschularchivbestand
|
||||
in_language: de
|
||||
- literal_form: fonds d'archives académiques
|
||||
in_language: fr
|
||||
- literal_form: fondo de archivo académico
|
||||
in_language: es
|
||||
- literal_form: أرشيف أكاديمي
|
||||
in_language: ar
|
||||
- literal_form: koleksi arsip akademik
|
||||
in_language: id
|
||||
- literal_form: 学术档案集
|
||||
in_language: zh
|
||||
is_a: CollectionType
|
||||
# ... rest of class definition
|
||||
```
|
||||
|
||||
### Translation Guidelines
|
||||
|
||||
1. **Accuracy over literal translation**: Translate the concept, not word-by-word
|
||||
2. **Use domain-appropriate terminology**: Use archival/library/museum terminology standard in each language
|
||||
3. **Consult existing vocabularies**: Reference RiC-O, ISAD(G), AAT translations when available
|
||||
4. **Maintain consistency**: Same term should be translated consistently across all class files
|
||||
|
||||
### Checklist
|
||||
|
||||
For each class file, verify:
|
||||
|
||||
- [ ] `alt_descriptions` present with all 7 languages
|
||||
- [ ] `structured_aliases` present with all 7 languages
|
||||
- [ ] Translations are accurate and domain-appropriate
|
||||
- [ ] Arabic text is properly encoded (RTL)
|
||||
- [ ] Chinese uses simplified characters (zh) unless traditional specified (zh-hant)
|
||||
|
|
@ -1,65 +0,0 @@
|
|||
# Rule: Engineering Parsimony and Domain Modeling
|
||||
|
||||
## Critical Convention
|
||||
|
||||
Our ontology follows an engineering-oriented approach: practical domain utility and
|
||||
stable interoperability take priority over minimal, tool-specific class catalogs.
|
||||
|
||||
## Rule
|
||||
|
||||
1. Model domain concepts, not implementation tools.
|
||||
- Reject classes like `ExaSearchMetadata`, `OpenAIFetchResult`, `ElasticsearchHit`.
|
||||
|
||||
2. Prefer generic, reusable activity/entity classes for operational provenance.
|
||||
- Use classes such as `ExternalSearchMetadata`, `RetrievalActivity`, `SearchResult`.
|
||||
|
||||
3. Capture tool/vendor details in slot values, not class names.
|
||||
- Record with generic predicates like `has_tool`, `has_method`, `has_agent`, `has_note`.
|
||||
|
||||
4. Digital platforms acting as custodians are valid domain classes.
|
||||
- Platform-as-custodian classes (for example YouTube-related custodian classes) are allowed.
|
||||
- Data processing/search tools are not ontology class candidates.
|
||||
|
||||
5. Avoid ontology growth driven by transient engineering stack choices.
|
||||
- New class proposals must be justified by cross-tool, domain-stable semantics.
|
||||
|
||||
## Rationale
|
||||
|
||||
- Tool names are volatile implementation details and age quickly.
|
||||
- Domain-level abstractions maximize reuse, query consistency, and mapping stability.
|
||||
- This aligns with an engineering ontology practice where strict theoretical
|
||||
parsimony in candidate theories is not the only optimization criterion; practical
|
||||
semantic interoperability and maintainability are primary.
|
||||
|
||||
## Examples
|
||||
|
||||
### Wrong
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
ExaSearchMetadata:
|
||||
class_uri: prov:Activity
|
||||
```
|
||||
|
||||
### Correct
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
ExternalSearchMetadata:
|
||||
class_uri: prov:Activity
|
||||
slots:
|
||||
- has_tool
|
||||
- has_method
|
||||
- has_agent
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
1. Liefke, K. (2024). *Natural Language Ontology and Semantic Theory*.
|
||||
Cambridge Elements in Semantics. DOI: `10.1017/9781009307789`.
|
||||
URL: https://www.cambridge.org/core/elements/abs/natural-language-ontology-and-semantic-theory/E8DDE548BB8A98137721984E26FAD764
|
||||
|
||||
2. Liefke, K. (2025). *Reduction and Unification in Natural Language Ontology*.
|
||||
Cambridge Elements in Semantics. DOI: `10.1017/9781009559683`.
|
||||
URL: https://www.cambridge.org/core/elements/abs/reduction-and-unification-in-natural-language-ontology/40F58ABA0D9C08958B5926F0CBDAD3CA
|
||||
|
||||
|
|
@ -1,37 +0,0 @@
|
|||
# Exact Mapping Predicate/Class Distinction Rule
|
||||
|
||||
🚨 **CRITICAL**: The `exact_mappings` property implies semantic equivalence. Equivalence can only exist between elements of the same ontological category.
|
||||
|
||||
## The Rule
|
||||
|
||||
1. **Slots (Predicates)** MUST ONLY have `exact_mappings` to ontology **predicates** (properties).
|
||||
* ❌ INVALID: Slot `analyze` maps to `schema:object` (a Class).
|
||||
* ✅ VALID: Slot `analyze` maps to `crm:P129_is_about` (a Property).
|
||||
|
||||
2. **Classes (Entities)** MUST ONLY have `exact_mappings` to ontology **classes** (entities).
|
||||
* ❌ INVALID: Class `Person` maps to `foaf:name` (a Property).
|
||||
* ✅ VALID: Class `Person` maps to `foaf:Person` (a Class).
|
||||
|
||||
3. **When true equivalence exists and is verified, exact mapping is preferred.**
|
||||
* ✅ VALID: Class `Acquisition` maps to `crm:E8_Acquisition`.
|
||||
* ✅ VALID: Slot mapped to an actually equivalent ontology property.
|
||||
* ❗ Do not avoid `exact_mappings` by default; avoid only when scope is broader/narrower/similar-but-not-equal.
|
||||
|
||||
## Rationale
|
||||
|
||||
Mapping a slot (which defines a relationship or attribute) to a class (which defines a type of entity) is a category error. `schema:object` represents the *class* of objects, not the *relationship* of "having an object" or "analyzing an object".
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
When adding or reviewing `exact_mappings`:
|
||||
- [ ] Is the LinkML element a Class or a Slot?
|
||||
- [ ] Did you verify the target term type in the ontology definition files (do not rely on naming heuristics)?
|
||||
- [ ] Do they match? (Class↔Class, Slot↔Property)
|
||||
- [ ] If the target ontology uses opaque IDs (like CIDOC-CRM `E55_Type`), verify the type definition in the ontology file.
|
||||
- [ ] If semantic scope is truly equivalent, use `exact_mappings` (not `close`/`broad` as a conservative fallback).
|
||||
|
||||
## Common Pitfalls to Fix
|
||||
|
||||
- Mapping slots to `schema:Object` or `schema:Thing`.
|
||||
- Mapping slots to `skos:Concept`.
|
||||
- Mapping classes to `schema:name` or `dc:title`.
|
||||
|
|
@ -1,144 +0,0 @@
|
|||
# Rule 58: Feedback vs Revision Distinction in slot_fixes.yaml
|
||||
|
||||
## Summary
|
||||
|
||||
The `feedback` and `revision` fields in `slot_fixes.yaml` serve distinct purposes and MUST NOT be conflated or renamed.
|
||||
|
||||
## Field Definitions
|
||||
|
||||
### `revision` Field
|
||||
- **Purpose**: Defines WHAT the migration target is
|
||||
- **Content**: List of slots and classes to create
|
||||
- **Authority**: IMMUTABLE (per Rule 57)
|
||||
- **Format**: Structured YAML list with `label`, `type`, optional `link_branch`
|
||||
|
||||
### `feedback` Field
|
||||
- **Purpose**: Contains user instructions on HOW the revision needs to be applied or corrected
|
||||
- **Content**: Can be string or structured format
|
||||
- **Authority**: User directives that override previous `notes`
|
||||
- **Action Required**: Agent must interpret and act upon feedback
|
||||
|
||||
## Feedback Formats
|
||||
|
||||
### Format 1: Structured (with `done` field)
|
||||
```yaml
|
||||
feedback:
|
||||
- timestamp: '2026-01-17T00:01:57Z'
|
||||
user: Simon C. Kemper
|
||||
done: false # Becomes true after agent processes
|
||||
comment: |
|
||||
The migration should use X instead of Y.
|
||||
response: "" # Agent fills this after completing
|
||||
```
|
||||
|
||||
### Format 2: String (direct instruction)
|
||||
```yaml
|
||||
feedback: I reject this! type_id should be migrated to has_or_had_identifier + Identifier
|
||||
```
|
||||
|
||||
Or:
|
||||
```yaml
|
||||
feedback: I altered the revision based on this feedback. Conduct this new migration accordingly.
|
||||
```
|
||||
|
||||
## Interpretation Rules
|
||||
|
||||
| Feedback Contains | Meaning | Action Required |
|
||||
|-------------------|---------|-----------------|
|
||||
| "I reject this" | Previous `notes` were WRONG | Follow `revision` field instead |
|
||||
| "I altered the revision" | User updated `revision` | Execute migration per NEW revision |
|
||||
| "Conduct the migration" | Migration not yet done | Execute migration now |
|
||||
| "Please conduct accordingly" | Migration pending | Execute migration now |
|
||||
| "ADDRESSED" or `done: true` | Already processed | No action needed |
|
||||
|
||||
## Decision Tree
|
||||
|
||||
```
|
||||
Is feedback field present?
|
||||
├─ NO → Check `processed.status`
|
||||
│ ├─ true → Migration complete
|
||||
│ └─ false → Execute revision
|
||||
│
|
||||
└─ YES → What format?
|
||||
├─ Structured with `done: true` → No action needed
|
||||
├─ Structured with `done: false` → Process feedback, then set done: true
|
||||
└─ String format → Parse for keywords:
|
||||
├─ "reject" → Previous notes invalid, follow revision
|
||||
├─ "altered/adjusted revision" → Execute NEW revision
|
||||
├─ "conduct/please" → Migration pending, execute now
|
||||
└─ "ADDRESSED" → Already done, no action
|
||||
```
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### WRONG: Renaming feedback to revision
|
||||
```yaml
|
||||
# DO NOT DO THIS
|
||||
# feedback contains instructions, not migration specs
|
||||
revision: # Was: feedback
|
||||
- I reject this! Use has_or_had_identifier
|
||||
```
|
||||
|
||||
### WRONG: Ignoring string feedback
|
||||
```yaml
|
||||
feedback: Please conduct the migration accordingly.
|
||||
notes: "NO MIGRATION NEEDED" # WRONG - feedback overrides notes
|
||||
```
|
||||
|
||||
### WRONG: Treating all feedback as completed
|
||||
```yaml
|
||||
feedback: I altered the revision. Conduct this new migration.
|
||||
processed:
|
||||
status: true # WRONG if migration not actually done
|
||||
```
|
||||
|
||||
## Correct Workflow
|
||||
|
||||
1. **Read feedback** - Understand user instruction
|
||||
2. **Check revision** - This defines the target migration
|
||||
3. **Execute migration** - Create/update slots and classes per revision
|
||||
4. **Update processed.status** - Set to `true`
|
||||
5. **Add response** - Document what was done
|
||||
- For structured feedback: Set `done: true` and fill `response`
|
||||
- For string feedback: Add new structured feedback entry confirming completion
|
||||
|
||||
## Example: Processing String Feedback
|
||||
|
||||
Before:
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/type_id
|
||||
feedback: I reject this! type_id should be migrated to has_or_had_identifier + Identifier
|
||||
revision:
|
||||
- label: has_or_had_identifier
|
||||
type: slot
|
||||
- label: Identifier
|
||||
type: class
|
||||
processed:
|
||||
status: false
|
||||
notes: "Previously marked as no migration needed"
|
||||
```
|
||||
|
||||
After processing:
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/type_id
|
||||
feedback:
|
||||
- timestamp: '2026-01-17T12:00:00Z'
|
||||
user: System
|
||||
done: true
|
||||
comment: "Original string feedback: I reject this! type_id should be migrated to has_or_had_identifier + Identifier"
|
||||
response: "Migration completed. type_id.yaml archived, consuming classes updated to use has_or_had_identifier slot with Identifier range."
|
||||
revision:
|
||||
- label: has_or_had_identifier
|
||||
type: slot
|
||||
- label: Identifier
|
||||
type: class
|
||||
processed:
|
||||
status: true
|
||||
notes: "Migration completed per user feedback rejecting previous notes."
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- **Rule 53**: Full Slot Migration - slot_fixes.yaml is AUTHORITATIVE
|
||||
- **Rule 57**: slot_fixes.yaml Revision Key is IMMUTABLE
|
||||
- **Rule 39**: Slot Naming Convention (RiC-O Style)
|
||||
|
|
@ -1,373 +0,0 @@
|
|||
# Rule 53: Full Slot Migration - No Deprecation Notes
|
||||
|
||||
🚨 **CRITICAL**: When migrating slots from `slot_fixes.yaml`:
|
||||
|
||||
1. **Follow the `revision` section EXACTLY** - The `slot_fixes.yaml` file specifies the exact replacement slots and classes to use
|
||||
2. **Perform FULL MIGRATION** - Completely remove the deprecated slot from the entity class
|
||||
3. **Do NOT add deprecation notes** - Never keep both old and new slots with deprecation markers
|
||||
|
||||
---
|
||||
|
||||
## 🚨 slot_fixes.yaml is AUTHORITATIVE AND CURATED 🚨
|
||||
|
||||
**File Location**: `schemas/20251121/linkml/modules/slots/slot_fixes.yaml`
|
||||
|
||||
**THIS FILE IS THE SINGLE SOURCE OF TRUTH FOR ALL SLOT MIGRATIONS.**
|
||||
|
||||
The `slot_fixes.yaml` file has been **manually curated** to specify the exact replacement slots and classes for each deprecated slot. The revisions are based on:
|
||||
|
||||
1. **Ontology analysis** - Each replacement was chosen based on alignment with base ontologies (CIDOC-CRM, RiC-O, PROV-O, Schema.org, etc.)
|
||||
2. **Semantic correctness** - Revisions reflect the intended meaning of the original slot
|
||||
3. **Pattern consistency** - Follows established naming conventions (Rule 39: RiC-O style, Rule 43: singular nouns)
|
||||
4. **Class hierarchy design** - Type/Types pattern (Rule 0b) applied where appropriate
|
||||
|
||||
**YOU MUST NOT**:
|
||||
- ❌ Substitute different slots than those specified in `revision`
|
||||
- ❌ Use your own judgment to pick "similar" slots
|
||||
- ❌ Skip the revision and invent new mappings
|
||||
- ❌ Partially apply the revision (e.g., use the slot but not the class)
|
||||
|
||||
**YOU MUST**:
|
||||
- ✅ Follow the `revision` section TO THE LETTER
|
||||
- ✅ Use EXACTLY the slots and classes specified
|
||||
- ✅ Apply ALL components of the revision (both slots AND classes)
|
||||
- ✅ Interpret `link_branch` fields correctly (see below)
|
||||
- ✅ Update `processed.status: true` after completing migration
|
||||
|
||||
---
|
||||
|
||||
## Understanding `link_branch` in Revision Plans
|
||||
|
||||
🚨 **CRITICAL**: The `link_branch` field in revision plans indicates **nested class attributes**. Items with `link_branch: N` are slots/classes that belong TO the primary class, not standalone replacements.
|
||||
|
||||
### How to Interpret `link_branch`
|
||||
|
||||
| Revision Item | Meaning |
|
||||
|---------------|---------|
|
||||
| Items **WITHOUT** `link_branch` | **PRIMARY** slot and class to create |
|
||||
| Items **WITH** `link_branch: 1` | First attribute branch that the primary class needs |
|
||||
| Items **WITH** `link_branch: 2` | Second attribute branch that the primary class needs |
|
||||
| Items **WITH** `link_branch: N` | Nth attribute branch for the primary class |
|
||||
|
||||
### Example: `visitor_count` Revision
|
||||
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/visitor_count
|
||||
revision:
|
||||
- label: has_or_had_quantity # PRIMARY SLOT (no link_branch)
|
||||
type: slot
|
||||
- label: Quantity # PRIMARY CLASS (no link_branch)
|
||||
type: class
|
||||
- label: has_or_had_measurement_unit # Quantity needs this slot
|
||||
type: slot
|
||||
link_branch: 1 # ← Branch 1: unit attribute
|
||||
- label: MeasureUnit # Range of has_or_had_measurement_unit
|
||||
type: class
|
||||
value:
|
||||
- visitors
|
||||
link_branch: 1
|
||||
- label: temporal_extent # Quantity needs this slot too
|
||||
type: slot
|
||||
link_branch: 2 # ← Branch 2: time attribute
|
||||
- label: TimeSpan # Range of temporal_extent
|
||||
type: class
|
||||
link_branch: 2
|
||||
```
|
||||
|
||||
**Interpretation**: This creates:
|
||||
1. **Primary**: `has_or_had_quantity` slot → `Quantity` class
|
||||
2. **Branch 1**: `Quantity.has_or_had_measurement_unit` → `MeasureUnit` (with value "visitors")
|
||||
3. **Branch 2**: `Quantity.temporal_extent` → `TimeSpan`
|
||||
|
||||
### Resulting Class Structure
|
||||
|
||||
```yaml
|
||||
# The Quantity class should have these slots:
|
||||
Quantity:
|
||||
slots:
|
||||
- has_or_had_measurement_unit # From link_branch: 1
|
||||
- temporal_extent # From link_branch: 2
|
||||
```
|
||||
|
||||
### Complex Example: `visitor_conversion_rate`
|
||||
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/visitor_conversion_rate
|
||||
revision:
|
||||
- label: has_or_had_conversion_rate # PRIMARY SLOT
|
||||
type: slot
|
||||
- label: ConversionRate # PRIMARY CLASS
|
||||
type: class
|
||||
- label: has_or_had_type # ConversionRate.has_or_had_type
|
||||
type: slot
|
||||
link_branch: 1
|
||||
- label: ConversionRateType # Abstract type class
|
||||
type: class
|
||||
link_branch: 1
|
||||
- label: includes_or_included # ConversionRateType hierarchy slot
|
||||
type: slot
|
||||
link_branch: 1
|
||||
- label: ConversionRateTypes # Concrete subclasses file
|
||||
type: class
|
||||
link_branch: 1
|
||||
- label: temporal_extent # ConversionRate.temporal_extent
|
||||
type: slot
|
||||
link_branch: 2
|
||||
- label: TimeSpan # Range of temporal_extent
|
||||
type: class
|
||||
link_branch: 2
|
||||
```
|
||||
|
||||
**Interpretation**:
|
||||
1. **Primary**: `has_or_had_conversion_rate` → `ConversionRate`
|
||||
2. **Branch 1**: Type hierarchy with `ConversionRateType` (abstract) + `ConversionRateTypes` (concrete subclasses)
|
||||
3. **Branch 2**: Temporal tracking via `temporal_extent` → `TimeSpan`
|
||||
|
||||
### Migration Checklist for `link_branch` Revisions
|
||||
|
||||
- [ ] Create/verify PRIMARY slot exists
|
||||
- [ ] Create/verify PRIMARY class exists
|
||||
- [ ] For EACH `link_branch: N`:
|
||||
- [ ] Add the branch slot to PRIMARY class's `slots:` list
|
||||
- [ ] Import the branch slot file
|
||||
- [ ] Import the branch class file (if creating new class)
|
||||
- [ ] Verify range of branch slot points to branch class
|
||||
- [ ] Update consuming class to use PRIMARY slot (not deprecated slot)
|
||||
- [ ] Update examples to show nested structure
|
||||
|
||||
---
|
||||
|
||||
## Mandatory: Follow slot_fixes.yaml Revisions Exactly
|
||||
|
||||
**The `revision` section in `slot_fixes.yaml` is AUTHORITATIVE.** Do not substitute different slots based on your own judgment.
|
||||
|
||||
**Example from slot_fixes.yaml**:
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/actual_start
|
||||
revision:
|
||||
- label: begin_of_the_begin # ← USE THIS SLOT
|
||||
type: slot
|
||||
- label: TimeSpan # ← USE THIS CLASS
|
||||
type: class
|
||||
```
|
||||
|
||||
**CORRECT**: Use `begin_of_the_begin` slot (as specified)
|
||||
**WRONG**: Substitute `has_actual_start_date` (not in revision)
|
||||
|
||||
## The Problem
|
||||
|
||||
Adding deprecation notes while keeping both old and new slots:
|
||||
- Creates schema bloat with redundant properties
|
||||
- Confuses data consumers about which slot to use
|
||||
- Violates single-source-of-truth principle
|
||||
- Complicates future data validation
|
||||
|
||||
## Anti-Pattern (WRONG)
|
||||
|
||||
```yaml
|
||||
# WRONG - Keeping deprecated slot with deprecation note
|
||||
classes:
|
||||
TemporaryLocation:
|
||||
slots:
|
||||
- actual_start # OLD - kept with deprecation note
|
||||
- actual_end # OLD - kept with deprecation note
|
||||
- has_actual_start_date # NEW
|
||||
- has_actual_end_date # NEW
|
||||
slot_usage:
|
||||
actual_start:
|
||||
deprecated: |
|
||||
DEPRECATED: Use has_actual_start_date instead.
|
||||
# ... more deprecation documentation
|
||||
```
|
||||
|
||||
## Correct Pattern
|
||||
|
||||
```yaml
|
||||
# CORRECT - Only new slots, old slots completely removed
|
||||
classes:
|
||||
TemporaryLocation:
|
||||
slots:
|
||||
- has_actual_start_date # NEW - only new slots present
|
||||
- has_actual_end_date # NEW
|
||||
# NO slot_usage for deprecated slots - they don't exist in this class
|
||||
```
|
||||
|
||||
## Migration Steps
|
||||
|
||||
When processing a slot from `slot_fixes.yaml`:
|
||||
|
||||
1. **Identify affected entity class(es)**
|
||||
2. **Remove old slot from imports** (if dedicated import file exists)
|
||||
3. **Remove old slot from slots list**
|
||||
4. **Remove any slot_usage for old slot**
|
||||
5. **Add new slot import** (if not already present)
|
||||
6. **Add new slot to slots list**
|
||||
7. **Add slot_usage for new slot** (if range override or customization needed)
|
||||
8. **Update examples** to use new slot
|
||||
9. **Validate with gen-owl**
|
||||
|
||||
## What Happens to Old Slot Files
|
||||
|
||||
The old slot files in `modules/slots/` (e.g., `actual_start.yaml`, `activities_societies.yaml`) are **NOT deleted** because:
|
||||
- Other entity classes might still use them
|
||||
- They serve as documentation of the old schema
|
||||
- They can be archived when all usages are migrated
|
||||
|
||||
However, the old slots are **removed from the entity class** being migrated.
|
||||
|
||||
## Example: TemporaryLocation Migration
|
||||
|
||||
**Before** (with old slots):
|
||||
```yaml
|
||||
imports:
|
||||
- ../slots/actual_end
|
||||
- ../slots/actual_start
|
||||
- ../slots/has_actual_start_date
|
||||
- ../slots/has_actual_end_date
|
||||
|
||||
slots:
|
||||
- actual_end
|
||||
- actual_start
|
||||
- has_actual_start_date
|
||||
- has_actual_end_date
|
||||
```
|
||||
|
||||
**After** (fully migrated):
|
||||
```yaml
|
||||
imports:
|
||||
# actual_end and actual_start imports REMOVED
|
||||
- ../slots/has_actual_start_date
|
||||
- ../slots/has_actual_end_date
|
||||
|
||||
slots:
|
||||
# actual_end and actual_start REMOVED from list
|
||||
- has_actual_start_date
|
||||
- has_actual_end_date
|
||||
```
|
||||
|
||||
## Slot Usage for New Slots
|
||||
|
||||
Only add `slot_usage` for the new slot if you need to:
|
||||
- Override the range for this specific class
|
||||
- Add class-specific examples
|
||||
- Add class-specific constraints
|
||||
|
||||
Do NOT add `slot_usage` just to document that it replaces an old slot.
|
||||
|
||||
## Recording in slot_fixes.yaml
|
||||
|
||||
When marking a slot as processed:
|
||||
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/actual_start
|
||||
processed:
|
||||
status: true
|
||||
timestamp: '2026-01-14T16:00:00Z'
|
||||
session: "session-2026-01-14-type-migration"
|
||||
notes: "FULLY MIGRATED: TemporaryLocation - actual_start REMOVED, using temporal_extent with TimeSpan.begin_of_the_begin (Rule 53)"
|
||||
```
|
||||
|
||||
Note the "FULLY MIGRATED" prefix in notes to confirm this was a complete removal, not a deprecation-in-place.
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ Common Mistakes to Avoid ⚠️
|
||||
|
||||
### Mistake 1: Substituting Different Slots
|
||||
|
||||
**slot_fixes.yaml specifies**:
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/actual_start
|
||||
revision:
|
||||
- label: begin_of_the_begin # ← MUST USE THIS
|
||||
type: slot
|
||||
- label: TimeSpan # ← WITH THIS CLASS
|
||||
type: class
|
||||
```
|
||||
|
||||
| Action | Status |
|
||||
|--------|--------|
|
||||
| Using `begin_of_the_begin` with `TimeSpan` | ✅ CORRECT |
|
||||
| Using `has_actual_start_date` (invented) | ❌ WRONG |
|
||||
| Using `start_date` (different slot) | ❌ WRONG |
|
||||
| Using `begin_of_the_begin` WITHOUT `TimeSpan` | ❌ WRONG (incomplete) |
|
||||
|
||||
### Mistake 2: Partial Application
|
||||
|
||||
The revision often specifies MULTIPLE components that work together:
|
||||
|
||||
```yaml
|
||||
revision:
|
||||
- label: has_or_had_type # ← Slot for linking
|
||||
type: slot
|
||||
- label: BackupType # ← Abstract base class
|
||||
type: class
|
||||
- label: includes_or_included # ← Slot for hierarchy
|
||||
type: slot
|
||||
- label: BackupTypes # ← Concrete subclasses
|
||||
type: class
|
||||
```
|
||||
|
||||
**All four components** are part of the migration. Don't just use `has_or_had_type` and ignore the class structure.
|
||||
|
||||
### Mistake 3: Using `temporal_extent` Slot Correctly
|
||||
|
||||
When `slot_fixes.yaml` specifies TimeSpan-based revision:
|
||||
|
||||
```yaml
|
||||
revision:
|
||||
- label: begin_of_the_begin
|
||||
type: slot
|
||||
- label: TimeSpan
|
||||
type: class
|
||||
```
|
||||
|
||||
This means: **Use the `temporal_extent` slot** (which has `range: TimeSpan`) and access the temporal bounds via TimeSpan's slots:
|
||||
|
||||
```yaml
|
||||
# CORRECT: Use temporal_extent with TimeSpan structure
|
||||
temporal_extent:
|
||||
begin_of_the_begin: '2020-06-15'
|
||||
end_of_the_end: '2022-03-15'
|
||||
|
||||
# WRONG: Create new has_actual_start_date slot
|
||||
has_actual_start_date: '2020-06-15' # ❌ Not in revision!
|
||||
```
|
||||
|
||||
### Mistake 4: Not Updating Examples
|
||||
|
||||
When migrating slots, **update ALL examples** in the class file:
|
||||
- Description examples (in class description)
|
||||
- slot_usage examples
|
||||
- Class-level examples (at bottom of file)
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
Before marking a slot as processed:
|
||||
|
||||
- [ ] Read the `revision` section completely
|
||||
- [ ] Identified ALL slots and classes in revision
|
||||
- [ ] Removed old slot from imports
|
||||
- [ ] Removed old slot from slots list
|
||||
- [ ] Removed old slot from slot_usage
|
||||
- [ ] Added new slot(s) per revision
|
||||
- [ ] Added new class import(s) per revision
|
||||
- [ ] Updated ALL examples to use new slots
|
||||
- [ ] Validated with `linkml-lint` or `gen-owl`
|
||||
- [ ] Updated `slot_fixes.yaml` with:
|
||||
- `status: true`
|
||||
- `timestamp` (ISO 8601)
|
||||
- `session` identifier
|
||||
- `notes` with "FULLY MIGRATED:" prefix
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- Rule 9: Enum-to-Class Promotion (single source of truth principle)
|
||||
- Rule 0b: Type/Types File Naming Convention
|
||||
- Rule: Slot Naming Convention (Current Style)
|
||||
- `.opencode/ENUM_TO_CLASS_PRINCIPLE.md`
|
||||
- `schemas/20251121/linkml/modules/slots/slot_fixes.yaml` - **AUTHORITATIVE** master list of migrations
|
||||
|
|
@ -1,129 +0,0 @@
|
|||
# Rule: Generic Slots, Specific Classes
|
||||
|
||||
**Identifier**: `generic-slots-specific-classes`
|
||||
**Severity**: **CRITICAL**
|
||||
|
||||
## Core Principle
|
||||
|
||||
**Slots MUST be generic predicates** that can be reused across multiple classes. **Classes MUST be specific** to provide context and constraints.
|
||||
|
||||
**DO NOT** create class-specific slots when a generic predicate can be used.
|
||||
|
||||
## Rationale
|
||||
|
||||
1. **Predicate Proliferation**: Creating bespoke slots for every class explodes the schema size (e.g., `has_museum_name`, `has_library_name`, `has_archive_name` instead of `has_name`).
|
||||
2. **Interoperability**: Generic predicates (`has_name`, `has_identifier`, `has_part`) map cleanly to standard ontologies (Schema.org, Dublin Core, RiC-O).
|
||||
3. **Querying**: It's easier to query "all entities with a name" than "all entities with museum_name OR library_name OR archive_name".
|
||||
4. **Maintenance**: Updating one generic slot propagates to all classes.
|
||||
|
||||
## Examples
|
||||
|
||||
### ❌ Anti-Pattern: Class-Specific Slots
|
||||
|
||||
```yaml
|
||||
# WRONG: Creating specific slots for each class
|
||||
slots:
|
||||
has_museum_visitor_count:
|
||||
range: integer
|
||||
has_library_patron_count:
|
||||
range: integer
|
||||
|
||||
classes:
|
||||
Museum:
|
||||
slots:
|
||||
- has_museum_visitor_count
|
||||
Library:
|
||||
slots:
|
||||
- has_library_patron_count
|
||||
```
|
||||
|
||||
### ✅ Correct Pattern: Generic Slot, Specific Class Usage
|
||||
|
||||
```yaml
|
||||
# CORRECT: One generic slot reused
|
||||
slots:
|
||||
has_or_had_quantity:
|
||||
slot_uri: rico:hasOrHadQuantity
|
||||
range: Quantity
|
||||
multivalued: true
|
||||
|
||||
classes:
|
||||
Museum:
|
||||
slots:
|
||||
- has_or_had_quantity
|
||||
slot_usage:
|
||||
has_or_had_quantity:
|
||||
description: The number of visitors to the museum.
|
||||
|
||||
Library:
|
||||
slots:
|
||||
- has_or_had_quantity
|
||||
slot_usage:
|
||||
has_or_had_quantity:
|
||||
description: The number of registered patrons.
|
||||
```
|
||||
|
||||
## Intermediate Class Pattern
|
||||
|
||||
Making slots generic often requires introducing **Intermediate Classes** to hold structured data, rather than flattening attributes onto the parent class.
|
||||
|
||||
### ❌ Anti-Pattern: Specific Flattened Slots
|
||||
|
||||
```yaml
|
||||
# WRONG: Flattened specific attributes
|
||||
classes:
|
||||
Museum:
|
||||
slots:
|
||||
- has_museum_budget_amount
|
||||
- has_museum_budget_currency
|
||||
- has_museum_budget_year
|
||||
```
|
||||
|
||||
### ✅ Correct Pattern: Generic Slot + Intermediate Class
|
||||
|
||||
```yaml
|
||||
# CORRECT: Generic slot pointing to structured class
|
||||
slots:
|
||||
has_or_had_budget:
|
||||
range: Budget
|
||||
multivalued: true
|
||||
|
||||
classes:
|
||||
Museum:
|
||||
slots:
|
||||
- has_or_had_budget
|
||||
|
||||
Budget:
|
||||
slots:
|
||||
- has_or_had_amount
|
||||
- has_or_had_currency
|
||||
- has_or_had_year
|
||||
```
|
||||
|
||||
## Specificity Levels
|
||||
|
||||
| Level | Component | Example |
|
||||
|-------|-----------|---------|
|
||||
| **Generic** | **Slot (Predicate)** | `has_or_had_identifier` |
|
||||
| **Specific** | **Class (Subject/Object)** | `ISILCode` |
|
||||
| **Specific** | **Slot Usage (Context)** | "The ISIL code assigned to this library" |
|
||||
|
||||
## Migration Guide
|
||||
|
||||
If you encounter an overly specific slot:
|
||||
|
||||
1. **Identify the generic concept** (e.g., `has_museum_opening_hours` → `has_opening_hours`).
|
||||
2. **Check if a generic slot exists** in `modules/slots/`.
|
||||
3. **If yes**, use the generic slot and add `slot_usage` to the class.
|
||||
4. **If no**, create the **generic** slot, not a specific one.
|
||||
|
||||
## Naming Indicators
|
||||
|
||||
**Reject slots containing:**
|
||||
* Class names (e.g., `has_custodian_name` → `has_name`)
|
||||
* Narrow types (e.g., `has_isbn_identifier` → `has_identifier`)
|
||||
* Contextual specifics (e.g., `has_primary_email` → `has_email` + type/role)
|
||||
|
||||
## See Also
|
||||
* Rule 55: Broaden Generic Predicate Ranges
|
||||
* Rule: Slot Naming Convention (Current Style)
|
||||
|
|
@ -1,157 +0,0 @@
|
|||
# Rule 59: LinkML Union Types Require `range: Any`
|
||||
|
||||
🚨 **CRITICAL**: When using `any_of` for union types in LinkML, you MUST also specify `range: Any` at the attribute level. Without it, the union type validation does NOT work.
|
||||
|
||||
## The Problem
|
||||
|
||||
LinkML's `any_of` construct allows defining slots that accept multiple types (e.g., string OR integer). However, there's a critical implementation detail:
|
||||
|
||||
**Without `range: Any`, the `any_of` constraint is silently ignored during validation.**
|
||||
|
||||
This leads to validation failures where data that should be valid (e.g., integer value in a string/integer union field) is rejected.
|
||||
|
||||
## Correct Pattern
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
identifier_value:
|
||||
range: Any # ← REQUIRED for any_of to work
|
||||
any_of:
|
||||
- range: string
|
||||
- range: integer
|
||||
description: The identifier value (can be string or integer)
|
||||
```
|
||||
|
||||
## Incorrect Pattern (WILL FAIL)
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
identifier_value:
|
||||
# Missing range: Any - validation will fail!
|
||||
any_of:
|
||||
- range: string
|
||||
- range: integer
|
||||
description: The identifier value (can be string or integer)
|
||||
```
|
||||
|
||||
## Common Use Cases
|
||||
|
||||
This pattern is required for:
|
||||
|
||||
| Use Case | Types | Example Fields |
|
||||
|----------|-------|----------------|
|
||||
| Identifier values | string \| integer | `identifier_value`, `geonames_id`, `viaf_id` |
|
||||
| Social media IDs | string \| array | `youtube_channel_id`, `facebook_id`, `twitter_username` |
|
||||
| Flexible identifiers | object \| array | `identifiers` (dict or list format) |
|
||||
| Numeric strings | string \| integer | `postal_code`, `kvk_number` |
|
||||
|
||||
## Real-World Examples from GLAM Schema
|
||||
|
||||
### Example 1: OriginalEntryIdentifier.yaml
|
||||
|
||||
```yaml
|
||||
# Before (BROKEN):
|
||||
attributes:
|
||||
identifier_value:
|
||||
any_of:
|
||||
- range: string
|
||||
- range: integer
|
||||
|
||||
# After (WORKING):
|
||||
attributes:
|
||||
identifier_value:
|
||||
range: Any # Added
|
||||
any_of:
|
||||
- range: string
|
||||
- range: integer
|
||||
```
|
||||
|
||||
### Example 2: WikidataSocialMedia.yaml
|
||||
|
||||
```yaml
|
||||
# Social media fields that can be single value or array
|
||||
attributes:
|
||||
youtube_channel_id:
|
||||
range: Any # Required for string|array union
|
||||
any_of:
|
||||
- range: string
|
||||
- range: string
|
||||
multivalued: true
|
||||
description: YouTube channel ID (single value or array)
|
||||
|
||||
facebook_id:
|
||||
range: Any
|
||||
any_of:
|
||||
- range: string
|
||||
- range: string
|
||||
multivalued: true
|
||||
```
|
||||
|
||||
### Example 3: OriginalEntry.yaml (object|array union)
|
||||
|
||||
```yaml
|
||||
# identifiers field that accepts both dict and array formats
|
||||
attributes:
|
||||
identifiers:
|
||||
range: Any # Required for flexible typing
|
||||
description: >-
|
||||
Identifiers from original source. Accepts both dict format
|
||||
(e.g., {isil: "XX-123"}) and array format
|
||||
(e.g., [{scheme: "isil", value: "XX-123"}])
|
||||
```
|
||||
|
||||
### Example 4: OriginalEntryLocation.yaml
|
||||
|
||||
```yaml
|
||||
attributes:
|
||||
geonames_id:
|
||||
range: Any # Required for string|integer
|
||||
any_of:
|
||||
- range: string
|
||||
- range: integer
|
||||
description: GeoNames ID (may be string or integer depending on source)
|
||||
```
|
||||
|
||||
## Validation Behavior
|
||||
|
||||
| Schema Definition | Integer Data | String Data | Result |
|
||||
|-------------------|--------------|-------------|--------|
|
||||
| `range: string` | ❌ FAIL | ✅ PASS | Strict string only |
|
||||
| `range: integer` | ✅ PASS | ❌ FAIL | Strict integer only |
|
||||
| `any_of` without `range: Any` | ❌ FAIL | ❌ FAIL | Broken - nothing works |
|
||||
| `any_of` with `range: Any` | ✅ PASS | ✅ PASS | Correct union behavior |
|
||||
|
||||
## Why This Happens
|
||||
|
||||
LinkML's validation engine processes `range` first to determine the basic type constraint. When `range` is not specified (or defaults to `string`), it applies that constraint before checking `any_of`. The `range: Any` tells the validator to defer type checking to the `any_of` constraints.
|
||||
|
||||
## Checklist for Union Types
|
||||
|
||||
When adding a field that accepts multiple types:
|
||||
|
||||
- [ ] Define the `any_of` block with all acceptable ranges
|
||||
- [ ] Add `range: Any` at the same level as `any_of`
|
||||
- [ ] Test with sample data of each type
|
||||
- [ ] Document the accepted types in the description
|
||||
|
||||
## See Also
|
||||
|
||||
- LinkML Documentation: [Union Types](https://linkml.io/linkml/schemas/advanced.html#union-types)
|
||||
- GLAM Validation: `schemas/20251121/linkml/modules/classes/CustodianSourceFile.yaml`
|
||||
- Validation command: `linkml-validate -s <schema>.yaml <data>.yaml`
|
||||
|
||||
## Migration Notes
|
||||
|
||||
**Affected Files (Fixed January 2026)**:
|
||||
- `OriginalEntryIdentifier.yaml` - `identifier_value`
|
||||
- `Identifier.yaml` - `identifier_value` slot_usage
|
||||
- `WikidataSocialMedia.yaml` - `youtube_channel_id`, `facebook_id`, `instagram_username`, `linkedin_company_id`, `twitter_username`, `facebook_page_id`
|
||||
- `YoutubeEnrichment.yaml` - `channel_id`
|
||||
- `OriginalEntryLocation.yaml` - `geonames_id`
|
||||
- `OriginalEntry.yaml` - `identifiers`
|
||||
|
||||
---
|
||||
|
||||
**Version**: 1.0
|
||||
**Created**: 2026-01-18
|
||||
**Author**: AI Agent (OpenCode Claude)
|
||||
|
|
@ -1,181 +0,0 @@
|
|||
# LinkML YAML Best Practices Rule
|
||||
|
||||
## Rule: Follow LinkML Conventions for Valid, Interoperable Schema Files
|
||||
|
||||
### 1. equals_expression Anti-Pattern
|
||||
|
||||
`equals_expression` is for dynamic formula evaluation (e.g., `"{age_in_years} * 12"`). Never use it for static value constraints.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_expression: '["hc:ArchiveOrganizationType"]'
|
||||
hold_record_set:
|
||||
equals_expression: '["hc:Fonds", "hc:Series"]'
|
||||
```
|
||||
|
||||
**CORRECT** (single value):
|
||||
```yaml
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType"
|
||||
```
|
||||
|
||||
**CORRECT** (multiple allowed values - if classes):
|
||||
```yaml
|
||||
slot_usage:
|
||||
hold_record_set:
|
||||
any_of:
|
||||
- range: UniversityAdministrativeFonds
|
||||
- range: StudentRecordSeries
|
||||
- range: FacultyPaperCollection
|
||||
```
|
||||
|
||||
**CORRECT** (multiple allowed values - if literals):
|
||||
```yaml
|
||||
slot_usage:
|
||||
status:
|
||||
equals_string_in:
|
||||
- "active"
|
||||
- "inactive"
|
||||
- "pending"
|
||||
```
|
||||
|
||||
### 2. Declare All Used Prefixes
|
||||
|
||||
Every CURIE prefix used in the file must be declared in the `prefixes:` block.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType" # hc: not declared!
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
default_prefix: hc
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType"
|
||||
```
|
||||
|
||||
### 3. Import Referenced Classes
|
||||
|
||||
When using external classes in `is_a`, `range`, or other references, import them.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
imports:
|
||||
- linkml:types
|
||||
classes:
|
||||
AcademicArchive:
|
||||
is_a: ArchiveOrganizationType # Not imported!
|
||||
slot_usage:
|
||||
related_to:
|
||||
range: WikidataAlignment # Not imported!
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../classes/ArchiveOrganizationType
|
||||
- ../classes/WikidataAlignment
|
||||
classes:
|
||||
AcademicArchive:
|
||||
is_a: ArchiveOrganizationType
|
||||
slot_usage:
|
||||
related_to:
|
||||
range: WikidataAlignment
|
||||
```
|
||||
|
||||
### 4. Quote Regex Patterns and Annotation Values
|
||||
|
||||
**Regex patterns:**
|
||||
```yaml
|
||||
# WRONG
|
||||
pattern: ^Q[0-9]+$
|
||||
|
||||
# CORRECT
|
||||
pattern: "^Q[0-9]+$"
|
||||
```
|
||||
|
||||
**Annotation values (must be strings):**
|
||||
```yaml
|
||||
# WRONG
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
|
||||
# CORRECT
|
||||
annotations:
|
||||
specificity_score: "0.1"
|
||||
```
|
||||
|
||||
### 5. Remove Unused Imports
|
||||
|
||||
Only import slots and classes that are actually used in the file.
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
imports:
|
||||
- ../slots/has_scope # Never used in slots: or slot_usage:
|
||||
- ../slots/has_score
|
||||
- ../slots/has_type
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
imports:
|
||||
- ../slots/has_score
|
||||
- ../slots/has_type
|
||||
```
|
||||
|
||||
### 6. Slot Usage Requires Slot Presence
|
||||
|
||||
A slot referenced in `slot_usage:` must either be:
|
||||
- Listed in the `slots:` array, OR
|
||||
- Inherited from a parent class via `is_a`
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
classes:
|
||||
MyClass:
|
||||
slots:
|
||||
- has_type
|
||||
slot_usage:
|
||||
has_type: {...}
|
||||
identified_by: {...} # Not in slots: and not inherited!
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
classes:
|
||||
MyClass:
|
||||
slots:
|
||||
- has_type
|
||||
- identified_by
|
||||
slot_usage:
|
||||
has_type: {...}
|
||||
identified_by: {...}
|
||||
```
|
||||
|
||||
## Checklist for Class Files
|
||||
|
||||
- [ ] All prefixes used in CURIEs are declared
|
||||
- [ ] `default_prefix` set if module belongs to that namespace
|
||||
- [ ] All referenced classes are imported
|
||||
- [ ] All used slots are imported
|
||||
- [ ] No `equals_expression` with static JSON arrays
|
||||
- [ ] Regex patterns are quoted
|
||||
- [ ] Annotation values are quoted strings
|
||||
- [ ] No unused imports
|
||||
- [ ] `slot_usage` only references slots that exist (via slots: or inheritance)
|
||||
|
|
@ -1,185 +0,0 @@
|
|||
# Mapping Specificity Rule: Broad vs Narrow vs Exact Mappings
|
||||
|
||||
## 🚨 CRITICAL: Mapping Semantics
|
||||
|
||||
When mapping LinkML classes to external ontologies, you MUST distinguish between **equivalence**, **hypernyms** (broader concepts), and **hyponyms** (narrower concepts).
|
||||
|
||||
### The Rule
|
||||
|
||||
1. **Exact Mappings (`skos:exactMatch`)**: Use ONLY when the external concept is **semantically equivalent** to your class.
|
||||
* *Example*: `hc:Person` `exact_mappings` `schema:Person`.
|
||||
* **CRITICAL**: Exact means the SAME semantic scope - neither broader nor narrower!
|
||||
* **DO NOT AVOID EXACT BY DEFAULT**: If equivalence is verified (including class/property category match and ontology definition review), `exact_mappings` SHOULD be used.
|
||||
|
||||
2. **Broad Mappings (`skos:broadMatch`)**: Use when the external concept is a **hypernym** (a broader, more general category) of your class.
|
||||
* *Example*: `hc:AcademicArchiveRecordSetType` `broad_mappings` `rico:RecordSetType`.
|
||||
* *Rationale*: An academic archive record set *is a* record set type, but `rico:RecordSetType` is broader.
|
||||
* *Common Hypernyms*: `skos:Concept`, `prov:Entity`, `prov:Activity`, `schema:Thing`, `schema:Organization`, `schema:Action`, `rico:RecordSetType`, `crm:E55_Type`.
|
||||
|
||||
3. **Narrow Mappings (`skos:narrowMatch`)**: Use when the external concept is a **hyponym** (a narrower, more specific category) of your class.
|
||||
* *Example*: `hc:Organization` `narrow_mappings` `hc:Library` (if mapping inversely).
|
||||
|
||||
4. **Close Mappings (`skos:closeMatch`)**: Use when the external concept is similar but not exactly equivalent.
|
||||
* *Example*: `hc:AccessPolicy` `close_mappings` `dcterms:accessRights` (related but different scope).
|
||||
|
||||
5. **Related Mappings (`skos:relatedMatch`)**: Use for non-hierarchical relationships.
|
||||
* *Example*: `hc:Collection` `related_mappings` `rico:RecordSet`.
|
||||
|
||||
### 🚨 Type Compatibility Rule
|
||||
|
||||
**Classes map to classes, properties map to properties.** Never mix types in mappings.
|
||||
|
||||
| Your Element | Valid Mapping Target |
|
||||
|--------------|---------------------|
|
||||
| Class | Class (owl:Class, rdfs:Class) |
|
||||
| Slot | Property (owl:ObjectProperty, owl:DatatypeProperty, rdf:Property) |
|
||||
|
||||
❌ **WRONG**:
|
||||
```yaml
|
||||
# AccessApplication is a CLASS, schema:Action is a CLASS - but Action is BROADER
|
||||
AccessApplication:
|
||||
exact_mappings:
|
||||
- schema:Action # WRONG: Action is a hypernym, not equivalent
|
||||
```
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
AccessApplication:
|
||||
broad_mappings:
|
||||
- schema:Action # CORRECT: Action is the broader category
|
||||
```
|
||||
|
||||
### 🚨 No Self/Internal Exact Mappings
|
||||
|
||||
`exact_mappings` MUST NOT contain self-references or internal HC class references for the same concept.
|
||||
|
||||
❌ **WRONG**:
|
||||
```yaml
|
||||
AcademicArchive:
|
||||
exact_mappings:
|
||||
- hc:AcademicArchive # Self/internal reference; not an external equivalence mapping
|
||||
```
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
AcademicArchive:
|
||||
exact_mappings:
|
||||
- wd:Q27032435 # External concept with equivalent semantic scope
|
||||
```
|
||||
|
||||
Use `exact_mappings` only for equivalent terms in external ontologies or external controlled vocabularies, not for repeating the class itself.
|
||||
|
||||
### ✅ Positive Guidance: When Exact Mapping Is Correct
|
||||
|
||||
Use `exact_mappings` when all checks below pass:
|
||||
|
||||
- Semantic scope is equivalent (not parent/child, not merely similar)
|
||||
- Ontological category matches (Class↔Class, Slot↔Property)
|
||||
- Target term is verified in the ontology source files under `data/ontology/` or verified Wikidata entity metadata
|
||||
- No self/internal duplication (no `hc:` self-reference for the same concept)
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
Person:
|
||||
exact_mappings:
|
||||
- schema:Person
|
||||
|
||||
Acquisition:
|
||||
exact_mappings:
|
||||
- crm:E8_Acquisition
|
||||
```
|
||||
|
||||
Do not downgrade a truly equivalent mapping to `close_mappings` or `broad_mappings` just to be conservative.
|
||||
|
||||
### Common Hypernyms That Are NEVER Exact Mappings
|
||||
|
||||
These terms are always BROADER than your specific class - never use them as `exact_mappings`:
|
||||
|
||||
| Hypernym | What It Means | Use Instead |
|
||||
|----------|---------------|-------------|
|
||||
| `schema:Action` | Any action | `broad_mappings` |
|
||||
| `schema:Organization` | Any organization | `broad_mappings` |
|
||||
| `schema:Thing` | Anything at all | `broad_mappings` |
|
||||
| `schema:PropertyValue` | Any property value | `broad_mappings` |
|
||||
| `schema:Permit` | Any permit | `broad_mappings` |
|
||||
| `prov:Activity` | Any activity | `broad_mappings` |
|
||||
| `prov:Entity` | Any entity | `broad_mappings` |
|
||||
| `skos:Concept` | Any concept | `broad_mappings` |
|
||||
| `crm:E55_Type` | Any type classification | `broad_mappings` |
|
||||
| `crm:E42_Identifier` | Any identifier | `broad_mappings` |
|
||||
| `rico:Identifier` | Any identifier | `broad_mappings` |
|
||||
| `dcat:DataService` | Any data service | `broad_mappings` |
|
||||
|
||||
### Common Violations to Avoid
|
||||
|
||||
❌ **WRONG**:
|
||||
```yaml
|
||||
AcademicArchiveRecordSetType:
|
||||
exact_mappings:
|
||||
- rico:RecordSetType # WRONG: This implies AcademicArchiveRecordSetType == RecordSetType
|
||||
```
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
AcademicArchiveRecordSetType:
|
||||
broad_mappings:
|
||||
- rico:RecordSetType # CORRECT: RecordSetType is broader
|
||||
```
|
||||
|
||||
❌ **WRONG**:
|
||||
```yaml
|
||||
SocialMovement:
|
||||
exact_mappings:
|
||||
- schema:Organization # WRONG: SocialMovement is a specific TYPE of Organization
|
||||
```
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
SocialMovement:
|
||||
broad_mappings:
|
||||
- schema:Organization # CORRECT
|
||||
```
|
||||
|
||||
❌ **WRONG**:
|
||||
```yaml
|
||||
AccessApplication:
|
||||
exact_mappings:
|
||||
- schema:Action # WRONG: Action is a hypernym
|
||||
```
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
AccessApplication:
|
||||
broad_mappings:
|
||||
- schema:Action # CORRECT: Action is the broader category
|
||||
```
|
||||
|
||||
### How to Determine Mapping Type
|
||||
|
||||
Ask these questions:
|
||||
|
||||
1. **Is it the SAME thing?** → `exact_mappings`
|
||||
- "Could I swap these two terms in any context without changing meaning?"
|
||||
- If NO, it's not an exact mapping
|
||||
|
||||
2. **Is the external term a PARENT category?** → `broad_mappings`
|
||||
- "Is my class a TYPE OF the external term?"
|
||||
- Example: AccessApplication IS-A Action
|
||||
|
||||
3. **Is the external term a CHILD category?** → `narrow_mappings`
|
||||
- "Is the external term a TYPE OF my class?"
|
||||
- Example: Library IS-A Organization (so Organization has narrow_mapping to Library)
|
||||
|
||||
4. **Is it similar but not hierarchical?** → `close_mappings`
|
||||
- "Related but not equivalent or hierarchical"
|
||||
|
||||
5. **Is there some other relationship?** → `related_mappings`
|
||||
- "Connected in some way"
|
||||
|
||||
### Verification Checklist
|
||||
|
||||
- [ ] Does the `exact_mapping` represent the **exact same scope**?
|
||||
- [ ] Is the external term a generic parent class (e.g., `Type`, `Concept`, `Entity`, `Action`, `Activity`, `Organization`)? → Move to `broad_mappings`
|
||||
- [ ] Is the external term a specific instance or subclass? → Check `narrow_mappings`
|
||||
- [ ] Is the external term the same type (class→class, property→property)?
|
||||
- [ ] Would swapping the terms change the meaning? If yes, not an `exact_mapping`
|
||||
|
|
@ -1,177 +0,0 @@
|
|||
# Rule: Multilingual Support Requirements
|
||||
|
||||
## Overview
|
||||
|
||||
All LinkML slot files MUST include multilingual support with translations in the following languages:
|
||||
|
||||
| Code | Language | Required |
|
||||
|------|----------|----------|
|
||||
| `nl` | Dutch | ✅ Yes |
|
||||
| `de` | German | ✅ Yes |
|
||||
| `fr` | French | ✅ Yes |
|
||||
| `ar` | Arabic | ✅ Yes |
|
||||
| `id` | Indonesian | ✅ Yes |
|
||||
| `zh` | Chinese (Simplified) | ✅ Yes |
|
||||
| `es` | Spanish | ✅ Yes |
|
||||
|
||||
---
|
||||
|
||||
## Required Multilingual Fields
|
||||
|
||||
### 1. `alt_descriptions`
|
||||
|
||||
Provide faithful translations of the English `description` field:
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
my_slot:
|
||||
description: >-
|
||||
To possess a specific structural arrangement or encoding standard.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Het bezitten van een specifieke structurele rangschikking of coderingsstandaard.
|
||||
de: >-
|
||||
Das Besitzen einer spezifischen strukturellen Anordnung oder eines Kodierungsstandards.
|
||||
fr: >-
|
||||
Posséder un arrangement structurel spécifique ou une norme de codage.
|
||||
ar: >-
|
||||
امتلاك ترتيب هيكلي محدد أو معيار ترميز.
|
||||
id: >-
|
||||
Memiliki susunan struktural tertentu atau standar pengkodean.
|
||||
zh: >-
|
||||
拥有特定的结构安排或编码标准。
|
||||
es: >-
|
||||
Poseer una disposición estructural específica o un estándar de codificación.
|
||||
```
|
||||
|
||||
### 2. `structured_aliases`
|
||||
|
||||
Provide translated slot names/labels for each language:
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
has_format:
|
||||
structured_aliases:
|
||||
- literal_form: heeft formaat
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: nl
|
||||
- literal_form: hat Format
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: de
|
||||
- literal_form: a un format
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: fr
|
||||
- literal_form: لديه تنسيق
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: ar
|
||||
- literal_form: memiliki format
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: id
|
||||
- literal_form: 具有格式
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: zh
|
||||
- literal_form: tiene formato
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: es
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Translation Guidelines
|
||||
|
||||
### DO:
|
||||
- Translate the semantic meaning faithfully
|
||||
- Preserve technical precision
|
||||
- Use natural phrasing for each language
|
||||
- Keep translations concise (similar length to English)
|
||||
|
||||
### DON'T:
|
||||
- Paraphrase or expand beyond the original meaning
|
||||
- Add information not present in the English description
|
||||
- Use machine translation without review
|
||||
- Skip any of the required languages
|
||||
|
||||
---
|
||||
|
||||
## Complete Example
|
||||
|
||||
```yaml
|
||||
id: https://nde.nl/ontology/hc/slot/catalogue
|
||||
name: catalogue
|
||||
title: catalogue
|
||||
|
||||
slots:
|
||||
catalogue:
|
||||
slot_uri: crm:P70_documents
|
||||
description: >-
|
||||
To systematically record, classify, and organize items within a structured
|
||||
inventory or database for the purposes of documentation and retrieval.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Het systematisch vastleggen, classificeren en ordenen van items binnen een
|
||||
gestructureerde inventaris of database voor documentatie en terugvinding.
|
||||
de: >-
|
||||
Das systematische Erfassen, Klassifizieren und Ordnen von Objekten in einem
|
||||
strukturierten Inventar oder einer Datenbank für Dokumentation und Abruf.
|
||||
fr: >-
|
||||
Enregistrer, classer et organiser systématiquement des éléments dans un
|
||||
inventaire structuré ou une base de données à des fins de documentation et de récupération.
|
||||
ar: >-
|
||||
تسجيل وتصنيف وتنظيم العناصر بشكل منهجي ضمن جرد منظم أو قاعدة بيانات لأغراض التوثيق والاسترجاع.
|
||||
id: >-
|
||||
Mencatat, mengklasifikasikan, dan mengatur item secara sistematis dalam
|
||||
inventaris terstruktur atau database untuk tujuan dokumentasi dan pengambilan.
|
||||
zh: >-
|
||||
在结构化清单或数据库中系统地记录、分类和组织项目,以便于文档编制和检索。
|
||||
es: >-
|
||||
Registrar, clasificar y organizar sistemáticamente elementos dentro de un
|
||||
inventario estructurado o base de datos con fines de documentación y recuperación.
|
||||
structured_aliases:
|
||||
- literal_form: catalogiseren
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: nl
|
||||
- literal_form: katalogisieren
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: de
|
||||
- literal_form: cataloguer
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: fr
|
||||
- literal_form: فهرسة
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: ar
|
||||
- literal_form: mengkatalogkan
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: id
|
||||
- literal_form: 编目
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: zh
|
||||
- literal_form: catalogar
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: es
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Validation Checklist
|
||||
|
||||
Before completing a slot file, verify:
|
||||
|
||||
- [ ] `alt_descriptions` provided for all 7 languages (nl, de, fr, ar, id, zh, es)
|
||||
- [ ] `structured_aliases` provided for all 7 languages
|
||||
- [ ] Translations are faithful to the English original
|
||||
- [ ] No language is skipped or left empty
|
||||
- [ ] Arabic and Chinese characters render correctly
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- Rule 1: Preserve Original Descriptions (LINKML_EDITING_RULES.md)
|
||||
- Rule 2: Translation Accuracy (LINKML_EDITING_RULES.md)
|
||||
- Rule 3: Description Field Purity (LINKML_EDITING_RULES.md)
|
||||
|
||||
---
|
||||
|
||||
**Version**: 1.0.0
|
||||
**Created**: 2026-02-03
|
||||
**Author**: OpenCODE
|
||||
|
|
@ -1,24 +0,0 @@
|
|||
# Rule: No Autonomous Alias Assignment
|
||||
|
||||
**Status**: ACTIVE
|
||||
**Created**: 2026-02-10
|
||||
|
||||
## Rule
|
||||
|
||||
The agent MUST NOT assign aliases to canonical slot files on its own. Only the user decides which `new/` slot files are absorbed as aliases into which canonical slots.
|
||||
|
||||
## Rationale
|
||||
|
||||
Alias assignment is a semantic decision that determines the conceptual scope of a canonical slot. Incorrect alias assignment conflates distinct concepts. For example, `membership_criteria` (eligibility rules for joining) is not an alias of `has_mission` (organizational purpose), even though both relate to organizational governance.
|
||||
|
||||
## What the agent MUST do
|
||||
|
||||
1. When creating or polishing a canonical slot file, leave the `aliases` field empty unless the user has explicitly specified which aliases to include.
|
||||
2. When processing `new/` files, present candidates to the user and wait for their alias assignment decisions.
|
||||
3. Do NOT delete `new/` files until the user confirms the alias mapping.
|
||||
|
||||
## What the agent MUST NOT do
|
||||
|
||||
- Autonomously decide that a `new/` file should become an alias of a canonical slot.
|
||||
- Add alias entries without explicit user instruction.
|
||||
- Delete `new/` files based on self-determined alias assignments.
|
||||
|
|
@ -1,46 +0,0 @@
|
|||
# Rule: Do Not Delete From slot_fixes.yaml
|
||||
|
||||
**Identifier**: `no-deletion-from-slot-fixes`
|
||||
**Severity**: **CRITICAL**
|
||||
|
||||
## Core Directive
|
||||
|
||||
**NEVER delete entries from `slot_fixes.yaml`.**
|
||||
|
||||
The `slot_fixes.yaml` file serves as the historical record and audit trail for all schema migrations. Removing entries destroys this history and violates the project's data integrity principles.
|
||||
|
||||
## Workflow
|
||||
|
||||
When processing a migration:
|
||||
|
||||
1. **Do NOT Remove**: Never delete the entry for the slot you are working on.
|
||||
2. **Update `processed`**: Instead, update the `processed` block:
|
||||
* Set `status: true`.
|
||||
* Set `date` to the current date (YYYY-MM-DD).
|
||||
* Add a detailed `notes` string explaining what was done (e.g., "Fully migrated to [new_slot] + [Class] (Rule 53). [File].yaml updated. Slot archived.").
|
||||
3. **Preserve History**: The entry must remain in the file permanently as a record of the migration.
|
||||
|
||||
## Rationale
|
||||
|
||||
* **Audit Trail**: We need to know what was migrated, when, and how.
|
||||
* **Reversibility**: If a migration introduces a bug, the record helps us understand the original state.
|
||||
* **Completeness**: The file tracks the total progress of the schema refactoring project.
|
||||
|
||||
## Example
|
||||
|
||||
**WRONG (Deletion)**:
|
||||
```yaml
|
||||
# DELETED from file
|
||||
# - original_slot_id: ...
|
||||
```
|
||||
|
||||
**CORRECT (Update)**:
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/has_some_slot
|
||||
processed:
|
||||
status: true
|
||||
date: '2026-01-27'
|
||||
notes: Fully migrated to has_or_had_new_slot + NewClass (Rule 53).
|
||||
revision:
|
||||
...
|
||||
```
|
||||
|
|
@ -1,189 +0,0 @@
|
|||
# Rule 52: No Duplicate Ontology Mappings
|
||||
|
||||
## Summary
|
||||
|
||||
Each ontology URI MUST appear in only ONE mapping category per schema element. A URI cannot simultaneously have multiple semantic relationships to the same class or slot.
|
||||
|
||||
## The Problem
|
||||
|
||||
LinkML provides five mapping annotation types based on SKOS vocabulary alignment:
|
||||
|
||||
| Property | SKOS Predicate | Meaning |
|
||||
|----------|---------------|---------|
|
||||
| `exact_mappings` | `skos:exactMatch` | "This IS that" (equivalent) |
|
||||
| `close_mappings` | `skos:closeMatch` | "This is very similar to that" |
|
||||
| `related_mappings` | `skos:relatedMatch` | "This is conceptually related to that" |
|
||||
| `narrow_mappings` | `skos:narrowMatch` | "This is MORE SPECIFIC than that" |
|
||||
| `broad_mappings` | `skos:broadMatch` | "This is MORE GENERAL than that" |
|
||||
|
||||
These relationships are **mutually exclusive**. A URI cannot simultaneously:
|
||||
- BE the element (`exact_mappings`) AND be broader than it (`broad_mappings`)
|
||||
- Be closely similar (`close_mappings`) AND be more general (`broad_mappings`)
|
||||
|
||||
## Anti-Pattern (WRONG)
|
||||
|
||||
```yaml
|
||||
# WRONG - schema:url appears in TWO mapping types
|
||||
slots:
|
||||
source_url:
|
||||
slot_uri: prov:atLocation
|
||||
exact_mappings:
|
||||
- schema:url # Says "source_url IS schema:url"
|
||||
broad_mappings:
|
||||
- schema:url # Says "schema:url is MORE GENERAL than source_url"
|
||||
```
|
||||
|
||||
This is a **logical contradiction**: `source_url` cannot simultaneously BE `schema:url` AND be more specific than `schema:url`.
|
||||
|
||||
## Correct Pattern
|
||||
|
||||
```yaml
|
||||
# CORRECT - each URI appears in only ONE mapping type
|
||||
slots:
|
||||
source_url:
|
||||
slot_uri: prov:atLocation
|
||||
exact_mappings:
|
||||
- schema:url # source_url IS schema:url
|
||||
close_mappings:
|
||||
- dcterms:source # Similar but not identical
|
||||
```
|
||||
|
||||
## Decision Guide: Which Mapping to Keep
|
||||
|
||||
When a URI appears in multiple categories, keep the **most precise** one:
|
||||
|
||||
### Precedence Order (keep the first match)
|
||||
|
||||
1. **exact_mappings** - Strongest claim: semantic equivalence
|
||||
2. **close_mappings** - Strong claim: nearly equivalent
|
||||
3. **narrow_mappings** / **broad_mappings** - Hierarchical relationship
|
||||
4. **related_mappings** - Weakest claim: conceptual association
|
||||
|
||||
### Decision Matrix
|
||||
|
||||
| If URI appears in... | Keep | Remove |
|
||||
|---------------------|------|--------|
|
||||
| exact + broad | exact | broad |
|
||||
| exact + close | exact | close |
|
||||
| exact + related | exact | related |
|
||||
| close + broad | close | broad |
|
||||
| close + related | close | related |
|
||||
| related + broad | related | broad |
|
||||
| narrow + broad | narrow | broad (contradictory!) |
|
||||
|
||||
### Special Case: narrow + broad
|
||||
|
||||
If a URI appears in BOTH `narrow_mappings` AND `broad_mappings`, this is a **data error** - the same URI cannot be both more specific AND more general. Investigate which is correct based on the ontology definition.
|
||||
|
||||
## Real Examples Fixed
|
||||
|
||||
### Example 1: source_url
|
||||
|
||||
```yaml
|
||||
# BEFORE (wrong)
|
||||
slots:
|
||||
source_url:
|
||||
exact_mappings:
|
||||
- schema:url
|
||||
broad_mappings:
|
||||
- schema:url # Duplicate!
|
||||
|
||||
# AFTER (correct)
|
||||
slots:
|
||||
source_url:
|
||||
exact_mappings:
|
||||
- schema:url # Keep exact (strongest)
|
||||
# broad_mappings removed
|
||||
```
|
||||
|
||||
### Example 2: Custodian class
|
||||
|
||||
```yaml
|
||||
# BEFORE (wrong)
|
||||
classes:
|
||||
Custodian:
|
||||
close_mappings:
|
||||
- cpov:PublicOrganisation
|
||||
narrow_mappings:
|
||||
- cpov:PublicOrganisation # Duplicate!
|
||||
|
||||
# AFTER (correct)
|
||||
classes:
|
||||
Custodian:
|
||||
close_mappings:
|
||||
- cpov:PublicOrganisation # Keep close (Custodian ≈ PublicOrganisation)
|
||||
# narrow_mappings: use for URIs that are MORE SPECIFIC than Custodian
|
||||
```
|
||||
|
||||
### Example 3: geonames_id (narrow + broad conflict)
|
||||
|
||||
```yaml
|
||||
# BEFORE (wrong - logical contradiction!)
|
||||
slots:
|
||||
geonames_id:
|
||||
narrow_mappings:
|
||||
- dcterms:identifier # Says geonames_id is MORE SPECIFIC
|
||||
broad_mappings:
|
||||
- dcterms:identifier # Says geonames_id is MORE GENERAL
|
||||
|
||||
# AFTER (correct)
|
||||
slots:
|
||||
geonames_id:
|
||||
narrow_mappings:
|
||||
- dcterms:identifier # geonames_id IS a specific type of identifier
|
||||
# broad_mappings removed (was contradictory)
|
||||
```
|
||||
|
||||
## Detection Script
|
||||
|
||||
Run this to find duplicate mappings in the schema:
|
||||
|
||||
```python
|
||||
import yaml
|
||||
from pathlib import Path
|
||||
from collections import defaultdict
|
||||
|
||||
mapping_types = ['exact_mappings', 'close_mappings', 'related_mappings',
|
||||
'narrow_mappings', 'broad_mappings']
|
||||
|
||||
dirs = [
|
||||
Path('schemas/20251121/linkml/modules/slots'),
|
||||
Path('schemas/20251121/linkml/modules/classes'),
|
||||
]
|
||||
|
||||
for d in dirs:
|
||||
for yaml_file in d.glob('*.yaml'):
|
||||
try:
|
||||
with open(yaml_file) as f:
|
||||
content = yaml.safe_load(f)
|
||||
except Exception:
|
||||
continue
|
||||
if not content:
|
||||
continue
|
||||
|
||||
for section in ['classes', 'slots']:
|
||||
items = content.get(section, {})
|
||||
if not isinstance(items, dict):
|
||||
continue
|
||||
for name, defn in items.items():
|
||||
if not isinstance(defn, dict):
|
||||
continue
|
||||
uri_to_types = defaultdict(list)
|
||||
for mt in mapping_types:
|
||||
for uri in defn.get(mt, []) or []:
|
||||
uri_to_types[uri].append(mt)
|
||||
for uri, types in uri_to_types.items():
|
||||
if len(types) > 1:
|
||||
print(f"{yaml_file}: {name} - {uri} in {types}")
|
||||
```
|
||||
|
||||
## Validation Rule
|
||||
|
||||
**Pre-commit check**: Before committing LinkML schema changes, run the detection script. If any duplicates are found, the commit should fail.
|
||||
|
||||
## References
|
||||
|
||||
- [LinkML Mappings Documentation](https://linkml.io/linkml-model/latest/docs/mappings/)
|
||||
- [SKOS Mapping Properties](https://www.w3.org/TR/skos-reference/#mapping)
|
||||
- Rule 50: Ontology-to-LinkML Mapping Convention (parent rule)
|
||||
- Rule 51: No Hallucinated Ontology References
|
||||
|
|
@ -1,316 +0,0 @@
|
|||
# Rule 51: No Hallucinated Ontology References
|
||||
|
||||
**Priority**: CRITICAL
|
||||
**Scope**: All LinkML schema files (`schemas/20251121/linkml/`)
|
||||
**Created**: 2025-01-13
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
All ontology references in LinkML schema files (`class_uri`, `slot_uri`, `*_mappings`) MUST be verifiable against actual ontology files in `/data/ontology/`. References to predicates or classes that do not exist in local ontology files are considered **hallucinated** and are prohibited.
|
||||
|
||||
---
|
||||
|
||||
## The Problem
|
||||
|
||||
AI agents may suggest ontology mappings based on training data without verifying that:
|
||||
1. The ontology file exists in `/data/ontology/`
|
||||
2. The specific predicate/class exists within that ontology file
|
||||
3. The prefix is declared and resolvable
|
||||
|
||||
This leads to schema files containing references like `dqv:value` or `adms:status` that cannot be validated or serialized to RDF.
|
||||
|
||||
---
|
||||
|
||||
## Requirements
|
||||
|
||||
### 1. All Ontology Prefixes Must Have Local Files
|
||||
|
||||
Before using a prefix (e.g., `prov:`, `schema:`, `org:`), verify the ontology file exists:
|
||||
|
||||
```bash
|
||||
# Check if ontology exists
|
||||
ls data/ontology/ | grep -i "prov\|schema\|org"
|
||||
```
|
||||
|
||||
**Available Ontologies** (as of 2025-01-13):
|
||||
|
||||
| Prefix | File | Verified |
|
||||
|--------|------|----------|
|
||||
| `prov:` | `prov-o.ttl`, `prov.ttl` | ✅ |
|
||||
| `schema:` | `schemaorg.owl` | ✅ |
|
||||
| `org:` | `org.rdf` | ✅ |
|
||||
| `skos:` | `skos.rdf` | ✅ |
|
||||
| `dcterms:` | `dublin_core_elements.rdf` | ✅ |
|
||||
| `foaf:` | `foaf.ttl` | ✅ |
|
||||
| `rico:` | `RiC-O_1-1.rdf` | ✅ |
|
||||
| `crm:` | `CIDOC_CRM_v7.1.3.rdf` | ✅ |
|
||||
| `geo:` | `geo.ttl` | ✅ |
|
||||
| `sosa:` | `sosa.ttl` | ✅ |
|
||||
| `bf:` | `bibframe.rdf` | ✅ |
|
||||
| `edm:` | `edm.owl` | ✅ |
|
||||
| `premis:` | `premis3.owl` | ✅ |
|
||||
| `dcat:` | `dcat3.ttl` | ✅ |
|
||||
| `ore:` | `ore.rdf` | ✅ |
|
||||
| `pico:` | `pico.ttl` | ✅ |
|
||||
| `gn:` | `geonames_ontology.rdf` | ✅ |
|
||||
| `time:` | `time.ttl` | ✅ |
|
||||
| `locn:` | `locn.ttl` | ✅ |
|
||||
| `dqv:` | `dqv.ttl` | ✅ |
|
||||
| `adms:` | `adms.ttl` | ✅ |
|
||||
|
||||
**NOT Available** (do not use without adding):
|
||||
|
||||
| Prefix | Status | Alternative |
|
||||
|--------|--------|-------------|
|
||||
| `qudt:` | Only referenced in era_ontology.ttl | Use `hc:` with close_mappings annotation |
|
||||
|
||||
### 2. Predicates Must Exist in Ontology Files
|
||||
|
||||
Before using a predicate, verify it exists:
|
||||
|
||||
```bash
|
||||
# Verify predicate exists
|
||||
grep -l "hasFrameRate\|frameRate" data/ontology/premis3.owl
|
||||
|
||||
# Check specific predicate definition
|
||||
grep -E "premis:hasFrameRate|:hasFrameRate" data/ontology/premis3.owl
|
||||
```
|
||||
|
||||
### 3. Use hc: Prefix for Domain-Specific Concepts
|
||||
|
||||
When no standard ontology predicate exists, use the Heritage Custodian namespace:
|
||||
|
||||
```yaml
|
||||
# CORRECT - Use hc: with documentation
|
||||
slots:
|
||||
heritage_relevance_score:
|
||||
slot_uri: hc:heritageRelevanceScore
|
||||
description: Heritage sector relevance score (0.0-1.0)
|
||||
annotations:
|
||||
ontology_note: >-
|
||||
No standard ontology predicate for heritage relevance scoring.
|
||||
Domain-specific metric for this project.
|
||||
|
||||
# WRONG - Hallucinated predicate
|
||||
slots:
|
||||
heritage_relevance_score:
|
||||
slot_uri: dqv:heritageScore # Does not exist!
|
||||
```
|
||||
|
||||
### 4. Document External References in close_mappings
|
||||
|
||||
When a similar concept exists in an ontology we don't have locally, document it in `close_mappings` with a note:
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
confidence_score:
|
||||
slot_uri: hc:confidenceScore
|
||||
close_mappings:
|
||||
- dqv:value # W3C Data Quality Vocabulary (not in local files)
|
||||
annotations:
|
||||
external_ontology_note: >-
|
||||
dqv:value from W3C Data Quality Vocabulary would be semantically
|
||||
appropriate but ontology not included in project. See
|
||||
https://www.w3.org/TR/vocab-dqv/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Workflow
|
||||
|
||||
### Before Adding New Mappings
|
||||
|
||||
1. **Check if ontology file exists**:
|
||||
```bash
|
||||
ls data/ontology/ | grep -i "<ontology-name>"
|
||||
```
|
||||
|
||||
2. **Search for predicate in ontology**:
|
||||
```bash
|
||||
grep -l "<predicate-name>" data/ontology/*
|
||||
```
|
||||
|
||||
3. **Verify predicate definition**:
|
||||
```bash
|
||||
grep -B2 -A5 "<predicate-name>" data/ontology/<file>
|
||||
```
|
||||
|
||||
4. **If not found**: Use `hc:` prefix with appropriate documentation
|
||||
|
||||
### When Reviewing Existing Mappings
|
||||
|
||||
Run validation script:
|
||||
|
||||
```bash
|
||||
# Find all slot_uri references
|
||||
grep -r "slot_uri:" schemas/20251121/linkml/modules/slots/ | \
|
||||
grep -v "hc:" | \
|
||||
cut -d: -f3 | \
|
||||
sort -u
|
||||
|
||||
# Verify each prefix has a local file
|
||||
for prefix in prov schema org skos dcterms foaf rico; do
|
||||
echo "Checking $prefix:"
|
||||
ls data/ontology/ | grep -i "$prefix" || echo " NOT FOUND!"
|
||||
done
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Ontology Addition Process
|
||||
|
||||
If a new ontology is genuinely needed:
|
||||
|
||||
1. **Download the ontology**:
|
||||
```bash
|
||||
curl -L -o data/ontology/<name>.ttl "<url>" -H "Accept: text/turtle"
|
||||
```
|
||||
|
||||
2. **Update ONTOLOGY_CATALOG.md**:
|
||||
```bash
|
||||
# Add entry to data/ontology/ONTOLOGY_CATALOG.md
|
||||
```
|
||||
|
||||
3. **Verify predicates exist**:
|
||||
```bash
|
||||
grep "<predicate>" data/ontology/<name>.ttl
|
||||
```
|
||||
|
||||
4. **Update LinkML prefixes** in schema files
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### CORRECT: Verified Mapping
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
retrieval_timestamp:
|
||||
slot_uri: prov:atTime # Verified in data/ontology/prov-o.ttl
|
||||
range: datetime
|
||||
```
|
||||
|
||||
### CORRECT: Domain-Specific with External Reference
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
confidence_score:
|
||||
slot_uri: hc:confidenceScore # HC namespace (always valid)
|
||||
range: float
|
||||
close_mappings:
|
||||
- dqv:value # External reference (documented, not required locally)
|
||||
annotations:
|
||||
ontology_note: >-
|
||||
Uses HC namespace as dqv: ontology not in local files.
|
||||
dqv:value would be semantically appropriate alternative.
|
||||
```
|
||||
|
||||
### WRONG: Hallucinated Mapping
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
confidence_score:
|
||||
slot_uri: dqv:value # INVALID - dqv: not in data/ontology/!
|
||||
range: float
|
||||
```
|
||||
|
||||
### WRONG: Non-Existent Predicate
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
frame_rate:
|
||||
slot_uri: premis:hasFrameRate # INVALID - predicate not in premis3.owl!
|
||||
range: float
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Consequences of Violation
|
||||
|
||||
1. **RDF serialization fails** - Invalid prefixes cause gen-owl errors
|
||||
2. **Schema validation errors** - LinkML validates prefix declarations
|
||||
3. **Broken interoperability** - External systems cannot resolve URIs
|
||||
4. **Data quality issues** - Semantic web tooling cannot process data
|
||||
|
||||
---
|
||||
|
||||
## PREMIS Ontology Reference (premis3.owl)
|
||||
|
||||
**CRITICAL**: The PREMIS ontology is frequently hallucinated. ALL premis: references MUST be verified.
|
||||
|
||||
### Valid PREMIS Classes
|
||||
|
||||
```
|
||||
Action, Agent, Bitstream, Copyright, Dependency, EnvironmentCharacteristic,
|
||||
Event, File, Fixity, HardwareAgent, Identifier, Inhibitor, InstitutionalPolicy,
|
||||
IntellectualEntity, License, Object, Organization, OutcomeStatus, Person,
|
||||
PreservationPolicy, Representation, RightsBasis, RightsStatus, Rule, Signature,
|
||||
SignatureEncoding, SignificantProperties, SoftwareAgent, Statute,
|
||||
StorageLocation, StorageMedium
|
||||
```
|
||||
|
||||
### Valid PREMIS Properties
|
||||
|
||||
```
|
||||
act, allows, basis, characteristic, citation, compositionLevel, dependency,
|
||||
determinationDate, documentation, encoding, endDate, fixity, governs,
|
||||
identifier, inhibitedBy, inhibits, jurisdiction, key, medium, note,
|
||||
originalName, outcome, outcomeNote, policy, prohibits, purpose, rationale,
|
||||
relationship, restriction, rightsStatus, signature, size, startDate,
|
||||
storedAt, terms, validationRules, version
|
||||
```
|
||||
|
||||
### Known Hallucinated PREMIS Terms (DO NOT USE)
|
||||
|
||||
| Hallucinated Term | Correction |
|
||||
|-------------------|------------|
|
||||
| `premis:PreservationEvent` | Use `premis:Event` |
|
||||
| `premis:RightsDeclaration` | Use `premis:RightsBasis` or `premis:RightsStatus` |
|
||||
| `premis:hasRightsStatement` | Use `premis:rightsStatus` |
|
||||
| `premis:hasRightsDeclaration` | Use `premis:rightsStatus` |
|
||||
| `premis:hasRepresentation` | Use `premis:relationship` or `dcterms:hasFormat` |
|
||||
| `premis:hasRelatedStatementInformation` | Use `premis:note` or `adms:status` |
|
||||
| `premis:hasObjectCharacteristics` | Use `premis:characteristic` |
|
||||
| `premis:rightsGranted` | Use `premis:RightsStatus` class with `premis:restriction` |
|
||||
| `premis:rightsEndDate` | Use `premis:endDate` |
|
||||
| `premis:linkingAgentIdentifier` | Use `premis:Agent` class |
|
||||
| `premis:storageLocation` (lowercase) | Use `premis:storedAt` property or `premis:StorageLocation` class |
|
||||
| `premis:hasFrameRate` | Does not exist - use `hc:frameRate` |
|
||||
| `premis:environmentCharacteristic` (lowercase) | Use `premis:EnvironmentCharacteristic` (class) |
|
||||
|
||||
### PREMIS Verification Commands
|
||||
|
||||
```bash
|
||||
# List all PREMIS classes
|
||||
grep -E "owl:Class.*premis" data/ontology/premis3.owl | \
|
||||
sed 's/.*v3\///' | sed 's/".*//' | sort -u
|
||||
|
||||
# List all PREMIS properties
|
||||
grep -E "ObjectProperty|DatatypeProperty" data/ontology/premis3.owl | \
|
||||
grep -oP 'v3/\K[^"]+' | sort -u
|
||||
|
||||
# Verify a specific term exists
|
||||
grep -c "YourTermHere" data/ontology/premis3.owl
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- Rule 38: Slot Centralization and Semantic URI Requirements
|
||||
- Rule 50: Ontology-to-LinkML Mapping Convention
|
||||
- `/data/ontology/ONTOLOGY_CATALOG.md` - Available ontologies
|
||||
- `.opencode/rules/slot-ontology-mapping-reference.md` - Mapping reference
|
||||
|
||||
---
|
||||
|
||||
## Version History
|
||||
|
||||
- **2025-01-13**: Added 7 more hallucinated PREMIS terms discovered during schema audit:
|
||||
- `premis:hasRightsStatement`, `premis:hasRightsDeclaration`, `premis:hasRepresentation`
|
||||
- `premis:hasRelatedStatementInformation`, `premis:rightsGranted`, `premis:rightsEndDate`
|
||||
- `premis:linkingAgentIdentifier`
|
||||
- **2025-01-13**: Initial creation after discovering dqv:, adms:, qudt: references without local files
|
||||
|
|
@ -1,164 +0,0 @@
|
|||
# Rule 60: No Migration Deferral - Execute All slot_fixes.yaml Revisions
|
||||
|
||||
## Status: MANDATORY
|
||||
|
||||
## Summary
|
||||
|
||||
**AI agents MUST execute ALL migrations specified in `slot_fixes.yaml`.** The pattern of marking migrations as "WELL_STRUCTURED_NO_MIGRATION_NEEDED" or similar deferral notes is **STRICTLY PROHIBITED** except in the rarest cases of genuine semantic conflict.
|
||||
|
||||
## The Problem
|
||||
|
||||
Previous AI agents incorrectly deferred >95% of slot_fixes.yaml migrations with notes like:
|
||||
- "WELL_STRUCTURED_NO_MIGRATION_NEEDED"
|
||||
- "Already has proper slot_uri"
|
||||
- "Simple enum is appropriate"
|
||||
- "Would add unnecessary indirection"
|
||||
- "Current structure is adequate"
|
||||
|
||||
**These deferrals were WRONG.** The slot_fixes.yaml revisions were carefully curated based on:
|
||||
- Ontology analysis (CIDOC-CRM, RiC-O, PROV-O, Schema.org alignment)
|
||||
- Semantic correctness and consistency
|
||||
- Pattern adherence (Rule 39: RiC-O naming, Rule 43: singular nouns)
|
||||
- Type/Types class hierarchy design (Rule 0b)
|
||||
|
||||
## The Rule
|
||||
|
||||
### MANDATORY: Execute ALL Migrations
|
||||
|
||||
When encountering a slot_fixes.yaml entry with `status: false`:
|
||||
|
||||
1. **READ the `revision` field** - This specifies EXACTLY what to do
|
||||
2. **EXECUTE the migration** - Create/update slots, classes, imports as specified
|
||||
3. **UPDATE status to `true`** - Document what was done in `notes`
|
||||
|
||||
### PROHIBITED: Deferral Reasoning
|
||||
|
||||
The following justifications are **INVALID** and **NEVER acceptable**:
|
||||
|
||||
| Invalid Justification | Why It's Wrong |
|
||||
|-----------------------|----------------|
|
||||
| "Already has proper slot_uri" | slot_uri handles external mapping; internal structure is separate concern |
|
||||
| "Simple string/enum is sufficient" | Consistency and extensibility trump local simplicity |
|
||||
| "Would add unnecessary indirection" | Indirection enables reuse and future extension |
|
||||
| "Creating a class would over-engineer" | Ontology design favors class-based modeling |
|
||||
| "Current structure is well-designed" | Individual optimization conflicts with schema-wide patterns |
|
||||
| "No semantic benefit" | Agent lacks context to make this judgment |
|
||||
|
||||
### VALID: Reasons to Pause (Not Defer)
|
||||
|
||||
The ONLY valid reasons to pause a migration (NOT defer indefinitely) are:
|
||||
|
||||
1. **Semantic Conflict**: The proposed revision contradicts the slot's documented semantic intent
|
||||
2. **Circular Dependency**: Migration would create import cycle
|
||||
3. **External Consumer Impact**: Breaking change affecting downstream systems
|
||||
4. **Class Already Exists**: Target class exists under different name (document and merge)
|
||||
|
||||
**When pausing**: Add a `feedback` entry explaining the specific conflict and request human review. Do NOT mark as "NO_MIGRATION_NEEDED".
|
||||
|
||||
## Decision Tree
|
||||
|
||||
```
|
||||
Is there a slot_fixes.yaml entry with status: false?
|
||||
├─ YES → Read the revision field
|
||||
│ ├─ Does revision specify slots/classes to create?
|
||||
│ │ └─ YES → EXECUTE THE MIGRATION (mandatory)
|
||||
│ └─ Is there a genuine semantic conflict?
|
||||
│ ├─ NO → EXECUTE THE MIGRATION (mandatory)
|
||||
│ └─ YES → Document conflict in feedback, request human review
|
||||
│ (Do NOT mark as "no migration needed")
|
||||
└─ NO → Nothing to do
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### WRONG: Deferral Note
|
||||
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/example_slot
|
||||
revision:
|
||||
- label: has_or_had_example
|
||||
type: slot
|
||||
- label: Example
|
||||
type: class
|
||||
processed:
|
||||
status: true # WRONG - marked true without doing work
|
||||
notes: "WELL_STRUCTURED_NO_MIGRATION_NEEDED - slot already has proper
|
||||
slot_uri and the current structure is adequate" # INVALID
|
||||
```
|
||||
|
||||
### CORRECT: Execute Migration
|
||||
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/example_slot
|
||||
revision:
|
||||
- label: has_or_had_example
|
||||
type: slot
|
||||
- label: Example
|
||||
type: class
|
||||
processed:
|
||||
status: true
|
||||
timestamp: '2026-01-19T12:00:00Z'
|
||||
notes: 'Migrated 2026-01-19 per Rule 53/56.
|
||||
- Created has_or_had_example.yaml slot file
|
||||
- Created Example.yaml class file
|
||||
- Updated ClassA.yaml, ClassB.yaml to use new slot
|
||||
- Archived: modules/slots/archive/example_slot_archived_20260119.yaml'
|
||||
```
|
||||
|
||||
### CORRECT: Pause with Genuine Conflict
|
||||
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/conflicting_slot
|
||||
revision:
|
||||
- label: has_or_had_foo
|
||||
type: slot
|
||||
processed:
|
||||
status: false # Correctly left false
|
||||
notes: ''
|
||||
feedback:
|
||||
- timestamp: '2026-01-19T12:00:00Z'
|
||||
user: opencode-claude
|
||||
done: false
|
||||
comment: |
|
||||
PAUSED FOR HUMAN REVIEW - Genuine semantic conflict detected:
|
||||
- Revision specifies has_or_had_foo (temporal relationship)
|
||||
- But slot is used for immutable birth dates (should be has_*)
|
||||
- Request clarification on intended temporal semantics
|
||||
```
|
||||
|
||||
## Statistics Context
|
||||
|
||||
The slot_fixes.yaml file contains 527 migration entries. Analysis of previous agent behavior:
|
||||
|
||||
- **Incorrectly deferred**: >95% of entries marked "NO_MIGRATION_NEEDED"
|
||||
- **Actually needing deferral**: <5% (genuine semantic conflicts)
|
||||
- **Required action**: Execute ALL migrations except those with documented semantic conflicts
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 53**: Full Slot Migration - slot_fixes.yaml is AUTHORITATIVE
|
||||
- **Rule 56**: Semantic Consistency Over Simplicity - Always Execute Revisions
|
||||
- **Rule 57**: slot_fixes.yaml Revision Key is IMMUTABLE
|
||||
- **Rule 58**: Feedback vs Revision Distinction
|
||||
|
||||
## Anti-Patterns Checklist
|
||||
|
||||
Before marking ANY migration as complete without execution, verify:
|
||||
|
||||
- [ ] Did I actually create the specified slots?
|
||||
- [ ] Did I actually create the specified classes?
|
||||
- [ ] Did I update all class files that use this slot?
|
||||
- [ ] Did I archive the old slot file?
|
||||
- [ ] Is my "notes" field documenting actual work done, not a deferral excuse?
|
||||
|
||||
If any answer is "no", the migration is NOT complete.
|
||||
|
||||
## Consequences
|
||||
|
||||
Agents that defer migrations without genuine semantic conflict:
|
||||
1. Create technical debt requiring human cleanup
|
||||
2. Delay schema consistency improvements
|
||||
3. Waste curator time reviewing false "completions"
|
||||
4. Undermine trust in AI-assisted schema work
|
||||
|
||||
**Execute the migrations. Do not defer.**
|
||||
|
|
@ -1,215 +0,0 @@
|
|||
# Rule 42: No Ontology Prefixes in Slot Names
|
||||
|
||||
**CRITICAL**: LinkML slot names MUST NOT include ontology namespace prefixes. Ontology references belong in mapping properties, NOT in element names.
|
||||
|
||||
---
|
||||
|
||||
## 1. The Problem
|
||||
|
||||
Slot names like `rico_has_or_had_holder` or `skos_broader` violate separation of concerns:
|
||||
|
||||
- **Slot names** should describe the semantic meaning in plain, readable terms
|
||||
- **Ontology mappings** belong in `slot_uri`, `exact_mappings`, `close_mappings`, `related_mappings`, `narrow_mappings`, `broad_mappings`
|
||||
|
||||
Embedding ontology prefixes in names:
|
||||
1. Creates coupling between naming and specific ontology versions
|
||||
2. Reduces readability for non-ontology experts
|
||||
3. Duplicates information already in mapping properties
|
||||
4. Makes future ontology migrations harder
|
||||
|
||||
---
|
||||
|
||||
## 2. Correct Pattern
|
||||
|
||||
### Use Descriptive Names + Mapping Properties
|
||||
|
||||
```yaml
|
||||
# CORRECT: Clean name with ontology reference in slot_uri
|
||||
slots:
|
||||
record_holder:
|
||||
description: The custodian that holds or held this record set.
|
||||
slot_uri: rico:hasOrHadHolder
|
||||
exact_mappings:
|
||||
- rico:hasOrHadHolder
|
||||
close_mappings:
|
||||
- schema:holdingArchive
|
||||
range: Custodian
|
||||
```
|
||||
|
||||
### WRONG: Ontology Prefix in Name
|
||||
|
||||
```yaml
|
||||
# WRONG: Ontology prefix embedded in slot name
|
||||
slots:
|
||||
rico_has_or_had_holder: # BAD - "rico_" prefix
|
||||
description: The custodian that holds or held this record set.
|
||||
slot_uri: rico:hasOrHadHolder
|
||||
range: string
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Prohibited Prefixes in Slot Names
|
||||
|
||||
The following prefixes MUST NOT appear at the start of slot names:
|
||||
|
||||
| Prefix | Ontology | Example Violation |
|
||||
|--------|----------|-------------------|
|
||||
| `rico_` | Records in Contexts | `rico_organizational_principle` |
|
||||
| `skos_` | SKOS | `skos_broader`, `skos_narrower` |
|
||||
| `schema_` | Schema.org | `schema_name` |
|
||||
| `dcterms_` | Dublin Core | `dcterms_created` |
|
||||
| `dct_` | Dublin Core | `dct_identifier` |
|
||||
| `prov_` | PROV-O | `prov_generated_by` |
|
||||
| `org_` | W3C Organization | `org_has_member` |
|
||||
| `crm_` | CIDOC-CRM | `crm_carried_out_by` |
|
||||
| `foaf_` | FOAF | `foaf_knows` |
|
||||
| `owl_` | OWL | `owl_same_as` |
|
||||
| `rdf_` | RDF | `rdf_type` |
|
||||
| `rdfs_` | RDFS | `rdfs_label` |
|
||||
| `cpov_` | CPOV | `cpov_public_organisation` |
|
||||
| `tooi_` | TOOI | `tooi_overheidsorganisatie` |
|
||||
| `bf_` | BIBFRAME | `bf_title` |
|
||||
| `edm_` | Europeana | `edm_provided_cho` |
|
||||
|
||||
---
|
||||
|
||||
## 4. Migration Examples
|
||||
|
||||
### Example 1: RiC-O Slots
|
||||
|
||||
```yaml
|
||||
# BEFORE (wrong)
|
||||
rico_has_or_had_holder:
|
||||
slot_uri: rico:hasOrHadHolder
|
||||
range: string
|
||||
|
||||
# AFTER (correct)
|
||||
record_holder:
|
||||
description: Reference to the custodian that holds or held this record set.
|
||||
slot_uri: rico:hasOrHadHolder
|
||||
exact_mappings:
|
||||
- rico:hasOrHadHolder
|
||||
range: Custodian
|
||||
```
|
||||
|
||||
### Example 2: SKOS Slots
|
||||
|
||||
```yaml
|
||||
# BEFORE (wrong)
|
||||
skos_broader:
|
||||
slot_uri: skos:broader
|
||||
range: uriorcurie
|
||||
|
||||
# AFTER (correct)
|
||||
broader_concept:
|
||||
description: A broader concept in the hierarchy.
|
||||
slot_uri: skos:broader
|
||||
exact_mappings:
|
||||
- skos:broader
|
||||
range: uriorcurie
|
||||
```
|
||||
|
||||
### Example 3: RiC-O Organizational Principle
|
||||
|
||||
```yaml
|
||||
# BEFORE (wrong)
|
||||
rico_organizational_principle:
|
||||
slot_uri: rico:hasRecordSetType
|
||||
range: string
|
||||
|
||||
# AFTER (correct)
|
||||
organizational_principle:
|
||||
description: The organizational principle (fonds, series, collection) for this record set.
|
||||
slot_uri: rico:hasRecordSetType
|
||||
exact_mappings:
|
||||
- rico:hasRecordSetType
|
||||
range: string
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Exceptions
|
||||
|
||||
### 5.1 Identifier Slots
|
||||
|
||||
Slots that store **identifiers from external systems** may include system names (not ontology prefixes):
|
||||
|
||||
```yaml
|
||||
# ALLOWED: External system identifier
|
||||
wikidata_id:
|
||||
description: Wikidata entity identifier (Q-number).
|
||||
slot_uri: schema:identifier
|
||||
range: string
|
||||
pattern: "^Q[0-9]+$"
|
||||
|
||||
# ALLOWED: External system identifier
|
||||
viaf_id:
|
||||
description: VIAF identifier for authority control.
|
||||
slot_uri: schema:identifier
|
||||
range: string
|
||||
```
|
||||
|
||||
### 5.2 Internal Namespace Force Slots
|
||||
|
||||
Technical slots for namespace generation are prefixed with `internal_`:
|
||||
|
||||
```yaml
|
||||
# ALLOWED: Technical workaround slot
|
||||
internal_wd_namespace_force:
|
||||
description: Internal slot to force WD namespace generation. Do not use.
|
||||
slot_uri: wd:Q35120
|
||||
range: string
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Validation
|
||||
|
||||
Run this command to find violations:
|
||||
|
||||
```bash
|
||||
cd schemas/20251121/linkml/modules/slots
|
||||
ls -1 *.yaml | grep -E "^(rico_|skos_|schema_|dcterms_|dct_|prov_|org_|crm_|foaf_|owl_|rdf_|rdfs_|cpov_|tooi_|bf_|edm_)"
|
||||
```
|
||||
|
||||
Expected output: No files (after migration)
|
||||
|
||||
---
|
||||
|
||||
## 7. Rationale
|
||||
|
||||
### LinkML Best Practices
|
||||
|
||||
LinkML provides dedicated properties for ontology alignment:
|
||||
|
||||
| Property | Purpose | Example |
|
||||
|----------|---------|---------|
|
||||
| `slot_uri` | Primary ontology predicate | `slot_uri: rico:hasOrHadHolder` |
|
||||
| `exact_mappings` | Semantically equivalent predicates | `exact_mappings: [schema:holdingArchive]` |
|
||||
| `close_mappings` | Nearly equivalent predicates | `close_mappings: [dc:creator]` |
|
||||
| `related_mappings` | Related but different predicates | `related_mappings: [prov:wasAttributedTo]` |
|
||||
| `narrow_mappings` | More specific predicates | `narrow_mappings: [rico:hasInstantiation]` |
|
||||
| `broad_mappings` | More general predicates | `broad_mappings: [schema:about]` |
|
||||
|
||||
See: https://linkml.io/linkml-model/latest/docs/mappings/
|
||||
|
||||
### Clean Separation of Concerns
|
||||
|
||||
- **Names**: Human-readable, domain-focused terminology
|
||||
- **URIs**: Machine-readable, ontology-specific identifiers
|
||||
- **Mappings**: Cross-ontology alignment documentation
|
||||
|
||||
This separation allows:
|
||||
1. Renaming slots without changing ontology bindings
|
||||
2. Adding new ontology mappings without renaming slots
|
||||
3. Clear documentation of semantic relationships
|
||||
4. Easier maintenance and evolution
|
||||
|
||||
---
|
||||
|
||||
## 8. See Also
|
||||
|
||||
- **Rule 38**: Slot Centralization and Semantic URI Requirements
|
||||
- **Rule 39**: Slot Naming Convention (RiC-O Style) - for temporal naming patterns
|
||||
- LinkML Mappings Documentation: https://linkml.io/linkml-model/latest/docs/mappings/
|
||||
|
|
@ -1,61 +0,0 @@
|
|||
# Rule: No Rough Edits in Schema Files
|
||||
|
||||
**Identifier**: `no-rough-edits-in-schema`
|
||||
**Severity**: **CRITICAL**
|
||||
|
||||
## Core Directive
|
||||
|
||||
**DO NOT** perform rough, imprecise, or bulk text substitutions (like `sed -i` or regex-based python scripts) on LinkML schema files (`schemas/*/linkml/`) without guaranteeing structural integrity.
|
||||
|
||||
**YOU MUST**:
|
||||
* ✅ Use proper YAML parsers/dumpers if modifying structure programmatically.
|
||||
* ✅ Manually verify edits if using text replacement.
|
||||
* ✅ Ensure indentation and nesting are preserved exactly.
|
||||
* ✅ Respect comments and ordering (which parsers often destroy, so careful text editing is sometimes necessary, but it must be PRECISE).
|
||||
|
||||
## Rationale
|
||||
|
||||
LinkML schemas are highly structured YAML files where indentation and nesting semantics are critical. Rough edits often cause:
|
||||
* **Duplicate keys** (e.g., leaving a property behind after deleting its parent key).
|
||||
* **Invalid indentation** (breaking the parent-child relationship).
|
||||
* **Silent corruption** (valid YAML but wrong semantics).
|
||||
|
||||
## Examples
|
||||
|
||||
### ❌ Anti-Pattern: Rough Deletion
|
||||
|
||||
Deleting lines containing a string without checking context:
|
||||
|
||||
```python
|
||||
# WRONG: Deleting lines blindly
|
||||
for line in lines:
|
||||
if "some_slot" in line:
|
||||
continue # Deletes the line, but might leave children orphaned!
|
||||
new_lines.append(line)
|
||||
```
|
||||
|
||||
**Resulting Corruption**:
|
||||
```yaml
|
||||
# Original
|
||||
slots:
|
||||
some_slot:
|
||||
range: string
|
||||
|
||||
# Corrupted (orphaned child)
|
||||
slots:
|
||||
range: string # INVALID!
|
||||
```
|
||||
|
||||
### ✅ Correct Pattern: Structural Awareness
|
||||
|
||||
If removing a slot reference, ensure you remove the entire list item or key-value block.
|
||||
|
||||
```python
|
||||
# BETTER: Check for list item syntax
|
||||
if re.match(r'^\s*-\s*some_slot\s*$', line):
|
||||
continue
|
||||
```
|
||||
|
||||
## Application
|
||||
|
||||
This rule applies to ALL files in `schemas/20251121/linkml/` and future versions.
|
||||
|
|
@ -1,53 +0,0 @@
|
|||
# Rule: No Version Indicators in Names
|
||||
|
||||
## 🚨 Critical
|
||||
|
||||
Do not include version identifiers in **class names**, **slot names**, or **enum names**.
|
||||
|
||||
Version tags in semantic names create churn, break reuse, and force unnecessary migrations.
|
||||
|
||||
## The Rule
|
||||
|
||||
1. Use stable semantic names for LinkML elements.
|
||||
- ✅ `DigitalPlatform`
|
||||
- ❌ `DigitalPlatformV2`
|
||||
|
||||
2. If a model evolves, keep the name and update metadata/provenance.
|
||||
- Track revision in changelog, annotations, or transformation metadata.
|
||||
- Do not encode `v2`, `v3`, `_2026`, `beta`, `final` in the element name.
|
||||
|
||||
3. Apply this to all naming surfaces:
|
||||
- `classes:` keys
|
||||
- `slots:` keys
|
||||
- `enums:` keys
|
||||
- `name:` values in module files
|
||||
|
||||
## Allowed Versioning Locations
|
||||
|
||||
- File-level changelog/comments
|
||||
- Dedicated metadata classes/slots (e.g., transformation metadata)
|
||||
- External release tags (git tags, manifest versions)
|
||||
|
||||
## Migration Guidance
|
||||
|
||||
When you encounter versioned names:
|
||||
|
||||
1. Rename semantic elements to stable names.
|
||||
2. Update references/imports/usages accordingly.
|
||||
3. Preserve provenance of the migration in comments/annotations.
|
||||
|
||||
## Examples
|
||||
|
||||
✅ Correct:
|
||||
```yaml
|
||||
classes:
|
||||
DigitalPlatformTransformationMetadata:
|
||||
description: Metadata about record transformation steps.
|
||||
```
|
||||
|
||||
❌ Wrong:
|
||||
```yaml
|
||||
classes:
|
||||
DigitalPlatformV2TransformationMetadata:
|
||||
description: Metadata about V2 transformation.
|
||||
```
|
||||
|
|
@ -1,15 +0,0 @@
|
|||
# Rule: Ontology Detection vs Heuristics
|
||||
|
||||
## Summary
|
||||
When detecting classes and predicates in `data/ontology/` or external ontology files, you must **read the actual ontology definitions** (e.g., RDF, OWL, TTL files) to determine if a term is a Class or a Property. Do not rely on naming heuristics (like "Capitalized means Class").
|
||||
|
||||
## Detail
|
||||
* **Verification**: Always read the source ontology file or use a semantic lookup tool to verify the `rdf:type` of an entity.
|
||||
* If `rdf:type` is `owl:Class` or `rdfs:Class`, it is a **Class**.
|
||||
* If `rdf:type` is `rdf:Property`, `owl:ObjectProperty`, or `owl:DatatypeProperty`, it is a **Property**.
|
||||
* **Avoid Heuristics**: Do not assume that `skos:Concept` is a class just because it looks like one (it is), or that `schema:name` is a property just because it's lowercase. Many ontologies have inconsistent naming conventions (e.g., `schema:Person` vs `foaf:Person`).
|
||||
* **Strictness**: If the ontology file is not available locally, attempt to fetch it or consult authoritative documentation before guessing.
|
||||
|
||||
## Violation Examples
|
||||
* Assuming `ex:MyTerm` is a class because it starts with an uppercase letter without checking the `.ttl` file.
|
||||
* Mapping a LinkML slot to `schema:Thing` (a Class) instead of a Property because you guessed based on the name.
|
||||
|
|
@ -1,306 +0,0 @@
|
|||
# Rule 50: Ontology-to-LinkML Mapping Convention
|
||||
|
||||
🚨 **CRITICAL**: When mapping base ontology classes and predicates to LinkML schema elements, use LinkML's dedicated mapping properties as documented at https://linkml.io/linkml-model/latest/docs/mappings/
|
||||
|
||||
---
|
||||
|
||||
## 1. What "LinkML Mapping" Means in This Project
|
||||
|
||||
**"LinkML mapping"** refers specifically to:
|
||||
1. Connecting LinkML schema elements (classes, slots, enums) to external ontology URIs
|
||||
2. Using LinkML's built-in mapping properties (`class_uri`, `slot_uri`, `*_mappings`)
|
||||
3. Following SKOS-based vocabulary alignment standards
|
||||
|
||||
**LinkML mapping does NOT mean**:
|
||||
- Creating arbitrary crosswalks in spreadsheets
|
||||
- Writing prose descriptions of how concepts relate
|
||||
- Inventing custom `@context` JSON-LD mappings outside the schema
|
||||
|
||||
---
|
||||
|
||||
## 2. LinkML Mapping Property Reference
|
||||
|
||||
### Primary Identity Properties
|
||||
|
||||
| Property | Applies To | Purpose | Example |
|
||||
|----------|-----------|---------|---------|
|
||||
| `class_uri` | Classes | Primary RDF class URI | `class_uri: ore:Aggregation` |
|
||||
| `slot_uri` | Slots | Primary RDF predicate URI | `slot_uri: rico:hasOrHadHolder` |
|
||||
| `enum_uri` | Enums | Enum namespace URI | `enum_uri: hc:PlatformTypeEnum` |
|
||||
|
||||
### SKOS-Based Mapping Properties
|
||||
|
||||
These properties express **semantic relationships** to external ontology terms:
|
||||
|
||||
| Property | SKOS Predicate | Meaning | Use When |
|
||||
|----------|---------------|---------|----------|
|
||||
| `exact_mappings` | `skos:exactMatch` | **IDENTICAL meaning** | Different ontology, **SAME semantics** (interchangeable) |
|
||||
| `close_mappings` | `skos:closeMatch` | Very similar meaning | Similar but **NOT interchangeable** |
|
||||
| `related_mappings` | `skos:relatedMatch` | Semantically related | Broader conceptual relationship |
|
||||
| `narrow_mappings` | `skos:narrowMatch` | This is more specific | External term is broader |
|
||||
| `broad_mappings` | `skos:broadMatch` | This is more general | External term is narrower |
|
||||
|
||||
### ⚠️ CRITICAL: `exact_mappings` Requires PRECISE Semantic Equivalence
|
||||
|
||||
**`exact_mappings` means the terms are INTERCHANGEABLE** - you could substitute one for the other in any context without changing meaning.
|
||||
|
||||
**Requirements for `exact_mappings`**:
|
||||
1. **Same definition**: Both terms must have equivalent definitions
|
||||
2. **Same scope**: Both terms cover the same set of instances
|
||||
3. **Same constraints**: Same domain/range restrictions apply
|
||||
4. **Bidirectional**: If A exactMatch B, then B exactMatch A
|
||||
|
||||
**DO NOT use `exact_mappings` when**:
|
||||
- One term is a subset of the other (use `narrow_mappings`/`broad_mappings`)
|
||||
- Terms are similar but have different scopes (use `close_mappings`)
|
||||
- Terms are related but not equivalent (use `related_mappings`)
|
||||
- You're uncertain about equivalence (default to `close_mappings`)
|
||||
|
||||
**Example - WRONG**:
|
||||
```yaml
|
||||
# PersonProfile is NOT equivalent to foaf:Person
|
||||
# PersonProfile is a structured document ABOUT a person, not the person themselves
|
||||
exact_mappings:
|
||||
- foaf:Person # ❌ WRONG - different semantics!
|
||||
```
|
||||
|
||||
**Example - CORRECT**:
|
||||
```yaml
|
||||
# foaf:Person and schema:Person ARE equivalent
|
||||
# Both define "a person" with the same scope
|
||||
exact_mappings:
|
||||
- schema:Person # ✅ CORRECT - truly equivalent
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Mapping Workflow: Ontology → LinkML
|
||||
|
||||
### Step 1: Identify External Ontology Class/Predicate
|
||||
|
||||
Search base ontology files in `/data/ontology/`:
|
||||
|
||||
```bash
|
||||
# Find aggregation-related classes
|
||||
rg -i "aggregation|aggregate" data/ontology/*.ttl data/ontology/*.rdf data/ontology/*.owl
|
||||
|
||||
# Check specific ontology
|
||||
rg "rdfs:Class|owl:Class" data/ontology/ore.rdf | grep -i "aggregation"
|
||||
```
|
||||
|
||||
### Step 2: Determine Mapping Strength
|
||||
|
||||
| Scenario | Mapping Property |
|
||||
|----------|------------------|
|
||||
| **This IS that ontology class** (identity) | `class_uri` |
|
||||
| **Equivalent in another vocabulary** | `exact_mappings` |
|
||||
| **Similar concept, different scope** | `close_mappings` |
|
||||
| **Related but different granularity** | `narrow_mappings` / `broad_mappings` |
|
||||
| **Conceptually related** | `related_mappings` |
|
||||
|
||||
### Step 3: Document Mapping in LinkML Schema
|
||||
|
||||
#### For Classes
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
DataAggregator:
|
||||
class_uri: ore:Aggregation # Primary identity - THIS IS an ORE Aggregation
|
||||
description: |
|
||||
A platform that harvests and STORES copies of metadata/content, causing data duplication.
|
||||
|
||||
ore:Aggregation - "A set of related resources grouped together."
|
||||
|
||||
Mapped to ORE because aggregators create aggregations of harvested metadata.
|
||||
exact_mappings:
|
||||
- edm:EuropeanaAggregation # Europeana's specialization
|
||||
close_mappings:
|
||||
- dcat:Catalog # Similar (collects datasets) but broader scope
|
||||
narrow_mappings:
|
||||
- edm:ProvidedCHO # More specific (single cultural object)
|
||||
```
|
||||
|
||||
#### For Slots
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
aggregates_from:
|
||||
slot_uri: ore:aggregates # Primary predicate
|
||||
description: |
|
||||
Institutions whose data is aggregated (harvested and stored) by this platform.
|
||||
|
||||
ore:aggregates - "Aggregations assert ore:aggregates relationships."
|
||||
exact_mappings:
|
||||
- edm:aggregatedCHO # Europeana equivalent
|
||||
range: HeritageCustodian
|
||||
multivalued: true
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Aggregation vs. Linking: A Mapping Example
|
||||
|
||||
This project requires **semantic precision** in distinguishing:
|
||||
|
||||
| Concept | Primary Mapping | Semantic Pattern |
|
||||
|---------|-----------------|------------------|
|
||||
| **Data Aggregation** | `ore:Aggregation` | Data is COPIED to aggregator's server |
|
||||
| **Linking/Federation** | `dcat:DataService` | Data REMAINS at source; only links provided |
|
||||
|
||||
### Aggregation Pattern (Data Duplication)
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
DataAggregator:
|
||||
class_uri: ore:Aggregation
|
||||
description: |
|
||||
Harvests and stores copies of metadata from partner institutions.
|
||||
|
||||
Key semantic: Data DUPLICATION occurs - the aggregator maintains its own copy.
|
||||
|
||||
Examples: Europeana, DPLA, Archives Portal Europe
|
||||
exact_mappings:
|
||||
- edm:EuropeanaAggregation
|
||||
annotations:
|
||||
data_storage_pattern: AGGREGATION
|
||||
causes_data_duplication: true
|
||||
```
|
||||
|
||||
### Linking Pattern (Single Source of Truth)
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
FederatedDiscoveryPortal:
|
||||
class_uri: dcat:DataService
|
||||
description: |
|
||||
Provides unified search across multiple institutions but LINKS to original sources.
|
||||
|
||||
Key semantic: NO data duplication - users are redirected to source institutions.
|
||||
|
||||
Data remains at partner institutions' platforms (single source of truth).
|
||||
close_mappings:
|
||||
- schema:SearchAction # The search functionality
|
||||
related_mappings:
|
||||
- ore:Aggregation # Related but crucially different
|
||||
annotations:
|
||||
data_storage_pattern: LINKING
|
||||
causes_data_duplication: false
|
||||
```
|
||||
|
||||
### Linking Properties from EDM
|
||||
|
||||
Use `edm:isShownAt` and `edm:isShownBy` to express links to source:
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
is_shown_at:
|
||||
slot_uri: edm:isShownAt
|
||||
description: |
|
||||
Unambiguous URL to the digital object on the provider's web site
|
||||
in its full information context.
|
||||
|
||||
edm:isShownAt - "The URL of a web view of the object in full context."
|
||||
|
||||
This property LINKS to the source institution - no data duplication.
|
||||
range: uri
|
||||
|
||||
is_shown_by:
|
||||
slot_uri: edm:isShownBy
|
||||
description: |
|
||||
Direct URL to the object in best available resolution on provider's site.
|
||||
|
||||
edm:isShownBy - "The URL of the object itself (not the context page)."
|
||||
range: uri
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Complete Mapping Documentation Template
|
||||
|
||||
When creating or updating a class with ontology mappings:
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
MyNewClass:
|
||||
# === PRIMARY IDENTITY ===
|
||||
class_uri: {prefix}:{ClassName} # The ontology class this IS
|
||||
|
||||
# === DESCRIPTION WITH ONTOLOGY REFERENCE ===
|
||||
description: |
|
||||
{Human-readable description of what this class represents}
|
||||
|
||||
{Ontology}: {class} - "{Definition from ontology documentation}"
|
||||
|
||||
Mapping rationale:
|
||||
- Chosen because: {why this ontology class fits}
|
||||
- Not using X because: {why alternatives were rejected}
|
||||
|
||||
# === SKOS-BASED MAPPINGS ===
|
||||
exact_mappings:
|
||||
- {prefix}:{EquivalentClass} # Same meaning, different vocabulary
|
||||
close_mappings:
|
||||
- {prefix}:{SimilarClass} # Very similar but not identical
|
||||
narrow_mappings:
|
||||
- {prefix}:{MoreSpecificClass} # External is broader than ours
|
||||
broad_mappings:
|
||||
- {prefix}:{MoreGeneralClass} # External is narrower than ours
|
||||
related_mappings:
|
||||
- {prefix}:{RelatedClass} # Conceptually related
|
||||
|
||||
# === OPTIONAL ANNOTATIONS ===
|
||||
annotations:
|
||||
ontology_source: "{Full name of source ontology}"
|
||||
ontology_version: "{Version if applicable}"
|
||||
mapping_confidence: "high|medium|low"
|
||||
mapping_notes: "{Additional context}"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Validation Checklist
|
||||
|
||||
Before committing ontology mappings:
|
||||
|
||||
- [ ] `class_uri` / `slot_uri` points to a real URI in `data/ontology/` files
|
||||
- [ ] Description includes ontology definition (quoted from source)
|
||||
- [ ] Mapping rationale documented for non-obvious choices
|
||||
- [ ] `exact_mappings` used ONLY for truly equivalent terms
|
||||
- [ ] `close_mappings` documented with difference explanation
|
||||
- [ ] All prefixes declared in schema's `prefixes:` block
|
||||
- [ ] Prefixes resolve to valid ontology namespaces
|
||||
|
||||
---
|
||||
|
||||
## 7. Common Ontology Prefixes for Mappings
|
||||
|
||||
| Prefix | Namespace | Ontology | Use For |
|
||||
|--------|-----------|----------|---------|
|
||||
| `ore:` | `http://www.openarchives.org/ore/terms/` | OAI-ORE | Aggregation patterns |
|
||||
| `edm:` | `http://www.europeana.eu/schemas/edm/` | Europeana Data Model | Cultural heritage aggregation |
|
||||
| `dcat:` | `http://www.w3.org/ns/dcat#` | DCAT | Data catalogs, services |
|
||||
| `rico:` | `https://www.ica.org/standards/RiC/ontology#` | Records in Contexts | Archival description |
|
||||
| `crm:` | `http://www.cidoc-crm.org/cidoc-crm/` | CIDOC-CRM | Cultural heritage events |
|
||||
| `schema:` | `http://schema.org/` | Schema.org | Web semantics |
|
||||
| `skos:` | `http://www.w3.org/2004/02/skos/core#` | SKOS | Concepts, labels |
|
||||
| `dcterms:` | `http://purl.org/dc/terms/` | Dublin Core | Metadata properties |
|
||||
| `prov:` | `http://www.w3.org/ns/prov#` | PROV-O | Provenance |
|
||||
| `org:` | `http://www.w3.org/ns/org#` | W3C Organization | Organizations |
|
||||
| `foaf:` | `http://xmlns.com/foaf/0.1/` | FOAF | People, agents |
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [LinkML Mappings Documentation](https://linkml.io/linkml-model/latest/docs/mappings/)
|
||||
- [LinkML URIs and Mappings Guide](https://linkml.io/linkml/schemas/uris-and-mappings.html)
|
||||
- [LinkML class_uri Reference](https://linkml.io/linkml-model/latest/docs/class_uri/)
|
||||
- [LinkML slot_uri Reference](https://linkml.io/linkml-model/latest/docs/slot_uri/)
|
||||
- Rule 1: Ontology Files Are Your Primary Reference
|
||||
- Rule 38: Slot Centralization and Semantic URI Requirements
|
||||
- Rule 42: No Ontology Prefixes in Slot Names
|
||||
|
||||
---
|
||||
|
||||
**Version**: 1.0.0
|
||||
**Created**: 2026-01-12
|
||||
**Author**: OpenCODE
|
||||
|
|
@ -1,45 +0,0 @@
|
|||
# Rule: Polished Slot Storage Location
|
||||
|
||||
## Summary
|
||||
|
||||
Polished (refactored) canonical slot files MUST be stored in the parent `slots/` directory:
|
||||
|
||||
```
|
||||
schemas/20251121/linkml/modules/slots/
|
||||
```
|
||||
|
||||
They must **NOT** be stored in the `20260202_matang/` subdirectory.
|
||||
|
||||
## Rationale
|
||||
|
||||
The `new/` subdirectory contain **draft/unpolished** slot definitions that are pending review. Once a slot file has been polished (ontology-aligned, translated, cleaned), it graduates to the canonical `slots/` directory.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
schemas/20251121/linkml/modules/slots/
|
||||
├── *.yaml ← Polished canonical slot files go HERE
|
||||
└── 20260202_matang/
|
||||
├── *.yaml ← Draft/unpolished canonical slots (staging area)
|
||||
└── new/
|
||||
└── *.yaml ← Raw/draft slot definitions pending triage
|
||||
```
|
||||
|
||||
## Rule
|
||||
|
||||
- When polishing a slot file, write the result to `schemas/20251121/linkml/modules/slots/{slot_name}.yaml`
|
||||
- If the source file was in `20260202_matang/`, remove it from there after writing to `slots/`
|
||||
- If the source file was in `20260202_matang/new/`, it should only be deleted after user confirmation of alias absorption (per the no-autonomous-alias-assignment rule)
|
||||
- If a file already exists in `slots/` (i.e., it was previously polished in an earlier session), overwrite it in place
|
||||
|
||||
## Examples
|
||||
|
||||
**CORRECT:**
|
||||
```
|
||||
schemas/20251121/linkml/modules/slots/has_pattern.yaml ← polished file
|
||||
```
|
||||
|
||||
**WRONG:**
|
||||
```
|
||||
schemas/20251121/linkml/modules/slots/20260202_matang/has_pattern.yaml ← should not be here after polishing
|
||||
```
|
||||
|
|
@ -1,32 +0,0 @@
|
|||
# Rule: Preserve Bespoke Slots Until Refactoring
|
||||
|
||||
**Identifier**: `preserve-bespoke-slots-until-refactoring`
|
||||
**Severity**: **CRITICAL**
|
||||
|
||||
## Core Directive
|
||||
|
||||
**DO NOT remove or migrate "additional" bespoke slots during generic migration passes unless they are the specific target of the current task.**
|
||||
|
||||
## Context
|
||||
|
||||
When migrating a specific slot (e.g., `has_approval_date`), you may encounter other bespoke or legacy slots in the same class file (e.g., `innovation_budget`, `operating_budget`).
|
||||
|
||||
**YOU MUST**:
|
||||
* ✅ Migrate ONLY the specific slot you were instructed to work on.
|
||||
* ✅ Leave other bespoke slots exactly as they are.
|
||||
* ✅ Focus strictly on the current migration target.
|
||||
|
||||
**YOU MUST NOT**:
|
||||
* ❌ Proactively migrate "nearby" slots just because they look like they need refactoring.
|
||||
* ❌ Remove slots that seem unused or redundant without specific instruction.
|
||||
* ❌ "Clean up" the class file by removing legacy attributes.
|
||||
|
||||
## Rationale
|
||||
|
||||
Refactoring is a separate, planned phase. Mixing opportunistic refactoring with systematic slot migration increases the risk of regression and makes changes harder to review. "We will refactor those later."
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Identify Target**: Identify the specific slot(s) assigned for migration (from `slot_fixes.yaml` or user prompt).
|
||||
2. **Execute Migration**: Apply changes ONLY for those slots.
|
||||
3. **Ignore Others**: Do not touch other slots in the file, even if they violate other rules (like Rule 39 or Rule 53). Those will be handled in their own dedicated tasks.
|
||||
|
|
@ -1,190 +0,0 @@
|
|||
# Rule 56: Semantic Consistency Over Simplicity - Always Execute slot_fixes.yaml Revisions
|
||||
|
||||
🚨 **CRITICAL**: When `slot_fixes.yaml` specifies a revision, agents MUST execute it. Perceived simplicity of the existing slot is NOT a valid reason to reject a migration.
|
||||
|
||||
## The Core Problem
|
||||
|
||||
Previous agents marked migrations as "NO MIGRATION NEEDED" or "WELL_STRUCTURED_NO_MIGRATION_NEEDED" citing reasons like:
|
||||
- "simple enum appropriate"
|
||||
- "would add unnecessary indirection"
|
||||
- "already has proper slot_uri"
|
||||
- "string range is sufficient for this use case"
|
||||
- "well-structured, no migration needed"
|
||||
- "current implementation is adequate"
|
||||
|
||||
**These judgments were INCORRECT.**
|
||||
|
||||
**Over 95% of entries in `slot_fixes.yaml` represent carefully curated migrations that MUST be executed.**
|
||||
|
||||
The revisions were manually designed based on:
|
||||
- Ontology analysis (CIDOC-CRM, RiC-O, PROV-O, Schema.org alignment)
|
||||
- Semantic correctness
|
||||
- Pattern consistency (Rule 39: RiC-O style naming)
|
||||
- Type/Types class hierarchy design (Rule 0b)
|
||||
|
||||
## Why Revisions MUST Be Executed
|
||||
|
||||
| Principle | Explanation |
|
||||
|-----------|-------------|
|
||||
| **Schema Consistency** | Ontology achieves semantic power through consistent patterns, not local optimizations |
|
||||
| **LinkML Mapping Separation** | `slot_uri` handles external ontology alignment; slot structure handles internal consistency |
|
||||
| **Single Responsibility Principle** | Predicates should have single, focused purposes |
|
||||
| **Extensibility First** | Structured classes enable future extension even if current use is simple |
|
||||
| **Curated Quality** | Revisions were manually designed with ontology expertise - trust them |
|
||||
|
||||
## Invalid Reasons to Reject Migrations
|
||||
|
||||
| Rejected Reason | Why It's Invalid |
|
||||
|-----------------|------------------|
|
||||
| "Already has proper slot_uri" | slot_uri is for external mapping; internal structure is separate concern |
|
||||
| "Simple string/enum is sufficient" | Consistency and extensibility trump local simplicity |
|
||||
| "Would add unnecessary indirection" | Indirection enables reuse and future extension |
|
||||
| "Creating a class would over-engineer" | Ontology design favors class-based modeling |
|
||||
| "Well-structured, no migration needed" | **If revision exists, migration IS needed** |
|
||||
| "Current implementation is adequate" | Adequacy is not the standard; consistency is |
|
||||
| "WELL_STRUCTURED_NO_MIGRATION_NEEDED" | **This note type is PROHIBITED** |
|
||||
|
||||
## Valid Reasons to PAUSE Migrations (Not Reject)
|
||||
|
||||
These warrant discussion with the user, NOT unilateral rejection:
|
||||
|
||||
1. **Semantic Conflict**: Proposed slot_uri contradicts the semantic intent of the slot
|
||||
2. **Class Already Exists**: The target class exists under a different name (propose merge)
|
||||
3. **Circular Dependency**: Migration would create import cycles
|
||||
4. **Breaking External API**: Change would break published external consumers
|
||||
|
||||
**Even these reasons require escalation to the user** - agents should NOT mark as "NO MIGRATION NEEDED".
|
||||
|
||||
## The Key Insight
|
||||
|
||||
Agents confused "has good external mapping" with "needs no migration". These are **independent concerns**:
|
||||
|
||||
- **External mapping** (slot_uri, exact_mappings): How this slot relates to external ontologies
|
||||
- **Internal structure** (range, class design): How this slot fits within the LinkML schema
|
||||
|
||||
A slot can have perfect external mappings AND still need structural migration for internal consistency.
|
||||
|
||||
## Correct Agent Behavior
|
||||
|
||||
```yaml
|
||||
# When encountering a slot_fixes.yaml entry:
|
||||
|
||||
1. READ the revision specification completely
|
||||
2. DO NOT evaluate whether migration is "necessary" - it IS necessary
|
||||
3. CREATE the specified slots and classes
|
||||
4. UPDATE the consuming class(es)
|
||||
5. ARCHIVE the old slot
|
||||
6. ADD processed section with status: true
|
||||
|
||||
# NEVER add notes like:
|
||||
# - "NO MIGRATION NEEDED"
|
||||
# - "WELL_STRUCTURED_NO_MIGRATION_NEEDED"
|
||||
# - "Current implementation is sufficient"
|
||||
# - "Would add unnecessary complexity"
|
||||
# - "Simple enum appropriate"
|
||||
```
|
||||
|
||||
## Handling Previous "NO MIGRATION NEEDED" Entries
|
||||
|
||||
Entries previously marked with notes like "WELL_STRUCTURED_NO_MIGRATION_NEEDED" should be:
|
||||
|
||||
1. **Re-processed**: The migration should be executed as specified in `revision`
|
||||
2. **Notes replaced**: Replace the incorrect notes with actual migration documentation
|
||||
3. **Status verified**: Ensure `status: true` reflects ACTUAL migration, not skipped migration
|
||||
|
||||
## Example - WRONG Agent Behavior
|
||||
|
||||
```yaml
|
||||
# WRONG - Agent decided migration wasn't needed
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/cites_appendix
|
||||
revision:
|
||||
- label: is_or_was_listed_in
|
||||
type: slot
|
||||
- label: CITESAppendix
|
||||
type: class
|
||||
processed:
|
||||
status: true # ← Marked complete but NOT actually migrated!
|
||||
notes: "WELL_STRUCTURED_NO_MIGRATION_NEEDED: Already has proper slot_uri
|
||||
and string range is sufficient for CITES appendix values."
|
||||
```
|
||||
|
||||
## Example - CORRECT Agent Behavior
|
||||
|
||||
```yaml
|
||||
# CORRECT - Agent executed the migration as specified
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/cites_appendix
|
||||
revision:
|
||||
- label: is_or_was_listed_in
|
||||
type: slot
|
||||
- label: CITESAppendix
|
||||
type: class
|
||||
processed:
|
||||
status: true
|
||||
timestamp: '2026-01-19T00:00:00Z'
|
||||
session: session-2026-01-19-cites-appendix-migration
|
||||
notes: 'Migrated 2026-01-19 per Rule 53/56. Created is_or_was_listed_in.yaml.
|
||||
Created CITESAppendix.yaml class. Updated BiologicalObject.yaml.
|
||||
Archived: modules/slots/archive/cites_appendix_archived_20260119.yaml.'
|
||||
```
|
||||
|
||||
## Feedback Field
|
||||
|
||||
The `feedback` field in slot_fixes.yaml entries contains user corrections to agent mistakes. When feedback says things like:
|
||||
|
||||
- "I reject this!"
|
||||
- "Conduct the migration"
|
||||
- "Please conduct accordingly"
|
||||
- "I altered the revision"
|
||||
|
||||
This means a previous agent incorrectly deferred the migration, and it MUST now be executed.
|
||||
|
||||
## Schema Consistency Examples
|
||||
|
||||
### Why "Simple URI is fine" is WRONG
|
||||
|
||||
```yaml
|
||||
# WRONG - Agent judgment: "Simple URI is fine"
|
||||
thumbnail_url:
|
||||
range: uri
|
||||
slot_uri: schema:thumbnailUrl
|
||||
|
||||
# CORRECT - Consistent with all media references
|
||||
has_or_had_thumbnail:
|
||||
range: Thumbnail # Thumbnail class with has_or_had_url → URL
|
||||
```
|
||||
|
||||
**Rationale**: All media references (images, thumbnails, videos, documents) should use the same structural pattern.
|
||||
|
||||
### Why "Simple enum is appropriate" is WRONG
|
||||
|
||||
```yaml
|
||||
# WRONG - "Simple enum is fine"
|
||||
thinking_mode:
|
||||
range: ThinkingModeEnum # enabled, disabled, interleaved
|
||||
|
||||
# CORRECT - Enables extension
|
||||
has_or_had_mode:
|
||||
range: ThinkingMode
|
||||
# ThinkingMode can have: mode_type, confidence, effective_date, etc.
|
||||
```
|
||||
|
||||
**Rationale**: Even if current use is simple, structured classes enable future extension without breaking changes.
|
||||
|
||||
## Summary
|
||||
|
||||
**Trust the revision. Execute the migration. Document the work.**
|
||||
|
||||
The `revision` key in `slot_fixes.yaml` represents carefully curated ontology decisions. Agents are **executors** of these decisions, **not evaluators**. The only acceptable output is a completed migration with proper documentation.
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 53**: slot_fixes.yaml is AUTHORITATIVE - Full Slot Migration
|
||||
- **Rule 55**: Broaden Generic Predicate Ranges Instead of Creating Bespoke Predicates
|
||||
- **Rule 57**: The revision key in slot_fixes.yaml is IMMUTABLE
|
||||
- **Rule 39**: RiC-O Temporal Naming Conventions
|
||||
- **Rule 38**: Slot Centralization and Semantic URI Requirements
|
||||
|
||||
## Revision History
|
||||
|
||||
- 2026-01-19: Strengthened with explicit prohibition of "WELL_STRUCTURED_NO_MIGRATION_NEEDED" notes
|
||||
- 2026-01-16: Created based on analysis of 51 feedback entries in slot_fixes.yaml
|
||||
|
|
@ -1,317 +0,0 @@
|
|||
# Rule 38: Slot Centralization and Semantic URI Requirements
|
||||
|
||||
🚨 **CRITICAL**: All LinkML slots MUST be centralized in `model/symbolic/schema/modules/slots/` and MUST have semantically sound `slot_uri` predicates from base ontologies.
|
||||
|
||||
---
|
||||
|
||||
## 1. Slot Centralization is Mandatory
|
||||
|
||||
**Location**: All slot definitions MUST be in `model/symbolic/schema/modules/slots/`
|
||||
|
||||
**File Naming**: `{slot_name}.yaml` (snake_case)
|
||||
|
||||
**Import Pattern**: Classes import slots via relative imports:
|
||||
```yaml
|
||||
# In modules/classes/Collection.yaml
|
||||
imports:
|
||||
- ../slots/collection_name
|
||||
- ../slots/collection_type_ref
|
||||
- ../slots/parent_collection
|
||||
```
|
||||
|
||||
### Why Centralization?
|
||||
|
||||
1. **UML Visualization**: The frontend's schema service loads slots from the database in which `modules/slots/` files are ingested to determine aggregation edges. Inline slots in class files are NOT properly parsed for visualization.
|
||||
|
||||
2. **Reusability**: Slots can be used by multiple classes without duplication.
|
||||
|
||||
3. **Semantic Consistency**: Single source of truth for slot semantics prevents drift.
|
||||
|
||||
4. **Maintainability**: Changes to slot semantics propagate automatically to all classes.
|
||||
|
||||
### Anti-Pattern: Inline Slot Definitions
|
||||
|
||||
```yaml
|
||||
# ❌ WRONG - Slots defined inline in class file
|
||||
classes:
|
||||
Collection:
|
||||
slots:
|
||||
- collection_name
|
||||
- parent_collection
|
||||
|
||||
slots: # ← This section in a class file is WRONG
|
||||
collection_name:
|
||||
range: string
|
||||
```
|
||||
|
||||
```yaml
|
||||
# ✅ CORRECT - Slots imported from centralized files
|
||||
# In modules/classes/Collection.yaml
|
||||
imports:
|
||||
- ../slots/collection_name
|
||||
- ../slots/parent_collection
|
||||
|
||||
classes:
|
||||
Collection:
|
||||
slots:
|
||||
- collection_name
|
||||
- parent_collection
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Every Slot MUST Have `slot_uri`
|
||||
|
||||
**`slot_uri`** provides the semantic meaning of the slot in a linked data context. It maps your slot to a predicate from an established ontology. Do avoid adding external uri in case there are no exact mapping! In this common case, the slot_uri should be a self-reference using the 'hc' prefix.
|
||||
|
||||
### Required Slot File Structure
|
||||
|
||||
```yaml
|
||||
# Global slot definition for {slot_name}
|
||||
# Used by: {list of classes}
|
||||
|
||||
id: https://nde.nl/ontology/hc/slot/{slot_name}
|
||||
name: {slot_name}
|
||||
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
# Add ontology prefixes as needed
|
||||
rico: https://www.ica.org/standards/RiC/ontology#
|
||||
schema: http://schema.org/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
|
||||
slots:
|
||||
{slot_name}:
|
||||
slot_uri: {ontology_prefix}:{predicate} # ← REQUIRED
|
||||
description: |
|
||||
Description of the slot's semantic meaning.
|
||||
|
||||
{OntologyName}: {predicate} - "{definition from ontology}"
|
||||
range: {ClassName or primitive}
|
||||
required: true/false
|
||||
multivalued: true/false
|
||||
# Optional mappings for additional semantic relationships
|
||||
exact_mappings:
|
||||
- schema:alternatePredicate
|
||||
close_mappings:
|
||||
- dct:relatedPredicate
|
||||
examples:
|
||||
- value: {example}
|
||||
description: {explanation}
|
||||
```
|
||||
|
||||
### Ontology Sources for `slot_uri`
|
||||
|
||||
Consult these base ontology files in `/data/ontology/`:
|
||||
|
||||
| Ontology | File | Namespace | Use Cases |
|
||||
|----------|------|-----------|-----------|
|
||||
| **RiC-O** | `RiC-O_1-1.rdf` | `rico:` | Archival records, record sets, custody |
|
||||
| **CIDOC-CRM** | `CIDOC_CRM_v7.1.3.rdf` | `crm:` | Cultural heritage objects, events |
|
||||
| **Schema.org** | `schemaorg.owl` | `schema:` | Web semantics, general properties |
|
||||
| **SKOS** | `skos.rdf` | `skos:` | Labels, concepts, mappings |
|
||||
| **Dublin Core** | `dublin_core_elements.rdf` | `dcterms:` | Metadata properties |
|
||||
| **PROV-O** | `prov-o.ttl` | `prov:` | Provenance tracking |
|
||||
| **PAV** | `pav.rdf` | `pav:` | Provenance, authoring, versioning |
|
||||
| **TOOI** | `tooiont.ttl` | `tooi:` | Dutch government organizations |
|
||||
| **CPOV** | `core-public-organisation-ap.ttl` | `cpov:` | EU public sector |
|
||||
| **ORG** | `org.rdf` | `org:` | Organizations, units, roles |
|
||||
| **FOAF** | `foaf.ttl` | `foaf:` | People, agents, social network |
|
||||
| **GLEIF** | `gleif_base.ttl` | `gleif_base:` | Legal entities |
|
||||
|
||||
### Example: Correct Slot with `slot_uri`
|
||||
|
||||
```yaml
|
||||
# modules/slots/preferred_label.yaml
|
||||
id: https://nde.nl/ontology/hc/slot/preferred_label
|
||||
name: preferred_label_slot
|
||||
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
schema: http://schema.org/
|
||||
rdfs: http://www.w3.org/2000/01/rdf-schema#
|
||||
|
||||
slots:
|
||||
preferred_label:
|
||||
slot_uri: skos:prefLabel # ← REQUIRED
|
||||
description: |
|
||||
The primary display name for this entity.
|
||||
|
||||
SKOS: prefLabel - "A preferred lexical label for a resource."
|
||||
|
||||
This is the CANONICAL name - the standardized label accepted by the
|
||||
entity itself for public representation.
|
||||
range: string
|
||||
required: false
|
||||
exact_mappings:
|
||||
- schema:name
|
||||
- rdfs:label
|
||||
examples:
|
||||
- value: "Rijksmuseum"
|
||||
description: Primary display name for the Rijksmuseum
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Mappings Can Apply to Both Classes AND Slots
|
||||
|
||||
LinkML provides SKOS-based mapping predicates that work on **both classes and slots**:
|
||||
|
||||
| Mapping Type | Predicate | Use Case |
|
||||
|--------------|-----------|----------|
|
||||
| `exact_mappings` | `skos:exactMatch` | Identical meaning |
|
||||
| `close_mappings` | `skos:closeMatch` | Very similar meaning |
|
||||
| `related_mappings` | `skos:relatedMatch` | Semantically related |
|
||||
| `narrow_mappings` | `skos:narrowMatch` | More specific |
|
||||
| `broad_mappings` | `skos:broadMatch` | More general |
|
||||
|
||||
### When to Use Mappings vs. slot_uri
|
||||
|
||||
| Scenario | Use |
|
||||
|----------|-----|
|
||||
| **Primary semantic identity** | `slot_uri` (exactly one) |
|
||||
| **Equivalent predicates in other ontologies** | `exact_mappings` (multiple allowed) |
|
||||
| **Similar but not identical predicates** | `close_mappings` |
|
||||
| **Related predicates with different scope** | `narrow_mappings` / `broad_mappings` |
|
||||
|
||||
### Example: Slot with Multiple Mappings
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
website:
|
||||
slot_uri: gleif_base:hasWebsite # Primary predicate
|
||||
range: uri
|
||||
description: |
|
||||
Official website URL of the organization or entity.
|
||||
|
||||
gleif_base:hasWebsite - "A website associated with something"
|
||||
exact_mappings:
|
||||
- schema:url # Identical meaning in Schema.org
|
||||
close_mappings:
|
||||
- foaf:homepage # Similar but specifically "main" page
|
||||
```
|
||||
|
||||
### Example: Class with Multiple Mappings
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
Collection:
|
||||
class_uri: rico:RecordSet # Primary class
|
||||
exact_mappings:
|
||||
- crm:E78_Curated_Holding # CIDOC-CRM equivalent
|
||||
close_mappings:
|
||||
- bf:Collection # BIBFRAME close match
|
||||
narrow_mappings:
|
||||
- edm:ProvidedCHO # Europeana (narrower - cultural heritage objects)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Workflow for Creating a New Slot
|
||||
|
||||
### Step 1: Search Base Ontologies
|
||||
|
||||
Before creating a slot, search for existing predicates:
|
||||
|
||||
```bash
|
||||
# Search for relevant predicates
|
||||
rg "website|homepage|url" /data/ontology/*.ttl /data/ontology/*.rdf /data/ontology/*.owl
|
||||
|
||||
# Check specific ontology
|
||||
rg "rdfs:label|rdfs:comment" /data/ontology/schemaorg.owl | grep -i "name"
|
||||
```
|
||||
|
||||
### Step 2: Document Ontology Alignment
|
||||
|
||||
In the slot file, document WHY you chose that predicate:
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
source_url:
|
||||
slot_uri: pav:retrievedFrom
|
||||
description: |
|
||||
URL of the web page from which data was retrieved.
|
||||
|
||||
pav:retrievedFrom - "The URI from which the resource was retrieved."
|
||||
|
||||
Chosen over:
|
||||
- schema:url (too generic - refers to the entity's URL, not source)
|
||||
- dct:source (refers to intellectual source, not retrieval location)
|
||||
- prov:wasDerivedFrom (refers to entity derivation, not retrieval)
|
||||
```
|
||||
|
||||
### Step 3: Create Centralized Slot File
|
||||
|
||||
```bash
|
||||
# Create new slot file
|
||||
touch schemas/20251121/linkml/modules/slots/new_slot_name.yaml
|
||||
```
|
||||
|
||||
### Step 4: Update Manifest
|
||||
|
||||
Run the manifest regeneration script or manually add to manifest:
|
||||
|
||||
```bash
|
||||
cd schemas/20251121/linkml
|
||||
python3 scripts/regenerate_manifest.py
|
||||
```
|
||||
|
||||
### Step 5: Import in Class Files
|
||||
|
||||
Add the import to classes that use this slot.
|
||||
|
||||
---
|
||||
|
||||
## 5. Validation Checklist
|
||||
|
||||
Before committing slot changes:
|
||||
|
||||
- [ ] Slot file is in `modules/slots/`
|
||||
- [ ] Slot has `slot_uri` pointing to an established ontology predicate
|
||||
- [ ] Predicate is from `data/ontology/` files or standard vocabularies
|
||||
- [ ] Description includes ontology definition
|
||||
- [ ] Rationale documented if multiple predicates were considered
|
||||
- [ ] `exact_mappings`/`close_mappings` added for equivalent predicates
|
||||
- [ ] Manifest updated to include new slot file
|
||||
- [ ] Classes using the slot have been updated with import
|
||||
- [ ] Frontend slot files synced: `frontend/public/schemas/20251121/linkml/modules/slots/`
|
||||
|
||||
---
|
||||
|
||||
## 6. Common Slot URI Mappings
|
||||
|
||||
| Slot Concept | Recommended `slot_uri` | Alternative Mappings |
|
||||
|--------------|------------------------|---------------------|
|
||||
| Preferred name | `skos:prefLabel` | `schema:name`, `rdfs:label` |
|
||||
| Alternative names | `skos:altLabel` | `schema:alternateName` |
|
||||
| Description | `dcterms:description` | `schema:description`, `rdfs:comment` |
|
||||
| Identifier | `dcterms:identifier` | `schema:identifier` |
|
||||
| Website URL | `gleif_base:hasWebsite` | `schema:url`, `foaf:homepage` |
|
||||
| Source URL | `pav:retrievedFrom` | `prov:wasDerivedFrom` |
|
||||
| Created date | `dcterms:created` | `schema:dateCreated`, `prov:generatedAtTime` |
|
||||
| Modified date | `dcterms:modified` | `schema:dateModified` |
|
||||
| Language | `schema:inLanguage` | `dcterms:language` |
|
||||
| Part of | `dcterms:isPartOf` | `rico:isOrWasPartOf`, `schema:isPartOf` |
|
||||
| Has part | `dcterms:hasPart` | `rico:hasOrHadPart`, `schema:hasPart` |
|
||||
| Location | `schema:location` | `locn:address`, `crm:P53_has_former_or_current_location` |
|
||||
| Start date | `schema:startDate` | `prov:startedAtTime`, `rico:hasBeginningDate` |
|
||||
| End date | `schema:endDate` | `prov:endedAtTime`, `rico:hasEndDate` |
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [LinkML slot_uri documentation](https://linkml.io/linkml-model/latest/docs/slot_uri/)
|
||||
- [LinkML mappings documentation](https://linkml.io/linkml-model/latest/docs/mappings/)
|
||||
- [LinkML URIs and Mappings guide](https://linkml.io/linkml/schemas/uris-and-mappings.html)
|
||||
- Rule 1: Ontology Files Are Your Primary Reference
|
||||
- Rule 0: LinkML Schemas Are the Single Source of Truth
|
||||
|
||||
---
|
||||
|
||||
**Version**: 1.0.0
|
||||
**Created**: 2026-01-06
|
||||
**Author**: OpenCODE
|
||||
|
|
@ -1,29 +0,0 @@
|
|||
# Rule: Slot Fixes File is Authoritative
|
||||
|
||||
**Scope:** Schema Migration / Slot Fixes
|
||||
|
||||
**Description:**
|
||||
The file `slot_fixes.yaml` is the **single authoritative source** for tracking slot migrations and fixes.
|
||||
|
||||
**Directives:**
|
||||
1. **Authoritative Source:** Always read and update `slot_fixes.yaml`.
|
||||
2. **Processed Status:** When a slot migration is completed (schema updated, data migrated), you MUST update the entry in `slot_fixes.yaml` with a `processed` block containing:
|
||||
* `status: true`
|
||||
* `date: 'YYYY-MM-DD'`
|
||||
* `notes`: Brief description of what was done.
|
||||
3. **NEVER DELETE:** You MUST NOT delete entries from `slot_fixes.yaml`. Even if a slot is removed from the schema, the record of its fix MUST remain in this file with `status: true`.
|
||||
4. **Format Compliance:** New slots added during migration must follow proper LinkML format conventions and use `slot_uri` and mappings (`exact_mappings`, `close_mappings`) that reference **legitimate predicates and classes found in `/Users/kempersc/apps/glam/data/ontology/`**.
|
||||
|
||||
**Example of Processed Entry:**
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/has_old_slot
|
||||
revision:
|
||||
- label: has_new_slot
|
||||
type: slot
|
||||
- label: NewClass
|
||||
type: class
|
||||
processed:
|
||||
status: true
|
||||
date: '2026-01-27'
|
||||
notes: Migrated to has_new_slot + NewClass. Old slot archived.
|
||||
```
|
||||
|
|
@ -1,169 +0,0 @@
|
|||
# Rule: slot_fixes.yaml Revision Key Immutability
|
||||
|
||||
## Status: CRITICAL
|
||||
|
||||
## Summary
|
||||
|
||||
The `revision` key in `slot_fixes.yaml` is **IMMUTABLE**. AI agents MUST follow revision specifications exactly and are NEVER permitted to modify the content of revision entries.
|
||||
|
||||
## The Authoritative Source
|
||||
|
||||
The file `slot_fixes.yaml` serves as the **curated migration specification** for all slot consolidations in the Heritage Custodian Ontology. Each entry's `revision` section was manually curated based on:
|
||||
|
||||
- Ontology analysis (CIDOC-CRM, RiC-O, PROV-O, Schema.org alignment)
|
||||
- Semantic correctness
|
||||
- Pattern consistency (Rule 39: RiC-O style naming)
|
||||
- Type/Types class hierarchy design (Rule 0b)
|
||||
|
||||
## What Agents CAN Do
|
||||
|
||||
| Action | Permitted | Location |
|
||||
|--------|-----------|----------|
|
||||
| Add completion notes | ✅ YES | `processed.notes` |
|
||||
| Update status | ✅ YES | `processed.status` |
|
||||
| Add feedback responses | ✅ YES | `feedback.response` |
|
||||
| Mark feedback as done | ✅ YES | `feedback.done` |
|
||||
| Execute the migration per revision | ✅ YES | Class/slot files |
|
||||
|
||||
## What Agents CANNOT Do
|
||||
|
||||
| Action | Permitted | Reason |
|
||||
|--------|-----------|--------|
|
||||
| Modify `revision` content | ❌ NEVER | Authoritative specification |
|
||||
| Substitute different slots | ❌ NEVER | Violates curated design |
|
||||
| Skip revision components | ❌ NEVER | Incomplete migration |
|
||||
| Add new revision items | ❌ NEVER | Requires human curation |
|
||||
| Change revision labels | ❌ NEVER | Breaks semantic mapping |
|
||||
| Reorder revision items | ❌ NEVER | `link_branch` dependencies |
|
||||
|
||||
## Structure of slot_fixes.yaml Entries
|
||||
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/example_slot
|
||||
original_slot_label: example_slot
|
||||
revision: # ← IMMUTABLE - DO NOT MODIFY
|
||||
- label: has_or_had_example # Generic slot to use
|
||||
type: slot
|
||||
- label: Example # Class for range
|
||||
type: class
|
||||
- label: has_or_had_attribute # Nested attribute (link_branch: 1)
|
||||
type: slot
|
||||
link_branch: 1
|
||||
processed:
|
||||
status: false # ← CAN UPDATE to true
|
||||
notes: "" # ← CAN ADD notes here
|
||||
feedback: # ← CAN ADD responses here
|
||||
user: "Simon C. Kemper"
|
||||
date: "2026-01-17"
|
||||
message: "Migration incomplete"
|
||||
done: false # ← CAN UPDATE to true
|
||||
response: "" # ← CAN ADD response here
|
||||
```
|
||||
|
||||
## Understanding `link_branch` in Revisions
|
||||
|
||||
The `link_branch` field indicates **nested class attributes**:
|
||||
|
||||
| Revision Item | Meaning |
|
||||
|---------------|---------|
|
||||
| Items **WITHOUT** `link_branch` | PRIMARY slot and class to create |
|
||||
| Items **WITH** `link_branch: 1` | First attribute the primary class needs |
|
||||
| Items **WITH** `link_branch: 2` | Second attribute the primary class needs |
|
||||
|
||||
**Example**:
|
||||
```yaml
|
||||
revision:
|
||||
- label: has_or_had_quantity # PRIMARY SLOT
|
||||
type: slot
|
||||
- label: Quantity # PRIMARY CLASS
|
||||
type: class
|
||||
- label: has_or_had_measurement_unit # Quantity.has_or_had_measurement_unit
|
||||
type: slot
|
||||
link_branch: 1
|
||||
- label: MeasureUnit # Range of branch 1 slot
|
||||
type: class
|
||||
link_branch: 1
|
||||
```
|
||||
|
||||
## Migration Workflow
|
||||
|
||||
1. **READ** the `revision` section completely
|
||||
2. **VERIFY** all referenced slots/classes exist (or create them)
|
||||
3. **REMOVE** old slot from imports, slots list, and slot_usage in consuming classes
|
||||
4. **ADD** new slot(s) and class import(s) per revision specification
|
||||
5. **UPDATE** slot_usage to narrow range to specified class
|
||||
6. **VALIDATE** with `linkml-lint` or `gen-owl`
|
||||
7. **UPDATE** slot_fixes.yaml:
|
||||
- Set `processed.status: true`
|
||||
- Add completion note to `processed.notes`
|
||||
- If feedback exists, set `feedback.done: true` and add `feedback.response`
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### WRONG - Modifying Revision Content
|
||||
|
||||
```yaml
|
||||
# Agent incorrectly "improves" the revision
|
||||
revision:
|
||||
- label: has_description # ❌ CHANGED from has_or_had_description
|
||||
type: slot
|
||||
- label: TextDescription # ❌ CHANGED from Description
|
||||
type: class
|
||||
```
|
||||
|
||||
### WRONG - Substituting Different Slots
|
||||
|
||||
```yaml
|
||||
# Agent uses a different slot than specified
|
||||
# Revision says: has_or_had_type + BindingType
|
||||
# Agent uses: binding_classification + BindingClassification ❌ WRONG
|
||||
```
|
||||
|
||||
### WRONG - Partial Migration
|
||||
|
||||
```yaml
|
||||
# Agent only creates the slot, ignores the class
|
||||
revision:
|
||||
- label: has_or_had_type # ✅ Agent created this
|
||||
type: slot
|
||||
- label: BindingType # ❌ Agent ignored this
|
||||
type: class
|
||||
```
|
||||
|
||||
### CORRECT - Following Revision Exactly
|
||||
|
||||
```yaml
|
||||
# Revision specifies:
|
||||
revision:
|
||||
- label: has_or_had_description
|
||||
type: slot
|
||||
- label: Description
|
||||
type: class
|
||||
|
||||
# Agent creates/uses EXACTLY:
|
||||
# 1. Import ../slots/has_or_had_description
|
||||
# 2. Import ../classes/Description
|
||||
# 3. slot_usage: has_or_had_description with range: Description
|
||||
```
|
||||
|
||||
## Rationale
|
||||
|
||||
1. **Curated Quality**: Revisions were manually designed with ontology expertise
|
||||
2. **Consistency**: Same patterns applied across all migrations
|
||||
3. **Auditability**: Clear record of intended vs. actual changes
|
||||
4. **Reversibility**: Original specifications preserved for review
|
||||
5. **Trust**: Users can rely on revision specifications being stable
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 53**: Full Slot Migration - slot_fixes.yaml is AUTHORITATIVE
|
||||
- **Rule 56**: Semantic Consistency Over Simplicity
|
||||
- **Rule 39**: Slot Naming Convention (RiC-O Style)
|
||||
- **Rule 38**: Slot Centralization and Semantic URI Requirements
|
||||
- **Rule 0b**: Type/Types File Naming Convention
|
||||
|
||||
## See Also
|
||||
|
||||
- `schemas/20251121/linkml/modules/slots/slot_fixes.yaml` - The authoritative file
|
||||
- `.opencode/rules/full-slot-migration-rule.md` - Migration execution rules
|
||||
- `.opencode/rules/semantic-consistency-over-simplicity.md` - Why revisions must be followed
|
||||
|
|
@ -1,69 +0,0 @@
|
|||
# Rule: Slot Naming Convention (Current Style)
|
||||
|
||||
🚨 **CRITICAL**: New LinkML slot names MUST follow the current verb-first naming style used in active slot files under `modules/slots/`.
|
||||
|
||||
## Core Naming Rules
|
||||
|
||||
1. Use `snake_case`.
|
||||
2. Prefer short, descriptive verb predicates as canonical names.
|
||||
3. Keep names ontology-neutral (no ontology namespace prefixes in slot names).
|
||||
4. Use singular nouns in object positions (including multivalued slots).
|
||||
5. Keep temporal semantics in mappings/definitions when needed, not by forcing a legacy prefix.
|
||||
|
||||
## Preferred Patterns
|
||||
|
||||
### 1) Simple verb predicates (default)
|
||||
|
||||
Use a single verb when it clearly expresses the relation.
|
||||
|
||||
Examples from active slots:
|
||||
- `accept`
|
||||
- `contain`
|
||||
- `catalogue`
|
||||
- `exhibit`
|
||||
|
||||
### 2) Verb + particle/preposition when needed
|
||||
|
||||
Use compact phrasal forms when a preposition carries core meaning.
|
||||
|
||||
Examples:
|
||||
- `belong_to`
|
||||
- `located_in`
|
||||
- `derived_from`
|
||||
|
||||
### 3) Symmetric or directional pair pattern
|
||||
|
||||
Use `<present>_or_<past_participle>` when both directions/states are intentionally modeled in one predicate label.
|
||||
|
||||
Examples:
|
||||
- `contains_or_contained`
|
||||
- `includes_or_included`
|
||||
- `operates_or_operated`
|
||||
|
||||
## Legacy Compatibility
|
||||
|
||||
- For migrations, keep backward compatibility via `aliases` when renaming to current-style canonical names.
|
||||
- Do not rename canonical slots opportunistically; follow migration plans and canonical-slot protection rules.
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
- ❌ `rico_has_or_had_holder` (ontology prefix in name)
|
||||
- ❌ `collections` (plural noun predicate)
|
||||
- ❌ `has_museum_visitor_count` (class-specific slot name)
|
||||
- ❌ Creating new `has_or_had_*` names by default when a verb predicate is clearer
|
||||
|
||||
## Quick Checklist
|
||||
|
||||
- [ ] Is the canonical slot name verb-first and descriptive?
|
||||
- [ ] Is it `snake_case`?
|
||||
- [ ] Is the noun part singular?
|
||||
- [ ] Is the name ontology-neutral?
|
||||
- [ ] If renaming legacy slots, are aliases/migration constraints handled?
|
||||
|
||||
## See Also
|
||||
|
||||
- `.opencode/rules/archive/DEPRECATED-slot-naming-convention-rico-style.md`
|
||||
- `.opencode/rules/no-ontology-prefix-in-slot-names.md`
|
||||
- `.opencode/rules/slot-noun-singular-convention.md`
|
||||
- `.opencode/rules/generic-slots-specific-classes.md`
|
||||
- `.opencode/rules/canonical-slot-protection-rule.md`
|
||||
|
|
@ -1,80 +0,0 @@
|
|||
# Rule: Slot Nouns Must Be Singular
|
||||
|
||||
🚨 **CRITICAL**: LinkML slot names MUST use singular nouns, even for multivalued slots. The `multivalued: true` property indicates cardinality, not the slot name.
|
||||
|
||||
## Rationale
|
||||
|
||||
1. **Predicate semantics**: Slots represent predicates/relationships. In RDF, `hasCollection` can have multiple objects without changing the predicate name.
|
||||
2. **Consistency**: Singular names work for both single-valued and multivalued slots.
|
||||
3. **Ontology alignment**: Standard ontologies use singular predicates (`skos:broader`, `org:hasMember`, `rico:hasOrHadHolder`).
|
||||
4. **Readability**: `custodian.has_or_had_custodian_type` reads naturally as "custodian has (or had) custodian type".
|
||||
|
||||
## Correct Pattern
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
has_or_had_custodian_type: # ✅ CORRECT - singular noun
|
||||
slot_uri: org:classification
|
||||
range: CustodianType
|
||||
multivalued: true # Cardinality expressed here, not in name
|
||||
|
||||
has_or_had_collection: # ✅ CORRECT - singular noun
|
||||
slot_uri: rico:hasOrHadPart
|
||||
range: CustodianCollection
|
||||
multivalued: true
|
||||
|
||||
has_or_had_member: # ✅ CORRECT - singular noun
|
||||
slot_uri: org:hasMember
|
||||
range: Custodian
|
||||
multivalued: true
|
||||
```
|
||||
|
||||
## Incorrect Pattern
|
||||
|
||||
```yaml
|
||||
slots:
|
||||
has_or_had_custodian_types: # ❌ WRONG - plural noun
|
||||
multivalued: true
|
||||
|
||||
collections: # ❌ WRONG - plural noun
|
||||
multivalued: true
|
||||
|
||||
members: # ❌ WRONG - plural noun
|
||||
multivalued: true
|
||||
```
|
||||
|
||||
## Migration Examples
|
||||
|
||||
| Old (Plural) | New (Singular) |
|
||||
|--------------|----------------|
|
||||
| `custodian_types` | `has_or_had_custodian_type` |
|
||||
| `collections` | `has_or_had_collection` |
|
||||
| `identifiers` | `identifier` |
|
||||
| `alternative_names` | `alternative_name` |
|
||||
| `staff_members` | `staff_member` |
|
||||
|
||||
## Exceptions
|
||||
|
||||
**Compound concepts** where the plural is part of the concept name itself:
|
||||
|
||||
- `archives_regionales` - French administrative term (proper noun)
|
||||
- `united_states` - Geographic proper noun
|
||||
|
||||
**NOT exceptions** (still use singular):
|
||||
|
||||
- `has_or_had_identifier` not `has_or_had_identifiers` (even if institution has multiple)
|
||||
- `broader_type` not `broader_types` (even if multiple broader types)
|
||||
|
||||
## Implementation
|
||||
|
||||
When creating or renaming slots:
|
||||
|
||||
1. Extract the noun from the slot name
|
||||
2. Convert to singular form
|
||||
3. Combine with relationship prefix (`has_or_had_`, `is_or_was_`, etc.)
|
||||
4. Set `multivalued: true` if multiple values are expected
|
||||
|
||||
## See Also
|
||||
|
||||
- `.opencode/rules/slot-naming-convention-current-style.md` - Current slot naming patterns
|
||||
- `.opencode/rules/slot-centralization-and-semantic-uri-rule.md` - Slot centralization requirements
|
||||
|
|
@ -1,174 +0,0 @@
|
|||
# Rule 49: Slot Usage Minimization - No Redundant Overrides
|
||||
|
||||
## Summary
|
||||
|
||||
LinkML `slot_usage` entries MUST provide meaningful modifications to the generic slot definition. Redundant `slot_usage` entries that merely re-declare the same values as the generic slot MUST be removed.
|
||||
|
||||
## Background
|
||||
|
||||
### What is slot_usage?
|
||||
|
||||
In LinkML, [`slot_usage`](https://linkml.io/linkml-model/latest/docs/slot_usage/) allows a class to customize how an inherited slot behaves within that specific class context. It enables:
|
||||
|
||||
- Narrowing the `range` to a more specific type
|
||||
- Adding class-specific `required`, `multivalued`, or `identifier` constraints
|
||||
- Providing class-specific `description`, `examples`, or `pattern` overrides
|
||||
- Adding class-specific semantic mappings (`exact_mappings`, `close_mappings`, etc.)
|
||||
|
||||
### The Problem
|
||||
|
||||
A code generation process created **874 redundant `slot_usage` entries** across **374 class files** that simply re-declare the same `range` and `inlined` values already defined in the generic slot:
|
||||
|
||||
```yaml
|
||||
# In modules/slots/template_specificity.yaml (GENERIC DEFINITION)
|
||||
slots:
|
||||
template_specificity:
|
||||
slot_uri: hc:templateSpecificity
|
||||
range: TemplateSpecificityScores
|
||||
inlined: true
|
||||
|
||||
# In modules/classes/AdministrativeOffice.yaml (REDUNDANT OVERRIDE)
|
||||
slot_usage:
|
||||
template_specificity:
|
||||
range: TemplateSpecificityScores # Same as generic!
|
||||
inlined: true # Same as generic!
|
||||
```
|
||||
|
||||
This creates:
|
||||
1. **Visual noise** in the schema viewer (slot_usage badge displayed when nothing is actually customized)
|
||||
2. **Maintenance burden** (changes to generic slot must be mirrored in 374 files)
|
||||
3. **Semantic confusion** (suggests customization where none exists)
|
||||
|
||||
## The Rule
|
||||
|
||||
### MUST Remove: Truly Redundant Overrides
|
||||
|
||||
A `slot_usage` entry is **truly redundant** and MUST be removed if:
|
||||
|
||||
1. **All properties match the generic slot definition exactly**
|
||||
2. **No additional properties are added** (no extra `examples`, `description`, `required`, etc.)
|
||||
|
||||
```yaml
|
||||
# REDUNDANT - Remove this entire slot_usage entry
|
||||
slot_usage:
|
||||
template_specificity:
|
||||
range: TemplateSpecificityScores
|
||||
inlined: true
|
||||
```
|
||||
|
||||
### MAY Keep: Description-Only Modifications
|
||||
|
||||
A `slot_usage` entry that ONLY modifies the `description` by adding articles or context MAY be kept if it provides **semantic value** by referring to a specific entity rather than a general concept.
|
||||
|
||||
**Tolerated Example** (adds definiteness):
|
||||
```yaml
|
||||
# Generic slot
|
||||
slots:
|
||||
has_or_had_record_set:
|
||||
description: Record sets associated with a custodian.
|
||||
range: RecordSet
|
||||
|
||||
# Class-specific slot_usage - TOLERABLE
|
||||
slot_usage:
|
||||
has_or_had_record_set:
|
||||
description: The record sets held by this archive. # "The" makes it definite
|
||||
```
|
||||
|
||||
**Rationale**: "The record sets" (definite) vs "record sets" (indefinite) conveys that this class specifically requires/expects record sets, rather than merely allowing them. This is a **semantic distinction** in linguistic terms (definiteness marking).
|
||||
|
||||
### MUST Keep: Meaningful Modifications
|
||||
|
||||
A `slot_usage` entry MUST be kept if it provides ANY of the following:
|
||||
|
||||
| Modification Type | Example |
|
||||
|-------------------|---------|
|
||||
| **Range narrowing** | `range: MuseumCollection` (from generic `Collection`) |
|
||||
| **Required constraint** | `required: true` (when generic is optional) |
|
||||
| **Pattern override** | `pattern: "^NL-.*"` (Dutch ISIL codes only) |
|
||||
| **Examples addition** | Class-specific examples not in generic |
|
||||
| **Inlined change** | `inlined: true` when generic is `false` |
|
||||
| **Identifier designation** | `identifier: true` for primary key |
|
||||
|
||||
## Decision Matrix
|
||||
|
||||
| Scenario | Action |
|
||||
|----------|--------|
|
||||
| All properties match generic exactly | **REMOVE** |
|
||||
| Only `range` and/or `inlined` match generic | **REMOVE** |
|
||||
| Only `description` differs by adding articles | **TOLERATE** (but consider removing) |
|
||||
| `description` provides substantive new information | **KEEP** |
|
||||
| Any other property modified | **KEEP** |
|
||||
|
||||
## Implementation
|
||||
|
||||
### Cleanup Script
|
||||
|
||||
Use the following to identify and remove redundant overrides:
|
||||
|
||||
```python
|
||||
# scripts/cleanup_redundant_slot_usage.py
|
||||
import yaml
|
||||
import glob
|
||||
|
||||
SLOTS_TO_CHECK = ['template_specificity', 'specificity_annotation']
|
||||
|
||||
for class_file in glob.glob('schemas/20251121/linkml/modules/classes/*.yaml'):
|
||||
with open(class_file) as f:
|
||||
content = yaml.safe_load(f)
|
||||
|
||||
modified = False
|
||||
for cls_name, cls_def in content.get('classes', {}).items():
|
||||
slot_usage = cls_def.get('slot_usage', {})
|
||||
for slot_name in SLOTS_TO_CHECK:
|
||||
if slot_name in slot_usage:
|
||||
override = slot_usage[slot_name]
|
||||
# Check if redundant (only range/inlined matching generic)
|
||||
if is_redundant(override, slot_name):
|
||||
del slot_usage[slot_name]
|
||||
modified = True
|
||||
|
||||
# Remove empty slot_usage
|
||||
if not slot_usage:
|
||||
del cls_def['slot_usage']
|
||||
|
||||
if modified:
|
||||
with open(class_file, 'w') as f:
|
||||
yaml.dump(content, f, allow_unicode=True, sort_keys=False)
|
||||
```
|
||||
|
||||
### Validation
|
||||
|
||||
After cleanup, validate that:
|
||||
1. `linkml-validate` passes for all schemas
|
||||
2. Generated RDF/OWL output is unchanged (redundant overrides have no semantic effect)
|
||||
3. Frontend slot viewer shows fewer `slot_usage` badges
|
||||
|
||||
## Frontend UX Implications
|
||||
|
||||
The frontend LinkML viewer should:
|
||||
|
||||
1. **Display "Slot Usage"** (with space, no underscore) instead of `slot_usage`
|
||||
2. **Add tooltip** explaining what slot_usage means, linking to [LinkML documentation](https://linkml.io/linkml-model/latest/docs/slot_usage/)
|
||||
3. **Only show badge** when `slot_usage` contains meaningful modifications
|
||||
4. **Comparison view** should highlight actual differences, not redundant re-declarations
|
||||
|
||||
## Affected Slots
|
||||
|
||||
Current analysis found redundant overrides for:
|
||||
|
||||
| Slot | Redundant Overrides | Files Affected |
|
||||
|------|---------------------|----------------|
|
||||
| `template_specificity` | 873 | 374 |
|
||||
| `specificity_annotation` | 874 | 374 |
|
||||
|
||||
## References
|
||||
|
||||
- [LinkML slot_usage documentation](https://linkml.io/linkml-model/latest/docs/slot_usage/)
|
||||
- Rule 38: Slot Centralization and Semantic URI Requirements
|
||||
- Rule 48: Class Files Must Not Define Inline Slots
|
||||
|
||||
## Version History
|
||||
|
||||
| Date | Change |
|
||||
|------|--------|
|
||||
| 2026-01-12 | Initial rule created after identifying 874 redundant slot_usage entries |
|
||||
|
|
@ -1,401 +0,0 @@
|
|||
# Rule: Specificity Score Convention for LinkML Schema Annotations
|
||||
|
||||
**Version**: 1.0.0
|
||||
**Created**: 2025-01-04
|
||||
**Status**: Active
|
||||
**Applies to**: `schemas/20251121/linkml/modules/classes/*.yaml`
|
||||
|
||||
---
|
||||
|
||||
## Rule Statement
|
||||
|
||||
Every class in the Heritage Custodian Ontology MUST have specificity score annotations to enable intelligent filtering for RAG retrieval and UML visualization.
|
||||
|
||||
---
|
||||
|
||||
## Annotation Schema
|
||||
|
||||
### Required Annotations
|
||||
|
||||
Every class YAML file MUST include these annotations:
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
ClassName:
|
||||
annotations:
|
||||
specificity_score: 0.75 # Required: General specificity (0.0-1.0)
|
||||
specificity_rationale: "..." # Required: Why this score was assigned
|
||||
```
|
||||
|
||||
### Optional Annotations
|
||||
|
||||
Template-specific scores for context-aware filtering:
|
||||
|
||||
```yaml
|
||||
classes:
|
||||
ClassName:
|
||||
annotations:
|
||||
specificity_score: 0.75
|
||||
specificity_rationale: "..."
|
||||
template_specificity: # Optional: Template-specific scores
|
||||
archive_search: 0.95
|
||||
museum_search: 0.20
|
||||
person_research: 0.30
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Score Semantics
|
||||
|
||||
### General Specificity Score
|
||||
|
||||
The `specificity_score` measures how **context-dependent** a class is:
|
||||
|
||||
| Score Range | Meaning | Example Classes |
|
||||
|-------------|---------|-----------------|
|
||||
| 0.00-0.20 | **Universal** - relevant in almost all contexts | `HeritageCustodian`, `CustodianName`, `Location` |
|
||||
| 0.20-0.40 | **Broadly useful** - relevant in most contexts | `Collection`, `Identifier`, `GHCID` |
|
||||
| 0.40-0.60 | **Moderately specific** - relevant in several contexts | `ChangeEvent`, `PersonProfile`, `DigitalPlatform` |
|
||||
| 0.60-0.80 | **Fairly specific** - relevant in limited contexts | `Archive`, `Museum`, `Library`, `FindingAid` |
|
||||
| 0.80-1.00 | **Highly specific** - relevant only in specialized contexts | `LinkedInConnectionExtraction`, `GHCIDHistoryEntry` |
|
||||
|
||||
**Key Insight**: Lower scores = MORE generally relevant (always useful in RAG); Higher scores = MORE specific (only useful in specialized queries).
|
||||
|
||||
---
|
||||
|
||||
### Template Specificity Scores
|
||||
|
||||
The `template_specificity` maps class relevance to 10 conversation templates:
|
||||
|
||||
| Template ID | Focus Area | Example High-Score Classes |
|
||||
|-------------|------------|---------------------------|
|
||||
| `archive_search` | Archives and archival holdings | `Archive`, `RecordSet`, `Fonds` |
|
||||
| `museum_search` | Museums and exhibitions | `Museum`, `Gallery`, `Exhibition` |
|
||||
| `library_search` | Libraries and catalogs | `Library`, `Catalog`, `BibliographicCollection` |
|
||||
| `collection_discovery` | Collections and holdings | `Collection`, `Accession`, `Extent` |
|
||||
| `person_research` | People and staff | `PersonProfile`, `Staff`, `Role` |
|
||||
| `location_browse` | Geographic information | `Location`, `Address`, `GeoCoordinates` |
|
||||
| `identifier_lookup` | Identifiers (ISIL, Wikidata) | `Identifier`, `GHCID`, `ISIL` |
|
||||
| `organizational_change` | History and changes | `ChangeEvent`, `Founding`, `Merger` |
|
||||
| `digital_platform` | Online resources | `DigitalPlatform`, `Website`, `API` |
|
||||
| `general_heritage` | Fallback/general | Uses `specificity_score` directly |
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### Example 1: Universal Class (Low Specificity)
|
||||
|
||||
```yaml
|
||||
# modules/classes/HeritageCustodian.yaml
|
||||
classes:
|
||||
HeritageCustodian:
|
||||
description: >-
|
||||
Base class for all heritage custodian institutions.
|
||||
annotations:
|
||||
specificity_score: 0.15
|
||||
specificity_rationale: >-
|
||||
Universal base class relevant in virtually all heritage contexts.
|
||||
Every query about heritage institutions implicitly involves this class.
|
||||
template_specificity:
|
||||
archive_search: 0.65
|
||||
museum_search: 0.65
|
||||
library_search: 0.65
|
||||
collection_discovery: 0.70
|
||||
person_research: 0.70
|
||||
location_browse: 0.75
|
||||
identifier_lookup: 0.70
|
||||
organizational_change: 0.75
|
||||
digital_platform: 0.70
|
||||
general_heritage: 0.15
|
||||
```
|
||||
|
||||
### Example 2: Domain-Specific Class (High Specificity)
|
||||
|
||||
```yaml
|
||||
# modules/classes/Archive.yaml
|
||||
classes:
|
||||
Archive:
|
||||
is_a: HeritageCustodian
|
||||
description: >-
|
||||
An archive institution holding historical records and documents.
|
||||
annotations:
|
||||
specificity_score: 0.70
|
||||
specificity_rationale: >-
|
||||
Domain-specific institution type. Highly relevant for archival research
|
||||
but not needed for museum or library queries.
|
||||
template_specificity:
|
||||
archive_search: 0.95
|
||||
museum_search: 0.20
|
||||
library_search: 0.25
|
||||
collection_discovery: 0.75
|
||||
person_research: 0.40
|
||||
location_browse: 0.65
|
||||
identifier_lookup: 0.50
|
||||
organizational_change: 0.60
|
||||
digital_platform: 0.45
|
||||
general_heritage: 0.70
|
||||
```
|
||||
|
||||
### Example 3: Technical Class (Very High Specificity)
|
||||
|
||||
```yaml
|
||||
# modules/classes/LinkedInConnectionExtraction.yaml
|
||||
classes:
|
||||
LinkedInConnectionExtraction:
|
||||
description: >-
|
||||
Technical class for extracting LinkedIn connection data.
|
||||
annotations:
|
||||
specificity_score: 0.95
|
||||
specificity_rationale: >-
|
||||
Internal extraction class with no semantic significance for end users.
|
||||
Only relevant when specifically researching data extraction processes.
|
||||
template_specificity:
|
||||
archive_search: 0.05
|
||||
museum_search: 0.05
|
||||
library_search: 0.05
|
||||
collection_discovery: 0.05
|
||||
person_research: 0.40
|
||||
location_browse: 0.05
|
||||
identifier_lookup: 0.10
|
||||
organizational_change: 0.05
|
||||
digital_platform: 0.15
|
||||
general_heritage: 0.95
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Score Assignment Guidelines
|
||||
|
||||
### Factors That LOWER Specificity Score
|
||||
|
||||
| Factor | Impact | Example |
|
||||
|--------|--------|---------|
|
||||
| Base/parent class | -0.20 to -0.30 | `HeritageCustodian` is parent of all |
|
||||
| Used in identifiers | -0.10 to -0.15 | `CustodianName` used in GHCID |
|
||||
| Geographic component | -0.10 to -0.15 | `Location` needed for all institutions |
|
||||
| Universal attribute | -0.10 to -0.15 | `Provenance` applies to all data |
|
||||
|
||||
### Factors That RAISE Specificity Score
|
||||
|
||||
| Factor | Impact | Example |
|
||||
|--------|--------|---------|
|
||||
| Institution type | +0.30 to +0.40 | `Archive`, `Museum`, `Library` |
|
||||
| Technical/extraction | +0.30 to +0.40 | `LinkedInConnectionExtraction` |
|
||||
| Event subtype | +0.20 to +0.30 | `Merger`, `Founding`, `Closure` |
|
||||
| Domain terminology | +0.15 to +0.25 | `Fonds`, `FindingAid`, `RecordSet` |
|
||||
|
||||
### Cross-Class Consistency Rules
|
||||
|
||||
1. **Inheritance**: Child classes should have equal or higher specificity than parents
|
||||
2. **Siblings**: Classes at same hierarchy level should have similar base scores
|
||||
3. **Competing types**: Institution types should reduce each other's template scores
|
||||
|
||||
```yaml
|
||||
# CORRECT: Archive (0.70) inherits from HeritageCustodian (0.15)
|
||||
Archive:
|
||||
is_a: HeritageCustodian # Parent: 0.15
|
||||
annotations:
|
||||
specificity_score: 0.70 # Child: 0.70 >= 0.15 ✓
|
||||
|
||||
# WRONG: Child less specific than parent
|
||||
Archive:
|
||||
is_a: HeritageCustodian # Parent: 0.15
|
||||
annotations:
|
||||
specificity_score: 0.10 # Child: 0.10 < 0.15 ✗
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Validation Rules
|
||||
|
||||
### Required Validations
|
||||
|
||||
1. **Range Check**: `0.0 <= specificity_score <= 1.0`
|
||||
2. **Rationale Present**: `specificity_rationale` must not be empty
|
||||
3. **Inheritance Consistency**: Child score >= parent score
|
||||
4. **Template Score Range**: All template scores must be 0.0-1.0
|
||||
|
||||
### Recommended Validations
|
||||
|
||||
1. **No Orphan Scores**: Every class should have annotations (warn if missing)
|
||||
2. **Score Distribution**: Flag if >50% of classes have same score (lack of differentiation)
|
||||
3. **Template Coverage**: Warn if template_specificity omits common templates
|
||||
|
||||
### Validation Script
|
||||
|
||||
```python
|
||||
# scripts/validate_specificity_scores.py
|
||||
|
||||
from linkml_runtime import SchemaView
|
||||
from pathlib import Path
|
||||
import sys
|
||||
|
||||
REQUIRED_TEMPLATES = [
|
||||
"archive_search", "museum_search", "library_search",
|
||||
"collection_discovery", "person_research", "location_browse",
|
||||
"identifier_lookup", "organizational_change", "digital_platform",
|
||||
"general_heritage"
|
||||
]
|
||||
|
||||
def validate_specificity_scores(schema_path: Path) -> list[str]:
|
||||
"""Validate all specificity score annotations."""
|
||||
errors = []
|
||||
schema = SchemaView(str(schema_path))
|
||||
|
||||
for class_name in schema.all_classes():
|
||||
cls = schema.get_class(class_name)
|
||||
|
||||
# Check required annotations
|
||||
score = cls.annotations.get("specificity_score")
|
||||
rationale = cls.annotations.get("specificity_rationale")
|
||||
|
||||
if score is None:
|
||||
errors.append(f"{class_name}: Missing specificity_score")
|
||||
continue
|
||||
|
||||
# Validate score range
|
||||
try:
|
||||
score_val = float(score.value)
|
||||
if not 0.0 <= score_val <= 1.0:
|
||||
errors.append(f"{class_name}: Score {score_val} out of range [0.0, 1.0]")
|
||||
except (ValueError, TypeError):
|
||||
errors.append(f"{class_name}: Invalid score value: {score.value}")
|
||||
|
||||
# Check rationale
|
||||
if rationale is None or not rationale.value.strip():
|
||||
errors.append(f"{class_name}: Missing or empty specificity_rationale")
|
||||
|
||||
# Check inheritance consistency
|
||||
if cls.is_a:
|
||||
parent = schema.get_class(cls.is_a)
|
||||
parent_score = parent.annotations.get("specificity_score")
|
||||
if parent_score and float(score.value) < float(parent_score.value):
|
||||
errors.append(
|
||||
f"{class_name}: Score {score.value} < parent {cls.is_a} score {parent_score.value}"
|
||||
)
|
||||
|
||||
return errors
|
||||
|
||||
if __name__ == "__main__":
|
||||
schema_path = Path("schemas/20251121/linkml/01_custodian_name.yaml")
|
||||
errors = validate_specificity_scores(schema_path)
|
||||
|
||||
if errors:
|
||||
print("Validation errors:")
|
||||
for error in errors:
|
||||
print(f" - {error}")
|
||||
sys.exit(1)
|
||||
else:
|
||||
print("All specificity scores valid!")
|
||||
sys.exit(0)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### What NOT to Do
|
||||
|
||||
| Anti-Pattern | Why It's Wrong | Correct Approach |
|
||||
|--------------|----------------|------------------|
|
||||
| Score without rationale | No audit trail for decisions | Always include rationale |
|
||||
| All scores = 0.5 | No differentiation, useless for filtering | Differentiate based on semantics |
|
||||
| Child < parent score | Violates specificity inheritance | Child should be equal or more specific |
|
||||
| Template score > 1.0 | Invalid score value | Keep all scores in [0.0, 1.0] |
|
||||
| Empty rationale | Fails validation, no documentation | Write meaningful rationale |
|
||||
|
||||
### Example of Incorrect Annotation
|
||||
|
||||
```yaml
|
||||
# WRONG - Multiple issues
|
||||
classes:
|
||||
Archive:
|
||||
annotations:
|
||||
specificity_score: 1.5 # Out of range!
|
||||
specificity_rationale: "" # Empty rationale!
|
||||
template_specificity:
|
||||
archive_search: 0.95
|
||||
# Missing other templates - incomplete coverage
|
||||
```
|
||||
|
||||
### Example of Correct Annotation
|
||||
|
||||
```yaml
|
||||
# CORRECT
|
||||
classes:
|
||||
Archive:
|
||||
annotations:
|
||||
specificity_score: 0.70
|
||||
specificity_rationale: >-
|
||||
Domain-specific institution type for archives. Highly relevant
|
||||
for archival research queries but less useful for museum or
|
||||
library-focused questions.
|
||||
template_specificity:
|
||||
archive_search: 0.95
|
||||
museum_search: 0.20
|
||||
library_search: 0.25
|
||||
collection_discovery: 0.75
|
||||
person_research: 0.40
|
||||
location_browse: 0.65
|
||||
identifier_lookup: 0.50
|
||||
organizational_change: 0.60
|
||||
digital_platform: 0.45
|
||||
general_heritage: 0.70
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Migration Checklist
|
||||
|
||||
When adding specificity scores to existing classes:
|
||||
|
||||
### Phase 1: Assessment
|
||||
|
||||
- [ ] Count classes without annotations
|
||||
- [ ] Identify class hierarchy (parents → children order)
|
||||
- [ ] Review existing descriptions for scoring hints
|
||||
|
||||
### Phase 2: Annotation
|
||||
|
||||
- [ ] Start with root classes (lowest specificity)
|
||||
- [ ] Work down hierarchy (increasing specificity)
|
||||
- [ ] Assign template scores based on domain alignment
|
||||
- [ ] Write rationale explaining score decisions
|
||||
|
||||
### Phase 3: Validation
|
||||
|
||||
- [ ] Run validation script
|
||||
- [ ] Check inheritance consistency
|
||||
- [ ] Verify score distribution (not all same value)
|
||||
- [ ] Review edge cases (technical classes, mixins)
|
||||
|
||||
### Phase 4: Documentation
|
||||
|
||||
- [ ] Update class count in plan documents
|
||||
- [ ] Document any scoring decisions that were difficult
|
||||
- [ ] Create PR with all changes
|
||||
|
||||
---
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 0**: LinkML Schemas Are the Single Source of Truth
|
||||
- **Rule 4**: Technical Classes Are Excluded from Visualizations
|
||||
- **Rule 13**: Custodian Type Annotations on LinkML Schema Elements
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- `docs/plan/specificity_score/README.md` - System overview
|
||||
- `docs/plan/specificity_score/04-prompt-conversation-templates.md` - Template definitions
|
||||
- `docs/plan/specificity_score/06-uml-visualization.md` - UML filtering integration
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Date | Version | Change |
|
||||
|------|---------|--------|
|
||||
| 2025-01-04 | 1.0.0 | Initial rule created for specificity score system |
|
||||
|
|
@ -1,223 +0,0 @@
|
|||
# Rule: LinkML Type/Types File Naming Convention
|
||||
|
||||
**Version**: 1.0.0
|
||||
**Created**: 2025-01-04
|
||||
**Status**: Active
|
||||
**Applies to**: `schemas/20251121/linkml/modules/classes/`
|
||||
|
||||
---
|
||||
|
||||
## Rule Statement
|
||||
|
||||
When creating class hierarchies that replace enums in LinkML schemas, follow the **Type/Types** naming pattern to clearly distinguish abstract base classes from their concrete subclasses.
|
||||
|
||||
---
|
||||
|
||||
## Pattern Definition
|
||||
|
||||
| File Name Pattern | Purpose | Contains |
|
||||
|-------------------|---------|----------|
|
||||
| `[Entity]Type.yaml` (singular) | Abstract base class | Single abstract class defining the type taxonomy |
|
||||
| `[Entity]Types.yaml` (plural) | Concrete subclasses | All concrete subclasses inheriting from the base |
|
||||
|
||||
---
|
||||
|
||||
## Class Naming Convention
|
||||
|
||||
🚨 **CRITICAL**: Follow these naming rules for classes within the files:
|
||||
|
||||
1. **Abstract Base Class** (`[Entity]Type.yaml`):
|
||||
* **MUST** end with `Type` suffix.
|
||||
* *Example*: `DigitalPlatformType`, `WarehouseType`.
|
||||
|
||||
2. **Concrete Subclasses** (`[Entity]Types.yaml`):
|
||||
* **MUST NOT** end with `Type` suffix.
|
||||
* Use the natural entity name.
|
||||
* *Example*: `DigitalLibrary` (✅), `CentralDepot` (✅).
|
||||
* *Incorrect*: `DigitalLibraryType` (❌), `CentralDepotType` (❌).
|
||||
|
||||
**Rationale**: The file context (`WarehouseTypes.yaml`) already establishes these are types. Repeating "Type" in the class name is redundant and makes the class name less natural when used as an object instance (e.g., "This object is a CentralDepot").
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
|
||||
### Current Implementations
|
||||
|
||||
| Base Class File | Subclasses File | Subclass Count | Description |
|
||||
|-----------------|-----------------|----------------|-------------|
|
||||
| `DigitalPlatformType.yaml` | `DigitalPlatformTypes.yaml` | 69 | Digital platform type taxonomy |
|
||||
| `WebPortalType.yaml` | `WebPortalTypes.yaml` | ~15 | Web portal type taxonomy |
|
||||
| `CustodianType.yaml` | `CustodianTypes.yaml` | 19 | Heritage custodian type taxonomy (GLAMORCUBESFIXPHDNT) |
|
||||
| `DataServiceEndpointType.yaml` | `DataServiceEndpointTypes.yaml` | 7 | API/data service endpoint types |
|
||||
|
||||
### File Structure Example
|
||||
|
||||
```
|
||||
modules/classes/
|
||||
├── DigitalPlatformType.yaml # Abstract base class
|
||||
├── DigitalPlatformTypes.yaml # 69 concrete subclasses
|
||||
├── WebPortalType.yaml # Abstract base class
|
||||
├── WebPortalTypes.yaml # ~15 concrete subclasses
|
||||
├── CustodianType.yaml # Abstract base class
|
||||
└── CustodianTypes.yaml # 19 concrete subclasses
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Import Pattern
|
||||
|
||||
The subclasses file MUST import the base class file:
|
||||
|
||||
```yaml
|
||||
# In DigitalPlatformTypes.yaml (subclasses file)
|
||||
id: https://w3id.org/heritage-custodian/linkml/digital_platform_types
|
||||
name: digital_platform_types
|
||||
|
||||
imports:
|
||||
- linkml:types
|
||||
- ./DigitalPlatformType # Import base class (singular)
|
||||
|
||||
classes:
|
||||
DigitalLibrary:
|
||||
is_a: DigitalPlatformType # Inherit from base
|
||||
description: >-
|
||||
A digital library platform providing access to digitized collections.
|
||||
class_uri: schema:DigitalDocument
|
||||
|
||||
DigitalArchive:
|
||||
is_a: DigitalPlatformType
|
||||
description: >-
|
||||
A digital archive for born-digital or digitized archival materials.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Slot Range Pattern
|
||||
|
||||
When other classes reference the type taxonomy, use the **base class** (singular) as the range:
|
||||
|
||||
```yaml
|
||||
# In DigitalPlatform.yaml
|
||||
imports:
|
||||
- ./DigitalPlatformType # Import base class for range
|
||||
- ./DigitalPlatformTypes # Import subclasses for validation
|
||||
|
||||
classes:
|
||||
DigitalPlatform:
|
||||
slots:
|
||||
- platform_type
|
||||
slot_usage:
|
||||
platform_type:
|
||||
range: DigitalPlatformType # Use base class as range
|
||||
description: >-
|
||||
The type of digital platform. Value must be one of the
|
||||
concrete subclasses defined in DigitalPlatformTypes.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### What NOT to Do
|
||||
|
||||
| Anti-Pattern | Why It's Wrong | Correct Alternative |
|
||||
|--------------|----------------|---------------------|
|
||||
| `DigitalPlatformTypeBase.yaml` | "Base" suffix is redundant; singular "Type" already implies base class | `DigitalPlatformType.yaml` |
|
||||
| `DigitalPlatformTypeClasses.yaml` | "Classes" is less intuitive than "Types" for a type taxonomy | `DigitalPlatformTypes.yaml` |
|
||||
| All types in single file | Large files are hard to navigate; separation clarifies architecture | Split into Type.yaml + Types.yaml |
|
||||
| `DigitalPlatformEnum.yaml` | Enums lack extensibility; class hierarchies are preferred | Use class hierarchy pattern |
|
||||
| `CentralDepotType` (Class Name) | Redundant "Type" suffix on concrete subclass | `CentralDepot` |
|
||||
|
||||
### Example of Incorrect Naming
|
||||
|
||||
```yaml
|
||||
# WRONG - Don't use "Base" suffix
|
||||
# File: DigitalPlatformTypeBase.yaml
|
||||
classes:
|
||||
DigitalPlatformTypeBase: # Redundant "Base"
|
||||
abstract: true
|
||||
```
|
||||
|
||||
```yaml
|
||||
# CORRECT - Use singular "Type"
|
||||
# File: DigitalPlatformType.yaml
|
||||
classes:
|
||||
DigitalPlatformType: # Clean, clear naming
|
||||
abstract: true
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Rationale
|
||||
|
||||
1. **Clarity**: "Type" (singular) = one abstract concept; "Types" (plural) = many concrete implementations
|
||||
2. **Discoverability**: Related files appear adjacent in alphabetical directory listings
|
||||
3. **Consistency**: Follows established pattern across entire schema
|
||||
4. **Semantics**: Mirrors natural language ("a platform type" vs "the platform types")
|
||||
5. **Scalability**: Easy to add new types without modifying base class file
|
||||
|
||||
---
|
||||
|
||||
## Migration Checklist
|
||||
|
||||
When renaming existing files to follow this convention:
|
||||
|
||||
### Pre-Migration
|
||||
|
||||
- [ ] Identify all files referencing the old name
|
||||
- [ ] Create backup or ensure version control is clean
|
||||
- [ ] Document the old → new name mapping
|
||||
|
||||
### File Rename
|
||||
|
||||
- [ ] Rename file: `[Entity]TypeBase.yaml` → `[Entity]Type.yaml`
|
||||
- [ ] Update `id:` field in renamed file
|
||||
- [ ] Update `name:` field in renamed file
|
||||
- [ ] Update class name inside the file
|
||||
- [ ] Update all internal documentation references
|
||||
|
||||
### Update References
|
||||
|
||||
- [ ] Update imports in `[Entity]Types.yaml` (subclasses file)
|
||||
- [ ] Update `is_a:` in all subclasses
|
||||
- [ ] Update imports in consuming classes (e.g., `DigitalPlatform.yaml`)
|
||||
- [ ] Update `range:` in slot definitions
|
||||
- [ ] Update any `slot_usage:` references
|
||||
|
||||
### Documentation
|
||||
|
||||
- [ ] Update AGENTS.md if convention is documented there
|
||||
- [ ] Update any design documents
|
||||
- [ ] Add migration note to changelog
|
||||
|
||||
### Verification
|
||||
|
||||
```bash
|
||||
# Verify no references to old name remain
|
||||
grep -r "OldClassName" schemas/20251121/linkml/
|
||||
|
||||
# Verify new file exists
|
||||
ls -la schemas/20251121/linkml/modules/classes/NewClassName.yaml
|
||||
|
||||
# Verify old file is removed
|
||||
ls -la schemas/20251121/linkml/modules/classes/OldClassName.yaml # Should fail
|
||||
|
||||
# Validate schema
|
||||
linkml-validate schemas/20251121/linkml/01_custodian_name.yaml
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 0**: LinkML Schemas Are the Single Source of Truth
|
||||
- **Rule 9**: Enum-to-Class Promotion - Single Source of Truth
|
||||
|
||||
---
|
||||
|
||||
## Changelog
|
||||
|
||||
| Date | Version | Change |
|
||||
|------|---------|--------|
|
||||
| 2025-01-04 | 1.0.0 | Initial rule created after DigitalPlatformType refactoring |
|
||||
|
|
@ -1,332 +0,0 @@
|
|||
# Rule: LinkML "Types" Classes Define SPARQL Template Variables
|
||||
|
||||
**Created**: 2025-01-08
|
||||
**Status**: Active
|
||||
**Applies to**: SPARQL template design, RAG pipeline slot extraction
|
||||
|
||||
## Core Principle
|
||||
|
||||
LinkML classes following the `*Type` / `*Types` naming pattern (Rule 0b) serve as the **single source of truth** for valid values in SPARQL template slot variables.
|
||||
|
||||
When designing SPARQL templates, **extract variables from the schema** rather than hardcoding values. This enables:
|
||||
- **Flexibility**: Same template works across all institution types
|
||||
- **Extensibility**: Adding new types to schema automatically extends templates
|
||||
- **Consistency**: Variable values always align with ontology
|
||||
- **Multilingual support**: Type labels in multiple languages available from schema
|
||||
|
||||
## Template Variable Sources
|
||||
|
||||
### 1. Institution Type Variable (`institution_type`)
|
||||
|
||||
**Schema Source**: `CustodianType` abstract class and its 19 subclasses
|
||||
|
||||
| Subclass | Code | Description |
|
||||
|----------|------|-------------|
|
||||
| `ArchiveOrganizationType` | A | Archives |
|
||||
| `BioCustodianType` | B | Botanical gardens, zoos |
|
||||
| `CommercialOrganizationType` | C | Corporations |
|
||||
| `DigitalPlatformType` | D | Digital platforms |
|
||||
| `EducationProviderType` | E | Universities, schools |
|
||||
| `FeatureCustodianType` | F | Geographic features |
|
||||
| `GalleryType` | G | Art galleries |
|
||||
| `HolySacredSiteType` | H | Religious sites |
|
||||
| `IntangibleHeritageGroupType` | I | Folklore organizations |
|
||||
| `LibraryType` | L | Libraries |
|
||||
| `MuseumType` | M | Museums |
|
||||
| `NonProfitType` | N | NGOs |
|
||||
| `OfficialInstitutionType` | O | Government agencies |
|
||||
| `PersonalCollectionType` | P | Private collectors |
|
||||
| `ResearchOrganizationType` | R | Research centers |
|
||||
| `HeritageSocietyType` | S | Historical societies |
|
||||
| `TasteScentHeritageType` | T | Culinary heritage |
|
||||
| `UnspecifiedType` | U | Unknown |
|
||||
| `MixedCustodianType` | X | Multiple types |
|
||||
|
||||
**Template Slot Definition**:
|
||||
```yaml
|
||||
slots:
|
||||
institution_type:
|
||||
type: institution_type
|
||||
required: true
|
||||
schema_source: "modules/classes/CustodianType.yaml"
|
||||
# Valid values derived from CustodianType subclasses
|
||||
```
|
||||
|
||||
### 2. Geographic Scope Variable (`location`)
|
||||
|
||||
Geographic scope is a **hierarchical variable** with three levels:
|
||||
|
||||
| Level | Schema Source | SPARQL Property | Example |
|
||||
|-------|---------------|-----------------|---------|
|
||||
| Country | ISO 3166-1 alpha-2 | `hc:countryCode` | NL, DE, BE |
|
||||
| Subregion | ISO 3166-2 | `hc:subregionCode` | NL-NH, DE-BY |
|
||||
| Settlement | GeoNames | `hc:settlementName` | Amsterdam, Berlin |
|
||||
|
||||
**Template Slot Definition**:
|
||||
```yaml
|
||||
slots:
|
||||
location:
|
||||
type: location
|
||||
required: true
|
||||
schema_source:
|
||||
- "modules/enums/CountryCodeEnum.yaml" (if exists)
|
||||
- "data/reference/geonames.db"
|
||||
resolution_order: [settlement, subregion, country]
|
||||
# SlotExtractor detects which level user specified
|
||||
```
|
||||
|
||||
### 3. Digital Platform Type Variable (`platform_type`)
|
||||
|
||||
**Schema Source**: `DigitalPlatformType` abstract class and 69+ subclasses in `DigitalPlatformTypes.yaml`
|
||||
|
||||
Categories include:
|
||||
- REPOSITORY: DigitalLibrary, DigitalArchivePlatform, OpenAccessRepository
|
||||
- AGGREGATOR: Europeana-type aggregators, BibliographicDatabasePlatform
|
||||
- DISCOVERY: WebPortal, OnlineDatabase, OpenDataPortal
|
||||
- VIRTUAL_HERITAGE: VirtualMuseum, VirtualLibrary, OnlineArtGallery
|
||||
- RESEARCH: DisciplinaryRepository, PrePrintServer, GenealogyDatabase
|
||||
- ...and many more
|
||||
|
||||
**Template Slot Definition**:
|
||||
```yaml
|
||||
slots:
|
||||
platform_type:
|
||||
type: platform_type
|
||||
required: false
|
||||
schema_source: "modules/classes/DigitalPlatformTypes.yaml"
|
||||
```
|
||||
|
||||
## Template Design Pattern
|
||||
|
||||
### Before (Hardcoded - WRONG)
|
||||
|
||||
```yaml
|
||||
# Separate templates for each institution type - DO NOT DO THIS
|
||||
templates:
|
||||
count_museums_in_region:
|
||||
sparql: |
|
||||
SELECT (COUNT(?s) AS ?count) WHERE {
|
||||
?s hc:institutionType "M" ;
|
||||
hc:subregionCode "{{ region }}" .
|
||||
}
|
||||
|
||||
count_archives_in_region:
|
||||
sparql: |
|
||||
SELECT (COUNT(?s) AS ?count) WHERE {
|
||||
?s hc:institutionType "A" ;
|
||||
hc:subregionCode "{{ region }}" .
|
||||
}
|
||||
```
|
||||
|
||||
### After (Parameterized - CORRECT)
|
||||
|
||||
```yaml
|
||||
# Single template with institution_type as variable
|
||||
templates:
|
||||
count_institutions_by_type_location:
|
||||
description: "Count heritage institutions by type and location"
|
||||
slots:
|
||||
institution_type:
|
||||
type: institution_type
|
||||
required: true
|
||||
schema_source: "modules/classes/CustodianType.yaml"
|
||||
location:
|
||||
type: location
|
||||
required: true
|
||||
resolution_order: [settlement, subregion, country]
|
||||
|
||||
# Multiple SPARQL variants based on location resolution
|
||||
sparql_template: |
|
||||
SELECT (COUNT(DISTINCT ?institution) AS ?count) WHERE {
|
||||
?institution a hcc:Custodian ;
|
||||
hc:institutionType "{{ institution_type }}" ;
|
||||
hc:settlementName "{{ location }}" .
|
||||
}
|
||||
|
||||
sparql_template_region: |
|
||||
SELECT (COUNT(DISTINCT ?institution) AS ?count) WHERE {
|
||||
?institution a hcc:Custodian ;
|
||||
hc:institutionType "{{ institution_type }}" ;
|
||||
hc:subregionCode "{{ location }}" .
|
||||
}
|
||||
|
||||
sparql_template_country: |
|
||||
SELECT (COUNT(DISTINCT ?institution) AS ?count) WHERE {
|
||||
?institution a hcc:Custodian ;
|
||||
hc:institutionType "{{ institution_type }}" ;
|
||||
hc:countryCode "{{ location }}" .
|
||||
}
|
||||
```
|
||||
|
||||
## SlotExtractor Responsibilities
|
||||
|
||||
The SlotExtractor module must:
|
||||
|
||||
1. **Detect institution type** from user query:
|
||||
- "musea" → M (Dutch plural)
|
||||
- "archives" → A (English)
|
||||
- "bibliotheken" → L (Dutch)
|
||||
- Use synonyms from `_slot_types.institution_type.synonyms`
|
||||
|
||||
2. **Detect location level** from user query:
|
||||
- "Amsterdam" → settlement level → use `sparql_template`
|
||||
- "Noord-Holland" → subregion level → use `sparql_template_region`
|
||||
- "Nederland" → country level → use `sparql_template_country`
|
||||
|
||||
3. **Normalize values** to schema-compliant codes:
|
||||
- "Noord-Holland" → "NL-NH"
|
||||
- "museum" → "M"
|
||||
|
||||
## Dynamic Label Resolution (NO HARDCODING)
|
||||
|
||||
**CRITICAL**: Labels MUST be resolved at runtime from schema/reference files, NOT hardcoded in templates or code.
|
||||
|
||||
### Institution Type Labels
|
||||
|
||||
The `CustodianType` classes contain multilingual labels via `type_label` slot:
|
||||
|
||||
```yaml
|
||||
MuseumType:
|
||||
type_label:
|
||||
- "Museum"@en
|
||||
- "museum"@nl
|
||||
- "Museum"@de
|
||||
- "museo"@es
|
||||
```
|
||||
|
||||
**Label Resolution Chain**:
|
||||
1. Load `CustodianType.yaml` and subclass files
|
||||
2. Parse `type_label` slot for each type code (M, L, A, etc.)
|
||||
3. Build runtime label dictionary keyed by code + language
|
||||
|
||||
### Geographic Labels
|
||||
|
||||
Subregion/settlement names come from **reference data files**, not hardcoded:
|
||||
|
||||
```yaml
|
||||
label_sources:
|
||||
- "data/reference/iso_3166_2_{country}.json" # e.g., iso_3166_2_nl.json
|
||||
- "data/reference/geonames.db" # GeoNames database
|
||||
- "data/reference/admin1CodesASCII.txt" # GeoNames fallback
|
||||
```
|
||||
|
||||
**Example**: `iso_3166_2_nl.json` contains:
|
||||
```json
|
||||
{
|
||||
"provinces": {
|
||||
"Noord-Holland": "NH",
|
||||
"Zuid-Holland": "ZH",
|
||||
"North Holland": "NH" // English synonym
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### SlotExtractor Label Loading
|
||||
|
||||
```python
|
||||
class SlotExtractor:
|
||||
def __init__(self, schema_path: str, reference_path: str):
|
||||
# Load institution type labels from schema
|
||||
self.type_labels = self._load_custodian_type_labels(schema_path)
|
||||
|
||||
# Load geographic labels from reference files
|
||||
self.subregion_labels = self._load_subregion_labels(reference_path)
|
||||
|
||||
def _load_custodian_type_labels(self, schema_path: str) -> dict:
|
||||
"""Load multilingual labels from CustodianType schema files."""
|
||||
# Parse YAML, extract type_label slots
|
||||
# Return: {"M": {"nl": "musea", "en": "museums"}, ...}
|
||||
|
||||
def _load_subregion_labels(self, reference_path: str) -> dict:
|
||||
"""Load subregion labels from ISO 3166-2 JSON files."""
|
||||
# Load iso_3166_2_nl.json, iso_3166_2_de.json, etc.
|
||||
# Return: {"NL-NH": {"nl": "Noord-Holland", "en": "North Holland"}, ...}
|
||||
```
|
||||
|
||||
### UI Template Interpolation
|
||||
|
||||
```yaml
|
||||
ui_template:
|
||||
nl: "Er zijn {{ count }} {{ institution_type_nl }} in {{ location }}."
|
||||
en: "There are {{ count }} {{ institution_type_en }} in {{ location }}."
|
||||
```
|
||||
|
||||
The RAG pipeline populates `institution_type_nl` / `institution_type_en` from dynamically loaded labels:
|
||||
|
||||
```python
|
||||
# At runtime, NOT hardcoded
|
||||
template_context["institution_type_nl"] = slot_extractor.type_labels[type_code]["nl"]
|
||||
template_context["institution_type_en"] = slot_extractor.type_labels[type_code]["en"]
|
||||
```
|
||||
|
||||
## Adding New Types
|
||||
|
||||
When the schema gains new institution types:
|
||||
|
||||
1. **No template changes needed** - parameterized templates automatically support new types
|
||||
2. **Update synonyms** in `_slot_types.institution_type.synonyms` for NLP recognition
|
||||
3. **Labels auto-discovered** from schema files - no code changes needed
|
||||
|
||||
## Anti-Patterns (FORBIDDEN)
|
||||
|
||||
### Hardcoded Labels in Templates
|
||||
|
||||
```yaml
|
||||
# WRONG - Hardcoded labels
|
||||
labels:
|
||||
NL-NH: {nl: "Noord-Holland", en: "North Holland"}
|
||||
NL-ZH: {nl: "Zuid-Holland", en: "South Holland"}
|
||||
```
|
||||
|
||||
```python
|
||||
# WRONG - Hardcoded labels in code
|
||||
INSTITUTION_TYPE_LABELS_NL = {
|
||||
"M": "musea", "L": "bibliotheken", ...
|
||||
}
|
||||
```
|
||||
|
||||
### Correct Approach
|
||||
|
||||
```yaml
|
||||
# CORRECT - Reference to schema/data source
|
||||
label_sources:
|
||||
- "schemas/20251121/linkml/modules/classes/CustodianType.yaml"
|
||||
- "data/reference/iso_3166_2_{country}.json"
|
||||
```
|
||||
|
||||
```python
|
||||
# CORRECT - Load labels at runtime
|
||||
type_labels = load_labels_from_schema("CustodianType.yaml")
|
||||
region_labels = load_labels_from_reference("iso_3166_2_nl.json")
|
||||
```
|
||||
|
||||
**Why?**
|
||||
1. **Single source of truth** - Labels defined once in schema/reference files
|
||||
2. **Automatic sync** - Schema changes automatically propagate to UI
|
||||
3. **Extensibility** - Adding new countries/types doesn't require code changes
|
||||
4. **Multilingual** - All language variants come from same source
|
||||
|
||||
## Validation
|
||||
|
||||
Templates MUST validate slot values against schema:
|
||||
|
||||
```python
|
||||
def validate_institution_type(value: str) -> bool:
|
||||
"""Validate institution_type against CustodianType schema."""
|
||||
valid_codes = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I',
|
||||
'L', 'M', 'N', 'O', 'P', 'R', 'S', 'T', 'U', 'X']
|
||||
return value in valid_codes
|
||||
```
|
||||
|
||||
## Related Rules
|
||||
|
||||
- **Rule 0b**: Type/Types file naming convention
|
||||
- **Rule 13**: Custodian type annotations on LinkML schema elements
|
||||
- **Rule 37**: Specificity score annotations for template filtering
|
||||
|
||||
## References
|
||||
|
||||
- Schema: `schemas/20251121/linkml/modules/classes/CustodianType.yaml`
|
||||
- Types: `schemas/20251121/linkml/modules/classes/*Types.yaml`
|
||||
- Enums: `schemas/20251121/linkml/modules/enums/InstitutionTypeCodeEnum.yaml`
|
||||
- Templates: `data/sparql_templates.yaml`
|
||||
|
|
@ -1,323 +0,0 @@
|
|||
# Rule: Verified Ontology Mapping Requirements
|
||||
|
||||
## Overview
|
||||
|
||||
All LinkML slot files MUST include ontology mappings that are **verified against the actual ontology files** in `data/ontology/`. Never use hallucinated or assumed ontology terms.
|
||||
|
||||
---
|
||||
|
||||
## 1. Source Ontology Files
|
||||
|
||||
The following ontology files are available for verification:
|
||||
|
||||
| Prefix | Namespace | File | Key Properties |
|
||||
|--------|-----------|------|----------------|
|
||||
| `crm:` | `http://www.cidoc-crm.org/cidoc-crm/` | `CIDOC_CRM_v7.1.3.rdf` | P1, P2, P22, P23, P70, P82, etc. |
|
||||
| `rico:` | `https://www.ica.org/standards/RiC/ontology#` | `RiC-O_1-1.rdf` | hasOrHadHolder, isOrWasPartOf, etc. |
|
||||
| `prov:` | `http://www.w3.org/ns/prov#` | `prov.ttl` | wasInfluencedBy, wasDerivedFrom, used, etc. |
|
||||
| `schema:` | `http://schema.org/` | `schemaorg.owl` | url, name, description, etc. |
|
||||
| `dcterms:` | `http://purl.org/dc/terms/` | `dcterms.rdf` | format, rights, source, etc. |
|
||||
| `skos:` | `http://www.w3.org/2004/02/skos/core#` | `skos.rdf` | prefLabel, notation, inScheme, etc. |
|
||||
| `foaf:` | `http://xmlns.com/foaf/0.1/` | `foaf.ttl` | page, homepage, name, etc. |
|
||||
| `dcat:` | `http://www.w3.org/ns/dcat#` | `dcat3.ttl` | mediaType, downloadURL, etc. |
|
||||
| `time:` | `http://www.w3.org/2006/time#` | `time.ttl` | hasBeginning, hasEnd, etc. |
|
||||
| `org:` | `http://www.w3.org/ns/org#` | `org.rdf` | siteOf, hasSite, subOrganizationOf, etc. |
|
||||
| `sosa:` | `http://www.w3.org/ns/sosa/` | `sosa.ttl` | madeBySensor, observes, etc. |
|
||||
|
||||
---
|
||||
|
||||
## 2. Required Header Documentation
|
||||
|
||||
Every slot file MUST include a header comment block with an ontology alignment table:
|
||||
|
||||
```yaml
|
||||
# ==============================================================================
|
||||
# LinkML Slot Definition: {slot_name}
|
||||
# ==============================================================================
|
||||
# {Brief description - one line}
|
||||
#
|
||||
# ONTOLOGY ALIGNMENT (verified against data/ontology/):
|
||||
#
|
||||
# | Ontology | Property | File/Line | Mapping | Notes |
|
||||
# |---------------|-----------------------|----------------------|---------|------------------------------------|
|
||||
# | **PROV-O** | `prov:used` | prov.ttl:1046-1057 | exact | Entity used by activity |
|
||||
# | **PROV-O** | `prov:wasInfluencedBy`| prov.ttl:1099-1121 | broad | Parent property (subPropertyOf) |
|
||||
#
|
||||
# HIERARCHY: prov:used rdfs:subPropertyOf prov:wasInfluencedBy (line 1046)
|
||||
#
|
||||
# CREATED: YYYY-MM-DD
|
||||
# UPDATED: YYYY-MM-DD - Description of changes
|
||||
# ==============================================================================
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Mapping Types
|
||||
|
||||
Use the correct mapping type based on semantic relationship:
|
||||
|
||||
| Mapping Type | Usage | Example |
|
||||
|--------------|-------|---------|
|
||||
| `slot_uri` | Primary RDF predicate for this slot | `slot_uri: prov:used` |
|
||||
| `exact_mappings` | Semantically equivalent properties | `- schema:dateRetrieved` |
|
||||
| `close_mappings` | Very similar but slightly different semantics | `- prov:wasGeneratedBy` |
|
||||
| `broad_mappings` | Parent/broader properties (slot is subPropertyOf these) | `- prov:wasInfluencedBy` |
|
||||
| `narrow_mappings` | Child/narrower properties (these are subPropertyOf slot) | `- prov:qualifiedUsage` |
|
||||
| `related_mappings` | Conceptually related but different scope | `- dcterms:source` |
|
||||
|
||||
---
|
||||
|
||||
## 4. Hierarchy Discovery Process
|
||||
|
||||
### Step 1: Search for subPropertyOf relationships
|
||||
|
||||
```bash
|
||||
# Find if our property is subPropertyOf something (-> broad_mapping)
|
||||
grep -n "OUR_PROPERTY.*subPropertyOf\|subPropertyOf.*OUR_PROPERTY" data/ontology/*.ttl
|
||||
|
||||
# Find properties that are subPropertyOf our property (-> narrow_mappings)
|
||||
grep -n "subPropertyOf.*OUR_PROPERTY" data/ontology/*.rdf
|
||||
```
|
||||
|
||||
### Step 2: Document the hierarchy
|
||||
|
||||
When you find a hierarchy, document it in:
|
||||
1. The header comment block (HIERARCHY line)
|
||||
2. The appropriate mapping field (`broad_mappings` or `narrow_mappings`)
|
||||
3. Inline comments with file/line references
|
||||
|
||||
---
|
||||
|
||||
## 5. Key Ontology Hierarchies Reference
|
||||
|
||||
### PROV-O (`prov.ttl`)
|
||||
|
||||
```
|
||||
prov:wasInfluencedBy (parent of many)
|
||||
├── prov:wasDerivedFrom
|
||||
│ ├── prov:hadPrimarySource
|
||||
│ ├── prov:wasQuotedFrom
|
||||
│ └── prov:wasRevisionOf
|
||||
├── prov:wasGeneratedBy
|
||||
├── prov:used
|
||||
├── prov:wasAssociatedWith
|
||||
├── prov:wasAttributedTo
|
||||
└── prov:wasInformedBy
|
||||
|
||||
prov:influenced (inverse direction)
|
||||
├── prov:generated
|
||||
└── prov:invalidated
|
||||
```
|
||||
|
||||
### CIDOC-CRM (`CIDOC_CRM_v7.1.3.rdf`)
|
||||
|
||||
```
|
||||
crm:P1_is_identified_by
|
||||
├── crm:P48_has_preferred_identifier
|
||||
└── crm:P168_place_is_defined_by
|
||||
|
||||
crm:P82_at_some_time_within
|
||||
├── crm:P82a_begin_of_the_begin
|
||||
└── crm:P82b_end_of_the_end
|
||||
|
||||
crm:P81_ongoing_throughout
|
||||
├── crm:P81a_end_of_the_begin
|
||||
└── crm:P81b_begin_of_the_end
|
||||
|
||||
crm:P67_refers_to
|
||||
└── crm:P70_documents
|
||||
```
|
||||
|
||||
### RiC-O (`RiC-O_1-1.rdf`)
|
||||
|
||||
```
|
||||
rico:isOrWasUnderAuthorityOf
|
||||
├── rico:hasOrHadManager
|
||||
│ └── rico:hasOrHadHolder
|
||||
└── (other authority relationships)
|
||||
|
||||
rico:hasOrHadPart
|
||||
└── rico:containsOrContained
|
||||
└── rico:containsTransitive
|
||||
|
||||
rico:isSuccessorOf
|
||||
├── rico:hasAncestor
|
||||
├── rico:resultedFromTheMergerOf
|
||||
└── rico:resultedFromTheSplitOf
|
||||
```
|
||||
|
||||
### Dublin Core Terms (`dcterms.rdf`)
|
||||
|
||||
```
|
||||
dcterms:rights
|
||||
└── dcterms:accessRights
|
||||
```
|
||||
|
||||
### DCAT (`dcat3.ttl`)
|
||||
|
||||
```
|
||||
dcterms:format
|
||||
├── dcat:mediaType
|
||||
├── dcat:compressFormat
|
||||
└── dcat:packageFormat
|
||||
```
|
||||
|
||||
### FOAF (`foaf.ttl`)
|
||||
|
||||
```
|
||||
foaf:page
|
||||
├── foaf:homepage
|
||||
├── foaf:weblog
|
||||
├── foaf:interest
|
||||
├── foaf:workplaceHomepage
|
||||
└── foaf:schoolHomepage
|
||||
```
|
||||
|
||||
### Schema.org (`schemaorg.owl`)
|
||||
|
||||
```
|
||||
schema:workFeatured
|
||||
├── schema:workPerformed
|
||||
└── schema:workPresented
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Verification Commands
|
||||
|
||||
### Check if a property exists
|
||||
|
||||
```bash
|
||||
grep -n "PROPERTY_NAME" data/ontology/FILE.ttl
|
||||
```
|
||||
|
||||
### Find all subPropertyOf for a property
|
||||
|
||||
```bash
|
||||
grep -B5 -A5 "subPropertyOf" data/ontology/FILE.ttl | grep -A5 -B5 "PROPERTY_NAME"
|
||||
```
|
||||
|
||||
### Validate YAML after editing
|
||||
|
||||
```bash
|
||||
python3 -c "import yaml; yaml.safe_load(open('FILENAME.yaml')); print('✅ valid')"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Complete Slot File Example
|
||||
|
||||
```yaml
|
||||
# ==============================================================================
|
||||
# LinkML Slot Definition: retrieved_through
|
||||
# ==============================================================================
|
||||
# To denote the specific method, protocol, or mechanism by which a resource
|
||||
# or data was accessed, fetched, or collected.
|
||||
#
|
||||
# ONTOLOGY ALIGNMENT (verified against data/ontology/):
|
||||
#
|
||||
# | Ontology | Property | File/Line | Mapping | Notes |
|
||||
# |------------|--------------------------|--------------------|---------|------------------------------------|
|
||||
# | **PROV-O** | `prov:used` | prov.ttl:1046-1057 | exact | Entity used by activity |
|
||||
# | **PROV-O** | `prov:wasInfluencedBy` | prov.ttl:1099-1121 | broad | Parent property (subPropertyOf) |
|
||||
# | **PROV-O** | `prov:qualifiedUsage` | prov.ttl:788-798 | narrow | Qualified usage with details |
|
||||
#
|
||||
# HIERARCHY: prov:used rdfs:subPropertyOf prov:wasInfluencedBy (line 1046)
|
||||
#
|
||||
# CREATED: 2026-01-26
|
||||
# UPDATED: 2026-02-03 - Added broad/narrow mappings, header documentation
|
||||
# ==============================================================================
|
||||
|
||||
id: https://nde.nl/ontology/hc/slot/retrieved_through
|
||||
name: retrieved_through
|
||||
title: Retrieved Through
|
||||
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
schema: http://schema.org/
|
||||
|
||||
imports:
|
||||
- linkml:types
|
||||
|
||||
default_prefix: hc
|
||||
|
||||
slots:
|
||||
retrieved_through:
|
||||
slot_uri: prov:used
|
||||
description: |
|
||||
To denote the specific method, protocol, or mechanism by which a resource or data was accessed, fetched, or collected.
|
||||
range: string
|
||||
exact_mappings:
|
||||
- prov:used # prov.ttl:1046-1057
|
||||
broad_mappings:
|
||||
- prov:wasInfluencedBy # prov.ttl:1099-1121 - parent (used subPropertyOf wasInfluencedBy)
|
||||
narrow_mappings:
|
||||
- prov:qualifiedUsage # prov.ttl:788-798 - qualified form with details
|
||||
comments:
|
||||
- |
|
||||
**ONTOLOGY ALIGNMENT** (verified against data/ontology/):
|
||||
|
||||
| Ontology | Property | Line | Mapping | Notes |
|
||||
|----------|----------|------|---------|-------|
|
||||
| PROV-O | prov:used | 1046-1057 | exact | Entity used by activity |
|
||||
| PROV-O | prov:wasInfluencedBy | 1099-1121 | broad | Parent property |
|
||||
| PROV-O | prov:qualifiedUsage | 788-798 | narrow | Qualified usage |
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Anti-Patterns
|
||||
|
||||
### ❌ WRONG: Hallucinated ontology terms
|
||||
|
||||
```yaml
|
||||
exact_mappings:
|
||||
- prov:retrievedWith # ❌ Does not exist in PROV-O!
|
||||
- rico:wasObtainedBy # ❌ Not a real RiC-O property!
|
||||
```
|
||||
|
||||
### ❌ WRONG: No verification references
|
||||
|
||||
```yaml
|
||||
exact_mappings:
|
||||
- prov:used # No file/line reference - how do we know this is correct?
|
||||
```
|
||||
|
||||
### ✅ CORRECT: Verified with references
|
||||
|
||||
```yaml
|
||||
exact_mappings:
|
||||
- prov:used # prov.ttl:1046-1057 - "Entity used by activity"
|
||||
broad_mappings:
|
||||
- prov:wasInfluencedBy # prov.ttl:1099-1121 - parent property (verified subPropertyOf)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. Validation Checklist
|
||||
|
||||
Before completing a slot file, verify:
|
||||
|
||||
- [ ] Header comment block includes ontology alignment table
|
||||
- [ ] All mappings verified against actual ontology files in `data/ontology/`
|
||||
- [ ] File/line references provided for each mapping
|
||||
- [ ] `rdfs:subPropertyOf` relationships checked for broad/narrow mappings
|
||||
- [ ] HIERARCHY line documents any property hierarchies
|
||||
- [ ] No hallucinated or assumed ontology terms
|
||||
- [ ] YAML validates correctly
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- Rule 1: Ontology Files Are Your Primary Reference (`no-hallucinated-ontology-references.md`)
|
||||
- Rule: Verified Ontology Terms (`verified-ontology-terms.md`)
|
||||
- Ontology files: `data/ontology/`
|
||||
|
||||
---
|
||||
|
||||
**Version**: 1.0.0
|
||||
**Created**: 2026-02-03
|
||||
**Author**: OpenCODE
|
||||
|
|
@ -1,68 +0,0 @@
|
|||
# Rule 62: Verified Ontology Terms Reference
|
||||
|
||||
🚨 **CRITICAL**: All `class_uri`, `slot_uri`, and mapping properties (`exact_mappings`, `close_mappings`, etc.) MUST use verified classes and predicates that exist in the local ontology files at `data/ontology/`.
|
||||
|
||||
## 1. Verified Ontology Files
|
||||
|
||||
The following ontologies are locally available in `data/ontology/`. Always verify terms against these specific files. **NO HALLUCINATIONS ALLOWED.**
|
||||
|
||||
**Mandatory Verification Step**: Before using any `class_uri`, `slot_uri`, or mapping URI, you MUST `grep` the term in the local ontology file to confirm it exists.
|
||||
|
||||
| Prefix | Namespace | Local File | Key Classes/Predicates (Verified) |
|
||||
|--------|-----------|------------|-----------------------------------|
|
||||
| `cpov:` | `http://data.europa.eu/m8g/` | `core-public-organisation-ap.ttl` | `PublicOrganisation`, `contactPage`, `email` |
|
||||
| `crm:` | `http://www.cidoc-crm.org/cidoc-crm/` | `CIDOC_CRM_v7.1.3.rdf` | `E1_CRM_Entity`, `E5_Event`, `P2_has_type` |
|
||||
| `rico:` | `https://www.ica.org/standards/RiC/ontology#` | `RiC-O_1-1.rdf` | `Record`, `Agent`, `hasOrHadHolder` (Note: Use v1.1 file) |
|
||||
| `pico:` | `https://personsincontext.org/model#` | `pico.ttl` | `PersonObservation`, `role` |
|
||||
| `prov:` | `http://www.w3.org/ns/prov#` | `prov.ttl` | `Activity`, `Agent`, `wasGeneratedBy` |
|
||||
| `skos:` | `http://www.w3.org/2004/02/skos/core#` | `skos.rdf` | `Concept`, `prefLabel`, `broader` |
|
||||
| `schema:` | `https://schema.org/` | `frontend/public/ontology/schemaorg.owl` | `Organization`, `Place`, `name`, `url` |
|
||||
| `dcterms:` | `http://purl.org/dc/terms/` | `dublin_core_elements.rdf` | `identifier`, `title`, `description` |
|
||||
| `org:` | `http://www.w3.org/ns/org#` | `org.rdf` | `Organization`, `hasMember` |
|
||||
| `tooi:` | `https://identifier.overheid.nl/tooi/def/ont/` | `tooiont.ttl` | `Overheidsorganisatie` |
|
||||
| `dcat:` | `http://www.w3.org/ns/dcat#` | `dcat3.ttl` | `Dataset`, `Catalog`, `dataset` |
|
||||
| `gn:` | `https://www.geonames.org/ontology#` | `geonames_ontology.rdf` | `Feature` |
|
||||
| `dqv:` | `http://www.w3.org/ns/dqv#` | `dqv.ttl` | `QualityMeasurement`, `hasQualityAnnotation` |
|
||||
| `premis:` | `http://www.loc.gov/premis/rdf/v3/` | `premis3.owl` | `fixity`, `storedAt`, `Event` |
|
||||
|
||||
## 2. Verification Procedure (MANDATORY)
|
||||
|
||||
**You MUST verify every term.** Do not assume a term exists just because it sounds standard.
|
||||
|
||||
```bash
|
||||
# 1. Identify the source ontology file
|
||||
ls data/ontology/
|
||||
|
||||
# 2. Grep for the specific term (e.g., 'hasFixity')
|
||||
grep "hasFixity" data/ontology/premis3.owl
|
||||
# Result: EMPTY -> Term does not exist! DO NOT USE.
|
||||
|
||||
# 3. Grep for the correct term (e.g., 'fixity')
|
||||
grep "fixity" data/ontology/premis3.owl
|
||||
# Result: <owl:ObjectProperty rdf:about=".../fixity"> -> Term exists. USE THIS.
|
||||
```
|
||||
|
||||
## 3. LinkML Mapping Requirements
|
||||
|
||||
Mappings must be precise and verified.
|
||||
|
||||
* `exact_mappings` = `skos:exactMatch` (Semantic equivalence)
|
||||
* `close_mappings` = `skos:closeMatch` (Near equivalence)
|
||||
* `related_mappings` = `skos:relatedMatch` (Association)
|
||||
* `broad_mappings` = `skos:broadMatch` (Broader concept)
|
||||
* `narrow_mappings` = `skos:narrowMatch` (Narrower concept)
|
||||
|
||||
## 4. Prohibited/Invalid Terms (Hallucinations)
|
||||
|
||||
Do NOT use these commonly hallucinated or incorrect terms. They have been verified as **non-existent** in our local ontologies:
|
||||
|
||||
* ❌ `dqv:ConfidenceScore` (Use `dqv:QualityMeasurement`)
|
||||
* ❌ `premis:hasFixity` (Use `premis:fixity`)
|
||||
* ❌ `premis:hasFrameRate` (Verify specific PREMIS properties first)
|
||||
* ❌ `schema:HeritageBuilding` (Use `schema:LandmarksOrHistoricalBuildings`)
|
||||
* ❌ `rico:has_provenance` (Use `rico:history`)
|
||||
* ❌ `rico:hasProvenance` (Use `rico:history`)
|
||||
* ❌ `schema:archive` (Use `schema:archiveHeld` or `schema:archivedAt`)
|
||||
|
||||
**Always verify against the local file content.**
|
||||
|
||||
|
|
@ -1,162 +0,0 @@
|
|||
# Wikidata Mapping Discovery Rule
|
||||
|
||||
## Rule: Use Wikidata MCP to Discover and Verify Mappings Carefully
|
||||
|
||||
When adding Wikidata mappings to class files, you MUST verify the semantic meaning and relationship before adding any mapping.
|
||||
|
||||
### 🚨 CRITICAL: Always Verify Before Adding
|
||||
|
||||
**NEVER add a Wikidata QID without verifying:**
|
||||
1. What the entity actually IS (not just the label)
|
||||
2. That it's the SAME TYPE as your class (organization→organization, NOT organization→building)
|
||||
3. That the semantic relationship makes sense
|
||||
|
||||
### Workflow
|
||||
|
||||
#### Step 1: VERIFY Existing Mappings First
|
||||
|
||||
Before trusting any existing mapping, verify it:
|
||||
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription WHERE {
|
||||
VALUES ?item { wd:Q22075301 wd:Q1643722 wd:Q185583 }
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
}
|
||||
```
|
||||
|
||||
**Example of WRONG mappings found in codebase:**
|
||||
| QID | Label | Was Mapped To | WHY WRONG |
|
||||
|-----|-------|---------------|-----------|
|
||||
| Q22075301 | textile artwork | FacultyPaperCollection | Not related at all! |
|
||||
| Q1643722 | building in Vienna | UniversityAdministrativeFonds | Not an archival concept! |
|
||||
| Q185583 | candy | AcademicStudentRecordSeries | Completely unrelated! |
|
||||
|
||||
#### Step 2: Search for Candidates
|
||||
|
||||
Search for relevant Wikidata entities by keyword or hierarchy:
|
||||
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription WHERE {
|
||||
?item wdt:P279 wd:Q166118 . # subclasses of "archives"
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
}
|
||||
```
|
||||
|
||||
#### Step 3: VERIFY Each Candidate
|
||||
|
||||
For EVERY candidate found, verify:
|
||||
1. **Read the description** - does it match your class?
|
||||
2. **Check instance of (P31)** - is it the same type?
|
||||
3. **Check subclass of (P279)** - is it in a relevant hierarchy?
|
||||
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription ?instanceLabel ?subclassLabel WHERE {
|
||||
VALUES ?item { wd:Q9388534 }
|
||||
OPTIONAL { ?item wdt:P31 ?instance. }
|
||||
OPTIONAL { ?item wdt:P279 ?subclass. }
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
}
|
||||
```
|
||||
|
||||
#### Step 4: Confirm Semantic Relationship
|
||||
|
||||
Ask: **Would a domain expert agree this mapping makes sense?**
|
||||
|
||||
| Your Class | Wikidata Entity | Verdict |
|
||||
|------------|-----------------|---------|
|
||||
| FacultyPaperCollection | Q22075301 (textile artwork) | ❌ NO - completely unrelated |
|
||||
| CampusDocumentationCollection | Q9388534 (archival collection) | ✅ YES - semantically related |
|
||||
| AcademicArchive | Q27032435 (academic archive) | ✅ YES - exact match |
|
||||
|
||||
### Type Compatibility Rules
|
||||
|
||||
| Your Class Type | Valid Wikidata Types | Invalid Wikidata Types |
|
||||
|-----------------|---------------------|------------------------|
|
||||
| Organization | organization, institution | building, person, artwork |
|
||||
| Record Set Type | collection, fonds, series | building, candy, textile |
|
||||
| Event | activity, occurrence | organization, place |
|
||||
| Type/Category | type, concept, class | specific instances |
|
||||
|
||||
### Common Mistakes to Avoid
|
||||
|
||||
❌ **WRONG: Adding any QID found in search without verification**
|
||||
```
|
||||
"Found Q1643722 in search results, adding it as mapping"
|
||||
→ Result: Mapping a "building in Vienna" to "UniversityAdministrativeFonds"
|
||||
```
|
||||
|
||||
✅ **CORRECT: Verify description and type before adding**
|
||||
```
|
||||
1. Search finds Q1643722
|
||||
2. Verify: Q1643722 = "building in Vienna, Austria"
|
||||
3. Check: Is a building related to "UniversityAdministrativeFonds"?
|
||||
4. Decision: NO - do not add this mapping
|
||||
```
|
||||
|
||||
### When to Add Wikidata Mappings
|
||||
|
||||
Add Wikidata mappings ONLY when:
|
||||
- [ ] You verified the entity's label AND description
|
||||
- [ ] The entity is the same type as your class
|
||||
- [ ] The semantic relationship is clear (exact, broader, narrower, related)
|
||||
- [ ] A domain expert would agree the mapping makes sense
|
||||
|
||||
### When NOT to Add Wikidata Mappings
|
||||
|
||||
Do NOT add Wikidata mappings when:
|
||||
- You only searched but didn't verify the description
|
||||
- The entity type doesn't match (e.g., building vs. organization)
|
||||
- The relationship is unclear or forced
|
||||
- You're just trying to "fill in" mappings
|
||||
|
||||
### Mapping Categories
|
||||
|
||||
| Category | Wikidata Property | When to Use |
|
||||
|----------|-------------------|-------------|
|
||||
| `exact_mappings` | - | Same semantic meaning (rare!) |
|
||||
| `close_mappings` | - | Similar but not identical |
|
||||
| `broad_mappings` | P279 (subclass of) | Wikidata entity is BROADER |
|
||||
| `narrow_mappings` | inverse of P279 | Wikidata entity is NARROWER |
|
||||
| `related_mappings` | - | Non-hierarchical but semantically related |
|
||||
|
||||
### Checklist
|
||||
|
||||
For each Wikidata mapping:
|
||||
|
||||
- [ ] Verified entity label matches expected meaning
|
||||
- [ ] Verified entity description confirms semantic fit
|
||||
- [ ] Entity type is compatible with class type
|
||||
- [ ] Mapping category (exact/close/broad/narrow/related) is correct
|
||||
- [ ] A domain expert would agree this makes sense
|
||||
|
||||
### Example: Proper Verification for FacultyPaperCollection
|
||||
|
||||
**Step 1: What are we looking for?**
|
||||
- Personal papers of faculty members
|
||||
- Academic archives
|
||||
- Manuscript collections
|
||||
|
||||
**Step 2: Search**
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription WHERE {
|
||||
?item ?prop ?value .
|
||||
?value bif:contains "'personal papers' OR 'faculty papers'" .
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
} LIMIT 10
|
||||
```
|
||||
|
||||
**Step 3: Verify candidates**
|
||||
- If no exact match found → DO NOT add a wrong mapping
|
||||
- Better to have NO Wikidata mapping than a WRONG one
|
||||
|
||||
**Step 4: Decision**
|
||||
- No exact Wikidata match for "FacultyPaperCollection"
|
||||
- Keep ontology mappings only (rico-rst:Fonds, bf:Archival)
|
||||
- Do NOT add unrelated QIDs like Q22075301 (textile artwork!)
|
||||
|
||||
### Integration with Other Rules
|
||||
|
||||
This rule complements:
|
||||
- **mapping-specificity-hypernym-rule.md**: Rules for choosing mapping type
|
||||
- **wikidata-mapping-verification-rule.md**: Rules for verifying QIDs exist
|
||||
- **verified-ontology-mapping-requirements.md**: General ontology verification
|
||||
|
|
@ -1,97 +0,0 @@
|
|||
# Wikidata Mapping Verification Rule
|
||||
|
||||
## Rule: Always Verify Wikidata Mappings Using Authenticated Tools
|
||||
|
||||
When adding or reviewing Wikidata mappings (wd:Qxxxxx), you MUST verify the entity exists and is semantically appropriate using the available tools.
|
||||
|
||||
### Verification Methods (in order of preference)
|
||||
|
||||
#### 1. Wikidata SPARQL Query (Primary)
|
||||
|
||||
Use `wikidata-authenticated_execute_sparql` to verify entity labels and descriptions:
|
||||
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription WHERE {
|
||||
VALUES ?item { wd:Q38723 wd:Q2385804 }
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
}
|
||||
```
|
||||
|
||||
#### 2. Wikidata Metadata API
|
||||
|
||||
Use `wikidata-authenticated_get_metadata` to retrieve label and description:
|
||||
|
||||
```
|
||||
entity_id: Q38723
|
||||
language: en
|
||||
```
|
||||
|
||||
#### 3. Web Search as Fallback
|
||||
|
||||
If authenticated tools fail, use `linkup_linkup-search` or `exa_web_search_exa`:
|
||||
```
|
||||
query: "Wikidata Q38723 higher education institution"
|
||||
```
|
||||
|
||||
### Common Errors to Avoid
|
||||
|
||||
| Error | Example | Fix |
|
||||
|-------|---------|-----|
|
||||
| **Wrong QID** | Q600875 (a person) for "academic program" | Q600134 (course) |
|
||||
| **Too broad** | Q35120 (entity) for specific class | Use appropriate subclass |
|
||||
| **Too narrow** | Q3918 (university) for general academic institution | Use Q38723 (higher education institution) |
|
||||
| **Different concept** | Q416703 (museum building) for museum organization | Use appropriate organizational class |
|
||||
|
||||
### Verification Checklist
|
||||
|
||||
Before committing any Wikidata mapping:
|
||||
|
||||
- [ ] QID exists (not 404)
|
||||
- [ ] Label matches expected concept
|
||||
- [ ] Description confirms semantic alignment
|
||||
- [ ] Mapping specificity follows Rule 63 (exact/broad/narrow/close)
|
||||
- [ ] Not a duplicate of another mapping in the same class
|
||||
|
||||
### Example Verification
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
# Q600875 was not verified - it's actually a person
|
||||
close_mappings:
|
||||
- wd:Q600875 # Juan Lindolfo Cuestas - President of Uruguay!
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
# Verified via SPARQL: Q600134 = "course"
|
||||
close_mappings:
|
||||
- wd:Q600134 # program of study, or unit of teaching
|
||||
```
|
||||
|
||||
### SPARQL Query Template
|
||||
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription ?itemAltLabel WHERE {
|
||||
VALUES ?item { wd:Q38723 }
|
||||
OPTIONAL { ?item skos:altLabel ?itemAltLabel. FILTER(LANG(?itemAltLabel) = "en") }
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
}
|
||||
```
|
||||
|
||||
### Batch Verification
|
||||
|
||||
For multiple QIDs in a file, verify all at once:
|
||||
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription WHERE {
|
||||
VALUES ?item { wd:Q38723 wd:Q2385804 wd:Q600134 wd:Q3918 }
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
}
|
||||
```
|
||||
|
||||
### Integration with Other Rules
|
||||
|
||||
This rule complements:
|
||||
- **Rule 63** (mapping-specificity-hypernym-rule.md): Determines mapping type (exact/broad/narrow)
|
||||
- **no-hallucinated-ontology-references.md**: Prevents fake ontology terms
|
||||
- **verified-ontology-terms.md**: General ontology verification
|
||||
|
|
@ -8,108 +8,15 @@ When mapping LinkML classes to external ontologies, you MUST distinguish between
|
|||
|
||||
1. **Exact Mappings (`skos:exactMatch`)**: Use ONLY when the external concept is **semantically equivalent** to your class.
|
||||
* *Example*: `hc:Person` `exact_mappings` `schema:Person`.
|
||||
* **CRITICAL**: Exact means the SAME semantic scope - neither broader nor narrower!
|
||||
* **DO NOT AVOID EXACT BY DEFAULT**: If equivalence is verified (including class/property category match and ontology definition review), `exact_mappings` SHOULD be used.
|
||||
|
||||
2. **Broad Mappings (`skos:broadMatch`)**: Use when the external concept is a **hypernym** (a broader, more general category) of your class.
|
||||
* *Example*: `hc:AcademicArchiveRecordSetType` `broad_mappings` `rico:RecordSetType`.
|
||||
* *Rationale*: An academic archive record set *is a* record set type, but `rico:RecordSetType` is broader.
|
||||
* *Common Hypernyms*: `skos:Concept`, `prov:Entity`, `prov:Activity`, `schema:Thing`, `schema:Organization`, `schema:Action`, `rico:RecordSetType`, `crm:E55_Type`.
|
||||
* *Common Hypernyms*: `skos:Concept`, `prov:Entity`, `schema:Thing`, `schema:Organization`, `rico:RecordSetType`.
|
||||
|
||||
3. **Narrow Mappings (`skos:narrowMatch`)**: Use when the external concept is a **hyponym** (a narrower, more specific category) of your class.
|
||||
* *Example*: `hc:Organization` `narrow_mappings` `hc:Library` (if mapping inversely).
|
||||
|
||||
4. **Close Mappings (`skos:closeMatch`)**: Use when the external concept is similar but not exactly equivalent.
|
||||
* *Example*: `hc:AccessPolicy` `close_mappings` `dcterms:accessRights` (related but different scope).
|
||||
|
||||
5. **Related Mappings (`skos:relatedMatch`)**: Use for non-hierarchical relationships.
|
||||
* *Example*: `hc:Collection` `related_mappings` `rico:RecordSet`.
|
||||
|
||||
### 🚨 Type Compatibility Rule
|
||||
|
||||
**Classes map to classes, properties map to properties.** Never mix types in mappings.
|
||||
|
||||
| Your Element | Valid Mapping Target |
|
||||
|--------------|---------------------|
|
||||
| Class | Class (owl:Class, rdfs:Class) |
|
||||
| Slot | Property (owl:ObjectProperty, owl:DatatypeProperty, rdf:Property) |
|
||||
|
||||
❌ **WRONG**:
|
||||
```yaml
|
||||
# AccessApplication is a CLASS, schema:Action is a CLASS - but Action is BROADER
|
||||
AccessApplication:
|
||||
exact_mappings:
|
||||
- schema:Action # WRONG: Action is a hypernym, not equivalent
|
||||
```
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
AccessApplication:
|
||||
broad_mappings:
|
||||
- schema:Action # CORRECT: Action is the broader category
|
||||
```
|
||||
|
||||
### 🚨 No Self/Internal Exact Mappings
|
||||
|
||||
`exact_mappings` MUST NOT contain self-references or internal HC class references for the same concept.
|
||||
|
||||
❌ **WRONG**:
|
||||
```yaml
|
||||
AcademicArchive:
|
||||
exact_mappings:
|
||||
- hc:AcademicArchive # Self/internal reference; not an external equivalence mapping
|
||||
```
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
AcademicArchive:
|
||||
exact_mappings:
|
||||
- wd:Q27032435 # External concept with equivalent semantic scope
|
||||
```
|
||||
|
||||
Use `exact_mappings` only for equivalent terms in external ontologies or external controlled vocabularies, not for repeating the class itself.
|
||||
|
||||
### ✅ Positive Guidance: When Exact Mapping Is Correct
|
||||
|
||||
Use `exact_mappings` when all checks below pass:
|
||||
|
||||
- Semantic scope is equivalent (not parent/child, not merely similar)
|
||||
- Ontological category matches (Class↔Class, Slot↔Property)
|
||||
- Target term is verified in the ontology source files under `data/ontology/` or verified Wikidata entity metadata
|
||||
- No self/internal duplication (no `hc:` self-reference for the same concept)
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
Person:
|
||||
exact_mappings:
|
||||
- schema:Person
|
||||
|
||||
Acquisition:
|
||||
exact_mappings:
|
||||
- crm:E8_Acquisition
|
||||
```
|
||||
|
||||
Do not downgrade a truly equivalent mapping to `close_mappings` or `broad_mappings` just to be conservative.
|
||||
|
||||
### Common Hypernyms That Are NEVER Exact Mappings
|
||||
|
||||
These terms are always BROADER than your specific class - never use them as `exact_mappings`:
|
||||
|
||||
| Hypernym | What It Means | Use Instead |
|
||||
|----------|---------------|-------------|
|
||||
| `schema:Action` | Any action | `broad_mappings` |
|
||||
| `schema:Organization` | Any organization | `broad_mappings` |
|
||||
| `schema:Thing` | Anything at all | `broad_mappings` |
|
||||
| `schema:PropertyValue` | Any property value | `broad_mappings` |
|
||||
| `schema:Permit` | Any permit | `broad_mappings` |
|
||||
| `prov:Activity` | Any activity | `broad_mappings` |
|
||||
| `prov:Entity` | Any entity | `broad_mappings` |
|
||||
| `skos:Concept` | Any concept | `broad_mappings` |
|
||||
| `crm:E55_Type` | Any type classification | `broad_mappings` |
|
||||
| `crm:E42_Identifier` | Any identifier | `broad_mappings` |
|
||||
| `rico:Identifier` | Any identifier | `broad_mappings` |
|
||||
| `dcat:DataService` | Any data service | `broad_mappings` |
|
||||
|
||||
### Common Violations to Avoid
|
||||
|
||||
❌ **WRONG**:
|
||||
|
|
@ -140,46 +47,8 @@ SocialMovement:
|
|||
- schema:Organization # CORRECT
|
||||
```
|
||||
|
||||
❌ **WRONG**:
|
||||
```yaml
|
||||
AccessApplication:
|
||||
exact_mappings:
|
||||
- schema:Action # WRONG: Action is a hypernym
|
||||
```
|
||||
|
||||
✅ **CORRECT**:
|
||||
```yaml
|
||||
AccessApplication:
|
||||
broad_mappings:
|
||||
- schema:Action # CORRECT: Action is the broader category
|
||||
```
|
||||
|
||||
### How to Determine Mapping Type
|
||||
|
||||
Ask these questions:
|
||||
|
||||
1. **Is it the SAME thing?** → `exact_mappings`
|
||||
- "Could I swap these two terms in any context without changing meaning?"
|
||||
- If NO, it's not an exact mapping
|
||||
|
||||
2. **Is the external term a PARENT category?** → `broad_mappings`
|
||||
- "Is my class a TYPE OF the external term?"
|
||||
- Example: AccessApplication IS-A Action
|
||||
|
||||
3. **Is the external term a CHILD category?** → `narrow_mappings`
|
||||
- "Is the external term a TYPE OF my class?"
|
||||
- Example: Library IS-A Organization (so Organization has narrow_mapping to Library)
|
||||
|
||||
4. **Is it similar but not hierarchical?** → `close_mappings`
|
||||
- "Related but not equivalent or hierarchical"
|
||||
|
||||
5. **Is there some other relationship?** → `related_mappings`
|
||||
- "Connected in some way"
|
||||
|
||||
### Verification Checklist
|
||||
|
||||
- [ ] Does the `exact_mapping` represent the **exact same scope**?
|
||||
- [ ] Is the external term a generic parent class (e.g., `Type`, `Concept`, `Entity`, `Action`, `Activity`, `Organization`)? → Move to `broad_mappings`
|
||||
- [ ] Is the external term a specific instance or subclass? → Check `narrow_mappings`
|
||||
- [ ] Is the external term the same type (class→class, property→property)?
|
||||
- [ ] Would swapping the terms change the meaning? If yes, not an `exact_mapping`
|
||||
- [ ] If the external term is a generic parent class (e.g., `Type`, `Concept`, `Entity`), move it to `broad_mappings`.
|
||||
- [ ] If the external term is a specific instance or subclass, check `narrow_mappings`.
|
||||
|
|
|
|||
|
|
@ -1,53 +0,0 @@
|
|||
# Rule: No Version Indicators in Names
|
||||
|
||||
## 🚨 Critical
|
||||
|
||||
Do not include version identifiers in **class names**, **slot names**, or **enum names**.
|
||||
|
||||
Version tags in semantic names create churn, break reuse, and force unnecessary migrations.
|
||||
|
||||
## The Rule
|
||||
|
||||
1. Use stable semantic names for LinkML elements.
|
||||
- ✅ `DigitalPlatform`
|
||||
- ❌ `DigitalPlatformV2`
|
||||
|
||||
2. If a model evolves, keep the name and update metadata/provenance.
|
||||
- Track revision in changelog, annotations, or transformation metadata.
|
||||
- Do not encode `v2`, `v3`, `_2026`, `beta`, `final` in the element name.
|
||||
|
||||
3. Apply this to all naming surfaces:
|
||||
- `classes:` keys
|
||||
- `slots:` keys
|
||||
- `enums:` keys
|
||||
- `name:` values in module files
|
||||
|
||||
## Allowed Versioning Locations
|
||||
|
||||
- File-level changelog/comments
|
||||
- Dedicated metadata classes/slots (e.g., transformation metadata)
|
||||
- External release tags (git tags, manifest versions)
|
||||
|
||||
## Migration Guidance
|
||||
|
||||
When you encounter versioned names:
|
||||
|
||||
1. Rename semantic elements to stable names.
|
||||
2. Update references/imports/usages accordingly.
|
||||
3. Preserve provenance of the migration in comments/annotations.
|
||||
|
||||
## Examples
|
||||
|
||||
✅ Correct:
|
||||
```yaml
|
||||
classes:
|
||||
DigitalPlatformTransformationMetadata:
|
||||
description: Metadata about record transformation steps.
|
||||
```
|
||||
|
||||
❌ Wrong:
|
||||
```yaml
|
||||
classes:
|
||||
DigitalPlatformV2TransformationMetadata:
|
||||
description: Metadata about V2 transformation.
|
||||
```
|
||||
|
|
@ -1,45 +0,0 @@
|
|||
# Rule: Polished Slot Storage Location
|
||||
|
||||
## Summary
|
||||
|
||||
Polished (refactored) canonical slot files MUST be stored in the parent `slots/` directory:
|
||||
|
||||
```
|
||||
schemas/20251121/linkml/modules/slots/
|
||||
```
|
||||
|
||||
They must **NOT** be stored in the `20260202_matang/` subdirectory.
|
||||
|
||||
## Rationale
|
||||
|
||||
The `new/` subdirectory contain **draft/unpolished** slot definitions that are pending review. Once a slot file has been polished (ontology-aligned, translated, cleaned), it graduates to the canonical `slots/` directory.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
schemas/20251121/linkml/modules/slots/
|
||||
├── *.yaml ← Polished canonical slot files go HERE
|
||||
└── 20260202_matang/
|
||||
├── *.yaml ← Draft/unpolished canonical slots (staging area)
|
||||
└── new/
|
||||
└── *.yaml ← Raw/draft slot definitions pending triage
|
||||
```
|
||||
|
||||
## Rule
|
||||
|
||||
- When polishing a slot file, write the result to `schemas/20251121/linkml/modules/slots/{slot_name}.yaml`
|
||||
- If the source file was in `20260202_matang/`, remove it from there after writing to `slots/`
|
||||
- If the source file was in `20260202_matang/new/`, it should only be deleted after user confirmation of alias absorption (per the no-autonomous-alias-assignment rule)
|
||||
- If a file already exists in `slots/` (i.e., it was previously polished in an earlier session), overwrite it in place
|
||||
|
||||
## Examples
|
||||
|
||||
**CORRECT:**
|
||||
```
|
||||
schemas/20251121/linkml/modules/slots/has_pattern.yaml ← polished file
|
||||
```
|
||||
|
||||
**WRONG:**
|
||||
```
|
||||
schemas/20251121/linkml/modules/slots/20260202_matang/has_pattern.yaml ← should not be here after polishing
|
||||
```
|
||||
Binary file not shown.
|
|
@ -3,10 +3,10 @@
|
|||
**Scope:** Schema Migration / Slot Fixes
|
||||
|
||||
**Description:**
|
||||
The file `slot_fixes.yaml` is the **single authoritative source** for tracking slot migrations and fixes.
|
||||
The file `/Users/kempersc/apps/glam/data/fixes/slot_fixes.yaml` is the **single authoritative source** for tracking slot migrations and fixes.
|
||||
|
||||
**Directives:**
|
||||
1. **Authoritative Source:** Always read and update `slot_fixes.yaml`.
|
||||
1. **Authoritative Source:** Always read and update `/Users/kempersc/apps/glam/data/fixes/slot_fixes.yaml`. Do NOT use `schemas/.../slot_fixes.yaml` as the master list (though you may need to sync them if they diverge, the `data/fixes` version takes precedence).
|
||||
2. **Processed Status:** When a slot migration is completed (schema updated, data migrated), you MUST update the entry in `slot_fixes.yaml` with a `processed` block containing:
|
||||
* `status: true`
|
||||
* `date: 'YYYY-MM-DD'`
|
||||
|
|
|
|||
|
|
@ -4,7 +4,7 @@
|
|||
|
||||
## Summary
|
||||
|
||||
The `revision` key in `slot_fixes.yaml` is **IMMUTABLE**. AI agents MUST follow revision specifications exactly and are NEVER permitted to modify the content of revision entries.
|
||||
The `revision` key in `schemas/20251121/linkml/modules/slots/slot_fixes.yaml` is **IMMUTABLE**. AI agents MUST follow revision specifications exactly and are NEVER permitted to modify the content of revision entries.
|
||||
|
||||
## The Authoritative Source
|
||||
|
||||
|
|
|
|||
|
|
@ -1,69 +0,0 @@
|
|||
# Rule: Slot Naming Convention (Current Style)
|
||||
|
||||
🚨 **CRITICAL**: New LinkML slot names MUST follow the current verb-first naming style used in active slot files under `modules/slots/`.
|
||||
|
||||
## Core Naming Rules
|
||||
|
||||
1. Use `snake_case`.
|
||||
2. Prefer short, descriptive verb predicates as canonical names.
|
||||
3. Keep names ontology-neutral (no ontology namespace prefixes in slot names).
|
||||
4. Use singular nouns in object positions (including multivalued slots).
|
||||
5. Keep temporal semantics in mappings/definitions when needed, not by forcing a legacy prefix.
|
||||
|
||||
## Preferred Patterns
|
||||
|
||||
### 1) Simple verb predicates (default)
|
||||
|
||||
Use a single verb when it clearly expresses the relation.
|
||||
|
||||
Examples from active slots:
|
||||
- `accept`
|
||||
- `contain`
|
||||
- `catalogue`
|
||||
- `exhibit`
|
||||
|
||||
### 2) Verb + particle/preposition when needed
|
||||
|
||||
Use compact phrasal forms when a preposition carries core meaning.
|
||||
|
||||
Examples:
|
||||
- `belong_to`
|
||||
- `located_in`
|
||||
- `derived_from`
|
||||
|
||||
### 3) Symmetric or directional pair pattern
|
||||
|
||||
Use `<present>_or_<past_participle>` when both directions/states are intentionally modeled in one predicate label.
|
||||
|
||||
Examples:
|
||||
- `contains_or_contained`
|
||||
- `includes_or_included`
|
||||
- `operates_or_operated`
|
||||
|
||||
## Legacy Compatibility
|
||||
|
||||
- For migrations, keep backward compatibility via `aliases` when renaming to current-style canonical names.
|
||||
- Do not rename canonical slots opportunistically; follow migration plans and canonical-slot protection rules.
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
- ❌ `rico_has_or_had_holder` (ontology prefix in name)
|
||||
- ❌ `collections` (plural noun predicate)
|
||||
- ❌ `has_museum_visitor_count` (class-specific slot name)
|
||||
- ❌ Creating new `has_or_had_*` names by default when a verb predicate is clearer
|
||||
|
||||
## Quick Checklist
|
||||
|
||||
- [ ] Is the canonical slot name verb-first and descriptive?
|
||||
- [ ] Is it `snake_case`?
|
||||
- [ ] Is the noun part singular?
|
||||
- [ ] Is the name ontology-neutral?
|
||||
- [ ] If renaming legacy slots, are aliases/migration constraints handled?
|
||||
|
||||
## See Also
|
||||
|
||||
- `.opencode/rules/archive/DEPRECATED-slot-naming-convention-rico-style.md`
|
||||
- `.opencode/rules/no-ontology-prefix-in-slot-names.md`
|
||||
- `.opencode/rules/slot-noun-singular-convention.md`
|
||||
- `.opencode/rules/generic-slots-specific-classes.md`
|
||||
- `.opencode/rules/canonical-slot-protection-rule.md`
|
||||
|
|
@ -76,5 +76,5 @@ When creating or renaming slots:
|
|||
|
||||
## See Also
|
||||
|
||||
- `.opencode/rules/slot-naming-convention-current-style.md` - Current slot naming patterns
|
||||
- `.opencode/rules/slot-naming-convention-rico-style.md` - RiC-O naming patterns
|
||||
- `.opencode/rules/slot-centralization-and-semantic-uri-rule.md` - Slot centralization requirements
|
||||
|
|
|
|||
|
|
@ -1,162 +0,0 @@
|
|||
# Wikidata Mapping Discovery Rule
|
||||
|
||||
## Rule: Use Wikidata MCP to Discover and Verify Mappings Carefully
|
||||
|
||||
When adding Wikidata mappings to class files, you MUST verify the semantic meaning and relationship before adding any mapping.
|
||||
|
||||
### 🚨 CRITICAL: Always Verify Before Adding
|
||||
|
||||
**NEVER add a Wikidata QID without verifying:**
|
||||
1. What the entity actually IS (not just the label)
|
||||
2. That it's the SAME TYPE as your class (organization→organization, NOT organization→building)
|
||||
3. That the semantic relationship makes sense
|
||||
|
||||
### Workflow
|
||||
|
||||
#### Step 1: VERIFY Existing Mappings First
|
||||
|
||||
Before trusting any existing mapping, verify it:
|
||||
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription WHERE {
|
||||
VALUES ?item { wd:Q22075301 wd:Q1643722 wd:Q185583 }
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
}
|
||||
```
|
||||
|
||||
**Example of WRONG mappings found in codebase:**
|
||||
| QID | Label | Was Mapped To | WHY WRONG |
|
||||
|-----|-------|---------------|-----------|
|
||||
| Q22075301 | textile artwork | FacultyPaperCollection | Not related at all! |
|
||||
| Q1643722 | building in Vienna | UniversityAdministrativeFonds | Not an archival concept! |
|
||||
| Q185583 | candy | AcademicStudentRecordSeries | Completely unrelated! |
|
||||
|
||||
#### Step 2: Search for Candidates
|
||||
|
||||
Search for relevant Wikidata entities by keyword or hierarchy:
|
||||
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription WHERE {
|
||||
?item wdt:P279 wd:Q166118 . # subclasses of "archives"
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
}
|
||||
```
|
||||
|
||||
#### Step 3: VERIFY Each Candidate
|
||||
|
||||
For EVERY candidate found, verify:
|
||||
1. **Read the description** - does it match your class?
|
||||
2. **Check instance of (P31)** - is it the same type?
|
||||
3. **Check subclass of (P279)** - is it in a relevant hierarchy?
|
||||
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription ?instanceLabel ?subclassLabel WHERE {
|
||||
VALUES ?item { wd:Q9388534 }
|
||||
OPTIONAL { ?item wdt:P31 ?instance. }
|
||||
OPTIONAL { ?item wdt:P279 ?subclass. }
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
}
|
||||
```
|
||||
|
||||
#### Step 4: Confirm Semantic Relationship
|
||||
|
||||
Ask: **Would a domain expert agree this mapping makes sense?**
|
||||
|
||||
| Your Class | Wikidata Entity | Verdict |
|
||||
|------------|-----------------|---------|
|
||||
| FacultyPaperCollection | Q22075301 (textile artwork) | ❌ NO - completely unrelated |
|
||||
| CampusDocumentationCollection | Q9388534 (archival collection) | ✅ YES - semantically related |
|
||||
| AcademicArchive | Q27032435 (academic archive) | ✅ YES - exact match |
|
||||
|
||||
### Type Compatibility Rules
|
||||
|
||||
| Your Class Type | Valid Wikidata Types | Invalid Wikidata Types |
|
||||
|-----------------|---------------------|------------------------|
|
||||
| Organization | organization, institution | building, person, artwork |
|
||||
| Record Set Type | collection, fonds, series | building, candy, textile |
|
||||
| Event | activity, occurrence | organization, place |
|
||||
| Type/Category | type, concept, class | specific instances |
|
||||
|
||||
### Common Mistakes to Avoid
|
||||
|
||||
❌ **WRONG: Adding any QID found in search without verification**
|
||||
```
|
||||
"Found Q1643722 in search results, adding it as mapping"
|
||||
→ Result: Mapping a "building in Vienna" to "UniversityAdministrativeFonds"
|
||||
```
|
||||
|
||||
✅ **CORRECT: Verify description and type before adding**
|
||||
```
|
||||
1. Search finds Q1643722
|
||||
2. Verify: Q1643722 = "building in Vienna, Austria"
|
||||
3. Check: Is a building related to "UniversityAdministrativeFonds"?
|
||||
4. Decision: NO - do not add this mapping
|
||||
```
|
||||
|
||||
### When to Add Wikidata Mappings
|
||||
|
||||
Add Wikidata mappings ONLY when:
|
||||
- [ ] You verified the entity's label AND description
|
||||
- [ ] The entity is the same type as your class
|
||||
- [ ] The semantic relationship is clear (exact, broader, narrower, related)
|
||||
- [ ] A domain expert would agree the mapping makes sense
|
||||
|
||||
### When NOT to Add Wikidata Mappings
|
||||
|
||||
Do NOT add Wikidata mappings when:
|
||||
- You only searched but didn't verify the description
|
||||
- The entity type doesn't match (e.g., building vs. organization)
|
||||
- The relationship is unclear or forced
|
||||
- You're just trying to "fill in" mappings
|
||||
|
||||
### Mapping Categories
|
||||
|
||||
| Category | Wikidata Property | When to Use |
|
||||
|----------|-------------------|-------------|
|
||||
| `exact_mappings` | - | Same semantic meaning (rare!) |
|
||||
| `close_mappings` | - | Similar but not identical |
|
||||
| `broad_mappings` | P279 (subclass of) | Wikidata entity is BROADER |
|
||||
| `narrow_mappings` | inverse of P279 | Wikidata entity is NARROWER |
|
||||
| `related_mappings` | - | Non-hierarchical but semantically related |
|
||||
|
||||
### Checklist
|
||||
|
||||
For each Wikidata mapping:
|
||||
|
||||
- [ ] Verified entity label matches expected meaning
|
||||
- [ ] Verified entity description confirms semantic fit
|
||||
- [ ] Entity type is compatible with class type
|
||||
- [ ] Mapping category (exact/close/broad/narrow/related) is correct
|
||||
- [ ] A domain expert would agree this makes sense
|
||||
|
||||
### Example: Proper Verification for FacultyPaperCollection
|
||||
|
||||
**Step 1: What are we looking for?**
|
||||
- Personal papers of faculty members
|
||||
- Academic archives
|
||||
- Manuscript collections
|
||||
|
||||
**Step 2: Search**
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription WHERE {
|
||||
?item ?prop ?value .
|
||||
?value bif:contains "'personal papers' OR 'faculty papers'" .
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
} LIMIT 10
|
||||
```
|
||||
|
||||
**Step 3: Verify candidates**
|
||||
- If no exact match found → DO NOT add a wrong mapping
|
||||
- Better to have NO Wikidata mapping than a WRONG one
|
||||
|
||||
**Step 4: Decision**
|
||||
- No exact Wikidata match for "FacultyPaperCollection"
|
||||
- Keep ontology mappings only (rico-rst:Fonds, bf:Archival)
|
||||
- Do NOT add unrelated QIDs like Q22075301 (textile artwork!)
|
||||
|
||||
### Integration with Other Rules
|
||||
|
||||
This rule complements:
|
||||
- **mapping-specificity-hypernym-rule.md**: Rules for choosing mapping type
|
||||
- **wikidata-mapping-verification-rule.md**: Rules for verifying QIDs exist
|
||||
- **verified-ontology-mapping-requirements.md**: General ontology verification
|
||||
|
|
@ -1,97 +0,0 @@
|
|||
# Wikidata Mapping Verification Rule
|
||||
|
||||
## Rule: Always Verify Wikidata Mappings Using Authenticated Tools
|
||||
|
||||
When adding or reviewing Wikidata mappings (wd:Qxxxxx), you MUST verify the entity exists and is semantically appropriate using the available tools.
|
||||
|
||||
### Verification Methods (in order of preference)
|
||||
|
||||
#### 1. Wikidata SPARQL Query (Primary)
|
||||
|
||||
Use `wikidata-authenticated_execute_sparql` to verify entity labels and descriptions:
|
||||
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription WHERE {
|
||||
VALUES ?item { wd:Q38723 wd:Q2385804 }
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
}
|
||||
```
|
||||
|
||||
#### 2. Wikidata Metadata API
|
||||
|
||||
Use `wikidata-authenticated_get_metadata` to retrieve label and description:
|
||||
|
||||
```
|
||||
entity_id: Q38723
|
||||
language: en
|
||||
```
|
||||
|
||||
#### 3. Web Search as Fallback
|
||||
|
||||
If authenticated tools fail, use `linkup_linkup-search` or `exa_web_search_exa`:
|
||||
```
|
||||
query: "Wikidata Q38723 higher education institution"
|
||||
```
|
||||
|
||||
### Common Errors to Avoid
|
||||
|
||||
| Error | Example | Fix |
|
||||
|-------|---------|-----|
|
||||
| **Wrong QID** | Q600875 (a person) for "academic program" | Q600134 (course) |
|
||||
| **Too broad** | Q35120 (entity) for specific class | Use appropriate subclass |
|
||||
| **Too narrow** | Q3918 (university) for general academic institution | Use Q38723 (higher education institution) |
|
||||
| **Different concept** | Q416703 (museum building) for museum organization | Use appropriate organizational class |
|
||||
|
||||
### Verification Checklist
|
||||
|
||||
Before committing any Wikidata mapping:
|
||||
|
||||
- [ ] QID exists (not 404)
|
||||
- [ ] Label matches expected concept
|
||||
- [ ] Description confirms semantic alignment
|
||||
- [ ] Mapping specificity follows Rule 63 (exact/broad/narrow/close)
|
||||
- [ ] Not a duplicate of another mapping in the same class
|
||||
|
||||
### Example Verification
|
||||
|
||||
**WRONG:**
|
||||
```yaml
|
||||
# Q600875 was not verified - it's actually a person
|
||||
close_mappings:
|
||||
- wd:Q600875 # Juan Lindolfo Cuestas - President of Uruguay!
|
||||
```
|
||||
|
||||
**CORRECT:**
|
||||
```yaml
|
||||
# Verified via SPARQL: Q600134 = "course"
|
||||
close_mappings:
|
||||
- wd:Q600134 # program of study, or unit of teaching
|
||||
```
|
||||
|
||||
### SPARQL Query Template
|
||||
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription ?itemAltLabel WHERE {
|
||||
VALUES ?item { wd:Q38723 }
|
||||
OPTIONAL { ?item skos:altLabel ?itemAltLabel. FILTER(LANG(?itemAltLabel) = "en") }
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
}
|
||||
```
|
||||
|
||||
### Batch Verification
|
||||
|
||||
For multiple QIDs in a file, verify all at once:
|
||||
|
||||
```sparql
|
||||
SELECT ?item ?itemLabel ?itemDescription WHERE {
|
||||
VALUES ?item { wd:Q38723 wd:Q2385804 wd:Q600134 wd:Q3918 }
|
||||
SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
|
||||
}
|
||||
```
|
||||
|
||||
### Integration with Other Rules
|
||||
|
||||
This rule complements:
|
||||
- **Rule 63** (mapping-specificity-hypernym-rule.md): Determines mapping type (exact/broad/narrow)
|
||||
- **no-hallucinated-ontology-references.md**: Prevents fake ontology terms
|
||||
- **verified-ontology-terms.md**: General ontology verification
|
||||
10
AGENTS.md
10
AGENTS.md
|
|
@ -33,16 +33,6 @@ AcademicArchiveRecordSetType:
|
|||
|
||||
**See**: `.opencode/rules/mapping-specificity-hypernym-rule.md` for complete documentation.
|
||||
|
||||
### Rule 64: Archive Organization Type Descriptions
|
||||
|
||||
🚨 **CRITICAL**: Archive classes that do NOT have `recordType` or `hold_record_set` as a primary distinguishing feature represent **archives as organizations**, not just collections of records.
|
||||
|
||||
**The Rule**:
|
||||
- **Archive Organization Types** (e.g., `BankArchive`, `ChurchArchive`, `MunicipalArchive`): Emphasize institutional characteristics—governance, funding, legal status, parent organization relationships, and organizational functions.
|
||||
- **Record Set Types** (have recordType): Focus on the nature and format of the records themselves.
|
||||
|
||||
**See**: `.opencode/rules/archive-organization-type-description-rule.md` for complete documentation.
|
||||
|
||||
### Rule: Ontology Detection vs Heuristics
|
||||
🚨 **CRITICAL**: When detecting classes and predicates in `data/ontology/` or external ontology files, you must **read the actual ontology definitions** (e.g., RDF, OWL, TTL files) to determine if a term is a Class or a Property. Do not rely on naming heuristics (like "Capitalized means Class").
|
||||
|
||||
|
|
|
|||
|
|
@ -359,7 +359,7 @@ classes:
|
|||
range: WikidataEnrichment
|
||||
description: Full Wikidata enrichment data
|
||||
ghcid:
|
||||
range: GHCIDBlock
|
||||
range: GhcidBlock
|
||||
description: GHCID generation metadata with history
|
||||
web_claims:
|
||||
range: WebClaimsBlock
|
||||
|
|
@ -1174,7 +1174,7 @@ classes:
|
|||
# GHCID BLOCK - Heritage Custodian ID with history
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
GHCIDBlock:
|
||||
GhcidBlock:
|
||||
description: GHCID generation metadata and history
|
||||
attributes:
|
||||
ghcid_current:
|
||||
|
|
@ -1203,7 +1203,7 @@ classes:
|
|||
range: datetime
|
||||
description: When GHCID was generated
|
||||
ghcid_history:
|
||||
range: GHCIDHistoryEntry
|
||||
range: GhcidHistoryEntry
|
||||
multivalued: true
|
||||
inlined_as_list: true
|
||||
description: History of GHCID changes
|
||||
|
|
@ -1220,7 +1220,7 @@ classes:
|
|||
range: boolean
|
||||
description: Whether a collision was detected and resolved
|
||||
|
||||
GHCIDHistoryEntry:
|
||||
GhcidHistoryEntry:
|
||||
description: Historical GHCID entry with validity period
|
||||
attributes:
|
||||
ghcid:
|
||||
|
|
|
|||
|
|
@ -75,8 +75,8 @@ imports:
|
|||
- ./modules/classes/ProvenanceSources
|
||||
- ./modules/classes/SourceRecord
|
||||
# Identifiers Domain
|
||||
- ./modules/classes/GHCIDBlock
|
||||
- ./modules/classes/GHCIDHistoryEntry
|
||||
- ./modules/classes/GhcidBlock
|
||||
- ./modules/classes/GhcidHistoryEntry
|
||||
- ./modules/classes/Identifier
|
||||
# Location Domain
|
||||
- ./modules/classes/CoordinateProvenance
|
||||
|
|
|
|||
|
|
@ -66,7 +66,7 @@ instances:
|
|||
examples:
|
||||
- name: Private art collector
|
||||
description: Individual maintaining personal art collection
|
||||
GHCID_type: P # Personal collection
|
||||
ghcid_type: P # Personal collection
|
||||
- name: Family archivist
|
||||
description: Individual preserving family papers and photographs
|
||||
- name: Independent researcher
|
||||
|
|
|
|||
|
|
@ -1143,13 +1143,13 @@
|
|||
"category": "classes"
|
||||
},
|
||||
{
|
||||
"name": "GHCIDBlock",
|
||||
"path": "modules/classes/GHCIDBlock.yaml",
|
||||
"name": "GhcidBlock",
|
||||
"path": "modules/classes/GhcidBlock.yaml",
|
||||
"category": "classes"
|
||||
},
|
||||
{
|
||||
"name": "GHCIDHistoryEntry",
|
||||
"path": "modules/classes/GHCIDHistoryEntry.yaml",
|
||||
"name": "GhcidHistoryEntry",
|
||||
"path": "modules/classes/GhcidHistoryEntry.yaml",
|
||||
"category": "classes"
|
||||
},
|
||||
{
|
||||
|
|
|
|||
|
|
@ -24,7 +24,7 @@ imports:
|
|||
- ./CustodianNameConsensus
|
||||
- ./DigitalPlatform
|
||||
- ./GenealogiewerkbalkEnrichment
|
||||
- ./GHCIDBlock
|
||||
- ./GhcidBlock
|
||||
- ./GoogleMapsEnrichment
|
||||
- ./GoogleMapsPlaywrightEnrichment
|
||||
- ./Identifier
|
||||
|
|
@ -94,7 +94,7 @@ classes:
|
|||
range: WikidataEnrichment
|
||||
description: Full Wikidata enrichment data
|
||||
ghcid:
|
||||
range: GHCIDBlock
|
||||
range: GhcidBlock
|
||||
description: GHCID generation metadata with history
|
||||
has_or_had_web_claim:
|
||||
range: WebClaimsBlock
|
||||
|
|
|
|||
|
|
@ -1,10 +1,10 @@
|
|||
# GHCIDBlock - GHCID generation metadata and history
|
||||
# GhcidBlock - GHCID generation metadata and history
|
||||
# Extracted from custodian_source.yaml per Rule 38 (modular schema files)
|
||||
# Extraction date: 2026-01-08
|
||||
|
||||
id: https://nde.nl/ontology/hc/classes/GHCIDBlock
|
||||
name: GHCIDBlock
|
||||
title: GHCIDBlock
|
||||
id: https://nde.nl/ontology/hc/classes/GhcidBlock
|
||||
name: GhcidBlock
|
||||
title: GhcidBlock
|
||||
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
|
|
@ -17,12 +17,12 @@ imports:
|
|||
- linkml:types
|
||||
|
||||
|
||||
- ./GHCIDHistoryEntry
|
||||
- ./GhcidHistoryEntry
|
||||
- ./LocationResolution
|
||||
default_range: string
|
||||
|
||||
classes:
|
||||
GHCIDBlock:
|
||||
GhcidBlock:
|
||||
description: GHCID generation metadata and history
|
||||
attributes:
|
||||
ghcid_current:
|
||||
|
|
@ -51,7 +51,7 @@ classes:
|
|||
range: datetime
|
||||
description: When GHCID was generated
|
||||
ghcid_history:
|
||||
range: GHCIDHistoryEntry
|
||||
range: GhcidHistoryEntry
|
||||
multivalued: true
|
||||
inlined_as_list: true
|
||||
description: History of GHCID changes
|
||||
|
|
|
|||
|
|
@ -1,10 +1,10 @@
|
|||
# GHCIDHistoryEntry - Historical GHCID entry with validity period
|
||||
# GhcidHistoryEntry - Historical GHCID entry with validity period
|
||||
# Extracted from custodian_source.yaml per Rule 38 (modular schema files)
|
||||
# Extraction date: 2026-01-08
|
||||
|
||||
id: https://nde.nl/ontology/hc/classes/GHCIDHistoryEntry
|
||||
name: GHCIDHistoryEntry
|
||||
title: GHCIDHistoryEntry
|
||||
id: https://nde.nl/ontology/hc/classes/GhcidHistoryEntry
|
||||
name: GhcidHistoryEntry
|
||||
title: GhcidHistoryEntry
|
||||
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
|
|
@ -20,7 +20,7 @@ imports:
|
|||
default_range: string
|
||||
|
||||
classes:
|
||||
GHCIDHistoryEntry:
|
||||
GhcidHistoryEntry:
|
||||
description: Historical GHCID entry with validity period
|
||||
attributes:
|
||||
ghcid:
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
id: https://nde.nl/ontology/hc/slot/ghcid
|
||||
name: ghcid_slot
|
||||
title: GHCID Slot
|
||||
title: Ghcid Slot
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
|
|
|
|||
|
|
@ -1,123 +0,0 @@
|
|||
---
|
||||
Goal
|
||||
Improve the quality of LinkML class files in a GLAM (Galleries, Libraries, Archives, Museums) ontology project by:
|
||||
1. Writing proper dictionary-style descriptions without repeating class names
|
||||
2. Adding multilingual support (alt_descriptions and structured_aliases in 7 languages: nl, de, fr, es, ar, id, zh)
|
||||
3. Structuring data properly using LinkML annotations (examples:, keywords:, comments:)
|
||||
4. Ensuring all ontology mappings use the correct category (exact_mappings vs broad_mappings vs close_mappings vs narrow_mappings vs related_mappings)
|
||||
5. Verifying all Wikidata mappings are semantically correct
|
||||
6. Creating and updating rules in .opencode/rules/
|
||||
Instructions
|
||||
Key Rules Created/Updated
|
||||
1. class-description-quality-rule.md:
|
||||
- No repetition of class name in descriptions
|
||||
- MIGRATE structured data before removing from descriptions
|
||||
- Use folded block scalar (>-)
|
||||
- Use examples: annotation properly
|
||||
2. linkml-yaml-best-practices-rule.md:
|
||||
- equals_expression anti-pattern → use equals_string or any_of
|
||||
- Declare all prefixes
|
||||
- Import referenced classes
|
||||
- Quote regex patterns and annotation values
|
||||
3. class-multilingual-support-rule.md:
|
||||
- Required languages: nl, de, fr, es, ar, id, zh
|
||||
- Structure for alt_descriptions and structured_aliases
|
||||
4. mapping-specificity-hypernym-rule.md (updated):
|
||||
- Classes that are NEVER exact mappings: schema:Action, schema:Organization, schema:Thing, schema:PropertyValue, schema:Permit, prov:Activity, skos:Concept, crm:E55_Type, crm:E42_Identifier, dcat:DataService
|
||||
- Type compatibility: class→class, property→property
|
||||
- Decision tree for mapping type
|
||||
5. wikidata-mapping-verification-rule.md:
|
||||
- Use wikidata-authenticated_execute_sparql to verify QIDs
|
||||
- SPARQL templates for batch verification
|
||||
6. wikidata-mapping-discovery-rule.md (updated):
|
||||
- ALWAYS verify BOTH label AND description before adding
|
||||
- Check type compatibility (organization→organization, NOT organization→building)
|
||||
- Examples of WRONG mappings found: Q22075301 (textile artwork) was mapped to FacultyPaperCollection!
|
||||
- "Better no mapping than wrong mapping" principle
|
||||
Discoveries
|
||||
1. Wrong Wikidata mappings found and removed:
|
||||
- Q22075301 (textile artwork) → was mapped to FacultyPaperCollection ❌
|
||||
- Q1643722 (building in Vienna) → was mapped to UniversityAdministrativeFonds ❌
|
||||
- Q185583 (candy) → was mapped to AcademicStudentRecordSeries ❌
|
||||
2. Wrong mapping categories corrected:
|
||||
- schema:Action was exact_mappings → should be broad_mappings
|
||||
- crm:E55_Type was exact_mappings → should be broad_mappings
|
||||
- prov:Activity was exact_mappings → should be broad_mappings
|
||||
- schema:Organization was exact_mappings → should be broad_mappings
|
||||
- crm:E42_Identifier was exact_mappings → should be broad_mappings
|
||||
3. Verified correct Wikidata mappings:
|
||||
- Q27032435 (academic archive) → AcademicArchive (exact) ✓
|
||||
- Q38723 (higher education institution) → AcademicInstitution (exact) ✓
|
||||
- Q9388534 (archival collection) → CampusDocumentationCollection (related) ✓
|
||||
4. Conceptual model clarification:
|
||||
- AcademicArchive = the institution (organizational entity)
|
||||
- AcademicArchiveRecordSetType = the classification of record sets
|
||||
Accomplished
|
||||
Fully Processed Class Files (25 files)
|
||||
All with: dictionary-style descriptions, 7-language alt_descriptions, 7-language structured_aliases, proper examples, keywords, correct mapping categories, verified Wikidata mappings
|
||||
| File | Key Mappings |
|
||||
|------|--------------|
|
||||
| AcademicArchive.yaml | wd:Q27032435 (exact), wd:Q166118 (broad) |
|
||||
| AcademicArchiveRecordSetType.yaml | wd:Q27032435 (close), rico:RecordSetType (broad) |
|
||||
| AcademicArchiveRecordSetTypes.yaml | 4 subclasses, rico-rst mappings |
|
||||
| AcademicInstitution.yaml | wd:Q38723 (exact), wd:Q4671277 (close) |
|
||||
| AcademicProgram.yaml | schema:EducationalOccupationalProgram (exact), wd:Q600134 (close) |
|
||||
| Access.yaml | dcterms:RightsStatement (exact) |
|
||||
| AccessApplication.yaml | schema:Action (broad) |
|
||||
| AccessControl.yaml | schema:DigitalDocumentPermission (close) |
|
||||
| AccessibilityFeature.yaml | schema:LocationFeatureSpecification (close) |
|
||||
| AccessInterface.yaml | dcat:DataService (broad) |
|
||||
| AccessionEvent.yaml | crm:E63_Beginning_of_Existence (broad) |
|
||||
| AccessionNumber.yaml | crm:E42_Identifier (broad), rico:Identifier (broad) |
|
||||
| AccessLevel.yaml | skos:Concept (broad) |
|
||||
| AccessTriggerEvent.yaml | prov:Activity (broad) |
|
||||
| AccountIdentifier.yaml | schema:PropertyValue (broad) |
|
||||
| AccountStatus.yaml | skos:Concept (broad) |
|
||||
| Accreditation.yaml | schema:Permit (broad) |
|
||||
| AccreditationBody.yaml | schema:Organization (broad) |
|
||||
| AccreditationEvent.yaml | prov:Activity (broad) |
|
||||
| Accumulation.yaml | rico:AccumulationRelation (exact) |
|
||||
| AccuracyLevel.yaml | skos:Concept (broad) |
|
||||
| Acquisition.yaml | crm:E8_Acquisition (exact) |
|
||||
| AcquisitionEvent.yaml | crm:E10_Transfer_of_Custody (exact) |
|
||||
| AcquisitionMethod.yaml | crm:E55_Type (broad) |
|
||||
Rules Created (6 new/updated)
|
||||
- class-description-quality-rule.md
|
||||
- class-multilingual-support-rule.md
|
||||
- linkml-yaml-best-practices-rule.md
|
||||
- mapping-specificity-hypernym-rule.md (updated)
|
||||
- wikidata-mapping-verification-rule.md
|
||||
- wikidata-mapping-discovery-rule.md (updated)
|
||||
Remaining Work
|
||||
- Continue processing remaining class files in /Users/kempersc/apps/glam/schemas/20251121/linkml/modules/classes/
|
||||
- Fix LSP errors in CollectionType.yaml, UniversityArchiveRecordSetTypes.yaml, AccessPolicy.yaml (duplicate keys)
|
||||
Relevant files / directories
|
||||
Directories
|
||||
- Classes: /Users/kempersc/apps/glam/schemas/20251121/linkml/modules/classes/ (~1,378 YAML files)
|
||||
- Rules: /Users/kempersc/apps/glam/.opencode/rules/ (47 rule files)
|
||||
- Ontology data: /Users/kempersc/apps/glam/data/ontology/ (RDF/RDFS/OWL files)
|
||||
Key Processed Class Files
|
||||
- AcademicArchive.yaml
|
||||
- AcademicArchiveRecordSetType.yaml
|
||||
- AcademicArchiveRecordSetTypes.yaml (contains 4 subclasses)
|
||||
- AcademicInstitution.yaml
|
||||
- AcademicProgram.yaml
|
||||
- Access.yaml through AcquisitionMethod.yaml (20 Access/Accr/Acq files)
|
||||
Rules Created/Updated This Session
|
||||
- class-description-quality-rule.md
|
||||
- class-multilingual-support-rule.md
|
||||
- linkml-yaml-best-practices-rule.md
|
||||
- mapping-specificity-hypernym-rule.md
|
||||
- wikidata-mapping-verification-rule.md
|
||||
- wikidata-mapping-discovery-rule.md
|
||||
Files with LSP Errors (need fixing)
|
||||
- CollectionType.yaml (line 81 - duplicate key)
|
||||
- UniversityArchiveRecordSetTypes.yaml (lines 51, 86, 119 - duplicate keys)
|
||||
- AccessPolicy.yaml (lines 129, 133, 184 - duplicate require: keys)
|
||||
|
||||
Do not refer to the class itself in the exact mappings. Prevent referring to the terms in the class label when describing it under the description header.
|
||||
|
||||
'Archive' referring classes that do not have recordType in their label almost always refer to the archive as an organisation, please emphasize this in their descriptions.
|
||||
|
||||
Never remove structured data represented as string: properly structuralise it using the LinkML conventions and syntax instead. We do need to keep structured data out of the description, but try to preserve it as structured LinkML data. See https://linkml.io/linkml/
|
||||
REMEMBER THAT MAPPING HALLUCINATED CLASSES OR PREDICATES OR QID IS STRICTLY PROHIBITED! Always double check the Link mappings and the mappings categories (https://linkml.io/linkml-model/latest/docs/mappings/) through studying @data/ontology/ and Wikidata (using the Wikidata MCP) carefully! Continue with:
|
||||
|
|
@ -359,7 +359,7 @@ classes:
|
|||
range: WikidataEnrichment
|
||||
description: Full Wikidata enrichment data
|
||||
ghcid:
|
||||
range: GHCIDBlock
|
||||
range: GhcidBlock
|
||||
description: GHCID generation metadata with history
|
||||
web_claims:
|
||||
range: WebClaimsBlock
|
||||
|
|
@ -1174,7 +1174,7 @@ classes:
|
|||
# GHCID BLOCK - Heritage Custodian ID with history
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
GHCIDBlock:
|
||||
GhcidBlock:
|
||||
description: GHCID generation metadata and history
|
||||
attributes:
|
||||
ghcid_current:
|
||||
|
|
@ -1203,7 +1203,7 @@ classes:
|
|||
range: datetime
|
||||
description: When GHCID was generated
|
||||
ghcid_history:
|
||||
range: GHCIDHistoryEntry
|
||||
range: GhcidHistoryEntry
|
||||
multivalued: true
|
||||
inlined_as_list: true
|
||||
description: History of GHCID changes
|
||||
|
|
@ -1220,7 +1220,7 @@ classes:
|
|||
range: boolean
|
||||
description: Whether a collision was detected and resolved
|
||||
|
||||
GHCIDHistoryEntry:
|
||||
GhcidHistoryEntry:
|
||||
description: Historical GHCID entry with validity period
|
||||
attributes:
|
||||
ghcid:
|
||||
|
|
|
|||
|
|
@ -1,165 +0,0 @@
|
|||
id: https://nde.nl/ontology/hc/class/BirthPlace
|
||||
name: birth_place_class
|
||||
title: Birth Place Class
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
gn: http://www.geonames.org/ontology#
|
||||
wdt: http://www.wikidata.org/prop/direct/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
rdfs: http://www.w3.org/2000/01/rdf-schema#
|
||||
org: http://www.w3.org/ns/org#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../metadata
|
||||
- ../slots/has_coordinates
|
||||
- ../slots/in_country
|
||||
- ../slots/identified_by
|
||||
- ../slots/has_score
|
||||
- ../slots/has_name
|
||||
- ../slots/has_label
|
||||
- ../slots/has_source
|
||||
- ../slots/has_code
|
||||
default_prefix: hc
|
||||
classes:
|
||||
BirthPlace:
|
||||
class_uri: schema:Place
|
||||
description: >-
|
||||
Structured representation of where a person was born with support for historical
|
||||
place names, modern equivalents, and geographic identifiers.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Gestructureerde weergave van waar een persoon geboren is met ondersteuning voor historische
|
||||
plaatsnamen, moderne equivalenten en geografische identificaties.
|
||||
de: >-
|
||||
Strukturierte Darstellung des Geburtsorts einer Person mit Unterstützung für historische
|
||||
Ortsnamen, moderne Entsprechungen und geografische Kennungen.
|
||||
fr: >-
|
||||
Représentation structurée du lieu de naissance d'une personne avec support pour les noms
|
||||
historiques, équivalents modernes et identifiants géographiques.
|
||||
es: >-
|
||||
Representación estructurada de dónde nació una persona con soporte para nombres
|
||||
históricos de lugares, equivalentes modernos e identificadores geográficos.
|
||||
ar: >-
|
||||
تمثيل منظم لمكان ميلاد الشخص مع دعم الأسماء التاريخية
|
||||
للأماكن والمعادلات الحديثة والمعرفات الجغرافية.
|
||||
id: >-
|
||||
Representasi terstruktur tempat seseorang lahir dengan dukungan nama tempat
|
||||
historis, padanan modern, dan pengidentifikasi geografis.
|
||||
zh: >-
|
||||
人员出生地点的结构化表示,支持历史地名、现代等价物和地理标识符。
|
||||
exact_mappings:
|
||||
- schema:Place
|
||||
close_mappings:
|
||||
- crm:E53_Place
|
||||
- gn:Feature
|
||||
slots:
|
||||
- has_label
|
||||
- has_name
|
||||
- in_country
|
||||
- has_code
|
||||
- identified_by
|
||||
- has_coordinates
|
||||
- has_source
|
||||
- has_score
|
||||
slot_usage:
|
||||
has_label:
|
||||
required: true
|
||||
examples:
|
||||
- value: Amsterdam
|
||||
- value: Batavia
|
||||
has_name:
|
||||
required: false
|
||||
examples:
|
||||
- value: Jakarta
|
||||
in_country:
|
||||
required: false
|
||||
pattern: "^[A-Z]{2}$"
|
||||
examples:
|
||||
- value: NL
|
||||
- value: ID
|
||||
has_code:
|
||||
required: false
|
||||
examples:
|
||||
- value: NH
|
||||
- value: 2759794
|
||||
identified_by:
|
||||
range: WikiDataIdentifier
|
||||
required: false
|
||||
examples:
|
||||
- value:
|
||||
has_coordinates:
|
||||
required: false
|
||||
examples:
|
||||
- value: 52.3676,4.9041
|
||||
has_source:
|
||||
required: false
|
||||
examples:
|
||||
- value: born at the family estate in rural Gelderland
|
||||
comments:
|
||||
- Replaces simple birth_place string slot (Rule 53)
|
||||
- Preserves historical place names while linking to modern identifiers
|
||||
- GeoNames ID is authoritative per AGENTS.md
|
||||
see_also:
|
||||
- https://schema.org/birthPlace
|
||||
- https://www.geonames.org/
|
||||
examples:
|
||||
- value:
|
||||
has_label: Amsterdam
|
||||
in_country: NL
|
||||
has_code: NH
|
||||
identified_by:
|
||||
has_coordinates: 52.3676,4.9041
|
||||
description: Dutch city with modern name
|
||||
- value:
|
||||
has_label: Batavia
|
||||
has_name: Jakarta
|
||||
in_country: ID
|
||||
identified_by:
|
||||
description: Historical colonial name with modern equivalent
|
||||
- value:
|
||||
has_label: rural Gelderland
|
||||
in_country: NL
|
||||
has_code: GE
|
||||
has_source: born at the family estate in rural Gelderland
|
||||
description: Imprecise location from source text
|
||||
keywords:
|
||||
- birth
|
||||
- place
|
||||
- location
|
||||
- geographic
|
||||
- historical
|
||||
annotations:
|
||||
specificity_score: 0.45
|
||||
specificity_rationale: Relevant for person research across heritage sectors.
|
||||
custodian_types: "['*']"
|
||||
structured_aliases:
|
||||
- literal_form: geboorteplaats
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: nl
|
||||
- literal_form: geboren te
|
||||
predicate: RELATED_SYNONYM
|
||||
in_language: nl
|
||||
- literal_form: Geburtsort
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: de
|
||||
- literal_form: lieu de naissance
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: fr
|
||||
- literal_form: lugar de nacimiento
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: es
|
||||
- literal_form: مكان الميلاد
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: ar
|
||||
- literal_form: tempat lahir
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: id
|
||||
- literal_form: 出生地
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: zh
|
||||
|
|
@ -19,7 +19,7 @@ description: |
|
|||
- provenance: Data tier tracking and source lineage
|
||||
- ghcid: Global Heritage Custodian ID with history
|
||||
- identifiers: ISIL, Wikidata, GHCID variants
|
||||
- enrichments: Google Maps, Wikidata, genealogy archive registries, etc.
|
||||
- enrichments: Google Maps, Wikidata, Genealogiewerkbalk, etc.
|
||||
- web_claims: Extracted claims with XPath provenance
|
||||
- custodian_name: Consensus name determination
|
||||
- location: Normalized geographic data
|
||||
|
|
@ -42,8 +42,6 @@ default_range: string
|
|||
|
||||
imports:
|
||||
- linkml:types
|
||||
# Slots bundle (required for LinkML JSON Schema generation)
|
||||
- ./modules/slots/SlotsBundle
|
||||
# =============================================================================
|
||||
# ENUMERATIONS (7 enums)
|
||||
# =============================================================================
|
||||
|
|
@ -59,12 +57,6 @@ imports:
|
|||
# =============================================================================
|
||||
# Root Class
|
||||
- ./modules/classes/CustodianSourceFile
|
||||
# Shared base classes
|
||||
- ./modules/classes/CustodianType
|
||||
- ./modules/classes/DigitalPlatformType
|
||||
- ./modules/classes/Claim
|
||||
- ./modules/classes/ReconstructedEntity
|
||||
- ./modules/classes/Entity
|
||||
# Original Entry Domain
|
||||
- ./modules/classes/DuplicateEntry
|
||||
- ./modules/classes/MowInscription
|
||||
|
|
@ -83,8 +75,8 @@ imports:
|
|||
- ./modules/classes/ProvenanceSources
|
||||
- ./modules/classes/SourceRecord
|
||||
# Identifiers Domain
|
||||
- ./modules/classes/GHCIDBlock
|
||||
- ./modules/classes/GHCIDHistoryEntry
|
||||
- ./modules/classes/GhcidBlock
|
||||
- ./modules/classes/GhcidHistoryEntry
|
||||
- ./modules/classes/Identifier
|
||||
# Location Domain
|
||||
- ./modules/classes/CoordinateProvenance
|
||||
|
|
@ -101,6 +93,7 @@ imports:
|
|||
- ./modules/classes/GoogleMapsPlaywrightEnrichment
|
||||
- ./modules/classes/GooglePhoto
|
||||
- ./modules/classes/GoogleReview
|
||||
- ./modules/classes/LlmVerification
|
||||
- ./modules/classes/OpeningHours
|
||||
- ./modules/classes/OpeningPeriod
|
||||
- ./modules/classes/PhotoAttribution
|
||||
|
|
@ -152,6 +145,7 @@ imports:
|
|||
- ./modules/classes/WebEnrichment
|
||||
- ./modules/classes/WebSource
|
||||
# Custodian Name Domain
|
||||
- ./modules/classes/AlternativeName
|
||||
- ./modules/classes/CustodianLegalNameClaim
|
||||
- ./modules/classes/CustodianNameConsensus
|
||||
- ./modules/classes/FormerName
|
||||
|
|
@ -159,7 +153,7 @@ imports:
|
|||
- ./modules/classes/MergeNote
|
||||
# Dutch Enrichments Domain
|
||||
- ./modules/classes/ArchiveInfo
|
||||
- ./modules/classes/GenealogyArchivesRegistryEnrichment
|
||||
- ./modules/classes/GenealogiewerkbalkEnrichment
|
||||
- ./modules/classes/IsilCodeEntry
|
||||
- ./modules/classes/MunicipalityInfo
|
||||
- ./modules/classes/NanIsilEnrichment
|
||||
|
|
@ -188,21 +182,22 @@ imports:
|
|||
- ./modules/classes/YoutubeTranscript
|
||||
- ./modules/classes/YoutubeVideo
|
||||
# CH-Annotator Domain
|
||||
- ./modules/classes/AnnotatorAnnotationMetadata
|
||||
- ./modules/classes/AnnotatorAnnotationProvenance
|
||||
- ./modules/classes/AnnotatorBlock
|
||||
- ./modules/classes/AnnotatorEntityClaim
|
||||
- ./modules/classes/AnnotatorEntityClassification
|
||||
- ./modules/classes/AnnotatorIntegrationNote
|
||||
- ./modules/classes/AnnotatorModel
|
||||
- ./modules/classes/AnnotatorProvenance
|
||||
- ./modules/classes/ChAnnotatorAnnotationMetadata
|
||||
- ./modules/classes/ChAnnotatorAnnotationProvenance
|
||||
- ./modules/classes/ChAnnotatorBlock
|
||||
- ./modules/classes/ChAnnotatorEntityClaim
|
||||
- ./modules/classes/ChAnnotatorEntityClassification
|
||||
- ./modules/classes/ChAnnotatorIntegrationNote
|
||||
- ./modules/classes/ChAnnotatorModel
|
||||
- ./modules/classes/ChAnnotatorProvenance
|
||||
- ./modules/classes/ExtractionSourceInfo
|
||||
- ./modules/classes/PatternClassification
|
||||
# Person/Staff Domain
|
||||
- ./modules/classes/CareerEntry
|
||||
- ./modules/classes/CertificationEntry
|
||||
- ./modules/classes/CurrentPosition
|
||||
- ./modules/classes/ExternalSearchMetadata
|
||||
- ./modules/classes/EducationEntry
|
||||
- ./modules/classes/ExaSearchMetadata
|
||||
- ./modules/classes/HeritageExperienceEntry
|
||||
- ./modules/classes/MediaAppearanceEntry
|
||||
- ./modules/classes/PersonProfile
|
||||
|
|
|
|||
|
|
@ -66,7 +66,7 @@ instances:
|
|||
examples:
|
||||
- name: Private art collector
|
||||
description: Individual maintaining personal art collection
|
||||
GHCID_type: P # Personal collection
|
||||
ghcid_type: P # Personal collection
|
||||
- name: Family archivist
|
||||
description: Individual preserving family papers and photographs
|
||||
- name: Independent researcher
|
||||
|
|
|
|||
File diff suppressed because it is too large
Load diff
|
|
@ -1,94 +1,25 @@
|
|||
id: https://nde.nl/ontology/hc/class/APIEndpoint
|
||||
name: APIEndpoint
|
||||
title: API Endpoint Class
|
||||
title: APIEndpoint
|
||||
description: An API endpoint.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
dcat: http://www.w3.org/ns/dcat#
|
||||
hydra: http://www.w3.org/ns/hydra/core#
|
||||
default_prefix: hc
|
||||
rico: https://www.ica.org/standards/RiC/ontology#
|
||||
wd: http://www.wikidata.org/entity/
|
||||
classes:
|
||||
APIEndpoint:
|
||||
class_uri: schema:EntryPoint
|
||||
description: An API endpoint.
|
||||
slots:
|
||||
- has_url
|
||||
- has_description
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_description
|
||||
- ../slots/has_url
|
||||
classes:
|
||||
APIEndpoint:
|
||||
class_uri: schema:EntryPoint
|
||||
description: >-
|
||||
Uniform Resource Locator that provides programmatic access to a service
|
||||
or data resource through a defined interface specification.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Uniform Resource Locator die programmatische toegang biedt tot een service
|
||||
of gegevensbron via een gedefinieerde interfacespecificatie.
|
||||
de: >-
|
||||
Uniform Resource Locator, der programmatischen Zugriff auf einen Dienst
|
||||
oder eine Datenressource uber eine definierte Schnittstellenspezifikation bietet.
|
||||
fr: >-
|
||||
Localisateur uniforme de ressources fournissant un acces programmatique
|
||||
a un service ou a une ressource de donnees via une specification d'interface definie.
|
||||
es: >-
|
||||
Localizador uniforme de recursos que proporciona acceso programatico a un
|
||||
servicio o recurso de datos a traves de una especificacion de interfaz definida.
|
||||
ar: >-
|
||||
محدد موقع الموارد المنتظم الذي يوفر وصولاً برمجياً إلى خدمة أو مورد بيانات
|
||||
من خلال مواصفات واجهة محددة.
|
||||
id: >-
|
||||
Uniform Resource Locator yang menyediakan akses terprogram ke layanan atau
|
||||
sumber daya data melalui spesifikasi antarmuka yang ditentukan.
|
||||
zh: >-
|
||||
通过定义的接口规范提供服务或数据资源的编程访问的统一资源定位符。
|
||||
structured_aliases:
|
||||
- literal_form: API-eindpunt
|
||||
in_language: nl
|
||||
- literal_form: API-Endpunkt
|
||||
in_language: de
|
||||
- literal_form: point de terminaison API
|
||||
in_language: fr
|
||||
- literal_form: punto final API
|
||||
in_language: es
|
||||
- literal_form: نقطة نهاية API
|
||||
in_language: ar
|
||||
- literal_form: titik akhir API
|
||||
in_language: id
|
||||
- literal_form: API端点
|
||||
in_language: zh
|
||||
close_mappings:
|
||||
- hydra:EntryPoint
|
||||
- dcat:DataService
|
||||
broad_mappings:
|
||||
- schema:EntryPoint
|
||||
- skos:Concept
|
||||
slots:
|
||||
- has_url
|
||||
- has_description
|
||||
comments:
|
||||
- Represents a callable URL for an API operation
|
||||
- Part of schema.org for describing web services
|
||||
- Use with APIVersion for versioned endpoints
|
||||
see_also:
|
||||
- https://schema.org/EntryPoint
|
||||
- APIVersion
|
||||
- APIRequest
|
||||
examples:
|
||||
- value:
|
||||
has_url: "https://api.example.org/v2/collections"
|
||||
has_description: "Collections listing endpoint"
|
||||
description: REST API endpoint for collections
|
||||
- value:
|
||||
has_url: "https://api.example.org/v2/search"
|
||||
has_description: "Search endpoint with query parameters"
|
||||
description: Search API endpoint
|
||||
keywords:
|
||||
- API
|
||||
- endpoint
|
||||
- URL
|
||||
- web service
|
||||
- REST
|
||||
- programmatic access
|
||||
annotations:
|
||||
specificity_score: "0.3"
|
||||
specificity_rationale: Specific to API/web service endpoints
|
||||
custodian_types: "['*']"
|
||||
|
|
|
|||
|
|
@ -1,14 +1,14 @@
|
|||
id: https://nde.nl/ontology/hc/class/APIRequest
|
||||
name: APIRequest
|
||||
title: API Request Class
|
||||
title: APIRequest
|
||||
description: An API request event.
|
||||
prefixes:
|
||||
rov: http://www.w3.org/ns/regorg#
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
dcat: http://www.w3.org/ns/dcat#
|
||||
default_prefix: hc
|
||||
rico: https://www.ica.org/standards/RiC/ontology#
|
||||
wd: http://www.wikidata.org/entity/
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_endpoint
|
||||
|
|
@ -16,89 +16,15 @@ imports:
|
|||
- ../slots/has_version
|
||||
classes:
|
||||
APIRequest:
|
||||
class_uri: hc:APIRequest
|
||||
description: >-
|
||||
Single invocation of an API endpoint, capturing the request context,
|
||||
provenance, and version information for audit and debugging purposes.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Enkele aanroeping van een API-eindpunt, waarbij de verzoekcontext,
|
||||
herkomst en versie-informatie worden vastgelegd voor controle en debugging.
|
||||
de: >-
|
||||
Einzelner Aufruf eines API-Endpunkts, der den Anfragekontext,
|
||||
die Provenienz und Versionsinformationen fur Audit und Debugging erfasst.
|
||||
fr: >-
|
||||
Invocation unique d'un point de terminaison API, capturant le contexte
|
||||
de la requete, la provenance et les informations de version pour l'audit
|
||||
et le debogage.
|
||||
es: >-
|
||||
Invocacion unica de un punto final API, capturando el contexto de la solicitud,
|
||||
la procedencia y la informacion de version para auditoria y depuracion.
|
||||
ar: >-
|
||||
استدعاء واحد لنقطة نهاية API، يلتقط سياق الطلب والمصدر ومعلومات الإصدار
|
||||
لأغراض التدقيق وتصحيح الأخطاء.
|
||||
id: >-
|
||||
Pemanggilan tunggal titik akhir API, menangkap konteks permintaan,
|
||||
provenans, dan informasi versi untuk audit dan debugging.
|
||||
zh: >-
|
||||
对API端点的单次调用,捕获请求上下文、来源和版本信息用于审计和调试。
|
||||
structured_aliases:
|
||||
- literal_form: API-verzoek
|
||||
in_language: nl
|
||||
- literal_form: API-Anfrage
|
||||
in_language: de
|
||||
- literal_form: requete API
|
||||
in_language: fr
|
||||
- literal_form: solicitud API
|
||||
in_language: es
|
||||
- literal_form: طلب API
|
||||
in_language: ar
|
||||
- literal_form: permintaan API
|
||||
in_language: id
|
||||
- literal_form: API请求
|
||||
in_language: zh
|
||||
broad_mappings:
|
||||
- prov:Activity
|
||||
- schema:Action
|
||||
class_uri: prov:Activity
|
||||
close_mappings:
|
||||
- dcat:DataService
|
||||
- schema:Action
|
||||
description: An API request event.
|
||||
slots:
|
||||
- has_provenance
|
||||
- has_endpoint
|
||||
- has_version
|
||||
slot_usage:
|
||||
has_endpoint:
|
||||
range: APIEndpoint
|
||||
required: true
|
||||
has_provenance:
|
||||
range: Provenance
|
||||
required: false
|
||||
has_version:
|
||||
range: APIVersion
|
||||
required: false
|
||||
comments:
|
||||
- Captures individual API call events for logging and analysis
|
||||
- Useful for rate limiting, audit trails, and usage analytics
|
||||
- Links to endpoint definition and API version
|
||||
see_also:
|
||||
- APIEndpoint
|
||||
- APIVersion
|
||||
examples:
|
||||
- value:
|
||||
has_endpoint: "https://api.example.org/v2/search"
|
||||
has_provenance:
|
||||
created_by: "https://example.org/users/app123"
|
||||
created_at: "2025-01-14T10:00:00Z"
|
||||
has_version: "v2.1.0"
|
||||
description: Logged search API request with provenance
|
||||
keywords:
|
||||
- API request
|
||||
- API call
|
||||
- invocation
|
||||
- audit
|
||||
- logging
|
||||
- provenance
|
||||
annotations:
|
||||
specificity_score: "0.4"
|
||||
specificity_rationale: Specific to API request event tracking
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
|
|
|
|||
|
|
@ -1,107 +1,25 @@
|
|||
id: https://nde.nl/ontology/hc/class/APIVersion
|
||||
name: APIVersion
|
||||
title: API Version Class
|
||||
title: APIVersion
|
||||
description: Version of an API.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
dcat: http://www.w3.org/ns/dcat#
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
default_prefix: hc
|
||||
rico: https://www.ica.org/standards/RiC/ontology#
|
||||
wd: http://www.wikidata.org/entity/
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/identified_by
|
||||
- ../slots/has_label
|
||||
classes:
|
||||
APIVersion:
|
||||
class_uri: hc:APIVersion
|
||||
description: >-
|
||||
Specific release or iteration of an Application Programming Interface,
|
||||
identified by version number and potentially associated with changelogs
|
||||
and deprecation schedules.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Specifieke release of iteratie van een Application Programming Interface,
|
||||
geidentificeerd door versienummer en mogelijk gekoppeld aan changelogs
|
||||
en deprecation-schema's.
|
||||
de: >-
|
||||
Spezifische Veroffentlichung oder Iteration einer Programmierschnittstelle,
|
||||
identifiziert durch Versionsnummer und moglicherweise verknupft mit
|
||||
Anderungsprotokollen und Deprecation-Zeitplanen.
|
||||
fr: >-
|
||||
Version specifique ou iteration d'une interface de programmation d'application,
|
||||
identifiee par numero de version et potentiellement associee aux journaux
|
||||
des modifications et aux calendriers d'obsolescence.
|
||||
es: >-
|
||||
Version especifica o iteracion de una interfaz de programacion de aplicaciones,
|
||||
identificada por numero de version y potencialmente asociada con registros
|
||||
de cambios y calendarios de obsolescencia.
|
||||
ar: >-
|
||||
إصدار محدد أو تكرار لواجهة برمجة التطبيقات، محدد برقم الإصدار وربما
|
||||
مرتبط بسجلات التغيير وجداول الإهمال.
|
||||
id: >-
|
||||
Rilis atau iterasi spesifik dari Antarmuka Pemrograman Aplikasi, diidentifikasi
|
||||
oleh nomor versi dan berpotensi terkait dengan log perubahan dan jadwal penghentian.
|
||||
zh: >-
|
||||
应用程序编程接口的特定版本或迭代,由版本号标识,可能与变更日志和弃用计划相关联。
|
||||
structured_aliases:
|
||||
- literal_form: API-versie
|
||||
in_language: nl
|
||||
- literal_form: API-Version
|
||||
in_language: de
|
||||
- literal_form: version d'API
|
||||
in_language: fr
|
||||
- literal_form: version de API
|
||||
in_language: es
|
||||
- literal_form: إصدار API
|
||||
in_language: ar
|
||||
- literal_form: versi API
|
||||
in_language: id
|
||||
- literal_form: API版本
|
||||
in_language: zh
|
||||
close_mappings:
|
||||
- schema:SoftwareVersion
|
||||
- dcterms:hasVersion
|
||||
broad_mappings:
|
||||
- skos:Concept
|
||||
class_uri: schema:SoftwareApplication
|
||||
description: Version of an API.
|
||||
slots:
|
||||
- has_label
|
||||
- identified_by
|
||||
slot_usage:
|
||||
has_label:
|
||||
pattern: "^v?[0-9]+\\.[0-9]+(\\.[0-9]+)?(-[a-zA-Z0-9]+)?$"
|
||||
examples:
|
||||
- value: "v2.1.0"
|
||||
- value: "1.0.0-beta"
|
||||
- value: "2.0"
|
||||
identified_by:
|
||||
range: string
|
||||
required: true
|
||||
comments:
|
||||
- Follows semantic versioning convention (MAJOR.MINOR.PATCH)
|
||||
- Used to track API compatibility and deprecation
|
||||
- Links to endpoint definitions for versioned access
|
||||
see_also:
|
||||
- APIEndpoint
|
||||
- https://semver.org/
|
||||
examples:
|
||||
- value:
|
||||
has_label: "v2.1.0"
|
||||
identified_by: "v2.1.0"
|
||||
description: Semantic version 2.1.0 of an API
|
||||
- value:
|
||||
has_label: "v3.0.0-beta"
|
||||
identified_by: "v3.0.0-beta"
|
||||
description: Beta release of version 3.0.0
|
||||
keywords:
|
||||
- API version
|
||||
- semantic versioning
|
||||
- release
|
||||
- changelog
|
||||
- deprecation
|
||||
- compatibility
|
||||
annotations:
|
||||
specificity_score: "0.4"
|
||||
specificity_rationale: Specific to API versioning
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
|
|
|
|||
|
|
@ -15,66 +15,16 @@ prefixes:
|
|||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_name
|
||||
- ../slots/has_type
|
||||
- linkml:types
|
||||
- ../slots/has_name
|
||||
- ../slots/has_type
|
||||
classes:
|
||||
AVEquipment:
|
||||
class_uri: schema:Product
|
||||
description: >-
|
||||
Audiovisual equipment used in heritage contexts for playback, digitization,
|
||||
recording, or presentation of audiovisual materials and collections.
|
||||
alt_descriptions:
|
||||
nl: Audiovisuele apparatuur die in erfgoedcontexten wordt gebruikt voor het afspelen, digitaliseren, opnemen of presenteren van audiovisuele materialen en collecties.
|
||||
de: Audiovisuelle Ausrüstung, die in Heritage-Kontexten für die Wiedergabe, Digitalisierung, Aufnahme oder Präsentation audiovisueller Materialien und Sammlungen verwendet wird.
|
||||
fr: Équipement audiovisuel utilisé dans les contextes patrimoniaux pour la lecture, la numérisation, l'enregistrement ou la présentation de matériaux et collections audiovisuels.
|
||||
es: Equipo audiovisual utilizado en contextos patrimoniales para la reproducción, digitalización, grabación o presentación de materiales y colecciones audiovisuales.
|
||||
ar: معدات سمعية بصرية تُستخدم في السياقات التراثية للتشغيل والرقمنة والتسجيل أو عرض المواد والمجموعات السمعية البصرية.
|
||||
id: Peralatan audiovisual yang digunakan dalam konteks warisan untuk pemutaran, digitalisasi, perekaman, atau presentasi materi dan koleksi audiovisual.
|
||||
zh: 在遗产环境中用于播放、数字化、录制或展示视听材料和藏品的视听设备。
|
||||
broad_mappings:
|
||||
- schema:Product
|
||||
close_mappings:
|
||||
- crm:E19_Physical_Object
|
||||
related_mappings:
|
||||
- prov:Entity
|
||||
description: AV Equipment.
|
||||
slots:
|
||||
- has_name
|
||||
- has_type
|
||||
structured_aliases:
|
||||
- literal_form: AV-apparatuur
|
||||
in_language: nl
|
||||
- literal_form: AV-Ausrüstung
|
||||
in_language: de
|
||||
- literal_form: équipement AV
|
||||
in_language: fr
|
||||
- literal_form: equipo AV
|
||||
in_language: es
|
||||
- literal_form: معدات سمعية بصرية
|
||||
in_language: ar
|
||||
- literal_form: peralatan AV
|
||||
in_language: id
|
||||
- literal_form: 视听设备
|
||||
in_language: zh
|
||||
comments:
|
||||
- Used for playback, digitization, recording, and presentation
|
||||
- Includes projectors, players, recorders, and display equipment
|
||||
keywords:
|
||||
- audiovisual
|
||||
- equipment
|
||||
- playback
|
||||
- digitization
|
||||
- recording
|
||||
- AV hardware
|
||||
examples:
|
||||
- value:
|
||||
has_name: U-matic SP Player
|
||||
has_type: VIDEO_PLAYER
|
||||
description: Video playback equipment
|
||||
- value:
|
||||
has_name: Studer A810
|
||||
has_type: AUDIO_RECORDER
|
||||
description: Professional audio recorder
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
|
|
|
|||
|
|
@ -1,87 +1,46 @@
|
|||
id: https://nde.nl/ontology/hc/class/AcademicArchive
|
||||
name: AcademicArchive
|
||||
title: Academic Archive
|
||||
title: Academic Archive Type
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
rico: https://www.ica.org/standards/RiC/ontology#
|
||||
wd: http://www.wikidata.org/entity/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../classes/ArchiveOrganizationType
|
||||
- ../classes/WikidataAlignment
|
||||
- ../slots/has_hypernym
|
||||
- ../slots/has_label
|
||||
- ../slots/has_score
|
||||
- ../slots/has_type
|
||||
- ../slots/hold_record_set
|
||||
- ../slots/identified_by
|
||||
- ../slots/related_to
|
||||
- linkml:types
|
||||
- ../slots/has_hypernym
|
||||
- ../slots/identified_by
|
||||
- ../slots/has_label
|
||||
- ../slots/has_scope
|
||||
- ../slots/has_score
|
||||
- ../slots/has_type
|
||||
- ../slots/hold_record_set
|
||||
- ../slots/related_to
|
||||
classes:
|
||||
AcademicArchive:
|
||||
description: >-
|
||||
Organizational unit serving as the official custodian for the documentary heritage of a tertiary educational institution.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Organisatorische eenheid die fungeert als de officiële bewaarder van het documentair erfgoed van een instelling voor hoger onderwijs.
|
||||
de: >-
|
||||
Organisatorische Einheit, die als offizieller Verwahrer des dokumentarischen Erbes einer Hochschuleinrichtung dient.
|
||||
fr: >-
|
||||
Unite organisationnelle agissant en tant que depositeur officiel du patrimoine documentaire d'un etablissement d'enseignement superieur.
|
||||
es: >-
|
||||
Unidad organizativa que sirve como depositario oficial del patrimonio documental de una institucion de educacion superior.
|
||||
ar: >-
|
||||
وحدة تنظيمية تعمل كحارس رسمي للتراث الوثائقي لمؤسسة التعليم العالي.
|
||||
id: >-
|
||||
Unit organisasi yang berfungsi sebagai penjaga resmi warisan dokumenter institusi pendidikan tinggi.
|
||||
zh: >-
|
||||
作为高等教育机构文献遗产官方保管者的组织单位。
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: University Archives
|
||||
has_hypernym: wd:Q166118
|
||||
hold_record_set:
|
||||
- hc:UniversityAdministrativeFonds
|
||||
- hc:FacultyPaperCollection
|
||||
description: A university archives institution holding administrative records and faculty papers
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: College Archives
|
||||
has_hypernym: wd:Q166118
|
||||
hold_record_set:
|
||||
- hc:AcademicStudentRecordSeries
|
||||
description: A college archives institution preserving student records
|
||||
is_a: ArchiveOrganizationType
|
||||
class_uri: schema:ArchiveOrganization
|
||||
description: Archive of a higher education institution (university, college, polytechnic).
|
||||
slots:
|
||||
- has_type
|
||||
- has_label
|
||||
- has_hypernym
|
||||
- hold_record_set
|
||||
- has_hypernym
|
||||
- has_label
|
||||
- has_score
|
||||
- related_to
|
||||
structured_aliases:
|
||||
- literal_form: academisch archief
|
||||
in_language: nl
|
||||
- literal_form: Hochschularchiv
|
||||
in_language: de
|
||||
- literal_form: archives academiques
|
||||
in_language: fr
|
||||
- literal_form: archivo academico
|
||||
- literal_form: "archivo acad\xE9mico"
|
||||
in_language: es
|
||||
- literal_form: أرشيف أكاديمي
|
||||
in_language: ar
|
||||
- literal_form: arsip akademik
|
||||
in_language: id
|
||||
- literal_form: 学术档案馆
|
||||
in_language: zh
|
||||
- literal_form: "archives acad\xE9miques"
|
||||
in_language: fr
|
||||
- literal_form: archivio accademico
|
||||
in_language: it
|
||||
- literal_form: arquivo academico
|
||||
- literal_form: academisch archief
|
||||
in_language: nl
|
||||
- literal_form: "arquivo acad\xEAmico"
|
||||
in_language: pt
|
||||
keywords:
|
||||
- administrative records
|
||||
|
|
@ -100,40 +59,43 @@ classes:
|
|||
- campus life documentation
|
||||
slot_usage:
|
||||
hold_record_set:
|
||||
equals_string_in:
|
||||
- "hc:UniversityAdministrativeFonds"
|
||||
- "hc:AcademicStudentRecordSeries"
|
||||
- "hc:FacultyPaperCollection"
|
||||
- "hc:CampusDocumentationCollection"
|
||||
equals_expression: '["hc:UniversityAdministrativeFonds", "hc:StudentRecordSeries", "hc:FacultyPaperCollection", "hc:CampusDocumentationCollection"]
|
||||
'
|
||||
identified_by:
|
||||
pattern: "^Q[0-9]+$"
|
||||
pattern: ^Q[0-9]+$
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType"
|
||||
equals_expression: '["hc:ArchiveOrganizationType"]'
|
||||
related_to:
|
||||
range: WikidataAlignment
|
||||
inlined: true
|
||||
has_hypernym:
|
||||
equals_string: "wd:Q166118"
|
||||
equals_expression: '["wd:Q166118"]'
|
||||
has_label:
|
||||
ifabsent: string(archive)
|
||||
exact_mappings:
|
||||
- wd:Q27032435
|
||||
close_mappings:
|
||||
- rico:CorporateBody
|
||||
- wd:Q27032435
|
||||
- skos:Concept
|
||||
broad_mappings:
|
||||
- wd:Q166118
|
||||
- wd:Q124762372
|
||||
narrow_mappings:
|
||||
- wd:Q2496264
|
||||
related_mappings:
|
||||
- wd:Q1065413
|
||||
comments:
|
||||
- Institutional custodian type for higher education archives
|
||||
- Distinguished from institutional repositories (wd:Q1065413) which manage published scholarly outputs
|
||||
- The actual holdings are represented by AcademicArchiveRecordSetType instances
|
||||
- 'Preserved from prior description: Organizational unit serving as the official custodian for the documentary heritage of a tertiary educational institution. Charged with acquiring, preserving, and providing access to administrative records, faculty papers, student records, and campus documentation. Distinguished from institutional repositories that primarily manage published scholarly outputs.'
|
||||
- Custodian type class for academic/higher education archives
|
||||
- 'Part of dual-class pattern: custodian type + rico:RecordSetType'
|
||||
- Parent institution is typically a university or college
|
||||
- class_uri is schema:ArchiveOrganization - primary semantic meaning
|
||||
- skos:broader relationship to wd:Q166118 (archive) expressed via broad_mappings
|
||||
see_also:
|
||||
- AcademicArchiveRecordSetType
|
||||
- wd:Q2496264
|
||||
- wd:Q124762372
|
||||
- wd:Q1065413
|
||||
- AcademicArchiveRecordSetType
|
||||
annotations:
|
||||
specificity_score: "0.3"
|
||||
specificity_rationale: Specific to higher education archival custodians
|
||||
custodian_types: "['AcademicArchive']"
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
|
|
|
|||
|
|
@ -1,101 +1,62 @@
|
|||
id: https://nde.nl/ontology/hc/class/AcademicArchiveRecordSetType
|
||||
name: AcademicArchiveRecordSetType
|
||||
title: Academic Archive Record Set Type
|
||||
title: AcademicArchive Record Set Type
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
rico: https://www.ica.org/standards/RiC/ontology#
|
||||
wd: http://www.wikidata.org/entity/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../classes/CollectionType
|
||||
- ../classes/WikidataAlignment
|
||||
- ../slots/has_score
|
||||
- ../slots/has_type
|
||||
- ../slots/related_to
|
||||
- linkml:types
|
||||
- ../slots/has_scope
|
||||
- ../slots/has_score
|
||||
- ../slots/has_type
|
||||
- ../slots/related_to
|
||||
classes:
|
||||
AcademicArchiveRecordSetType:
|
||||
description: >-
|
||||
Category for grouping documentary materials accumulated by tertiary educational institutions during their administrative, academic, and operational activities.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Categorie voor het groeperen van documentair materiaal dat door hogeronderwijsinstellingen is verzameld tijdens hun administratieve, academische en operationele activiteiten.
|
||||
de: >-
|
||||
Kategorie zur Gruppierung von Dokumentenmaterial, das von Hochschulen während ihrer administrativen, akademischen und betrieblichen Aktivitäten angesammelt wurde.
|
||||
fr: >-
|
||||
Catégorie de regroupement des documents accumulés par les établissements d'enseignement supérieur au cours de leurs activités administratives, académiques et opérationnelles.
|
||||
es: >-
|
||||
Categoría para agrupar materiales documentales acumulados por instituciones de educación superior durante sus actividades administrativas, académicas y operativas.
|
||||
ar: >-
|
||||
فئة لتجميع المواد الوثائقية التي جمعتها مؤسسات التعليم العالي خلال أنشطتها الإدارية والأكاديمية والتشغيلية.
|
||||
id: >-
|
||||
Kategori untuk mengelompokkan materi dokumenter yang dikumpulkan oleh institusi pendidikan tinggi selama aktivitas administratif, akademik, dan operasional mereka.
|
||||
zh: >-
|
||||
高等教育机构在行政、学术和运营活动中积累的文献材料的分类类别。
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: University Administrative Records
|
||||
related_to: wd:Q27032435
|
||||
description: Administrative fonds containing governance records, committee minutes, and policy documents
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Student Records Series
|
||||
related_to: wd:Q27032435
|
||||
description: Enrollment records, academic transcripts, and graduation documentation
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Faculty Papers Collection
|
||||
related_to: wd:Q27032435
|
||||
description: Research documentation, teaching materials, and correspondence
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Campus Documentation Collection
|
||||
related_to: wd:Q27032435
|
||||
description: Photographs, university publications, and audiovisual materials
|
||||
description: A rico:RecordSetType for classifying collections of academic and
|
||||
higher education institutional records.
|
||||
is_a: CollectionType
|
||||
class_uri: rico:RecordSetType
|
||||
slots:
|
||||
- has_type
|
||||
- has_score
|
||||
- has_scope
|
||||
- related_to
|
||||
comments:
|
||||
- Record set TYPE classification, not the custodian organization
|
||||
- Collection type class for academic/higher education record sets
|
||||
- Part of dual-class pattern with AcademicArchive (custodian type)
|
||||
- Use AcademicArchive for the archive organization; use this class for collection types
|
||||
- 'Preserved from prior description: Category for grouping documentary materials accumulated by tertiary educational institutions during their administrative, academic, and operational activities. Distinguishes the classification of holdings from the repository organization responsible for their custody.'
|
||||
structured_aliases:
|
||||
- literal_form: academisch archiefbestand
|
||||
in_language: nl
|
||||
- literal_form: Hochschularchivbestand
|
||||
in_language: de
|
||||
- literal_form: fonds d'archives académiques
|
||||
in_language: fr
|
||||
- literal_form: fondo de archivo académico
|
||||
in_language: es
|
||||
- literal_form: مجموعة الأرشيف الأكاديمي
|
||||
in_language: ar
|
||||
- literal_form: koleksi arsip akademik
|
||||
in_language: id
|
||||
- literal_form: 学术档案集
|
||||
in_language: zh
|
||||
- literal_form: fonds d'archives académiques
|
||||
in_language: fr
|
||||
- literal_form: academisch archiefbestand
|
||||
in_language: nl
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType"
|
||||
equals_expression: '["hc:ArchiveOrganizationType"]'
|
||||
related_to:
|
||||
range: WikidataAlignment
|
||||
inlined: true
|
||||
broad_mappings:
|
||||
- rico:RecordSetType
|
||||
close_mappings:
|
||||
exact_mappings:
|
||||
- wd:Q27032435
|
||||
- rico:RecordSetType
|
||||
broad_mappings:
|
||||
- wd:Q27032435
|
||||
close_mappings:
|
||||
- skos:Concept
|
||||
see_also:
|
||||
- AcademicArchive
|
||||
- rico:RecordSetType
|
||||
- UniversityAdministrativeFonds
|
||||
- StudentRecordSeries
|
||||
- FacultyPaperCollection
|
||||
- CampusDocumentationCollection
|
||||
annotations:
|
||||
specificity_score: "0.3"
|
||||
specificity_rationale: Specific to academic/higher education archival collections
|
||||
custodian_types: "['AcademicArchive']"
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: '[''*'']'
|
||||
|
|
|
|||
|
|
@ -6,80 +6,44 @@ prefixes:
|
|||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
rico: https://www.ica.org/standards/RiC/ontology#
|
||||
rico-rst: https://www.ica.org/standards/RiC/vocabularies/recordSetTypes#
|
||||
wd: http://www.wikidata.org/entity/
|
||||
bf: http://id.loc.gov/ontologies/bibframe/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- ./AcademicArchiveRecordSetType
|
||||
- linkml:types
|
||||
- ../slots/has_score
|
||||
- ../slots/has_type
|
||||
- ../slots/has_note
|
||||
- ../slots/has_scope
|
||||
- ./AcademicArchiveRecordSetType
|
||||
- linkml:types
|
||||
- ../slots/has_score
|
||||
- ../slots/has_type
|
||||
- ../slots/has_note
|
||||
- ../slots/has_scope
|
||||
classes:
|
||||
UniversityAdministrativeFonds:
|
||||
description: >-
|
||||
Records created or accumulated by a university's central administration in the exercise of governance, policy-making, and operational functions.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Archiefbescheiden gecreeerd of verzameld door de centrale administratie van een universiteit bij de uitoefening van bestuurlijke, beleidsmatige en operationele functies.
|
||||
de: >-
|
||||
Unterlagen, die von der Zentralverwaltung einer Universitaet bei der Ausuebung von Regierungs-, Politik- und Betriebsfunktionen erstellt oder gesammelt wurden.
|
||||
fr: >-
|
||||
Documents crees ou accumules par l'administration centrale d'une universite dans l'exercice de fonctions de gouvernance, d'elaboration de politiques et operationnelles.
|
||||
es: >-
|
||||
Registros creados o acumulados por la administracion central de una universidad en el ejercicio de funciones de gobernanza, formulacion de politicas y operaciones.
|
||||
ar: >-
|
||||
سجلات تم إنشاؤها أو تجميعها من قبل الإدارة المركزية للجامعة في ممارسة وظائف الحوكمة وصنع السياسات والعمليات.
|
||||
id: >-
|
||||
Catatan yang dibuat atau dikumpulkan oleh administrasi pusat universitas dalam menjalankan fungsi tata kelola, pembuatan kebijakan, dan operasional.
|
||||
zh: >-
|
||||
大学中央行政管理部门在行使治理、决策和运营职能时创建或积累的记录。
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Board of Trustees Minutes
|
||||
has_note: Meeting minutes, resolutions, and supporting documents
|
||||
description: Governance records from university board of trustees
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Faculty Senate Records
|
||||
has_note: Senate minutes, committee reports, policy proposals
|
||||
description: Faculty governance documentation from university senate
|
||||
is_a: AcademicArchiveRecordSetType
|
||||
class_uri: rico:RecordSetType
|
||||
slots:
|
||||
- has_type
|
||||
- has_score
|
||||
- has_note
|
||||
- has_scope
|
||||
description: "A rico:RecordSetType for university administrative records organized\
|
||||
\ as a fonds.\n\n**Definition**:\nRecords created or accumulated by a university's\
|
||||
\ central administration in the \nexercise of governance, policy-making, and\
|
||||
\ operational functions. Organized \naccording to archival principles of provenance\
|
||||
\ (respect des fonds).\n\n**Typical Contents**:\n- Governance records (board\
|
||||
\ minutes, resolutions, bylaws)\n- Committee records (senate, faculty councils,\
|
||||
\ standing committees)\n- Policy records (institutional policies, procedures,\
|
||||
\ guidelines)\n- Strategic planning documents\n- Accreditation and institutional\
|
||||
\ assessment records\n- Executive correspondence\n\n**RiC-O Alignment**:\nThis\
|
||||
\ class is a specialized rico:RecordSetType. Records classified with this\n\
|
||||
type follow the fonds organizational principle as defined by rico-rst:Fonds\n\
|
||||
(respect des fonds / provenance-based organization from university central administration).\n"
|
||||
structured_aliases:
|
||||
- literal_form: universiteitsbestuursarchief
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: nl
|
||||
- literal_form: Hochschulverwaltungsbestand
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: de
|
||||
- literal_form: fonds d'administration universitaire
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: fr
|
||||
- literal_form: fondo de administracion universitaria
|
||||
predicate: EXACT_SYNONYM
|
||||
- literal_form: "fondo de administraci\xF3n universitaria"
|
||||
in_language: es
|
||||
- literal_form: أرشيف الإدارة الجامعية
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: ar
|
||||
- literal_form: arsip administrasi universitas
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: id
|
||||
- literal_form: 大学行政档案
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: zh
|
||||
- literal_form: fundo de administracao universitaria
|
||||
predicate: EXACT_SYNONYM
|
||||
- literal_form: fonds d'administration universitaire
|
||||
in_language: fr
|
||||
- literal_form: universiteitsbestuursarchief
|
||||
in_language: nl
|
||||
- literal_form: "fundo de administra\xE7\xE3o universit\xE1ria"
|
||||
in_language: pt
|
||||
keywords:
|
||||
- governance records
|
||||
|
|
@ -92,90 +56,76 @@ classes:
|
|||
- accreditation records
|
||||
- executive correspondence
|
||||
- institutional bylaws
|
||||
- resolutions
|
||||
- procedures
|
||||
- guidelines
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType"
|
||||
broad_mappings:
|
||||
- rico:RecordSetType
|
||||
- skos:Concept
|
||||
- crm:E55_Type
|
||||
related_mappings:
|
||||
- rico-rst:Fonds
|
||||
- wd:Q1643722
|
||||
- rico:RecordSetType
|
||||
- skos:Concept
|
||||
close_mappings:
|
||||
- skos:Concept
|
||||
see_also:
|
||||
- AcademicArchiveRecordSetType
|
||||
- rico:RecordSetType
|
||||
- rico-rst:Fonds
|
||||
comments:
|
||||
- Records follow the fonds organizational principle reflecting provenance from university central administration
|
||||
- Subject to records retention schedules and institutional access policies
|
||||
- "Preserved from prior description: Records created or accumulated by a university's central administration in the exercise of governance, policy-making, and operational functions. Organized according to archival principles of provenance (respect des fonds)."
|
||||
annotations:
|
||||
specificity_score: "0.5"
|
||||
specificity_rationale: Specific to university administrative records
|
||||
custodian_types: "['AcademicArchive']"
|
||||
AcademicStudentRecordSeries:
|
||||
description: >-
|
||||
Records documenting the academic careers and activities of students, typically organized as series within a larger university fonds.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Archiefbescheiden die de academische carrieeres en activiteiten van studenten documenteren, doorgaans georganiseerd als series binnen een groter universitair fonds.
|
||||
de: >-
|
||||
Unterlagen, die die akademischen Karrieren und Aktivitaeten von Studenten dokumentieren, typischerweise als Serie innerhalb eines groesseren Universitaetsfonds organisiert.
|
||||
fr: >-
|
||||
Documents recensant les carrieres et activites academiques des etudiants, generalement organises en series au sein d'un fonds universitaire plus vaste.
|
||||
es: >-
|
||||
Registros que documentan las carreras academicas y actividades de los estudiantes, generalmente organizados como series dentro de un fondo universitario mas amplio.
|
||||
ar: >-
|
||||
سجلات توثق المسارات الأكاديمية وأنشطة الطلاب، وعادة ما تكون منظمة كسلاسل ضمن صندوق جامعي أكبر.
|
||||
id: >-
|
||||
Catatan yang mendokumentasikan karir dan aktivitas akademik mahasiswa, biasanya diatur sebagai seri dalam dana universitas yang lebih besar.
|
||||
zh: >-
|
||||
记录学生学术生涯和活动的记录,通常作为较大大学档案中的系列组织。
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Registrar Student Records
|
||||
has_note: Enrollment, transcripts, graduation records with privacy restrictions
|
||||
description: Student academic records series with 75-year retention period
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Historical Student Records
|
||||
has_note: Pre-1950 student records with fewer access restrictions
|
||||
description: Historical student records open for research access
|
||||
is_a: AcademicArchiveRecordSetType
|
||||
class_uri: rico:RecordSetType
|
||||
slots:
|
||||
- has_type
|
||||
- has_score
|
||||
- organizational_principle
|
||||
- organizational_principle_uri
|
||||
- has_note
|
||||
- has_type
|
||||
- has_scope
|
||||
- has_scope
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_expression: '["hc:ArchiveOrganizationType"]'
|
||||
has_type:
|
||||
equals_string: UniversityAdministrativeFonds
|
||||
organizational_principle:
|
||||
equals_string: fonds
|
||||
organizational_principle_uri:
|
||||
equals_string: https://www.ica.org/standards/RiC/vocabularies/recordSetTypes#Fonds
|
||||
has_note:
|
||||
equals_string: This RecordSetType classifies record sets following the fonds
|
||||
principle. The fonds structure reflects provenance from university central
|
||||
administration.
|
||||
has_scope:
|
||||
equals_string: '["governance records", "committee records", "policy records",
|
||||
"strategic planning", "accreditation records"]'
|
||||
has_scope:
|
||||
equals_string: '["student records", "faculty papers", "research data"]'
|
||||
annotations:
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: '[''*'']'
|
||||
AcademicStudentRecordSeries:
|
||||
is_a: AcademicArchiveRecordSetType
|
||||
class_uri: rico:RecordSetType
|
||||
description: "A rico:RecordSetType for student records organized as archival series.\n\
|
||||
\n**Definition**:\nRecords documenting the academic careers and activities of\
|
||||
\ students, typically \norganized as series within a larger university fonds.\
|
||||
\ Subject to retention \nschedules and privacy regulations (FERPA in US, GDPR\
|
||||
\ in EU, AVG in NL).\n\n**Typical Contents**:\n- Enrollment and registration\
|
||||
\ records\n- Academic transcripts and grade records\n- Graduation records and\
|
||||
\ diploma registers\n- Disciplinary records\n- Financial aid records\n- Student\
|
||||
\ organization records\n\n**Privacy Considerations**:\nAccess restrictions typically\
|
||||
\ apply due to personally identifiable information.\nHistorical student records\
|
||||
\ (typically 75+ years) may have fewer restrictions.\n\n**RiC-O Alignment**:\n\
|
||||
This class is a specialized rico:RecordSetType. Records classified with this\n\
|
||||
type follow the series organizational principle as defined by rico-rst:Series\n\
|
||||
(organizational level within the university fonds).\n"
|
||||
structured_aliases:
|
||||
- literal_form: studentendossiers
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: nl
|
||||
- literal_form: Studentenaktenserie
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: de
|
||||
- literal_form: serie de dossiers etudiants
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: fr
|
||||
- literal_form: serie de expedientes estudiantiles
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: es
|
||||
- literal_form: سجلات الطلاب
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: ar
|
||||
- literal_form: seri catatan mahasiswa
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: id
|
||||
- literal_form: 学生档案系列
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: zh
|
||||
- literal_form: serie de registros de alunos
|
||||
predicate: EXACT_SYNONYM
|
||||
- literal_form: "s\xE9rie de dossiers \xE9tudiants"
|
||||
in_language: fr
|
||||
- literal_form: studentendossiers
|
||||
in_language: nl
|
||||
- literal_form: "s\xE9rie de registros de alunos"
|
||||
in_language: pt
|
||||
keywords:
|
||||
- enrollment records
|
||||
|
|
@ -188,91 +138,78 @@ classes:
|
|||
- disciplinary records
|
||||
- student organizations
|
||||
- financial aid records
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType"
|
||||
broad_mappings:
|
||||
- rico:RecordSetType
|
||||
- skos:Concept
|
||||
- crm:E55_Type
|
||||
related_mappings:
|
||||
- rico-rst:Series
|
||||
- wd:Q185583
|
||||
- rico:RecordSetType
|
||||
- skos:Concept
|
||||
close_mappings:
|
||||
- skos:Concept
|
||||
see_also:
|
||||
- AcademicArchiveRecordSetType
|
||||
- rico:RecordSetType
|
||||
- rico-rst:Series
|
||||
- UniversityAdministrativeFonds
|
||||
comments:
|
||||
- Records follow the series organizational principle within the university fonds
|
||||
- Access restrictions typically apply for records less than 75 years old
|
||||
- Subject to educational records privacy laws (FERPA, GDPR, AVG)
|
||||
- 'Preserved from prior description: Records documenting the academic careers and activities of students, typically organized as series within a larger university fonds. Subject to retention schedules and privacy regulations (FERPA in US, GDPR in EU, AVG in NL).'
|
||||
annotations:
|
||||
specificity_score: "0.5"
|
||||
specificity_rationale: Specific to student academic records
|
||||
custodian_types: "['AcademicArchive']"
|
||||
FacultyPaperCollection:
|
||||
description: >-
|
||||
Personal papers of faculty members documenting their academic careers, research
|
||||
activities, teaching, and professional service. Typically acquired as donations
|
||||
or bequests, distinct from official university records.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Persoonlijke archieven van hoogleraren die hun academische carrieeres, onderzoeksactiviteiten, onderwijs en professionele dienst documenteren.
|
||||
de: >-
|
||||
Persoenliche Papiere von Fakultaetsmitgliedern, die ihre akademischen Karrieren, Forschungsaktivitaeten, Lehre und den professionellen Dienst dokumentieren.
|
||||
fr: >-
|
||||
Papiers personnels des membres du corps professoral documentant leurs carrieres academiques, activites de recherche, enseignement et service professionnel.
|
||||
es: >-
|
||||
Papeles personales de los miembros de la facultad que documentan sus carreras academicas, actividades de investigacion, docencia y servicio profesional.
|
||||
ar: >-
|
||||
أوراق شخصية لأعضاء هيئة التدريس توثق مسيرتهم الأكاديمية وأنشطة البحث والتدريس والخدمة المهنية.
|
||||
id: >-
|
||||
Arsip pribadi anggota fakultas yang mendokumentasikan karir akademik, kegiatan penelitian, pengajaran, dan layanan profesional.
|
||||
zh: >-
|
||||
教职员个人档案,记录其学术生涯、研究活动、教学和专业服务。
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Professor Smith Papers
|
||||
has_note: Research notes, correspondence, lecture materials, conference papers
|
||||
description: Personal papers of a professor including research documentation
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Department Chair Archive
|
||||
has_note: Administrative correspondence, committee service records, teaching materials
|
||||
description: Faculty papers from a department chair with administrative responsibilities
|
||||
is_a: AcademicArchiveRecordSetType
|
||||
class_uri: rico:RecordSetType
|
||||
slots:
|
||||
- has_type
|
||||
- has_score
|
||||
- organizational_principle
|
||||
- organizational_principle_uri
|
||||
- has_note
|
||||
- has_note
|
||||
- has_type
|
||||
- has_scope
|
||||
- has_scope
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_expression: '["hc:ArchiveOrganizationType"]'
|
||||
has_type:
|
||||
equals_string: AcademicStudentRecordSeries
|
||||
organizational_principle:
|
||||
equals_string: series
|
||||
organizational_principle_uri:
|
||||
equals_string: https://www.ica.org/standards/RiC/vocabularies/recordSetTypes#Series
|
||||
has_note:
|
||||
equals_string: This RecordSetType classifies record sets following the series
|
||||
principle. Typically a series within the university administration fonds
|
||||
or registrar's office fonds.
|
||||
has_scope:
|
||||
equals_string: '["enrollment records", "academic transcripts", "graduation
|
||||
records", "disciplinary records", "financial aid records"]'
|
||||
has_scope:
|
||||
equals_string: '["faculty records", "research records", "administrative policy"]'
|
||||
has_note:
|
||||
equals_string: Subject to educational records privacy laws (FERPA, GDPR, AVG). Access
|
||||
restrictions typically apply for records less than 75 years old.
|
||||
FacultyPaperCollection:
|
||||
is_a: AcademicArchiveRecordSetType
|
||||
class_uri: rico:RecordSetType
|
||||
description: "A rico:RecordSetType for faculty papers and personal archives.\n\
|
||||
\n**Definition**:\nPersonal papers of faculty members documenting their academic\
|
||||
\ careers, research \nactivities, teaching, and professional service. These\
|
||||
\ are typically acquired as \ndonations or bequests, distinct from official\
|
||||
\ university records.\n\n**Typical Contents**:\n- Research documentation and\
|
||||
\ notes\n- Correspondence (professional and personal)\n- Lecture notes and course\
|
||||
\ materials\n- Manuscripts and drafts\n- Conference papers and presentations\n\
|
||||
- Professional organization records\n- Photographs and audiovisual materials\n\
|
||||
\n**Provenance**:\nUnlike administrative fonds, faculty papers are personal\
|
||||
\ archives with the \nindividual faculty member as creator/accumulator. The\
|
||||
\ university acquires \ncustody but respects original order where it exists.\n\
|
||||
\n**RiC-O Alignment**:\nThis class is a specialized rico:RecordSetType. Records\
|
||||
\ classified with this\ntype follow the fonds organizational principle as defined\
|
||||
\ by rico-rst:Fonds\n(personal papers fonds with the faculty member as creator/accumulator).\n"
|
||||
structured_aliases:
|
||||
- literal_form: hoogleraarspapieren
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: nl
|
||||
- literal_form: Professorennachlass
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: de
|
||||
- literal_form: fonds de professeurs
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: fr
|
||||
- literal_form: archivo personal de profesores
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: es
|
||||
- literal_form: أوراق هيئة التدريس
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: ar
|
||||
- literal_form: arsip fakultas
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: id
|
||||
- literal_form: 教职员档案
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: zh
|
||||
- literal_form: fonds de professeurs
|
||||
in_language: fr
|
||||
- literal_form: hoogleraarspapieren
|
||||
in_language: nl
|
||||
- literal_form: arquivo pessoal de docentes
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: pt
|
||||
keywords:
|
||||
- personal papers
|
||||
|
|
@ -283,136 +220,129 @@ classes:
|
|||
- conference papers
|
||||
- professional papers
|
||||
- academic papers
|
||||
- correspondence
|
||||
- manuscripts
|
||||
- drafts
|
||||
- professional organization records
|
||||
- photographs
|
||||
- audiovisual materials
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType"
|
||||
broad_mappings:
|
||||
- rico:RecordSetType
|
||||
- skos:Concept
|
||||
- crm:E55_Type
|
||||
related_mappings:
|
||||
- rico-rst:Fonds
|
||||
- wd:Q22075301
|
||||
- rico:RecordSetType
|
||||
- skos:Concept
|
||||
close_mappings:
|
||||
- skos:Concept
|
||||
- bf:Archival
|
||||
see_also:
|
||||
- AcademicArchiveRecordSetType
|
||||
- rico:RecordSetType
|
||||
- rico-rst:Fonds
|
||||
comments:
|
||||
- Personal archives with individual faculty member as creator/accumulator
|
||||
- Typically acquired through donation or bequest with possible donor restrictions
|
||||
- Respects original order where it exists
|
||||
annotations:
|
||||
specificity_score: "0.5"
|
||||
specificity_rationale: Specific to faculty personal papers
|
||||
custodian_types: "['AcademicArchive']"
|
||||
acquisition_note: "Typically acquired through donation or bequest. May include restrictions on access or publication specified by donor agreement."
|
||||
CampusDocumentationCollection:
|
||||
description: >-
|
||||
Materials documenting campus life, institutional identity, and university culture beyond formal administrative records.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Materialen die campusleven, institutionele identiteit en universiteitscultuur documenteren buiten formele administratieve archieven om.
|
||||
de: >-
|
||||
Materialien, die das Campusleben, die institutionelle Identitaet und die Universitaetskultur jenseits formaler Verwaltungsunterlagen dokumentieren.
|
||||
fr: >-
|
||||
Documents recensant la vie du campus, l'identite institutionnelle et la culture universitaire au-dela des archives administratives formelles.
|
||||
es: >-
|
||||
Materiales que documentan la vida del campus, la identidad institucional y la cultura universitaria mas alla de los registros administrativos formales.
|
||||
ar: >-
|
||||
مواد توثق حياة الحرم الجامعي والهوية المؤسسية وثقافة الجامعة خارج السجلات الإدارية الرسمية.
|
||||
id: >-
|
||||
Materi yang mendokumentasikan kehidupan kampus, identitas institusional, dan budaya universitas di luar catatan administratif formal.
|
||||
zh: >-
|
||||
记录校园生活、机构身份和大学文化的材料,超越正式行政记录。
|
||||
examples:
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Campus Photograph Collection
|
||||
has_note: Historical photographs, yearbooks, event programs, oral histories
|
||||
description: Campus life documentation including photographs and publications
|
||||
- value:
|
||||
has_type: hc:ArchiveOrganizationType
|
||||
has_label: Student Newspaper Archive
|
||||
has_note: Student newspapers, magazines, ephemera, memorabilia
|
||||
description: Student publication documentation with campus culture materials
|
||||
is_a: AcademicArchiveRecordSetType
|
||||
class_uri: rico:RecordSetType
|
||||
acquisition_note: Typically acquired through donation or bequest. May include
|
||||
restrictions on access or publication specified by donor agreement.
|
||||
slots:
|
||||
- has_type
|
||||
- has_score
|
||||
- organizational_principle
|
||||
- organizational_principle_uri
|
||||
- has_note
|
||||
- has_type
|
||||
- has_scope
|
||||
- has_scope
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_expression: '["hc:ArchiveOrganizationType", "hc:LibraryType"]'
|
||||
has_type:
|
||||
equals_string: FacultyPaperCollection
|
||||
organizational_principle:
|
||||
equals_string: fonds
|
||||
organizational_principle_uri:
|
||||
equals_string: https://www.ica.org/standards/RiC/vocabularies/recordSetTypes#Fonds
|
||||
has_note:
|
||||
equals_string: This RecordSetType classifies record sets following the fonds
|
||||
principle. Personal archives with individual faculty member as creator/accumulator.
|
||||
has_scope:
|
||||
equals_string: '["research documentation", "correspondence", "lecture notes",
|
||||
"manuscripts", "conference papers"]'
|
||||
has_scope:
|
||||
equals_string: '["official university records", "student records", "administrative
|
||||
files"]'
|
||||
CampusDocumentationCollection:
|
||||
is_a: AcademicArchiveRecordSetType
|
||||
class_uri: rico:RecordSetType
|
||||
description: "A rico:RecordSetType for campus life and institutional documentation.\n\
|
||||
\n**Definition**:\nMaterials documenting campus life, institutional identity,\
|
||||
\ and university \nculture beyond formal administrative records. Often includes\
|
||||
\ visual materials, \npublications, and ephemera that capture the lived experience\
|
||||
\ of the institution.\n\n**Typical Contents**:\n- Campus photographs and audiovisual\
|
||||
\ materials\n- University publications (yearbooks, newspapers, magazines)\n\
|
||||
- Ephemera (programs, posters, invitations)\n- Memorabilia and artifacts\n-\
|
||||
\ Oral histories\n- Event documentation\n- Building and facilities documentation\n\
|
||||
\n**Collection Nature**:\nMay be assembled collections (artificial) rather than\
|
||||
\ strictly provenance-based,\nespecially for ephemera and visual materials.\
|
||||
\ Documentation value often takes\nprecedence over strict archival arrangement.\n\
|
||||
\n**RiC-O Alignment**:\nThis class is a specialized rico:RecordSetType. Records\
|
||||
\ classified with this\ntype follow the collection organizational principle\
|
||||
\ as defined by rico-rst:Collection\n(assembled/artificial collection organized\
|
||||
\ by subject or documentation purpose).\n"
|
||||
structured_aliases:
|
||||
- literal_form: campusdocumentatiecollectie
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: nl
|
||||
- literal_form: Campus-Dokumentationssammlung
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: de
|
||||
- literal_form: collection de documentation du campus
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: fr
|
||||
- literal_form: coleccion de documentacion del campus
|
||||
predicate: EXACT_SYNONYM
|
||||
- literal_form: "colecci\xF3n de documentaci\xF3n del campus"
|
||||
in_language: es
|
||||
- literal_form: توثيق الحرم الجامعي
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: ar
|
||||
- literal_form: koleksi dokumentasi kampus
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: id
|
||||
- literal_form: 校园文献集
|
||||
predicate: EXACT_SYNONYM
|
||||
in_language: zh
|
||||
- literal_form: colecao de documentacao do campus
|
||||
predicate: EXACT_SYNONYM
|
||||
- literal_form: collection de documentation du campus
|
||||
in_language: fr
|
||||
- literal_form: campusdocumentatiecollectie
|
||||
in_language: nl
|
||||
- literal_form: "cole\xE7\xE3o de documenta\xE7\xE3o do campus"
|
||||
in_language: pt
|
||||
keywords:
|
||||
- campus photographs
|
||||
- audiovisual materials
|
||||
- university publications
|
||||
- student newspapers
|
||||
- yearbooks
|
||||
- magazines
|
||||
- oral histories
|
||||
- event documentation
|
||||
- building documentation
|
||||
- campus life
|
||||
- ephemera
|
||||
- programs
|
||||
- posters
|
||||
- memorabilia
|
||||
- artifacts
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_string: "hc:ArchiveOrganizationType"
|
||||
broad_mappings:
|
||||
- rico:RecordSetType
|
||||
- skos:Concept
|
||||
- crm:E55_Type
|
||||
related_mappings:
|
||||
- rico-rst:Collection
|
||||
- wd:Q9388534
|
||||
- rico:RecordSetType
|
||||
- skos:Concept
|
||||
close_mappings:
|
||||
- skos:Concept
|
||||
- schema:Collection
|
||||
see_also:
|
||||
- AcademicArchiveRecordSetType
|
||||
- rico:RecordSetType
|
||||
- rico-rst:Collection
|
||||
comments:
|
||||
- Often includes assembled/artificial collections organized by subject or documentation purpose
|
||||
- May prioritize documentation value over strict archival arrangement
|
||||
- Can include ephemera, memorabilia, and visual materials
|
||||
- 'Preserved from prior description: Materials documenting campus life, institutional identity, and university culture beyond formal administrative records. Often includes visual materials, publications, and ephemera capturing the lived experience of the institution.'
|
||||
annotations:
|
||||
specificity_score: "0.5"
|
||||
specificity_rationale: Specific to campus documentation materials
|
||||
custodian_types: "['AcademicArchive']"
|
||||
collection_nature_note: "Often includes artificial/assembled collections organized by subject, format, or documentation purpose rather than strict provenance."
|
||||
collection_nature_note: Often includes artificial/assembled collections organized
|
||||
by subject, format, or documentation purpose rather than strict provenance.
|
||||
slots:
|
||||
- has_type
|
||||
- has_score
|
||||
- organizational_principle
|
||||
- organizational_principle_uri
|
||||
- has_note
|
||||
- has_type
|
||||
- has_scope
|
||||
- has_scope
|
||||
slot_usage:
|
||||
has_type:
|
||||
equals_expression: '["hc:ArchiveOrganizationType", "hc:LibraryType", "hc:MuseumType"]'
|
||||
has_type:
|
||||
equals_string: CampusDocumentationCollection
|
||||
organizational_principle:
|
||||
equals_string: collection
|
||||
organizational_principle_uri:
|
||||
equals_string: https://www.ica.org/standards/RiC/vocabularies/recordSetTypes#Collection
|
||||
has_note:
|
||||
equals_string: This RecordSetType classifies record sets following the collection
|
||||
principle. May be assembled collection (artificial) organized by subject
|
||||
or documentation purpose.
|
||||
has_scope:
|
||||
equals_string: '["photographs", "audiovisual materials", "publications", "ephemera",
|
||||
"oral histories", "memorabilia"]'
|
||||
has_scope:
|
||||
equals_string: '["administrative records", "student records", "faculty papers"]'
|
||||
|
|
|
|||
|
|
@ -1,103 +1,22 @@
|
|||
id: https://nde.nl/ontology/hc/class/AcademicInstitution
|
||||
name: AcademicInstitution
|
||||
title: Academic Institution
|
||||
title: AcademicInstitution
|
||||
description: An institution of higher education or research.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
wd: http://www.wikidata.org/entity/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_label
|
||||
classes:
|
||||
AcademicInstitution:
|
||||
description: >-
|
||||
Organization providing post-secondary education or conducting advanced research.
|
||||
Includes universities, colleges, polytechnics, institutes of technology, research
|
||||
institutes, and other tertiary educational entities.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Organisatie die hoger onderwijs biedt of gevorderd onderzoek uitvoert. Omvat
|
||||
universiteiten, hogescholen, polytechnische instituten, technologische instituten,
|
||||
onderzoeksinstituten en andere tertiaire onderwijsinstellingen.
|
||||
de: >-
|
||||
Organisation, die Hochschulbildung anbietet oder fortgeschrittene Forschung
|
||||
betreibt. Umfasst Universitaeten, Hochschulen, Polytechnische Institute,
|
||||
Technologieinstitute, Forschungsinstitute und andere tertiaere Bildungseinrichtungen.
|
||||
fr: >-
|
||||
Organisation dispensant un enseignement superieur ou menant des recherches
|
||||
avancees. Comprend les universites, les colleges, les instituts polytechniques,
|
||||
les instituts de technologie, les instituts de recherche et autres etablissements
|
||||
d'enseignement tertiaire.
|
||||
es: >-
|
||||
Organizacion que proporciona educacion postsecundaria o realiza investigacion
|
||||
avanzada. Incluye universidades, colegios, institutos politecnicos, institutos
|
||||
de tecnologia, institutos de investigacion y otras entidades educativas terciarias.
|
||||
ar: >-
|
||||
منظمة توفر تعليمًا ما بعد الثانوي أو تجري أبحاثًا متقدمة. تشمل الجامعات
|
||||
والكليات والمعاهد الفنية ومعاهد التكنولوجيا ومعاهد البحث وغيرها من
|
||||
المؤسسات التعليمية العليا.
|
||||
id: >-
|
||||
Organisasi yang menyediakan pendidikan pasca-menengah atau melakukan penelitian
|
||||
lanjutan. Termasuk universitas, perguruan tinggi, politeknik, institut teknologi,
|
||||
institut penelitian, dan entitas pendidikan tersier lainnya.
|
||||
zh: >-
|
||||
提供高等教育或进行高级研究的组织。包括大学、学院、理工学院、技术学院、
|
||||
研究所和其他高等教育机构。
|
||||
examples:
|
||||
- value:
|
||||
has_label: University of Amsterdam
|
||||
description: A research university in the Netherlands
|
||||
- value:
|
||||
has_label: Technical College of Berlin
|
||||
description: A technical higher education institution in Germany
|
||||
class_uri: schema:EducationalOrganization
|
||||
description: Academic institution.
|
||||
slots:
|
||||
- has_label
|
||||
structured_aliases:
|
||||
- literal_form: onderwijsinstelling
|
||||
in_language: nl
|
||||
- literal_form: Bildungseinrichtung
|
||||
in_language: de
|
||||
- literal_form: etablissement d'enseignement
|
||||
in_language: fr
|
||||
- literal_form: institucion educativa
|
||||
in_language: es
|
||||
- literal_form: مؤسسة تعليمية
|
||||
in_language: ar
|
||||
- literal_form: institusi pendidikan
|
||||
in_language: id
|
||||
- literal_form: 教育机构
|
||||
in_language: zh
|
||||
keywords:
|
||||
- university
|
||||
- college
|
||||
- polytechnic
|
||||
- institute of technology
|
||||
- research institute
|
||||
- higher education
|
||||
- tertiary education
|
||||
- academic
|
||||
close_mappings:
|
||||
- wd:Q4671277
|
||||
- wd:Q38723
|
||||
broad_mappings:
|
||||
- wd:Q2385804
|
||||
- schema:EducationalOrganization
|
||||
- skos:Concept
|
||||
narrow_mappings:
|
||||
- wd:Q3918
|
||||
- schema:CollegeOrUniversity
|
||||
comments:
|
||||
- Encompasses both degree-granting institutions and research-focused organizations
|
||||
- Distinct from primary and secondary educational institutions
|
||||
- May include specialized academies, conservatories, and professional schools
|
||||
see_also:
|
||||
- AcademicArchive
|
||||
- AcademicProgram
|
||||
annotations:
|
||||
specificity_score: "0.4"
|
||||
specificity_rationale: Specific to tertiary education and research organizations
|
||||
custodian_types: "['AcademicArchive']"
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
|
|
|
|||
|
|
@ -1,92 +1,22 @@
|
|||
id: https://nde.nl/ontology/hc/class/AcademicProgram
|
||||
name: AcademicProgram
|
||||
title: Academic Program
|
||||
title: AcademicProgram
|
||||
description: An educational or research program offered by an academic institution.
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
wd: http://www.wikidata.org/entity/
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../slots/has_label
|
||||
- linkml:types
|
||||
- ../slots/has_label
|
||||
classes:
|
||||
AcademicProgram:
|
||||
description: >-
|
||||
Course of study or research offered by a tertiary educational institution, leading to a credential such as a degree, diploma, or certificate.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Gestructureerd studieprogramma of onderzoeksprogramma aangeboden door een instelling voor hoger onderwijs, leidend tot een kwalificatie zoals een graad, diploma of certificaat.
|
||||
de: >-
|
||||
Strukturierter Studiengang oder Forschungsprogramm, das von einer Hochschuleinrichtung angeboten wird und zu einem Abschluss wie einem Grad, Diplom oder Zertifikat fuehrt.
|
||||
fr: >-
|
||||
Programme d'etudes ou de recherche structure propose par un etablissement d'enseignement superieur, conduisant a une qualification telle qu'un diplome, un certificat ou une attestation.
|
||||
es: >-
|
||||
Programa de estudios o investigacion estructurado ofrecido por una institucion de educacion superior, que conduce a una credencial como un titulo, diploma o certificado.
|
||||
ar: >-
|
||||
برنامج دراسي أو بحثي منظم تقدمه مؤسسة تعليم عالي، يؤدي إلى مؤهل مثل درجة علمية أو دبلوم أو شهادة.
|
||||
id: >-
|
||||
Program studi atau penelitian terstruktur yang ditawarkan oleh institusi pendidikan tinggi, yang mengarah ke kredensial seperti gelar, diploma, atau sertifikat.
|
||||
zh: >-
|
||||
高等教育机构提供的结构化学习或研究课程,可获学位、文凭或证书等资格。
|
||||
examples:
|
||||
- value:
|
||||
has_label: Bachelor of Computer Science
|
||||
description: Undergraduate degree program in computer science
|
||||
- value:
|
||||
has_label: Master of Arts in History
|
||||
description: Graduate degree program in historical studies
|
||||
- value:
|
||||
has_label: PhD Program in Molecular Biology
|
||||
description: Doctoral research program in molecular biology
|
||||
- value:
|
||||
has_label: Professional Certificate in Digital Archiving
|
||||
description: Non-degree professional development program
|
||||
class_uri: schema:EducationalOccupationalProgram
|
||||
description: Academic program.
|
||||
slots:
|
||||
- has_label
|
||||
structured_aliases:
|
||||
- literal_form: studieprogramma
|
||||
in_language: nl
|
||||
- literal_form: Studiengang
|
||||
in_language: de
|
||||
- literal_form: programme d'etudes
|
||||
in_language: fr
|
||||
- literal_form: programa de estudios
|
||||
in_language: es
|
||||
- literal_form: برنامج أكاديمي
|
||||
in_language: ar
|
||||
- literal_form: program akademik
|
||||
in_language: id
|
||||
- literal_form: 学术项目
|
||||
in_language: zh
|
||||
keywords:
|
||||
- degree program
|
||||
- diploma program
|
||||
- certificate program
|
||||
- undergraduate
|
||||
- graduate
|
||||
- doctoral
|
||||
- course of study
|
||||
- curriculum
|
||||
- major
|
||||
- specialization
|
||||
- research program
|
||||
exact_mappings:
|
||||
- schema:EducationalOccupationalProgram
|
||||
close_mappings:
|
||||
- wd:Q600134
|
||||
broad_mappings:
|
||||
- skos:Concept
|
||||
comments:
|
||||
- Programs may be full-time, part-time, or hybrid in delivery format
|
||||
- May include professional, vocational, or academic orientations
|
||||
- Often organized into semesters, quarters, or modular units
|
||||
- 'Preserved from prior description: Course of study or research offered by a tertiary educational institution, leading to a credential such as a degree, diploma, or certificate. Comprises a defined sequence of learning opportunities with clear requirements, start and end points, and intended outcomes.'
|
||||
see_also:
|
||||
- AcademicInstitution
|
||||
annotations:
|
||||
specificity_score: "0.4"
|
||||
specificity_rationale: Specific to structured academic offerings
|
||||
custodian_types: "['AcademicArchive']"
|
||||
specificity_score: 0.1
|
||||
specificity_rationale: Generic utility class/slot created during migration
|
||||
custodian_types: "['*']"
|
||||
|
|
|
|||
|
|
@ -1,13 +1,12 @@
|
|||
id: https://nde.nl/ontology/hc/class/Access
|
||||
name: Access
|
||||
title: Access
|
||||
title: Access Class
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
|
|
@ -20,107 +19,43 @@ imports:
|
|||
- ../slots/temporal_extent
|
||||
classes:
|
||||
Access:
|
||||
description: >-
|
||||
Information describing how heritage collections, services, or facilities
|
||||
may be used or consulted. Captures access types, eligible user categories,
|
||||
conditions, and temporal availability.
|
||||
alt_descriptions:
|
||||
nl: >-
|
||||
Gestructureerde informatie over hoe erfgoedcollecties, diensten of faciliteiten
|
||||
kunnen worden gebruikt of geraadpleegd. Legt toegangstypen, in aanmerking komende
|
||||
gebruikerscategorieen, voorwaarden en tijdelijke beschikbaarheid vast.
|
||||
de: >-
|
||||
Strukturierte Informationen darueber, wie Erbesammlungen, Dienstleistungen oder
|
||||
Einrichtungen genutzt oder konsultiert werden koennen. Erfasst Zugangsarten,
|
||||
berechtigte Benutzerkategorien, Bedingungen und zeitliche Verfuegbarkeit.
|
||||
fr: >-
|
||||
Informations structurees decrivant comment les collections patrimoniales,
|
||||
les services ou les installations peuvent etre utilises ou consultes. Capture
|
||||
les types d'acces, les categories d'utilisateurs eligibles, les conditions
|
||||
et la disponibilite temporelle.
|
||||
es: >-
|
||||
Informacion estructurada que describe como se pueden utilizar o consultar
|
||||
las colecciones patrimoniales, servicios o instalaciones. Captura los tipos
|
||||
de acceso, las categorias de usuarios elegibles, las condiciones y la
|
||||
disponibilidad temporal.
|
||||
ar: >-
|
||||
معلومات منظمة تصف كيف يمكن استخدام أو استشارة مجموعات التراث أو الخدمات
|
||||
أو المرافق. تسجل أنواع الوصول وفئات المستخدمين المؤهلين والشروط
|
||||
والتوفر الزمني.
|
||||
id: >-
|
||||
Informasi terstruktur yang menjelaskan bagaimana koleksi warisan, layanan,
|
||||
atau fasilitas dapat digunakan atau dikonsultasikan. Merekam jenis akses,
|
||||
kategori pengguna yang memenuhi syarat, kondisi, dan ketersediaan temporal.
|
||||
zh: >-
|
||||
描述如何使用或查阅遗产馆藏、服务或设施的结构化信息。记录访问类型、
|
||||
符合条件的用户类别、条件和时间可用性。
|
||||
examples:
|
||||
- value:
|
||||
has_type: PUBLIC
|
||||
has_description: Open to general public during gallery hours
|
||||
has_user_category:
|
||||
- general public
|
||||
description: Museum gallery with unrestricted public access
|
||||
- value:
|
||||
has_type: BY_APPOINTMENT
|
||||
has_user_category:
|
||||
- credentialed researchers
|
||||
- graduate students with faculty sponsor
|
||||
description: Special collections reading room requiring advance booking
|
||||
- value:
|
||||
has_type: ACADEMIC
|
||||
has_description: Open to enrolled students and faculty; public by appointment
|
||||
has_user_category:
|
||||
- enrolled students
|
||||
- faculty
|
||||
- research staff
|
||||
description: University library with priority for academic community
|
||||
- value:
|
||||
has_type: DIGITAL_ONLY
|
||||
has_description: Collection accessible only through online database
|
||||
has_user_category:
|
||||
- anyone with internet access
|
||||
description: Digitized collection available remotely
|
||||
- value:
|
||||
has_type: RESTRICTED
|
||||
has_description: Fragile materials require staff supervision
|
||||
has_user_category:
|
||||
- senior researchers with institutional affiliation
|
||||
description: Conservation-restricted materials with supervised access
|
||||
class_uri: dcterms:RightsStatement
|
||||
description: |
|
||||
Structured access information for heritage collections, services, or facilities.
|
||||
**Purpose**:
|
||||
Replaces simple string descriptions of access conditions with structured
|
||||
data capturing access types, eligible users, conditions, and restrictions.
|
||||
**Key Properties**:
|
||||
- `has_type`: Type of access (PUBLIC, BY_APPOINTMENT, RESTRICTED, etc.)
|
||||
- `has_user_category`: Who can access (public, students, faculty, researchers)
|
||||
- `condition_of_access`: Conditions or requirements for access
|
||||
- `has_description`: Free-text description
|
||||
- `temporal_extent`: When this access policy applies
|
||||
**Access Types**:
|
||||
- PUBLIC: Open to general public
|
||||
- BY_APPOINTMENT: Requires advance appointment
|
||||
- ACADEMIC: Restricted to academic community
|
||||
- RESEARCHER: Restricted to credentialed researchers
|
||||
- MEMBER: Requires membership
|
||||
- RESTRICTED: Limited access with specific conditions
|
||||
- CLOSED: Not currently accessible
|
||||
- DIGITAL_ONLY: Available only in digital form
|
||||
**Ontological Alignment**:
|
||||
- **Primary**: `dcterms:RightsStatement` - Dublin Core rights statement
|
||||
- **Close**: `schema:publicAccess` - Schema.org access indicator
|
||||
- **Related**: `crm:E30_Right` - CIDOC-CRM rights
|
||||
exact_mappings:
|
||||
- dcterms:RightsStatement
|
||||
close_mappings:
|
||||
- schema:publicAccess
|
||||
related_mappings:
|
||||
- crm:E30_Right
|
||||
slots:
|
||||
- has_type
|
||||
- has_user_category
|
||||
- has_description
|
||||
- temporal_extent
|
||||
- has_frequency
|
||||
structured_aliases:
|
||||
- literal_form: toegang
|
||||
in_language: nl
|
||||
- literal_form: Zugang
|
||||
in_language: de
|
||||
- literal_form: acces
|
||||
in_language: fr
|
||||
- literal_form: acceso
|
||||
in_language: es
|
||||
- literal_form: وصول
|
||||
in_language: ar
|
||||
- literal_form: akses
|
||||
in_language: id
|
||||
- literal_form: 访问
|
||||
in_language: zh
|
||||
keywords:
|
||||
- access policy
|
||||
- access rights
|
||||
- opening hours
|
||||
- appointment required
|
||||
- restricted materials
|
||||
- public access
|
||||
- research access
|
||||
- reading room
|
||||
- digital access
|
||||
- physical access
|
||||
- user eligibility
|
||||
- has_type
|
||||
- has_user_category
|
||||
- has_description
|
||||
- temporal_extent
|
||||
- has_frequency
|
||||
slot_usage:
|
||||
has_type:
|
||||
range: AccessTypeEnum
|
||||
|
|
@ -128,9 +63,9 @@ classes:
|
|||
has_user_category:
|
||||
required: false
|
||||
examples:
|
||||
- value: enrolled students
|
||||
- value: faculty and staff
|
||||
- value: visiting researchers with credentials
|
||||
- value: "enrolled students"
|
||||
- value: "faculty and staff"
|
||||
- value: "visiting researchers with credentials"
|
||||
temporal_extent:
|
||||
required: false
|
||||
range: TimeSpan
|
||||
|
|
@ -140,55 +75,37 @@ classes:
|
|||
range: Frequency
|
||||
inlined: true
|
||||
examples:
|
||||
- value:
|
||||
has_label: Daily
|
||||
broad_mappings:
|
||||
- dcterms:RightsStatement
|
||||
- skos:Concept
|
||||
close_mappings:
|
||||
- schema:publicAccess
|
||||
related_mappings:
|
||||
- crm:E30_Right
|
||||
comments:
|
||||
- Replaces simple string descriptions of access conditions with structured data
|
||||
- Key slots include has_type (access type), has_user_category (eligible users), has_description (conditions), and temporal_extent (when policy applies)
|
||||
- Common access types include PUBLIC, BY_APPOINTMENT, ACADEMIC, RESEARCHER, MEMBER, RESTRICTED, CLOSED, and DIGITAL_ONLY
|
||||
- Created per slot_fixes.yaml revision for collection_access migration
|
||||
- RULE 53: Part of collection_access to offers_or_offered_access plus Access migration
|
||||
see_also:
|
||||
- AccessTypeEnum
|
||||
- dcterms:accessRights
|
||||
notes:
|
||||
- |
|
||||
Preserved from prior description (commit ae09ff81):
|
||||
|
||||
Preserved from prior description (commit ae09ff81):
|
||||
|
||||
Structured access information for heritage collections, services, or facilities.
|
||||
**Purpose**:
|
||||
Replaces simple string descriptions of access conditions with structured
|
||||
data capturing access types, eligible users, conditions, and restrictions.
|
||||
**Key Properties**:
|
||||
- `has_type`: Type of access (PUBLIC, BY_APPOINTMENT, RESTRICTED, etc.)
|
||||
- `has_user_category`: Who can access (public, students, faculty, researchers)
|
||||
- `condition_of_access`: Conditions or requirements for access
|
||||
- `has_description`: Free-text description
|
||||
- `temporal_extent`: When this access policy applies
|
||||
**Access Types**:
|
||||
- PUBLIC: Open to general public
|
||||
- BY_APPOINTMENT: Requires advance appointment
|
||||
- ACADEMIC: Restricted to academic community
|
||||
- RESEARCHER: Restricted to credentialed researchers
|
||||
- MEMBER: Requires membership
|
||||
- RESTRICTED: Limited access with specific conditions
|
||||
- CLOSED: Not currently accessible
|
||||
- DIGITAL_ONLY: Available only in digital form
|
||||
**Ontological Alignment**:
|
||||
- **Primary**: `dcterms:RightsStatement` - Dublin Core rights statement
|
||||
- **Close**: `schema:publicAccess` - Schema.org access indicator
|
||||
- **Related**: `crm:E30_Right` - CIDOC-CRM rights
|
||||
- value:
|
||||
has_label: "Daily"
|
||||
annotations:
|
||||
specificity_score: "0.5"
|
||||
specificity_rationale: Moderately specific - applies to collection and service access contexts
|
||||
custodian_types: "['*']"
|
||||
custodian_types_rationale: All institution types offer some form of access
|
||||
specificity_score: 0.50
|
||||
specificity_rationale: "Moderately specific - applies to collection and service access contexts"
|
||||
custodian_types: '["*"]'
|
||||
custodian_types_rationale: "All institution types offer some form of access"
|
||||
comments:
|
||||
- "Created per slot_fixes.yaml revision for collection_access migration"
|
||||
- "Replaces string-based collection_access with structured access data"
|
||||
- "RULE 53: Part of collection_access → offers_or_offered_access + Access migration"
|
||||
examples:
|
||||
- value:
|
||||
has_type: PUBLIC
|
||||
has_description: "Open to general public during gallery hours"
|
||||
has_user_category:
|
||||
- "general public"
|
||||
- value:
|
||||
has_type: BY_APPOINTMENT
|
||||
has_user_category:
|
||||
- "credentialed researchers"
|
||||
- "graduate students with faculty sponsor"
|
||||
- value:
|
||||
has_type: ACADEMIC
|
||||
has_description: "Open to enrolled students and faculty; public by appointment"
|
||||
has_user_category:
|
||||
- "enrolled students"
|
||||
- "faculty"
|
||||
- "research staff"
|
||||
- value:
|
||||
has_type: DIGITAL_ONLY
|
||||
has_description: "Collection accessible only through online database"
|
||||
has_user_category:
|
||||
- "anyone with internet access"
|
||||
Some files were not shown because too many files have changed in this diff Show more
Loading…
Reference in a new issue