Compare commits
55 commits
b68d580c82
...
fcb704c97e
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
fcb704c97e | ||
|
|
1516d509cf | ||
|
|
7cf10084b4 | ||
|
|
2a349b11bb | ||
|
|
1f8776bef4 | ||
|
|
c60b523f29 | ||
|
|
68c2274e5f | ||
|
|
7b4e113a5a | ||
|
|
4a518f587c | ||
|
|
c1946e93f9 | ||
|
|
c51b3e1cbf | ||
|
|
d69227897b | ||
|
|
f3c0586d09 | ||
|
|
fa5779bfd4 | ||
|
|
7ea7e3d0d7 | ||
|
|
7992e8abaa | ||
|
|
f800e198ff | ||
|
|
8c42292235 | ||
|
|
09674f7da2 | ||
|
|
7f57b3e4b8 | ||
|
|
b2840d5db4 | ||
|
|
80eb3d969c | ||
|
|
140ef25b96 | ||
|
|
b4d1a7677f | ||
|
|
3d7c52c1de | ||
|
|
bdba9de593 | ||
|
|
73b2d21bb3 | ||
|
|
9342919c79 | ||
|
|
4fa0fd572f | ||
|
|
ec113e8811 | ||
|
|
fba1ab9353 | ||
|
|
48d89206f9 | ||
|
|
776462de90 | ||
|
|
511fc99847 | ||
|
|
9a0e56e23a | ||
|
|
b61572f08a | ||
|
|
4cdf9588b2 | ||
|
|
6bb8ac20ba | ||
|
|
479ceae715 | ||
|
|
46cb4d40fa | ||
|
|
849e5354cc | ||
|
|
4efaef60e4 | ||
|
|
3c9926956e | ||
|
|
2a75ddf7cc | ||
|
|
be18d6761c | ||
|
|
821d040b9d | ||
|
|
615910055a | ||
|
|
2d09776856 | ||
|
|
1cd3704762 | ||
|
|
4c3978ab2f | ||
|
|
ba2c766dd0 | ||
|
|
367aaffc27 | ||
|
|
f8205cbc75 | ||
|
|
8d9817c99a | ||
|
|
b32efc208e |
7354 changed files with 395029 additions and 168409 deletions
46
.opencode/rules/no-deletion-from-slot-fixes.md
Normal file
46
.opencode/rules/no-deletion-from-slot-fixes.md
Normal file
|
|
@ -0,0 +1,46 @@
|
|||
# Rule: Do Not Delete From slot_fixes.yaml
|
||||
|
||||
**Identifier**: `no-deletion-from-slot-fixes`
|
||||
**Severity**: **CRITICAL**
|
||||
|
||||
## Core Directive
|
||||
|
||||
**NEVER delete entries from `slot_fixes.yaml`.**
|
||||
|
||||
The `slot_fixes.yaml` file serves as the historical record and audit trail for all schema migrations. Removing entries destroys this history and violates the project's data integrity principles.
|
||||
|
||||
## Workflow
|
||||
|
||||
When processing a migration:
|
||||
|
||||
1. **Do NOT Remove**: Never delete the entry for the slot you are working on.
|
||||
2. **Update `processed`**: Instead, update the `processed` block:
|
||||
* Set `status: true`.
|
||||
* Set `date` to the current date (YYYY-MM-DD).
|
||||
* Add a detailed `notes` string explaining what was done (e.g., "Fully migrated to [new_slot] + [Class] (Rule 53). [File].yaml updated. Slot archived.").
|
||||
3. **Preserve History**: The entry must remain in the file permanently as a record of the migration.
|
||||
|
||||
## Rationale
|
||||
|
||||
* **Audit Trail**: We need to know what was migrated, when, and how.
|
||||
* **Reversibility**: If a migration introduces a bug, the record helps us understand the original state.
|
||||
* **Completeness**: The file tracks the total progress of the schema refactoring project.
|
||||
|
||||
## Example
|
||||
|
||||
**WRONG (Deletion)**:
|
||||
```yaml
|
||||
# DELETED from file
|
||||
# - original_slot_id: ...
|
||||
```
|
||||
|
||||
**CORRECT (Update)**:
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/has_some_slot
|
||||
processed:
|
||||
status: true
|
||||
date: '2026-01-27'
|
||||
notes: Fully migrated to has_or_had_new_slot + NewClass (Rule 53).
|
||||
revision:
|
||||
...
|
||||
```
|
||||
32
.opencode/rules/preserve-bespoke-slots-until-refactoring.md
Normal file
32
.opencode/rules/preserve-bespoke-slots-until-refactoring.md
Normal file
|
|
@ -0,0 +1,32 @@
|
|||
# Rule: Preserve Bespoke Slots Until Refactoring
|
||||
|
||||
**Identifier**: `preserve-bespoke-slots-until-refactoring`
|
||||
**Severity**: **CRITICAL**
|
||||
|
||||
## Core Directive
|
||||
|
||||
**DO NOT remove or migrate "additional" bespoke slots during generic migration passes unless they are the specific target of the current task.**
|
||||
|
||||
## Context
|
||||
|
||||
When migrating a specific slot (e.g., `has_approval_date`), you may encounter other bespoke or legacy slots in the same class file (e.g., `innovation_budget`, `operating_budget`).
|
||||
|
||||
**YOU MUST**:
|
||||
* ✅ Migrate ONLY the specific slot you were instructed to work on.
|
||||
* ✅ Leave other bespoke slots exactly as they are.
|
||||
* ✅ Focus strictly on the current migration target.
|
||||
|
||||
**YOU MUST NOT**:
|
||||
* ❌ Proactively migrate "nearby" slots just because they look like they need refactoring.
|
||||
* ❌ Remove slots that seem unused or redundant without specific instruction.
|
||||
* ❌ "Clean up" the class file by removing legacy attributes.
|
||||
|
||||
## Rationale
|
||||
|
||||
Refactoring is a separate, planned phase. Mixing opportunistic refactoring with systematic slot migration increases the risk of regression and makes changes harder to review. "We will refactor those later."
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Identify Target**: Identify the specific slot(s) assigned for migration (from `slot_fixes.yaml` or user prompt).
|
||||
2. **Execute Migration**: Apply changes ONLY for those slots.
|
||||
3. **Ignore Others**: Do not touch other slots in the file, even if they violate other rules (like Rule 39 or Rule 53). Those will be handled in their own dedicated tasks.
|
||||
29
.opencode/rules/slot-fixes-authoritative-rule.md
Normal file
29
.opencode/rules/slot-fixes-authoritative-rule.md
Normal file
|
|
@ -0,0 +1,29 @@
|
|||
# Rule: Slot Fixes File is Authoritative
|
||||
|
||||
**Scope:** Schema Migration / Slot Fixes
|
||||
|
||||
**Description:**
|
||||
The file `/Users/kempersc/apps/glam/data/fixes/slot_fixes.yaml` is the **single authoritative source** for tracking slot migrations and fixes.
|
||||
|
||||
**Directives:**
|
||||
1. **Authoritative Source:** Always read and update `/Users/kempersc/apps/glam/data/fixes/slot_fixes.yaml`. Do NOT use `schemas/.../slot_fixes.yaml` as the master list (though you may need to sync them if they diverge, the `data/fixes` version takes precedence).
|
||||
2. **Processed Status:** When a slot migration is completed (schema updated, data migrated), you MUST update the entry in `slot_fixes.yaml` with a `processed` block containing:
|
||||
* `status: true`
|
||||
* `date: 'YYYY-MM-DD'`
|
||||
* `notes`: Brief description of what was done.
|
||||
3. **NEVER DELETE:** You MUST NOT delete entries from `slot_fixes.yaml`. Even if a slot is removed from the schema, the record of its fix MUST remain in this file with `status: true`.
|
||||
4. **Format Compliance:** New slots added during migration must follow proper LinkML format conventions and use `slot_uri` and mappings (`exact_mappings`, `close_mappings`) that reference **legitimate predicates and classes found in `/Users/kempersc/apps/glam/data/ontology/`**.
|
||||
|
||||
**Example of Processed Entry:**
|
||||
```yaml
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/has_old_slot
|
||||
revision:
|
||||
- label: has_new_slot
|
||||
type: slot
|
||||
- label: NewClass
|
||||
type: class
|
||||
processed:
|
||||
status: true
|
||||
date: '2026-01-27'
|
||||
notes: Migrated to has_new_slot + NewClass. Old slot archived.
|
||||
```
|
||||
68
.opencode/rules/verified-ontology-terms.md
Normal file
68
.opencode/rules/verified-ontology-terms.md
Normal file
|
|
@ -0,0 +1,68 @@
|
|||
# Rule 62: Verified Ontology Terms Reference
|
||||
|
||||
🚨 **CRITICAL**: All `class_uri`, `slot_uri`, and mapping properties (`exact_mappings`, `close_mappings`, etc.) MUST use verified classes and predicates that exist in the local ontology files at `data/ontology/`.
|
||||
|
||||
## 1. Verified Ontology Files
|
||||
|
||||
The following ontologies are locally available in `data/ontology/`. Always verify terms against these specific files. **NO HALLUCINATIONS ALLOWED.**
|
||||
|
||||
**Mandatory Verification Step**: Before using any `class_uri`, `slot_uri`, or mapping URI, you MUST `grep` the term in the local ontology file to confirm it exists.
|
||||
|
||||
| Prefix | Namespace | Local File | Key Classes/Predicates (Verified) |
|
||||
|--------|-----------|------------|-----------------------------------|
|
||||
| `cpov:` | `http://data.europa.eu/m8g/` | `core-public-organisation-ap.ttl` | `PublicOrganisation`, `contactPage`, `email` |
|
||||
| `crm:` | `http://www.cidoc-crm.org/cidoc-crm/` | `CIDOC_CRM_v7.1.3.rdf` | `E1_CRM_Entity`, `E5_Event`, `P2_has_type` |
|
||||
| `rico:` | `https://www.ica.org/standards/RiC/ontology#` | `RiC-O_1-1.rdf` | `Record`, `Agent`, `hasOrHadHolder` (Note: Use v1.1 file) |
|
||||
| `pico:` | `https://personsincontext.org/model#` | `pico.ttl` | `PersonObservation`, `role` |
|
||||
| `prov:` | `http://www.w3.org/ns/prov#` | `prov.ttl` | `Activity`, `Agent`, `wasGeneratedBy` |
|
||||
| `skos:` | `http://www.w3.org/2004/02/skos/core#` | `skos.rdf` | `Concept`, `prefLabel`, `broader` |
|
||||
| `schema:` | `https://schema.org/` | `frontend/public/ontology/schemaorg.owl` | `Organization`, `Place`, `name`, `url` |
|
||||
| `dcterms:` | `http://purl.org/dc/terms/` | `dublin_core_elements.rdf` | `identifier`, `title`, `description` |
|
||||
| `org:` | `http://www.w3.org/ns/org#` | `org.rdf` | `Organization`, `hasMember` |
|
||||
| `tooi:` | `https://identifier.overheid.nl/tooi/def/ont/` | `tooiont.ttl` | `Overheidsorganisatie` |
|
||||
| `dcat:` | `http://www.w3.org/ns/dcat#` | `dcat3.ttl` | `Dataset`, `Catalog`, `dataset` |
|
||||
| `gn:` | `https://www.geonames.org/ontology#` | `geonames_ontology.rdf` | `Feature` |
|
||||
| `dqv:` | `http://www.w3.org/ns/dqv#` | `dqv.ttl` | `QualityMeasurement`, `hasQualityAnnotation` |
|
||||
| `premis:` | `http://www.loc.gov/premis/rdf/v3/` | `premis3.owl` | `fixity`, `storedAt`, `Event` |
|
||||
|
||||
## 2. Verification Procedure (MANDATORY)
|
||||
|
||||
**You MUST verify every term.** Do not assume a term exists just because it sounds standard.
|
||||
|
||||
```bash
|
||||
# 1. Identify the source ontology file
|
||||
ls data/ontology/
|
||||
|
||||
# 2. Grep for the specific term (e.g., 'hasFixity')
|
||||
grep "hasFixity" data/ontology/premis3.owl
|
||||
# Result: EMPTY -> Term does not exist! DO NOT USE.
|
||||
|
||||
# 3. Grep for the correct term (e.g., 'fixity')
|
||||
grep "fixity" data/ontology/premis3.owl
|
||||
# Result: <owl:ObjectProperty rdf:about=".../fixity"> -> Term exists. USE THIS.
|
||||
```
|
||||
|
||||
## 3. LinkML Mapping Requirements
|
||||
|
||||
Mappings must be precise and verified.
|
||||
|
||||
* `exact_mappings` = `skos:exactMatch` (Semantic equivalence)
|
||||
* `close_mappings` = `skos:closeMatch` (Near equivalence)
|
||||
* `related_mappings` = `skos:relatedMatch` (Association)
|
||||
* `broad_mappings` = `skos:broadMatch` (Broader concept)
|
||||
* `narrow_mappings` = `skos:narrowMatch` (Narrower concept)
|
||||
|
||||
## 4. Prohibited/Invalid Terms (Hallucinations)
|
||||
|
||||
Do NOT use these commonly hallucinated or incorrect terms. They have been verified as **non-existent** in our local ontologies:
|
||||
|
||||
* ❌ `dqv:ConfidenceScore` (Use `dqv:QualityMeasurement`)
|
||||
* ❌ `premis:hasFixity` (Use `premis:fixity`)
|
||||
* ❌ `premis:hasFrameRate` (Verify specific PREMIS properties first)
|
||||
* ❌ `schema:HeritageBuilding` (Use `schema:LandmarksOrHistoricalBuildings`)
|
||||
* ❌ `rico:has_provenance` (Use `rico:history`)
|
||||
* ❌ `rico:hasProvenance` (Use `rico:history`)
|
||||
* ❌ `schema:archive` (Use `schema:archiveHeld` or `schema:archivedAt`)
|
||||
|
||||
**Always verify against the local file content.**
|
||||
|
||||
13
AGENTS.md
13
AGENTS.md
|
|
@ -4806,3 +4806,16 @@ def test_historical_addition():
|
|||
**Schema Version**: v0.2.1 (modular)
|
||||
**Last Updated**: 2025-12-08
|
||||
**Maintained By**: GLAM Data Extraction Project
|
||||
|
||||
### Rule 61: Slot Fixes Authoritative File
|
||||
|
||||
🚨 **CRITICAL**: The file `/Users/kempersc/apps/glam/data/fixes/slot_fixes.yaml` is the AUTHORITATIVE source for slot migrations. NEVER delete entries from this file. Always mark completed migrations with `processed: {status: true}`.
|
||||
|
||||
**See**: `.opencode/rules/slot-fixes-authoritative-rule.md` for complete documentation
|
||||
|
||||
### Rule 62: Verified Ontology Terms Reference
|
||||
|
||||
🚨 **CRITICAL**: All `class_uri`, `slot_uri`, and mappings MUST use verified classes and predicates from local ontology files in `data/ontology/`.
|
||||
|
||||
**See**: `.opencode/rules/verified-ontology-terms.md` for the list of verified ontologies and verification procedures.
|
||||
|
||||
|
|
|
|||
5
archived_classes.txt
Normal file
5
archived_classes.txt
Normal file
|
|
@ -0,0 +1,5 @@
|
|||
DualClassLink_archived_20260126.yaml
|
||||
EducationCredential_archived_20260125.yaml
|
||||
EducationEntry_archived_20260125.yaml
|
||||
RealnessStatus_archived_20260114.yaml
|
||||
TemplateSpecificityScores_archived_20260117.yaml
|
||||
1146
archived_slots.txt
Normal file
1146
archived_slots.txt
Normal file
File diff suppressed because it is too large
Load diff
1146
archived_slots_refresh.txt
Normal file
1146
archived_slots_refresh.txt
Normal file
File diff suppressed because it is too large
Load diff
|
|
@ -42,6 +42,8 @@ RUN useradd -m -u 1000 -s /bin/bash glam
|
|||
|
||||
# Install Python dependencies first (better layer caching)
|
||||
COPY requirements.txt .
|
||||
# Install CPU-only PyTorch first to avoid massive CUDA download and runtime issues
|
||||
RUN pip install --no-cache-dir torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
# Copy application code
|
||||
|
|
|
|||
2534
backend/rag/hybrid_retriever.py
Normal file
2534
backend/rag/hybrid_retriever.py
Normal file
File diff suppressed because it is too large
Load diff
|
|
@ -407,7 +407,7 @@ class Settings:
|
|||
# RAG uses only Qdrant (vectors) and Oxigraph (SPARQL) for retrieval
|
||||
|
||||
# LLM Configuration
|
||||
anthropic_api_key: str = os.getenv("ANTHROPIC_API_KEY", "")
|
||||
anthropic_api_key: str = os.getenv("ANTHROPIC_API_KEY", "") or os.getenv("CLAUDE_API_KEY", "")
|
||||
openai_api_key: str = os.getenv("OPENAI_API_KEY", "")
|
||||
huggingface_api_key: str = os.getenv("HUGGINGFACE_API_KEY", "")
|
||||
groq_api_key: str = os.getenv("GROQ_API_KEY", "")
|
||||
|
|
@ -1660,6 +1660,7 @@ class MultiSourceRetriever:
|
|||
only_heritage_relevant: bool = False,
|
||||
only_wcms: bool = False,
|
||||
using: str | None = None,
|
||||
extra_filters: dict[str, Any] | None = None,
|
||||
) -> list[Any]:
|
||||
"""Search for persons/staff in the heritage_persons collection.
|
||||
|
||||
|
|
@ -1672,20 +1673,29 @@ class MultiSourceRetriever:
|
|||
only_heritage_relevant: Only return heritage-relevant staff
|
||||
only_wcms: Only return WCMS-registered profiles
|
||||
using: Optional embedding model to use (e.g., 'minilm_384', 'openai_1536')
|
||||
extra_filters: Optional extra filters for Qdrant
|
||||
|
||||
Returns:
|
||||
List of RetrievedPerson objects
|
||||
"""
|
||||
if self.qdrant:
|
||||
try:
|
||||
return self.qdrant.search_persons( # type: ignore[no-any-return]
|
||||
query=query,
|
||||
k=k,
|
||||
filter_custodian=filter_custodian,
|
||||
only_heritage_relevant=only_heritage_relevant,
|
||||
only_wcms=only_wcms,
|
||||
using=using,
|
||||
)
|
||||
# Dynamically check if qdrant.search_persons supports extra_filters
|
||||
# This handles case where HybridRetriever signature varies
|
||||
import inspect
|
||||
sig = inspect.signature(self.qdrant.search_persons)
|
||||
kwargs = {
|
||||
"query": query,
|
||||
"k": k,
|
||||
"filter_custodian": filter_custodian,
|
||||
"only_heritage_relevant": only_heritage_relevant,
|
||||
"only_wcms": only_wcms,
|
||||
"using": using,
|
||||
}
|
||||
if "extra_filters" in sig.parameters:
|
||||
kwargs["extra_filters"] = extra_filters
|
||||
|
||||
return self.qdrant.search_persons(**kwargs) # type: ignore[no-any-return]
|
||||
except Exception as e:
|
||||
logger.error(f"Person search failed: {e}")
|
||||
return []
|
||||
|
|
@ -2752,16 +2762,68 @@ async def person_search(request: PersonSearchRequest) -> PersonSearchResponse:
|
|||
)
|
||||
|
||||
try:
|
||||
# Augment query for better recall on domain names if it looks like a domain search
|
||||
# "nos" -> "nos email domain nos" to guide vector search towards email addresses
|
||||
search_query = request.query
|
||||
extra_filters = None
|
||||
|
||||
# 1. Email/Domain Detection Logic
|
||||
is_email_like = "@" in search_query
|
||||
# Check for common TLDs at the end of the query or at the end of words
|
||||
common_tlds = ['.nl', '.com', '.org', '.net', '.eu', '.be', '.de', '.edu', '.gov', '.uk', '.fr', '.it', '.es']
|
||||
is_domain_like = any(tld in search_query.lower() for tld in common_tlds)
|
||||
|
||||
# 2. Construct Filters
|
||||
if is_email_like or is_domain_like:
|
||||
logger.info(f"[PersonSearch] Email/Domain pattern detected: '{search_query}'")
|
||||
# If explicit @ is present, we might want to strip leading @ for better matching
|
||||
# e.g. "@nos.nl" -> "nos.nl"
|
||||
clean_term = search_query.strip()
|
||||
if clean_term.startswith("@"):
|
||||
clean_term = clean_term[1:]
|
||||
|
||||
# Apply MatchText filter on email field
|
||||
# This prioritizes email matches by strictly filtering for them first
|
||||
extra_filters = {"email": {"match": {"text": clean_term}}}
|
||||
|
||||
elif len(search_query.split()) == 1 and len(search_query) > 2:
|
||||
# Heuristic: single word queries might be domain searches (e.g. "nos", "leiden")
|
||||
# We use MatchText filtering on email field to find substring matches
|
||||
# Qdrant "match": {"text": "nos"} performs token-based matching
|
||||
extra_filters = {"email": {"match": {"text": search_query}}}
|
||||
logger.info(f"[PersonSearch] Potential domain search detected for '{search_query}'. Applying strict email filter: {extra_filters}")
|
||||
|
||||
logger.info(f"[PersonSearch] Executing search for '{search_query}' (extra_filters={extra_filters})")
|
||||
|
||||
# Use the hybrid retriever's person search
|
||||
results = retriever.search_persons(
|
||||
query=request.query,
|
||||
query=search_query,
|
||||
k=request.k,
|
||||
filter_custodian=request.filter_custodian,
|
||||
only_heritage_relevant=request.only_heritage_relevant,
|
||||
only_wcms=request.only_wcms,
|
||||
using=request.embedding_model, # Pass embedding model
|
||||
extra_filters=extra_filters,
|
||||
)
|
||||
|
||||
# FALLBACK: If strict domain filter yielded no results, try standard vector search
|
||||
# This fixes the issue where searching for names like "willem" (which look like domains)
|
||||
# would fail because they don't appear in emails.
|
||||
if extra_filters and not results:
|
||||
logger.info(f"[PersonSearch] No results with email filter for '{search_query}'. Falling back to standard vector search.")
|
||||
results = retriever.search_persons(
|
||||
query=search_query,
|
||||
k=request.k,
|
||||
filter_custodian=request.filter_custodian,
|
||||
only_heritage_relevant=request.only_heritage_relevant,
|
||||
only_wcms=request.only_wcms,
|
||||
using=request.embedding_model,
|
||||
extra_filters=None, # Disable filter for fallback
|
||||
)
|
||||
logger.info(f"[PersonSearch] Fallback search returned {len(results)} results")
|
||||
|
||||
logger.info(f"[PersonSearch] Final result count: {len(results)}")
|
||||
|
||||
# Determine which embedding model was actually used
|
||||
embedding_model_used = None
|
||||
qdrant = retriever.qdrant
|
||||
|
|
@ -3501,6 +3563,21 @@ async def dspy_query(request: DSPyQueryRequest) -> DSPyQueryResponse:
|
|||
|
||||
logger.info(f"LLM provider requested: {requested_provider} (request.llm_provider={request.llm_provider}, server default={settings.llm_provider})")
|
||||
|
||||
# Check if requested provider has API key configured - fail early if not
|
||||
provider_api_keys = {
|
||||
"zai": settings.zai_api_token,
|
||||
"groq": settings.groq_api_key,
|
||||
"anthropic": settings.anthropic_api_key,
|
||||
"openai": settings.openai_api_key,
|
||||
"huggingface": settings.huggingface_api_key,
|
||||
}
|
||||
|
||||
if requested_provider in provider_api_keys and not provider_api_keys[requested_provider]:
|
||||
raise ValueError(
|
||||
f"LLM provider '{requested_provider}' was requested but its API key is not configured. "
|
||||
f"Please set the appropriate environment variable (e.g., ANTHROPIC_API_KEY or CLAUDE_API_KEY for anthropic)."
|
||||
)
|
||||
|
||||
# Provider configuration priority: requested provider first, then fallback chain
|
||||
providers_to_try = [requested_provider]
|
||||
# Add fallback chain (but not duplicates)
|
||||
|
|
@ -4201,6 +4278,22 @@ async def stream_dspy_query_response(
|
|||
llm_model_used: str | None = None
|
||||
lm = None
|
||||
|
||||
# Check if requested provider has API key configured - fail early if not
|
||||
provider_api_keys = {
|
||||
"zai": settings.zai_api_token,
|
||||
"groq": settings.groq_api_key,
|
||||
"anthropic": settings.anthropic_api_key,
|
||||
"openai": settings.openai_api_key,
|
||||
"huggingface": settings.huggingface_api_key,
|
||||
}
|
||||
|
||||
if requested_provider in provider_api_keys and not provider_api_keys[requested_provider]:
|
||||
yield emit_error(
|
||||
f"LLM provider '{requested_provider}' was requested but its API key is not configured. "
|
||||
f"Please set the appropriate environment variable (e.g., ANTHROPIC_API_KEY or CLAUDE_API_KEY for anthropic)."
|
||||
)
|
||||
return
|
||||
|
||||
providers_to_try = [requested_provider]
|
||||
for fallback in ["zai", "groq", "anthropic", "openai"]:
|
||||
if fallback not in providers_to_try:
|
||||
|
|
|
|||
846
backend/rag/multi_embedding_retriever.py
Normal file
846
backend/rag/multi_embedding_retriever.py
Normal file
|
|
@ -0,0 +1,846 @@
|
|||
"""
|
||||
Multi-Embedding Retriever for Heritage Data
|
||||
|
||||
Supports multiple embedding models using Qdrant's named vectors feature.
|
||||
This enables:
|
||||
- A/B testing different embedding models
|
||||
- Cost optimization (cheap local embeddings vs paid API embeddings)
|
||||
- Gradual migration between embedding models
|
||||
- Fallback when one model is unavailable
|
||||
|
||||
Supported Embedding Models:
|
||||
- openai_1536: text-embedding-3-small (1536-dim, $0.02/1M tokens)
|
||||
- minilm_384: all-MiniLM-L6-v2 (384-dim, free/local)
|
||||
- bge_768: bge-base-en-v1.5 (768-dim, free/local, high quality)
|
||||
|
||||
Collection Architecture:
|
||||
Each collection has named vectors for each embedding model:
|
||||
|
||||
heritage_custodians:
|
||||
vectors:
|
||||
"openai_1536": VectorParams(size=1536)
|
||||
"minilm_384": VectorParams(size=384)
|
||||
payload: {name, ghcid, institution_type, ...}
|
||||
|
||||
heritage_persons:
|
||||
vectors:
|
||||
"openai_1536": VectorParams(size=1536)
|
||||
"minilm_384": VectorParams(size=384)
|
||||
payload: {name, headline, custodian_name, ...}
|
||||
|
||||
Usage:
|
||||
retriever = MultiEmbeddingRetriever()
|
||||
|
||||
# Search with default model (auto-select based on availability)
|
||||
results = retriever.search("museums in Amsterdam")
|
||||
|
||||
# Search with specific model
|
||||
results = retriever.search("museums in Amsterdam", using="minilm_384")
|
||||
|
||||
# A/B test comparison
|
||||
comparison = retriever.compare_models("museums in Amsterdam")
|
||||
"""
|
||||
|
||||
import hashlib
|
||||
import logging
|
||||
import os
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum
|
||||
from typing import Any, Literal
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class EmbeddingModel(str, Enum):
|
||||
"""Supported embedding models with their configurations."""
|
||||
|
||||
OPENAI_1536 = "openai_1536"
|
||||
MINILM_384 = "minilm_384"
|
||||
BGE_768 = "bge_768"
|
||||
|
||||
@property
|
||||
def dimension(self) -> int:
|
||||
"""Get the vector dimension for this model."""
|
||||
dims = {
|
||||
"openai_1536": 1536,
|
||||
"minilm_384": 384,
|
||||
"bge_768": 768,
|
||||
}
|
||||
return dims[self.value]
|
||||
|
||||
@property
|
||||
def model_name(self) -> str:
|
||||
"""Get the actual model name for loading."""
|
||||
names = {
|
||||
"openai_1536": "text-embedding-3-small",
|
||||
"minilm_384": "all-MiniLM-L6-v2",
|
||||
"bge_768": "BAAI/bge-base-en-v1.5",
|
||||
}
|
||||
return names[self.value]
|
||||
|
||||
@property
|
||||
def is_local(self) -> bool:
|
||||
"""Check if this model runs locally (no API calls)."""
|
||||
return self.value in ("minilm_384", "bge_768")
|
||||
|
||||
@property
|
||||
def cost_per_1m_tokens(self) -> float:
|
||||
"""Approximate cost per 1M tokens (0 for local models)."""
|
||||
costs = {
|
||||
"openai_1536": 0.02,
|
||||
"minilm_384": 0.0,
|
||||
"bge_768": 0.0,
|
||||
}
|
||||
return costs[self.value]
|
||||
|
||||
|
||||
@dataclass
|
||||
class MultiEmbeddingConfig:
|
||||
"""Configuration for multi-embedding retriever."""
|
||||
|
||||
# Qdrant connection
|
||||
qdrant_host: str = "localhost"
|
||||
qdrant_port: int = 6333
|
||||
qdrant_https: bool = False
|
||||
qdrant_prefix: str | None = None
|
||||
|
||||
# API keys
|
||||
openai_api_key: str | None = None
|
||||
|
||||
# Default embedding model preference order
|
||||
# First available model is used if no explicit model is specified
|
||||
model_preference: list[EmbeddingModel] = field(default_factory=lambda: [
|
||||
EmbeddingModel.MINILM_384, # Free, fast, good quality
|
||||
EmbeddingModel.OPENAI_1536, # Higher quality, paid
|
||||
EmbeddingModel.BGE_768, # Free, high quality, slower
|
||||
])
|
||||
|
||||
# Collection names
|
||||
institutions_collection: str = "heritage_custodians"
|
||||
persons_collection: str = "heritage_persons"
|
||||
|
||||
# Search defaults
|
||||
default_k: int = 10
|
||||
|
||||
|
||||
class MultiEmbeddingRetriever:
|
||||
"""Retriever supporting multiple embedding models via Qdrant named vectors.
|
||||
|
||||
This class manages multiple embedding models and allows searching with
|
||||
any available model. It handles:
|
||||
- Model lazy-loading
|
||||
- Automatic model selection based on availability
|
||||
- Named vector creation and search
|
||||
- A/B testing between models
|
||||
"""
|
||||
|
||||
def __init__(self, config: MultiEmbeddingConfig | None = None):
|
||||
"""Initialize multi-embedding retriever.
|
||||
|
||||
Args:
|
||||
config: Configuration options. If None, uses environment variables.
|
||||
"""
|
||||
self.config = config or self._config_from_env()
|
||||
|
||||
# Lazy-loaded clients
|
||||
self._qdrant_client = None
|
||||
self._openai_client = None
|
||||
self._st_models: dict[str, Any] = {} # Sentence transformer models
|
||||
|
||||
# Track available models per collection
|
||||
self._available_models: dict[str, set[EmbeddingModel]] = {}
|
||||
|
||||
# Track whether each collection uses named vectors (vs single unnamed vector)
|
||||
self._uses_named_vectors: dict[str, bool] = {}
|
||||
|
||||
logger.info(f"MultiEmbeddingRetriever initialized with preference: {[m.value for m in self.config.model_preference]}")
|
||||
|
||||
@staticmethod
|
||||
def _config_from_env() -> MultiEmbeddingConfig:
|
||||
"""Create configuration from environment variables."""
|
||||
use_production = os.getenv("QDRANT_USE_PRODUCTION", "false").lower() == "true"
|
||||
|
||||
if use_production:
|
||||
return MultiEmbeddingConfig(
|
||||
qdrant_host=os.getenv("QDRANT_PROD_HOST", "bronhouder.nl"),
|
||||
qdrant_port=443,
|
||||
qdrant_https=True,
|
||||
qdrant_prefix=os.getenv("QDRANT_PROD_PREFIX", "qdrant"),
|
||||
openai_api_key=os.getenv("OPENAI_API_KEY"),
|
||||
)
|
||||
else:
|
||||
return MultiEmbeddingConfig(
|
||||
qdrant_host=os.getenv("QDRANT_HOST", "localhost"),
|
||||
qdrant_port=int(os.getenv("QDRANT_PORT", "6333")),
|
||||
openai_api_key=os.getenv("OPENAI_API_KEY"),
|
||||
)
|
||||
|
||||
@property
|
||||
def qdrant_client(self):
|
||||
"""Lazy-load Qdrant client."""
|
||||
if self._qdrant_client is None:
|
||||
from qdrant_client import QdrantClient
|
||||
|
||||
if self.config.qdrant_https:
|
||||
self._qdrant_client = QdrantClient(
|
||||
host=self.config.qdrant_host,
|
||||
port=self.config.qdrant_port,
|
||||
https=True,
|
||||
prefix=self.config.qdrant_prefix,
|
||||
prefer_grpc=False,
|
||||
timeout=30,
|
||||
)
|
||||
logger.info(f"Connected to Qdrant: https://{self.config.qdrant_host}/{self.config.qdrant_prefix or ''}")
|
||||
else:
|
||||
self._qdrant_client = QdrantClient(
|
||||
host=self.config.qdrant_host,
|
||||
port=self.config.qdrant_port,
|
||||
)
|
||||
logger.info(f"Connected to Qdrant: {self.config.qdrant_host}:{self.config.qdrant_port}")
|
||||
|
||||
return self._qdrant_client
|
||||
|
||||
@property
|
||||
def openai_client(self):
|
||||
"""Lazy-load OpenAI client."""
|
||||
if self._openai_client is None:
|
||||
if not self.config.openai_api_key:
|
||||
raise RuntimeError("OpenAI API key not configured")
|
||||
|
||||
import openai
|
||||
self._openai_client = openai.OpenAI(api_key=self.config.openai_api_key)
|
||||
|
||||
return self._openai_client
|
||||
|
||||
def _load_sentence_transformer(self, model: EmbeddingModel) -> Any:
|
||||
"""Lazy-load a sentence-transformers model.
|
||||
|
||||
Args:
|
||||
model: The embedding model to load
|
||||
|
||||
Returns:
|
||||
Loaded SentenceTransformer model
|
||||
"""
|
||||
if model.value not in self._st_models:
|
||||
try:
|
||||
from sentence_transformers import SentenceTransformer
|
||||
self._st_models[model.value] = SentenceTransformer(model.model_name)
|
||||
logger.info(f"Loaded sentence-transformers model: {model.model_name}")
|
||||
except ImportError:
|
||||
raise RuntimeError(
|
||||
"sentence-transformers not installed. Run: pip install sentence-transformers"
|
||||
)
|
||||
|
||||
return self._st_models[model.value]
|
||||
|
||||
def get_embedding(self, text: str, model: EmbeddingModel) -> list[float]:
|
||||
"""Get embedding vector for text using specified model.
|
||||
|
||||
Args:
|
||||
text: Text to embed
|
||||
model: Embedding model to use
|
||||
|
||||
Returns:
|
||||
Embedding vector as list of floats
|
||||
"""
|
||||
if model == EmbeddingModel.OPENAI_1536:
|
||||
response = self.openai_client.embeddings.create(
|
||||
input=text,
|
||||
model=model.model_name,
|
||||
)
|
||||
return response.data[0].embedding
|
||||
|
||||
elif model in (EmbeddingModel.MINILM_384, EmbeddingModel.BGE_768):
|
||||
st_model = self._load_sentence_transformer(model)
|
||||
embedding = st_model.encode(text)
|
||||
return embedding.tolist()
|
||||
|
||||
else:
|
||||
raise ValueError(f"Unknown embedding model: {model}")
|
||||
|
||||
def get_embeddings_batch(
|
||||
self,
|
||||
texts: list[str],
|
||||
model: EmbeddingModel,
|
||||
batch_size: int = 32,
|
||||
) -> list[list[float]]:
|
||||
"""Get embedding vectors for multiple texts.
|
||||
|
||||
Args:
|
||||
texts: List of texts to embed
|
||||
model: Embedding model to use
|
||||
batch_size: Batch size for processing
|
||||
|
||||
Returns:
|
||||
List of embedding vectors
|
||||
"""
|
||||
if not texts:
|
||||
return []
|
||||
|
||||
if model == EmbeddingModel.OPENAI_1536:
|
||||
# OpenAI batch API (max 2048 per request)
|
||||
all_embeddings = []
|
||||
for i in range(0, len(texts), 2048):
|
||||
batch = texts[i:i + 2048]
|
||||
response = self.openai_client.embeddings.create(
|
||||
input=batch,
|
||||
model=model.model_name,
|
||||
)
|
||||
batch_embeddings = [item.embedding for item in sorted(response.data, key=lambda x: x.index)]
|
||||
all_embeddings.extend(batch_embeddings)
|
||||
return all_embeddings
|
||||
|
||||
elif model in (EmbeddingModel.MINILM_384, EmbeddingModel.BGE_768):
|
||||
st_model = self._load_sentence_transformer(model)
|
||||
embeddings = st_model.encode(texts, batch_size=batch_size, show_progress_bar=len(texts) > 100)
|
||||
return embeddings.tolist()
|
||||
|
||||
else:
|
||||
raise ValueError(f"Unknown embedding model: {model}")
|
||||
|
||||
def get_available_models(self, collection_name: str) -> set[EmbeddingModel]:
|
||||
"""Get the embedding models available for a collection.
|
||||
|
||||
Checks which named vectors exist in the collection.
|
||||
For single-vector collections, returns models matching the dimension.
|
||||
|
||||
Args:
|
||||
collection_name: Name of the Qdrant collection
|
||||
|
||||
Returns:
|
||||
Set of available EmbeddingModel values
|
||||
"""
|
||||
if collection_name in self._available_models:
|
||||
return self._available_models[collection_name]
|
||||
|
||||
try:
|
||||
info = self.qdrant_client.get_collection(collection_name)
|
||||
vectors_config = info.config.params.vectors
|
||||
|
||||
available = set()
|
||||
uses_named_vectors = False
|
||||
|
||||
# Check for named vectors (dict of vector configs)
|
||||
if isinstance(vectors_config, dict):
|
||||
# Named vectors - each key is a vector name
|
||||
uses_named_vectors = True
|
||||
for vector_name in vectors_config.keys():
|
||||
try:
|
||||
model = EmbeddingModel(vector_name)
|
||||
available.add(model)
|
||||
except ValueError:
|
||||
logger.warning(f"Unknown vector name in collection: {vector_name}")
|
||||
else:
|
||||
# Single unnamed vector - check dimension to find compatible model
|
||||
# Note: This doesn't mean we can use `using=model.value` in queries
|
||||
uses_named_vectors = False
|
||||
if hasattr(vectors_config, 'size'):
|
||||
dim = vectors_config.size
|
||||
for model in EmbeddingModel:
|
||||
if model.dimension == dim:
|
||||
available.add(model)
|
||||
|
||||
# Store both available models and whether named vectors are used
|
||||
self._available_models[collection_name] = available
|
||||
self._uses_named_vectors[collection_name] = uses_named_vectors
|
||||
|
||||
if uses_named_vectors:
|
||||
logger.info(f"Collection '{collection_name}' uses named vectors: {[m.value for m in available]}")
|
||||
else:
|
||||
logger.info(f"Collection '{collection_name}' uses single vector (compatible with: {[m.value for m in available]})")
|
||||
|
||||
return available
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Could not get available models for {collection_name}: {e}")
|
||||
return set()
|
||||
|
||||
def uses_named_vectors(self, collection_name: str) -> bool:
|
||||
"""Check if a collection uses named vectors (vs single unnamed vector).
|
||||
|
||||
Args:
|
||||
collection_name: Name of the Qdrant collection
|
||||
|
||||
Returns:
|
||||
True if collection has named vectors, False for single-vector collections
|
||||
"""
|
||||
# Ensure models are loaded (populates _uses_named_vectors)
|
||||
self.get_available_models(collection_name)
|
||||
return self._uses_named_vectors.get(collection_name, False)
|
||||
|
||||
def select_model(
|
||||
self,
|
||||
collection_name: str,
|
||||
preferred: EmbeddingModel | None = None,
|
||||
) -> EmbeddingModel | None:
|
||||
"""Select the best available embedding model for a collection.
|
||||
|
||||
Args:
|
||||
collection_name: Name of the collection
|
||||
preferred: Preferred model (used if available)
|
||||
|
||||
Returns:
|
||||
Selected EmbeddingModel or None if none available
|
||||
"""
|
||||
available = self.get_available_models(collection_name)
|
||||
|
||||
if not available:
|
||||
# No named vectors - check if we can use any model
|
||||
# This happens for legacy single-vector collections
|
||||
try:
|
||||
info = self.qdrant_client.get_collection(collection_name)
|
||||
vectors_config = info.config.params.vectors
|
||||
|
||||
# Get vector dimension
|
||||
dim = None
|
||||
if hasattr(vectors_config, 'size'):
|
||||
dim = vectors_config.size
|
||||
elif isinstance(vectors_config, dict):
|
||||
# Get first vector config
|
||||
first_config = next(iter(vectors_config.values()), None)
|
||||
if first_config and hasattr(first_config, 'size'):
|
||||
dim = first_config.size
|
||||
|
||||
if dim:
|
||||
for model in self.config.model_preference:
|
||||
if model.dimension == dim:
|
||||
return model
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return None
|
||||
|
||||
# If preferred model is available, use it
|
||||
if preferred and preferred in available:
|
||||
return preferred
|
||||
|
||||
# Otherwise, follow preference order
|
||||
for model in self.config.model_preference:
|
||||
if model in available:
|
||||
# Check if model is usable (has API key if needed)
|
||||
if model == EmbeddingModel.OPENAI_1536 and not self.config.openai_api_key:
|
||||
continue
|
||||
return model
|
||||
|
||||
return None
|
||||
|
||||
def search(
|
||||
self,
|
||||
query: str,
|
||||
collection_name: str | None = None,
|
||||
k: int | None = None,
|
||||
using: EmbeddingModel | str | None = None,
|
||||
filter_conditions: dict[str, Any] | None = None,
|
||||
) -> list[dict[str, Any]]:
|
||||
"""Search for similar documents using specified or auto-selected model.
|
||||
|
||||
Args:
|
||||
query: Search query text
|
||||
collection_name: Collection to search (default: institutions)
|
||||
k: Number of results
|
||||
using: Embedding model to use (auto-selected if None)
|
||||
filter_conditions: Optional Qdrant filter conditions
|
||||
|
||||
Returns:
|
||||
List of results with scores and payloads
|
||||
"""
|
||||
collection_name = collection_name or self.config.institutions_collection
|
||||
k = k or self.config.default_k
|
||||
|
||||
# Resolve model
|
||||
if using is not None:
|
||||
if isinstance(using, str):
|
||||
model = EmbeddingModel(using)
|
||||
else:
|
||||
model = using
|
||||
else:
|
||||
model = self.select_model(collection_name)
|
||||
|
||||
if model is None:
|
||||
raise RuntimeError(f"No compatible embedding model for collection '{collection_name}'")
|
||||
|
||||
logger.info(f"Searching '{collection_name}' with {model.value}: {query[:50]}...")
|
||||
|
||||
# Get query embedding
|
||||
query_vector = self.get_embedding(query, model)
|
||||
|
||||
# Build filter
|
||||
from qdrant_client.http import models
|
||||
|
||||
query_filter = None
|
||||
if filter_conditions:
|
||||
query_filter = models.Filter(
|
||||
must=[
|
||||
models.FieldCondition(
|
||||
key=key,
|
||||
match=models.MatchValue(value=value),
|
||||
)
|
||||
for key, value in filter_conditions.items()
|
||||
]
|
||||
)
|
||||
|
||||
# Check if collection uses named vectors (not just single unnamed vector)
|
||||
# Only pass `using=model.value` if collection has actual named vectors
|
||||
use_named_vector = self.uses_named_vectors(collection_name)
|
||||
|
||||
# Search
|
||||
if use_named_vector:
|
||||
results = self.qdrant_client.query_points(
|
||||
collection_name=collection_name,
|
||||
query=query_vector,
|
||||
using=model.value,
|
||||
limit=k,
|
||||
with_payload=True,
|
||||
query_filter=query_filter,
|
||||
)
|
||||
else:
|
||||
# Legacy single-vector search
|
||||
results = self.qdrant_client.query_points(
|
||||
collection_name=collection_name,
|
||||
query=query_vector,
|
||||
limit=k,
|
||||
with_payload=True,
|
||||
query_filter=query_filter,
|
||||
)
|
||||
|
||||
return [
|
||||
{
|
||||
"id": str(point.id),
|
||||
"score": point.score,
|
||||
"model": model.value,
|
||||
"payload": point.payload or {},
|
||||
}
|
||||
for point in results.points
|
||||
]
|
||||
|
||||
def search_persons(
|
||||
self,
|
||||
query: str,
|
||||
k: int | None = None,
|
||||
using: EmbeddingModel | str | None = None,
|
||||
filter_custodian: str | None = None,
|
||||
only_heritage_relevant: bool = False,
|
||||
only_wcms: bool = False,
|
||||
) -> list[dict[str, Any]]:
|
||||
"""Search for persons/staff in the heritage_persons collection.
|
||||
|
||||
Args:
|
||||
query: Search query text
|
||||
k: Number of results
|
||||
using: Embedding model to use
|
||||
filter_custodian: Optional custodian slug to filter by
|
||||
only_heritage_relevant: Only return heritage-relevant staff
|
||||
only_wcms: Only return WCMS-registered profiles (heritage sector users)
|
||||
|
||||
Returns:
|
||||
List of person results with scores
|
||||
"""
|
||||
k = k or self.config.default_k
|
||||
|
||||
# Build filters
|
||||
filters = {}
|
||||
if filter_custodian:
|
||||
filters["custodian_slug"] = filter_custodian
|
||||
if only_wcms:
|
||||
filters["has_wcms"] = True
|
||||
|
||||
# Search with over-fetch for post-filtering
|
||||
results = self.search(
|
||||
query=query,
|
||||
collection_name=self.config.persons_collection,
|
||||
k=k * 2,
|
||||
using=using,
|
||||
filter_conditions=filters if filters else None,
|
||||
)
|
||||
|
||||
# Post-filter for heritage_relevant if needed
|
||||
if only_heritage_relevant:
|
||||
results = [r for r in results if r.get("payload", {}).get("heritage_relevant", False)]
|
||||
|
||||
# Format results
|
||||
formatted = []
|
||||
for r in results[:k]:
|
||||
payload = r.get("payload", {})
|
||||
formatted.append({
|
||||
"person_id": payload.get("staff_id", "") or hashlib.md5(
|
||||
f"{payload.get('custodian_slug', '')}:{payload.get('name', '')}".encode()
|
||||
).hexdigest()[:16],
|
||||
"name": payload.get("name", ""),
|
||||
"headline": payload.get("headline"),
|
||||
"custodian_name": payload.get("custodian_name"),
|
||||
"custodian_slug": payload.get("custodian_slug"),
|
||||
"location": payload.get("location"),
|
||||
"heritage_relevant": payload.get("heritage_relevant", False),
|
||||
"heritage_type": payload.get("heritage_type"),
|
||||
"linkedin_url": payload.get("linkedin_url"),
|
||||
"score": r["score"],
|
||||
"model": r["model"],
|
||||
})
|
||||
|
||||
return formatted
|
||||
|
||||
def compare_models(
|
||||
self,
|
||||
query: str,
|
||||
collection_name: str | None = None,
|
||||
k: int = 10,
|
||||
models: list[EmbeddingModel] | None = None,
|
||||
) -> dict[str, Any]:
|
||||
"""A/B test comparison of multiple embedding models.
|
||||
|
||||
Args:
|
||||
query: Search query
|
||||
collection_name: Collection to search
|
||||
k: Number of results per model
|
||||
models: Models to compare (default: all available)
|
||||
|
||||
Returns:
|
||||
Dict with results per model and overlap analysis
|
||||
"""
|
||||
collection_name = collection_name or self.config.institutions_collection
|
||||
|
||||
# Determine which models to compare
|
||||
available = self.get_available_models(collection_name)
|
||||
if models:
|
||||
models_to_test = [m for m in models if m in available]
|
||||
else:
|
||||
models_to_test = list(available)
|
||||
|
||||
if not models_to_test:
|
||||
return {"error": "No models available for comparison"}
|
||||
|
||||
results = {}
|
||||
all_ids = {}
|
||||
|
||||
for model in models_to_test:
|
||||
try:
|
||||
model_results = self.search(
|
||||
query=query,
|
||||
collection_name=collection_name,
|
||||
k=k,
|
||||
using=model,
|
||||
)
|
||||
results[model.value] = model_results
|
||||
all_ids[model.value] = {r["id"] for r in model_results}
|
||||
except Exception as e:
|
||||
results[model.value] = {"error": str(e)}
|
||||
all_ids[model.value] = set()
|
||||
|
||||
# Calculate overlap between models
|
||||
overlap = {}
|
||||
model_values = list(all_ids.keys())
|
||||
for i, m1 in enumerate(model_values):
|
||||
for m2 in model_values[i + 1:]:
|
||||
if all_ids[m1] and all_ids[m2]:
|
||||
intersection = all_ids[m1] & all_ids[m2]
|
||||
union = all_ids[m1] | all_ids[m2]
|
||||
jaccard = len(intersection) / len(union) if union else 0
|
||||
overlap[f"{m1}_vs_{m2}"] = {
|
||||
"jaccard_similarity": round(jaccard, 3),
|
||||
"common_results": len(intersection),
|
||||
"total_unique": len(union),
|
||||
}
|
||||
|
||||
return {
|
||||
"query": query,
|
||||
"collection": collection_name,
|
||||
"k": k,
|
||||
"results": results,
|
||||
"overlap_analysis": overlap,
|
||||
}
|
||||
|
||||
def create_multi_embedding_collection(
|
||||
self,
|
||||
collection_name: str,
|
||||
models: list[EmbeddingModel] | None = None,
|
||||
) -> bool:
|
||||
"""Create a new collection with named vectors for multiple embedding models.
|
||||
|
||||
Args:
|
||||
collection_name: Name for the new collection
|
||||
models: Embedding models to support (default: all)
|
||||
|
||||
Returns:
|
||||
True if created successfully
|
||||
"""
|
||||
from qdrant_client.http.models import Distance, VectorParams
|
||||
|
||||
models = models or list(EmbeddingModel)
|
||||
|
||||
vectors_config = {
|
||||
model.value: VectorParams(
|
||||
size=model.dimension,
|
||||
distance=Distance.COSINE,
|
||||
)
|
||||
for model in models
|
||||
}
|
||||
|
||||
try:
|
||||
self.qdrant_client.create_collection(
|
||||
collection_name=collection_name,
|
||||
vectors_config=vectors_config,
|
||||
)
|
||||
logger.info(f"Created multi-embedding collection '{collection_name}' with {[m.value for m in models]}")
|
||||
|
||||
# Clear cache
|
||||
self._available_models.pop(collection_name, None)
|
||||
|
||||
return True
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to create collection: {e}")
|
||||
return False
|
||||
|
||||
def add_documents_multi_embedding(
|
||||
self,
|
||||
documents: list[dict[str, Any]],
|
||||
collection_name: str,
|
||||
models: list[EmbeddingModel] | None = None,
|
||||
batch_size: int = 100,
|
||||
) -> int:
|
||||
"""Add documents with embeddings from multiple models.
|
||||
|
||||
Args:
|
||||
documents: List of documents with 'text' and optional 'metadata' fields
|
||||
collection_name: Target collection
|
||||
models: Models to generate embeddings for (default: all available)
|
||||
batch_size: Batch size for processing
|
||||
|
||||
Returns:
|
||||
Number of documents added
|
||||
"""
|
||||
from qdrant_client.http import models as qmodels
|
||||
|
||||
# Determine which models to use
|
||||
available = self.get_available_models(collection_name)
|
||||
if models:
|
||||
models_to_use = [m for m in models if m in available]
|
||||
else:
|
||||
models_to_use = list(available)
|
||||
|
||||
if not models_to_use:
|
||||
raise RuntimeError(f"No embedding models available for collection '{collection_name}'")
|
||||
|
||||
# Filter valid documents
|
||||
valid_docs = [d for d in documents if d.get("text")]
|
||||
total_indexed = 0
|
||||
|
||||
for i in range(0, len(valid_docs), batch_size):
|
||||
batch = valid_docs[i:i + batch_size]
|
||||
texts = [d["text"] for d in batch]
|
||||
|
||||
# Generate embeddings for each model
|
||||
embeddings_by_model = {}
|
||||
for model in models_to_use:
|
||||
try:
|
||||
embeddings_by_model[model] = self.get_embeddings_batch(texts, model)
|
||||
except Exception as e:
|
||||
logger.warning(f"Failed to get {model.value} embeddings: {e}")
|
||||
|
||||
if not embeddings_by_model:
|
||||
continue
|
||||
|
||||
# Create points with named vectors
|
||||
points = []
|
||||
for j, doc in enumerate(batch):
|
||||
text = doc["text"]
|
||||
metadata = doc.get("metadata", {})
|
||||
point_id = doc.get("id") or hashlib.md5(text.encode()).hexdigest()
|
||||
|
||||
# Build named vectors dict
|
||||
vectors = {}
|
||||
for model, model_embeddings in embeddings_by_model.items():
|
||||
vectors[model.value] = model_embeddings[j]
|
||||
|
||||
points.append(qmodels.PointStruct(
|
||||
id=point_id,
|
||||
vector=vectors,
|
||||
payload={
|
||||
"text": text,
|
||||
**metadata,
|
||||
}
|
||||
))
|
||||
|
||||
# Upsert batch
|
||||
self.qdrant_client.upsert(
|
||||
collection_name=collection_name,
|
||||
points=points,
|
||||
)
|
||||
total_indexed += len(points)
|
||||
logger.info(f"Indexed {total_indexed}/{len(valid_docs)} documents with {len(models_to_use)} models")
|
||||
|
||||
return total_indexed
|
||||
|
||||
def get_stats(self) -> dict[str, Any]:
|
||||
"""Get statistics about collections and available models.
|
||||
|
||||
Returns:
|
||||
Dict with collection stats and model availability
|
||||
"""
|
||||
stats = {
|
||||
"config": {
|
||||
"qdrant_host": self.config.qdrant_host,
|
||||
"qdrant_port": self.config.qdrant_port,
|
||||
"model_preference": [m.value for m in self.config.model_preference],
|
||||
"openai_available": bool(self.config.openai_api_key),
|
||||
},
|
||||
"collections": {},
|
||||
}
|
||||
|
||||
for collection_name in [self.config.institutions_collection, self.config.persons_collection]:
|
||||
try:
|
||||
info = self.qdrant_client.get_collection(collection_name)
|
||||
available_models = self.get_available_models(collection_name)
|
||||
selected_model = self.select_model(collection_name)
|
||||
|
||||
stats["collections"][collection_name] = {
|
||||
"vectors_count": info.vectors_count,
|
||||
"points_count": info.points_count,
|
||||
"status": info.status.value if info.status else "unknown",
|
||||
"available_models": [m.value for m in available_models],
|
||||
"selected_model": selected_model.value if selected_model else None,
|
||||
}
|
||||
except Exception as e:
|
||||
stats["collections"][collection_name] = {"error": str(e)}
|
||||
|
||||
return stats
|
||||
|
||||
def close(self):
|
||||
"""Close all connections."""
|
||||
if self._qdrant_client:
|
||||
self._qdrant_client.close()
|
||||
self._qdrant_client = None
|
||||
self._st_models.clear()
|
||||
self._available_models.clear()
|
||||
self._uses_named_vectors.clear()
|
||||
|
||||
|
||||
def create_multi_embedding_retriever(use_production: bool | None = None) -> MultiEmbeddingRetriever:
|
||||
"""Factory function to create a MultiEmbeddingRetriever.
|
||||
|
||||
Args:
|
||||
use_production: If True, connect to production Qdrant.
|
||||
Defaults to QDRANT_USE_PRODUCTION env var.
|
||||
|
||||
Returns:
|
||||
Configured MultiEmbeddingRetriever instance
|
||||
"""
|
||||
if use_production is None:
|
||||
use_production = os.getenv("QDRANT_USE_PRODUCTION", "").lower() in ("true", "1", "yes")
|
||||
|
||||
if use_production:
|
||||
config = MultiEmbeddingConfig(
|
||||
qdrant_host=os.getenv("QDRANT_PROD_HOST", "bronhouder.nl"),
|
||||
qdrant_port=443,
|
||||
qdrant_https=True,
|
||||
qdrant_prefix=os.getenv("QDRANT_PROD_PREFIX", "qdrant"),
|
||||
openai_api_key=os.getenv("OPENAI_API_KEY"),
|
||||
)
|
||||
else:
|
||||
config = MultiEmbeddingConfig(
|
||||
qdrant_host=os.getenv("QDRANT_HOST", "localhost"),
|
||||
qdrant_port=int(os.getenv("QDRANT_PORT", "6333")),
|
||||
openai_api_key=os.getenv("OPENAI_API_KEY"),
|
||||
)
|
||||
|
||||
return MultiEmbeddingRetriever(config)
|
||||
19045
data/fixes/bu/slot_fixes.yaml
Normal file
19045
data/fixes/bu/slot_fixes.yaml
Normal file
File diff suppressed because it is too large
Load diff
18357
data/fixes/slot_fixes.yaml
Normal file
18357
data/fixes/slot_fixes.yaml
Normal file
File diff suppressed because it is too large
Load diff
44
data/fixes/slot_fixes_20260129.yaml
Normal file
44
data/fixes/slot_fixes_20260129.yaml
Normal file
|
|
@ -0,0 +1,44 @@
|
|||
fixes:
|
||||
- orignal_slot_id: /Users/kempersc/apps/glam/schemas/20251121/linkml/modules/slots/archive_branches.yaml
|
||||
revision:
|
||||
- label: has_or_had_branch
|
||||
type: slot
|
||||
- label: Branch
|
||||
type: class
|
||||
- original_slot_id: /Users/kempersc/apps/glam/schemas/20251121/linkml/modules/slots/archive_path.yaml
|
||||
revision:
|
||||
- label: has_or_had_provenance_path
|
||||
type: slot
|
||||
- label: ProvenancePath
|
||||
type: class
|
||||
- original_slot_id: /Users/kempersc/apps/glam/schemas/20251121/linkml/modules/slots/archive_series.yaml
|
||||
revision:
|
||||
- label: is_or_was_part_of_series
|
||||
type: slot
|
||||
- label: Series
|
||||
type: class
|
||||
- orignal_slot_id: /Users/kempersc/apps/glam/schemas/20251121/linkml/modules/slots/condition_of_access.yaml
|
||||
revision:
|
||||
- label: has_or_had_condition_of_access
|
||||
type: slot
|
||||
- label: ConditionofAccess
|
||||
type: class
|
||||
- original_slot_id: /Users/kempersc/apps/glam/schemas/20251121/linkml/modules/slots/connection_heritage_relevant.yaml
|
||||
revision:
|
||||
- label: is_or_was_related_to
|
||||
type: slot
|
||||
- label: Entity
|
||||
type: class
|
||||
- original_slot_id: /Users/kempersc/apps/glam/schemas/20251121/linkml/modules/slots/connection_heritage_type.yaml
|
||||
revision:
|
||||
- label: has_or_had_heritage_type
|
||||
type: slot
|
||||
- label: HeritageType
|
||||
type: class
|
||||
- original_slot_id: https://nde.nl/ontology/hc/slot/was_retrieved_at
|
||||
revision:
|
||||
- label: is_or_was_retrieved_at
|
||||
type: slot
|
||||
- label: TimeSpan
|
||||
type: class
|
||||
-
|
||||
|
|
@ -73,6 +73,9 @@ This document catalogs all ontologies used in the GLAM Heritage Custodian projec
|
|||
|------|----------|---------|--------|-----------|
|
||||
| `skos.rdf` | SKOS (Simple Knowledge Org System) | 2009 | https://www.w3.org/TR/skos-reference/ | `skos:` |
|
||||
| `dublin_core_elements.rdf` | Dublin Core Elements | 1.1 | https://www.dublincore.org/specifications/dublin-core/ | `dc:` |
|
||||
| `dcterms.rdf` | DCMI Metadata Terms (RDF) | 2020 | https://www.dublincore.org/specifications/dublin-core/dcmi-terms/dublin_core_terms.rdf | `dcterms:` |
|
||||
| `dctype.rdf` | DCMI Type Vocabulary | 2012 | https://www.dublincore.org/specifications/dublin-core/dcmi-type-vocabulary/ | `dcmitype:` |
|
||||
| `oa.ttl` | Open Annotation Data Model | 2013 | https://www.w3.org/TR/annotation-vocab/ | `oa:` |
|
||||
| `dcat3.ttl` | DCAT (Data Catalog Vocabulary) | 3.0 | https://www.w3.org/TR/vocab-dcat-3/ | `dcat:` |
|
||||
| `schemaorg.owl` | Schema.org | 2024 | https://schema.org/ | `schema:` |
|
||||
| `vcard.rdf` | vCard Ontology | 4.0 | https://www.w3.org/TR/vcard-rdf/ | `vcard:` |
|
||||
|
|
|
|||
1
data/ontology/RiC-O_1-0-2.rdf
Normal file
1
data/ontology/RiC-O_1-0-2.rdf
Normal file
|
|
@ -0,0 +1 @@
|
|||
404: Not Found
|
||||
30839
data/ontology/RiC-O_1-1.rdf
Normal file
30839
data/ontology/RiC-O_1-1.rdf
Normal file
File diff suppressed because it is too large
Load diff
2103
data/ontology/dcterms.rdf
Normal file
2103
data/ontology/dcterms.rdf
Normal file
File diff suppressed because it is too large
Load diff
21762
data/ontology/dcterms.ttl
Normal file
21762
data/ontology/dcterms.ttl
Normal file
File diff suppressed because it is too large
Load diff
152
data/ontology/dctype.rdf
Normal file
152
data/ontology/dctype.rdf
Normal file
|
|
@ -0,0 +1,152 @@
|
|||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<!DOCTYPE rdf:RDF [
|
||||
<!ENTITY rdfns 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
|
||||
<!ENTITY rdfsns 'http://www.w3.org/2000/01/rdf-schema#'>
|
||||
<!ENTITY dcns 'http://purl.org/dc/elements/1.1/'>
|
||||
<!ENTITY dctermsns 'http://purl.org/dc/terms/'>
|
||||
<!ENTITY dctypens 'http://purl.org/dc/dcmitype/'>
|
||||
<!ENTITY dcamns 'http://purl.org/dc/dcam/'>
|
||||
<!ENTITY skosns 'http://www.w3.org/2004/02/skos/core#'>
|
||||
<!ENTITY owlns 'http://www.w3.org/2002/07/owl#'>
|
||||
]>
|
||||
<rdf:RDF xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:dcam="http://purl.org/dc/dcam/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
|
||||
<rdf:Description rdf:about="http://purl.org/dc/dcmitype/">
|
||||
<dcterms:title xml:lang="en">DCMI Type Vocabulary</dcterms:title>
|
||||
<dcterms:publisher rdf:resource="http://purl.org/dc/aboutdcmi#DCMI"/>
|
||||
<dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2012-06-14</dcterms:modified>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://purl.org/dc/dcmitype/Collection">
|
||||
<rdfs:label xml:lang="en">Collection</rdfs:label>
|
||||
<rdfs:comment xml:lang="en">An aggregation of resources.</rdfs:comment>
|
||||
<dcterms:description xml:lang="en">A collection is described as a group; its parts may also be separately described.</dcterms:description>
|
||||
<rdfs:isDefinedBy rdf:resource="http://purl.org/dc/dcmitype/"/>
|
||||
<dcterms:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2000-07-11</dcterms:issued>
|
||||
<dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2008-01-14</dcterms:modified>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
|
||||
<dcterms:hasVersion rdf:resource="http://dublincore.org/usage/terms/history/#Collection-003"/>
|
||||
<dcam:memberOf rdf:resource="http://purl.org/dc/terms/DCMIType"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://purl.org/dc/dcmitype/Dataset">
|
||||
<rdfs:label xml:lang="en">Dataset</rdfs:label>
|
||||
<rdfs:comment xml:lang="en">Data encoded in a defined structure.</rdfs:comment>
|
||||
<dcterms:description xml:lang="en">Examples include lists, tables, and databases. A dataset may be useful for direct machine processing.</dcterms:description>
|
||||
<rdfs:isDefinedBy rdf:resource="http://purl.org/dc/dcmitype/"/>
|
||||
<dcterms:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2000-07-11</dcterms:issued>
|
||||
<dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2008-01-14</dcterms:modified>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
|
||||
<dcterms:hasVersion rdf:resource="http://dublincore.org/usage/terms/history/#Dataset-003"/>
|
||||
<dcam:memberOf rdf:resource="http://purl.org/dc/terms/DCMIType"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://purl.org/dc/dcmitype/Event">
|
||||
<rdfs:label xml:lang="en">Event</rdfs:label>
|
||||
<rdfs:comment xml:lang="en">A non-persistent, time-based occurrence.</rdfs:comment>
|
||||
<dcterms:description xml:lang="en">Metadata for an event provides descriptive information that is the basis for discovery of the purpose, location, duration, and responsible agents associated with an event. Examples include an exhibition, webcast, conference, workshop, open day, performance, battle, trial, wedding, tea party, conflagration.</dcterms:description>
|
||||
<rdfs:isDefinedBy rdf:resource="http://purl.org/dc/dcmitype/"/>
|
||||
<dcterms:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2000-07-11</dcterms:issued>
|
||||
<dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2008-01-14</dcterms:modified>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
|
||||
<dcterms:hasVersion rdf:resource="http://dublincore.org/usage/terms/history/#Event-003"/>
|
||||
<dcam:memberOf rdf:resource="http://purl.org/dc/terms/DCMIType"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://purl.org/dc/dcmitype/Image">
|
||||
<rdfs:label xml:lang="en">Image</rdfs:label>
|
||||
<rdfs:comment xml:lang="en">A visual representation other than text.</rdfs:comment>
|
||||
<dcterms:description xml:lang="en">Examples include images and photographs of physical objects, paintings, prints, drawings, other images and graphics, animations and moving pictures, film, diagrams, maps, musical notation. Note that Image may include both electronic and physical representations.</dcterms:description>
|
||||
<rdfs:isDefinedBy rdf:resource="http://purl.org/dc/dcmitype/"/>
|
||||
<dcterms:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2000-07-11</dcterms:issued>
|
||||
<dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2008-01-14</dcterms:modified>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
|
||||
<dcterms:hasVersion rdf:resource="http://dublincore.org/usage/terms/history/#Image-004"/>
|
||||
<dcam:memberOf rdf:resource="http://purl.org/dc/terms/DCMIType"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://purl.org/dc/dcmitype/InteractiveResource">
|
||||
<rdfs:label xml:lang="en">Interactive Resource</rdfs:label>
|
||||
<rdfs:comment xml:lang="en">A resource requiring interaction from the user to be understood, executed, or experienced.</rdfs:comment>
|
||||
<dcterms:description xml:lang="en">Examples include forms on Web pages, applets, multimedia learning objects, chat services, or virtual reality environments.</dcterms:description>
|
||||
<rdfs:isDefinedBy rdf:resource="http://purl.org/dc/dcmitype/"/>
|
||||
<dcterms:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2000-07-11</dcterms:issued>
|
||||
<dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2008-01-14</dcterms:modified>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
|
||||
<dcterms:hasVersion rdf:resource="http://dublincore.org/usage/terms/history/#InteractiveResource-003"/>
|
||||
<dcam:memberOf rdf:resource="http://purl.org/dc/terms/DCMIType"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://purl.org/dc/dcmitype/Service">
|
||||
<rdfs:label xml:lang="en">Service</rdfs:label>
|
||||
<rdfs:comment xml:lang="en">A system that provides one or more functions.</rdfs:comment>
|
||||
<dcterms:description xml:lang="en">Examples include a photocopying service, a banking service, an authentication service, interlibrary loans, a Z39.50 or Web server.</dcterms:description>
|
||||
<rdfs:isDefinedBy rdf:resource="http://purl.org/dc/dcmitype/"/>
|
||||
<dcterms:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2000-07-11</dcterms:issued>
|
||||
<dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2008-01-14</dcterms:modified>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
|
||||
<dcterms:hasVersion rdf:resource="http://dublincore.org/usage/terms/history/#Service-003"/>
|
||||
<dcam:memberOf rdf:resource="http://purl.org/dc/terms/DCMIType"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://purl.org/dc/dcmitype/Software">
|
||||
<rdfs:label xml:lang="en">Software</rdfs:label>
|
||||
<rdfs:comment xml:lang="en">A computer program in source or compiled form.</rdfs:comment>
|
||||
<dcterms:description xml:lang="en">Examples include a C source file, MS-Windows .exe executable, or Perl script.</dcterms:description>
|
||||
<rdfs:isDefinedBy rdf:resource="http://purl.org/dc/dcmitype/"/>
|
||||
<dcterms:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2000-07-11</dcterms:issued>
|
||||
<dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2008-01-14</dcterms:modified>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
|
||||
<dcterms:hasVersion rdf:resource="http://dublincore.org/usage/terms/history/#Software-003"/>
|
||||
<dcam:memberOf rdf:resource="http://purl.org/dc/terms/DCMIType"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://purl.org/dc/dcmitype/Sound">
|
||||
<rdfs:label xml:lang="en">Sound</rdfs:label>
|
||||
<rdfs:comment xml:lang="en">A resource primarily intended to be heard.</rdfs:comment>
|
||||
<dcterms:description xml:lang="en">Examples include a music playback file format, an audio compact disc, and recorded speech or sounds.</dcterms:description>
|
||||
<rdfs:isDefinedBy rdf:resource="http://purl.org/dc/dcmitype/"/>
|
||||
<dcterms:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2000-07-11</dcterms:issued>
|
||||
<dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2008-01-14</dcterms:modified>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
|
||||
<dcterms:hasVersion rdf:resource="http://dublincore.org/usage/terms/history/#Sound-003"/>
|
||||
<dcam:memberOf rdf:resource="http://purl.org/dc/terms/DCMIType"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://purl.org/dc/dcmitype/Text">
|
||||
<rdfs:label xml:lang="en">Text</rdfs:label>
|
||||
<rdfs:comment xml:lang="en">A resource consisting primarily of words for reading.</rdfs:comment>
|
||||
<dcterms:description xml:lang="en">Examples include books, letters, dissertations, poems, newspapers, articles, archives of mailing lists. Note that facsimiles or images of texts are still of the genre Text.</dcterms:description>
|
||||
<rdfs:isDefinedBy rdf:resource="http://purl.org/dc/dcmitype/"/>
|
||||
<dcterms:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2000-07-11</dcterms:issued>
|
||||
<dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2008-01-14</dcterms:modified>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
|
||||
<dcterms:hasVersion rdf:resource="http://dublincore.org/usage/terms/history/#Text-003"/>
|
||||
<dcam:memberOf rdf:resource="http://purl.org/dc/terms/DCMIType"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://purl.org/dc/dcmitype/PhysicalObject">
|
||||
<rdfs:label xml:lang="en">Physical Object</rdfs:label>
|
||||
<rdfs:comment xml:lang="en">An inanimate, three-dimensional object or substance.</rdfs:comment>
|
||||
<dcterms:description xml:lang="en">Note that digital representations of, or surrogates for, these objects should use Image, Text or one of the other types.</dcterms:description>
|
||||
<rdfs:isDefinedBy rdf:resource="http://purl.org/dc/dcmitype/"/>
|
||||
<dcterms:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2002-07-13</dcterms:issued>
|
||||
<dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2008-01-14</dcterms:modified>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
|
||||
<dcterms:hasVersion rdf:resource="http://dublincore.org/usage/terms/history/#PhysicalObject-003"/>
|
||||
<dcam:memberOf rdf:resource="http://purl.org/dc/terms/DCMIType"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://purl.org/dc/dcmitype/StillImage">
|
||||
<rdfs:label xml:lang="en">Still Image</rdfs:label>
|
||||
<rdfs:comment xml:lang="en">A static visual representation.</rdfs:comment>
|
||||
<dcterms:description xml:lang="en">Examples include paintings, drawings, graphic designs, plans and maps. Recommended best practice is to assign the type Text to images of textual materials. Instances of the type Still Image must also be describable as instances of the broader type Image.</dcterms:description>
|
||||
<rdfs:isDefinedBy rdf:resource="http://purl.org/dc/dcmitype/"/>
|
||||
<dcterms:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2003-11-18</dcterms:issued>
|
||||
<dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2008-01-14</dcterms:modified>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
|
||||
<dcterms:hasVersion rdf:resource="http://dublincore.org/usage/terms/history/#StillImage-003"/>
|
||||
<dcam:memberOf rdf:resource="http://purl.org/dc/terms/DCMIType"/>
|
||||
<rdfs:subClassOf rdf:resource="http://purl.org/dc/dcmitype/Image"/>
|
||||
</rdf:Description>
|
||||
<rdf:Description rdf:about="http://purl.org/dc/dcmitype/MovingImage">
|
||||
<rdfs:label xml:lang="en">Moving Image</rdfs:label>
|
||||
<rdfs:comment xml:lang="en">A series of visual representations imparting an impression of motion when shown in succession.</rdfs:comment>
|
||||
<dcterms:description xml:lang="en">Examples include animations, movies, television programs, videos, zoetropes, or visual output from a simulation. Instances of the type Moving Image must also be describable as instances of the broader type Image.</dcterms:description>
|
||||
<rdfs:isDefinedBy rdf:resource="http://purl.org/dc/dcmitype/"/>
|
||||
<dcterms:issued rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2003-11-18</dcterms:issued>
|
||||
<dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2008-01-14</dcterms:modified>
|
||||
<rdf:type rdf:resource="http://www.w3.org/2000/01/rdf-schema#Class"/>
|
||||
<dcterms:hasVersion rdf:resource="http://dublincore.org/usage/terms/history/#MovingImage-003"/>
|
||||
<dcam:memberOf rdf:resource="http://purl.org/dc/terms/DCMIType"/>
|
||||
<rdfs:subClassOf rdf:resource="http://purl.org/dc/dcmitype/Image"/>
|
||||
</rdf:Description>
|
||||
</rdf:RDF>
|
||||
429
data/ontology/oa.ttl
Normal file
429
data/ontology/oa.ttl
Normal file
|
|
@ -0,0 +1,429 @@
|
|||
@prefix acl: <http://www.w3.org/ns/auth/acl#> .
|
||||
@prefix as: <http://www.w3.org/ns/activitystreams#> .
|
||||
@prefix bibo: <http://purl.org/ontology/bibo/> .
|
||||
@prefix cnt: <http://www.w3.org/2011/content#> .
|
||||
@prefix dc: <http://purl.org/dc/elements/1.1/> .
|
||||
@prefix dcterms: <http://purl.org/dc/terms/> .
|
||||
@prefix dctypes: <http://purl.org/dc/dcmitype/> .
|
||||
@prefix exif: <http://www.w3.org/2003/12/exif/ns#> .
|
||||
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
|
||||
@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
|
||||
@prefix gr: <http://purl.org/goodrelations/v1#> .
|
||||
@prefix iana: <http://www.iana.org/assignments/relation/> .
|
||||
@prefix iiif: <http://iiif.io/api/image/2#> .
|
||||
@prefix ldp: <http://www.w3.org/ns/ldp#> .
|
||||
@prefix oa: <http://www.w3.org/ns/oa#> .
|
||||
@prefix ore: <http://www.openarchives.org/ore/terms/> .
|
||||
@prefix owl: <http://www.w3.org/2002/07/owl#> .
|
||||
@prefix prov: <http://www.w3.org/ns/prov#> .
|
||||
@prefix pcdm: <http://pcdm.org/models#> .
|
||||
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
|
||||
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
|
||||
@prefix sc: <http://iiif.io/api/presentation/2#> .
|
||||
@prefix sioc: <http://rdfs.org/sioc/ns#> .
|
||||
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
|
||||
@prefix svcs: <http://rdfs.org/sioc/services#> .
|
||||
@prefix time: <http://www.w3.org/2006/time#> .
|
||||
@prefix trig: <http://www.w3.org/2004/03/trix/rdfg-1/> .
|
||||
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
|
||||
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
|
||||
|
||||
oa:Annotation a rdfs:Class ;
|
||||
rdfs:label "Annotation" ;
|
||||
rdfs:comment "The class for Web Annotations." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:Choice a rdfs:Class ;
|
||||
rdfs:label "Choice" ;
|
||||
rdfs:comment "A subClass of as:OrderedCollection that conveys to a consuming application that it should select one of the resources in the as:items list to use, rather than all of them. This is typically used to provide a choice of resources to render to the user, based on further supplied properties. If the consuming application cannot determine the user's preference, then it should use the first in the list." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf as:OrderedCollection .
|
||||
|
||||
oa:CssSelector a rdfs:Class ;
|
||||
rdfs:label "CssSelector" ;
|
||||
rdfs:comment "A CssSelector describes a Segment of interest in a representation that conforms to the Document Object Model through the use of the CSS selector specification." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf oa:Selector .
|
||||
|
||||
oa:CssStyle a rdfs:Class ;
|
||||
rdfs:label "CssStyle" ;
|
||||
rdfs:comment "A resource which describes styles for resources participating in the Annotation using CSS." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf oa:Style .
|
||||
|
||||
oa:DataPositionSelector a rdfs:Class ;
|
||||
rdfs:label "DataPositionSelector" ;
|
||||
rdfs:comment "DataPositionSelector describes a range of data by recording the start and end positions of the selection in the stream. Position 0 would be immediately before the first byte, position 1 would be immediately before the second byte, and so on. The start byte is thus included in the list, but the end byte is not." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf oa:Selector .
|
||||
|
||||
oa:Direction a rdfs:Class ;
|
||||
rdfs:label "Direction" ;
|
||||
rdfs:comment "A class to encapsulate the different text directions that a textual resource might take. It is not used directly in the Annotation Model, only its three instances." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:FragmentSelector a rdfs:Class ;
|
||||
rdfs:label "FragmentSelector" ;
|
||||
rdfs:comment "The FragmentSelector class is used to record the segment of a representation using the IRI fragment specification defined by the representation's media type." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf oa:Selector .
|
||||
|
||||
oa:HttpRequestState a rdfs:Class ;
|
||||
rdfs:label "HttpRequestState" ;
|
||||
rdfs:comment "The HttpRequestState class is used to record the HTTP request headers that a client SHOULD use to request the correct representation from the resource. " ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf oa:State .
|
||||
|
||||
oa:Motivation a rdfs:Class ;
|
||||
rdfs:label "Motivation" ;
|
||||
rdfs:comment "The Motivation class is used to record the user's intent or motivation for the creation of the Annotation, or the inclusion of the body or target, that it is associated with." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf skos:Concept .
|
||||
|
||||
oa:RangeSelector a rdfs:Class ;
|
||||
rdfs:label "RangeSelector" ;
|
||||
rdfs:comment "A Range Selector can be used to identify the beginning and the end of the selection by using other Selectors. The selection consists of everything from the beginning of the starting selector through to the beginning of the ending selector, but not including it." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf oa:Selector .
|
||||
|
||||
oa:ResourceSelection a rdfs:Class ;
|
||||
rdfs:label "ResourceSelection" ;
|
||||
rdfs:comment "Instances of the ResourceSelection class identify part (described by an oa:Selector) of another resource (referenced with oa:hasSource), possibly from a particular representation of a resource (described by an oa:State). Please note that ResourceSelection is not used directly in the Web Annotation model, but is provided as a separate class for further application profiles to use, separate from oa:SpecificResource which has many Annotation specific features." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:Selector a rdfs:Class ;
|
||||
rdfs:label "Selector" ;
|
||||
rdfs:comment "A resource which describes the segment of interest in a representation of a Source resource, indicated with oa:hasSelector from the Specific Resource. This class is not used directly in the Annotation model, only its subclasses." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:SpecificResource a rdfs:Class ;
|
||||
rdfs:label "SpecificResource" ;
|
||||
rdfs:comment "Instances of the SpecificResource class identify part of another resource (referenced with oa:hasSource), a particular representation of a resource, a resource with styling hints for renders, or any combination of these, as used within an Annotation." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf oa:ResourceSelection .
|
||||
|
||||
oa:State a rdfs:Class ;
|
||||
rdfs:label "State" ;
|
||||
rdfs:comment "A State describes the intended state of a resource as applied to the particular Annotation, and thus provides the information needed to retrieve the correct representation of that resource." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:Style a rdfs:Class ;
|
||||
rdfs:label "Style" ;
|
||||
rdfs:comment "A Style describes the intended styling of a resource as applied to the particular Annotation, and thus provides the information to ensure that rendering is consistent across implementations." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:SvgSelector a rdfs:Class ;
|
||||
rdfs:label "SvgSelector" ;
|
||||
rdfs:comment "An SvgSelector defines an area through the use of the Scalable Vector Graphics [SVG] standard. This allows the user to select a non-rectangular area of the content, such as a circle or polygon by describing the region using SVG. The SVG may be either embedded within the Annotation or referenced as an External Resource." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf oa:Selector .
|
||||
|
||||
oa:TextPositionSelector a rdfs:Class ;
|
||||
rdfs:label "TextPositionSelector" ;
|
||||
rdfs:comment "The TextPositionSelector describes a range of text by recording the start and end positions of the selection in the stream. Position 0 would be immediately before the first character, position 1 would be immediately before the second character, and so on." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf oa:Selector .
|
||||
|
||||
oa:TextQuoteSelector a rdfs:Class ;
|
||||
rdfs:label "TextQuoteSelector" ;
|
||||
rdfs:comment "The TextQuoteSelector describes a range of text by copying it, and including some of the text immediately before (a prefix) and after (a suffix) it to distinguish between multiple copies of the same sequence of characters." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf oa:Selector .
|
||||
|
||||
oa:TextualBody a rdfs:Class ;
|
||||
rdfs:label "TextualBody" ;
|
||||
rdfs:comment "" ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:TimeState a rdfs:Class ;
|
||||
rdfs:label "TimeState" ;
|
||||
rdfs:comment "A TimeState records the time at which the resource's state is appropriate for the Annotation, typically the time that the Annotation was created and/or a link to a persistent copy of the current version." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf oa:State .
|
||||
|
||||
oa:XPathSelector a rdfs:Class ;
|
||||
rdfs:label "XPathSelector" ;
|
||||
rdfs:comment " An XPathSelector is used to select elements and content within a resource that supports the Document Object Model via a specified XPath value." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:subClassOf oa:Selector .
|
||||
|
||||
oa:PreferContainedDescriptions a rdfs:Resource ;
|
||||
rdfs:label "PreferContainedDescriptions" ;
|
||||
rdfs:comment "An IRI to signal the client prefers to receive full descriptions of the Annotations from a container, not just their IRIs." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:PreferContainedIRIs a rdfs:Resource ;
|
||||
rdfs:label "PreferContainedIRIs" ;
|
||||
rdfs:comment "An IRI to signal that the client prefers to receive only the IRIs of the Annotations from a container, not their full descriptions." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:annotationService a rdf:Property ;
|
||||
rdfs:label "annotationService" ;
|
||||
rdfs:comment """The object of the relationship is the end point of a service that conforms to the annotation-protocol, and it may be associated with any resource. The expectation of asserting the relationship is that the object is the preferred service for maintaining annotations about the subject resource, according to the publisher of the relationship.
|
||||
|
||||
This relationship is intended to be used both within Linked Data descriptions and as the rel type of a Link, via HTTP Link Headers rfc5988 for binary resources and in HTML <link> elements. For more information about these, please see the Annotation Protocol specification annotation-protocol.
|
||||
""" ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:assessing a oa:Motivation ;
|
||||
rdfs:label "assessing" ;
|
||||
rdfs:comment "The motivation for when the user intends to provide an assessment about the Target resource." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:bodyValue a rdf:Property ;
|
||||
rdfs:label "bodyValue" ;
|
||||
rdfs:comment """The object of the predicate is a plain text string to be used as the content of the body of the Annotation. The value MUST be an xsd:string and that data type MUST NOT be expressed in the serialization. Note that language MUST NOT be associated with the value either as a language tag, as that is only available for rdf:langString .
|
||||
""" ;
|
||||
rdfs:domain oa:Annotation ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range xsd:string .
|
||||
|
||||
oa:bookmarking a oa:Motivation ;
|
||||
rdfs:label "bookmarking" ;
|
||||
rdfs:comment "The motivation for when the user intends to create a bookmark to the Target or part thereof." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:cachedSource a rdf:Property ;
|
||||
rdfs:label "cachedSource" ;
|
||||
rdfs:comment "A object of the relationship is a copy of the Source resource's representation, appropriate for the Annotation." ;
|
||||
rdfs:domain oa:TimeState ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:canonical a rdf:Property ;
|
||||
rdfs:label "canonical" ;
|
||||
rdfs:comment "A object of the relationship is the canonical IRI that can always be used to deduplicate the Annotation, regardless of the current IRI used to access the representation." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:classifying a oa:Motivation ;
|
||||
rdfs:label "classifying" ;
|
||||
rdfs:comment "The motivation for when the user intends to that classify the Target as something." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:commenting a oa:Motivation ;
|
||||
rdfs:label "commenting" ;
|
||||
rdfs:comment "The motivation for when the user intends to comment about the Target." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:describing a oa:Motivation ;
|
||||
rdfs:label "describing" ;
|
||||
rdfs:comment "The motivation for when the user intends to describe the Target, as opposed to a comment about them." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:editing a oa:Motivation ;
|
||||
rdfs:label "editing" ;
|
||||
rdfs:comment "The motivation for when the user intends to request a change or edit to the Target resource." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:end a rdf:Property ;
|
||||
rdfs:label "end" ;
|
||||
rdfs:comment "The end property is used to convey the 0-based index of the end position of a range of content." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range xsd:nonNegativeInteger .
|
||||
|
||||
oa:exact a rdf:Property ;
|
||||
rdfs:label "exact" ;
|
||||
rdfs:comment "The object of the predicate is a copy of the text which is being selected, after normalization." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range xsd:string .
|
||||
|
||||
oa:hasBody a rdf:Property ;
|
||||
rdfs:label "hasBody" ;
|
||||
rdfs:comment "The object of the relationship is a resource that is a body of the Annotation." ;
|
||||
rdfs:domain oa:Annotation ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:hasEndSelector a rdf:Property ;
|
||||
rdfs:label "hasEndSelector" ;
|
||||
rdfs:comment "The relationship between a RangeSelector and the Selector that describes the end position of the range. " ;
|
||||
rdfs:domain oa:RangeSelector ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range oa:Selector .
|
||||
|
||||
oa:hasPurpose a rdf:Property ;
|
||||
rdfs:label "hasPurpose" ;
|
||||
rdfs:comment "The purpose served by the resource in the Annotation." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range oa:Motivation .
|
||||
|
||||
oa:hasScope a rdf:Property ;
|
||||
rdfs:label "hasScope" ;
|
||||
rdfs:comment "The scope or context in which the resource is used within the Annotation." ;
|
||||
rdfs:domain oa:SpecificResource ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:hasSelector a rdf:Property ;
|
||||
rdfs:label "hasSelector" ;
|
||||
rdfs:comment "The object of the relationship is a Selector that describes the segment or region of interest within the source resource. Please note that the domain ( oa:ResourceSelection ) is not used directly in the Web Annotation model." ;
|
||||
rdfs:domain oa:ResourceSelection ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range oa:Selector .
|
||||
|
||||
oa:hasSource a rdf:Property ;
|
||||
rdfs:label "hasSource" ;
|
||||
rdfs:comment "The resource that the ResourceSelection, or its subclass SpecificResource, is refined from, or more specific than. Please note that the domain ( oa:ResourceSelection ) is not used directly in the Web Annotation model." ;
|
||||
rdfs:domain oa:ResourceSelection ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:hasStartSelector a rdf:Property ;
|
||||
rdfs:label "hasStartSelector" ;
|
||||
rdfs:comment "The relationship between a RangeSelector and the Selector that describes the start position of the range. " ;
|
||||
rdfs:domain oa:RangeSelector ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range oa:Selector .
|
||||
|
||||
oa:hasState a rdf:Property ;
|
||||
rdfs:label "hasState" ;
|
||||
rdfs:comment "The relationship between the ResourceSelection, or its subclass SpecificResource, and a State resource. Please note that the domain ( oa:ResourceSelection ) is not used directly in the Web Annotation model." ;
|
||||
rdfs:domain oa:ResourceSelection ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range oa:State .
|
||||
|
||||
oa:hasTarget a rdf:Property ;
|
||||
rdfs:label "hasTarget" ;
|
||||
rdfs:comment "The relationship between an Annotation and its Target." ;
|
||||
rdfs:domain oa:Annotation ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:highlighting a oa:Motivation ;
|
||||
rdfs:label "highlighting" ;
|
||||
rdfs:comment "The motivation for when the user intends to highlight the Target resource or segment of it." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:identifying a oa:Motivation ;
|
||||
rdfs:label "identifying" ;
|
||||
rdfs:comment "The motivation for when the user intends to assign an identity to the Target or identify what is being depicted or described in the Target." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:linking a oa:Motivation ;
|
||||
rdfs:label "linking" ;
|
||||
rdfs:comment "The motivation for when the user intends to link to a resource related to the Target." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:ltrDirection a oa:Direction ;
|
||||
rdfs:label "ltrDirection" ;
|
||||
rdfs:comment "The direction of text that is read from left to right." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:moderating a oa:Motivation ;
|
||||
rdfs:label "moderating" ;
|
||||
rdfs:comment "The motivation for when the user intends to assign some value or quality to the Target." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:motivatedBy a rdf:Property ;
|
||||
rdfs:label "motivatedBy" ;
|
||||
rdfs:comment "The relationship between an Annotation and a Motivation that describes the reason for the Annotation's creation." ;
|
||||
rdfs:domain oa:Annotation ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range oa:Motivation .
|
||||
|
||||
oa:prefix a rdf:Property ;
|
||||
rdfs:label "prefix" ;
|
||||
rdfs:comment "The object of the property is a snippet of content that occurs immediately before the content which is being selected by the Selector." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range xsd:string .
|
||||
|
||||
oa:processingLanguage a rdf:Property ;
|
||||
rdfs:label "processingLanguage" ;
|
||||
rdfs:comment "The object of the property is the language that should be used for textual processing algorithms when dealing with the content of the resource, including hyphenation, line breaking, which font to use for rendering and so forth. The value must follow the recommendations of BCP47." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range xsd:string .
|
||||
|
||||
oa:questioning a oa:Motivation ;
|
||||
rdfs:label "questioning" ;
|
||||
rdfs:comment "The motivation for when the user intends to ask a question about the Target." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:refinedBy a rdf:Property ;
|
||||
rdfs:label "refinedBy" ;
|
||||
rdfs:comment "The relationship between a Selector and another Selector or a State and a Selector or State that should be applied to the results of the first to refine the processing of the source resource. " ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:renderedVia a rdf:Property ;
|
||||
rdfs:label "renderedVia" ;
|
||||
rdfs:comment "A system that was used by the application that created the Annotation to render the resource." ;
|
||||
rdfs:domain oa:SpecificResource ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:replying a oa:Motivation ;
|
||||
rdfs:label "replying" ;
|
||||
rdfs:comment "The motivation for when the user intends to reply to a previous statement, either an Annotation or another resource." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:rtlDirection a oa:Direction ;
|
||||
rdfs:label "rtlDirection" ;
|
||||
rdfs:comment "The direction of text that is read from right to left." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:sourceDate a rdf:Property ;
|
||||
rdfs:label "sourceDate" ;
|
||||
rdfs:comment "The timestamp at which the Source resource should be interpreted as being applicable to the Annotation." ;
|
||||
rdfs:domain oa:TimeState ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range xsd:dateTime .
|
||||
|
||||
oa:sourceDateEnd a rdf:Property ;
|
||||
rdfs:label "sourceDateEnd" ;
|
||||
rdfs:comment "The end timestamp of the interval over which the Source resource should be interpreted as being applicable to the Annotation." ;
|
||||
rdfs:domain oa:TimeState ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range xsd:dateTime .
|
||||
|
||||
oa:sourceDateStart a rdf:Property ;
|
||||
rdfs:label "sourceDateStart" ;
|
||||
rdfs:comment "The start timestamp of the interval over which the Source resource should be interpreted as being applicable to the Annotation." ;
|
||||
rdfs:domain oa:TimeState ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range xsd:dateTime .
|
||||
|
||||
oa:start a rdf:Property ;
|
||||
rdfs:label "start" ;
|
||||
rdfs:comment "The start position in a 0-based index at which a range of content is selected from the data in the source resource." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range xsd:nonNegativeInteger .
|
||||
|
||||
oa:styleClass a rdf:Property ;
|
||||
rdfs:label "styleClass" ;
|
||||
rdfs:comment "The name of the class used in the CSS description referenced from the Annotation that should be applied to the Specific Resource." ;
|
||||
rdfs:domain oa:SpecificResource ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range xsd:string .
|
||||
|
||||
oa:styledBy a rdf:Property ;
|
||||
rdfs:label "styledBy" ;
|
||||
rdfs:comment "A reference to a Stylesheet that should be used to apply styles to the Annotation rendering." ;
|
||||
rdfs:domain oa:Annotation ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range oa:Style .
|
||||
|
||||
oa:suffix a rdf:Property ;
|
||||
rdfs:label "suffix" ;
|
||||
rdfs:comment "The snippet of text that occurs immediately after the text which is being selected." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range xsd:string .
|
||||
|
||||
oa:tagging a oa:Motivation ;
|
||||
rdfs:label "tagging" ;
|
||||
rdfs:comment "The motivation for when the user intends to associate a tag with the Target." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa:textDirection a rdf:Property ;
|
||||
rdfs:label "textDirection" ;
|
||||
rdfs:comment "The direction of the text of the subject resource. There MUST only be one text direction associated with any given resource." ;
|
||||
rdfs:isDefinedBy oa: ;
|
||||
rdfs:range oa:Direction .
|
||||
|
||||
oa:via a rdf:Property ;
|
||||
rdfs:label "via" ;
|
||||
rdfs:comment "A object of the relationship is a resource from which the source resource was retrieved by the providing system." ;
|
||||
rdfs:isDefinedBy oa: .
|
||||
|
||||
oa: a owl:Ontology ;
|
||||
dc:title "Web Annotation Ontology" ;
|
||||
dcterms:creator [a foaf:Person; foaf:name "Benjamin Young"],
|
||||
[a foaf:Person; foaf:name "Paolo Ciccarese"],
|
||||
[a foaf:Person; foaf:name "Robert Sanderson"] ;
|
||||
dcterms:modified "2016-11-12T21:28:11Z" ;
|
||||
rdfs:comment "The Web Annotation ontology defines the terms of the Web Annotation vocabulary. Any changes to this document MUST be from a Working Group in the W3C that has established expertise in the area." ;
|
||||
rdfs:seeAlso <http://www.w3.org/TR/annotation-vocab/> ;
|
||||
prov:wasRevisionOf <http://www.openannotation.org/spec/core/20130208/oa.owl> ;
|
||||
owl:versionInfo "2016-11-12T21:28:11Z" .
|
||||
41011
data/ontology/schemaorg.owl
Normal file
41011
data/ontology/schemaorg.owl
Normal file
File diff suppressed because it is too large
Load diff
2445
defined_schema_terms.txt
Normal file
2445
defined_schema_terms.txt
Normal file
File diff suppressed because it is too large
Load diff
|
|
@ -4,7 +4,7 @@
|
|||
"version": "0.0.0",
|
||||
"type": "module",
|
||||
"scripts": {
|
||||
"sync-schemas": "rsync -av --delete ../schemas/20251121/linkml/ public/schemas/20251121/linkml/",
|
||||
"sync-schemas": "rsync -av --delete --exclude=\"archive/\" ../schemas/20251121/linkml/ public/schemas/20251121/linkml/",
|
||||
"generate-manifest": "node scripts/generate-schema-manifest.cjs",
|
||||
"dev": "pnpm run sync-schemas && pnpm run generate-manifest && vite",
|
||||
"build": "pnpm run sync-schemas && pnpm run generate-manifest && tsc -b && vite build",
|
||||
|
|
|
|||
|
|
@ -43,12 +43,15 @@ imports:
|
|||
- modules/slots/has_appellation_type
|
||||
- modules/slots/has_appellation_value
|
||||
- modules/slots/has_or_had_arrangement_system
|
||||
- modules/slots/collection_description
|
||||
- modules/slots/collection_name
|
||||
- modules/slots/has_or_had_description
|
||||
- modules/slots/has_or_had_label
|
||||
# collection_description ARCHIVED (2026-01-18) - migrated to has_or_had_description (Rule 53)
|
||||
# collection_name ARCHIVED (2026-01-18) - migrated to has_or_had_label (Rule 53)
|
||||
# collection_scope ARCHIVED (2026-01-18) - migrated to has_or_had_scope + CollectionScope (Rule 53)
|
||||
- modules/slots/has_or_had_scope
|
||||
- modules/slots/collection_type
|
||||
- modules/slots/collections_under_responsibility
|
||||
# collections_under_responsibility ARCHIVED (2026-01-19) - migrated to is_or_was_responsible_for (Rule 53)
|
||||
- modules/slots/is_or_was_responsible_for
|
||||
- modules/slots/confidence_method
|
||||
- modules/slots/confidence_score
|
||||
- modules/slots/confidence_value
|
||||
|
|
@ -599,7 +602,7 @@ imports:
|
|||
- modules/slots/has_or_had_area_served
|
||||
- modules/slots/has_or_had_member_custodian
|
||||
- modules/slots/membership_criteria
|
||||
- modules/slots/community_engagement
|
||||
# community_engagement ARCHIVED 2026-01-19 - migrated to has_or_had_activity (imported above)
|
||||
- modules/slots/service_offering
|
||||
- modules/slots/record_type
|
||||
- modules/slots/society_focus
|
||||
|
|
|
|||
|
|
@ -6,8 +6,14 @@ prefixes:
|
|||
linkml: https://w3id.org/linkml/
|
||||
org: http://www.w3.org/ns/org#
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
rdfs: http://www.w3.org/2000/01/rdf-schema#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
imports:
|
||||
- linkml:types
|
||||
- linkml:types
|
||||
default_prefix: hc
|
||||
slots:
|
||||
has_or_had_admin_staff_count:
|
||||
|
|
@ -27,3 +33,5 @@ slots:
|
|||
custodian_types_primary: M
|
||||
specificity_score: 0.5
|
||||
specificity_rationale: Moderately specific slot.
|
||||
exact_mappings:
|
||||
- hc:hasOrHadAdminStaffCount
|
||||
|
|
@ -0,0 +1,37 @@
|
|||
id: https://nde.nl/ontology/hc/slot/has_or_had_admission_fee
|
||||
name: has_or_had_admission_fee_slot
|
||||
title: Has Or Had Admission Fee Slot
|
||||
prefixes:
|
||||
gr: http://purl.org/goodrelations/v1#
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
linkml: https://w3id.org/linkml/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
rdfs: http://www.w3.org/2000/01/rdf-schema#
|
||||
org: http://www.w3.org/ns/org#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
imports:
|
||||
- linkml:types
|
||||
default_prefix: hc
|
||||
slots:
|
||||
has_or_had_admission_fee:
|
||||
description: "Admission fee charged by the institution. Temporal as fees change. A string describing the fee amount or structure (free, \u20AC10, \u20AC5-15, etc.)."
|
||||
range: string
|
||||
slot_uri: hc:hasOrHadAdmissionFee
|
||||
close_mappings:
|
||||
- schema:price
|
||||
- schema:priceRange
|
||||
related_mappings:
|
||||
- schema:offers
|
||||
- gr:hasPriceSpecification
|
||||
comments:
|
||||
- schema:offers links to Offer objects, not fee amounts directly. An admission fee is a specific price value, not an offer.
|
||||
annotations:
|
||||
custodian_types: '["*"]'
|
||||
custodian_types_rationale: Applicable to all heritage custodian types.
|
||||
custodian_types_primary: M
|
||||
specificity_score: 0.5
|
||||
specificity_rationale: Moderately specific slot.
|
||||
|
|
@ -6,8 +6,14 @@ prefixes:
|
|||
hc: https://nde.nl/ontology/hc/
|
||||
linkml: https://w3id.org/linkml/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
schema: http://schema.org/
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
rdfs: http://www.w3.org/2000/01/rdf-schema#
|
||||
org: http://www.w3.org/ns/org#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
imports:
|
||||
- linkml:types
|
||||
- linkml:types
|
||||
default_prefix: hc
|
||||
slots:
|
||||
has_or_had_assigned_processor:
|
||||
|
|
@ -0,0 +1,33 @@
|
|||
id: https://nde.nl/ontology/hc/slot/has_or_had_classification
|
||||
name: has_or_had_classification_slot
|
||||
title: has_or_had_classification slot
|
||||
description: "Generic temporal classification slot following RiC-O naming pattern. Used for various classification schemes (biological, organizational, etc.).\nReplaces bespoke classification slots per Rule 53/56: - bio_type_classification \u2192 has_or_had_classification (in OutdoorSite)"
|
||||
version: 1.0.0
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
rdfs: http://www.w3.org/2000/01/rdf-schema#
|
||||
org: http://www.w3.org/ns/org#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
imports:
|
||||
- linkml:types
|
||||
default_prefix: hc
|
||||
slots:
|
||||
has_or_had_classification:
|
||||
slot_uri: schema:additionalType
|
||||
description: "Classification or categorization scheme value. Uses schema:additionalType for type classification compatibility.\nClasses narrow this slot's range via slot_usage to specific enum types: - OutdoorSite \u2192 BioCustodianTypeEnum (biological/botanical classification)"
|
||||
range: uriorcurie
|
||||
multivalued: true
|
||||
exact_mappings:
|
||||
- schema:additionalType
|
||||
close_mappings:
|
||||
- skos:Concept
|
||||
annotations:
|
||||
custodian_types:
|
||||
- '*'
|
||||
custodian_types_rationale: Universal utility concept
|
||||
|
|
@ -0,0 +1,45 @@
|
|||
id: https://nde.nl/ontology/hc/slot/has_or_had_comprehensive_overview
|
||||
name: has_or_had_comprehensive_overview_slot
|
||||
title: Has Or Had Comprehensive Overview Slot
|
||||
description: 'Generic slot for linking to comprehensive overview collections.
|
||||
|
||||
Follows RiC-O temporal naming convention to indicate the relationship may be current or historical.'
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
schema: http://schema.org/
|
||||
rico: https://www.ica.org/standards/RiC/ontology#
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
rdfs: http://www.w3.org/2000/01/rdf-schema#
|
||||
org: http://www.w3.org/ns/org#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../classes/Overview
|
||||
default_prefix: hc
|
||||
slots:
|
||||
has_or_had_comprehensive_overview:
|
||||
description: "Links an entity to a comprehensive overview collection of resources.\nFollows RiC-O temporal naming convention (`hasOrHad*`) to indicate the relationship may be current or historical.\n**USAGE**:\n```yaml finding_aid:\n has_or_had_comprehensive_overview:\n id: hc:overview/findingaid-links\n title: \"All Links\"\n includes_or_included:\n - url: https://example.org/link1\n link_text: \"Related Resource\"\n```\n**DESIGN RATIONALE**:\nThis is a GENERIC slot for linking to comprehensive collections of resources. Replaces domain-specific slots like `all_links` with a typed relationship to an `Overview` class.\n**MIGRATION NOTE** (2026-01-14):\nCreated as replacement for `all_links` slot. The new pattern: - Uses typed `Overview` class instead of untyped string list - Uses `includes_or_included` for WebLink composition - Enables richer metadata about link collections\n**ONTOLOGY ALIGNMENT**:\n- `dcterms:hasPart` - Dublin Core part-whole relationship - `schema:hasPart`\
|
||||
\ - Schema.org containment - `rico:hasOrHadPart` - RiC-O temporal containment"
|
||||
range: Overview
|
||||
multivalued: false
|
||||
inlined: true
|
||||
slot_uri: dcterms:hasPart
|
||||
exact_mappings:
|
||||
- dcterms:hasPart
|
||||
close_mappings:
|
||||
- schema:hasPart
|
||||
- rico:hasOrHadPart
|
||||
annotations:
|
||||
custodian_types: '["*"]'
|
||||
custodian_types_rationale: Comprehensive overviews applicable to all heritage custodian types.
|
||||
custodian_types_primary: A
|
||||
specificity_score: 0.35
|
||||
specificity_rationale: Low-moderate specificity - applicable across many contexts where comprehensive resource collections are needed.
|
||||
comments:
|
||||
- Replaces all_links slot
|
||||
- Uses Overview class for typed collection
|
||||
- Created from slot_fixes.yaml migration (2026-01-14)
|
||||
|
|
@ -0,0 +1,115 @@
|
|||
id: https://nde.nl/ontology/hc/slot/has_or_had_custodian_type
|
||||
name: has_or_had_custodian_type_slot
|
||||
title: Has Or Had Custodian Type Slot
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
org: http://www.w3.org/ns/org#
|
||||
rov: http://www.w3.org/ns/regorg#
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
rdfs: http://www.w3.org/2000/01/rdf-schema#
|
||||
xsd: http://www.w3.org/2001/XMLSchema#
|
||||
default_prefix: hc
|
||||
imports:
|
||||
- linkml:types
|
||||
- ../classes/CustodianType
|
||||
slots:
|
||||
has_or_had_custodian_type:
|
||||
slot_uri: org:classification
|
||||
description: "The organizational type classification(s) of a heritage custodian within\nthe GLAMORCUBESFIXPHDNT taxonomy.\n\n**Predicate Semantics**:\nThis slot uses org:classification as its primary predicate, which links\nan organization to its type classification(s) using SKOS concepts.\n\n**Temporal Semantics** (RiC-O Pattern):\nThe \"hasOrHad\" naming follows RiC-O convention indicating this relationship\nmay be historical - an institution may have changed type over time\n(e.g., a library becoming a museum, or a mixed institution).\n\n**Ontological Alignment**:\n- **Primary** (`slot_uri`): `org:classification` - W3C Organization Ontology\n predicate for organizational classification (range: skos:Concept)\n- **Close**: `rov:orgType` - Registered Organization Vocabulary predicate\n (subPropertyOf org:classification, for legal entity types like GmbH, Ltd)\n- **Related**: `crm:P2_has_type` - CIDOC-CRM predicate for typing entities\n (domain: E1_CRM_Entity, range: E55_Type)\n- **Related**:\
|
||||
\ `schema:additionalType` - Schema.org predicate for additional\n type classification beyond the primary @type\n- **Broad**: `dcterms:type` - Dublin Core predicate for resource type\n\n**Range**:\nValues are instances of `CustodianType` or its 19 subclasses:\n\n| Code | Subclass | Wikidata | Description |\n|------|--------------------------------|-----------|--------------------------------|\n| A | ArchiveOrganizationType | Q166118 | Archives |\n| B | BioCustodianType | Q167346 | Botanical gardens, zoos |\n| C | CommercialOrganizationType | Q6881511 | Corporate archives |\n| D | DigitalPlatformType | Q3565794 | Digital platforms |\n| E | EducationProviderType | Q3152824 | Educational institutions |\n| F | FeatureCustodianType | Q4989906 | Monuments, memorials |\n| G | GalleryType \
|
||||
\ | Q1007870 | Art galleries |\n| H | HolySacredSiteType | Q1370598 | Religious heritage sites |\n| I | IntangibleHeritageGroupType | Q59544 | Intangible heritage orgs |\n| L | LibraryType | Q7075 | Libraries |\n| M | MuseumType | Q33506 | Museums |\n| N | NonProfitType | Q163740 | NGOs, advocacy groups |\n| O | OfficialInstitutionType | Q2659904 | Government agencies |\n| P | PersonalCollectionType | Q2668072 | Private collections |\n| R | ResearchOrganizationType | Q31855 | Research institutes |\n| S | HeritageSocietyType | Q476068 | Historical societies |\n| T | TasteScentHeritageType | Q5765838 | Culinary/olfactory heritage |\n| U | UnspecifiedType | Q35120 | Unknown\
|
||||
\ type |\n| X | MixedCustodianType | Q35120 | Multiple types combined |\n\nEach CustodianType subclass provides:\n- Wikidata Q-number alignment (via schema:additionalType)\n- Multilingual labels (skos:prefLabel, skos:altLabel)\n- Hierarchical relationships (skos:broader, skos:narrower)\n- GHCID single-letter code derivation\n\n**Cardinality**:\nMultivalued - institutions may have multiple types (e.g., museum + archive).\nUse MixedCustodianType (X) for institutions with complex multi-type identity.\n"
|
||||
range: CustodianType
|
||||
required: false
|
||||
multivalued: true
|
||||
inlined_as_list: true
|
||||
exact_mappings:
|
||||
- org:classification
|
||||
close_mappings:
|
||||
- rov:orgType
|
||||
related_mappings:
|
||||
- crm:P2_has_type
|
||||
- schema:additionalType
|
||||
broad_mappings:
|
||||
- dcterms:type
|
||||
annotations:
|
||||
rico_naming_convention: 'Follows RiC-O "hasOrHad" pattern for temporal predicates.
|
||||
|
||||
See Rule 39: Slot Naming Convention (RiC-O Style)
|
||||
|
||||
'
|
||||
replaces_slots: custodian_type, custodian_types
|
||||
migration_date: '2026-01-09'
|
||||
predicate_clarification: 'slot_uri and mappings reference PREDICATES (properties), not classes.
|
||||
|
||||
- org:classification is a PREDICATE (links Organization to Concept)
|
||||
|
||||
- CustodianType is a CLASS (the range of valid values)
|
||||
|
||||
'
|
||||
range_note: 'Range is CustodianType (abstract class). Valid values are the 19
|
||||
|
||||
CustodianType subclasses defined in modules/classes/:
|
||||
|
||||
- ArchiveOrganizationType.yaml
|
||||
|
||||
- BioCustodianType.yaml
|
||||
|
||||
- CommercialOrganizationType.yaml
|
||||
|
||||
- DigitalPlatformType.yaml
|
||||
|
||||
- EducationProviderType.yaml
|
||||
|
||||
- FeatureCustodianType.yaml
|
||||
|
||||
- GalleryType.yaml
|
||||
|
||||
- HolySacredSiteType.yaml
|
||||
|
||||
- IntangibleHeritageGroupType.yaml
|
||||
|
||||
- LibraryType.yaml
|
||||
|
||||
- MuseumType.yaml
|
||||
|
||||
- NonProfitType.yaml (N)
|
||||
|
||||
- OfficialInstitutionType.yaml
|
||||
|
||||
- PersonalCollectionType.yaml
|
||||
|
||||
- ResearchOrganizationType.yaml
|
||||
|
||||
- HeritageSocietyType.yaml
|
||||
|
||||
- TasteScentHeritageType.yaml
|
||||
|
||||
- UnspecifiedType.yaml
|
||||
|
||||
- MixedCustodianType.yaml
|
||||
|
||||
'
|
||||
custodian_types:
|
||||
- '*'
|
||||
custodian_types_rationale: Universal utility concept
|
||||
comments:
|
||||
- Unified slot replacing custodian_type (singular) and custodian_types (plural)
|
||||
- slot_uri=org:classification is a PREDICATE, not a class
|
||||
- range=CustodianType is an ABSTRACT CLASS - valid values are its 19 subclasses
|
||||
- 'RiC-O naming: hasOrHad indicates potentially historical relationship'
|
||||
- 'Multivalued: institutions may have multiple type classifications'
|
||||
examples:
|
||||
- value: hc:MuseumType
|
||||
description: Art museum classification (M code)
|
||||
- value: hc:ArchiveOrganizationType
|
||||
description: Archive classification (A code)
|
||||
- value: '[hc:MuseumType, hc:ArchiveOrganizationType]'
|
||||
description: Mixed institution with both museum and archive functions
|
||||
- value: hc:MixedCustodianType
|
||||
description: Explicit mixed type when institution defies single categorization (X code)
|
||||
Some files were not shown because too many files have changed in this diff Show more
Loading…
Reference in a new issue