fix: correct hallucinated PREMIS terms and Schema.org namespace mismatch
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 3m48s

PREMIS ontology fixes (8 schema files):
- Replace invalid premis:hasRepresentation with dcterms:hasFormat
- Replace invalid premis:hasAccessRestriction with odrl:hasPolicy
- Replace invalid premis:hasPreservationPolicy with dcterms:conformsTo
- Replace invalid premis:hasAccessPolicy with dcterms:accessRights
- Replace invalid premis:hasStoragePolicy with dcterms:conformsTo
- Replace invalid premis:ProcessingStatus with skos:Concept
- Add proper close_mappings to valid PREMIS classes (premis:Representation, etc.)
- Document hallucinated terms in Rule 51 (AGENTS.md) for future prevention

Schema.org namespace fixes (3 frontend files):
- Update OntologyTermPopup.tsx: add normalizeSchemaOrgUri() function
- Update ontology-loader.ts: change schema prefix to https://schema.org/
- Update linkml-schema-service.ts: change schema prefix to https://schema.org/
- The schemaorg.owl file uses https:// but code was using http://

These changes ensure ontology term lookups work correctly for Schema.org
terms and that LinkML schema files only reference valid ontology predicates.
This commit is contained in:
kempersc 2026-01-13 14:16:33 +01:00
parent caa2690ba4
commit f2b10fca19
21 changed files with 94 additions and 52 deletions

View file

@ -218,13 +218,13 @@ instances:
scopeNote: >-
May include conservation treatment (paper repair, deacidification),
rehousing into archival containers, digitization for preservation/access,
or environmental stabilization. PREMIS premis:PreservationEvent.
or environmental stabilization. PREMIS premis:Event.
example:
- Paper documents undergoing deacidification
- Photographs being rehoused in archival sleeves
- Fragile documents being digitized for access
ontology:
class_uri: premis:PreservationEvent
class_uri: premis:Event
typical_duration: "Days to years for major conservation"
treatment_types:
- Paper repair
@ -416,7 +416,7 @@ usage:
Color-code by stage for visual processing pipeline.
rdf_generation: >-
Map to rico:Activity for RDF serialization.
Use premis:PreservationEvent for IN_PRESERVATION stage.
Use premis:Event for IN_PRESERVATION stage.
workflow_automation: >-
Enable status transitions in workflow management systems.
Track timestamps for processing metrics.

View file

@ -87,7 +87,7 @@ instances:
note: "vault (architecture)"
ontology_mappings:
schema_org: schema:Place
premis: premis:storageLocation
premis: premis:StorageLocation
environmental_requirements:
temperature: "Controlled (varies by material)"
humidity: "45-55% RH"

View file

@ -1,5 +1,5 @@
{
"generated": "2026-01-13T12:50:30.701Z",
"generated": "2026-01-13T13:09:16.459Z",
"schemaRoot": "/schemas/20251121/linkml",
"totalFiles": 2886,
"categoryCounts": {

View file

@ -40,13 +40,13 @@ imports:
- ../slots/has_appointment_required_flag
classes:
AccessPolicy:
class_uri: premis:RightsDeclaration
class_uri: premis:RightsStatus
description: "Access policy defining conditions under which heritage collections can be accessed.\n\n**PURPOSE**:\n\n\
AccessPolicy captures the access conditions governing a Collection:\n- WHO can access (public, researchers, staff only)\n\
- HOW access is granted (open, by appointment, with credentials)\n- WHEN access is available (opening hours, embargo\
\ periods)\n- WHAT restrictions apply (fragile materials, privacy, cultural sensitivity)\n\n1. **PREMIS**:\n - `premis:RightsDeclaration`\
\ - \"An assertion of one or more rights or\n permissions pertaining to an object and/or its content.\"\n - Links\
\ to Collection via premis:hasRightsDeclaration\n\n2. **Dublin Core**:\n - `dcterms:accessRights` - \"Information\
\ periods)\n- WHAT restrictions apply (fragile materials, privacy, cultural sensitivity)\n\n1. **PREMIS**:\n - `premis:RightsStatus`\
\ - \"Information about the rights status of an object.\"\n - Links\
\ to Collection via premis:rightsStatus\n\n2. **Dublin Core**:\n - `dcterms:accessRights` - \"Information\
\ about who may access the resource\n or an indication of its security status.\"\n\n3. **RiC-O**:\n - `rico:hasOrHadAllMembersWithAccessConditions`\
\ - Links RecordSet to access\n conditions applying to all members\n\n4. **RightsStatements.org**:\n - Standardized\
\ rights statements for cultural heritage\n - E.g., \"In Copyright\", \"No Copyright\", \"Unknown Copyright\"\n\n\
@ -75,7 +75,7 @@ classes:
\ not current access\n- Access restricted until triggering conditions (time, event)\n- \"Gray literature\" or un-catalogued\
\ backlogs awaiting processing\n"
exact_mappings:
- premis:RightsDeclaration
- premis:RightsStatus
- dcterms:accessRights
close_mappings:
- rico:Rule
@ -236,7 +236,7 @@ classes:
- Temporal validity enables policy versioning and embargo expiration
- DimArchive (dark archive) uses AccessPolicy to express preservation-only access
see_also:
- http://www.loc.gov/premis/rdf/v3/RightsDeclaration
- http://www.loc.gov/premis/rdf/v3/RightsStatus
- https://rightsstatements.org/
- https://localcontexts.org/
- https://www.ica.org/standards/RiC/ontology#Rule

View file

@ -73,7 +73,7 @@ classes:
- rico:RecordSet
- bf:Collection
related_mappings:
- premis:hasRepresentation
- premis:relationship
- dcterms:hasPart
slots:
- has_or_had_access_right

View file

@ -58,10 +58,10 @@ classes:
\ granted\n- Declassification decision\n- Original system failure (disaster recovery)\n\n**MULTILINGUAL LABELS**:\n\
- Dark Archive (de) [uses English term]\n\n**RELATED TYPES**:\n- LightArchive (Q112815447) - broadly accessible\n- DimArchive\
\ (Q112796779) - limited access\n- ClosedSpace - physical restricted access areas\n\n**ONTOLOGICAL ALIGNMENT**:\n- **SKOS**:\
\ skos:Concept (type classification)\n- **PREMIS**: premis:RightsDeclaration for access restrictions\n- **RiC-O**: rico:RecordSet\
\ skos:Concept (type classification)\n- **PREMIS**: premis:RightsStatus for access restrictions\n- **RiC-O**: rico:RecordSet\
\ with access restrictions\n- **Wikidata**: Q112796578\n\n**PREMIS INTEGRATION**:\n\nDark archives typically use PREMIS\
\ for preservation metadata:\n- `premis:rightsGranted` with no access rights\n- `premis:rightsEndDate` for embargo expiration\n\
- `premis:linkingAgentIdentifier` for custodian\n"
\ for preservation metadata:\n- `premis:RightsStatus` to document access restrictions\n- `premis:endDate` for embargo expiration dates\n\
- `premis:Agent` to identify the responsible custodian\n"
slot_usage:
wikidata_entity:
equals_string: Q112796578
@ -92,7 +92,7 @@ classes:
exact_mappings:
- wd:Q112796578
close_mappings:
- premis:RightsDeclaration
- premis:RightsStatus
- rico:RecordSet
- skos:Concept
broad_mappings:

View file

@ -98,7 +98,7 @@ classes:
exact_mappings:
- wd:Q112796779
close_mappings:
- premis:RightsDeclaration
- premis:RightsStatus
- rico:RecordSet
- skos:Concept
broad_mappings:

View file

@ -166,7 +166,7 @@ classes:
'
range: boolean
technical_metadata_standard:
slot_uri: premis:hasObjectCharacteristics
slot_uri: premis:characteristic
description: 'Standard used for technical metadata.

View file

@ -121,8 +121,8 @@ enums:
**Duration**: Days to years for major conservation
**PREMIS**: premis:PreservationEvent
meaning: premis:PreservationEvent
**PREMIS**: premis:Event (preservation activity)
meaning: premis:Event
PROCESSED_PENDING_TRANSFER:
description: |
@ -203,11 +203,11 @@ enums:
- "Tracks operational archive lifecycle BEFORE integration into CustodianCollection"
- "Processing backlogs commonly span decades in archival institutions"
- "RiC-O rico:Activity used for processing activities"
- "PREMIS premis:PreservationEvent for preservation activities"
- "PREMIS premis:Event for preservation activities"
- "Status changes should be tracked with timestamps for metrics"
see_also:
- "https://www.ica.org/standards/RiC/ontology#Activity"
- "http://www.loc.gov/premis/rdf/v3/PreservationEvent"
- "http://www.loc.gov/premis/rdf/v3/Event"
- "https://nde.nl/ontology/hc/class/CustodianArchive"
- "https://nde.nl/ontology/hc/class/CustodianCollection"

View file

@ -4,22 +4,22 @@ title: Digital Surrogates Slot
prefixes:
linkml: https://w3id.org/linkml/
hc: https://nde.nl/ontology/hc/
dcterms: http://purl.org/dc/terms/
premis: http://www.loc.gov/premis/rdf/v3/
imports:
- linkml:types
default_prefix: hc
slots:
digital_surrogate:
slot_uri: premis:hasRepresentation
slot_uri: dcterms:hasFormat
range: string
multivalued: true
description: 'Digital representations/surrogates of collection materials.
PREMIS: hasRepresentation links intellectual entities to digital representations.
Dublin Core: hasFormat links resources to alternative format representations.
A premis:Representation is a digital object that instantiates or renders an
IntellectualEntity (the conceptual work).
PREMIS: premis:Representation class models digital instantiations of intellectual entities.
Values can be:
@ -40,3 +40,12 @@ slots:
- "https://archive.org/details/manuscript-collection-x"
'
exact_mappings:
- dcterms:hasFormat
close_mappings:
- premis:Representation
annotations:
custodian_types: '["*"]'
custodian_types_rationale: Applicable to all heritage custodian types.
specificity_score: 0.6
specificity_rationale: Digital surrogate links are moderately specific to digitized collections.

View file

@ -1,15 +1,23 @@
id: https://nde.nl/ontology/hc/slot/digitization_status
name: digitization_status_slot
prefixes:
linkml: https://w3id.org/linkml/
hc: https://nde.nl/ontology/hc/
premis: http://www.loc.gov/premis/rdf/v3/
adms: http://www.w3.org/ns/adms#
imports:
- linkml:types
default_prefix: hc
slots:
digitization_status:
slot_uri: premis:hasRelatedStatementInformation
slot_uri: adms:status
range: string
description: 'Current status of collection digitization efforts.
PREMIS: hasRelatedStatementInformation for preservation metadata.
ADMS: status provides standardized vocabulary for asset lifecycle states.
PREMIS: premis:note can capture additional preservation metadata.
Values:
@ -36,5 +44,12 @@ slots:
- "IN_PROGRESS - expected completion 2026"
'
broad_mappings:
exact_mappings:
- adms:status
close_mappings:
- premis:note
annotations:
custodian_types: '["*"]'
custodian_types_rationale: Applicable to all heritage custodian types.
specificity_score: 0.6
specificity_rationale: Digitization status is moderately specific to digital preservation contexts.

View file

@ -24,7 +24,7 @@ slots:
- rico:hasOrHadAllMembersWithContentType
- schema:conditionsOfAccess
related_mappings:
- premis:hasRightsStatement
- premis:rightsStatus
broad_mappings:
- dcterms:rights
annotations:

View file

@ -35,12 +35,12 @@ slots:
range: string
slot_uri: hc:hasOrHadAccessControl
close_mappings:
- premis:hasRightsStatement
- premis:rightsStatus
- dcterms:accessRights
related_mappings:
- schema:conditionsOfAccess
comments:
- premis:hasRightsStatement links to rights statement objects; this slot describes the control mechanisms themselves.
- premis:rightsStatus links to rights status objects; this slot describes the control mechanisms themselves.
Related but not semantically identical.
annotations:
custodian_types: '["*"]'

View file

@ -53,7 +53,7 @@ slots:
close_mappings:
- schema:conditionsOfAccess
related_mappings:
- premis:hasRightsStatement
- premis:rightsStatus
annotations:
custodian_types: '["*"]'
custodian_types_rationale: Applicable to all heritage custodian types.

View file

@ -54,7 +54,7 @@ slots:
close_mappings:
- schema:conditionsOfAccess
related_mappings:
- premis:hasRightsStatement
- premis:rightsStatus
- schema:publishingPrinciples
comments:
- schema:conditionsOfAccess is semantically closer (conditions for access) than schema:publishingPrinciples (editorial

View file

@ -18,13 +18,13 @@ slots:
Links to AccessPolicy class defining access conditions.
PREMIS: hasRightsDeclaration for access rights.
PREMIS: rightsStatus for access rights.
'
range: uri
slot_uri: premis:hasRightsDeclaration
slot_uri: premis:rightsStatus
exact_mappings:
- premis:hasRightsDeclaration
- premis:rightsStatus
close_mappings:
- dcterms:references
- schema:citation

View file

@ -49,7 +49,7 @@ slots:
slot_uri: hc:hasOrHadAccessRestriction
close_mappings:
- dcterms:accessRights
- premis:hasRightsStatement
- premis:rightsStatus
related_mappings:
- rico:hasOrHadAllMembersWithContentType
comments:

View file

@ -34,7 +34,7 @@ slots:
close_mappings:
- schema:conditionsOfAccess
related_mappings:
- premis:hasRightsStatement
- premis:rightsStatus
- edm:rights
broad_mappings:
- dcterms:rights

View file

@ -41,8 +41,8 @@ const STANDARD_PREFIXES: Record<string, string> = {
'dcterms': 'http://purl.org/dc/terms/',
'dct': 'http://purl.org/dc/terms/',
// Schema.org (support both http and https)
'schema': 'http://schema.org/',
// Schema.org - use https to match schemaorg.owl which uses https://schema.org/
'schema': 'https://schema.org/',
// Domain ontologies
'rico': 'https://www.ica.org/standards/RiC/ontology#',
@ -155,6 +155,17 @@ function expandCurie(curie: string): string {
return curie;
}
/**
* Normalize Schema.org URIs to use https:// consistently
* The schemaorg.owl file uses https://schema.org/ but some references may use http://
*/
function normalizeSchemaOrgUri(uri: string): string {
if (uri.startsWith('http://schema.org/')) {
return uri.replace('http://schema.org/', 'https://schema.org/');
}
return uri;
}
/**
* Get the prefix from a CURIE
*/
@ -217,8 +228,9 @@ async function loadTermInfo(curie: string): Promise<TermInfo | null> {
return fetchWikidataEntity(curie);
}
// Expand the CURIE to full URI
const fullUri = expandCurie(curie);
// Expand the CURIE to full URI and normalize Schema.org URIs
let fullUri = expandCurie(curie);
fullUri = normalizeSchemaOrgUri(fullUri);
// Find the ontology file that contains this URI
const ontologyFile = getOntologyFileByUri(fullUri);
@ -239,8 +251,10 @@ async function loadTermInfo(curie: string): Promise<TermInfo | null> {
// Load and parse the ontology
const ontology = await loadOntology(ontologyFile.path);
// Search for the term in classes
const classMatch = ontology.classes.find(c => c.uri === fullUri);
// Search for the term in classes (normalize both sides for Schema.org)
const classMatch = ontology.classes.find(c =>
normalizeSchemaOrgUri(c.uri) === fullUri
);
if (classMatch) {
return {
uri: fullUri,
@ -255,8 +269,10 @@ async function loadTermInfo(curie: string): Promise<TermInfo | null> {
};
}
// Search for the term in properties
const propMatch = ontology.properties.find(p => p.uri === fullUri);
// Search for the term in properties (normalize both sides for Schema.org)
const propMatch = ontology.properties.find(p =>
normalizeSchemaOrgUri(p.uri) === fullUri
);
if (propMatch) {
return {
uri: fullUri,
@ -274,8 +290,10 @@ async function loadTermInfo(curie: string): Promise<TermInfo | null> {
};
}
// Search for the term in individuals
const indMatch = ontology.individuals.find(i => i.uri === fullUri);
// Search for the term in individuals (normalize both sides for Schema.org)
const indMatch = ontology.individuals.find(i =>
normalizeSchemaOrgUri(i.uri) === fullUri
);
if (indMatch) {
return {
uri: fullUri,

View file

@ -281,7 +281,7 @@ const COMMON_PREFIXES: Record<string, string> = {
org: 'http://www.w3.org/ns/org#',
prov: 'http://www.w3.org/ns/prov#',
pico: 'https://personsincontext.org/model#',
schema: 'http://schema.org/',
schema: 'https://schema.org/',
foaf: 'http://xmlns.com/foaf/0.1/',
rico: 'https://www.ica.org/standards/RiC/ontology#',
tooi: 'https://identifier.overheid.nl/tooi/def/ont/',

View file

@ -177,7 +177,7 @@ const NAMESPACES: Record<string, string> = {
dcterms: 'http://purl.org/dc/terms/',
skos: 'http://www.w3.org/2004/02/skos/core#',
foaf: 'http://xmlns.com/foaf/0.1/',
schema: 'http://schema.org/',
schema: 'https://schema.org/',
prov: 'http://www.w3.org/ns/prov#',
org: 'http://www.w3.org/ns/org#',
};