- Updated `entity_review.py` to map email semantic fields from JSON.
- Expanded `email_semantics.py` with additional museum mappings.
- Introduced a new rule in `.opencode/rules/no-duplicate-ontology-mappings.md` to prevent duplicate ontology mappings.
- Added a backup JSON file for entity resolution candidates.
- Created `enrich_email_semantics.py` to enrich candidates with email semantic signals.
- Developed `merge_entity_reviews.py` to merge reviewed decisions from a backup into new candidates.
The VCard ontology file (and 3 others) use @base directive with relative URIs
like <#Address>. The Turtle parser was not extracting @base or resolving
relative URIs against it.
Changes:
- Extract @base directive in first pass alongside @prefix
- Add baseUri parameter to expandUri() function
- Handle relative URIs starting with # (resolve against base)
- Handle empty relative URI <> (returns base URI itself)
- Pass baseUri through to processSubject() function
This fixes the 'Term not found' error for vcard:Address and similar terms
that use relative URI notation in their ontology definitions.
Affected ontologies: vcard.rdf, prov.ttl, era_ontology.ttl, ebg-ontology.ttl
- Introduced SoundArchiveRecordSetType, SpecialCollectionRecordSetType, SpecializedArchiveRecordSetType, SpecializedArchivesCzechiaRecordSetType, StateArchivesRecordSetType, StateArchivesSectionRecordSetType, StateDistrictArchiveRecordSetType, StateRegionalArchiveCzechiaRecordSetType, TelevisionArchiveRecordSetType, TradeUnionArchiveRecordSetType, UniversityArchiveRecordSetType, VereinsarchivRecordSetType, VerlagsarchivRecordSetType, VerwaltungsarchivRecordSetType, WebArchiveRecordSetType, and WomensArchivesRecordSetType.
- Each new type includes appropriate metadata, slots, and relationships to existing classes.
- Implemented a script to detect and fix Type class violations in LinkML files.
- Remove inline slot definitions from 144 class files
- Create 7 new centralized slot files in modules/slots/:
- custodian_type_broader.yaml
- custodian_type_narrower.yaml
- custodian_type_related.yaml
- definition.yaml
- finding_aid_access_restriction.yaml
- finding_aid_description.yaml
- finding_aid_temporal_coverage.yaml
- Add centralize_inline_slots.py automation script
- Update manifest with new timestamp
Rule 48: Class files must NOT define inline slots - all slots
must be imported from modules/slots/ directory.
Note: Pre-existing IdentifierFormat duplicate class definition
(in Standard.yaml and IdentifierFormat.yaml) not addressed in
this commit - requires separate schema refactor.
- Add entry count badge next to schema file name showing (xC, yE, zS) counts
- Add tooltip explaining LinkML file names vs class names
- Remove redundant section headers (Classes, Enums, Slots collapsible sections)
- Add URL params for enum (?enum=) and slot (?slot=) deep linking
- Persist category filters, dev tools visibility, and legend visibility to localStorage
- Set 'Main Schema' filter to OFF by default (confusing for users)
- Add Rule 48: Class files must not define inline slots
- Introduced EnvironmentalZoneTypeEnum.yaml to classify climate-controlled storage zones with detailed descriptions and recommended conditions for various materials.
- Created slots for environmental zone type code, description, ID, label, and HC preset URI to facilitate structured data representation.
- Implemented boolean slots for specific environmental requirements including dark storage, dust-free environment, ESD protection, and UV filtering, referencing relevant ISO standards.
- Enhanced documentation for each slot to clarify usage and preservation context.
Infrastructure changes to enable automatic frontend deployment when schemas change:
- Add .forgejo/workflows/deploy-frontend.yml workflow triggered by:
- Changes to frontend/** or schemas/20251121/linkml/**
- Manual workflow dispatch
- Rewrite generate-schema-manifest.cjs to properly scan all schema directories
- Recursively scans classes, enums, slots, modules directories
- Uses singular category names (class, enum, slot) matching TypeScript types
- Includes all 4 main schemas at root level
- Skips archive directories and backup files
- Update schema-loader.ts to match new manifest format
- Add SchemaCategory interface
- Update SchemaManifest to use categories as array
- Add flattenCategories() helper function
- Add getSchemaCategories() and getSchemaCategoriesSync() functions
The workflow builds frontend with updated manifest and deploys to bronhouder.nl
- Update VideoAnnotation class with new motivation type references
- Add AnnotationMotivationType and AnnotationMotivationTypes class files
- Add motivation_type slots (description, id, name)
- Archive deprecated AnnotationMotivationEnum
- Update slot references for derived_from_entity, has_observation, has_person_observation
Track full lineage of RAG responses: WHERE data comes from, WHEN it was
retrieved, HOW it was processed (SPARQL/vector/LLM).
Backend changes:
- Add provenance.py with EpistemicProvenance, DataTier, SourceAttribution
- Integrate provenance into MultiSourceRetriever.merge_results()
- Return epistemic_provenance in DSPyQueryResponse
Frontend changes:
- Pass EpistemicProvenance through useMultiDatabaseRAG hook
- Display provenance in ConversationPage (for cache transparency)
Schema fixes:
- Fix truncated example in has_observation.yaml slot definition
References:
- Pavlyshyn's Context Graphs and Data Traces paper
- LinkML ProvenanceBlock schema pattern
Copies authoritative schemas from schemas/20251121/ to:
- frontend/public/schemas/20251121/
- apps/archief-assistent/public/schemas/20251121/
This ensures slot definitions with corrected ontology property
references (commit 2808dad6cd) are available to frontend apps.
- auth.setup.ts: require env vars for test credentials (no hardcoded defaults)
- manifest.json: update schema manifest
- full_evaluation_results.json: add RAG evaluation results
- petra-links.json: update birth date from web claim
- Add 'Compare' toggle button next to slots with slot_usage overrides
- Show generic slot definition vs class-specific override in 3-column grid
- Highlight changed properties with green 'changed' badge
- Display '(inherited)' when override matches generic definition
- Display '(not defined)' when generic has no value for property
- Compare: range, description, required, multivalued, slot_uri, pattern, identifier
- Full i18n support (Dutch/English translations)
- Responsive design: stacks vertically on mobile (<640px)
- Add green 'slot_usage' badge for slots with class-specific overrides
- Add ✦ markers next to properties that are overridden vs inherited
- Add green left border styling for slots with slot_usage
- Add i18n translations (nl/en) for override indicators
- Merge generic slot definitions with class-specific slot_usage properties
This helps users understand which slot properties come from the generic
slot definition vs which are overridden at the class level via slot_usage.
- Migrate 236+ class files from custodian_types to has_or_had_custodian_type
- Archive deprecated slots: custodian_type, custodian_types, custodian_type_broader/narrower/related
- Update main schema and manifest imports
- Fix Custodian.yaml class to use new slot
- Fix annotation format (list→scalar) in has_or_had_custodian_type.yaml
Rules applied:
- Rule 39: RiC-O naming convention (hasOrHad pattern)
- Rule 43: Slot nouns must be singular (multivalued:true for cardinality)
- Rule 38: Slot centralization with semantic URI
- Add ontology cache warming at startup in lifespan() function
- Add is_factual_query() detection in template_sparql.py (12 templates)
- Add factual_result and sparql_query fields to DSPyQueryResponse
- Skip LLM generation for factual templates (count, list, compare)
- Execute SPARQL directly and return results as table (~15s → ~2s latency)
- Update ConversationPanel.tsx to render factual results table
- Add CSS styling for factual results with green theme
For queries like 'hoeveel archieven zijn er in Den Haag', the SPARQL
results ARE the answer - no need for expensive LLM prose generation.
- Remove deprecated CustodianTypeCodeEnum from class_metadata_slots.yaml
- Update custodian_types slot to use uriorcurie range (references CustodianType subclasses)
- Update custodian_types_primary slot similarly
- Add migration note for legacy string format ['A'] vs new URI format
Per Rule 9: Enum-to-Class Promotion - Single Source of Truth