Commit graph

381 commits

Author SHA1 Message Date
kempersc
6812524ae5 feat(entity-review): add 'provides match' toggle for source URLs
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 2m23s
DSPy RAG Evaluation / Layer 1 - Unit Tests (push) Successful in 5m37s
DSPy RAG Evaluation / Layer 2 - DSPy Module Tests (push) Successful in 7m24s
DSPy RAG Evaluation / Layer 3 - Integration Tests (push) Successful in 5m47s
DSPy RAG Evaluation / Layer 4 - Comprehensive Evaluation (push) Successful in 6m52s
DSPy RAG Evaluation / Quality Gate (push) Successful in 1s
- Add toggle in source URL form to indicate when a source provides
  sufficient information to create a person profile without LinkedIn
- Store provides_match boolean in source observation data
- Display green badge on existing sources that have provides_match: true
- Include bilingual tooltip (EN/NL) explaining the toggle purpose
2026-01-18 18:25:45 +01:00
kempersc
b11223277c fix(entity-review): persist source URLs for WCMS-only profiles
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 2m1s
DSPy RAG Evaluation / Layer 1 - Unit Tests (push) Successful in 5m34s
DSPy RAG Evaluation / Layer 2 - DSPy Module Tests (push) Successful in 7m43s
DSPy RAG Evaluation / Layer 3 - Integration Tests (push) Successful in 5m54s
DSPy RAG Evaluation / Layer 4 - Comprehensive Evaluation (push) Successful in 6m58s
DSPy RAG Evaluation / Quality Gate (push) Successful in 1s
- Add source_urls to WCMS-only profile detail response
- Update _candidates_by_wcms cache when creating new WCMS-only entries
- Use correct refresh method (fetchWcmsOnlyProfileDetail) after adding source URL

Fixes issue where source URLs added to WCMS-only profiles were not
displayed after page refresh because:
1. The wcms-only-profile/{email} endpoint wasn't returning source_urls
2. The frontend was calling fetchProfileDetail instead of
   fetchWcmsOnlyProfileDetail after adding a source URL
3. New WCMS-only entries weren't added to the lookup cache
2026-01-18 15:27:04 +01:00
kempersc
a31b89f672 refactor: update manifest.json timestamp, enhance schema definitions, and migrate publication_activity slot to structured format 2026-01-18 13:16:44 +01:00
kempersc
9a00160264 refactor: update manifest.json timestamp and enhance schema definitions across multiple YAML files 2026-01-18 02:02:54 +01:00
kempersc
a6a9ba58b8 standardise slots 2026-01-18 01:23:32 +01:00
kempersc
f30b1777f4 Enhance schema definitions and introduce new classes for DigitalPlatformV2
- Added detailed descriptions for slots: collecting_scope, collection_access, custody_history, education_level, membership_size, and publication_activity to improve clarity and usability.
- Removed the publication_date slot due to migration to a new structure.
- Updated slot fixes with migration notes and adjustments for various slots, ensuring alignment with new ontology standards.
- Introduced new classes for DigitalPlatformV2, including DigitalPlatformV2DataQualityNotes, DigitalPlatformV2DataSource, DigitalPlatformV2KeyContact, DigitalPlatformV2OrganizationProfile, DigitalPlatformV2OrganizationStatus, DigitalPlatformV2PrimaryPlatform, DigitalPlatformV2Provenance, DigitalPlatformV2ServiceDetails, and DigitalPlatformV2TransformationMetadata, each with comprehensive attributes and descriptions.
- Added classes for EnrichmentProvenance and EnrichmentProvenanceEntry to track provenance for enrichment sources, including detailed attributes for verification and source tracking.
- Created LogoClaim, LogoEnrichment, and LogoEnrichmentSummary classes to manage logo and favicon data extracted from web scraping, with attributes for claims and summary statistics.
- Archived the publication_date slot to maintain historical records.
2026-01-18 00:59:51 +01:00
kempersc
146f3c5c4e refactor: update generated timestamp in manifest.json and fix spelling errors in slot_fixes.yaml 2026-01-18 00:39:01 +01:00
kempersc
44b0771936 Remove incompatible equals_string values for has_or_had_identifier across multiple archive classes to comply with uriorcurie range requirements. 2026-01-17 23:02:17 +01:00
kempersc
aac0372aaa refactor: update tax deductibility schema and migrate slots per Rule 48 2026-01-17 21:45:52 +01:00
kempersc
2f707d224b refactor: update timestamp and enhance publication slot migration notes in slot_fixes.yaml 2026-01-17 21:11:23 +01:00
kempersc
47663e7c79 Refactor schema definitions and slots for improved temporal modeling and publisher representation
- Migrated `published_at` to `is_or_was_published_at` with structured `PublicationEvent` class for enhanced temporal accuracy.
- Introduced `has_or_had_publisher` slot to replace the string-based `publisher` slot, allowing for detailed publisher information.
- Added new slots: `deduction_percentage`, `regulatory_body`, `expiration_date`, and `jurisdiction` to support tax scheme documentation.
- Archived outdated slots: `published_by` and `publisher`, ensuring compliance with updated naming conventions and ontology alignment.
- Updated `Database` types to `DatabaseSystem` for consistency in technological infrastructure classification.
- Broadened range types for slots `allows_or_allowed` and `includes_or_included` from `string` to `uriorcurie` to resolve OWL ambiguities.
- Enhanced documentation and examples across various classes and slots to clarify usage and improve understanding.
2026-01-17 21:10:50 +01:00
kempersc
ed80fb316e refactor: migrate cataloging_standard to complies_or_complied_with and create CatalogingStandard class per Rule 53/56 2026-01-17 20:58:12 +01:00
kempersc
47834df7a3 refactor: update catalog_url to has_or_had_url and migrate per Rule 53/56 2026-01-17 19:53:55 +01:00
kempersc
441a096243 Implement feature X to enhance user experience and fix bug Y in module Z 2026-01-17 19:50:28 +01:00
kempersc
4319f38c05 Add archived slots for audience size, audience type, and capacity metrics
- Created new YAML files for audience size and audience type slots, defining their properties and annotations.
- Added archived capacity slots including cubic meters, linear meters, item count, and descriptions, with appropriate URIs and ranges.
- Introduced a template specificity slot for context-aware RAG filtering.
- Consolidated capacity-related slots into a unified structure, including has_or_had_capacity, capacity_type, and capacity_value, with detailed descriptions and examples.
2026-01-17 18:53:23 +01:00
kempersc
1b829fbe82 standardise slot names 2026-01-17 17:49:56 +01:00
kempersc
f71ef52432 refactor: migrate catalog slots to has_or_had_* structure and archive deprecated slots 2026-01-17 15:44:25 +01:00
kempersc
f18c7a5c3a Refactor address and asserter slots; migrate to has_or_had_* structure
- Updated manifest.json with new generated timestamp.
- Removed deprecated address_type and algorithm_name slots; migrated to has_or_had_type and has_or_had_label respectively.
- Updated Asserter.yaml to use has_or_had_* slots for asserter_contact, asserter_type, and asserter_version.
- Introduced IndexEntry class in Index.yaml for hierarchical index entries.
- Added DigitalPlatformType import to MailingListArchive.yaml and OnlineNewsArchive.yaml.
- Removed obsolete unit_type, algorithm_name, algorithm_version, asserter_contact, asserter_type, and asserter_version slot files.
- Archived removed slots in respective archive files.
- Updated slot_fixes.yaml to reflect migration statuses for asserter slots.
2026-01-17 15:34:11 +01:00
kempersc
46757be964 Refactor ontology schema: Migrate slots and update references
- Replaced deprecated slot 'broader_type' with 'has_or_had_hypernym' in MuseumType, OrganizationBranch, and ResearchOrganizationType schemas, ensuring all references are updated accordingly.
- Removed obsolete slots: 'binding_description', 'binding_type', 'borrower', 'borrower_contact', 'bounding_box', 'branch_description', 'branch_type', and 'taxonomic_rank', archiving them for future reference.
- Introduced new generic slots: 'has_or_had_contact_point', 'has_or_had_geographic_extent', and 'has_or_had_rank' to standardize contact and spatial information, aligning with RiC-O naming conventions.
- Updated slot_fixes.yaml to reflect migration status and ensure immutability of revision entries.
- Enhanced documentation and examples for new slots to facilitate understanding and usage.
2026-01-17 15:18:34 +01:00
kempersc
69373b5a13 Refactor and archive slots; migrate to generic temporal slots
- Refactored WomensArchivesRecordSetTypes.yaml to streamline imports and slot usage.
- Deleted obsolete slots: approximation_level, benefit, bio_custodian_subtype, bio_type_classification, business_criticality, business_model, cached_token.
- Archived deleted slots to respective archive directories for future reference.
- Introduced new generic slots: has_or_had_benefit, has_or_had_classification, has_or_had_level, has_or_had_model, has_or_had_token to standardize temporal naming conventions and improve semantic clarity.
- Updated slot descriptions and annotations to reflect new structures and usage.
2026-01-17 14:41:16 +01:00
kempersc
d99a7800e3 feat: enhance entity profile saving with PPID generation and backward compatibility 2026-01-17 01:55:38 +01:00
kempersc
dc95c7f7b7 fix: update generated timestamp in manifest.json and add feedback comments in slot_fixes.yaml 2026-01-17 00:11:38 +01:00
kempersc
54b26343c9 Add initial version of QUDT ontology file 2026-01-17 00:08:39 +01:00
kempersc
196f8a1023 Refactor schema slots and classes for improved semantic clarity and consistency
- Migrated `temperature_tolerance` to `allows_or_allowed` with `TemperatureDeviation` class for structured temperature tolerance representation.
- Replaced `temporal_coverage` with `has_or_had_content` to enhance temporal modeling using the `Content` class.
- Updated `FindingAid`, `LegalResponsibilityCollection`, and `EnvironmentalZone` schemas to reflect new slot structures.
- Archived obsolete slots: `temperature_tolerance`, `temporal_coverage`, `typical_http_methods`, and `typical_response_formats`.
- Introduced `has_or_had_technological_infrastructure` slot to replace `technology_stack`, providing a structured approach to modeling technological components.
- Enhanced documentation and examples across affected schemas to ensure clarity on new structures and their usage.
2026-01-16 20:09:58 +01:00
kempersc
cbdf2c2b2b feat: Introduce new slots and classes for heritage content and modes
- Added `has_or_had_direction` slot to represent directional orientations of entities, migrating from `text_direction`.
- Introduced `has_or_had_mode` slot for linking entities to operational modes, migrating from `thinking_mode`.
- Created `Content` class to encapsulate intellectual content of heritage materials, migrating from `temporal_coverage`.
- Added `TextDirection` class to define text flow orientations, migrating from `text_direction` slot.
- Introduced `ThinkingMode` class for LLM thinking mode configurations, migrating from `thinking_mode` slot.
- Archived previous slots related to text direction, languages detected, and thinking modes for historical reference.
- Updated documentation and annotations for clarity and compliance with RiC-O naming conventions.
2026-01-16 19:43:50 +01:00
kempersc
2c1ab0b4e6 fix: update generated timestamp and totalFiles count in manifest.json 2026-01-16 19:42:51 +01:00
kempersc
d47bb5b097 standardise slots 2026-01-16 18:57:52 +01:00
kempersc
c1748d3b11 feat: broaden slot ranges to 'Any' to resolve OWL ambiguity per Rule 55 2026-01-16 15:16:29 +01:00
kempersc
db389ed0a3 Refactor schema slots to resolve OWL ambiguity and enhance flexibility
- Updated ranges for multiple slots from `string` to `uriorcurie` to address OWL "Ambiguous type" warnings and allow for URI/CURIE references.
- Removed specialized slots for subtitle and transcript formats, consolidating them under broader predicates.
- Introduced new slots for structured descriptions, observation source documents, and entity statuses to improve data modeling.
- Implemented Rule 54 to broaden generic predicate ranges instead of creating bespoke predicates, promoting schema reuse and reducing complexity.
- Added a script for generating OWL ontology with type-object handling to ensure consistent ObjectProperty treatment for polymorphic slots.
2026-01-16 15:06:36 +01:00
kempersc
0a1f6c6f34 fix: update generated timestamp in manifest.json 2026-01-16 15:04:18 +01:00
kempersc
9034045e9a feat: update manifest.json timestamp and refactor Custodian and PersonObservation schemas with new slots for entity status and observation source documents
Some checks failed
Deploy Frontend / build-and-deploy (push) Successful in 5m10s
DSPy RAG Evaluation / Layer 1 - Unit Tests (push) Failing after 8m6s
DSPy RAG Evaluation / Layer 2 - DSPy Module Tests (push) Has been skipped
DSPy RAG Evaluation / Layer 3 - Integration Tests (push) Has been skipped
DSPy RAG Evaluation / Layer 4 - Comprehensive Evaluation (push) Has been skipped
DSPy RAG Evaluation / Quality Gate (push) Failing after 2s
2026-01-16 14:33:06 +01:00
kempersc
77fb2ba9bf fix: update identifier references to external_identifier in WebPortal.yaml and update generated timestamp in manifest.json
Some checks failed
Deploy Frontend / build-and-deploy (push) Successful in 4m34s
DSPy RAG Evaluation / Layer 1 - Unit Tests (push) Failing after 9m20s
DSPy RAG Evaluation / Layer 2 - DSPy Module Tests (push) Has been skipped
DSPy RAG Evaluation / Layer 3 - Integration Tests (push) Has been skipped
DSPy RAG Evaluation / Layer 4 - Comprehensive Evaluation (push) Has been skipped
DSPy RAG Evaluation / Quality Gate (push) Failing after 2s
2026-01-16 13:29:24 +01:00
kempersc
620b521f1b chore: update generated timestamp in manifest.json 2026-01-16 13:27:42 +01:00
kempersc
24cddb82dc enrich ppid profiles 2026-01-16 12:50:50 +01:00
kempersc
7424b85352 Add new slots for heritage custodian entities
- Introduced setpoint_max, setpoint_min, setpoint_tolerance, setpoint_type, setpoint_unit, setpoint_value, temperature_target, track_id, typical_http_methods, typical_metadata_standard, typical_response_formats, typical_scope, typical_technical_feature, unit_code, unit_symbol, unit_type, wikidata_entity, wikidata_equivalent, and wikidata_id slots.
- Each slot includes a unique identifier, name, title, description, and annotations for custodian types and specificity score.
2026-01-16 01:04:38 +01:00
kempersc
f9f3cc8e74 fix: resolve YAML import indentation and add missing slot descriptions
Schema Improvements:
- Fix YAML import indentation across 800+ class files (sed: '^- ../' → '  - ../')
- Add descriptions to 26 inline slots missing them (lint warnings)
- Fix malformed imports in BirthPlace.yaml and CustodianObservation.yaml

Validation Results:
- linkml-lint: 4 warnings (intentional SCREAMING_CASE tier names)
- gen-owl: SUCCESS (164,069 lines generated)
- gen-json-schema: SUCCESS (9.4MB generated)

Files affected: 1,034 files, +23,908 -15,200 lines
2026-01-16 00:09:28 +01:00
kempersc
6d961feafa chore: update manifest.json timestamp after schema sync 2026-01-15 23:23:11 +01:00
kempersc
c2629f6d29 Fix LinkML schema validation errors (0 errors, 30 warnings)
Schema Migration Fixes:
- Fix YAML import indentation in ~650 slot files (linkml:types and enum imports)
- Rename slot reference: has_or_had_holds_record_set_type → hold_or_held_record_set_type
  (70+ archive class files, main schema, manifest.json)
- Fix ProvenanceBlock.yaml: remove invalid any_of range, use string with multivalued
- Fix has_or_had_provenance.yaml: remove nested template_specificity from annotations

Validation Status:
- 0 errors (was multiple import/reference errors)
- 30 warnings (missing descriptions on inline slots, intentional SCREAMING_CASE names)

Files changed: ~3,850 (slots, classes, main schema, manifest)
2026-01-15 23:21:38 +01:00
kempersc
555949798d Update manifest.json timestamp, remove deprecated slot imports, and archive obsolete slots 2026-01-15 21:01:16 +01:00
kempersc
027871c070 Update generated timestamp in manifest.json and adjust imports in ExhibitionCatalog.yaml 2026-01-15 20:46:02 +01:00
kempersc
0cc8c8ca8f Add archived slot definitions for various attributes in the HC ontology
- Introduced new YAML files for slots including typical_scope, typical_technical_feature, unit_affiliation, used, used_by, user_community, verified, web_observation, whatsapp_business_likelihood, wikidata_alignment, wikidata, wikidata_entity, wikidata_equivalent, wikidata_id, wikidata_mapping, stores_or_stored, and time_of_destruction.
- Each slot includes detailed descriptions, mappings, and examples to enhance the ontology's semantic structure.
- Migrated and centralized the 'stores_object' slot into 'stores_or_stored' to comply with RiC-O naming conventions.
- Added comprehensive documentation for temporal-aware slots to support better data integration and querying capabilities.
2026-01-15 20:44:51 +01:00
kempersc
416aa407cc Add new slots for financial and heritage documentation
- Introduced total expense, total frames analyzed, total investment, total liability, total net asset, and traditional product slots to enhance financial reporting capabilities.
- Added transition types detected, treatment description, type hypothesis, typical condition, typical HTTP methods, typical response formats, and typical scope slots for improved heritage documentation.
- Implemented user community, verified, web observation, WhatsApp business likelihood, wikidata equivalent, and wikidata mapping slots to enrich institutional data representation.
- Established has_or_had_asset, has_or_had_budget, has_or_had_expense, and is_or_was_threatened_by slots to capture asset, budget, expense relationships, and threats to heritage forms.
2026-01-15 19:35:39 +01:00
kempersc
37d923cae1 Refactor slot names and update imports for consistency
- Migrated `was_generated_by` to `is_or_was_generated_by` and `was_derived_from` to `is_or_was_derived_from` across multiple YAML schema files as per Rule 53.
- Updated relevant imports, slot lists, and slot usage keys to reflect the new naming conventions.
- Added migration comments for clarity and tracking.
- Introduced a migration script to automate the changes across all affected files.
2026-01-15 15:07:53 +01:00
kempersc
3fb27c15e2 Refactor and archive deprecated slots; update migration records
- Removed deprecated slots: storage_security_level, version_number, video_comment, visiting_hour, was_asserted_by, was_revision_of, writing_system.
- Archived corresponding YAML files for deprecated slots with detailed migration notes.
- Updated slot definitions for has_collection and encompassing_body to reflect new naming conventions and temporal patterns.
- Enhanced metadata extraction in index_persons_qdrant.py to include WCMS registration and data sources.
- Modified hybrid_retriever and multi_embedding_retriever to support filtering by WCMS registration status.
2026-01-15 13:16:59 +01:00
kempersc
8174c9692e Refactor and Archive Deprecated Slots
- Removed deprecated slots:
  - accepts_or_accepted_external_work
  - accepts_or_accepted_payment_method
  - accepts_or_accepted_visiting_scholar
  - parent_collection
  - parent_custodian
  - storage_description
  - storage_type_description
  - sub_guide_description
  - transfer_location
  - transfer_location_text
  - transfer_policy
  - transfer_to_collection_date
  - unit_description

- Archived corresponding YAML files for the removed slots with detailed notes on migration and replacements.
- Updated slot fixes to reflect the migration of deprecated slots to new structures and naming conventions.
- Introduced new slots and classes to replace deprecated ones, ensuring compliance with RiC-O standards.
2026-01-15 13:00:27 +01:00
kempersc
ea61e36a8e feat: Update generated timestamp in manifest.json and add new slot revisions in slot_fixes.yaml 2026-01-15 12:37:46 +01:00
kempersc
916f8e7247 feat: Update generated timestamp in manifest.json 2026-01-15 11:43:12 +01:00
kempersc
6c3fa6b5a3 Remove deprecated slots and add new slot definitions for enhanced data modeling
- Deleted obsolete slot definitions for work_location and workshop_space.
- Introduced new TaxonName class to represent scientific taxonomic names with detailed attributes.
- Archived existing slots related to surname_prefix, target_name, taxon_name, terminal_count, text_region_count, title, title_proper, total_chapter, total_characters_extracted, total_connections_extracted, track_name, transcript_format, traveling_venue, type_label, type_status, typical_responsibility, unesco_domain, unesco_inscription_year, unesco_list_status, uniform_title, unit_name, used_by_custodian, uv_filtered_required, valid_from_geo, valid_to_geo, validation_status, variant_of_name, verification_date, viability_status, within_auxiliary_place, and within_place.
- Updated slot descriptions and structures to improve clarity and compliance with standards.
2026-01-15 11:42:35 +01:00
kempersc
d5d970b513 Remove deprecated slot definitions and add archived versions for future reference
- Deleted the following slot definitions:
  - wikidata_class_slot
  - wikidata_entity_label_slot
  - wikidata_mapping_rationale_slot
  - word_count_slot

- Added archived versions of the deleted slots to preserve historical data:
  - wikidata_class_archived_20260114.yaml
  - wikidata_entity_label_archived_20260114.yaml
  - wikidata_mapping_rationale_archived_20260114.yaml
  - word_count_archived_20260114.yaml

- Introduced a new hook `usePersonSearch` for enhanced semantic search functionality in the frontend, supporting debounced queries and caching.
2026-01-14 22:57:09 +01:00
kempersc
1389b744f1 feat: Update manifest timestamp and archive multiple slot definitions for compliance 2026-01-14 22:38:50 +01:00
kempersc
ad5fbe82cf Migrate valid_from and valid_to slots to temporal_extent across multiple classes
- Consolidated valid_from and valid_to slots into a single temporal_extent slot in FundingRequirement, GiftShop, OrganizationBranch, OrganizationalChangeEvent, OrganizationalStructure, SocialMediaProfile, Storage, StorageUnit classes.
- Updated slot definitions to use TimeSpan for temporal_extent, providing structured validity periods.
- Removed deprecated slots: valid_from, valid_to, verified_by, wikidata_entity_id, and worldcat_id, archiving their definitions for reference.
- Adjusted related documentation and examples to reflect the new temporal_extent structure.
2026-01-14 22:33:36 +01:00
kempersc
44f8621eba refactor: consolidate valid_from and valid_to into temporal_extent per Rule 53 in ArticlesOfAssociation, AuxiliaryDigitalPlatform, and AuxiliaryPlace schemas 2026-01-14 22:21:12 +01:00
kempersc
58940582c3 Refactor warehouse and video slots: Migrate and archive multiple slots including warehouse_security_level, warehouse_name, and video_id. Introduce UnitIdentifier class to replace unit_id and unit_identifier slots. Update consuming classes and document migration details. Archive obsolete slot definitions for track_id, tracking_ids_assigned, vendor_name, and others. Ensure compliance with migration rules and maintain historical records in the archive. 2026-01-14 22:20:44 +01:00
kempersc
13252cc5b7 fix(manifest): update generated timestamp and add new slot revisions for enhanced data modeling 2026-01-14 20:52:30 +01:00
kempersc
d3d5c5cdde feat: Update manifest and refactor EnvironmentalZone schema with new slot mappings and archived slots
- Updated generated timestamp in manifest.json
- Refactored EnvironmentalZone.yaml to replace zone_name and zone_description with has_or_had_label and has_or_had_description respectively
- Archived previous slots zone_name, zone_id, and zone_description with detailed migration notes
- Introduced new classes for ApprovalTimeType, ApprovalTimeTypes, ISO639-3Identifier, Investment, InvestmentArea, Language, Liability, NetAsset, ResourceType, ResponseFormat, ResponseFormatType, Token, TrackIdentifier, TraditionalProductType, TranscriptFormat, TypeStatus, UNESCODomain, UNESCODomainType, VenueTypes, and VideoFrames with appropriate attributes and slots
- Added subclasses for ApprovalTimeTypes, ResponseFormatTypes, TraditionalProductTypes, and UNESCODomainTypes
2026-01-14 20:40:08 +01:00
kempersc
bf7515c48f feat: Add new classes for Domain, HTTPMethod, HTTPMethodType, MetadataStandard, MetadataStandardType, Responsibility, ResponsibilityType, TechnicalFeatureTypes with associated attributes and types 2026-01-14 20:33:59 +01:00
kempersc
8123efe849 feat: Add new classes for HTTPMethod, MetadataStandard, Responsibility, and TechnicalFeature with associated attributes and types 2026-01-14 20:33:28 +01:00
kempersc
7a72a1d096 Add new classes and slots for enhanced data modeling
- Introduced VerificationStatus, Verifier, VersionNumber, ViabilityStatus, VideoCategoryIdentifier, VideoIdentifier, WhatsAppProfile, WordCount, WorkRevision, and WorldCatIdentifier classes to capture various aspects of data verification, categorization, and identification.
- Created corresponding slots such as analyzes_or_analyzed, unit_type, years_restricted, benefits_provided, consumes_or_consumed, has_or_had_contact_details, has_or_had_investment, has_or_had_liability, has_or_had_likelihood_score, has_or_had_location, has_or_had_net_asset, is_or_was_affiliated_with, is_or_was_allocated_to, is_or_was_alternative_form_of, is_or_was_categorized_as, is_or_was_used_by, and was_last_updated_at to facilitate detailed tracking and categorization of entities and their attributes.
- Each class and slot includes detailed descriptions, usage examples, and mappings to relevant ontologies to ensure interoperability and clarity in data representation.
2026-01-14 20:32:45 +01:00
kempersc
1f04a26b12 feat: Introduce MeasureUnitEnum for standardized measurement units
- Added MeasureUnitEnum.yaml to define standard measurement units for area, length, and related quantities, compliant with ISO 80000-1, QUDT, and UCUM.
- Included units such as square meters, hectares, acres, meters, kilometers, and their conversions.

feat: Create applies_or_applied_to_call slot for funding requirements

- Introduced applies_or_applied_to_call.yaml to track funding calls related to requirements, following RiC-O naming conventions.

chore: Archive annual_participants slot and migrate to has_or_had_annual_participant_count

- Archived annual_participants_archived_20260115.yaml, replaced with has_or_had_annual_participant_count for better temporal naming.

chore: Archive applies_to_call slot and migrate to applies_or_applied_to_call

- Archived applies_to_call_archived_20260115.yaml, replaced with applies_or_applied_to_call for improved naming consistency.

chore: Archive area_hectares slot and migrate to has_area_in_hectare

- Archived area_hectares_archived_20260115.yaml, replaced with has_area_in_hectare for standardized area measurement.

chore: Archive arrangement_notes slot and migrate to has_arrangement_note

- Archived arrangement_notes_archived_20260115.yaml, replaced with has_arrangement_note for better naming alignment.

chore: Archive available_caption_languages slot and migrate to has_available_caption_language

- Archived available_caption_languages_archived_20260115.yaml, replaced with has_available_caption_language for improved naming.

chore: Archive beneficiary_group slot and migrate to has_or_had_beneficiary

- Archived beneficiary_group_archived_20260115.yaml, replaced with has_or_had_beneficiary for compliance with naming conventions.

chore: Archive branch_head slot and migrate to has_or_had_head

- Archived branch_head_archived_20260114.yaml, replaced with has_or_had_head for better semantic alignment.

chore: Archive budget_currency slot

- Archived budget_currency_archived_20260114.yaml for future migration.

chore: Archive building_floor_area_sqm slot and migrate to has_or_had_area

- Archived building_floor_area_sqm_archived_20260115.yaml, replaced with has_or_had_area for standardized area measurement.

chore: Archive has_area_in_hectare slot and migrate to has_or_had_area

- Archived has_area_in_hectare_archived_20260115.yaml, replaced with has_or_had_area for compliance with generic predicates.

feat: Introduce has_or_had_area slot for area measurements

- Added has_or_had_area.yaml as the authoritative slot for area measurements, compliant with Rule 53.

feat: Introduce has_or_had_beneficiary slot for beneficiary tracking

- Added has_or_had_beneficiary.yaml to identify beneficiaries of organizational programs, following RiC-O naming conventions.

feat: Introduce has_or_had_currency slot for monetary values

- Added has_or_had_currency.yaml to track currency associated with monetary amounts, compliant with ISO 4217.

feat: Introduce has_or_had_head slot for organizational heads

- Added has_or_had_head.yaml to link organizational units with their heads, following W3C ORG standards.
2026-01-14 20:18:38 +01:00
kempersc
d4493580ea chore: Update manifest and add remaining slot migrations
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 4m17s
2026-01-14 20:03:23 +01:00
kempersc
53c6dbc2d9 feat(schema): Migrate temporal slots and introduce new pattern classes
Major slot migrations following slot_fixes.yaml revisions:
- TimeSpan: begin_of_the_begin, begin_of_the_end, end_of_the_begin, end_of_the_end
- Quantity: has_or_had_measurement_unit with MeasureUnit class
- Description: has_or_had_description with Description class
- URL, WikiData, Timestamp, Location, Provenance pattern classes

New slots for RiC-O compliance:
- Temporal: has_or_had_time_interval, calendar_system
- Transfer: is_or_was_transferred, has_or_had_policy
- Location: starts/ends_or_started/ended_at_location
- Provenance: has_or_had_provenance_path, is_or_was_webarchived_at

Archive deprecated slots per Rule 53 workflow.
2026-01-14 20:01:55 +01:00
kempersc
13ba8fb09b style(entity-review): Improve header button styling and icon sizes
- Increase icon sizes from 16px to 20px for better visibility
- Add borders and shadows to header action buttons
- Improve hover states with color transitions
- Add proper dark mode styling for all button variants
2026-01-14 19:56:24 +01:00
kempersc
a981bb7ca3 feat(linkml): Add slot_usage comparison popup in schema viewer
- Add 'slot_usage' type to SchemaElementPopup for comparing generic slots vs class overrides
- Show side-by-side comparison table with property, generic value, and override value
- Display green 'changed' badges for modified properties
- Add dual navigation buttons (Go to class / Go to slot)
- Include comprehensive dark mode support
- Match styling to main page's comparison view (green color scheme)
2026-01-14 19:55:57 +01:00
kempersc
853419d6c2 feat: Introduce MeasureUnitEnum for standardized measurement units
- Added MeasureUnitEnum.yaml to define standard measurement units for area, length, and related quantities, compliant with ISO 80000-1, QUDT, and UCUM.
- Included units such as square meters, hectares, acres, meters, kilometers, and their conversions.

feat: Create applies_or_applied_to_call slot for funding requirements

- Introduced applies_or_applied_to_call.yaml to track funding calls related to requirements, following RiC-O naming conventions.

chore: Archive and migrate annual_participants slot

- Archived annual_participants_archived_20260115.yaml, replaced by has_or_had_annual_participant_count for better temporal naming.

chore: Archive applies_to_call slot and migrate to new naming

- Archived applies_to_call_archived_20260115.yaml, replaced by applies_or_applied_to_call for compliance with RiC-O conventions.

chore: Archive area_hectares slot and migrate to has_area_in_hectare

- Archived area_hectares_archived_20260115.yaml, replaced by has_area_in_hectare for standardized area measurement.

chore: Archive arrangement_notes slot and migrate to has_arrangement_note

- Archived arrangement_notes_archived_20260115.yaml, replaced by has_arrangement_note for improved naming consistency.

chore: Archive available_caption_languages slot and migrate to has_available_caption_language

- Archived available_caption_languages_archived_20260115.yaml, replaced by has_available_caption_language for better naming.

chore: Archive beneficiary_group slot and migrate to has_or_had_beneficiary

- Archived beneficiary_group_archived_20260115.yaml, replaced by has_or_had_beneficiary for compliance with naming conventions.

chore: Archive branch_head slot and migrate to has_or_had_head

- Archived branch_head_archived_20260114.yaml, replaced by has_or_had_head for better semantic alignment.

chore: Archive budget_currency slot

- Archived budget_currency_archived_20260114.yaml for future migration.

chore: Archive building_floor_area_sqm slot and migrate to has_or_had_area

- Archived building_floor_area_sqm_archived_20260115.yaml, replaced by has_or_had_area for standardized area measurement.

chore: Archive has_area_in_hectare slot and migrate to has_or_had_area

- Archived has_area_in_hectare_archived_20260115.yaml, replaced by has_or_had_area for compliance with naming conventions.

feat: Introduce has_or_had_area slot for area measurements

- Added has_or_had_area.yaml as the authoritative slot for area measurements, compliant with Rule 53.

feat: Introduce has_or_had_beneficiary slot for beneficiary tracking

- Added has_or_had_beneficiary.yaml to identify beneficiaries of organizational programs, following RiC-O naming conventions.

feat: Introduce has_or_had_currency slot for monetary values

- Added has_or_had_currency.yaml to associate currencies with monetary amounts, compliant with ISO 4217.

feat: Introduce has_or_had_head slot for organizational heads

- Added has_or_had_head.yaml to link organizational units with their heads, following W3C ORG standards.

feat: Introduce has_or_had_unit slot for measurement units

- Added has_or_had_unit.yaml to associate measurements with their units, compliant with Rule 53.
2026-01-14 17:28:38 +01:00
kempersc
bdf3ceafb8 feat: Migrate and standardize measurement units; introduce Area and MeasureUnit classes 2026-01-14 17:04:33 +01:00
kempersc
913a1a41a7 chore: Update generated timestamp in manifest.json and add new slot revisions in slot_fixes.yaml 2026-01-14 16:59:47 +01:00
kempersc
8902f0e082 feat: Migrate and enhance currency and area slots; introduce Currency class 2026-01-14 16:58:25 +01:00
kempersc
6da794ee38 feat: Introduce new slots and classes for enhanced heritage data modeling
- Added `has_or_had_place_of_birth` slot to capture structured birth place information with historical context.
- Introduced `has_or_had_quantity` slot for capturing quantified values with units and provenance.
- Created `has_or_had_service_area` slot to define geographic service areas for heritage custodians.
- Implemented `is_or_was_approximate` slot to indicate uncertainty in values (dates, quantities).
- Added `is_or_was_asserted_by` slot to track the agent responsible for assertions.
- Introduced `Asserter` class to model agents making assertions, including types like human, automated, and AI.
- Created `Quantity` class to represent quantified values with optional units and types.
- Added enums for `AsserterTypeEnum` and `QuantityTypeEnum` to standardize types of asserters and quantities.
- Archived outdated slots and replaced them with new structured alternatives following RiC-O conventions.
2026-01-14 16:54:10 +01:00
kempersc
4acdab5b4e chore: Update generated timestamp in manifest.json
Some checks failed
Deploy Frontend / build-and-deploy (push) Successful in 4m8s
DSPy RAG Evaluation / Layer 1 - Unit Tests (push) Failing after 8m54s
DSPy RAG Evaluation / Layer 2 - DSPy Module Tests (push) Has been skipped
DSPy RAG Evaluation / Layer 3 - Integration Tests (push) Has been skipped
DSPy RAG Evaluation / Layer 4 - Comprehensive Evaluation (push) Has been skipped
DSPy RAG Evaluation / Quality Gate (push) Failing after 2s
2026-01-14 16:22:08 +01:00
kempersc
4338d0a081 feat: Add structured representation for BirthDate and BirthPlace classes
- Introduced BirthDate class with support for EDTF notation, provenance tracking, and confidence scoring.
- Added BirthPlace class to preserve historical names, link modern equivalents, and integrate geographic identifiers.
- Created Approximation Level slot to express uncertainty levels for various values.
- Migrated existing slots to structured classes for better data modeling, including has_or_had_date_of_birth and has_or_had_place_of_birth.
- Enhanced service area representation with has_or_had_service_area slot, linking to ServiceArea class.
- Updated is_or_was_approximate slot to model uncertainty levels using ApproximationStatus class.
- Archived previous versions of slots for historical reference.
2026-01-14 16:04:09 +01:00
kempersc
5ddb7e818a Refactor schema: Migrate slots to new patterns and create new classes
- Migrated `audio_event_segments` to `has_or_had_segment` with range `AudioEventSegment` in VideoAudioAnnotation.yaml.
- Removed deprecated slots: `approved_by`, `audio_event_segments`, `bay_number`, `box_number`, and `budget_status`.
- Created new classes: `AudioEventSegment`, `BayNumber`, `BoxNumber`, and `BudgetStatus` to encapsulate previously slot-based data.
- Introduced `has_or_had_auxiliary_entities` slot to replace `auxiliary_places` and `auxiliary_platforms`.
- Archived removed slots to maintain historical context.
- Updated LinkMLViewerPage to utilize new schema element popup for better navigation.
2026-01-14 15:20:53 +01:00
kempersc
7691a11e79 chore: Update generated timestamp in manifest.json and archive budget_status slot 2026-01-14 15:14:23 +01:00
kempersc
7c7d8c0270 feat: Add SchemaElementPopup component for displaying LinkML schema element previews
- Implemented a draggable, resizable, and minimizable popup component for displaying previews of LinkML schema elements (classes, slots, enums).
- Integrated loading states and error handling for fetching element information.
- Added navigation functionality to go to full element view.
- Enhanced user experience with type badges and detailed descriptions for each element type.

chore: Migrate AudioEventSegment, BayNumber, BoxNumber, and BudgetStatus classes to new YAML schema format

- Created new YAML definitions for AudioEventSegment, BayNumber, BoxNumber, and BudgetStatus classes with detailed descriptions and attributes.
- Migrated from deprecated slots to new class structures as part of Rule 53.
- Updated imports and prefixes for consistency across schemas.

chore: Archive deprecated slots for audio_event_segments, bay_number, and box_number

- Archived previous slot definitions for audio_event_segments, bay_number, and box_number to maintain historical records.
- Updated slot descriptions and ensured proper URI mappings for future reference.
2026-01-14 15:13:06 +01:00
kempersc
b927bc4b43 Update manifest.json and migrate approved_by slot to is_or_was_approved_by; add includes_or_included slot to InformationCarrier; remove bookplate slot and archive it 2026-01-14 15:05:37 +01:00
kempersc
21c207c9da Refactor schema slots and classes for improved clarity and structure
- Migrated `archived_at` to `is_or_was_archived_at` in AuxiliaryDigitalPlatform, WebObservation, and other relevant classes to better reflect historical archival status.
- Removed `bold_id` slot and replaced it with `has_or_had_identifier` linked to the new `BOLDIdentifier` class in BiologicalObject.
- Introduced `Bookplate` and `Approver` classes to enhance provenance tracking and ownership documentation.
- Updated `InformationCarrier` to replace `bookplate` with `includes_or_included` for better representation of ownership marks.
- Added new slots `is_or_was_approved_by` and `is_or_was_archived_at` to capture historical approval and archival locations.
- Archived old slot definitions for `archived_at` and `bold_id` to maintain schema integrity.
- Enhanced LinkedIn profile extraction functionality by integrating Linkup API alongside Exa API.
2026-01-14 13:28:33 +01:00
kempersc
c8471d3a02 Update generated timestamp in manifest.json 2026-01-14 13:10:01 +01:00
kempersc
60e66d60f9 Add new slots and classes for enhanced documentation and availability tracking
- Introduced `is_or_was_created_through` slot to indicate content creation methods, replacing previous boolean flags.
- Added `is_or_was_required` slot for generic temporal boolean requirements, aligning with Schema.org.
- Created `AutoGeneration` class to represent automatic content generation, capturing methods and provenance.
- Established `AvailabilityStatus` class to model resource availability with temporal validity.
- Developed `Documentation` class for structured documentation resources, replacing domain-specific slots.
- Implemented `Taxon` class for biological classification in natural history collections.
- Archived previous slots related to API availability and documentation, ensuring a clean schema.
- Enhanced existing slots with detailed descriptions and examples for clarity and usability.
2026-01-14 13:09:31 +01:00
kempersc
b13674400f Refactor schema slots and classes for improved organization and clarity
- Removed deprecated slots: appraisal_notes, branch_id, is_or_was_real.
- Introduced new slots: has_or_had_notes, has_or_had_provenance.
- Created Notes class to encapsulate note-related metadata.
- Archived removed slots and classes in accordance with the new archive folder convention.
- Updated slot_fixes.yaml to reflect migration status and details.
- Enhanced documentation for new slots and classes, ensuring compliance with ontology alignment.
- Added new slots for note content, date, and type to support the Notes class.
2026-01-14 12:14:07 +01:00
kempersc
b8914761b8 standardise slots 2026-01-14 09:51:14 +01:00
kempersc
e3adb4ed60 feat: Introduce Overview, RealnessStatus, and WebLink classes with comprehensive documentation and migration notes
- Added Overview class to represent structured collections of web links, including detailed descriptions, examples, and ontology alignments.
- Introduced RealnessStatus class to classify data as real or synthetic, with rich provenance and temporal semantics.
- Created WebLink class for representing hyperlinks with associated metadata, enhancing structured link representation.
- Established new slots: has_or_had_comprehensive_overview, is_or_was_real, and includes_or_included to support the new classes and improve data modeling.
- Migrated existing slots to new structures, ensuring compliance with RiC-O naming conventions and enhancing specificity.
- Updated annotations and examples across all new classes and slots for clarity and usability.
2026-01-14 09:32:14 +01:00
kempersc
c807487a51 Refactor expense slots: remove program_expense slot, migrate administrative_expenses, and archive related slots
- Deleted the program_expense slot from the schema.
- Updated slot_fixes.yaml to reflect the migration of administrative_expenses, marking it as fully migrated and archiving related bespoke slots.
- Created archived YAML files for administrative_expenses, fundraising_expense, has_or_had_administrative_expense, innovation_expense, and program_expense, documenting their structure and descriptions.
- All expense types now utilize the Expenses class with ExpenseTypeEnum classification for better organization and clarity.
2026-01-14 09:15:17 +01:00
kempersc
1133749de8 fix: update manifest.json timestamp and consolidate expense slots in FinancialStatement.yaml 2026-01-14 09:07:11 +01:00
kempersc
554c5721ea fix: update generated timestamp in manifest.json and add has_or_had_expenses slot definition 2026-01-14 09:06:36 +01:00
kempersc
b30711fcfb update slots 2026-01-14 09:05:54 +01:00
kempersc
17da3a81e9 feat(review): add enhanced pagination with first/last page buttons and page input
- Add first page (<<) and last page (>>) navigation buttons
- Add direct page number input field for jumping to specific pages
- Update CSS styling for new pagination controls including input field
- Use stacked ChevronLeft/ChevronRight icons for first/last (lucide-react compatibility)
2026-01-13 23:27:28 +01:00
kempersc
833bb56833 feat(entity-resolution): expand consumer email domain list
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 3m55s
Add additional Dutch ISP domains for better filtering:
- gmail.nl, icloud.nl, aol.nl, aol.com
- telfortglasvezel.nl, worldonline.nl, delta.nl, lijbrandt.nl
- t-mobilethuis.nl, compaqnet.nl, filternet.nl, onsmail.nl, box.nl
- mailinator.com (disposable email)
2026-01-13 20:54:34 +01:00
kempersc
6a3616beac feat(entity-resolution): expand Dutch heritage domain mappings
Some checks are pending
Deploy Frontend / build-and-deploy (push) Waiting to run
Add domain mappings for better email-based entity matching:
- Government: noord-holland.nl, amsterdam.nl, rotterdam.nl, denhaag.nl,
  hoorn.nl, hhnk.nl, rijksoverheid.nl, politie.nl, kadaster.nl, rvo.nl,
  rivm.nl, staatsbosbeheer.nl, vng.nl
- Museums: maritiemmuseum.nl, paleishetloo.nl, slotloevestein.nl
- Universities: student.vu.nl, cdh.leidenuniv.nl, jur.ru.nl, student.ru.nl,
  student.tudelft.nl, eshcc.eur.nl, wur.nl, ou.nl
- Hogescholen: hva.nl, student.hu.nl, student.fontys.nl

Also remove deprecated activity_id.yaml slot file
2026-01-13 20:53:49 +01:00
kempersc
408813280a refactor: simplify slot descriptions to be more concise
Some checks are pending
Deploy Frontend / build-and-deploy (push) Waiting to run
- accepts_or_accepted_external_work: Remove verbose examples list
- accepts_or_accepted_payment_method: Condense to single sentence
- accepts_or_accepted_visiting_scholar: Minor rewording for consistency
2026-01-13 20:52:05 +01:00
kempersc
ea8dc37905 feat(entity-review): add wrong person detection and confidence filtering
Some checks are pending
Deploy Frontend / build-and-deploy (push) Waiting to run
- Add is_likely_wrong_person and wrong_person_reason fields to MatchCandidate
- Add confidence_original field for tracking pre-adjustment scores
- Add visual indicators: AlertTriangle for wrong person, Star for high confidence
- Add filter checkboxes: 'Show high confidence (>80%)' and 'Hide wrong person'
- Add wrong person alert banner with bilingual labels (NL/EN)
- Add danger stat card showing count of likely wrong person matches
- Style signal badges by type: danger (birth_year_mismatch), success (validated)
- Add extensive CSS for wrong-person/high-confidence alerts and candidate styling
2026-01-13 20:49:47 +01:00
kempersc
fcf36f9a11 fix: prevent ontology popup flash by using useLayoutEffect for centering
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 3m56s
DSPy RAG Evaluation / Layer 1 - Unit Tests (push) Successful in 11m5s
DSPy RAG Evaluation / Layer 2 - DSPy Module Tests (push) Successful in 12m50s
DSPy RAG Evaluation / Layer 3 - Integration Tests (push) Successful in 10m51s
DSPy RAG Evaluation / Layer 4 - Comprehensive Evaluation (push) Successful in 11m59s
DSPy RAG Evaluation / Quality Gate (push) Successful in 1s
2026-01-13 20:38:21 +01:00
kempersc
92b490d690 edit slots 2026-01-13 20:35:11 +01:00
kempersc
2907c0372a feat: add Getty AAT support and resolve PREMIS/BIBFRAME URIs to human-readable LOC docs
- Add Getty AAT (Art & Architecture Thesaurus) vocabulary support
  - fetchGettyAATEntity() fetches term info from vocab.getty.edu JSON-LD API
  - Extracts English labels, scope notes, and aliases
  - Shows 'concept' term type for SKOS concepts

- Add getHumanReadableUrl() to map RDF URIs to documentation pages
  - PREMIS 3.0: http://www.loc.gov/premis/rdf/v3/X → id.loc.gov HTML docs
  - BIBFRAME: http://id.loc.gov/ontologies/bibframe/X → id.loc.gov HTML docs
  - Uses c_ prefix for classes, p_ for properties

- Add Getty vocabulary prefixes (aat:, tgn:, ulan:)
- Add ontology badge colors for PREMIS 3, LOCN, Getty AAT
2026-01-13 19:14:31 +01:00
kempersc
fc63164335 fix: close ontology popup when navigating to different LinkML schema files
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 4m17s
When users click on a different class, enum, or slot in the sidebar,
the ontology term popup now automatically closes. This prevents the
popup from persisting and showing stale information from the
previously viewed schema element.
2026-01-13 18:22:49 +01:00
kempersc
635beca582 sync: update frontend schema copies with duplicate mapping fixes (Rule 52)
Some checks are pending
Deploy Frontend / build-and-deploy (push) Waiting to run
2026-01-13 18:13:54 +01:00
kempersc
3b676f3ea5 fix: remove duplicate ontology mappings rendering in LinkML viewer
Some checks are pending
Deploy Frontend / build-and-deploy (push) Waiting to run
The slot details section was rendering close_mappings, narrow_mappings,
broad_mappings, and related_mappings twice each. This caused the mappings
to appear duplicated on pages like /linkml?class=AcademicArchive.

Removed 68 lines of duplicate JSX code.
2026-01-13 18:11:54 +01:00
kempersc
8a3c907f59 fix: resolve CIDOC-CRM relative URIs, add EDM ontology, render HTML in descriptions
Some checks failed
Deploy Frontend / build-and-deploy (push) Has been cancelled
DSPy RAG Evaluation / Layer 1 - Unit Tests (push) Successful in 11m12s
DSPy RAG Evaluation / Layer 2 - DSPy Module Tests (push) Successful in 12m55s
DSPy RAG Evaluation / Layer 3 - Integration Tests (push) Successful in 10m24s
DSPy RAG Evaluation / Layer 4 - Comprehensive Evaluation (push) Successful in 11m31s
DSPy RAG Evaluation / Quality Gate (push) Successful in 2s
- Fix resolveUri() to handle bare local names like 'E27_Site' used by CIDOC-CRM
  (previously only handled URIs starting with '#')
- Add EDM (Europeana Data Model) ontology to frontend
  - Copy edm.owl to frontend/public/ontology/
  - Register in ONTOLOGY_FILES array
  - Add 'edm' prefix to STANDARD_PREFIXES
  - Add EDM color to ONTOLOGY_COLORS
- Render HTML content in ontology descriptions safely using DOMPurify
  - Sanitize HTML to allow only safe tags (a, br, em, strong, etc.)
  - Fix Schema.org relative links to absolute URLs
  - Add target='_blank' to external links
2026-01-13 16:50:40 +01:00
kempersc
f74513e8ef feat: Enhance entity resolution with email semantics and review merging
- Updated `entity_review.py` to map email semantic fields from JSON.
- Expanded `email_semantics.py` with additional museum mappings.
- Introduced a new rule in `.opencode/rules/no-duplicate-ontology-mappings.md` to prevent duplicate ontology mappings.
- Added a backup JSON file for entity resolution candidates.
- Created `enrich_email_semantics.py` to enrich candidates with email semantic signals.
- Developed `merge_entity_reviews.py` to merge reviewed decisions from a backup into new candidates.
2026-01-13 16:43:56 +01:00
kempersc
21ed120ac2 fix: correct hallucinated RiC-O terms and add locn ontology
RiC-O hallucinated terms fixed:
- FindingAidType.yaml: rico:FindingAidType → rico:DocumentaryFormType
- has_acquisition_method.yaml: rico:hasOrHadActivityType → prov:wasGeneratedBy
- has_activity_type.yaml: rico:hasOrHadActivityType → dcterms:type
- has_arrangement.yaml: rico:hasOrHadArrangement → dcterms:description
- has_or_had_finding_aid.yaml: rico:isDescribedBy → rico:isOrWasDescribedBy

The following terms do NOT exist in RiC-O 1.1:
- rico:FindingAidType (use rico:DocumentaryFormType)
- rico:hasOrHadActivityType (no equivalent)
- rico:hasOrHadArrangement (no equivalent)
- rico:isDescribedBy (correct form: rico:isOrWasDescribedBy)

Added LOCN ontology support:
- Copied locn.ttl to frontend/public/ontology/
- Added LOCN to ONTOLOGY_FILES in ontology-loader.ts
- Added locn prefix to OntologyTermPopup.tsx
- LOCN (http://www.w3.org/ns/locn#) is W3C Location Core Vocabulary
  for addresses and geometry (used by locn:Address)
2026-01-13 16:42:32 +01:00
kempersc
6781073d06 fix: add @base directive support for Turtle/RDF parsing
The VCard ontology file (and 3 others) use @base directive with relative URIs
like <#Address>. The Turtle parser was not extracting @base or resolving
relative URIs against it.

Changes:
- Extract @base directive in first pass alongside @prefix
- Add baseUri parameter to expandUri() function
- Handle relative URIs starting with # (resolve against base)
- Handle empty relative URI <> (returns base URI itself)
- Pass baseUri through to processSubject() function

This fixes the 'Term not found' error for vcard:Address and similar terms
that use relative URI notation in their ontology definitions.

Affected ontologies: vcard.rdf, prov.ttl, era_ontology.ttl, ebg-ontology.ttl
2026-01-13 15:54:29 +01:00
kempersc
f2b10fca19 fix: correct hallucinated PREMIS terms and Schema.org namespace mismatch
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 3m48s
PREMIS ontology fixes (8 schema files):
- Replace invalid premis:hasRepresentation with dcterms:hasFormat
- Replace invalid premis:hasAccessRestriction with odrl:hasPolicy
- Replace invalid premis:hasPreservationPolicy with dcterms:conformsTo
- Replace invalid premis:hasAccessPolicy with dcterms:accessRights
- Replace invalid premis:hasStoragePolicy with dcterms:conformsTo
- Replace invalid premis:ProcessingStatus with skos:Concept
- Add proper close_mappings to valid PREMIS classes (premis:Representation, etc.)
- Document hallucinated terms in Rule 51 (AGENTS.md) for future prevention

Schema.org namespace fixes (3 frontend files):
- Update OntologyTermPopup.tsx: add normalizeSchemaOrgUri() function
- Update ontology-loader.ts: change schema prefix to https://schema.org/
- Update linkml-schema-service.ts: change schema prefix to https://schema.org/
- The schemaorg.owl file uses https:// but code was using http://

These changes ensure ontology term lookups work correctly for Schema.org
terms and that LinkML schema files only reference valid ontology predicates.
2026-01-13 14:16:33 +01:00
kempersc
1fb924c412 feat: add ontology mappings to LinkML schema and enhance entity resolution
Schema enhancements (443 files):
- Add class_uri with proper ontology references (schema:, prov:, skos:, rico:)
- Add close_mappings, related_mappings per Rule 50 convention
- Replace stub hc: slot_uri with standard predicates (dcterms:identifier, skos:prefLabel)
- Improve descriptions with ontology mapping rationale
- Add prefixes blocks to all schema modules

Entity Resolution improvements:
- Add entity_resolution module with email semantics parsing
- Enhance build_entity_resolution.py with email-based matching signals
- Extend Entity Review API with filtering by signal types and count
- Add candidates caching and indexing for performance
- Add ReviewLoginPage component

New rules and documentation:
- Add Rule 51: No Hallucinated Ontology References
- Add .opencode/rules/no-hallucinated-ontology-references.md
- Add .opencode/rules/slot-ontology-mapping-reference.md
- Add adms.ttl and dqv.ttl ontology files

Frontend ontology support:
- Add RiC-O_1-1.rdf and schemaorg.owl to public/ontology
2026-01-13 13:51:02 +01:00
kempersc
c5fb9ec88e feat: add route for Entity Review page with lazy loading 2026-01-13 01:49:43 +01:00
kempersc
8d7aca0f98 Refactor code structure for improved readability and maintainability 2026-01-12 19:13:35 +01:00
kempersc
3b35f4aea5 Refactor code structure for improved readability and maintainability 2026-01-12 18:31:31 +01:00
kempersc
846a6cdcec Add new Record Set Types for various archival collections
- Introduced SoundArchiveRecordSetType, SpecialCollectionRecordSetType, SpecializedArchiveRecordSetType, SpecializedArchivesCzechiaRecordSetType, StateArchivesRecordSetType, StateArchivesSectionRecordSetType, StateDistrictArchiveRecordSetType, StateRegionalArchiveCzechiaRecordSetType, TelevisionArchiveRecordSetType, TradeUnionArchiveRecordSetType, UniversityArchiveRecordSetType, VereinsarchivRecordSetType, VerlagsarchivRecordSetType, VerwaltungsarchivRecordSetType, WebArchiveRecordSetType, and WomensArchivesRecordSetType.
- Each new type includes appropriate metadata, slots, and relationships to existing classes.
- Implemented a script to detect and fix Type class violations in LinkML files.
2026-01-12 15:20:29 +01:00
kempersc
5807840bbc fix: update generated timestamp in manifest.json
Some checks failed
Deploy Frontend / build-and-deploy (push) Successful in 4m2s
DSPy RAG Evaluation / Layer 1 - Unit Tests (push) Failing after 6s
DSPy RAG Evaluation / Layer 3 - Integration Tests (push) Has been skipped
DSPy RAG Evaluation / Layer 2 - DSPy Module Tests (push) Has been skipped
DSPy RAG Evaluation / Layer 4 - Comprehensive Evaluation (push) Has been skipped
DSPy RAG Evaluation / Quality Gate (push) Failing after 2s
2026-01-12 14:46:05 +01:00
kempersc
355d8be51d centralise slots 2026-01-12 14:33:56 +01:00
kempersc
f497be98d1 chore: update schema manifest after slot centralization
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 3m54s
2026-01-11 23:28:03 +01:00
kempersc
da5660cf4c chore: update schema manifest timestamp 2026-01-11 23:19:52 +01:00
kempersc
5d3d8530b0 chore: trigger DSPy eval workflow
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 4m13s
2026-01-11 22:40:23 +01:00
kempersc
56c373bba8 Implement fast WCMS migration script with state file checkpointing and batch processing 2026-01-11 22:26:37 +01:00
kempersc
174a420c08 refactor(schema): centralize 1515 inline slot definitions per Rule 48
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 3m57s
- Remove inline slot definitions from 144 class files
- Create 7 new centralized slot files in modules/slots/:
  - custodian_type_broader.yaml
  - custodian_type_narrower.yaml
  - custodian_type_related.yaml
  - definition.yaml
  - finding_aid_access_restriction.yaml
  - finding_aid_description.yaml
  - finding_aid_temporal_coverage.yaml
- Add centralize_inline_slots.py automation script
- Update manifest with new timestamp

Rule 48: Class files must NOT define inline slots - all slots
must be imported from modules/slots/ directory.

Note: Pre-existing IdentifierFormat duplicate class definition
(in Standard.yaml and IdentifierFormat.yaml) not addressed in
this commit - requires separate schema refactor.
2026-01-11 22:02:14 +01:00
kempersc
3e6c2367ad feat(linkml-viewer): UX improvements - entry counts, deep links, settings persistence
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 4m4s
- Add entry count badge next to schema file name showing (xC, yE, zS) counts
- Add tooltip explaining LinkML file names vs class names
- Remove redundant section headers (Classes, Enums, Slots collapsible sections)
- Add URL params for enum (?enum=) and slot (?slot=) deep linking
- Persist category filters, dev tools visibility, and legend visibility to localStorage
- Set 'Main Schema' filter to OFF by default (confusing for users)
- Add Rule 48: Class files must not define inline slots
2026-01-11 21:42:35 +01:00
kempersc
eff3153f3f feat(schema): add Environmental Zone Type slot definitions
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 3m56s
Add 4 slot files for EnvironmentalZoneType class:
- environmental_zone_type_id: URI identifier slot
- environmental_zone_type_code: code slot for zone type codes
- environmental_zone_type_label: human-readable label
- environmental_zone_type_description: detailed description

Update manifest.json with new slot count (2084 slots total)
2026-01-11 21:22:44 +01:00
kempersc
95d79d0078 fix: update manifest with new generated timestamp and file counts; add EnvironmentalZoneType classes and new slot requirements
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 4m51s
2026-01-11 21:15:49 +01:00
kempersc
10bb5b69c5 Add Environmental Zone Type Enumeration and related slots
- Introduced EnvironmentalZoneTypeEnum.yaml to classify climate-controlled storage zones with detailed descriptions and recommended conditions for various materials.
- Created slots for environmental zone type code, description, ID, label, and HC preset URI to facilitate structured data representation.
- Implemented boolean slots for specific environmental requirements including dark storage, dust-free environment, ESD protection, and UV filtering, referencing relevant ISO standards.
- Enhanced documentation for each slot to clarify usage and preservation context.
2026-01-11 21:14:59 +01:00
kempersc
66ab2908d0 fix: remove deprecated AnnotationMotivationEnum, add European surname data
Some checks failed
Deploy Frontend / build-and-deploy (push) Failing after 3m21s
- Move deprecated AnnotationMotivationEnum to archive-deprecated/ (outside served paths)
- Add French, Italian, Polish, Spanish surname datasets for entity resolution
- Update name_commonality.py with expanded European surname detection
- Triggers GitOps workflow to test Forgejo Actions runner
2026-01-11 16:03:18 +01:00
kempersc
fd792fce2c Refactor code structure for improved readability and maintainability
Some checks failed
Deploy Frontend / build-and-deploy (push) Has been cancelled
2026-01-11 15:27:14 +01:00
kempersc
0f7fbf1ca0 feat(ci): add Forgejo Actions workflow for auto-deploy on LinkML schema changes
Some checks are pending
Deploy Frontend / build-and-deploy (push) Waiting to run
Infrastructure changes to enable automatic frontend deployment when schemas change:

- Add .forgejo/workflows/deploy-frontend.yml workflow triggered by:
  - Changes to frontend/** or schemas/20251121/linkml/**
  - Manual workflow dispatch

- Rewrite generate-schema-manifest.cjs to properly scan all schema directories
  - Recursively scans classes, enums, slots, modules directories
  - Uses singular category names (class, enum, slot) matching TypeScript types
  - Includes all 4 main schemas at root level
  - Skips archive directories and backup files

- Update schema-loader.ts to match new manifest format
  - Add SchemaCategory interface
  - Update SchemaManifest to use categories as array
  - Add flattenCategories() helper function
  - Add getSchemaCategories() and getSchemaCategoriesSync() functions

The workflow builds frontend with updated manifest and deploys to bronhouder.nl
2026-01-11 14:16:57 +01:00
kempersc
329b341bb1 refactor(schema): sync AnnotationMotivationType changes to frontend public schemas
- Update VideoAnnotation class with new motivation type references
- Add AnnotationMotivationType and AnnotationMotivationTypes class files
- Add motivation_type slots (description, id, name)
- Archive deprecated AnnotationMotivationEnum
- Update slot references for derived_from_entity, has_observation, has_person_observation
2026-01-11 14:16:39 +01:00
kempersc
9726cc7917 feat(frontend): Add AnnotationMotivationType to LinkML schema manifest
Add new AnnotationMotivationType and AnnotationMotivationTypes to the
SCHEMA_FILES array so they appear in the /linkml viewer.
2026-01-11 13:56:11 +01:00
kempersc
0a888ec682 chore: add node_modules to .gitignore and remove from tracking
- Add node_modules/ and .pnpm-store/ to .gitignore
- Remove 76k node_modules files from git tracking
- Update frontend manifest
2026-01-11 00:41:21 +01:00
kempersc
a4184cb805 feat(infra): add webhook-based schema deployment pipeline
- Add FastAPI webhook receiver for Forgejo push events
- Add setup script for server deployment
- Add Caddy snippet for webhook endpoint
- Add local sync-schemas.sh helper script
- Sync frontend schemas with source (archived deprecated slots)

Infrastructure scripts staged for optional webhook deployment.
Current deployment uses: ./infrastructure/deploy.sh --frontend
2026-01-10 21:45:02 +01:00
kempersc
6c19ef8661 feat(rag): add Rule 46 epistemic provenance tracking
Track full lineage of RAG responses: WHERE data comes from, WHEN it was
retrieved, HOW it was processed (SPARQL/vector/LLM).

Backend changes:
- Add provenance.py with EpistemicProvenance, DataTier, SourceAttribution
- Integrate provenance into MultiSourceRetriever.merge_results()
- Return epistemic_provenance in DSPyQueryResponse

Frontend changes:
- Pass EpistemicProvenance through useMultiDatabaseRAG hook
- Display provenance in ConversationPage (for cache transparency)

Schema fixes:
- Fix truncated example in has_observation.yaml slot definition

References:
- Pavlyshyn's Context Graphs and Data Traces paper
- LinkML ProvenanceBlock schema pattern
2026-01-10 18:42:43 +01:00
kempersc
28c3aaf33f enrich profiles 2026-01-10 17:31:02 +01:00
kempersc
13938c92ca chore(schemas): sync LinkML schemas to frontend apps
Copies authoritative schemas from schemas/20251121/ to:
- frontend/public/schemas/20251121/
- apps/archief-assistent/public/schemas/20251121/

This ensures slot definitions with corrected ontology property
references (commit 2808dad6cd) are available to frontend apps.
2026-01-10 15:02:25 +01:00
kempersc
8a475d5c02 refactor(linkml): apply RiC-O slot naming conventions (Rule 39)
Rename slots to follow Records in Contexts (RiC-O) style naming:
- Add 'has_' prefix for possession predicates (has_acquisition_method)
- Add 'is_or_was_' prefix for temporal relationships
- Add 'has_or_had_' for bidirectional temporal relations

Key changes across 496 schema files:
- acquisition_method → has_acquisition_method
- acquisition_date → has_acquisition_date
- acquisition_source → has_acquisition_source
- access_policy_ref → has_access_policy_reference
- arrangement → has_arrangement
- parent_custodian → is_or_was_suborganization_of (hierarchy)
- parent_custodian → associated_custodian (event association)

Also adds new slots following RiC-O patterns:
- is_or_was_aggregated_by
- is_or_was_allocated_by
- is_or_was_archive_department_of
- was_approved_by, was_archived_at, was_asserted_by

This aligns with AGENTS.md Rule 39: Slot Naming Convention (RiC-O Style)
for accurate temporal semantics in heritage custodian ontology.

Net change: +2,063 lines (new slots added, old patterns consolidated)
2026-01-10 10:33:51 +01:00
kempersc
004d342935 chore: minor updates and evaluation results
- auth.setup.ts: require env vars for test credentials (no hardcoded defaults)
- manifest.json: update schema manifest
- full_evaluation_results.json: add RAG evaluation results
- petra-links.json: update birth date from web claim
2026-01-09 21:10:55 +01:00
kempersc
f7bd3e9edc feat(linkml-viewer): add slot_usage side-by-side comparison view
- Add 'Compare' toggle button next to slots with slot_usage overrides
- Show generic slot definition vs class-specific override in 3-column grid
- Highlight changed properties with green 'changed' badge
- Display '(inherited)' when override matches generic definition
- Display '(not defined)' when generic has no value for property
- Compare: range, description, required, multivalued, slot_uri, pattern, identifier
- Full i18n support (Dutch/English translations)
- Responsive design: stacks vertically on mobile (<640px)
2026-01-09 21:02:14 +01:00
kempersc
9e67d0f967 enrich profiles 2026-01-09 20:35:19 +01:00
kempersc
932ec5438c add person profiles with PPID 2026-01-09 18:26:58 +01:00
kempersc
1ad717767a feat(linkml-viewer): add visual indicators for slot_usage overrides
- Add green 'slot_usage' badge for slots with class-specific overrides
- Add ✦ markers next to properties that are overridden vs inherited
- Add green left border styling for slots with slot_usage
- Add i18n translations (nl/en) for override indicators
- Merge generic slot definitions with class-specific slot_usage properties

This helps users understand which slot properties come from the generic
slot definition vs which are overridden at the class level via slot_usage.
2026-01-09 18:23:21 +01:00
kempersc
35a057981c chore(frontend): sync schema files with custodian_type → has_or_had_custodian_type refactor
- Remove deprecated slots: custodian_type.yaml, custodian_types.yaml,
  custodian_type_broader/narrower/related.yaml, custodian_types_primary/rationale.yaml
- Add new unified slot: has_or_had_custodian_type.yaml
- Sync all 236+ class files with updated slot references
- Update manifest.json
2026-01-09 12:15:32 +01:00
kempersc
c88fd3af70 Refactor code structure for improved readability and maintainability 2026-01-09 11:05:26 +01:00
kempersc
0393b321c9 refactor(schema): unify custodian_type slots into has_or_had_custodian_type (Rule 39, 43)
- Migrate 236+ class files from custodian_types to has_or_had_custodian_type
- Archive deprecated slots: custodian_type, custodian_types, custodian_type_broader/narrower/related
- Update main schema and manifest imports
- Fix Custodian.yaml class to use new slot
- Fix annotation format (list→scalar) in has_or_had_custodian_type.yaml

Rules applied:
- Rule 39: RiC-O naming convention (hasOrHad pattern)
- Rule 43: Slot nouns must be singular (multivalued:true for cardinality)
- Rule 38: Slot centralization with semantic URI
2026-01-09 10:55:21 +01:00
kempersc
6608a207d4 update frontend 2026-01-08 15:56:28 +01:00
kempersc
0b0ea75070 feat(rag): add factual query fast path - skip LLM for count/list queries
- Add ontology cache warming at startup in lifespan() function
- Add is_factual_query() detection in template_sparql.py (12 templates)
- Add factual_result and sparql_query fields to DSPyQueryResponse
- Skip LLM generation for factual templates (count, list, compare)
- Execute SPARQL directly and return results as table (~15s → ~2s latency)
- Update ConversationPanel.tsx to render factual results table
- Add CSS styling for factual results with green theme

For queries like 'hoeveel archieven zijn er in Den Haag', the SPARQL
results ARE the answer - no need for expensive LLM prose generation.
2026-01-08 13:34:23 +01:00
kempersc
9b769f1ca2 Update manifest timestamp and minor class fixes 2026-01-07 22:04:29 +01:00
kempersc
81da4ede50 Add comprehensive slot visualization to LinkML viewer
- Add standalone Slots section in visual view alongside Classes and Enums
- Display slot_uri, range, identifier badge, description, pattern
- Show examples with value/description pairs
- Color-coded SKOS mapping tags (exact/close/narrow/broad/related)
- Yellow highlighted comments section
- Custodian type filtering works with slots
- Shared renderSlotDetails() function for consistency
2026-01-07 22:03:08 +01:00
kempersc
d19822f958 Remove redundant sections from class descriptions
- Created cleanup_class_descriptions_v2.py script using text-based regex
- Removed 134 class files' redundant sections:
  - dual_class_pattern: 80 occurrences
  - ontological_alignment: 35 occurrences
  - ontology_alignment_upper: 33 occurrences
  - multilingual_labels: 26 occurrences
  - glamorcubes_category: 6 occurrences
  - example_structure: 6 occurrences
- Fixed ArchiveOrganizationType.yaml parse error after cleanup
- Added 49 new slot definition files
- All 395 class files validate as correct YAML
- Deployed to bronhouder.nl/linkml
2026-01-07 13:50:14 +01:00
kempersc
dfa667c90f Fix LinkML schema for valid RDF generation with proper slot_uri
Summary:
- Create 46 missing slot definition files with proper slot_uri values
- Add slot imports to main schema (01_custodian_name_modular.yaml)
- Fix YAML examples sections in 116+ class and slot files
- Fix PersonObservation.yaml examples section (nested objects → string literals)

Technical changes:
- All slots now have explicit slot_uri mapping to base ontologies (RiC-O, Schema.org, SKOS)
- Eliminates malformed URIs like 'custodian/:slot_name' in generated RDF
- gen-owl now produces valid Turtle with 153,166 triples

New slot files (46):
- RiC-O slots: rico_note, rico_organizational_principle, rico_has_or_had_holder, etc.
- Scope slots: scope_includes, scope_excludes, archive_scope
- Organization slots: organization_type, governance_authority, area_served
- Platform slots: platform_type_category, portal_type_category
- Social media slots: social_media_platform_category, post_type_*
- Type hierarchy slots: broader_type, narrower_types, custodian_type_broader
- Wikidata slots: wikidata_equivalent, wikidata_mapping

Generated output:
- schemas/20251121/rdf/01_custodian_name_modular_20260107_134534_clean.owl.ttl (6.9MB)
- Validated with rdflib: 153,166 triples, no malformed URIs
2026-01-07 13:48:03 +01:00
kempersc
98c42bf272 Fix LinkML URI conflicts and generate RDF outputs
- Fix scope_note → finding_aid_scope_note in FindingAid.yaml
- Remove duplicate wikidata_entity slot from CustodianType.yaml (import instead)
- Remove duplicate rico_record_set_type from class_metadata_slots.yaml
- Fix range types for equals_string compatibility (uriorcurie → string)
- Move class names from close_mappings to see_also in 10 RecordSetTypes files
- Generate all RDF formats: OWL, N-Triples, RDF/XML, N3, JSON-LD context
- Sync schemas to frontend/public/schemas/

Files: 1,151 changed (includes prior CustodianType migration)
2026-01-07 12:32:59 +01:00
kempersc
6c6810fa43 Replace CustodianTypeCodeEnum with CustodianType class references
- Remove deprecated CustodianTypeCodeEnum from class_metadata_slots.yaml
- Update custodian_types slot to use uriorcurie range (references CustodianType subclasses)
- Update custodian_types_primary slot similarly
- Add migration note for legacy string format ['A'] vs new URI format

Per Rule 9: Enum-to-Class Promotion - Single Source of Truth
2026-01-06 12:37:40 +01:00
kempersc
b34992b1d3 Migrate all 293 class files to ontology-aligned slots
Extends migration to all class types (museums, libraries, galleries, etc.)

New slots added to class_metadata_slots.yaml:
- RiC-O: rico_record_set_type, rico_organizational_principle,
  rico_has_or_had_holder, rico_note
- Multilingual: label_de, label_es, label_fr, label_nl, label_it, label_pt
- Scope: scope_includes, scope_excludes, custodian_only,
  organizational_level, geographic_restriction
- Notes: privacy_note, preservation_note, legal_note

Migration script now handles 30+ annotation types.
All migrated schemas pass linkml-validate.

Total: 387 class files now use proper slots instead of annotations.
2026-01-06 12:24:54 +01:00
kempersc
aa763dab25 Migrate 94 archive class annotations to ontology-aligned slots
- Add migration script: scripts/migrate_annotations_to_slots.py
- Convert custodian_types, wikidata, skos_broader, specificity_* annotations
- Replace with proper slots mapped to SKOS, PROV-O, RiC-O predicates
- Add ../slots/class_metadata_slots import to all migrated files
- Remove AcademicArchive_refactored.yaml (main file now migrated)
- Sync changes to frontend/public/schemas/

Migration converts:
  - custodian_types → hc:custodianTypes slot
  - wikidata/wikidata_label → wikidata_alignment structured slot
  - skos_broader → skos:broader slot
  - specificity_* → specificity_annotation structured slot
  - dual_class_pattern → dual_class_link structured slot
  - template_specificity → template_specificity slot

All 94 migrated schemas pass linkml-validate.
2026-01-06 11:25:37 +01:00
kempersc
f37f5208ca Copy class metadata slots to frontend public folder for deployment 2026-01-06 11:17:12 +01:00
kempersc
11983014bb Enhance specificity scoring system integration with existing infrastructure
- Updated documentation to clarify integration points with existing components in the RAG pipeline and DSPy framework.
- Added detailed mapping of SPARQL templates to context templates for improved specificity filtering.
- Implemented wrapper patterns around existing classifiers to extend functionality without duplication.
- Introduced new tests for the SpecificityAwareClassifier and SPARQLToContextMapper to ensure proper integration and functionality.
- Enhanced the CustodianRDFConverter to include ISO country and subregion codes from GHCID for better geospatial data handling.
2026-01-05 17:37:49 +01:00
kempersc
41d8905661 Fix Turtle parser multi-line string handling for PiCo ontology
- Fixed bug where closing triple-quotes (""") would incorrectly re-trigger
  multi-line string detection, causing subsequent class definitions to be skipped
- Added lineToProcess variable to track which portion of line to process after
  closing a multi-line string, preventing re-detection of opening quotes
- Moved UML large diagram confirmation logic from OntologyViewerPage to
  UMLVisualization component for better encapsulation
- PiCo ontology now correctly shows all 8 classes instead of 2

Deployed and verified on https://bronhouder.nl/ontology?ontology=PiCo
2026-01-05 11:25:43 +01:00
kempersc
242bc8bb35 Add new slots for heritage custodian entities
- Created deliverables_slot for expected or achieved deliverable outputs.
- Introduced event_id_slot for persistent unique event identifiers.
- Added follow_up_date_slot for scheduled follow-up action dates.
- Implemented object_ref_slot for references to heritage objects.
- Established price_slot for price information across entities.
- Added price_currency_slot for currency codes in price information.
- Created protocol_slot for API protocol specifications.
- Introduced provenance_text_slot for full provenance entry text.
- Added record_type_slot for classification of record types.
- Implemented response_formats_slot for supported API response formats.
- Established status_slot for current status of entities or activities.
- Added FactualCountDisplay component for displaying count query results.
- Introduced ReplyTypeIndicator component for visualizing reply types.
- Created approval_date_slot for formal approval dates.
- Added authentication_required_slot for API authentication status.
- Implemented capacity_items_slot for maximum storage capacity.
- Established conservation_lab_slot for conservation laboratory information.
- Added cost_usd_slot for API operation costs in USD.
2026-01-05 00:49:05 +01:00
kempersc
89001fbc53 compact header controls on OntologyViewer and QueryBuilder pages 2026-01-04 17:29:34 +01:00
kempersc
eb61f45de2 compact UML controls toolbar to fit single line when sidebar collapsed 2026-01-04 17:21:53 +01:00
kempersc
2dca28d8c1 enrich CH entries with mission statements 2026-01-04 13:12:32 +01:00
kempersc
4f0cafe98a enrich HC profiles 2026-01-02 02:11:04 +01:00
kempersc
349f31ae6f enrich custodian profiles 2026-01-02 02:10:18 +01:00
kempersc
45e873ec0a enrich JP BE AR profiles 2025-12-30 23:07:03 +01:00
kempersc
d64f857aa9 add sparql validator and RAG injector 2025-12-30 03:43:31 +01:00
kempersc
ca219340f2 enrich entries 2025-12-26 14:30:31 +01:00
kempersc
54869589d1 fix(linkml-viewer): 3D cube visualization bugs - prevent click-to-filter and parse JSON custodian_types
- Remove onFaceClick prop from CustodianTypeIndicator3D in class/slot/enum detail views
  to prevent accidental filtering when clicking decorative cubes (Bug 3)
- Add parseCustodianTypesAnnotation() helper to handle JSON-stringified arrays like '["A"]'
  in YAML annotations, fixing Bug 2 where all 19 letters appeared on every cube
- Legend bar retains onTypeClick for intentional filtering functionality
2025-12-23 20:40:32 +01:00
kempersc
5e8a432ef0 enrich japanese and dutch custodians 2025-12-23 18:08:45 +01:00
kempersc
a1fb6344e7 enriching custodian data 2025-12-23 17:26:29 +01:00
kempersc
0c1d19e98b enrich entries 2025-12-23 13:27:35 +01:00
kempersc
aca68ea47f remove a,bihguous web-claims 2025-12-21 00:01:54 +01:00
kempersc
23b1d8ee5f clean up GHCID 2025-12-17 11:58:40 +01:00
kempersc
99430c2a70 add new entries and semantic routing 2025-12-17 10:11:56 +01:00
kempersc
5fe692296d Fix RDF visualization: correct SPARQL namespaces and show all node types
- Update SPARQL CONSTRUCT query to use correct ontology namespaces:
  - hc: https://w3id.org/heritage/custodian/ (was nde)
  - Use nde:Custodian type (was crm:E39_Actor)
  - Use schema:location + geo:lat/long (was crm predicates)
- Remove LIMIT 500 clause to fetch all results
- Show all node types by default instead of random single type
- Fixes issue where Knowledge Graph showed incomplete/random data
2025-12-17 08:55:26 +01:00
kempersc
e0dd847491 extend ontology 2025-12-16 20:27:39 +01:00
kempersc
b0416efc7d enrich custodians and persons 2025-12-16 11:57:34 +01:00
kempersc
52ae711c56 add timespans 2025-12-16 09:02:52 +01:00
kempersc
b1340e30c8 add timespan 2025-12-15 22:35:35 +01:00
kempersc
cb56aa7e40 enrich all custodian timespan 2025-12-15 22:31:41 +01:00
kempersc
82aa655522 feat(conversation): Add resizable embedding projector panel with improved UX
- Larger default size (700x550) for better readability
- Resizable from all 8 edges/corners with visual SE grip indicator
- Clearer button icons (18px, strokeWidth 2.5)
- Draggable, minimizable, pinnable panel
- Dark theme and mobile responsive support
2025-12-15 17:45:27 +01:00
kempersc
31bbce13e6 fix(types): Make genealogiewerkbalk nested fields optional
Fixes TypeScript error where parseGenealogiewerkbalk returns optional
fields but Institution interface expected required fields.
2025-12-15 09:04:09 +01:00
kempersc
0a38225b36 feat(frontend): Add multi-select filters, URL params, and UI improvements
- Institution Browser: multi-select for types and countries
- URL query param sync for shareable filter URLs
- New utility: countryNames.ts with flag emoji support
- New utility: imageProxy.ts for image URL handling
- New component: SearchableMultiSelect dropdown
- Career timeline CSS and component updates
- Media gallery improvements
- Lazy load error boundary component
- Version check utility
2025-12-15 01:47:11 +01:00
kempersc
22709cc13e feat(rag): Add per-message refresh, bypass cache toggle, and cache clear improvements
- Add refresh button to assistant messages for re-running queries with fresh results
- Highlight refresh button (amber) for cached responses to draw attention
- Add spinning icon animation while refreshing
- Fix cache clear to return detailed success/failure status for local vs shared cache
- Add bypass cache toggle that forces fresh queries (one-shot, resets after query)
- Add Dutch/English translations for new UI elements
2025-12-14 19:12:25 +01:00
kempersc
1d26cade66 correct person labels 2025-12-14 17:58:55 +01:00
kempersc
c6aee998db correct person labels 2025-12-14 17:29:39 +01:00
kempersc
c50c35fd3a enrich person custodian 2025-12-14 17:09:55 +01:00
kempersc
41aace785f feat: Add SyncPanel component for database synchronization
- Add SyncPanel component with bilingual (NL/EN) support
- Add relative URL handling for production (bronhouder.nl)
- Integrate SyncPanel into Database page
- Show sync status for all 4 databases (DuckLake, PostgreSQL, Oxigraph, Qdrant)
- Support dry-run mode and file limit options
2025-12-12 23:42:22 +01:00
kempersc
505c12601a Add test script for PiCo extraction from Arabic waqf documents
- Implemented a new script `test_pico_arabic_waqf.py` to test the GLM annotator's ability to extract person observations from Arabic historical documents.
- The script includes environment variable handling for API token, structured prompts for the GLM API, and validation of extraction results.
- Added comprehensive logging for API responses, extraction results, and validation errors.
- Included a sample Arabic waqf text for testing purposes, following the PiCo ontology pattern.
2025-12-12 17:50:17 +01:00
kempersc
b1f93b6f22 enrich person profiles 2025-12-12 12:51:10 +01:00
kempersc
03263f67d6 moved web archives 2025-12-12 00:40:26 +01:00
kempersc
1b1cfbfca0 enrich custodians 2025-12-11 22:32:09 +01:00
kempersc
d4906abae4 update postgis data 2025-12-10 23:51:51 +01:00
kempersc
be3fbac601 enrich entries and persons 2025-12-10 18:04:25 +01:00
kempersc
41959f0766 correct HCID! 2025-12-10 13:01:13 +01:00
kempersc
3a6ead8fde feat: Add legal form filtering rule for CustodianName
- Introduced LEGAL-FORM-FILTER rule to standardize CustodianName by removing legal form designations.
- Documented rationale, examples, and implementation guidelines for the filtering process.

docs: Create README for value standardization rules

- Established a comprehensive README outlining various value standardization rules applicable to Heritage Custodian classes.
- Categorized rules into Name Standardization, Geographic Standardization, Web Observation, and Schema Evolution.

feat: Implement transliteration standards for non-Latin scripts

- Added TRANSLIT-ISO rule to ensure GHCID abbreviations are generated from emic names using ISO standards for transliteration.
- Included detailed guidelines for various scripts and languages, along with implementation examples.

feat: Define XPath provenance rules for web observations

- Created XPATH-PROVENANCE rule mandating XPath pointers for claims extracted from web sources.
- Established a workflow for archiving websites and verifying claims against archived HTML.

chore: Update records lifecycle diagram

- Generated a new Mermaid diagram illustrating the records lifecycle for heritage custodians.
- Included phases for active records, inactive archives, and processed heritage collections with key relationships and classifications.
2025-12-09 16:58:41 +01:00
kempersc
a7321b1bb9 reconstruct location blocks 2025-12-09 12:25:16 +01:00
kempersc
cab712659d recover location blocks 2025-12-09 11:34:56 +01:00
kempersc
62fdd35321 Refactor code structure for improved readability and maintainability 2025-12-09 11:15:51 +01:00
kempersc
131e3ca259 normalise custodian entries 2025-12-09 07:56:35 +01:00
kempersc
13f67bed19 feat(frontend): add graph visualization and data explorer features
Database Panels:
- Add D3.js force-directed graph visualization to Oxigraph and TypeDB panels
- Add 'Explore' tab with class/entity browser, graph/table toggle, and search
- Add data explorer to PostgreSQL panel with table browser, pagination, search, export
- Fix SPARQL variable naming bug in Oxigraph getGraphData() function
- Add node details panel showing selected entity attributes
- Add zoom/pan controls and node coloring by entity type

Map Features:
- Add TimelineSlider component for temporal filtering of institutions
- Support dual-handle range slider with decade histogram
- Add quick presets (Ancient, Medieval, Modern, Contemporary)
- Show institution density visualization by founding decade

Hooks:
- Extend useOxigraph with getGraphData() for graph visualization
- Extend useTypeDB with getGraphData() for graph visualization
- Extend usePostgreSQL with getTableData() and exportTableData()
- Improve useDuckLakeInstitutions with temporal filtering support

Styles:
- Add HeritageDashboard.css with shared panel styling
- Add TimelineSlider.css for timeline component styling
2025-12-08 14:56:17 +01:00
kempersc
7e3559f7e5 add new entries 2025-12-07 23:08:02 +01:00
kempersc
57c743b005 refactor(frontend): fetch NL municipalities from PostGIS API instead of static file
Replace static netherlands_municipalities_simplified.geojson with dynamic
PostGIS API call to /boundaries/countries/NL/admin2/geojson.

Transform API response properties to expected format:
- API: {code, name, name_local, admin1_code, admin1_name}
- Expected: {code, naam, provincieCode, provincieNaam}

This ensures NL boundary data comes from the authoritative PostGIS
database rather than a static file that could become outdated.
2025-12-07 19:48:07 +01:00
kempersc
12965071be fix(frontend): improve DuckLake connection detection in map page
Wait for DuckLake loading to complete before deciding whether to use
DuckLake data or fallback to static JSON. Prevents race conditions.
2025-12-07 19:23:12 +01:00
kempersc
1981dc28ed fix(frontend): normalize org_type names to letter codes in DuckLake hook
DuckLake stores full names like 'MUSEUM' but map expects single-letter
codes like 'M' for color styling. Also includes CSS fixes for Database page.
2025-12-07 19:21:58 +01:00
kempersc
810022d524 feat(frontend): add search filter and claim page filter to DuckLakePanel
- Add search bar to filter table data across all columns
- Filter web archive claims by selected page
- Include source_page in claim queries for filtering
- Fix TypeScript unused parameter warning
2025-12-07 19:20:40 +01:00
kempersc
f82dd57903 feat(frontend): add useBoundariesAPI hook for PostGIS boundary fetching
New React hook that fetches administrative boundaries from the PostGIS API:
- Supports international boundaries (NL, JP, CZ, DE, BE, CH, AT, etc.)
- Caches admin1, admin2, and GeoJSON data
- Provides point-in-polygon lookup
- Includes utility functions for filtering boundaries by code/name
- Replaces static GeoJSON file loading pattern
2025-12-07 19:20:21 +01:00
kempersc
d9325c0bb5 feat: add web archives integration and improve enrichment scripts
Backend:
- Attach web_archives.duckdb as read-only database in DuckLake
- Create views for web_archives, web_pages, web_claims in heritage schema

Scripts:
- enrich_cities_google.py: Add batch processing and retry logic
- migrate_web_archives.py: Improve schema handling and error recovery

Frontend:
- DuckLakePanel: Add web archives query support
- Database.css: Improve layout for query results display
2025-12-07 17:49:07 +01:00
kempersc
0b06af0fb6 chore: mark unused function and ignore ducklake databases 2025-12-07 14:28:12 +01:00
kempersc
9d15cce65c docs: add enrichment reports and update manifest
Add enrichment reports from city resolution:
- Austrian, Belgian, Bulgarian, Czech, Swiss ISIL enrichment reports
- GeoNames update reports
- Custodian creation reports
- Entry-to-GHCID mapping file
2025-12-07 14:27:36 +01:00