Commit graph

31 commits

Author SHA1 Message Date
kempersc
fba1ab9353 feat: Migrate multiple slots to structured classes and update processing notes 2026-01-26 01:41:04 +01:00
kempersc
ba2c766dd0 Add new slots and update existing ones following RiC-O temporal naming conventions
- Introduced `founding_date`, `founding_date_diocese`, and `fr` slots for capturing founding dates and French language text.
- Created `collects_or_collected`, `has_or_had_objective`, `has_or_had_percentage`, `has_or_had_place`, `has_or_had_reply`, `has_or_had_web_page`, `is_or_was_acquired_by`, `is_or_was_appreciated`, `is_or_was_founded_through`, `is_or_was_part_of`, `is_or_was_part_of_total`, `start_of_the_start`, `takes_or_took_comission`, and `was_fetched_at` slots to enhance data modeling capabilities.
- Each slot includes detailed descriptions, examples, and ontology alignments to ensure clarity and usability.
- Migration notes added for slots transitioned from previous definitions to maintain historical context and facilitate understanding of changes.
2026-01-22 15:15:56 +01:00
kempersc
4a277d7d42 standardise slots 2026-01-19 00:09:28 +01:00
kempersc
4319f38c05 Add archived slots for audience size, audience type, and capacity metrics
- Created new YAML files for audience size and audience type slots, defining their properties and annotations.
- Added archived capacity slots including cubic meters, linear meters, item count, and descriptions, with appropriate URIs and ranges.
- Introduced a template specificity slot for context-aware RAG filtering.
- Consolidated capacity-related slots into a unified structure, including has_or_had_capacity, capacity_type, and capacity_value, with detailed descriptions and examples.
2026-01-17 18:53:23 +01:00
kempersc
196f8a1023 Refactor schema slots and classes for improved semantic clarity and consistency
- Migrated `temperature_tolerance` to `allows_or_allowed` with `TemperatureDeviation` class for structured temperature tolerance representation.
- Replaced `temporal_coverage` with `has_or_had_content` to enhance temporal modeling using the `Content` class.
- Updated `FindingAid`, `LegalResponsibilityCollection`, and `EnvironmentalZone` schemas to reflect new slot structures.
- Archived obsolete slots: `temperature_tolerance`, `temporal_coverage`, `typical_http_methods`, and `typical_response_formats`.
- Introduced `has_or_had_technological_infrastructure` slot to replace `technology_stack`, providing a structured approach to modeling technological components.
- Enhanced documentation and examples across affected schemas to ensure clarity on new structures and their usage.
2026-01-16 20:09:58 +01:00
kempersc
f9f3cc8e74 fix: resolve YAML import indentation and add missing slot descriptions
Schema Improvements:
- Fix YAML import indentation across 800+ class files (sed: '^- ../' → '  - ../')
- Add descriptions to 26 inline slots missing them (lint warnings)
- Fix malformed imports in BirthPlace.yaml and CustodianObservation.yaml

Validation Results:
- linkml-lint: 4 warnings (intentional SCREAMING_CASE tier names)
- gen-owl: SUCCESS (164,069 lines generated)
- gen-json-schema: SUCCESS (9.4MB generated)

Files affected: 1,034 files, +23,908 -15,200 lines
2026-01-16 00:09:28 +01:00
kempersc
37d923cae1 Refactor slot names and update imports for consistency
- Migrated `was_generated_by` to `is_or_was_generated_by` and `was_derived_from` to `is_or_was_derived_from` across multiple YAML schema files as per Rule 53.
- Updated relevant imports, slot lists, and slot usage keys to reflect the new naming conventions.
- Added migration comments for clarity and tracking.
- Introduced a migration script to automate the changes across all affected files.
2026-01-15 15:07:53 +01:00
kempersc
ad5fbe82cf Migrate valid_from and valid_to slots to temporal_extent across multiple classes
- Consolidated valid_from and valid_to slots into a single temporal_extent slot in FundingRequirement, GiftShop, OrganizationBranch, OrganizationalChangeEvent, OrganizationalStructure, SocialMediaProfile, Storage, StorageUnit classes.
- Updated slot definitions to use TimeSpan for temporal_extent, providing structured validity periods.
- Removed deprecated slots: valid_from, valid_to, verified_by, wikidata_entity_id, and worldcat_id, archiving their definitions for reference.
- Adjusted related documentation and examples to reflect the new temporal_extent structure.
2026-01-14 22:33:36 +01:00
kempersc
1fb924c412 feat: add ontology mappings to LinkML schema and enhance entity resolution
Schema enhancements (443 files):
- Add class_uri with proper ontology references (schema:, prov:, skos:, rico:)
- Add close_mappings, related_mappings per Rule 50 convention
- Replace stub hc: slot_uri with standard predicates (dcterms:identifier, skos:prefLabel)
- Improve descriptions with ontology mapping rationale
- Add prefixes blocks to all schema modules

Entity Resolution improvements:
- Add entity_resolution module with email semantics parsing
- Enhance build_entity_resolution.py with email-based matching signals
- Extend Entity Review API with filtering by signal types and count
- Add candidates caching and indexing for performance
- Add ReviewLoginPage component

New rules and documentation:
- Add Rule 51: No Hallucinated Ontology References
- Add .opencode/rules/no-hallucinated-ontology-references.md
- Add .opencode/rules/slot-ontology-mapping-reference.md
- Add adms.ttl and dqv.ttl ontology files

Frontend ontology support:
- Add RiC-O_1-1.rdf and schemaorg.owl to public/ontology
2026-01-13 13:51:02 +01:00
kempersc
355d8be51d centralise slots 2026-01-12 14:33:56 +01:00
kempersc
0d5d48568d refactor(schema): centralize slot definitions per Rule 38
- Remove slot_uri, description, mappings from slot_usage sections
- Move these properties to centralized slot files in modules/slots/
- Keep only class-specific overrides in slot_usage (required, inlined, examples)
- Update 1,499 centralized slot files with enriched definitions
- Clean 188 class files

Violations fixed:
- slot_uri in slot_usage: 1,676 → 0
- description in slot_usage: 2,287 → 0 (moved to centralized)

Schema still validates: 816 classes, 2028 slots, 127 enums
2026-01-11 23:27:17 +01:00
kempersc
56c373bba8 Implement fast WCMS migration script with state file checkpointing and batch processing 2026-01-11 22:26:37 +01:00
kempersc
174a420c08 refactor(schema): centralize 1515 inline slot definitions per Rule 48
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 3m57s
- Remove inline slot definitions from 144 class files
- Create 7 new centralized slot files in modules/slots/:
  - custodian_type_broader.yaml
  - custodian_type_narrower.yaml
  - custodian_type_related.yaml
  - definition.yaml
  - finding_aid_access_restriction.yaml
  - finding_aid_description.yaml
  - finding_aid_temporal_coverage.yaml
- Add centralize_inline_slots.py automation script
- Update manifest with new timestamp

Rule 48: Class files must NOT define inline slots - all slots
must be imported from modules/slots/ directory.

Note: Pre-existing IdentifierFormat duplicate class definition
(in Standard.yaml and IdentifierFormat.yaml) not addressed in
this commit - requires separate schema refactor.
2026-01-11 22:02:14 +01:00
kempersc
626bd3a095 refactor(schemas): apply naming conventions to 261 class files
- Apply Rule 39: RiC-O style hasOrHad*/isOrWas* for temporal slots
- Apply Rule 43: Singular noun convention (keywords → keyword)
- Update slot references to match renamed slot files
- Maintain schema integrity across all class definitions
2026-01-10 15:36:33 +01:00
kempersc
095a3f949c refactor(linkml): apply RiC-O slot naming conventions to /schemas/ (Rule 39)
Apply same RiC-O-style slot naming refactor to /schemas/20251121/linkml/
that was previously applied to frontend/public/schemas/:

- Add 'has_' prefix for possession predicates
- Add 'is_or_was_' prefix for temporal inverse relationships
- Add 'has_or_had_' for bidirectional temporal relations
- Add new slots: is_or_was_aggregated_by, is_or_was_allocated_by, etc.
- Update count slots with proper descriptions

This ensures consistency between the source schema directory and the
frontend-served schemas.

514 files changed, +6,325 insertions, -4,255 deletions
2026-01-10 12:55:45 +01:00
kempersc
0393b321c9 refactor(schema): unify custodian_type slots into has_or_had_custodian_type (Rule 39, 43)
- Migrate 236+ class files from custodian_types to has_or_had_custodian_type
- Archive deprecated slots: custodian_type, custodian_types, custodian_type_broader/narrower/related
- Update main schema and manifest imports
- Fix Custodian.yaml class to use new slot
- Fix annotation format (list→scalar) in has_or_had_custodian_type.yaml

Rules applied:
- Rule 39: RiC-O naming convention (hasOrHad pattern)
- Rule 43: Slot nouns must be singular (multivalued:true for cardinality)
- Rule 38: Slot centralization with semantic URI
2026-01-09 10:55:21 +01:00
kempersc
b34992b1d3 Migrate all 293 class files to ontology-aligned slots
Extends migration to all class types (museums, libraries, galleries, etc.)

New slots added to class_metadata_slots.yaml:
- RiC-O: rico_record_set_type, rico_organizational_principle,
  rico_has_or_had_holder, rico_note
- Multilingual: label_de, label_es, label_fr, label_nl, label_it, label_pt
- Scope: scope_includes, scope_excludes, custodian_only,
  organizational_level, geographic_restriction
- Notes: privacy_note, preservation_note, legal_note

Migration script now handles 30+ annotation types.
All migrated schemas pass linkml-validate.

Total: 387 class files now use proper slots instead of annotations.
2026-01-06 12:24:54 +01:00
kempersc
11983014bb Enhance specificity scoring system integration with existing infrastructure
- Updated documentation to clarify integration points with existing components in the RAG pipeline and DSPy framework.
- Added detailed mapping of SPARQL templates to context templates for improved specificity filtering.
- Implemented wrapper patterns around existing classifiers to extend functionality without duplication.
- Introduced new tests for the SpecificityAwareClassifier and SPARQLToContextMapper to ensure proper integration and functionality.
- Enhanced the CustodianRDFConverter to include ISO country and subregion codes from GHCID for better geospatial data handling.
2026-01-05 17:37:49 +01:00
kempersc
4f0cafe98a enrich HC profiles 2026-01-02 02:11:04 +01:00
kempersc
aca68ea47f remove a,bihguous web-claims 2025-12-21 00:01:54 +01:00
kempersc
99430c2a70 add new entries and semantic routing 2025-12-17 10:11:56 +01:00
kempersc
41959f0766 correct HCID! 2025-12-10 13:01:13 +01:00
kempersc
131e3ca259 normalise custodian entries 2025-12-09 07:56:35 +01:00
kempersc
d661947830 update enriched entries 2025-12-03 17:38:46 +01:00
kempersc
097d116b72 enrich entries 2025-12-01 16:06:34 +01:00
kempersc
f3c149b1bb update entries 2025-11-30 23:30:29 +01:00
kempersc
cd0ff5b9c7 wrap up voorbeeld lijst 2025-11-27 10:58:53 +01:00
kempersc
a5a66eb547 add classes 2025-11-25 12:48:07 +01:00
kempersc
3ff0e33bf9 Add UML diagrams and scripts for custodian schema
- Created PlantUML diagrams for custodian types, full schema, legal status, and organizational structure.
- Implemented a script to generate GraphViz DOT diagrams from OWL/RDF ontology files.
- Developed a script to generate UML diagrams from modular LinkML schema, supporting both Mermaid and PlantUML formats.
- Enhanced class definitions and relationships in UML diagrams to reflect the latest schema updates.
2025-11-23 23:05:33 +01:00
kempersc
67657c39b6 feat: Complete Country Class Implementation and Hypernyms Removal
- Created the Country class with ISO 3166-1 alpha-2 and alpha-3 codes, ensuring minimal design without additional metadata.
- Integrated the Country class into CustodianPlace and LegalForm schemas to support country-specific feature types and legal forms.
- Removed duplicate keys in FeatureTypeEnum.yaml, resulting in 294 unique feature types.
- Eliminated "Hypernyms:" text from FeatureTypeEnum descriptions, verifying that semantic relationships are now conveyed through ontology mappings.
- Created example instance file demonstrating integration of Country with CustodianPlace and LegalForm.
- Updated documentation to reflect the completion of the Country class implementation and hypernyms removal.
2025-11-23 13:09:38 +01:00
kempersc
2761857b0d Add scripts for converting OWL/Turtle ontology to Mermaid and PlantUML diagrams
- Implemented `owl_to_mermaid.py` to convert OWL/Turtle files into Mermaid class diagrams.
- Implemented `owl_to_plantuml.py` to convert OWL/Turtle files into PlantUML class diagrams.
- Added two new PlantUML files for custodian multi-aspect diagrams.
2025-11-22 23:01:13 +01:00