Commit graph

26 commits

Author SHA1 Message Date
kempersc
6c3fa6b5a3 Remove deprecated slots and add new slot definitions for enhanced data modeling
- Deleted obsolete slot definitions for work_location and workshop_space.
- Introduced new TaxonName class to represent scientific taxonomic names with detailed attributes.
- Archived existing slots related to surname_prefix, target_name, taxon_name, terminal_count, text_region_count, title, title_proper, total_chapter, total_characters_extracted, total_connections_extracted, track_name, transcript_format, traveling_venue, type_label, type_status, typical_responsibility, unesco_domain, unesco_inscription_year, unesco_list_status, uniform_title, unit_name, used_by_custodian, uv_filtered_required, valid_from_geo, valid_to_geo, validation_status, variant_of_name, verification_date, viability_status, within_auxiliary_place, and within_place.
- Updated slot descriptions and structures to improve clarity and compliance with standards.
2026-01-15 11:42:35 +01:00
kempersc
d5d970b513 Remove deprecated slot definitions and add archived versions for future reference
- Deleted the following slot definitions:
  - wikidata_class_slot
  - wikidata_entity_label_slot
  - wikidata_mapping_rationale_slot
  - word_count_slot

- Added archived versions of the deleted slots to preserve historical data:
  - wikidata_class_archived_20260114.yaml
  - wikidata_entity_label_archived_20260114.yaml
  - wikidata_mapping_rationale_archived_20260114.yaml
  - word_count_archived_20260114.yaml

- Introduced a new hook `usePersonSearch` for enhanced semantic search functionality in the frontend, supporting debounced queries and caching.
2026-01-14 22:57:09 +01:00
kempersc
ad5fbe82cf Migrate valid_from and valid_to slots to temporal_extent across multiple classes
- Consolidated valid_from and valid_to slots into a single temporal_extent slot in FundingRequirement, GiftShop, OrganizationBranch, OrganizationalChangeEvent, OrganizationalStructure, SocialMediaProfile, Storage, StorageUnit classes.
- Updated slot definitions to use TimeSpan for temporal_extent, providing structured validity periods.
- Removed deprecated slots: valid_from, valid_to, verified_by, wikidata_entity_id, and worldcat_id, archiving their definitions for reference.
- Adjusted related documentation and examples to reflect the new temporal_extent structure.
2026-01-14 22:33:36 +01:00
kempersc
7a72a1d096 Add new classes and slots for enhanced data modeling
- Introduced VerificationStatus, Verifier, VersionNumber, ViabilityStatus, VideoCategoryIdentifier, VideoIdentifier, WhatsAppProfile, WordCount, WorkRevision, and WorldCatIdentifier classes to capture various aspects of data verification, categorization, and identification.
- Created corresponding slots such as analyzes_or_analyzed, unit_type, years_restricted, benefits_provided, consumes_or_consumed, has_or_had_contact_details, has_or_had_investment, has_or_had_liability, has_or_had_likelihood_score, has_or_had_location, has_or_had_net_asset, is_or_was_affiliated_with, is_or_was_allocated_to, is_or_was_alternative_form_of, is_or_was_categorized_as, is_or_was_used_by, and was_last_updated_at to facilitate detailed tracking and categorization of entities and their attributes.
- Each class and slot includes detailed descriptions, usage examples, and mappings to relevant ontologies to ensure interoperability and clarity in data representation.
2026-01-14 20:32:45 +01:00
kempersc
e3adb4ed60 feat: Introduce Overview, RealnessStatus, and WebLink classes with comprehensive documentation and migration notes
- Added Overview class to represent structured collections of web links, including detailed descriptions, examples, and ontology alignments.
- Introduced RealnessStatus class to classify data as real or synthetic, with rich provenance and temporal semantics.
- Created WebLink class for representing hyperlinks with associated metadata, enhancing structured link representation.
- Established new slots: has_or_had_comprehensive_overview, is_or_was_real, and includes_or_included to support the new classes and improve data modeling.
- Migrated existing slots to new structures, ensuring compliance with RiC-O naming conventions and enhancing specificity.
- Updated annotations and examples across all new classes and slots for clarity and usability.
2026-01-14 09:32:14 +01:00
kempersc
b30711fcfb update slots 2026-01-14 09:05:54 +01:00
kempersc
355d8be51d centralise slots 2026-01-12 14:33:56 +01:00
kempersc
0d5d48568d refactor(schema): centralize slot definitions per Rule 38
- Remove slot_uri, description, mappings from slot_usage sections
- Move these properties to centralized slot files in modules/slots/
- Keep only class-specific overrides in slot_usage (required, inlined, examples)
- Update 1,499 centralized slot files with enriched definitions
- Clean 188 class files

Violations fixed:
- slot_uri in slot_usage: 1,676 → 0
- description in slot_usage: 2,287 → 0 (moved to centralized)

Schema still validates: 816 classes, 2028 slots, 127 enums
2026-01-11 23:27:17 +01:00
kempersc
5d3d8530b0 chore: trigger DSPy eval workflow
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 4m13s
2026-01-11 22:40:23 +01:00
kempersc
56c373bba8 Implement fast WCMS migration script with state file checkpointing and batch processing 2026-01-11 22:26:37 +01:00
kempersc
174a420c08 refactor(schema): centralize 1515 inline slot definitions per Rule 48
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 3m57s
- Remove inline slot definitions from 144 class files
- Create 7 new centralized slot files in modules/slots/:
  - custodian_type_broader.yaml
  - custodian_type_narrower.yaml
  - custodian_type_related.yaml
  - definition.yaml
  - finding_aid_access_restriction.yaml
  - finding_aid_description.yaml
  - finding_aid_temporal_coverage.yaml
- Add centralize_inline_slots.py automation script
- Update manifest with new timestamp

Rule 48: Class files must NOT define inline slots - all slots
must be imported from modules/slots/ directory.

Note: Pre-existing IdentifierFormat duplicate class definition
(in Standard.yaml and IdentifierFormat.yaml) not addressed in
this commit - requires separate schema refactor.
2026-01-11 22:02:14 +01:00
kempersc
f02cffe1e8 refactor(schema): migrate 5 deprecated slots to temporal naming convention
Migrate slots to follow RiC-O-style temporal naming (Rule 39):
- accepts_external_work → accepts_or_accepted_external_work
- accepts_visiting_scholars → accepts_or_accepted_visiting_scholar
- accepts_payment_methods → accepts_or_accepted_payment_method
- access → has_or_had_access_condition
- access_policy_ref → has_or_had_access_policy_reference

Updated classes to use new slot names:
- ConservationLab.yaml
- ResearchCenter.yaml
- GiftShop.yaml
- ArchiveReference.yaml
- FindingAid.yaml
- Collection.yaml

Archived deprecated slots to schemas/20251121/linkml/archive/slots/
with _archived_20260110 suffix per Rule 9 (enum-to-class principle).
2026-01-10 21:09:29 +01:00
kempersc
28c3aaf33f enrich profiles 2026-01-10 17:31:02 +01:00
kempersc
626bd3a095 refactor(schemas): apply naming conventions to 261 class files
- Apply Rule 39: RiC-O style hasOrHad*/isOrWas* for temporal slots
- Apply Rule 43: Singular noun convention (keywords → keyword)
- Update slot references to match renamed slot files
- Maintain schema integrity across all class definitions
2026-01-10 15:36:33 +01:00
kempersc
095a3f949c refactor(linkml): apply RiC-O slot naming conventions to /schemas/ (Rule 39)
Apply same RiC-O-style slot naming refactor to /schemas/20251121/linkml/
that was previously applied to frontend/public/schemas/:

- Add 'has_' prefix for possession predicates
- Add 'is_or_was_' prefix for temporal inverse relationships
- Add 'has_or_had_' for bidirectional temporal relations
- Add new slots: is_or_was_aggregated_by, is_or_was_allocated_by, etc.
- Update count slots with proper descriptions

This ensures consistency between the source schema directory and the
frontend-served schemas.

514 files changed, +6,325 insertions, -4,255 deletions
2026-01-10 12:55:45 +01:00
kempersc
0393b321c9 refactor(schema): unify custodian_type slots into has_or_had_custodian_type (Rule 39, 43)
- Migrate 236+ class files from custodian_types to has_or_had_custodian_type
- Archive deprecated slots: custodian_type, custodian_types, custodian_type_broader/narrower/related
- Update main schema and manifest imports
- Fix Custodian.yaml class to use new slot
- Fix annotation format (list→scalar) in has_or_had_custodian_type.yaml

Rules applied:
- Rule 39: RiC-O naming convention (hasOrHad pattern)
- Rule 43: Slot nouns must be singular (multivalued:true for cardinality)
- Rule 38: Slot centralization with semantic URI
2026-01-09 10:55:21 +01:00
kempersc
dfa667c90f Fix LinkML schema for valid RDF generation with proper slot_uri
Summary:
- Create 46 missing slot definition files with proper slot_uri values
- Add slot imports to main schema (01_custodian_name_modular.yaml)
- Fix YAML examples sections in 116+ class and slot files
- Fix PersonObservation.yaml examples section (nested objects → string literals)

Technical changes:
- All slots now have explicit slot_uri mapping to base ontologies (RiC-O, Schema.org, SKOS)
- Eliminates malformed URIs like 'custodian/:slot_name' in generated RDF
- gen-owl now produces valid Turtle with 153,166 triples

New slot files (46):
- RiC-O slots: rico_note, rico_organizational_principle, rico_has_or_had_holder, etc.
- Scope slots: scope_includes, scope_excludes, archive_scope
- Organization slots: organization_type, governance_authority, area_served
- Platform slots: platform_type_category, portal_type_category
- Social media slots: social_media_platform_category, post_type_*
- Type hierarchy slots: broader_type, narrower_types, custodian_type_broader
- Wikidata slots: wikidata_equivalent, wikidata_mapping

Generated output:
- schemas/20251121/rdf/01_custodian_name_modular_20260107_134534_clean.owl.ttl (6.9MB)
- Validated with rdflib: 153,166 triples, no malformed URIs
2026-01-07 13:48:03 +01:00
kempersc
98c42bf272 Fix LinkML URI conflicts and generate RDF outputs
- Fix scope_note → finding_aid_scope_note in FindingAid.yaml
- Remove duplicate wikidata_entity slot from CustodianType.yaml (import instead)
- Remove duplicate rico_record_set_type from class_metadata_slots.yaml
- Fix range types for equals_string compatibility (uriorcurie → string)
- Move class names from close_mappings to see_also in 10 RecordSetTypes files
- Generate all RDF formats: OWL, N-Triples, RDF/XML, N3, JSON-LD context
- Sync schemas to frontend/public/schemas/

Files: 1,151 changed (includes prior CustodianType migration)
2026-01-07 12:32:59 +01:00
kempersc
b34992b1d3 Migrate all 293 class files to ontology-aligned slots
Extends migration to all class types (museums, libraries, galleries, etc.)

New slots added to class_metadata_slots.yaml:
- RiC-O: rico_record_set_type, rico_organizational_principle,
  rico_has_or_had_holder, rico_note
- Multilingual: label_de, label_es, label_fr, label_nl, label_it, label_pt
- Scope: scope_includes, scope_excludes, custodian_only,
  organizational_level, geographic_restriction
- Notes: privacy_note, preservation_note, legal_note

Migration script now handles 30+ annotation types.
All migrated schemas pass linkml-validate.

Total: 387 class files now use proper slots instead of annotations.
2026-01-06 12:24:54 +01:00
kempersc
11983014bb Enhance specificity scoring system integration with existing infrastructure
- Updated documentation to clarify integration points with existing components in the RAG pipeline and DSPy framework.
- Added detailed mapping of SPARQL templates to context templates for improved specificity filtering.
- Implemented wrapper patterns around existing classifiers to extend functionality without duplication.
- Introduced new tests for the SpecificityAwareClassifier and SPARQLToContextMapper to ensure proper integration and functionality.
- Enhanced the CustodianRDFConverter to include ISO country and subregion codes from GHCID for better geospatial data handling.
2026-01-05 17:37:49 +01:00
kempersc
242bc8bb35 Add new slots for heritage custodian entities
- Created deliverables_slot for expected or achieved deliverable outputs.
- Introduced event_id_slot for persistent unique event identifiers.
- Added follow_up_date_slot for scheduled follow-up action dates.
- Implemented object_ref_slot for references to heritage objects.
- Established price_slot for price information across entities.
- Added price_currency_slot for currency codes in price information.
- Created protocol_slot for API protocol specifications.
- Introduced provenance_text_slot for full provenance entry text.
- Added record_type_slot for classification of record types.
- Implemented response_formats_slot for supported API response formats.
- Established status_slot for current status of entities or activities.
- Added FactualCountDisplay component for displaying count query results.
- Introduced ReplyTypeIndicator component for visualizing reply types.
- Created approval_date_slot for formal approval dates.
- Added authentication_required_slot for API authentication status.
- Implemented capacity_items_slot for maximum storage capacity.
- Established conservation_lab_slot for conservation laboratory information.
- Added cost_usd_slot for API operation costs in USD.
2026-01-05 00:49:05 +01:00
kempersc
2dca28d8c1 enrich CH entries with mission statements 2026-01-04 13:12:32 +01:00
kempersc
4f0cafe98a enrich HC profiles 2026-01-02 02:11:04 +01:00
kempersc
aca68ea47f remove a,bihguous web-claims 2025-12-21 00:01:54 +01:00
kempersc
62fdd35321 Refactor code structure for improved readability and maintainability 2025-12-09 11:15:51 +01:00
kempersc
b61271220b enrich entries 2025-12-09 10:46:43 +01:00