Commit graph

40 commits

Author SHA1 Message Date
kempersc
69a22e2b5a Refactor and expand LinkML slot definitions
- Deleted the `rights_statement_url` slot definition as it is no longer needed.
- Added multiple new slots including `has_legal_basis`, `has_statement`, `impose`, `pose_condition`, and `reviewed_through` with detailed descriptions and ontology alignments.
- Updated existing slots to improve clarity and consistency, including renaming `close_mappings` to `related_mappings` in several definitions.
- Enhanced the `require` slot with additional aliases for better usability.
- Improved documentation and comments across all slot definitions to clarify their purpose and usage.
2026-02-08 23:37:44 +01:00
kempersc
90842851c2 Add slot definitions for 'updated_at' and 'written_in' with multilingual support and ontology alignment
- Created 'updated_at.yaml' to record the last modified date and time of entities, including multilingual descriptions and structured aliases.
- Created 'written_in.yaml' to specify the language in which content is composed, covering both natural and programming languages, with detailed comments and close ontology mappings.
2026-02-07 11:22:05 +01:00
kempersc
6435786556 edit slots 2026-02-04 00:24:46 +01:00
kempersc
a83f04d9c4 Refactor code structure for improved readability and maintainability 2026-02-02 15:57:17 +01:00
kempersc
fc405445c6 Refactor and update schema definitions
- Removed obsolete slots: `has_or_had_custodian_observation`, `provider`, and `specificity_annotation`.
- Updated `has_or_had_score` slot to use `SpecificityScore` class and modified its description and examples.
- Added new slots: `end_seconds`, `end_time`, `has_archive_path`, `has_or_had_custodian_name`, `protocol_name`, and `protocol_version`.
- Introduced a script `check_annotation_types.py` to validate the presence and structure of `custodian_types` in YAML files.
- Added a script `update_specificity.py` to automate updates related to `SpecificityAnnotation` to `SpecificityScore`.
2026-02-01 19:55:38 +01:00
kempersc
7e0622c755 chore: clean up code structure and remove redundant changes 2026-01-31 21:24:53 +01:00
kempersc
eb4eac24ce Update manifest.json timestamp and refactor YAML schemas for improved clarity and consistency 2026-01-31 01:23:46 +01:00
kempersc
ca4a54181e Refactor schema files to improve clarity and maintainability
- Updated WorldCatIdentifier.yaml to remove unnecessary description and ensure consistent formatting.
- Enhanced WorldHeritageSite.yaml by breaking long description into multiple lines for better readability and removed unused attributes.
- Simplified WritingSystem.yaml by removing redundant attributes and ensuring consistent formatting.
- Cleaned up XPathScore.yaml by removing unnecessary attributes and ensuring consistent formatting.
- Improved YoutubeChannel.yaml by breaking long description into multiple lines for better readability.
- Enhanced YoutubeEnrichment.yaml by breaking long description into multiple lines for better readability.
- Updated YoutubeVideo.yaml to break long description into multiple lines and removed legacy field name.
- Refined has_or_had_affiliation.yaml by removing unnecessary comments and ensuring clarity.
- Cleaned up is_or_was_retrieved_at.yaml by removing unnecessary comments and ensuring clarity.
- Added rules for generic slots and avoiding rough edits in schema files to maintain structural integrity.
- Introduced changes_or_changed_through.yaml to define a new slot for linking entities to change events.
2026-01-31 00:46:23 +01:00
kempersc
4034c2a00a Refactor schema slots across multiple classes to improve consistency and clarity
- Removed unused slots from TaxonomicAuthority, TechnicalFeature, TelevisionArchive, TentativeWorldHeritageSite, Threat, TimeSpan, Title, TradeRegister, TradeUnionArchive, TradeUnionArchiveRecordSetType, TransferEvent, UNESCODomain, UnitIdentifier, UniversityArchive, UnspecifiedType, UserCommunity, Venue, Vereinsarchiv, Verlagsarchiv, VerlagsarchivRecordSetType, Version, Verwaltungsarchiv, VideoAnnotationTypes, VideoAudioAnnotation, VideoFrame, VideoPost, VideoSubtitle, VideoTextContent, Warehouse, WebArchive, WebClaim, WebClaimsBlock, WebLink, WebPortal, WebPortalTypes, WomensArchives, WordCount, WorldHeritageSite, WritingSystem, and XPathScore.
- Introduced new slot is_or_was_retrieved_at for tracking data retrieval timestamps.
2026-01-31 00:28:09 +01:00
kempersc
6203d19875 Refactor YAML schemas for clarity and consistency
- Removed unnecessary line breaks and whitespace in descriptions across multiple classes including Taxon, TaxonomicAuthority, TechnicalFeature, TradeRegister, TransferEvent, UNESCODomain, UnspecifiedType, UserCommunity, Version, VideoAnnotationTypes, VideoFrame, VideoTextContent, WebArchive, WebClaimsBlock, WebLink, WebPortal, and WordCount.
- Updated descriptions to enhance readability and maintain a uniform style.
- Migrated attributes and slots as per the latest schema rules, ensuring alignment with the defined standards.
- Improved documentation for better understanding of class purposes and usage scenarios.
2026-01-31 00:21:50 +01:00
kempersc
0c5211e40a removed convenience slots 2026-01-31 00:15:53 +01:00
kempersc
14375c583e added hidden slots 2026-01-30 23:56:19 +01:00
kempersc
c60b523f29 Implement feature X to enhance user experience and fix bug Y in module Z 2026-01-29 00:12:27 +01:00
kempersc
f800e198ff Refactor code structure for improved readability and maintainability 2026-01-28 01:11:55 +01:00
kempersc
b4d1a7677f feat: Migrate has_air_changes_per_hour to specifies_or_specified and create AirChanges and Ventilation classes 2026-01-27 09:03:22 +01:00
kempersc
bdba9de593 feat: Add archived governance slots and update manifest generation timestamp 2026-01-27 00:49:30 +01:00
kempersc
73b2d21bb3 Refactor code structure for improved readability and maintainability 2026-01-26 23:48:27 +01:00
kempersc
ec113e8811 Add new classes and slots for archival and educational metadata
- Introduced EADIdentifier, EBook, EcclesiasticalProvince, Edition, Editor, Education, EmailAddress, and Size classes to enhance archival description capabilities.
- Added slots for digital presence types, digital surrogates, digitization status, and dimensions to support comprehensive metadata management.
- Migrated existing slots such as ead_id, edition_number, and dimension to new structured formats.
- Established relationships between works and their editions, sizes, and editors to improve data interconnectivity.
- Enhanced ontology alignment with Schema.org and BIBFRAME standards for better interoperability.
2026-01-26 09:00:29 +01:00
kempersc
511fc99847 feat: Add PriceRange, Publication, and TaxDeductibility classes
- Introduced PriceRange class to categorize price levels for hospitality services, including structured metadata for various price categories.
- Added Publication class to represent publication events, capturing details like publisher, publication place, and edition.
- Created TaxDeductibilityType as an abstract class for tax deductibility status, promoting previous enum values to a class hierarchy for richer metadata.
- Implemented TaxDeductibilityTypes with concrete subclasses detailing various tax deductibility statuses.
- Archived previous DeductibilityStatusEnum and related slots, transitioning to a more structured approach for tax deductibility classification.
- Updated multiple slot definitions to align with new class structures and naming conventions, including has_or_had_measurement and has_or_had_price.
- Enhanced documentation and examples across new and existing slots for clarity and compliance with naming conventions.
2026-01-24 17:41:06 +01:00
kempersc
b61572f08a feat: Add CreationEvent, DatePrecision, IdentificationEvent, and Image classes with structured slots
- Introduced CreationEvent class to represent the creation of objects, including temporal extent, creator, and place of creation.
- Added DatePrecision class to indicate the precision level of date values, supporting various formats from day to century.
- Implemented IdentificationEvent class for taxonomic identification, capturing identification date, method, and confidence level.
- Created Image class for visual content representation, including URL and metadata for images used in collections.
- Archived previous slots related to card images and titles, replacing them with structured slots for better data representation.
- Enhanced slots for decommission dates, degree of certainty, and identification events to improve temporal data handling.
2026-01-23 23:15:43 +01:00
kempersc
2d09776856 Refactor StorageCondition schema: Migrate compliance_status to has_or_had_status with ComplianceStatus class
- Removed compliance_status slot and replaced it with has_or_had_status.
- Updated has_or_had_status to use ComplianceStatus for structured representation.
- Adjusted examples to reflect new structure for compliance status.
- Updated documentation to indicate migration and provide details on the ComplianceStatus class.
2026-01-22 16:22:16 +01:00
kempersc
4a277d7d42 standardise slots 2026-01-19 00:09:28 +01:00
kempersc
4319f38c05 Add archived slots for audience size, audience type, and capacity metrics
- Created new YAML files for audience size and audience type slots, defining their properties and annotations.
- Added archived capacity slots including cubic meters, linear meters, item count, and descriptions, with appropriate URIs and ranges.
- Introduced a template specificity slot for context-aware RAG filtering.
- Consolidated capacity-related slots into a unified structure, including has_or_had_capacity, capacity_type, and capacity_value, with detailed descriptions and examples.
2026-01-17 18:53:23 +01:00
kempersc
d99a7800e3 feat: enhance entity profile saving with PPID generation and backward compatibility 2026-01-17 01:55:38 +01:00
kempersc
f9f3cc8e74 fix: resolve YAML import indentation and add missing slot descriptions
Schema Improvements:
- Fix YAML import indentation across 800+ class files (sed: '^- ../' → '  - ../')
- Add descriptions to 26 inline slots missing them (lint warnings)
- Fix malformed imports in BirthPlace.yaml and CustodianObservation.yaml

Validation Results:
- linkml-lint: 4 warnings (intentional SCREAMING_CASE tier names)
- gen-owl: SUCCESS (164,069 lines generated)
- gen-json-schema: SUCCESS (9.4MB generated)

Files affected: 1,034 files, +23,908 -15,200 lines
2026-01-16 00:09:28 +01:00
kempersc
c2629f6d29 Fix LinkML schema validation errors (0 errors, 30 warnings)
Schema Migration Fixes:
- Fix YAML import indentation in ~650 slot files (linkml:types and enum imports)
- Rename slot reference: has_or_had_holds_record_set_type → hold_or_held_record_set_type
  (70+ archive class files, main schema, manifest.json)
- Fix ProvenanceBlock.yaml: remove invalid any_of range, use string with multivalued
- Fix has_or_had_provenance.yaml: remove nested template_specificity from annotations

Validation Status:
- 0 errors (was multiple import/reference errors)
- 30 warnings (missing descriptions on inline slots, intentional SCREAMING_CASE names)

Files changed: ~3,850 (slots, classes, main schema, manifest)
2026-01-15 23:21:38 +01:00
kempersc
416aa407cc Add new slots for financial and heritage documentation
- Introduced total expense, total frames analyzed, total investment, total liability, total net asset, and traditional product slots to enhance financial reporting capabilities.
- Added transition types detected, treatment description, type hypothesis, typical condition, typical HTTP methods, typical response formats, and typical scope slots for improved heritage documentation.
- Implemented user community, verified, web observation, WhatsApp business likelihood, wikidata equivalent, and wikidata mapping slots to enrich institutional data representation.
- Established has_or_had_asset, has_or_had_budget, has_or_had_expense, and is_or_was_threatened_by slots to capture asset, budget, expense relationships, and threats to heritage forms.
2026-01-15 19:35:39 +01:00
kempersc
355d8be51d centralise slots 2026-01-12 14:33:56 +01:00
kempersc
0d5d48568d refactor(schema): centralize slot definitions per Rule 38
- Remove slot_uri, description, mappings from slot_usage sections
- Move these properties to centralized slot files in modules/slots/
- Keep only class-specific overrides in slot_usage (required, inlined, examples)
- Update 1,499 centralized slot files with enriched definitions
- Clean 188 class files

Violations fixed:
- slot_uri in slot_usage: 1,676 → 0
- description in slot_usage: 2,287 → 0 (moved to centralized)

Schema still validates: 816 classes, 2028 slots, 127 enums
2026-01-11 23:27:17 +01:00
kempersc
174a420c08 refactor(schema): centralize 1515 inline slot definitions per Rule 48
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 3m57s
- Remove inline slot definitions from 144 class files
- Create 7 new centralized slot files in modules/slots/:
  - custodian_type_broader.yaml
  - custodian_type_narrower.yaml
  - custodian_type_related.yaml
  - definition.yaml
  - finding_aid_access_restriction.yaml
  - finding_aid_description.yaml
  - finding_aid_temporal_coverage.yaml
- Add centralize_inline_slots.py automation script
- Update manifest with new timestamp

Rule 48: Class files must NOT define inline slots - all slots
must be imported from modules/slots/ directory.

Note: Pre-existing IdentifierFormat duplicate class definition
(in Standard.yaml and IdentifierFormat.yaml) not addressed in
this commit - requires separate schema refactor.
2026-01-11 22:02:14 +01:00
kempersc
626bd3a095 refactor(schemas): apply naming conventions to 261 class files
- Apply Rule 39: RiC-O style hasOrHad*/isOrWas* for temporal slots
- Apply Rule 43: Singular noun convention (keywords → keyword)
- Update slot references to match renamed slot files
- Maintain schema integrity across all class definitions
2026-01-10 15:36:33 +01:00
kempersc
095a3f949c refactor(linkml): apply RiC-O slot naming conventions to /schemas/ (Rule 39)
Apply same RiC-O-style slot naming refactor to /schemas/20251121/linkml/
that was previously applied to frontend/public/schemas/:

- Add 'has_' prefix for possession predicates
- Add 'is_or_was_' prefix for temporal inverse relationships
- Add 'has_or_had_' for bidirectional temporal relations
- Add new slots: is_or_was_aggregated_by, is_or_was_allocated_by, etc.
- Update count slots with proper descriptions

This ensures consistency between the source schema directory and the
frontend-served schemas.

514 files changed, +6,325 insertions, -4,255 deletions
2026-01-10 12:55:45 +01:00
kempersc
0393b321c9 refactor(schema): unify custodian_type slots into has_or_had_custodian_type (Rule 39, 43)
- Migrate 236+ class files from custodian_types to has_or_had_custodian_type
- Archive deprecated slots: custodian_type, custodian_types, custodian_type_broader/narrower/related
- Update main schema and manifest imports
- Fix Custodian.yaml class to use new slot
- Fix annotation format (list→scalar) in has_or_had_custodian_type.yaml

Rules applied:
- Rule 39: RiC-O naming convention (hasOrHad pattern)
- Rule 43: Slot nouns must be singular (multivalued:true for cardinality)
- Rule 38: Slot centralization with semantic URI
2026-01-09 10:55:21 +01:00
kempersc
dfa667c90f Fix LinkML schema for valid RDF generation with proper slot_uri
Summary:
- Create 46 missing slot definition files with proper slot_uri values
- Add slot imports to main schema (01_custodian_name_modular.yaml)
- Fix YAML examples sections in 116+ class and slot files
- Fix PersonObservation.yaml examples section (nested objects → string literals)

Technical changes:
- All slots now have explicit slot_uri mapping to base ontologies (RiC-O, Schema.org, SKOS)
- Eliminates malformed URIs like 'custodian/:slot_name' in generated RDF
- gen-owl now produces valid Turtle with 153,166 triples

New slot files (46):
- RiC-O slots: rico_note, rico_organizational_principle, rico_has_or_had_holder, etc.
- Scope slots: scope_includes, scope_excludes, archive_scope
- Organization slots: organization_type, governance_authority, area_served
- Platform slots: platform_type_category, portal_type_category
- Social media slots: social_media_platform_category, post_type_*
- Type hierarchy slots: broader_type, narrower_types, custodian_type_broader
- Wikidata slots: wikidata_equivalent, wikidata_mapping

Generated output:
- schemas/20251121/rdf/01_custodian_name_modular_20260107_134534_clean.owl.ttl (6.9MB)
- Validated with rdflib: 153,166 triples, no malformed URIs
2026-01-07 13:48:03 +01:00
kempersc
98c42bf272 Fix LinkML URI conflicts and generate RDF outputs
- Fix scope_note → finding_aid_scope_note in FindingAid.yaml
- Remove duplicate wikidata_entity slot from CustodianType.yaml (import instead)
- Remove duplicate rico_record_set_type from class_metadata_slots.yaml
- Fix range types for equals_string compatibility (uriorcurie → string)
- Move class names from close_mappings to see_also in 10 RecordSetTypes files
- Generate all RDF formats: OWL, N-Triples, RDF/XML, N3, JSON-LD context
- Sync schemas to frontend/public/schemas/

Files: 1,151 changed (includes prior CustodianType migration)
2026-01-07 12:32:59 +01:00
kempersc
b34992b1d3 Migrate all 293 class files to ontology-aligned slots
Extends migration to all class types (museums, libraries, galleries, etc.)

New slots added to class_metadata_slots.yaml:
- RiC-O: rico_record_set_type, rico_organizational_principle,
  rico_has_or_had_holder, rico_note
- Multilingual: label_de, label_es, label_fr, label_nl, label_it, label_pt
- Scope: scope_includes, scope_excludes, custodian_only,
  organizational_level, geographic_restriction
- Notes: privacy_note, preservation_note, legal_note

Migration script now handles 30+ annotation types.
All migrated schemas pass linkml-validate.

Total: 387 class files now use proper slots instead of annotations.
2026-01-06 12:24:54 +01:00
kempersc
11983014bb Enhance specificity scoring system integration with existing infrastructure
- Updated documentation to clarify integration points with existing components in the RAG pipeline and DSPy framework.
- Added detailed mapping of SPARQL templates to context templates for improved specificity filtering.
- Implemented wrapper patterns around existing classifiers to extend functionality without duplication.
- Introduced new tests for the SpecificityAwareClassifier and SPARQLToContextMapper to ensure proper integration and functionality.
- Enhanced the CustodianRDFConverter to include ISO country and subregion codes from GHCID for better geospatial data handling.
2026-01-05 17:37:49 +01:00
kempersc
242bc8bb35 Add new slots for heritage custodian entities
- Created deliverables_slot for expected or achieved deliverable outputs.
- Introduced event_id_slot for persistent unique event identifiers.
- Added follow_up_date_slot for scheduled follow-up action dates.
- Implemented object_ref_slot for references to heritage objects.
- Established price_slot for price information across entities.
- Added price_currency_slot for currency codes in price information.
- Created protocol_slot for API protocol specifications.
- Introduced provenance_text_slot for full provenance entry text.
- Added record_type_slot for classification of record types.
- Implemented response_formats_slot for supported API response formats.
- Established status_slot for current status of entities or activities.
- Added FactualCountDisplay component for displaying count query results.
- Introduced ReplyTypeIndicator component for visualizing reply types.
- Created approval_date_slot for formal approval dates.
- Added authentication_required_slot for API authentication status.
- Implemented capacity_items_slot for maximum storage capacity.
- Established conservation_lab_slot for conservation laboratory information.
- Added cost_usd_slot for API operation costs in USD.
2026-01-05 00:49:05 +01:00
kempersc
2dca28d8c1 enrich CH entries with mission statements 2026-01-04 13:12:32 +01:00
kempersc
4f0cafe98a enrich HC profiles 2026-01-02 02:11:04 +01:00