Commit graph

25 commits

Author SHA1 Message Date
kempersc
30576d541d Refactor code structure for improved readability and maintainability 2026-02-16 23:25:16 +01:00
kempersc
a590a8d94b Refactor and enhance descriptions across multiple YAML schemas for improved clarity and consistency.
- Updated descriptions in `WikidataOrganization`, `WikidataRecognition`, `WikidataResolvedEntities`, `WikidataSitelinks`, `WikidataSocialMedia`, `WikidataTemporal`, `WikidataTimeValue`, `WikidataWeb`, `WomensArchives`, `WomensArchivesRecordSetType`, `WomensArchivesRecordSetTypes`, `WordCount`, `WorkRevision`, `WorldCatIdentifier`, `WorldHeritageSite`, `WritingSystem`, `XPath`, `XPathScore`, `YoutubeChannel`, `YoutubeComment`, `YoutubeTranscript`, and `YoutubeVideo` to enhance readability and precision.
- Adjusted mappings and slot usage in various schemas to align with updated descriptions and improve data structure.
- Added new synonyms in multiple languages for better localization support.
2026-02-16 15:53:42 +01:00
kempersc
d37a120ef2 Refactor schema definitions across multiple classes to improve clarity and consistency
- Removed unnecessary aliases and adjusted slot definitions in Timestamp, Topic, TopicType, TransferEvent, TransferPolicy, and others.
- Enhanced descriptions and added alternative language descriptions for TradeUnionArchiveRecordSetType and UnescoIchElement.
- Updated slot usage for various archive-related classes to use `equals_string` instead of `equals_expression`.
- Streamlined VideoChapter class by refining descriptions and restructuring slot usage for better navigation and organization.
- General cleanup of comments and annotations to ensure clarity and maintainability.
2026-02-16 11:17:33 +01:00
kempersc
66adec257e Add scripts for normalizing LinkML schemas and validating schema integrity
- Implement `normalize_linkml_alt_descriptions.py` to convert structured alt_descriptions to the expected scalar form.
- Implement `normalize_linkml_structured_aliases.py` to flatten language-keyed structured_aliases into a standard list-of-objects format.
- Implement `validate_linkml_schema_integrity.py` to validate the integrity of LinkML schema bundles, checking for import resolution, YAML parsing, and reference existence.
2026-02-16 10:16:51 +01:00
kempersc
2c9d3598dc Refactor Wikidata schema classes for improved clarity and multilingual support
- Updated titles for clarity in WikidataClassification, WikidataCollectionInfo, WikidataContact, WikidataCoordinates, WikidataEnrichment, WikidataEntity, WikidataIdentifiers, WikidataLocation, WikidataMedia, and WikidataOrganization classes.
- Enhanced descriptions with multilingual support, providing translations in Dutch, German, French, Spanish, Arabic, Indonesian, and Chinese.
- Added structured aliases for better synonym mapping in multiple languages.
- Improved comments and keywords for better understanding and searchability.
- Ensured consistent use of slots and mappings across classes to align with ontology standards.
2026-02-15 14:08:11 +01:00
kempersc
820d3969bb Refactor code structure for improved readability and maintainability 2026-02-11 12:11:59 +01:00
kempersc
d3a65a496c Refactor slot names and descriptions across multiple YAML files for consistency and clarity
- Updated slot names to improve semantic clarity:
  - `has_type` changed to `categorized_as`
  - `has_location` changed to `located_at`
  - `coordinates` changed to `has_coordinates`
  - `country` changed to `in_country`
  - `like_count` changed to `has_quantity`

- Adjusted descriptions and annotations for slots to enhance understanding and alignment with ontology standards.

- Modified imports in `WomensArchives.yaml` and `WomensArchivesRecordSetTypes.yaml` to reflect new slot names.

- Enhanced multilingual support in `has_record_set` slot definition with additional translations and structured aliases.

- General cleanup and standardization of slot definitions across various classes including `Wikidata`, `Youtube`, and `WorkExperience`.
2026-02-11 11:54:34 +01:00
kempersc
69a22e2b5a Refactor and expand LinkML slot definitions
- Deleted the `rights_statement_url` slot definition as it is no longer needed.
- Added multiple new slots including `has_legal_basis`, `has_statement`, `impose`, `pose_condition`, and `reviewed_through` with detailed descriptions and ontology alignments.
- Updated existing slots to improve clarity and consistency, including renaming `close_mappings` to `related_mappings` in several definitions.
- Enhanced the `require` slot with additional aliases for better usability.
- Improved documentation and comments across all slot definitions to clarify their purpose and usage.
2026-02-08 23:37:44 +01:00
kempersc
8f77f62585 update slot imports in classes 2026-02-08 19:22:13 +01:00
kempersc
90842851c2 Add slot definitions for 'updated_at' and 'written_in' with multilingual support and ontology alignment
- Created 'updated_at.yaml' to record the last modified date and time of entities, including multilingual descriptions and structured aliases.
- Created 'written_in.yaml' to specify the language in which content is composed, covering both natural and programming languages, with detailed comments and close ontology mappings.
2026-02-07 11:22:05 +01:00
kempersc
6435786556 edit slots 2026-02-04 00:24:46 +01:00
kempersc
a83f04d9c4 Refactor code structure for improved readability and maintainability 2026-02-02 15:57:17 +01:00
kempersc
fc405445c6 Refactor and update schema definitions
- Removed obsolete slots: `has_or_had_custodian_observation`, `provider`, and `specificity_annotation`.
- Updated `has_or_had_score` slot to use `SpecificityScore` class and modified its description and examples.
- Added new slots: `end_seconds`, `end_time`, `has_archive_path`, `has_or_had_custodian_name`, `protocol_name`, and `protocol_version`.
- Introduced a script `check_annotation_types.py` to validate the presence and structure of `custodian_types` in YAML files.
- Added a script `update_specificity.py` to automate updates related to `SpecificityAnnotation` to `SpecificityScore`.
2026-02-01 19:55:38 +01:00
kempersc
ca4a54181e Refactor schema files to improve clarity and maintainability
- Updated WorldCatIdentifier.yaml to remove unnecessary description and ensure consistent formatting.
- Enhanced WorldHeritageSite.yaml by breaking long description into multiple lines for better readability and removed unused attributes.
- Simplified WritingSystem.yaml by removing redundant attributes and ensuring consistent formatting.
- Cleaned up XPathScore.yaml by removing unnecessary attributes and ensuring consistent formatting.
- Improved YoutubeChannel.yaml by breaking long description into multiple lines for better readability.
- Enhanced YoutubeEnrichment.yaml by breaking long description into multiple lines for better readability.
- Updated YoutubeVideo.yaml to break long description into multiple lines and removed legacy field name.
- Refined has_or_had_affiliation.yaml by removing unnecessary comments and ensuring clarity.
- Cleaned up is_or_was_retrieved_at.yaml by removing unnecessary comments and ensuring clarity.
- Added rules for generic slots and avoiding rough edits in schema files to maintain structural integrity.
- Introduced changes_or_changed_through.yaml to define a new slot for linking entities to change events.
2026-01-31 00:46:23 +01:00
kempersc
4034c2a00a Refactor schema slots across multiple classes to improve consistency and clarity
- Removed unused slots from TaxonomicAuthority, TechnicalFeature, TelevisionArchive, TentativeWorldHeritageSite, Threat, TimeSpan, Title, TradeRegister, TradeUnionArchive, TradeUnionArchiveRecordSetType, TransferEvent, UNESCODomain, UnitIdentifier, UniversityArchive, UnspecifiedType, UserCommunity, Venue, Vereinsarchiv, Verlagsarchiv, VerlagsarchivRecordSetType, Version, Verwaltungsarchiv, VideoAnnotationTypes, VideoAudioAnnotation, VideoFrame, VideoPost, VideoSubtitle, VideoTextContent, Warehouse, WebArchive, WebClaim, WebClaimsBlock, WebLink, WebPortal, WebPortalTypes, WomensArchives, WordCount, WorldHeritageSite, WritingSystem, and XPathScore.
- Introduced new slot is_or_was_retrieved_at for tracking data retrieval timestamps.
2026-01-31 00:28:09 +01:00
kempersc
6203d19875 Refactor YAML schemas for clarity and consistency
- Removed unnecessary line breaks and whitespace in descriptions across multiple classes including Taxon, TaxonomicAuthority, TechnicalFeature, TradeRegister, TransferEvent, UNESCODomain, UnspecifiedType, UserCommunity, Version, VideoAnnotationTypes, VideoFrame, VideoTextContent, WebArchive, WebClaimsBlock, WebLink, WebPortal, and WordCount.
- Updated descriptions to enhance readability and maintain a uniform style.
- Migrated attributes and slots as per the latest schema rules, ensuring alignment with the defined standards.
- Improved documentation for better understanding of class purposes and usage scenarios.
2026-01-31 00:21:50 +01:00
kempersc
14375c583e added hidden slots 2026-01-30 23:56:19 +01:00
kempersc
1f8776bef4 Update schemas and slots with new mappings and descriptions
- Updated manifest.json with new generated timestamp.
- Added close mappings to APIRequest and Administration classes.
- Renamed slots in AccessPolicy to has_or_had_embargo_end_date and has_or_had_embargo_reason.
- Changed class_uri for Accumulation to rico:AccumulationRelation and updated description.
- Added exact mappings to Altitude, AppellationType, and ArchitecturalStyle classes.
- Removed deprecated slots from CollectionManagementSystem and updated has_or_had_type.
- Added new slots for has_or_had_embargo_end_date and has_or_had_embargo_reason.
- Updated slot definitions for has_or_had_assessment, has_or_had_sequence_index, and others with new URIs and mappings.
- Removed unused slots end_seconds and end_time.
- Added new slot definitions for has_or_had_exhibition_type, has_or_had_extent_text, and is_or_was_documented_by.
2026-01-29 13:33:23 +01:00
kempersc
f800e198ff Refactor code structure for improved readability and maintainability 2026-01-28 01:11:55 +01:00
kempersc
4319f38c05 Add archived slots for audience size, audience type, and capacity metrics
- Created new YAML files for audience size and audience type slots, defining their properties and annotations.
- Added archived capacity slots including cubic meters, linear meters, item count, and descriptions, with appropriate URIs and ranges.
- Introduced a template specificity slot for context-aware RAG filtering.
- Consolidated capacity-related slots into a unified structure, including has_or_had_capacity, capacity_type, and capacity_value, with detailed descriptions and examples.
2026-01-17 18:53:23 +01:00
kempersc
d47bb5b097 standardise slots 2026-01-16 18:57:52 +01:00
kempersc
3fb27c15e2 Refactor and archive deprecated slots; update migration records
- Removed deprecated slots: storage_security_level, version_number, video_comment, visiting_hour, was_asserted_by, was_revision_of, writing_system.
- Archived corresponding YAML files for deprecated slots with detailed migration notes.
- Updated slot definitions for has_collection and encompassing_body to reflect new naming conventions and temporal patterns.
- Enhanced metadata extraction in index_persons_qdrant.py to include WCMS registration and data sources.
- Modified hybrid_retriever and multi_embedding_retriever to support filtering by WCMS registration status.
2026-01-15 13:16:59 +01:00
kempersc
043ea868b5 fix(schema): Resolve broken imports after slot migration
All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 4m31s
- Fix empty import list elements (- # comment pattern) in Laptop, Expenses,
  FunctionType, Overview, WebLink, Photography classes
- Replace valid_from/valid_to slots with temporal_extent in class slots lists
- Update slot_usage to use temporal_extent with TimeSpan range
- Update examples to use temporal_extent with begin_of_the_begin/end_of_the_end
- Fix typo is_or_was_is_or_was_archived_at → is_or_was_archived_at in WebObservation
- Add TimeSpan imports to classes using temporal_extent
- Fix relative import paths for Timestamp in temporal slots
- Fix CustodianIdentifier → Identifier imports in FundingAgenda, ReadingRoomAnnex

Schema validates successfully with 902 classes and 2043 slots.
2026-01-15 12:25:27 +01:00
kempersc
6c3fa6b5a3 Remove deprecated slots and add new slot definitions for enhanced data modeling
- Deleted obsolete slot definitions for work_location and workshop_space.
- Introduced new TaxonName class to represent scientific taxonomic names with detailed attributes.
- Archived existing slots related to surname_prefix, target_name, taxon_name, terminal_count, text_region_count, title, title_proper, total_chapter, total_characters_extracted, total_connections_extracted, track_name, transcript_format, traveling_venue, type_label, type_status, typical_responsibility, unesco_domain, unesco_inscription_year, unesco_list_status, uniform_title, unit_name, used_by_custodian, uv_filtered_required, valid_from_geo, valid_to_geo, validation_status, variant_of_name, verification_date, viability_status, within_auxiliary_place, and within_place.
- Updated slot descriptions and structures to improve clarity and compliance with standards.
2026-01-15 11:42:35 +01:00
kempersc
e3adb4ed60 feat: Introduce Overview, RealnessStatus, and WebLink classes with comprehensive documentation and migration notes
- Added Overview class to represent structured collections of web links, including detailed descriptions, examples, and ontology alignments.
- Introduced RealnessStatus class to classify data as real or synthetic, with rich provenance and temporal semantics.
- Created WebLink class for representing hyperlinks with associated metadata, enhancing structured link representation.
- Established new slots: has_or_had_comprehensive_overview, is_or_was_real, and includes_or_included to support the new classes and improve data modeling.
- Migrated existing slots to new structures, ensuring compliance with RiC-O naming conventions and enhancing specificity.
- Updated annotations and examples across all new classes and slots for clarity and usability.
2026-01-14 09:32:14 +01:00