- Updated titles and descriptions in TimeSlot, TimeSpan, TimeSpanType, and TimespanBlock for improved readability and understanding.
- Enhanced multilingual support with refined alt_descriptions and structured_aliases across various classes.
- Changed mapping types from broad_mappings to exact_mappings in WebClaimsBlock, WebCollection, WebPage, WebPlatform, WebSource, WorkExperience, and various YouTube-related classes for better alignment with schema definitions.
- Improved comments and modeling notes in VariantTypes to clarify usage and examples.
- General cleanup of unnecessary comments and formatting adjustments for consistency across YAML files.
- Implement `normalize_linkml_alt_descriptions.py` to convert structured alt_descriptions to the expected scalar form.
- Implement `normalize_linkml_structured_aliases.py` to flatten language-keyed structured_aliases into a standard list-of-objects format.
- Implement `validate_linkml_schema_integrity.py` to validate the integrity of LinkML schema bundles, checking for import resolution, YAML parsing, and reference existence.
- Updated SocialMediaPostType.yaml:
- Renamed class and title for consistency.
- Simplified description to focus on controlled vocabulary.
- Adjusted slot definitions and removed duplicates.
- Enhanced comments for better understanding of class purpose.
- Modified SocialMediaProfile.yaml:
- Added a reference to Twitter in the see_also section.
- Preserved prior description in notes for context.
- Revised VideoAudioAnnotation.yaml:
- Updated description to clarify the purpose of audio annotations.
- Added multilingual alt_descriptions and structured_aliases.
- Streamlined slot definitions and removed duplicates.
- Enhanced VideoPost.yaml:
- Added multilingual alt_descriptions and structured_aliases.
- Clarified description to highlight video-specific properties.
- Updated slot definitions for better clarity and consistency.
- Updated VideoSubtitle.yaml:
- Preserved prior description in notes for context.
- Revised VideoTranscript.yaml:
- Preserved prior description in notes for context.
- Updated slot names to improve semantic clarity:
- `has_type` changed to `categorized_as`
- `has_location` changed to `located_at`
- `coordinates` changed to `has_coordinates`
- `country` changed to `in_country`
- `like_count` changed to `has_quantity`
- Adjusted descriptions and annotations for slots to enhance understanding and alignment with ontology standards.
- Modified imports in `WomensArchives.yaml` and `WomensArchivesRecordSetTypes.yaml` to reflect new slot names.
- Enhanced multilingual support in `has_record_set` slot definition with additional translations and structured aliases.
- General cleanup and standardization of slot definitions across various classes including `Wikidata`, `Youtube`, and `WorkExperience`.
- Created 'updated_at.yaml' to record the last modified date and time of entities, including multilingual descriptions and structured aliases.
- Created 'written_in.yaml' to specify the language in which content is composed, covering both natural and programming languages, with detailed comments and close ontology mappings.
- Removed obsolete slots: `has_or_had_custodian_observation`, `provider`, and `specificity_annotation`.
- Updated `has_or_had_score` slot to use `SpecificityScore` class and modified its description and examples.
- Added new slots: `end_seconds`, `end_time`, `has_archive_path`, `has_or_had_custodian_name`, `protocol_name`, and `protocol_version`.
- Introduced a script `check_annotation_types.py` to validate the presence and structure of `custodian_types` in YAML files.
- Added a script `update_specificity.py` to automate updates related to `SpecificityAnnotation` to `SpecificityScore`.
- Updated WorldCatIdentifier.yaml to remove unnecessary description and ensure consistent formatting.
- Enhanced WorldHeritageSite.yaml by breaking long description into multiple lines for better readability and removed unused attributes.
- Simplified WritingSystem.yaml by removing redundant attributes and ensuring consistent formatting.
- Cleaned up XPathScore.yaml by removing unnecessary attributes and ensuring consistent formatting.
- Improved YoutubeChannel.yaml by breaking long description into multiple lines for better readability.
- Enhanced YoutubeEnrichment.yaml by breaking long description into multiple lines for better readability.
- Updated YoutubeVideo.yaml to break long description into multiple lines and removed legacy field name.
- Refined has_or_had_affiliation.yaml by removing unnecessary comments and ensuring clarity.
- Cleaned up is_or_was_retrieved_at.yaml by removing unnecessary comments and ensuring clarity.
- Added rules for generic slots and avoiding rough edits in schema files to maintain structural integrity.
- Introduced changes_or_changed_through.yaml to define a new slot for linking entities to change events.
- Removed unnecessary line breaks and whitespace in descriptions across multiple classes including Taxon, TaxonomicAuthority, TechnicalFeature, TradeRegister, TransferEvent, UNESCODomain, UnspecifiedType, UserCommunity, Version, VideoAnnotationTypes, VideoFrame, VideoTextContent, WebArchive, WebClaimsBlock, WebLink, WebPortal, and WordCount.
- Updated descriptions to enhance readability and maintain a uniform style.
- Migrated attributes and slots as per the latest schema rules, ensuring alignment with the defined standards.
- Improved documentation for better understanding of class purposes and usage scenarios.
- Added `fix_dual_class_link.py` to remove dual class link references from specified YAML files.
- Created `fix_specific_ghosts.py` to apply specific replacements in YAML files based on defined mappings.
- Introduced `migrate_staff_count.py` to migrate staff count references to a new structure in specified YAML files.
- Developed `migrate_type_slots.py` to replace type-related slots with new identifiers across YAML files.
- Implemented `scan_ghost_references.py` to identify and report ghost references to archived slots and classes in YAML files.
- Added `verify_ontology_terms.py` to verify the presence of ontology terms in specified ontology files against schema definitions.
- Created new YAML files for audience size and audience type slots, defining their properties and annotations.
- Added archived capacity slots including cubic meters, linear meters, item count, and descriptions, with appropriate URIs and ranges.
- Introduced a template specificity slot for context-aware RAG filtering.
- Consolidated capacity-related slots into a unified structure, including has_or_had_capacity, capacity_type, and capacity_value, with detailed descriptions and examples.
- Added `has_or_had_direction` slot to represent directional orientations of entities, migrating from `text_direction`.
- Introduced `has_or_had_mode` slot for linking entities to operational modes, migrating from `thinking_mode`.
- Created `Content` class to encapsulate intellectual content of heritage materials, migrating from `temporal_coverage`.
- Added `TextDirection` class to define text flow orientations, migrating from `text_direction` slot.
- Introduced `ThinkingMode` class for LLM thinking mode configurations, migrating from `thinking_mode` slot.
- Archived previous slots related to text direction, languages detected, and thinking modes for historical reference.
- Updated documentation and annotations for clarity and compliance with RiC-O naming conventions.
- Introduced total expense, total frames analyzed, total investment, total liability, total net asset, and traditional product slots to enhance financial reporting capabilities.
- Added transition types detected, treatment description, type hypothesis, typical condition, typical HTTP methods, typical response formats, and typical scope slots for improved heritage documentation.
- Implemented user community, verified, web observation, WhatsApp business likelihood, wikidata equivalent, and wikidata mapping slots to enrich institutional data representation.
- Established has_or_had_asset, has_or_had_budget, has_or_had_expense, and is_or_was_threatened_by slots to capture asset, budget, expense relationships, and threats to heritage forms.
- Fix empty import list elements (- # comment pattern) in Laptop, Expenses,
FunctionType, Overview, WebLink, Photography classes
- Replace valid_from/valid_to slots with temporal_extent in class slots lists
- Update slot_usage to use temporal_extent with TimeSpan range
- Update examples to use temporal_extent with begin_of_the_begin/end_of_the_end
- Fix typo is_or_was_is_or_was_archived_at → is_or_was_archived_at in WebObservation
- Add TimeSpan imports to classes using temporal_extent
- Fix relative import paths for Timestamp in temporal slots
- Fix CustodianIdentifier → Identifier imports in FundingAgenda, ReadingRoomAnnex
Schema validates successfully with 902 classes and 2043 slots.
- Introduced SoundArchiveRecordSetType, SpecialCollectionRecordSetType, SpecializedArchiveRecordSetType, SpecializedArchivesCzechiaRecordSetType, StateArchivesRecordSetType, StateArchivesSectionRecordSetType, StateDistrictArchiveRecordSetType, StateRegionalArchiveCzechiaRecordSetType, TelevisionArchiveRecordSetType, TradeUnionArchiveRecordSetType, UniversityArchiveRecordSetType, VereinsarchivRecordSetType, VerlagsarchivRecordSetType, VerwaltungsarchivRecordSetType, WebArchiveRecordSetType, and WomensArchivesRecordSetType.
- Each new type includes appropriate metadata, slots, and relationships to existing classes.
- Implemented a script to detect and fix Type class violations in LinkML files.
- Remove inline slot definitions from 144 class files
- Create 7 new centralized slot files in modules/slots/:
- custodian_type_broader.yaml
- custodian_type_narrower.yaml
- custodian_type_related.yaml
- definition.yaml
- finding_aid_access_restriction.yaml
- finding_aid_description.yaml
- finding_aid_temporal_coverage.yaml
- Add centralize_inline_slots.py automation script
- Update manifest with new timestamp
Rule 48: Class files must NOT define inline slots - all slots
must be imported from modules/slots/ directory.
Note: Pre-existing IdentifierFormat duplicate class definition
(in Standard.yaml and IdentifierFormat.yaml) not addressed in
this commit - requires separate schema refactor.
- Migrate 236+ class files from custodian_types to has_or_had_custodian_type
- Archive deprecated slots: custodian_type, custodian_types, custodian_type_broader/narrower/related
- Update main schema and manifest imports
- Fix Custodian.yaml class to use new slot
- Fix annotation format (list→scalar) in has_or_had_custodian_type.yaml
Rules applied:
- Rule 39: RiC-O naming convention (hasOrHad pattern)
- Rule 43: Slot nouns must be singular (multivalued:true for cardinality)
- Rule 38: Slot centralization with semantic URI
- Updated documentation to clarify integration points with existing components in the RAG pipeline and DSPy framework.
- Added detailed mapping of SPARQL templates to context templates for improved specificity filtering.
- Implemented wrapper patterns around existing classifiers to extend functionality without duplication.
- Introduced new tests for the SpecificityAwareClassifier and SPARQLToContextMapper to ensure proper integration and functionality.
- Enhanced the CustodianRDFConverter to include ISO country and subregion codes from GHCID for better geospatial data handling.
- Introduced LEGAL-FORM-FILTER rule to standardize CustodianName by removing legal form designations.
- Documented rationale, examples, and implementation guidelines for the filtering process.
docs: Create README for value standardization rules
- Established a comprehensive README outlining various value standardization rules applicable to Heritage Custodian classes.
- Categorized rules into Name Standardization, Geographic Standardization, Web Observation, and Schema Evolution.
feat: Implement transliteration standards for non-Latin scripts
- Added TRANSLIT-ISO rule to ensure GHCID abbreviations are generated from emic names using ISO standards for transliteration.
- Included detailed guidelines for various scripts and languages, along with implementation examples.
feat: Define XPath provenance rules for web observations
- Created XPATH-PROVENANCE rule mandating XPath pointers for claims extracted from web sources.
- Established a workflow for archiving websites and verifying claims against archived HTML.
chore: Update records lifecycle diagram
- Generated a new Mermaid diagram illustrating the records lifecycle for heritage custodians.
- Included phases for active records, inactive archives, and processed heritage collections with key relationships and classifications.