- Migrated `archived_at` to `is_or_was_archived_at` in AuxiliaryDigitalPlatform, WebObservation, and other relevant classes to better reflect historical archival status.
- Removed `bold_id` slot and replaced it with `has_or_had_identifier` linked to the new `BOLDIdentifier` class in BiologicalObject.
- Introduced `Bookplate` and `Approver` classes to enhance provenance tracking and ownership documentation.
- Updated `InformationCarrier` to replace `bookplate` with `includes_or_included` for better representation of ownership marks.
- Added new slots `is_or_was_approved_by` and `is_or_was_archived_at` to capture historical approval and archival locations.
- Archived old slot definitions for `archived_at` and `bold_id` to maintain schema integrity.
- Enhanced LinkedIn profile extraction functionality by integrating Linkup API alongside Exa API.
- Introduced `is_or_was_created_through` slot to indicate content creation methods, replacing previous boolean flags.
- Added `is_or_was_required` slot for generic temporal boolean requirements, aligning with Schema.org.
- Created `AutoGeneration` class to represent automatic content generation, capturing methods and provenance.
- Established `AvailabilityStatus` class to model resource availability with temporal validity.
- Developed `Documentation` class for structured documentation resources, replacing domain-specific slots.
- Implemented `Taxon` class for biological classification in natural history collections.
- Archived previous slots related to API availability and documentation, ensuring a clean schema.
- Enhanced existing slots with detailed descriptions and examples for clarity and usability.
- Removed deprecated slots: appraisal_notes, branch_id, is_or_was_real.
- Introduced new slots: has_or_had_notes, has_or_had_provenance.
- Created Notes class to encapsulate note-related metadata.
- Archived removed slots and classes in accordance with the new archive folder convention.
- Updated slot_fixes.yaml to reflect migration status and details.
- Enhanced documentation for new slots and classes, ensuring compliance with ontology alignment.
- Added new slots for note content, date, and type to support the Notes class.
- Added Overview class to represent structured collections of web links, including detailed descriptions, examples, and ontology alignments.
- Introduced RealnessStatus class to classify data as real or synthetic, with rich provenance and temporal semantics.
- Created WebLink class for representing hyperlinks with associated metadata, enhancing structured link representation.
- Established new slots: has_or_had_comprehensive_overview, is_or_was_real, and includes_or_included to support the new classes and improve data modeling.
- Migrated existing slots to new structures, ensuring compliance with RiC-O naming conventions and enhancing specificity.
- Updated annotations and examples across all new classes and slots for clarity and usability.
- Deleted the program_expense slot from the schema.
- Updated slot_fixes.yaml to reflect the migration of administrative_expenses, marking it as fully migrated and archiving related bespoke slots.
- Created archived YAML files for administrative_expenses, fundraising_expense, has_or_had_administrative_expense, innovation_expense, and program_expense, documenting their structure and descriptions.
- All expense types now utilize the Expenses class with ExpenseTypeEnum classification for better organization and clarity.
- Add first page (<<) and last page (>>) navigation buttons
- Add direct page number input field for jumping to specific pages
- Update CSS styling for new pagination controls including input field
- Use stacked ChevronLeft/ChevronRight icons for first/last (lucide-react compatibility)
- Skip YYYYMMDD and YYMMDD date patterns at end of email
- Skip digit sequences longer than 4 characters
- Require non-digit before 4-digit years at end
- Add knid.nl/kabelnoord.nl to consumer domains (Friesland ISP)
- Add 11 missing regional archive domains to HERITAGE_DOMAIN_MAP
- Update recalculation script to re-extract email semantics
Results:
- 3,151 false birth years removed
- 'Likely wrong person' reduced from 533 to 325 (-39%)
- 2,944 candidates' scores boosted
- accepts_or_accepted_external_work: Remove verbose examples list
- accepts_or_accepted_payment_method: Condense to single sentence
- accepts_or_accepted_visiting_scholar: Minor rewording for consistency
- Add is_likely_wrong_person and wrong_person_reason fields to MatchCandidate
- Add confidence_original field for tracking pre-adjustment scores
- Add visual indicators: AlertTriangle for wrong person, Star for high confidence
- Add filter checkboxes: 'Show high confidence (>80%)' and 'Hide wrong person'
- Add wrong person alert banner with bilingual labels (NL/EN)
- Add danger stat card showing count of likely wrong person matches
- Style signal badges by type: danger (birth_year_mismatch), success (validated)
- Add extensive CSS for wrong-person/high-confidence alerts and candidate styling
When users click on a different class, enum, or slot in the sidebar,
the ontology term popup now automatically closes. This prevents the
popup from persisting and showing stale information from the
previously viewed schema element.
The slot details section was rendering close_mappings, narrow_mappings,
broad_mappings, and related_mappings twice each. This caused the mappings
to appear duplicated on pages like /linkml?class=AcademicArchive.
Removed 68 lines of duplicate JSX code.
- Fix resolveUri() to handle bare local names like 'E27_Site' used by CIDOC-CRM
(previously only handled URIs starting with '#')
- Add EDM (Europeana Data Model) ontology to frontend
- Copy edm.owl to frontend/public/ontology/
- Register in ONTOLOGY_FILES array
- Add 'edm' prefix to STANDARD_PREFIXES
- Add EDM color to ONTOLOGY_COLORS
- Render HTML content in ontology descriptions safely using DOMPurify
- Sanitize HTML to allow only safe tags (a, br, em, strong, etc.)
- Fix Schema.org relative links to absolute URLs
- Add target='_blank' to external links
- Updated `entity_review.py` to map email semantic fields from JSON.
- Expanded `email_semantics.py` with additional museum mappings.
- Introduced a new rule in `.opencode/rules/no-duplicate-ontology-mappings.md` to prevent duplicate ontology mappings.
- Added a backup JSON file for entity resolution candidates.
- Created `enrich_email_semantics.py` to enrich candidates with email semantic signals.
- Developed `merge_entity_reviews.py` to merge reviewed decisions from a backup into new candidates.
- Create .opencode/rules/no-duplicate-ontology-mappings.md with detection script
- Add Rule 52 to AGENTS.md (after Rule 51)
- Fix 29 duplicate mappings: same URI in multiple mapping categories
- 26 slot files: remove duplicates keeping most precise mapping
- 3 class files: ExhibitionSpace, Custodian, DigitalPlatform
- Mapping precedence: exact > close > narrow/broad > related
Each ontology URI must appear in only ONE mapping category per schema
element, following SKOS semantics where mapping properties are mutually
exclusive.
The VCard ontology file (and 3 others) use @base directive with relative URIs
like <#Address>. The Turtle parser was not extracting @base or resolving
relative URIs against it.
Changes:
- Extract @base directive in first pass alongside @prefix
- Add baseUri parameter to expandUri() function
- Handle relative URIs starting with # (resolve against base)
- Handle empty relative URI <> (returns base URI itself)
- Pass baseUri through to processSubject() function
This fixes the 'Term not found' error for vcard:Address and similar terms
that use relative URI notation in their ontology definitions.
Affected ontologies: vcard.rdf, prov.ttl, era_ontology.ttl, ebg-ontology.ttl
Removed conditional execution from Layer 2 and Layer 4.
All layers now run on every push, PR, and schedule.
Quality Gate requires all 4 layers to pass.
upload-artifact@v4 is not supported on GHES/Forgejo.
All 35 unit tests passed, but job failed due to artifact upload.
Downgrading to v3 which is compatible with self-hosted runners.
- Create venv at /opt/venv in each job
- Source venv/bin/activate before pip install and pytest commands
- Add python3-full package for complete venv support
- Fixes 'externally-managed-environment' error on Debian Bookworm
The Forgejo runner with label ubuntu-latest:docker://node:20-bookworm
does not properly support custom container overrides. Instead of using
container: image: python:3.11-slim, we now install Python from apt-get
in the node:20-bookworm base container (which is Debian-based).
Changes:
- Remove container: blocks from all 4 layer jobs
- Add 'Install Python' step to each job
- Use python3/python3 -m pip/python3 -m pytest commands
- Remove trigger comment from test file
- Introduced SoundArchiveRecordSetType, SpecialCollectionRecordSetType, SpecializedArchiveRecordSetType, SpecializedArchivesCzechiaRecordSetType, StateArchivesRecordSetType, StateArchivesSectionRecordSetType, StateDistrictArchiveRecordSetType, StateRegionalArchiveCzechiaRecordSetType, TelevisionArchiveRecordSetType, TradeUnionArchiveRecordSetType, UniversityArchiveRecordSetType, VereinsarchivRecordSetType, VerlagsarchivRecordSetType, VerwaltungsarchivRecordSetType, WebArchiveRecordSetType, and WomensArchivesRecordSetType.
- Each new type includes appropriate metadata, slots, and relationships to existing classes.
- Implemented a script to detect and fix Type class violations in LinkML files.