- Created AuxiliaryDigitalPlatformTypeEnum.yaml to classify types of secondary digital platforms.
- Created AuxiliaryPlaceTypeEnum.yaml to classify types of secondary physical locations.
- Added OrganizationBranchTypeEnum.yaml for formal organizational branches at auxiliary locations.
- Introduced auxiliary_places.yaml slot to link CustodianPlace to subordinate physical locations.
- Introduced auxiliary_platforms.yaml slot to link DigitalPlatform to subordinate digital properties.
- Added located_at.yaml slot to connect OrganizationalStructure to physical locations.
- Created PlantUML diagrams for custodian types, full schema, legal status, and organizational structure.
- Implemented a script to generate GraphViz DOT diagrams from OWL/RDF ontology files.
- Developed a script to generate UML diagrams from modular LinkML schema, supporting both Mermaid and PlantUML formats.
- Enhanced class definitions and relationships in UML diagrams to reflect the latest schema updates.
- Created the Country class with ISO 3166-1 alpha-2 and alpha-3 codes, ensuring minimal design without additional metadata.
- Integrated the Country class into CustodianPlace and LegalForm schemas to support country-specific feature types and legal forms.
- Removed duplicate keys in FeatureTypeEnum.yaml, resulting in 294 unique feature types.
- Eliminated "Hypernyms:" text from FeatureTypeEnum descriptions, verifying that semantic relationships are now conveyed through ontology mappings.
- Created example instance file demonstrating integration of Country with CustodianPlace and LegalForm.
- Updated documentation to reflect the completion of the Country class implementation and hypernyms removal.
- Created SHACL shapes for validating temporal consistency and bidirectional relationships in custodial collections and staff observations.
- Implemented a Python script to validate RDF data against the defined SHACL shapes using the pyshacl library.
- Added command-line interface for validation with options for specifying data formats and output reports.
- Included detailed error handling and reporting for validation results.
- Implemented `owl_to_mermaid.py` to convert OWL/Turtle files into Mermaid class diagrams.
- Implemented `owl_to_plantuml.py` to convert OWL/Turtle files into PlantUML class diagrams.
- Added two new PlantUML files for custodian multi-aspect diagrams.
- Implemented three independent aspects for custodians: CustodianLegalStatus, CustodianName, and CustodianPlace.
- Renamed CustodianReconstruction to CustodianLegalStatus and updated all references.
- Created new components for CustodianPlace and PlaceSpecificityEnum.
- Removed direct links from CustodianObservation to Custodian, aligning with PROV-O standards.
- Generated comprehensive example instance demonstrating the new architecture.
- Updated documentation to reflect changes and provide guidance on multi-aspect modeling.
- Added React hook for managing IndexedDB operations, including storing and loading transformation results.
- Created complete YAML example for Rijksmuseum, illustrating the integration of all three aspects.
- Introduced custodian_hub_v3.mmd, custodian_hub_v4_final.mmd, and custodian_hub_v5_FINAL.mmd for Mermaid representation.
- Created custodian_hub_FINAL.puml and custodian_hub_v3.puml for PlantUML representation.
- Defined entities such as CustodianReconstruction, Identifier, TimeSpan, Agent, CustodianName, CustodianObservation, ReconstructionActivity, Appellation, ConfidenceMeasure, Custodian, LanguageCode, and SourceDocument.
- Established relationships and associations between entities, including temporal extents, observations, and reconstruction activities.
- Incorporated enumerations for various types, statuses, and classifications relevant to custodians and their activities.
- Introduced a new Mermaid diagram for Custodian Hub v2, detailing entities such as CustodianReconstruction, Identifier, TimeSpan, Agent, CustodianName, CustodianObservation, ReconstructionActivity, Appellation, ConfidenceMeasure, Custodian, LanguageCode, and SourceDocument.
- Established relationships between entities, including temporal extents, derivations, and revisions.
- Added a comprehensive PlantUML diagram reflecting the same structure and relationships, including enumerations for various types and statuses relevant to custodians and observations.
- Enhanced documentation to clarify the hub architecture pattern and its implications for data integrity and source authority.
- Implemented a new script to extract full metadata from 149 archive detail pages on archive-in-thueringen.de.
- Extracted data includes addresses, emails, phones, directors, collection sizes, opening hours, histories, and more.
- Introduced structured data parsing and error handling for robust data extraction.
- Added rate limiting to respect server load and improve scraping efficiency.
- Results are saved in a JSON format with detailed metadata about the extraction process.
- Introduced `test_nlp_extractor.py` with unit tests for the InstitutionExtractor, covering various extraction patterns (ISIL, Wikidata, VIAF, city names) and ensuring proper classification of institutions (museum, library, archive).
- Added tests for extracted entities and result handling to validate the extraction process.
- Created `test_partnership_rdf_integration.py` to validate the end-to-end process of extracting partnerships from a conversation and exporting them to RDF format.
- Implemented tests for temporal properties in partnerships and ensured compliance with W3C Organization Ontology patterns.
- Verified that extracted partnerships are correctly linked with PROV-O provenance metadata.
- Add Wikidata Q-numbers to 8 Brazilian institutions
- Coverage: 56/212 institutions (26.4%, +5.6pp gain)
- All Q-numbers validated via Wikidata authenticated API
- Largest single batch gain yet
- Note: Duplicate entries detected, deduplication needed
Q-numbers added:
- Q10333651 - Museu da Borracha
- Q10387829 - UFAC Repository
- Q10345196 - Parque Memorial Quilombo dos Palmares
- Q1434444 - Teatro Amazonas
- Q116921020 - Centro Cultural dos Povos da Amazônia
- Q7894381 - UNIFAP
- Q16496091 - Arquivo Público do Estado da Bahia
- Q56695457 - Museu de Arqueologia e Etnologia da UFPR