Commit graph

55 commits

Author SHA1 Message Date
kempersc
c50c35fd3a enrich person custodian 2025-12-14 17:09:55 +01:00
kempersc
41aace785f feat: Add SyncPanel component for database synchronization
- Add SyncPanel component with bilingual (NL/EN) support
- Add relative URL handling for production (bronhouder.nl)
- Integrate SyncPanel into Database page
- Show sync status for all 4 databases (DuckLake, PostgreSQL, Oxigraph, Qdrant)
- Support dry-run mode and file limit options
2025-12-12 23:42:22 +01:00
kempersc
505c12601a Add test script for PiCo extraction from Arabic waqf documents
- Implemented a new script `test_pico_arabic_waqf.py` to test the GLM annotator's ability to extract person observations from Arabic historical documents.
- The script includes environment variable handling for API token, structured prompts for the GLM API, and validation of extraction results.
- Added comprehensive logging for API responses, extraction results, and validation errors.
- Included a sample Arabic waqf text for testing purposes, following the PiCo ontology pattern.
2025-12-12 17:50:17 +01:00
kempersc
b1f93b6f22 enrich person profiles 2025-12-12 12:51:10 +01:00
kempersc
03263f67d6 moved web archives 2025-12-12 00:40:26 +01:00
kempersc
1b1cfbfca0 enrich custodians 2025-12-11 22:32:09 +01:00
kempersc
d4906abae4 update postgis data 2025-12-10 23:51:51 +01:00
kempersc
be3fbac601 enrich entries and persons 2025-12-10 18:04:25 +01:00
kempersc
41959f0766 correct HCID! 2025-12-10 13:01:13 +01:00
kempersc
3a6ead8fde feat: Add legal form filtering rule for CustodianName
- Introduced LEGAL-FORM-FILTER rule to standardize CustodianName by removing legal form designations.
- Documented rationale, examples, and implementation guidelines for the filtering process.

docs: Create README for value standardization rules

- Established a comprehensive README outlining various value standardization rules applicable to Heritage Custodian classes.
- Categorized rules into Name Standardization, Geographic Standardization, Web Observation, and Schema Evolution.

feat: Implement transliteration standards for non-Latin scripts

- Added TRANSLIT-ISO rule to ensure GHCID abbreviations are generated from emic names using ISO standards for transliteration.
- Included detailed guidelines for various scripts and languages, along with implementation examples.

feat: Define XPath provenance rules for web observations

- Created XPATH-PROVENANCE rule mandating XPath pointers for claims extracted from web sources.
- Established a workflow for archiving websites and verifying claims against archived HTML.

chore: Update records lifecycle diagram

- Generated a new Mermaid diagram illustrating the records lifecycle for heritage custodians.
- Included phases for active records, inactive archives, and processed heritage collections with key relationships and classifications.
2025-12-09 16:58:41 +01:00
kempersc
a7321b1bb9 reconstruct location blocks 2025-12-09 12:25:16 +01:00
kempersc
cab712659d recover location blocks 2025-12-09 11:34:56 +01:00
kempersc
62fdd35321 Refactor code structure for improved readability and maintainability 2025-12-09 11:15:51 +01:00
kempersc
131e3ca259 normalise custodian entries 2025-12-09 07:56:35 +01:00
kempersc
13f67bed19 feat(frontend): add graph visualization and data explorer features
Database Panels:
- Add D3.js force-directed graph visualization to Oxigraph and TypeDB panels
- Add 'Explore' tab with class/entity browser, graph/table toggle, and search
- Add data explorer to PostgreSQL panel with table browser, pagination, search, export
- Fix SPARQL variable naming bug in Oxigraph getGraphData() function
- Add node details panel showing selected entity attributes
- Add zoom/pan controls and node coloring by entity type

Map Features:
- Add TimelineSlider component for temporal filtering of institutions
- Support dual-handle range slider with decade histogram
- Add quick presets (Ancient, Medieval, Modern, Contemporary)
- Show institution density visualization by founding decade

Hooks:
- Extend useOxigraph with getGraphData() for graph visualization
- Extend useTypeDB with getGraphData() for graph visualization
- Extend usePostgreSQL with getTableData() and exportTableData()
- Improve useDuckLakeInstitutions with temporal filtering support

Styles:
- Add HeritageDashboard.css with shared panel styling
- Add TimelineSlider.css for timeline component styling
2025-12-08 14:56:17 +01:00
kempersc
7e3559f7e5 add new entries 2025-12-07 23:08:02 +01:00
kempersc
57c743b005 refactor(frontend): fetch NL municipalities from PostGIS API instead of static file
Replace static netherlands_municipalities_simplified.geojson with dynamic
PostGIS API call to /boundaries/countries/NL/admin2/geojson.

Transform API response properties to expected format:
- API: {code, name, name_local, admin1_code, admin1_name}
- Expected: {code, naam, provincieCode, provincieNaam}

This ensures NL boundary data comes from the authoritative PostGIS
database rather than a static file that could become outdated.
2025-12-07 19:48:07 +01:00
kempersc
12965071be fix(frontend): improve DuckLake connection detection in map page
Wait for DuckLake loading to complete before deciding whether to use
DuckLake data or fallback to static JSON. Prevents race conditions.
2025-12-07 19:23:12 +01:00
kempersc
1981dc28ed fix(frontend): normalize org_type names to letter codes in DuckLake hook
DuckLake stores full names like 'MUSEUM' but map expects single-letter
codes like 'M' for color styling. Also includes CSS fixes for Database page.
2025-12-07 19:21:58 +01:00
kempersc
810022d524 feat(frontend): add search filter and claim page filter to DuckLakePanel
- Add search bar to filter table data across all columns
- Filter web archive claims by selected page
- Include source_page in claim queries for filtering
- Fix TypeScript unused parameter warning
2025-12-07 19:20:40 +01:00
kempersc
f82dd57903 feat(frontend): add useBoundariesAPI hook for PostGIS boundary fetching
New React hook that fetches administrative boundaries from the PostGIS API:
- Supports international boundaries (NL, JP, CZ, DE, BE, CH, AT, etc.)
- Caches admin1, admin2, and GeoJSON data
- Provides point-in-polygon lookup
- Includes utility functions for filtering boundaries by code/name
- Replaces static GeoJSON file loading pattern
2025-12-07 19:20:21 +01:00
kempersc
d9325c0bb5 feat: add web archives integration and improve enrichment scripts
Backend:
- Attach web_archives.duckdb as read-only database in DuckLake
- Create views for web_archives, web_pages, web_claims in heritage schema

Scripts:
- enrich_cities_google.py: Add batch processing and retry logic
- migrate_web_archives.py: Improve schema handling and error recovery

Frontend:
- DuckLakePanel: Add web archives query support
- Database.css: Improve layout for query results display
2025-12-07 17:49:07 +01:00
kempersc
0b06af0fb6 chore: mark unused function and ignore ducklake databases 2025-12-07 14:28:12 +01:00
kempersc
9d15cce65c docs: add enrichment reports and update manifest
Add enrichment reports from city resolution:
- Austrian, Belgian, Bulgarian, Czech, Swiss ISIL enrichment reports
- GeoNames update reports
- Custodian creation reports
- Entry-to-GHCID mapping file
2025-12-07 14:27:36 +01:00
kempersc
4825f57951 feat(frontend): improve werkgebied display and database UI
- Fix polygon rendering with static paint properties instead of data-driven
- Add ensureSourceAndLayers() helper for reliable layer management
- Use setPaintProperty() for historical vs modern styling distinction
- Improve Database page layout with back buttons and cleaner navigation
- Add ResizableNestedTable component for DuckLake data display
- Optimize spacing and layout in Database.css
- Update schema manifest
2025-12-07 14:26:37 +01:00
kempersc
ee4e57bc75 add new entries 2025-12-07 00:26:01 +01:00
kempersc
1635625032 added web annotations 2025-12-06 19:50:04 +01:00
kempersc
55e2cd2340 feat: implement LLM-based extraction for Archives Lab content
- Introduced `llm_extract_archiveslab.py` script for entity and relationship extraction using LLMAnnotator with GLAM-NER v1.7.0.
- Replaced regex-based extraction with generative LLM inference.
- Added functions for loading markdown content, converting annotation sessions to dictionaries, and generating extraction statistics.
- Implemented comprehensive logging of extraction results, including counts of entities, relationships, and specific types like heritage institutions and persons.
- Results and statistics are saved in JSON format for further analysis.
2025-12-05 23:16:21 +01:00
kempersc
3a242370fc annotation standards added 2025-12-05 15:30:23 +01:00
kempersc
d661947830 update enriched entries 2025-12-03 17:38:46 +01:00
kempersc
ef89b1213a validate enrichments 2025-12-02 14:36:01 +01:00
kempersc
4b833d20b2 add pids 2025-12-01 23:55:55 +01:00
kempersc
7dce283c17 Add new enums for PersonalCollectionType, ResearchCenterType, and TasteScentHeritage classifications; implement validation script for custodian names against authoritative sources 2025-12-01 18:39:22 +01:00
kempersc
48a2b26f59 feat: Add script to generate Mermaid ER diagrams with instance data from LinkML schemas
- Implemented `generate_mermaid_with_instances.py` to create ER diagrams that include all classes, relationships, enum values, and instance data.
- Loaded instance data from YAML files and enriched enum definitions with meaningful annotations.
- Configured output paths for generated diagrams in both frontend and schema directories.
- Added support for excluding technical classes and limiting the number of displayed enum and instance values for readability.
2025-12-01 16:58:03 +01:00
kempersc
097d116b72 enrich entries 2025-12-01 16:06:34 +01:00
kempersc
2497e5913f enrich entries 2025-12-01 00:37:24 +01:00
kempersc
f3c149b1bb update entries 2025-11-30 23:30:29 +01:00
kempersc
ff92698c7a Implement feature X to enhance user experience and fix bug Y in module Z 2025-11-30 23:25:05 +01:00
kempersc
d623f0af4a store archived websites 2025-11-29 20:40:46 +01:00
kempersc
0ab8f24a6b archive websites 2025-11-29 18:05:16 +01:00
kempersc
da1eae6597 Refactor code structure for improved readability and maintainability 2025-11-29 12:27:39 +01:00
kempersc
30162e6526 Add script to validate KB library entries and generate enrichment report
- Implemented a Python script to validate KB library YAML files for required fields and data quality.
- Analyzed enrichment coverage from Wikidata and Google Maps, generating statistics.
- Created a comprehensive markdown report summarizing validation results and enrichment quality.
- Included error handling for file loading and validation processes.
- Generated JSON statistics for further analysis.
2025-11-28 14:48:33 +01:00
kempersc
5cdce584b2 Add complete schema for heritage custodian observation reconstruction
- Introduced a comprehensive class diagram for the heritage custodian observation reconstruction schema.
- Defined multiple classes including AllocationAgency, ArchiveOrganizationType, AuxiliaryDigitalPlatform, and others, with relevant attributes and relationships.
- Established inheritance and associations among classes to represent complex relationships within the schema.
- Generated on 2025-11-28, version 0.9.0, excluding the Container class.
2025-11-28 13:13:23 +01:00
kempersc
0d1741c55e Refactor code structure for improved readability and maintainability 2025-11-28 11:44:21 +01:00
kempersc
37886f0433 Refactor code structure for improved readability and maintainability 2025-11-27 17:43:14 +01:00
kempersc
5ef8ccac51 Add script to enrich NDE Register NL entries with Wikidata data
- Implemented a Python script that fetches and enriches entries from the NDE Register using data from Wikidata.
- Utilized the Wikibase REST API and SPARQL endpoints for data retrieval.
- Added logging for tracking progress and errors during the enrichment process.
- Configured rate limiting based on authentication status for API requests.
- Created a structured output in YAML format, including detailed enrichment data.
- Generated a log file summarizing the enrichment process and results.
2025-11-27 13:30:00 +01:00
kempersc
cd0ff5b9c7 wrap up voorbeeld lijst 2025-11-27 10:58:53 +01:00
kempersc
a6cbce1749 feat: Implement intersection calculation for UML diagram node links 2025-11-27 10:58:45 +01:00
kempersc
e99b1e644e feat: Add platform_description slot for detailed auxiliary platform information 2025-11-26 10:18:16 +01:00
kempersc
a5a66eb547 add classes 2025-11-25 12:48:07 +01:00