kempersc
89001fbc53
compact header controls on OntologyViewer and QueryBuilder pages
2026-01-04 17:29:34 +01:00
kempersc
eb61f45de2
compact UML controls toolbar to fit single line when sidebar collapsed
2026-01-04 17:21:53 +01:00
kempersc
2dca28d8c1
enrich CH entries with mission statements
2026-01-04 13:12:32 +01:00
kempersc
4f0cafe98a
enrich HC profiles
2026-01-02 02:11:04 +01:00
kempersc
349f31ae6f
enrich custodian profiles
2026-01-02 02:10:18 +01:00
kempersc
aee76fcc7f
backup html content
2025-12-31 02:36:38 +01:00
kempersc
b7701c8a8e
backup person profiles
2025-12-31 00:04:09 +01:00
kempersc
7108cb1483
backup person profiles
2025-12-31 00:00:25 +01:00
kempersc
38dcd2ce9c
Restore YAML files for Museum Dokkum and Gemeente Smallingerland with enriched data and provenance tracking
2025-12-30 23:58:21 +01:00
kempersc
1d8fd68e3a
backup custodian web profiles
2025-12-30 23:53:16 +01:00
kempersc
f6a5962c3b
backup person profiles
2025-12-30 23:48:50 +01:00
kempersc
cbf88d2a6d
backup person profiles
2025-12-30 23:44:57 +01:00
kempersc
30b701a5ec
backup HC data
2025-12-30 23:41:15 +01:00
kempersc
c417d0c758
Refactor code structure for improved readability and maintainability
2025-12-30 23:38:18 +01:00
kempersc
fb0daab718
backup JP profiles
2025-12-30 23:24:30 +01:00
kempersc
b42d6bf5d2
backup CZ and JP
2025-12-30 23:19:38 +01:00
kempersc
45e873ec0a
enrich JP BE AR profiles
2025-12-30 23:07:03 +01:00
kempersc
bc6ad46bfa
enrich CZ and JP profiles
2025-12-30 23:03:03 +01:00
kempersc
90b402dba6
enrich AR en Czech files
2025-12-30 23:01:01 +01:00
kempersc
f753d7277f
Add country code extraction for location validation in Google Places API
2025-12-30 03:45:29 +01:00
kempersc
cefc847056
Remove custodian entry for Leica AG from YAML file
2025-12-30 03:44:25 +01:00
kempersc
9159ff35db
Add custodian entry for Leica AG with data contamination fixes and location corrections
2025-12-30 03:43:47 +01:00
kempersc
d64f857aa9
add sparql validator and RAG injector
2025-12-30 03:43:31 +01:00
kempersc
84904e344b
Make AGENTS more succint by referring to opencode rules & enrich custodians
2025-12-28 14:56:35 +01:00
kempersc
4cf3fe8a07
Logo enrichment batch: JP+170 (5,166/12,096 = 42.7%) - 14,503 total (45.6%)
2025-12-27 13:17:40 +01:00
kempersc
3447a9cc6c
Logo enrichment batch: JP+440 (4,996/12,096 = 41.3%) - 14,333 total (45.1%)
2025-12-27 12:20:53 +01:00
kempersc
cdb633b0c9
enrich custodian entries with logo
2025-12-27 02:15:17 +01:00
kempersc
fd91fec63f
Logo enrichment batch: JP+320, 13,603 total (42.8%)
...
- JP: 4,516/12,096 (37.4%) ✅ NEW COMMIT
- CZ: 3,820/8,432 (45.3%) - batches 7-16 running
- CH, NL, BE, AT, BR: 100% complete
- Total: 13,603/31,772 (42.8%)
- Using crawl4ai favicon extraction
2025-12-26 23:25:40 +01:00
kempersc
2104a90f22
Logo enrichment COMPLETE: CZ 3,820 (45.3%)
...
- CZ: 3,820/8,432 files processed (45.3%)
- 9 parallel batches completed (500 files each)
- NL person entities added (4 staff profiles)
- scripts/discover_websites_crawl4ai.py modified
- Using crawl4ai favicon extraction
2025-12-26 21:45:14 +01:00
kempersc
6af5009444
enrich entries
2025-12-26 21:41:18 +01:00
kempersc
ca219340f2
enrich entries
2025-12-26 14:30:31 +01:00
kempersc
59963c8d3f
Logo enrichment batch: JP+300, CZ-0 - 12,833 files (40.4%)
...
- JP: 4,496 processed (37.2% of 12,096) ✅ COMPLETE
- CZ: 2,820 processed (33.4% of 8,432) - batch completed, slight decrease
- CH, NL, BE, AT, BR: 100% complete
- Total: 12,833 of 31,772 files (40.4%)
- Using crawl4ai favicon extraction
2025-12-26 13:42:21 +01:00
kempersc
fb7993e3af
fix: filter DSPy field markers from streaming output
...
Implements a state machine to filter streaming tokens:
- Only stream tokens from the 'answer' field to the frontend
- Skip tokens from 'reasoning', 'citations', 'confidence', 'follow_up' fields
- Remove DSPy field markers like '[[ ## answer ## ]]' from streamed content
This fixes the issue where raw DSPy signature field markers were being
displayed in the chat interface instead of clean answer text.
2025-12-26 03:11:44 +01:00
kempersc
6b9fa33767
Logo enrichment batch: CZ+500, JP+170 - 12,513 files (39.4%)
...
- CZ: 2,820 processed (33.4% of 8,432)
- JP: 4,176 processed (34.5% of 12,096)
- Total: 12,513 of 31,772 (39.4%)
- CZ batch completed: 500 files, 52 logos found
- JP batch crashed during run (4,176 files before crash)
- Using crawl4ai favicon extraction
2025-12-26 02:03:48 +01:00
kempersc
63400392ff
Fix CZ-52-PAB-L-IPVVZOVI logo: use primary_logo.png instead of favicon.ico
...
- Primary logo (logo.png) identified via crawl4ai direct scraping
- Favicon (favicon.ico) retained as secondary asset
- Updated claims: primary_logo_url + favicon_url
- Summary shows: has_primary_logo: true, total_claims: 2
2025-12-25 21:01:05 +01:00
kempersc
6ab0b19ae2
Logo enrichment batch: CZ+260, JP+260 - 11,663 files (36.7%)
...
- CZ: 2,810 processed (33.3% of 8,432)
- JP: 3,336 processed (27.6% of 12,096)
- Total: 11,663 of 31,772 (36.7%)
- Using crawl4ai favicon extraction
2025-12-25 19:23:41 +01:00
kempersc
717ee3408a
Logo enrichment batch: JP+771, CZ+380 - 10,913 files (34%)
...
- JP: 2,846 processed (24% of 12,096)
- CZ: 2,550 processed (30% of 8,432)
- CH, NL, BE, AT, BR: 100% complete
- Total: 10,913 of 31,772 files (34%)
- Using crawl4ai favicon extraction
2025-12-25 13:44:26 +01:00
kempersc
d5f2f542ce
feat(archief-assistent): preserve SPARQL queries in semantic cache
...
- Add sparqlQuery field to CachedResponse interface
- Extract SPARQL query before cache storage (not after)
- Include sparqlQuery in cache HIT message objects
- Handle both snake_case (server) and camelCase field names
SPARQL queries are now displayed for both fresh API responses
and cached responses, improving debugging and transparency.
2025-12-24 22:26:22 +01:00
kempersc
c3387ef3f1
Logo enrichment batch: CZ +380, JP +125, AR +28 files
...
- CZ: 2,170 processed (26% of 8,432)
- JP: 2,075 processed (17% of 12,096)
- AR: Started processing
- Total checkpoint: 9,762 files across all countries
- Using crawl4ai favicon extraction
2025-12-24 12:50:20 +01:00
kempersc
57de5e4b11
CZ logo enrichment: 1,790 files processed (21%)
...
- Added logo_enrichment to 771 Czech custodian files
- 87% logo hit rate using crawl4ai favicon extraction
- Total checkpoint: 9,257 files across all countries
- CZ remaining: 6,642 files
2025-12-24 02:41:26 +01:00
kempersc
ce1f80d024
enrich: logo enrichment progress (CZ: 220, JP: 1600)
2025-12-23 22:08:43 +01:00
kempersc
4f6ca92084
enrich: logo enrichment progress (JP: 1500, CZ: 40 started)
2025-12-23 21:37:10 +01:00
kempersc
8036eb5a3f
enrich: logo enrichment for JP custodians (1490 processed, 10606 remaining)
2025-12-23 21:17:45 +01:00
kempersc
38292d1918
enrich: logo enrichment for JP custodians (1350 processed, 10746 remaining)
2025-12-23 20:56:21 +01:00
kempersc
54869589d1
fix(linkml-viewer): 3D cube visualization bugs - prevent click-to-filter and parse JSON custodian_types
...
- Remove onFaceClick prop from CustodianTypeIndicator3D in class/slot/enum detail views
to prevent accidental filtering when clicking decorative cubes (Bug 3)
- Add parseCustodianTypesAnnotation() helper to handle JSON-stringified arrays like '["A"]'
in YAML annotations, fixing Bug 2 where all 19 letters appeared on every cube
- Legend bar retains onTypeClick for intentional filtering functionality
2025-12-23 20:40:32 +01:00
kempersc
5e8a432ef0
enrich japanese and dutch custodians
2025-12-23 18:08:45 +01:00
kempersc
a1fb6344e7
enriching custodian data
2025-12-23 17:26:29 +01:00
kempersc
0c1d19e98b
enrich entries
2025-12-23 13:27:35 +01:00
kempersc
879cddc47e
fix(rag): update HeritageSPARQLGenerator with correct ontology
...
- Use hc: <https://w3id.org/heritage/custodian/ > prefix
- Use hc:institutionType with single-letter codes (M, L, A, etc.)
- Use Wikidata URIs for countries (Q55=NL, Q31=BE, etc.)
- Update all SPARQL examples to use correct ontology
- Align with actual RDF data in Oxigraph
2025-12-22 22:32:08 +01:00
kempersc
0860f6094d
fix(sparql): correct ontology in dspy_sparql.py to match actual RDF data
...
- Use crm:E39_Actor instead of glam:HeritageCustodian
- Use hc:institutionType with single-letter codes (M, L, A, etc.)
- Use Wikidata URIs for countries (Q55=NL, Q31=BE, etc.)
- Use skos:prefLabel for institution names
- Update ONTOLOGY_CONTEXT with correct examples
2025-12-22 22:22:07 +01:00