glam/apps/archief-assistent
kempersc 7fbff2ff5f feat(archief-assistent): add entity extraction to semantic cache
Prevent geographic false positives in cache lookups. Queries like
"musea in Amsterdam" vs "musea in Noord-Holland" have ~93%
embedding similarity but completely different answers.

Changes:
- Add ExtractedEntities interface for structured cache keys
- Implement fast entity extraction (<5ms, no LLM) with regex patterns
- Extract institution types (GLAMORCUBESFIXPHDNT), locations, and intent
- Generate structured cache keys (e.g., "count:M:amsterdam")
- Raise similarity threshold from 0.85 to 0.97 to match backend DSPy
- Add 'structured' match method to CacheLookupResult

The entity extractor recognizes:
- 19 institution types (Dutch + English patterns)
- 12 Dutch provinces with ISO 3166-2:NL codes
- Major Dutch cities with settlement codes
- Query intents (count, list, info)

This ensures geographic queries get different cache entries even when
embeddings are highly similar.
2026-01-10 10:33:21 +01:00
..
backend remove a,bihguous web-claims 2025-12-21 00:01:54 +01:00
e2e Add Playwright test results JSON file with initial test suite and failure details 2026-01-09 21:33:31 +01:00
node_modules Add Playwright test results JSON file with initial test suite and failure details 2026-01-09 21:33:31 +01:00
playwright-report Add Playwright test results JSON file with initial test suite and failure details 2026-01-09 21:33:31 +01:00
public enrich HC profiles 2026-01-02 02:11:04 +01:00
src feat(archief-assistent): add entity extraction to semantic cache 2026-01-10 10:33:21 +01:00
test-results Add Playwright test results JSON file with initial test suite and failure details 2026-01-09 21:33:31 +01:00
tests Refactor code structure for improved readability and maintainability 2026-01-09 11:05:26 +01:00
deploy-backend.sh remove a,bihguous web-claims 2025-12-21 00:01:54 +01:00
index.html remove a,bihguous web-claims 2025-12-21 00:01:54 +01:00
package.json Add Playwright test results JSON file with initial test suite and failure details 2026-01-09 21:33:31 +01:00
playwright-results.json Add Playwright test results JSON file with initial test suite and failure details 2026-01-09 21:33:31 +01:00
playwright.config.ts test(archief-assistent): add Playwright E2E test suite 2026-01-09 21:09:56 +01:00
tsconfig.app.json remove a,bihguous web-claims 2025-12-21 00:01:54 +01:00
tsconfig.json remove a,bihguous web-claims 2025-12-21 00:01:54 +01:00
tsconfig.node.json remove a,bihguous web-claims 2025-12-21 00:01:54 +01:00
vite.config.ts remove a,bihguous web-claims 2025-12-21 00:01:54 +01:00