kempersc/glam - Forgejo: Beyond coding. We Forge.

Author	SHA1	Message	Date
kempersc	28c3aaf33f	enrich profiles	2026-01-10 17:31:02 +01:00
kempersc	cce484c6b8	feat(archief-assistent): enhance semantic cache with ontology-driven vocabulary - Integrate tier-2 embeddings from types-vocab.json - Add segment-based caching for improved retrieval - Update tests and documentation	2026-01-10 15:38:11 +01:00
kempersc	ad74d8379e	feat(scripts): improve types-vocab extraction to derive all vocabulary from schema - Remove hardcoded type mappings, derive dynamically from LinkML - Extract keywords from annotations, structured_aliases, and comments - Add rename_plural_slot.py utility for schema slot renaming	2026-01-10 15:37:52 +01:00
kempersc	13938c92ca	chore(schemas): sync LinkML schemas to frontend apps Copies authoritative schemas from schemas/20251121/ to: - frontend/public/schemas/20251121/ - apps/archief-assistent/public/schemas/20251121/ This ensures slot definitions with corrected ontology property references (commit `2808dad6cd`) are available to frontend apps.	2026-01-10 15:02:25 +01:00
kempersc	e5a08a353d	enrich person profiles	2026-01-10 14:14:04 +01:00
kempersc	f2bc2d54cb	feat(archief-assistent): integrate ontology-driven vocabulary into semantic cache Implements Rule 46: Ontology-Driven Cache Segmentation Semantic Cache Enhancements: - Add institutionSubtype, recordSetType, wikidataEntity to ExtractedEntities - Add extractionMethod field to track vocabulary vs regex extraction - Implement async extractEntitiesWithVocabulary() using term log - Maintain sync regex fallback for cache key generation (<5ms) Build Pipeline: - Add prebuild hook to regenerate types-vocab.json from LinkML schemas - Extract vocabulary from Type.yaml and Types.yaml schema files - Generate GLAMORCUBESFIXPHDNT code mappings automatically New Script: - scripts/extract-types-vocab.ts - Extracts vocabulary from LinkML schemas - Supports --skip-embeddings flag for faster builds - Outputs to apps/archief-assistent/public/types-vocab.json This enables richer cache segmentation using ontology-derived subtypes (e.g., 'MUNICIPAL_ARCHIVE', 'ART_MUSEUM') instead of just top-level GLAMORCUBESFIXPHDNT codes.	2026-01-10 13:30:30 +01:00
kempersc	01b9d77566	feat(archief-assistent): add ontology-driven types vocabulary for cache segmentation Add LinkML-derived vocabulary for semantic cache entity extraction (Rule 46): - types-vocab.json: 10,142 lines of institution type vocabulary from LinkML - 19 GLAMORCUBESFIXPHDNT type codes with Dutch/English/German/French labels - Includes subtypes (kunstmuseum, rijksmuseum, streekarchief, etc.) - Extracted from CustodianType.yaml and CustodianTypes.yaml - types-vocabulary.ts: TypeScript module for entity extraction - Exports INSTITUTION_TYPES with regex patterns per type code - Replaces hardcoded patterns with schema-derived vocabulary - Supports multilingual matching - Rule 46 documentation (.opencode/rules/) - Specifies vocabulary extraction workflow - Defines cache key generation algorithm - Migration path from hardcoded patterns	2026-01-10 12:57:03 +01:00
kempersc	3c4f7acf87	test(archief-assistent): update E2E tests for entity extraction cache - Simplify cache spec assertions after structured matching implementation - Refactor map-panel spec for better test isolation and reliability - Remove redundant geographic false positive tests (handled by entity extraction)	2026-01-10 12:55:22 +01:00
kempersc	7fbff2ff5f	feat(archief-assistent): add entity extraction to semantic cache Prevent geographic false positives in cache lookups. Queries like "musea in Amsterdam" vs "musea in Noord-Holland" have ~93% embedding similarity but completely different answers. Changes: - Add ExtractedEntities interface for structured cache keys - Implement fast entity extraction (<5ms, no LLM) with regex patterns - Extract institution types (GLAMORCUBESFIXPHDNT), locations, and intent - Generate structured cache keys (e.g., "count:M:amsterdam") - Raise similarity threshold from 0.85 to 0.97 to match backend DSPy - Add 'structured' match method to CacheLookupResult The entity extractor recognizes: - 19 institution types (Dutch + English patterns) - 12 Dutch provinces with ISO 3166-2:NL codes - Major Dutch cities with settlement codes - Query intents (count, list, info) This ensures geographic queries get different cache entries even when embeddings are highly similar.	2026-01-10 10:33:21 +01:00
kempersc	519b0b47a8	Add Playwright test results JSON file with initial test suite and failure details	2026-01-09 21:33:31 +01:00
kempersc	004d342935	chore: minor updates and evaluation results - auth.setup.ts: require env vars for test credentials (no hardcoded defaults) - manifest.json: update schema manifest - full_evaluation_results.json: add RAG evaluation results - petra-links.json: update birth date from web claim	2026-01-09 21:10:55 +01:00
kempersc	ea35da02dc	test(archief-assistent): add Playwright E2E test suite - Add chat.spec.ts for RAG query testing - Add count-queries.spec.ts for aggregation validation - Add map-panel.spec.ts for geographic feature testing - Add cache.spec.ts for response caching verification - Add auth.setup.ts for authentication handling - Configure playwright.config.ts for multi-browser testing - Tests run against production archief.support	2026-01-09 21:09:56 +01:00
kempersc	97f85e0050	deps(archief-assistent): add playwright for E2E testing - Add @playwright/test as dev dependency - Alphabetize dependencies list	2026-01-09 21:06:12 +01:00
kempersc	9e67d0f967	enrich profiles	2026-01-09 20:35:19 +01:00
kempersc	c88fd3af70	Refactor code structure for improved readability and maintainability	2026-01-09 11:05:26 +01:00
kempersc	6608a207d4	update frontend	2026-01-08 15:56:28 +01:00
kempersc	98c42bf272	Fix LinkML URI conflicts and generate RDF outputs - Fix scope_note → finding_aid_scope_note in FindingAid.yaml - Remove duplicate wikidata_entity slot from CustodianType.yaml (import instead) - Remove duplicate rico_record_set_type from class_metadata_slots.yaml - Fix range types for equals_string compatibility (uriorcurie → string) - Move class names from close_mappings to see_also in 10 RecordSetTypes files - Generate all RDF formats: OWL, N-Triples, RDF/XML, N3, JSON-LD context - Sync schemas to frontend/public/schemas/ Files: 1,151 changed (includes prior CustodianType migration)	2026-01-07 12:32:59 +01:00
kempersc	2dca28d8c1	enrich CH entries with mission statements	2026-01-04 13:12:32 +01:00
kempersc	4f0cafe98a	enrich HC profiles	2026-01-02 02:11:04 +01:00
kempersc	d64f857aa9	add sparql validator and RAG injector	2025-12-30 03:43:31 +01:00
kempersc	84904e344b	Make AGENTS more succint by referring to opencode rules & enrich custodians	2025-12-28 14:56:35 +01:00
kempersc	cdb633b0c9	enrich custodian entries with logo	2025-12-27 02:15:17 +01:00
kempersc	d5f2f542ce	feat(archief-assistent): preserve SPARQL queries in semantic cache - Add sparqlQuery field to CachedResponse interface - Extract SPARQL query before cache storage (not after) - Include sparqlQuery in cache HIT message objects - Handle both snake_case (server) and camelCase field names SPARQL queries are now displayed for both fresh API responses and cached responses, improving debugging and transparency.	2025-12-24 22:26:22 +01:00
kempersc	0c1d19e98b	enrich entries	2025-12-23 13:27:35 +01:00
kempersc	aca68ea47f	remove a,bihguous web-claims	2025-12-21 00:01:54 +01:00

25 commits