kempersc
|
e5a08a353d
|
enrich person profiles
|
2026-01-10 14:14:04 +01:00 |
|
kempersc
|
f2bc2d54cb
|
feat(archief-assistent): integrate ontology-driven vocabulary into semantic cache
Implements Rule 46: Ontology-Driven Cache Segmentation
Semantic Cache Enhancements:
- Add institutionSubtype, recordSetType, wikidataEntity to ExtractedEntities
- Add extractionMethod field to track vocabulary vs regex extraction
- Implement async extractEntitiesWithVocabulary() using term log
- Maintain sync regex fallback for cache key generation (<5ms)
Build Pipeline:
- Add prebuild hook to regenerate types-vocab.json from LinkML schemas
- Extract vocabulary from *Type.yaml and *Types.yaml schema files
- Generate GLAMORCUBESFIXPHDNT code mappings automatically
New Script:
- scripts/extract-types-vocab.ts - Extracts vocabulary from LinkML schemas
- Supports --skip-embeddings flag for faster builds
- Outputs to apps/archief-assistent/public/types-vocab.json
This enables richer cache segmentation using ontology-derived subtypes
(e.g., 'MUNICIPAL_ARCHIVE', 'ART_MUSEUM') instead of just top-level
GLAMORCUBESFIXPHDNT codes.
|
2026-01-10 13:30:30 +01:00 |
|
kempersc
|
01b9d77566
|
feat(archief-assistent): add ontology-driven types vocabulary for cache segmentation
Add LinkML-derived vocabulary for semantic cache entity extraction (Rule 46):
- types-vocab.json: 10,142 lines of institution type vocabulary from LinkML
- 19 GLAMORCUBESFIXPHDNT type codes with Dutch/English/German/French labels
- Includes subtypes (kunstmuseum, rijksmuseum, streekarchief, etc.)
- Extracted from CustodianType.yaml and CustodianTypes.yaml
- types-vocabulary.ts: TypeScript module for entity extraction
- Exports INSTITUTION_TYPES with regex patterns per type code
- Replaces hardcoded patterns with schema-derived vocabulary
- Supports multilingual matching
- Rule 46 documentation (.opencode/rules/)
- Specifies vocabulary extraction workflow
- Defines cache key generation algorithm
- Migration path from hardcoded patterns
|
2026-01-10 12:57:03 +01:00 |
|
kempersc
|
4f0cafe98a
|
enrich HC profiles
|
2026-01-02 02:11:04 +01:00 |
|
kempersc
|
d64f857aa9
|
add sparql validator and RAG injector
|
2025-12-30 03:43:31 +01:00 |
|
kempersc
|
aca68ea47f
|
remove a,bihguous web-claims
|
2025-12-21 00:01:54 +01:00 |
|