kempersc/glam - Forgejo: Beyond coding. We Forge.

kempersc/glam

Fork 0

Commit graph

Author	SHA1	Message	Date
kempersc	ad74d8379e	feat(scripts): improve types-vocab extraction to derive all vocabulary from schema - Remove hardcoded type mappings, derive dynamically from LinkML - Extract keywords from annotations, structured_aliases, and comments - Add rename_plural_slot.py utility for schema slot renaming	2026-01-10 15:37:52 +01:00
kempersc	e5a08a353d	enrich person profiles	2026-01-10 14:14:04 +01:00
kempersc	f2bc2d54cb	feat(archief-assistent): integrate ontology-driven vocabulary into semantic cache Implements Rule 46: Ontology-Driven Cache Segmentation Semantic Cache Enhancements: - Add institutionSubtype, recordSetType, wikidataEntity to ExtractedEntities - Add extractionMethod field to track vocabulary vs regex extraction - Implement async extractEntitiesWithVocabulary() using term log - Maintain sync regex fallback for cache key generation (<5ms) Build Pipeline: - Add prebuild hook to regenerate types-vocab.json from LinkML schemas - Extract vocabulary from Type.yaml and Types.yaml schema files - Generate GLAMORCUBESFIXPHDNT code mappings automatically New Script: - scripts/extract-types-vocab.ts - Extracts vocabulary from LinkML schemas - Supports --skip-embeddings flag for faster builds - Outputs to apps/archief-assistent/public/types-vocab.json This enables richer cache segmentation using ontology-derived subtypes (e.g., 'MUNICIPAL_ARCHIVE', 'ART_MUSEUM') instead of just top-level GLAMORCUBESFIXPHDNT codes.	2026-01-10 13:30:30 +01:00

Author

SHA1

Message

Date

kempersc

ad74d8379e

feat(scripts): improve types-vocab extraction to derive all vocabulary from schema

- Remove hardcoded type mappings, derive dynamically from LinkML
- Extract keywords from annotations, structured_aliases, and comments
- Add rename_plural_slot.py utility for schema slot renaming

2026-01-10 15:37:52 +01:00

kempersc

e5a08a353d

enrich person profiles

2026-01-10 14:14:04 +01:00

kempersc

f2bc2d54cb

feat(archief-assistent): integrate ontology-driven vocabulary into semantic cache

Implements Rule 46: Ontology-Driven Cache Segmentation

Semantic Cache Enhancements:
- Add institutionSubtype, recordSetType, wikidataEntity to ExtractedEntities
- Add extractionMethod field to track vocabulary vs regex extraction
- Implement async extractEntitiesWithVocabulary() using term log
- Maintain sync regex fallback for cache key generation (<5ms)

Build Pipeline:
- Add prebuild hook to regenerate types-vocab.json from LinkML schemas
- Extract vocabulary from *Type.yaml and *Types.yaml schema files
- Generate GLAMORCUBESFIXPHDNT code mappings automatically

New Script:
- scripts/extract-types-vocab.ts - Extracts vocabulary from LinkML schemas
- Supports --skip-embeddings flag for faster builds
- Outputs to apps/archief-assistent/public/types-vocab.json

This enables richer cache segmentation using ontology-derived subtypes
(e.g., 'MUNICIPAL_ARCHIVE', 'ART_MUSEUM') instead of just top-level
GLAMORCUBESFIXPHDNT codes.

2026-01-10 13:30:30 +01:00

3 commits