Implements a state machine to filter streaming tokens:
- Only stream tokens from the 'answer' field to the frontend
- Skip tokens from 'reasoning', 'citations', 'confidence', 'follow_up' fields
- Remove DSPy field markers like '[[ ## answer ## ]]' from streamed content
This fixes the issue where raw DSPy signature field markers were being
displayed in the chat interface instead of clean answer text.
- Add sparqlQuery field to CachedResponse interface
- Extract SPARQL query before cache storage (not after)
- Include sparqlQuery in cache HIT message objects
- Handle both snake_case (server) and camelCase field names
SPARQL queries are now displayed for both fresh API responses
and cached responses, improving debugging and transparency.
- CZ: 2,170 processed (26% of 8,432)
- JP: 2,075 processed (17% of 12,096)
- AR: Started processing
- Total checkpoint: 9,762 files across all countries
- Using crawl4ai favicon extraction
- Added logo_enrichment to 771 Czech custodian files
- 87% logo hit rate using crawl4ai favicon extraction
- Total checkpoint: 9,257 files across all countries
- CZ remaining: 6,642 files
- Remove onFaceClick prop from CustodianTypeIndicator3D in class/slot/enum detail views
to prevent accidental filtering when clicking decorative cubes (Bug 3)
- Add parseCustodianTypesAnnotation() helper to handle JSON-stringified arrays like '["A"]'
in YAML annotations, fixing Bug 2 where all 19 letters appeared on every cube
- Legend bar retains onTypeClick for intentional filtering functionality
- Use hc: <https://w3id.org/heritage/custodian/> prefix
- Use hc:institutionType with single-letter codes (M, L, A, etc.)
- Use Wikidata URIs for countries (Q55=NL, Q31=BE, etc.)
- Update all SPARQL examples to use correct ontology
- Align with actual RDF data in Oxigraph
- Use crm:E39_Actor instead of glam:HeritageCustodian
- Use hc:institutionType with single-letter codes (M, L, A, etc.)
- Use Wikidata URIs for countries (Q55=NL, Q31=BE, etc.)
- Use skos:prefLabel for institution names
- Update ONTOLOGY_CONTEXT with correct examples
- Update HeritageSPARQLGenerator docstring with correct prefixes
- Change main class from hc:Custodian to crm:E39_Actor
- Change type property from hcp:institutionType to org:classification
- Update type values from single letters to full names (MUSEUM, ARCHIVE, etc.)
- Add rate limit handling with exponential backoff for 429 errors
- Fix test_live_rag.py sample queries to use correct ontology
- Update optimized_models instructions with correct prefixes
- Update SPARQL CONSTRUCT query to use correct ontology namespaces:
- hc: https://w3id.org/heritage/custodian/ (was nde)
- Use nde:Custodian type (was crm:E39_Actor)
- Use schema:location + geo:lat/long (was crm predicates)
- Remove LIMIT 500 clause to fetch all results
- Show all node types by default instead of random single type
- Fixes issue where Knowledge Graph showed incomplete/random data
YAML arrays in LinkML annotations must be quoted strings to ensure
proper parsing. This change quotes all custodian_types annotations
from the raw array format to quoted string format.
Before: custodian_types: ["A", "G"]
After: custodian_types: '["A", "G"]'
Affected: 50+ class files in modules/classes/
Also updates: manifest.json, 01_custodian_name_modular.yaml
Social Media Post Classes:
- SocialMediaPost: Base class for platform-agnostic post modeling
- SocialMediaPostType: Abstract base for post type taxonomy
- SocialMediaPostTypes: Concrete post types (TextPost, ImagePost,
CarouselPost, StoryPost, ReelPost, ArticlePost, PollPost, EventPost)
Content Classes:
- SocialMediaContent: Rich content modeling with media attachments,
hashtags, mentions, links, and engagement metrics
Features:
- Platform-specific post type mappings (Instagram, LinkedIn, Twitter, etc.)
- Engagement analytics (likes, comments, shares, saves)
- Heritage institution content categorization
- Media attachment handling (images, videos, documents)
- Hashtag and mention extraction for heritage topic tracking
- Larger default size (700x550) for better readability
- Resizable from all 8 edges/corners with visual SE grip indicator
- Clearer button icons (18px, strokeWidth 2.5)
- Draggable, minimizable, pinnable panel
- Dark theme and mobile responsive support
- Fixed _vector_search() to check uses_named_vectors() before adding 'using' parameter
- Fixed _person_vector_search() to detect person collection vector size and use appropriate model
- Resolves 'Not existing vector name error: openai_1536' for single-vector collections
- Resolves embedding dimension mismatch between heritage_custodians (1536-dim) and heritage_persons (384-dim)
- Institution Browser: multi-select for types and countries
- URL query param sync for shareable filter URLs
- New utility: countryNames.ts with flag emoji support
- New utility: imageProxy.ts for image URL handling
- New component: SearchableMultiSelect dropdown
- Career timeline CSS and component updates
- Media gallery improvements
- Lazy load error boundary component
- Version check utility
- Add digital platform discovery data with provenance
- Cleanup duplicate/incorrect custodian entries
- Add GHCID collision resolution suffixes where needed
- Update person entity profiles with career history
- Add GLAMORCUBESFIXPHDNT heritage type detection for person profiles
- Two-stage classification: blocklist non-heritage orgs, then match keywords
- Special handling for Digital (D) type: requires heritage org context
- Add career_history heritage_relevant and heritage_type fields
- Add exponential backoff retry for Anthropic API overload errors
- Fix DSPy 3.x async context with dspy.context() wrapper
- Add refresh button to assistant messages for re-running queries with fresh results
- Highlight refresh button (amber) for cached responses to draw attention
- Add spinning icon animation while refreshing
- Fix cache clear to return detailed success/failure status for local vs shared cache
- Add bypass cache toggle that forces fresh queries (one-shot, resets after query)
- Add Dutch/English translations for new UI elements
- Add SyncPanel component with bilingual (NL/EN) support
- Add relative URL handling for production (bronhouder.nl)
- Integrate SyncPanel into Database page
- Show sync status for all 4 databases (DuckLake, PostgreSQL, Oxigraph, Qdrant)
- Support dry-run mode and file limit options