- Add jq to apt-get install for deployment verification step
- Remove orphaned submodule entries (exa-mcp-server-source, mcp-wikidata) from git index
- Rename 'Install rsync' step to 'Install system dependencies'
The setup-node action fails to cache pnpm dependencies because the
store path /workspace/kempersc/glam/.pnpm-store/v3 can't be resolved.
Disabling caching for now to get the build working.
The frontend uses pnpm workspaces with 'workspace:*' protocol that npm
doesn't support. This updates the workflow to:
- Install pnpm using pnpm/action-setup
- Use pnpm for install, sync-schemas, generate-manifest, and build
- Cache pnpm dependencies using pnpm-lock.yaml
The repository has 314K+ files including backup data that exceeds
the CI runner's disk space. This change uses sparse checkout to only
fetch frontend/ and schemas/ directories needed for the build.
The old enum was properly archived to modules/enums/archive/ with .deprecated
suffix per Rule 9, but the manifest wasn't regenerated. Now correctly shows
only AnnotationMotivationType.yaml and AnnotationMotivationTypes.yaml.
Infrastructure changes to enable automatic frontend deployment when schemas change:
- Add .forgejo/workflows/deploy-frontend.yml workflow triggered by:
- Changes to frontend/** or schemas/20251121/linkml/**
- Manual workflow dispatch
- Rewrite generate-schema-manifest.cjs to properly scan all schema directories
- Recursively scans classes, enums, slots, modules directories
- Uses singular category names (class, enum, slot) matching TypeScript types
- Includes all 4 main schemas at root level
- Skips archive directories and backup files
- Update schema-loader.ts to match new manifest format
- Add SchemaCategory interface
- Update SchemaManifest to use categories as array
- Add flattenCategories() helper function
- Add getSchemaCategories() and getSchemaCategoriesSync() functions
The workflow builds frontend with updated manifest and deploys to bronhouder.nl
- Update VideoAnnotation class with new motivation type references
- Add AnnotationMotivationType and AnnotationMotivationTypes class files
- Add motivation_type slots (description, id, name)
- Archive deprecated AnnotationMotivationEnum
- Update slot references for derived_from_entity, has_observation, has_person_observation
- Set up GitHub integration to be disabled.
- Configure Git settings including path and autofetch options.
- Add Gitea instance URL and repository details.
- Enable YAML support for LinkML schemas with validation.
- Define file associations for YAML files.
- Recommend essential extensions for development and exclude unwanted ones.
Add companion_query support to fetch full entity records alongside
aggregate count queries. Enables displaying results on map/list when
asking 'how many museums in Amsterdam?'
Backend changes:
- Add companion_query, companion_query_region, companion_query_country
fields to TemplateDefinition and TemplateMatchResult
- Add render_template_string() for raw companion query rendering
Template changes:
- Add companion queries to count_institutions_by_type_and_location
for settlement, region, and country level queries
- Returns institution URI, name, coordinates, city for visualization
Update heritage professional profiles with:
- Separate role entries for different positions at same institution
- Employment date ranges (start_date, end_date)
- Updated observed_on timestamps
- Direct LinkedIn profile URLs as source
Profiles updated:
- Antoinet Nijssen (Noord-Hollands Archief)
- Anna Lakmaker
- Annelies Reus
- Marianne Hamersma
- Marcel Auwers
- Hans Felius
- Nico Vriend
Track full lineage of RAG responses: WHERE data comes from, WHEN it was
retrieved, HOW it was processed (SPARQL/vector/LLM).
Backend changes:
- Add provenance.py with EpistemicProvenance, DataTier, SourceAttribution
- Integrate provenance into MultiSourceRetriever.merge_results()
- Return epistemic_provenance in DSPyQueryResponse
Frontend changes:
- Pass EpistemicProvenance through useMultiDatabaseRAG hook
- Display provenance in ConversationPage (for cache transparency)
Schema fixes:
- Fix truncated example in has_observation.yaml slot definition
References:
- Pavlyshyn's Context Graphs and Data Traces paper
- LinkML ProvenanceBlock schema pattern
Document server disk architecture, PyTorch CPU-only setup, service
management, and recovery procedures learned from disk space crisis.
- Document dual-disk architecture (/: root 75GB, /mnt/data: 49GB)
- PyTorch CPU-only installation via --index-url whl/cpu
- Custodian data symlink: /mnt/data/custodian → /var/lib/glam/api/data/
- Service restart procedures for Oxigraph, GLAM API, Qdrant, etc.
- Emergency recovery commands for disk space crises
- Remove hardcoded type mappings, derive dynamically from LinkML
- Extract keywords from annotations, structured_aliases, and comments
- Add rename_plural_slot.py utility for schema slot renaming
- Add GHCID references to custodian affiliations
- Add start dates for employment periods
- Expand heritage type classifications (A→[A,F])
- Add detailed rationales based on career history
- Add full_initials from archival publications
Copies authoritative schemas from schemas/20251121/ to:
- frontend/public/schemas/20251121/
- apps/archief-assistent/public/schemas/20251121/
This ensures slot definitions with corrected ontology property
references (commit 2808dad6cd) are available to frontend apps.
Person Enrichment Scripts:
- enrich_person_comprehensive.py: Full-featured web search enrichment via Linkup
with Rule 6/21/26/34/35 compliance (dual timestamps, no fabrication)
- enrich_ppids_linkup.py: Batch PPID enrichment pipeline
- extract_persons_with_provenance.py: Extract person data from LinkedIn HTML
with XPath provenance tracking
LinkML Slot Management:
- update_slot_mappings.py: Update slots for RiC-O naming (Rule 39) and
semantic URI requirements (Rule 38)
- update_class_slot_references.py: Update class files referencing renamed slots
- validate_slot_mappings.py: Validate slot definitions against ontology rules
All scripts follow established project conventions for provenance and
ontology alignment.