glam/docs/plan/reply-mapping/00-master-checklist.md
2026-01-04 13:12:32 +01:00

193 lines
5.6 KiB
Markdown

# Reply Type Mapping - Master Checklist
## Overview
This feature adds semantic reply type classification to RAG responses, enabling the frontend to dynamically select appropriate UI components (tables, maps, charts, cards) based on query intent and result structure.
## Problem Statement
Current RAG responses return raw data without guidance on how to display it:
- Count queries return `{"count": 116}` but frontend doesn't know to show as "factual answer"
- List queries return institution arrays but no hint to use map vs. table
- Statistical queries have no chart type recommendations
## Solution
Add a `ReplyTypeClassifier` DSPy module that:
1. Analyzes query intent + result structure
2. Outputs a `ReplyType` enum value
3. Provides `ReplyContent` formatted for the selected UI component
---
## Phase 1: Core Infrastructure (Backend)
### 1.1 Define ReplyType Enum
- [ ] Create `backend/rag/reply_types.py`
- [ ] Define `ReplyType` enum with all supported types
- [ ] Define `ReplyContent` Pydantic models for each type
- [ ] Add type-specific formatting methods
### 1.2 Create ReplyTypeClassifier
- [ ] Create DSPy Signature for classification
- [ ] Implement rule-based fast-path (COUNT queries, LIST queries)
- [ ] Implement LLM-based classification for ambiguous cases
- [ ] Add confidence thresholds for fallback
### 1.3 Integrate into main.py
- [ ] Add classification step after SPARQL execution
- [ ] Format results using ReplyContent
- [ ] Include `reply_type` and `reply_content` in response
- [ ] Maintain backward compatibility with `raw_results`
---
## Phase 2: Reply Content Templates
### 2.1 Text-Based Replies
- [ ] `FACTUAL_COUNT` - Single number with context sentence
- [ ] `FACTUAL_ANSWER` - Short factual response
- [ ] `GENERATED_TEXT` - Free-form LLM narrative
### 2.2 Tabular Replies
- [ ] `TABLE_SIMPLE` - Basic data table (<50 rows)
- [ ] `TABLE_PAGINATED` - Large dataset (>50 rows)
- [ ] `TABLE_AGGREGATED` - Grouped/summarized data
### 2.3 Geographic Replies
- [ ] `MAP_POINTS` - Point markers (institutions, locations)
- [ ] `MAP_CLUSTER` - Clustered markers for dense data
- [ ] `MAP_CHOROPLETH` - Region-colored map (counts by province)
### 2.4 Chart Replies
- [ ] `CHART_BAR` - Bar chart for comparisons
- [ ] `CHART_PIE` - Pie chart for distributions
- [ ] `CHART_LINE` - Time series data
### 2.5 Hybrid Replies
- [ ] `CARD_LIST` - Institution cards (image, name, description)
- [ ] `COMPARISON` - Side-by-side institution comparison
---
## Phase 3: Frontend Integration
### 3.1 Component Selection Logic
- [ ] Create `ReplyRenderer` component
- [ ] Map `reply_type` to React components
- [ ] Implement lazy loading for heavy components
### 3.2 UI Components (existing)
- [ ] Material UI DataGrid for tables
- [ ] MapLibre GL JS for maps (see archief.support/map)
- [ ] D3.js for charts
- [ ] Custom cards for institution display
### 3.3 Testing
- [ ] Unit tests for each reply type
- [ ] E2E tests for rendering paths
- [ ] Visual regression tests
---
## Phase 4: Optimization
### 4.1 Performance
- [ ] Cache reply type classification
- [ ] Pre-compute UI hints during indexing
- [ ] Lazy load chart/map components
### 4.2 DSPy Optimization
- [ ] Create training examples for classifier
- [ ] Run GEPA optimization
- [ ] Measure classification accuracy
---
## Files to Create/Modify
### New Files
```
backend/rag/
├── reply_types.py # ReplyType enum, ReplyContent models
├── reply_classifier.py # DSPy module for classification
└── reply_templates/ # Jinja2 templates (optional)
docs/plan/reply-mapping/
├── 00-master-checklist.md # This file
├── 01-design-patterns.md # Architecture patterns
├── 02-sota-research.md # NL2VIS research summary
├── 03-tdd-specification.md # Test cases
├── 04-dependencies.md # Required packages
├── 05-reply-type-enum.md # Enum design
└── 06-reply-content-templates.md # Template designs
```
### Modified Files
```
backend/rag/main.py # Add classification + formatting
backend/rag/template_sparql.py # Add reply hints to templates
```
---
## Success Criteria
| Metric | Target |
|--------|--------|
| Classification accuracy | >90% on test queries |
| Response latency overhead | <100ms |
| Frontend render time | <500ms |
| User comprehension | Improved (A/B test) |
---
## Quick Win: Rule-Based Classification
Before implementing full LLM classification, use rule-based fast-path:
```python
def classify_reply_type_fast(intent: str, result_count: int, has_count: bool) -> ReplyType:
"""Rule-based reply type classification (no LLM needed)."""
# Statistical queries with COUNT result
if intent == "statistical" and has_count:
return ReplyType.FACTUAL_COUNT
# Geographic queries with multiple results
if intent == "geographic" and result_count > 1:
return ReplyType.MAP_POINTS
# Entity lookup with single result
if intent == "entity_lookup" and result_count == 1:
return ReplyType.CARD_LIST
# List queries with many results
if result_count > 10:
return ReplyType.TABLE_PAGINATED
# Default
return ReplyType.TABLE_SIMPLE
```
---
## Timeline
| Phase | Duration | Dependencies |
|-------|----------|--------------|
| Phase 1 | 2 days | None |
| Phase 2 | 3 days | Phase 1 |
| Phase 3 | 3 days | Phase 2 |
| Phase 4 | 2 days | Phase 3 |
**Total: ~10 days**
---
## References
- [NL2VIS Research](02-sota-research.md)
- [DSPy Compatibility](../prompt-query_template_mapping/dspy-compatibility.md)
- [RAG Integration](../prompt-query_template_mapping/rag-integration.md)
- [archief.support/map](https://archief.support/map) - Existing map implementation