193 lines
5.6 KiB
Markdown
193 lines
5.6 KiB
Markdown
# Reply Type Mapping - Master Checklist
|
|
|
|
## Overview
|
|
|
|
This feature adds semantic reply type classification to RAG responses, enabling the frontend to dynamically select appropriate UI components (tables, maps, charts, cards) based on query intent and result structure.
|
|
|
|
## Problem Statement
|
|
|
|
Current RAG responses return raw data without guidance on how to display it:
|
|
- Count queries return `{"count": 116}` but frontend doesn't know to show as "factual answer"
|
|
- List queries return institution arrays but no hint to use map vs. table
|
|
- Statistical queries have no chart type recommendations
|
|
|
|
## Solution
|
|
|
|
Add a `ReplyTypeClassifier` DSPy module that:
|
|
1. Analyzes query intent + result structure
|
|
2. Outputs a `ReplyType` enum value
|
|
3. Provides `ReplyContent` formatted for the selected UI component
|
|
|
|
---
|
|
|
|
## Phase 1: Core Infrastructure (Backend)
|
|
|
|
### 1.1 Define ReplyType Enum
|
|
- [ ] Create `backend/rag/reply_types.py`
|
|
- [ ] Define `ReplyType` enum with all supported types
|
|
- [ ] Define `ReplyContent` Pydantic models for each type
|
|
- [ ] Add type-specific formatting methods
|
|
|
|
### 1.2 Create ReplyTypeClassifier
|
|
- [ ] Create DSPy Signature for classification
|
|
- [ ] Implement rule-based fast-path (COUNT queries, LIST queries)
|
|
- [ ] Implement LLM-based classification for ambiguous cases
|
|
- [ ] Add confidence thresholds for fallback
|
|
|
|
### 1.3 Integrate into main.py
|
|
- [ ] Add classification step after SPARQL execution
|
|
- [ ] Format results using ReplyContent
|
|
- [ ] Include `reply_type` and `reply_content` in response
|
|
- [ ] Maintain backward compatibility with `raw_results`
|
|
|
|
---
|
|
|
|
## Phase 2: Reply Content Templates
|
|
|
|
### 2.1 Text-Based Replies
|
|
- [ ] `FACTUAL_COUNT` - Single number with context sentence
|
|
- [ ] `FACTUAL_ANSWER` - Short factual response
|
|
- [ ] `GENERATED_TEXT` - Free-form LLM narrative
|
|
|
|
### 2.2 Tabular Replies
|
|
- [ ] `TABLE_SIMPLE` - Basic data table (<50 rows)
|
|
- [ ] `TABLE_PAGINATED` - Large dataset (>50 rows)
|
|
- [ ] `TABLE_AGGREGATED` - Grouped/summarized data
|
|
|
|
### 2.3 Geographic Replies
|
|
- [ ] `MAP_POINTS` - Point markers (institutions, locations)
|
|
- [ ] `MAP_CLUSTER` - Clustered markers for dense data
|
|
- [ ] `MAP_CHOROPLETH` - Region-colored map (counts by province)
|
|
|
|
### 2.4 Chart Replies
|
|
- [ ] `CHART_BAR` - Bar chart for comparisons
|
|
- [ ] `CHART_PIE` - Pie chart for distributions
|
|
- [ ] `CHART_LINE` - Time series data
|
|
|
|
### 2.5 Hybrid Replies
|
|
- [ ] `CARD_LIST` - Institution cards (image, name, description)
|
|
- [ ] `COMPARISON` - Side-by-side institution comparison
|
|
|
|
---
|
|
|
|
## Phase 3: Frontend Integration
|
|
|
|
### 3.1 Component Selection Logic
|
|
- [ ] Create `ReplyRenderer` component
|
|
- [ ] Map `reply_type` to React components
|
|
- [ ] Implement lazy loading for heavy components
|
|
|
|
### 3.2 UI Components (existing)
|
|
- [ ] Material UI DataGrid for tables
|
|
- [ ] MapLibre GL JS for maps (see archief.support/map)
|
|
- [ ] D3.js for charts
|
|
- [ ] Custom cards for institution display
|
|
|
|
### 3.3 Testing
|
|
- [ ] Unit tests for each reply type
|
|
- [ ] E2E tests for rendering paths
|
|
- [ ] Visual regression tests
|
|
|
|
---
|
|
|
|
## Phase 4: Optimization
|
|
|
|
### 4.1 Performance
|
|
- [ ] Cache reply type classification
|
|
- [ ] Pre-compute UI hints during indexing
|
|
- [ ] Lazy load chart/map components
|
|
|
|
### 4.2 DSPy Optimization
|
|
- [ ] Create training examples for classifier
|
|
- [ ] Run GEPA optimization
|
|
- [ ] Measure classification accuracy
|
|
|
|
---
|
|
|
|
## Files to Create/Modify
|
|
|
|
### New Files
|
|
```
|
|
backend/rag/
|
|
├── reply_types.py # ReplyType enum, ReplyContent models
|
|
├── reply_classifier.py # DSPy module for classification
|
|
└── reply_templates/ # Jinja2 templates (optional)
|
|
|
|
docs/plan/reply-mapping/
|
|
├── 00-master-checklist.md # This file
|
|
├── 01-design-patterns.md # Architecture patterns
|
|
├── 02-sota-research.md # NL2VIS research summary
|
|
├── 03-tdd-specification.md # Test cases
|
|
├── 04-dependencies.md # Required packages
|
|
├── 05-reply-type-enum.md # Enum design
|
|
└── 06-reply-content-templates.md # Template designs
|
|
```
|
|
|
|
### Modified Files
|
|
```
|
|
backend/rag/main.py # Add classification + formatting
|
|
backend/rag/template_sparql.py # Add reply hints to templates
|
|
```
|
|
|
|
---
|
|
|
|
## Success Criteria
|
|
|
|
| Metric | Target |
|
|
|--------|--------|
|
|
| Classification accuracy | >90% on test queries |
|
|
| Response latency overhead | <100ms |
|
|
| Frontend render time | <500ms |
|
|
| User comprehension | Improved (A/B test) |
|
|
|
|
---
|
|
|
|
## Quick Win: Rule-Based Classification
|
|
|
|
Before implementing full LLM classification, use rule-based fast-path:
|
|
|
|
```python
|
|
def classify_reply_type_fast(intent: str, result_count: int, has_count: bool) -> ReplyType:
|
|
"""Rule-based reply type classification (no LLM needed)."""
|
|
|
|
# Statistical queries with COUNT result
|
|
if intent == "statistical" and has_count:
|
|
return ReplyType.FACTUAL_COUNT
|
|
|
|
# Geographic queries with multiple results
|
|
if intent == "geographic" and result_count > 1:
|
|
return ReplyType.MAP_POINTS
|
|
|
|
# Entity lookup with single result
|
|
if intent == "entity_lookup" and result_count == 1:
|
|
return ReplyType.CARD_LIST
|
|
|
|
# List queries with many results
|
|
if result_count > 10:
|
|
return ReplyType.TABLE_PAGINATED
|
|
|
|
# Default
|
|
return ReplyType.TABLE_SIMPLE
|
|
```
|
|
|
|
---
|
|
|
|
## Timeline
|
|
|
|
| Phase | Duration | Dependencies |
|
|
|-------|----------|--------------|
|
|
| Phase 1 | 2 days | None |
|
|
| Phase 2 | 3 days | Phase 1 |
|
|
| Phase 3 | 3 days | Phase 2 |
|
|
| Phase 4 | 2 days | Phase 3 |
|
|
|
|
**Total: ~10 days**
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- [NL2VIS Research](02-sota-research.md)
|
|
- [DSPy Compatibility](../prompt-query_template_mapping/dspy-compatibility.md)
|
|
- [RAG Integration](../prompt-query_template_mapping/rag-integration.md)
|
|
- [archief.support/map](https://archief.support/map) - Existing map implementation
|