19 KiB
19 KiB
DSPy Compatibility
Overview
This document describes how the template-based SPARQL system integrates with DSPy 2.6+. The key requirement is that template classification should be a DSPy module that can be optimized alongside existing modules using GEPA or other optimizers.
DSPy Integration Points
Current DSPy Architecture (dspy_heritage_rag.py)
The existing RAG system uses DSPy for:
- Query Intent Classification -
ClassifyQueryIntentSignature - Entity Extraction -
ExtractHeritageEntitiesSignature - SPARQL Generation -
GenerateSPARQLSignature (LLM-based) - Answer Generation -
GenerateHeritageAnswerSignature
Template Integration Strategy
We add template-based query generation as a pre-filter before LLM-based generation:
User Question
|
v
+------------------------+
| TemplateClassifier | <-- NEW: DSPy Module
| (DSPy Signature) |
+------------------------+
|
v
[Template Match?]
|
Yes No
| |
v v
+----------------+ +----------------+
| Template | | GenerateSPARQL | <-- Existing
| Instantiation | | (LLM-based) |
+----------------+ +----------------+
| |
+------------+------------+
|
v
Valid SPARQL
DSPy Signatures
1. TemplateClassifier Signature
import dspy
from typing import Optional, Literal
from pydantic import BaseModel, Field
class TemplateMatch(BaseModel):
"""Output model for template classification."""
template_id: str = Field(
description="ID of the matched template, or 'none' if no match"
)
confidence: float = Field(
description="Confidence score between 0.0 and 1.0",
ge=0.0,
le=1.0
)
extracted_slots: dict[str, str] = Field(
default_factory=dict,
description="Extracted slot values from the question"
)
reasoning: str = Field(
description="Brief explanation of why this template was selected"
)
class ClassifyTemplateSignature(dspy.Signature):
"""Classify a heritage question and match it to a SPARQL template.
Given a user question about Dutch heritage institutions, determine which
predefined SPARQL template best matches the query intent. If no template
matches, return template_id='none' to fall back to LLM generation.
Available Templates:
- region_institution_search: Find institutions of a type in a province
Example: "Welke archieven zijn er in Drenthe?"
Slots: institution_type (archieven/musea/bibliotheken), province (Dutch province name)
- count_by_type: Count institutions by type
Example: "Hoeveel musea zijn er in Nederland?"
Slots: institution_type
- count_by_type_region: Count institutions by type in a region
Example: "Hoeveel archieven zijn er in Noord-Holland?"
Slots: institution_type, province
- entity_lookup: Look up a specific institution by name
Example: "Wat is het Nationaal Archief?"
Slots: institution_name
- entity_lookup_by_ghcid: Look up by GHCID
Example: "Details van NL-HaNA"
Slots: ghcid
- list_by_type: List all institutions of a type
Example: "Toon alle bibliotheken"
Slots: institution_type
Dutch Institution Types:
- archief/archieven -> type code "A"
- museum/musea -> type code "M"
- bibliotheek/bibliotheken -> type code "L"
- galerie/galerijen -> type code "G"
Dutch Provinces:
- Drenthe, Flevoland, Friesland, Gelderland, Groningen,
Limburg, Noord-Brabant, Noord-Holland, Overijssel,
Utrecht, Zeeland, Zuid-Holland
"""
question: str = dspy.InputField(
desc="The user's question about heritage institutions"
)
language: str = dspy.InputField(
desc="Language code: 'nl' for Dutch, 'en' for English",
default="nl"
)
template_match: TemplateMatch = dspy.OutputField(
desc="The matched template and extracted slots"
)
class TemplateClassifier(dspy.Module):
"""DSPy module for template classification."""
def __init__(self):
super().__init__()
self.classify = dspy.ChainOfThought(ClassifyTemplateSignature)
def forward(self, question: str, language: str = "nl") -> TemplateMatch:
"""Classify question and return template match."""
result = self.classify(question=question, language=language)
return result.template_match
2. SlotExtractor Signature
class ExtractedSlots(BaseModel):
"""Output model for slot extraction."""
slots: dict[str, str] = Field(
description="Mapping of slot names to extracted values"
)
normalized_slots: dict[str, str] = Field(
description="Mapping of slot names to normalized values (codes)"
)
missing_slots: list[str] = Field(
default_factory=list,
description="List of required slots that could not be extracted"
)
class ExtractSlotsSignature(dspy.Signature):
"""Extract slot values from a question for a specific template.
Given a question and a template definition, extract the values for each
slot defined in the template. Normalize values to their standard codes.
Normalization Rules:
- Province names -> ISO 3166-2 codes (e.g., "Drenthe" -> "NL-DR")
- Institution types -> Single-letter codes (e.g., "archieven" -> "A")
- GHCID -> Keep as-is if valid format (e.g., "NL-HaNA")
Province Code Mappings:
- Drenthe: NL-DR
- Flevoland: NL-FL
- Friesland: NL-FR
- Gelderland: NL-GE
- Groningen: NL-GR
- Limburg: NL-LI
- Noord-Brabant: NL-NB
- Noord-Holland: NL-NH
- Overijssel: NL-OV
- Utrecht: NL-UT
- Zeeland: NL-ZE
- Zuid-Holland: NL-ZH
Institution Type Codes:
- A: Archive (archief, archieven)
- M: Museum (museum, musea)
- L: Library (bibliotheek, bibliotheken)
- G: Gallery (galerie, galerijen)
"""
question: str = dspy.InputField(
desc="The user's question"
)
template_id: str = dspy.InputField(
desc="ID of the matched template"
)
required_slots: list[str] = dspy.InputField(
desc="List of required slot names for this template"
)
extracted_slots: ExtractedSlots = dspy.OutputField(
desc="Extracted and normalized slot values"
)
class SlotExtractor(dspy.Module):
"""DSPy module for slot extraction."""
def __init__(self):
super().__init__()
self.extract = dspy.ChainOfThought(ExtractSlotsSignature)
def forward(
self,
question: str,
template_id: str,
required_slots: list[str]
) -> ExtractedSlots:
"""Extract slots from question."""
result = self.extract(
question=question,
template_id=template_id,
required_slots=required_slots
)
return result.extracted_slots
3. Combined TemplateSPARQL Module
class TemplateSPARQL(dspy.Module):
"""Combined DSPy module for template-based SPARQL generation.
This module orchestrates:
1. Template classification
2. Slot extraction (if template matches)
3. Template instantiation (if slots valid)
4. Fallback to LLM (if no match or invalid slots)
"""
def __init__(
self,
template_registry: "TemplateRegistry",
fallback_module: Optional[dspy.Module] = None
):
super().__init__()
self.classifier = TemplateClassifier()
self.slot_extractor = SlotExtractor()
self.template_registry = template_registry
self.fallback_module = fallback_module
def forward(self, question: str, language: str = "nl") -> str:
"""Generate SPARQL query from question.
Args:
question: User's natural language question
language: Language code ('nl' or 'en')
Returns:
Valid SPARQL query string
"""
# Step 1: Classify template
match = self.classifier(question=question, language=language)
# Step 2: Check if template matched
if match.template_id == "none" or match.confidence < 0.7:
# Fall back to LLM generation
if self.fallback_module:
return self.fallback_module(question=question)
raise ValueError("No template match and no fallback configured")
# Step 3: Get template and extract slots
template = self.template_registry.get(match.template_id)
required_slots = list(template.slots.keys())
slots = self.slot_extractor(
question=question,
template_id=match.template_id,
required_slots=required_slots
)
# Step 4: Check for missing required slots
if slots.missing_slots:
# Fall back if required slots are missing
if self.fallback_module:
return self.fallback_module(question=question)
raise ValueError(f"Missing required slots: {slots.missing_slots}")
# Step 5: Instantiate template
return template.instantiate(slots.normalized_slots)
GEPA Optimization
The template classification can be optimized using GEPA:
from dspy.teleprompt import GEPA
def template_accuracy_metric(example, prediction):
"""Metric for template classification accuracy."""
# Check if correct template was selected
if prediction.template_match.template_id != example.expected_template_id:
return 0.0
# Check slot extraction accuracy
expected_slots = example.expected_slots
predicted_slots = prediction.template_match.extracted_slots
if not expected_slots:
return 1.0 # No slots to check
correct_slots = sum(
1 for k, v in expected_slots.items()
if predicted_slots.get(k) == v
)
return correct_slots / len(expected_slots)
def create_training_examples():
"""Create training examples for GEPA optimization."""
return [
dspy.Example(
question="Welke archieven zijn er in Drenthe?",
language="nl",
expected_template_id="region_institution_search",
expected_slots={
"institution_type": "archieven",
"province": "Drenthe"
}
).with_inputs("question", "language"),
dspy.Example(
question="Welke musea zijn er in Noord-Holland?",
language="nl",
expected_template_id="region_institution_search",
expected_slots={
"institution_type": "musea",
"province": "Noord-Holland"
}
).with_inputs("question", "language"),
dspy.Example(
question="Hoeveel bibliotheken zijn er in Nederland?",
language="nl",
expected_template_id="count_by_type",
expected_slots={
"institution_type": "bibliotheken"
}
).with_inputs("question", "language"),
# Add more examples...
]
async def optimize_template_classifier():
"""Run GEPA optimization on template classifier."""
# Create training data
trainset = create_training_examples()
# Initialize classifier
classifier = TemplateClassifier()
# Configure GEPA optimizer
optimizer = GEPA(
metric=template_accuracy_metric,
auto="light", # Use light optimization
max_metric_calls=100,
)
# Run optimization
optimized_classifier = optimizer.compile(
classifier,
trainset=trainset,
)
# Save optimized module
optimized_classifier.save("optimized_template_classifier.json")
return optimized_classifier
Integration with Existing HeritageRAG
Modified HeritageRAG Class
# In dspy_heritage_rag.py
class HeritageRAG(dspy.Module):
"""Heritage RAG with template-based SPARQL generation."""
def __init__(
self,
template_registry: Optional["TemplateRegistry"] = None,
use_templates: bool = True,
template_confidence_threshold: float = 0.7,
**kwargs
):
super().__init__()
# Existing components
self.query_intent = dspy.ChainOfThought(ClassifyQueryIntent)
self.entity_extractor = dspy.ChainOfThought(ExtractHeritageEntities)
self.sparql_generator = dspy.ChainOfThought(GenerateSPARQL)
self.answer_generator = dspy.ChainOfThought(GenerateHeritageAnswer)
# NEW: Template-based components
self.use_templates = use_templates
self.template_confidence_threshold = template_confidence_threshold
if use_templates:
self.template_classifier = TemplateClassifier()
self.slot_extractor = SlotExtractor()
self.template_registry = template_registry or TemplateRegistry.load_default()
def generate_sparql(self, question: str, language: str = "nl") -> str:
"""Generate SPARQL query, trying templates first."""
if self.use_templates:
# Try template-based generation
try:
match = self.template_classifier(question=question, language=language)
if (match.template_id != "none" and
match.confidence >= self.template_confidence_threshold):
template = self.template_registry.get(match.template_id)
required_slots = list(template.slots.keys())
slots = self.slot_extractor(
question=question,
template_id=match.template_id,
required_slots=required_slots
)
if not slots.missing_slots:
# Successfully matched template
logger.info(
f"Using template '{match.template_id}' "
f"(confidence: {match.confidence:.2f})"
)
return template.instantiate(slots.normalized_slots)
except Exception as e:
logger.warning(f"Template generation failed: {e}, falling back to LLM")
# Fall back to LLM-based generation
logger.info("Using LLM-based SPARQL generation")
return self.sparql_generator(question=question).sparql_query
Signature Caching for OpenAI
To leverage OpenAI's prompt caching (1,024+ token threshold), we create cacheable docstrings:
def get_cacheable_template_classifier_docstring() -> str:
"""Generate >1024 token docstring for template classifier."""
# Load template definitions
templates = TemplateRegistry.load_default()
# Build comprehensive docstring
parts = [
"Classify a heritage question and match it to a SPARQL template.",
"",
"## Available Templates",
""
]
for template_id, template in templates.items():
parts.extend([
f"### {template_id}",
f"Description: {template.description}",
f"Example questions:",
])
for pattern in template.question_patterns[:3]:
parts.append(f" - {pattern}")
parts.extend([
f"Required slots: {list(template.slots.keys())}",
""
])
parts.extend([
"## Slot Value Mappings",
"",
"### Province Codes (ISO 3166-2)",
])
# Add all province mappings
for province, code in get_subregion_mappings().items():
parts.append(f"- {province}: {code}")
parts.extend([
"",
"### Institution Type Codes",
])
for type_name, code in get_institution_type_mappings().items():
parts.append(f"- {type_name}: {code}")
return "\n".join(parts)
Testing DSPy Integration
# tests/template_sparql/test_dspy_integration.py
import pytest
import dspy
class TestDSPyIntegration:
"""Test DSPy module integration."""
@pytest.fixture
def classifier(self):
"""Create template classifier with mock LM."""
# Use DSPy's testing utilities
dspy.configure(lm=dspy.LM("gpt-4o-mini"))
return TemplateClassifier()
def test_classifier_forward(self, classifier):
"""Test classifier forward pass."""
result = classifier(
question="Welke archieven zijn er in Drenthe?",
language="nl"
)
assert result.template_id == "region_institution_search"
assert result.confidence > 0.5
assert "institution_type" in result.extracted_slots
assert "province" in result.extracted_slots
def test_classifier_is_optimizable(self, classifier):
"""Test that classifier can be compiled with GEPA."""
from dspy.teleprompt import GEPA
trainset = [
dspy.Example(
question="Welke archieven zijn er in Drenthe?",
language="nl",
expected_template_id="region_institution_search"
).with_inputs("question", "language")
]
optimizer = GEPA(
metric=lambda e, p: 1.0 if p.template_match.template_id == e.expected_template_id else 0.0,
max_metric_calls=5
)
# Should not raise
compiled = optimizer.compile(classifier, trainset=trainset)
assert compiled is not None
Performance Considerations
Caching Strategy
from functools import lru_cache
class CachedTemplateClassifier(TemplateClassifier):
"""Template classifier with caching."""
@lru_cache(maxsize=1000)
def forward_cached(self, question: str, language: str = "nl") -> TemplateMatch:
"""Cached forward pass for repeated questions."""
return super().forward(question, language)
def forward(self, question: str, language: str = "nl") -> TemplateMatch:
# Normalize question for better cache hits
normalized = question.lower().strip()
return self.forward_cached(normalized, language)
Batch Processing
class BatchTemplateClassifier(dspy.Module):
"""Batch classification for efficiency."""
def __init__(self):
super().__init__()
self.classifier = TemplateClassifier()
async def forward_batch(
self,
questions: list[str],
language: str = "nl"
) -> list[TemplateMatch]:
"""Classify multiple questions in parallel."""
import asyncio
async def classify_one(q: str) -> TemplateMatch:
return self.classifier(question=q, language=language)
return await asyncio.gather(*[classify_one(q) for q in questions])
References
- DSPy Documentation: https://dspy-docs.vercel.app/
- GEPA Paper: https://arxiv.org/abs/2507.19457
- DSPy Signatures: https://dspy-docs.vercel.app/docs/building-blocks/signatures
- DSPy Modules: https://dspy-docs.vercel.app/docs/building-blocks/modules