# Competency Questions and Ontology Coverage ## Overview This document describes how the template-based SPARQL system serves a dual purpose: 1. **Query generation** - Translate user questions to valid SPARQL 2. **Ontology validation** - Identify gaps in the Heritage Custodian Ontology through unanswerable questions The principle is simple: **if a relevant question cannot be mapped to a SPARQL template, the ontology likely lacks coverage for that domain**. ## Competency Questions (CQs) ### What are Competency Questions? Competency Questions are natural language questions that an ontology should be able to answer. They serve as: - **Requirements** during ontology design - **Validation criteria** during ontology evaluation - **Coverage metrics** for ongoing maintenance > "Competency questions define what the ontology knows about. If you can't answer a competency question with a SPARQL query, your ontology is incomplete." — Grüninger & Fox (1995) ### CQs as Template Coverage Metrics Each SPARQL template implicitly defines a set of Competency Questions: ```yaml # Template definition implies these CQs are answerable: region_institution_search: question_patterns: - "Welke {institution_type_nl} zijn er in {province}?" - "Which {institution_type_en} are in {province}?" # Implied Competency Questions: # CQ1: What archives exist in a given Dutch province? # CQ2: What museums exist in a given Dutch province? # CQ3: What libraries exist in a given Dutch province? # etc. ``` ### Tracking Ontology Coverage ```python """Track which competency questions the ontology can answer.""" from dataclasses import dataclass from enum import Enum class CQStatus(Enum): """Status of a competency question.""" ANSWERABLE = "answerable" # Has matching template PARTIAL = "partial" # Template exists but limited UNANSWERABLE = "unanswerable" # No template, ontology gap OUT_OF_SCOPE = "out_of_scope" # Not relevant to ontology (fyke) @dataclass class CompetencyQuestion: """A competency question for ontology validation.""" id: str question_nl: str question_en: str category: str # geographic, statistical, relational, etc. status: CQStatus template_id: str | None # Template that answers this CQ ontology_classes: list[str] # Classes needed to answer ontology_properties: list[str] # Properties needed to answer notes: str | None = None # Example CQ registry COMPETENCY_QUESTIONS = [ CompetencyQuestion( id="CQ-GEO-001", question_nl="Welke archieven zijn er in een bepaalde provincie?", question_en="What archives exist in a given province?", category="geographic", status=CQStatus.ANSWERABLE, template_id="region_institution_search", ontology_classes=["crm:E39_Actor"], ontology_properties=["hc:institutionType", "schema:addressLocality"], ), CompetencyQuestion( id="CQ-REL-001", question_nl="Welke instellingen hebben dezelfde directeur gehad?", question_en="Which institutions have shared the same director?", category="relational", status=CQStatus.UNANSWERABLE, # GAP: No staff employment history template_id=None, ontology_classes=["crm:E39_Actor", "schema:Person"], ontology_properties=["schema:employee", "schema:worksFor"], # MISSING notes="Ontology lacks employment history modeling", ), CompetencyQuestion( id="CQ-OUT-001", question_nl="Waar kan ik tandpasta met korting kopen?", question_en="Where can I buy toothpaste with a discount?", category="out_of_scope", status=CQStatus.OUT_OF_SCOPE, template_id=None, ontology_classes=[], ontology_properties=[], notes="Not related to heritage institutions - route to fyke", ), ] ``` ## The Fyke: Catching Irrelevant Questions ### What is a Fyke? A **fyke** (Dutch: *fuik*) is a type of fish trap - a net that catches fish swimming in a certain direction. In our system, the fyke catches questions that are **irrelevant to the Heritage Custodian Ontology**. ``` User Question | v +------------------+ | Relevance Filter | <-- Is this about heritage institutions? +------------------+ | +---> Relevant? | | | Yes | No | | | | | v | v | Route | +-------+ | to | | FYKE | <-- Catch irrelevant questions | SPARQL| +-------+ | | | | | v | | Standard Response: | | "Deze vraag kan niet beantwoord worden | | door de ArchiefAssistent. De service | | bevat informatie over erfgoedinstellingen | | zoals archieven, musea en bibliotheken." ``` ### Fyke Implementation ```python """Fyke: Filter for irrelevant questions.""" import dspy from typing import Literal from pydantic import BaseModel, Field class RelevanceClassification(BaseModel): """Structured output for relevance classification.""" is_relevant: bool = Field( description="Whether the question relates to heritage institutions" ) confidence: float = Field( ge=0.0, le=1.0, description="Confidence in the classification" ) reasoning: str = Field( description="Brief explanation of why the question is or isn't relevant" ) detected_topics: list[str] = Field( description="Topics detected in the question" ) class HeritageRelevanceSignature(dspy.Signature): """Determine if a question is relevant to the Heritage Custodian Ontology. The Heritage Custodian Ontology covers: - Heritage institutions: museums, archives, libraries, galleries - Institution properties: location, founding date, type, collections - Staff and personnel at heritage institutions - Geographic distribution of heritage institutions - Relationships between institutions Questions about the following are OUT OF SCOPE and should be marked irrelevant: - Commercial products or shopping - Medical or health advice - Legal advice - Current news or politics (unless about heritage policy) - Personal relationships - Technical support for unrelated systems - General knowledge not related to heritage Be generous with relevance: if the question MIGHT relate to heritage institutions, mark it as relevant. Only flag clearly unrelated questions as irrelevant. """ question: str = dspy.InputField( desc="User's question to classify" ) language: str = dspy.InputField( desc="Language of the question (nl, en, de, fr)", default="nl" ) classification: RelevanceClassification = dspy.OutputField( desc="Structured relevance classification" ) class FykeFilter(dspy.Module): """Filter irrelevant questions before template matching. The fyke catches questions that cannot be answered by the Heritage Custodian Ontology, returning a polite standard response. """ # Standard responses by language STANDARD_RESPONSES = { "nl": ( "Deze vraag kan helaas niet beantwoord worden door de ArchiefAssistent. " "Deze service bevat informatie over erfgoedinstellingen in Nederland en " "daarbuiten, zoals archieven, musea, bibliotheken en galerieën. " "Stel gerust een vraag over deze instellingen!" ), "en": ( "Unfortunately, this question cannot be answered by the ArchiefAssistent. " "This service contains information about heritage institutions in the " "Netherlands and beyond, such as archives, museums, libraries, and galleries. " "Feel free to ask a question about these institutions!" ), "de": ( "Leider kann diese Frage vom ArchiefAssistent nicht beantwortet werden. " "Dieser Service enthält Informationen über Kulturerbe-Einrichtungen in den " "Niederlanden und darüber hinaus, wie Archive, Museen, Bibliotheken und Galerien. " "Stellen Sie gerne eine Frage zu diesen Einrichtungen!" ), "fr": ( "Malheureusement, cette question ne peut pas être répondue par l'ArchiefAssistent. " "Ce service contient des informations sur les institutions patrimoniales aux " "Pays-Bas et au-delà, telles que les archives, les musées, les bibliothèques " "et les galeries. N'hésitez pas à poser une question sur ces institutions!" ), } # Confidence threshold for fyke activation IRRELEVANCE_THRESHOLD = 0.85 def __init__(self, fast_lm: dspy.LM | None = None): """Initialize the fyke filter. Args: fast_lm: Optional fast LM for relevance classification. Recommended: gpt-4o-mini or similar for speed. """ super().__init__() self.fast_lm = fast_lm self.classifier = dspy.TypedPredictor(HeritageRelevanceSignature) def forward( self, question: str, language: str = "nl", ) -> dspy.Prediction: """Classify question relevance and optionally catch in fyke. Returns: Prediction with: - is_relevant: bool - caught_by_fyke: bool - fyke_response: str | None (if caught) - reasoning: str - confidence: float """ # Use fast LM if configured if self.fast_lm: with dspy.settings.context(lm=self.fast_lm): result = self.classifier(question=question, language=language) else: result = self.classifier(question=question, language=language) classification = result.classification # Determine if caught by fyke caught = ( not classification.is_relevant and classification.confidence >= self.IRRELEVANCE_THRESHOLD ) # Get appropriate response fyke_response = None if caught: fyke_response = self.STANDARD_RESPONSES.get( language, self.STANDARD_RESPONSES["en"] ) return dspy.Prediction( is_relevant=classification.is_relevant, caught_by_fyke=caught, fyke_response=fyke_response, reasoning=classification.reasoning, confidence=classification.confidence, detected_topics=classification.detected_topics, ) # Example usage in the RAG pipeline class HeritageRAGWithFyke(dspy.Module): """Heritage RAG with fyke pre-filter.""" def __init__(self): super().__init__() self.fyke = FykeFilter() self.router = HeritageQueryRouter() self.template_classifier = TemplateClassifier() # ... other components async def answer( self, question: str, language: str = "nl", ) -> dspy.Prediction: """Answer question with fyke pre-filtering.""" # Step 1: Check relevance (fyke filter) relevance = self.fyke(question=question, language=language) if relevance.caught_by_fyke: # Question is irrelevant - return standard response return dspy.Prediction( answer=relevance.fyke_response, caught_by_fyke=True, reasoning=relevance.reasoning, confidence=relevance.confidence, ) # Step 2: Route relevant question to templates routing = self.router(question=question, language=language) # ... continue with normal processing ``` ### Fyke Examples | Question | Language | Relevant? | Reasoning | |----------|----------|-----------|-----------| | "Welke archieven zijn er in Utrecht?" | nl | ✅ Yes | Asks about archives in a location | | "Waar kan ik tandpasta met korting kopen?" | nl | ❌ No | Shopping query, not heritage | | "What is the weather in Amsterdam?" | en | ❌ No | Weather query, not heritage | | "Wie is de directeur van het Rijksmuseum?" | nl | ✅ Yes | Asks about museum staff | | "How do I reset my password?" | en | ❌ No | Technical support, not heritage | | "Welke musea hebben een Van Gogh collectie?" | nl | ✅ Yes | Asks about museum collections | ## Ontology Gap Detection ### Identifying Gaps Through Template Failure When a relevant question cannot be mapped to any template, this signals a potential ontology gap: ```python """Detect ontology gaps through template matching failures.""" from dataclasses import dataclass from datetime import datetime @dataclass class OntologyGapReport: """Report of a potential ontology gap.""" question: str timestamp: datetime detected_intent: str detected_entities: list[str] required_classes: list[str] # Classes that would be needed required_properties: list[str] # Properties that would be needed existing_coverage: str # What the ontology currently covers gap_description: str priority: str # high, medium, low class OntologyGapDetector: """Detect gaps in the Heritage Custodian Ontology.""" # Known ontology capabilities ONTOLOGY_COVERAGE = { "institution_identity": { "classes": ["crm:E39_Actor"], "properties": ["skos:prefLabel", "hc:institutionType", "schema:description"], "description": "Institution names, types, and descriptions", }, "institution_location": { "classes": ["crm:E39_Actor"], "properties": ["schema:addressLocality", "schema:addressCountry", "schema:geo"], "description": "Institution geographic locations", }, "institution_founding": { "classes": ["crm:E39_Actor"], "properties": ["schema:foundingDate", "schema:dissolutionDate"], "description": "Institution founding and closure dates", }, "staff_current": { "classes": ["schema:Person"], "properties": ["schema:name", "schema:jobTitle"], "description": "Current staff members and roles", }, } # Known gaps (to be expanded as gaps are discovered) KNOWN_GAPS = { "staff_history": { "description": "Historical employment records", "example_questions": [ "Wie was de eerste directeur van het Rijksmuseum?", "Welke archivarissen hebben bij meerdere instellingen gewerkt?", ], "required_modeling": "Employment periods with start/end dates", }, "collection_items": { "description": "Individual collection items", "example_questions": [ "Welke musea hebben werken van Rembrandt?", "Waar kan ik de Nachtwacht zien?", ], "required_modeling": "Collection items linked to institutions", }, "institutional_relationships": { "description": "Relationships between institutions", "example_questions": [ "Welke instellingen zijn onderdeel van de Reinwardt Academie?", "Met welke musea werkt het Rijksmuseum samen?", ], "required_modeling": "Formal relationships (parent, partner, etc.)", }, } def analyze_unmatched_question( self, question: str, routing: dspy.Prediction, ) -> OntologyGapReport | None: """Analyze why a relevant question couldn't be matched to a template. Args: question: The user's question routing: Routing prediction with intent, entities, etc. Returns: OntologyGapReport if a gap is detected, None otherwise """ # Check if this matches a known gap pattern for gap_id, gap_info in self.KNOWN_GAPS.items(): for example in gap_info["example_questions"]: if self._is_similar_question(question, example): return OntologyGapReport( question=question, timestamp=datetime.now(), detected_intent=routing.intent, detected_entities=routing.entities, required_classes=self._infer_required_classes(gap_id), required_properties=self._infer_required_properties(gap_id), existing_coverage=self._describe_current_coverage(routing.intent), gap_description=gap_info["description"], priority=self._assess_priority(gap_id), ) # Unknown gap - log for review return OntologyGapReport( question=question, timestamp=datetime.now(), detected_intent=routing.intent, detected_entities=routing.entities, required_classes=["UNKNOWN"], required_properties=["UNKNOWN"], existing_coverage=self._describe_current_coverage(routing.intent), gap_description="Unknown gap - requires manual review", priority="medium", ) def _is_similar_question(self, q1: str, q2: str) -> bool: """Check if two questions are semantically similar.""" # Simple implementation - could use embeddings q1_lower = q1.lower() q2_lower = q2.lower() # Check for key term overlap q1_terms = set(q1_lower.split()) q2_terms = set(q2_lower.split()) overlap = len(q1_terms & q2_terms) / max(len(q1_terms), len(q2_terms)) return overlap > 0.5 ``` ### Gap Reporting Dashboard ```python """Generate ontology coverage reports.""" from collections import Counter def generate_coverage_report( competency_questions: list[CompetencyQuestion], gap_reports: list[OntologyGapReport], ) -> dict: """Generate an ontology coverage report. Returns: Dict with coverage statistics and gap analysis """ # Count CQ statuses status_counts = Counter(cq.status for cq in competency_questions) # Calculate coverage percentage total_cqs = len(competency_questions) answerable = status_counts.get(CQStatus.ANSWERABLE, 0) partial = status_counts.get(CQStatus.PARTIAL, 0) coverage_pct = (answerable + 0.5 * partial) / total_cqs * 100 if total_cqs > 0 else 0 # Group gaps by category gaps_by_category = {} for gap in gap_reports: category = gap.detected_intent if category not in gaps_by_category: gaps_by_category[category] = [] gaps_by_category[category].append(gap) # Identify most common gaps gap_descriptions = Counter(g.gap_description for g in gap_reports) return { "summary": { "total_competency_questions": total_cqs, "answerable": answerable, "partial": partial, "unanswerable": status_counts.get(CQStatus.UNANSWERABLE, 0), "out_of_scope": status_counts.get(CQStatus.OUT_OF_SCOPE, 0), "coverage_percentage": round(coverage_pct, 1), }, "gaps_by_category": { cat: len(gaps) for cat, gaps in gaps_by_category.items() }, "top_gaps": gap_descriptions.most_common(10), "recommendations": _generate_recommendations(gap_reports), } def _generate_recommendations(gaps: list[OntologyGapReport]) -> list[str]: """Generate recommendations for ontology improvements.""" recommendations = [] # Analyze gap patterns gap_types = Counter(g.gap_description for g in gaps) for gap_type, count in gap_types.most_common(5): if count >= 3: recommendations.append( f"High priority: Add modeling for '{gap_type}' " f"({count} unanswered questions)" ) return recommendations ``` ## Competency Question Registry ### YAML Format for CQ Tracking ```yaml # data/competency_questions.yaml version: "1.0.0" categories: geographic: description: "Questions about institution locations" coverage: high statistical: description: "Questions about counts and distributions" coverage: high relational: description: "Questions about relationships between entities" coverage: low temporal: description: "Questions about historical changes" coverage: medium biographical: description: "Questions about people in heritage sector" coverage: medium competency_questions: # ANSWERABLE - Geographic - id: CQ-GEO-001 question_nl: "Welke archieven zijn er in een bepaalde provincie?" question_en: "What archives exist in a given province?" category: geographic status: answerable template_id: region_institution_search ontology_classes: [crm:E39_Actor] ontology_properties: [hc:institutionType, schema:addressLocality] - id: CQ-GEO-002 question_nl: "Welke musea zijn er in een bepaalde stad?" question_en: "What museums exist in a given city?" category: geographic status: answerable template_id: city_institution_search ontology_classes: [crm:E39_Actor] ontology_properties: [hc:institutionType, schema:addressLocality] # ANSWERABLE - Statistical - id: CQ-STAT-001 question_nl: "Hoeveel musea zijn er in Nederland?" question_en: "How many museums are there in the Netherlands?" category: statistical status: answerable template_id: count_institutions_by_type ontology_classes: [crm:E39_Actor] ontology_properties: [hc:institutionType, schema:addressCountry] # PARTIAL - Temporal - id: CQ-TEMP-001 question_nl: "Wat is het oudste archief in Nederland?" question_en: "What is the oldest archive in the Netherlands?" category: temporal status: partial template_id: find_oldest_institution ontology_classes: [crm:E39_Actor] ontology_properties: [schema:foundingDate] notes: "Founding dates incomplete for many institutions" # UNANSWERABLE - Relational (GAP) - id: CQ-REL-001 question_nl: "Welke instellingen hebben dezelfde directeur gehad?" question_en: "Which institutions have shared the same director?" category: relational status: unanswerable template_id: null ontology_classes: [crm:E39_Actor, schema:Person] ontology_properties: [schema:employee] # MISSING: employment history gap_description: "No modeling for historical employment relationships" recommended_fix: "Add employment periods with start/end dates" - id: CQ-REL-002 question_nl: "Welke musea zijn onderdeel van een groter netwerk?" question_en: "Which museums are part of a larger network?" category: relational status: unanswerable template_id: null ontology_classes: [crm:E39_Actor, org:Organization] ontology_properties: [org:subOrganizationOf] # MISSING gap_description: "No modeling for organizational hierarchies" # OUT OF SCOPE - Fyke - id: CQ-OUT-001 question_nl: "Waar kan ik tandpasta met korting kopen?" question_en: "Where can I buy toothpaste with a discount?" category: out_of_scope status: out_of_scope template_id: null ontology_classes: [] ontology_properties: [] notes: "Shopping query - route to fyke" - id: CQ-OUT-002 question_nl: "Wat is het weer morgen in Amsterdam?" question_en: "What is the weather tomorrow in Amsterdam?" category: out_of_scope status: out_of_scope template_id: null ontology_classes: [] ontology_properties: [] notes: "Weather query - route to fyke" ``` ## Integration with Template System ### Template-CQ Bidirectional Linking ```python """Link templates to competency questions bidirectionally.""" def validate_template_cq_coverage( templates: dict[str, SPARQLTemplate], competency_questions: list[CompetencyQuestion], ) -> dict: """Validate that templates cover expected CQs and vice versa. Returns: Validation report with coverage analysis """ # Templates without CQ coverage templates_without_cq = [] for template_id, template in templates.items(): matching_cqs = [ cq for cq in competency_questions if cq.template_id == template_id ] if not matching_cqs: templates_without_cq.append(template_id) # CQs without template coverage cqs_without_template = [ cq for cq in competency_questions if cq.status == CQStatus.ANSWERABLE and cq.template_id is None ] # CQs marked answerable but template doesn't exist orphaned_cqs = [ cq for cq in competency_questions if cq.template_id and cq.template_id not in templates ] return { "templates_without_cq": templates_without_cq, "cqs_without_template": [cq.id for cq in cqs_without_template], "orphaned_cqs": [cq.id for cq in orphaned_cqs], "coverage_complete": ( len(templates_without_cq) == 0 and len(cqs_without_template) == 0 and len(orphaned_cqs) == 0 ), } ``` ## Summary The template-based SPARQL system provides critical insights into ontology coverage: | Aspect | Implementation | Benefit | |--------|----------------|---------| | **Competency Questions** | YAML registry with status tracking | Defines what ontology should answer | | **Fyke Filter** | DSPy relevance classifier | Catches irrelevant questions early | | **Gap Detection** | Analysis of unmatched questions | Identifies ontology improvements needed | | **Coverage Reports** | Automated metrics generation | Tracks ontology completeness over time | | **Bidirectional Linking** | Templates ↔ CQs validation | Ensures consistency | ### Key Metrics - **Coverage %** = (Answerable CQs + 0.5 × Partial CQs) / Total Relevant CQs - **Fyke Rate** = Out-of-scope questions / Total questions - **Gap Rate** = Unanswerable relevant questions / Total relevant questions Target: **>90% coverage** of relevant competency questions.