glam/docs/plan/prompt-query_template_mapping/competency-questions.md

# Competency Questions and Ontology Coverage

## Overview

This document describes how the template-based SPARQL system serves a dual purpose:

1. **Query generation** - Translate user questions to valid SPARQL
2. **Ontology validation** - Identify gaps in the Heritage Custodian Ontology through unanswerable questions

The principle is simple: **if a relevant question cannot be mapped to a SPARQL template, the ontology likely lacks coverage for that domain**.

## Competency Questions (CQs)

### What are Competency Questions?

Competency Questions are natural language questions that an ontology should be able to answer. They serve as:

- **Requirements** during ontology design
- **Validation criteria** during ontology evaluation
- **Coverage metrics** for ongoing maintenance

> "Competency questions define what the ontology knows about. If you can't answer a competency question with a SPARQL query, your ontology is incomplete." — Grüninger & Fox (1995)

### CQs as Template Coverage Metrics

Each SPARQL template implicitly defines a set of Competency Questions:

```yaml
# Template definition implies these CQs are answerable:
region_institution_search:
  question_patterns:
    - "Welke {institution_type_nl} zijn er in {province}?"
    - "Which {institution_type_en} are in {province}?"

  # Implied Competency Questions:
  # CQ1: What archives exist in a given Dutch province?
  # CQ2: What museums exist in a given Dutch province?
  # CQ3: What libraries exist in a given Dutch province?
  # etc.
```

### Tracking Ontology Coverage

```python
"""Track which competency questions the ontology can answer."""

from dataclasses import dataclass
from enum import Enum


class CQStatus(Enum):
    """Status of a competency question."""
    ANSWERABLE = "answerable"           # Has matching template
    PARTIAL = "partial"                 # Template exists but limited
    UNANSWERABLE = "unanswerable"       # No template, ontology gap
    OUT_OF_SCOPE = "out_of_scope"       # Not relevant to ontology (fyke)


@dataclass
class CompetencyQuestion:
    """A competency question for ontology validation."""

    id: str
    question_nl: str
    question_en: str
    category: str  # geographic, statistical, relational, etc.
    status: CQStatus
    template_id: str | None  # Template that answers this CQ
    ontology_classes: list[str]  # Classes needed to answer
    ontology_properties: list[str]  # Properties needed to answer
    notes: str | None = None


# Example CQ registry
COMPETENCY_QUESTIONS = [
    CompetencyQuestion(
        id="CQ-GEO-001",
        question_nl="Welke archieven zijn er in een bepaalde provincie?",
        question_en="What archives exist in a given province?",
        category="geographic",
        status=CQStatus.ANSWERABLE,
        template_id="region_institution_search",
        ontology_classes=["crm:E39_Actor"],
        ontology_properties=["hc:institutionType", "schema:addressLocality"],
    ),
    CompetencyQuestion(
        id="CQ-REL-001",
        question_nl="Welke instellingen hebben dezelfde directeur gehad?",
        question_en="Which institutions have shared the same director?",
        category="relational",
        status=CQStatus.UNANSWERABLE,  # GAP: No staff employment history
        template_id=None,
        ontology_classes=["crm:E39_Actor", "schema:Person"],
        ontology_properties=["schema:employee", "schema:worksFor"],  # MISSING
        notes="Ontology lacks employment history modeling",
    ),
    CompetencyQuestion(
        id="CQ-OUT-001",
        question_nl="Waar kan ik tandpasta met korting kopen?",
        question_en="Where can I buy toothpaste with a discount?",
        category="out_of_scope",
        status=CQStatus.OUT_OF_SCOPE,
        template_id=None,
        ontology_classes=[],
        ontology_properties=[],
        notes="Not related to heritage institutions - route to fyke",
    ),
]
```

## The Fyke: Catching Irrelevant Questions

### What is a Fyke?

A **fyke** (Dutch: *fuik*) is a type of fish trap - a net that catches fish swimming in a certain direction. In our system, the fyke catches questions that are **irrelevant to the Heritage Custodian Ontology**.

```
User Question
     |
     v
+------------------+
| Relevance Filter |  <-- Is this about heritage institutions?
+------------------+
     |
     +---> Relevant?
     |        |
     |   Yes  |   No
     |    |   |    |
     |    v   |    v
     |  Route |  +-------+
     |  to    |  | FYKE  |  <-- Catch irrelevant questions
     |  SPARQL|  +-------+
     |        |       |
     |        |       v
     |        |  Standard Response:
     |        |  "Deze vraag kan niet beantwoord worden
     |        |   door de ArchiefAssistent. De service
     |        |   bevat informatie over erfgoedinstellingen
     |        |   zoals archieven, musea en bibliotheken."
```

### Fyke Implementation

```python
"""Fyke: Filter for irrelevant questions."""

import dspy
from typing import Literal
from pydantic import BaseModel, Field


class RelevanceClassification(BaseModel):
    """Structured output for relevance classification."""

    is_relevant: bool = Field(
        description="Whether the question relates to heritage institutions"
    )
    confidence: float = Field(
        ge=0.0, le=1.0,
        description="Confidence in the classification"
    )
    reasoning: str = Field(
        description="Brief explanation of why the question is or isn't relevant"
    )
    detected_topics: list[str] = Field(
        description="Topics detected in the question"
    )


class HeritageRelevanceSignature(dspy.Signature):
    """Determine if a question is relevant to the Heritage Custodian Ontology.

    The Heritage Custodian Ontology covers:
    - Heritage institutions: museums, archives, libraries, galleries
    - Institution properties: location, founding date, type, collections
    - Staff and personnel at heritage institutions
    - Geographic distribution of heritage institutions
    - Relationships between institutions

    Questions about the following are OUT OF SCOPE and should be marked irrelevant:
    - Commercial products or shopping
    - Medical or health advice
    - Legal advice
    - Current news or politics (unless about heritage policy)
    - Personal relationships
    - Technical support for unrelated systems
    - General knowledge not related to heritage

    Be generous with relevance: if the question MIGHT relate to heritage institutions,
    mark it as relevant. Only flag clearly unrelated questions as irrelevant.
    """

    question: str = dspy.InputField(
        desc="User's question to classify"
    )
    language: str = dspy.InputField(
        desc="Language of the question (nl, en, de, fr)",
        default="nl"
    )

    classification: RelevanceClassification = dspy.OutputField(
        desc="Structured relevance classification"
    )


class FykeFilter(dspy.Module):
    """Filter irrelevant questions before template matching.

    The fyke catches questions that cannot be answered by the
    Heritage Custodian Ontology, returning a polite standard response.
    """

    # Standard responses by language
    STANDARD_RESPONSES = {
        "nl": (
            "Deze vraag kan helaas niet beantwoord worden door de ArchiefAssistent. "
            "Deze service bevat informatie over erfgoedinstellingen in Nederland en "
            "daarbuiten, zoals archieven, musea, bibliotheken en galerieën. "
            "Stel gerust een vraag over deze instellingen!"
        ),
        "en": (
            "Unfortunately, this question cannot be answered by the ArchiefAssistent. "
            "This service contains information about heritage institutions in the "
            "Netherlands and beyond, such as archives, museums, libraries, and galleries. "
            "Feel free to ask a question about these institutions!"
        ),
        "de": (
            "Leider kann diese Frage vom ArchiefAssistent nicht beantwortet werden. "
            "Dieser Service enthält Informationen über Kulturerbe-Einrichtungen in den "
            "Niederlanden und darüber hinaus, wie Archive, Museen, Bibliotheken und Galerien. "
            "Stellen Sie gerne eine Frage zu diesen Einrichtungen!"
        ),
        "fr": (
            "Malheureusement, cette question ne peut pas être répondue par l'ArchiefAssistent. "
            "Ce service contient des informations sur les institutions patrimoniales aux "
            "Pays-Bas et au-delà, telles que les archives, les musées, les bibliothèques "
            "et les galeries. N'hésitez pas à poser une question sur ces institutions!"
        ),
    }

    # Confidence threshold for fyke activation
    IRRELEVANCE_THRESHOLD = 0.85

    def __init__(self, fast_lm: dspy.LM | None = None):
        """Initialize the fyke filter.

        Args:
            fast_lm: Optional fast LM for relevance classification.
                     Recommended: gpt-4o-mini or similar for speed.
        """
        super().__init__()
        self.fast_lm = fast_lm
        self.classifier = dspy.TypedPredictor(HeritageRelevanceSignature)

    def forward(
        self,
        question: str,
        language: str = "nl",
    ) -> dspy.Prediction:
        """Classify question relevance and optionally catch in fyke.

        Returns:
            Prediction with:
                - is_relevant: bool
                - caught_by_fyke: bool
                - fyke_response: str | None (if caught)
                - reasoning: str
                - confidence: float
        """
        # Use fast LM if configured
        if self.fast_lm:
            with dspy.settings.context(lm=self.fast_lm):
                result = self.classifier(question=question, language=language)
        else:
            result = self.classifier(question=question, language=language)

        classification = result.classification

        # Determine if caught by fyke
        caught = (
            not classification.is_relevant
            and classification.confidence >= self.IRRELEVANCE_THRESHOLD
        )

        # Get appropriate response
        fyke_response = None
        if caught:
            fyke_response = self.STANDARD_RESPONSES.get(
                language,
                self.STANDARD_RESPONSES["en"]
            )

        return dspy.Prediction(
            is_relevant=classification.is_relevant,
            caught_by_fyke=caught,
            fyke_response=fyke_response,
            reasoning=classification.reasoning,
            confidence=classification.confidence,
            detected_topics=classification.detected_topics,
        )


# Example usage in the RAG pipeline
class HeritageRAGWithFyke(dspy.Module):
    """Heritage RAG with fyke pre-filter."""

    def __init__(self):
        super().__init__()
        self.fyke = FykeFilter()
        self.router = HeritageQueryRouter()
        self.template_classifier = TemplateClassifier()
        # ... other components

    async def answer(
        self,
        question: str,
        language: str = "nl",
    ) -> dspy.Prediction:
        """Answer question with fyke pre-filtering."""

        # Step 1: Check relevance (fyke filter)
        relevance = self.fyke(question=question, language=language)

        if relevance.caught_by_fyke:
            # Question is irrelevant - return standard response
            return dspy.Prediction(
                answer=relevance.fyke_response,
                caught_by_fyke=True,
                reasoning=relevance.reasoning,
                confidence=relevance.confidence,
            )

        # Step 2: Route relevant question to templates
        routing = self.router(question=question, language=language)

        # ... continue with normal processing
```

### Fyke Examples

| Question | Language | Relevant? | Reasoning |
|----------|----------|-----------|-----------|
| "Welke archieven zijn er in Utrecht?" | nl | ✅ Yes | Asks about archives in a location |
| "Waar kan ik tandpasta met korting kopen?" | nl | ❌ No | Shopping query, not heritage |
| "What is the weather in Amsterdam?" | en | ❌ No | Weather query, not heritage |
| "Wie is de directeur van het Rijksmuseum?" | nl | ✅ Yes | Asks about museum staff |
| "How do I reset my password?" | en | ❌ No | Technical support, not heritage |
| "Welke musea hebben een Van Gogh collectie?" | nl | ✅ Yes | Asks about museum collections |

## Ontology Gap Detection

### Identifying Gaps Through Template Failure

When a relevant question cannot be mapped to any template, this signals a potential ontology gap:

```python
"""Detect ontology gaps through template matching failures."""

from dataclasses import dataclass
from datetime import datetime


@dataclass
class OntologyGapReport:
    """Report of a potential ontology gap."""

    question: str
    timestamp: datetime
    detected_intent: str
    detected_entities: list[str]
    required_classes: list[str]  # Classes that would be needed
    required_properties: list[str]  # Properties that would be needed
    existing_coverage: str  # What the ontology currently covers
    gap_description: str
    priority: str  # high, medium, low


class OntologyGapDetector:
    """Detect gaps in the Heritage Custodian Ontology."""

    # Known ontology capabilities
    ONTOLOGY_COVERAGE = {
        "institution_identity": {
            "classes": ["crm:E39_Actor"],
            "properties": ["skos:prefLabel", "hc:institutionType", "schema:description"],
            "description": "Institution names, types, and descriptions",
        },
        "institution_location": {
            "classes": ["crm:E39_Actor"],
            "properties": ["schema:addressLocality", "schema:addressCountry", "schema:geo"],
            "description": "Institution geographic locations",
        },
        "institution_founding": {
            "classes": ["crm:E39_Actor"],
            "properties": ["schema:foundingDate", "schema:dissolutionDate"],
            "description": "Institution founding and closure dates",
        },
        "staff_current": {
            "classes": ["schema:Person"],
            "properties": ["schema:name", "schema:jobTitle"],
            "description": "Current staff members and roles",
        },
    }

    # Known gaps (to be expanded as gaps are discovered)
    KNOWN_GAPS = {
        "staff_history": {
            "description": "Historical employment records",
            "example_questions": [
                "Wie was de eerste directeur van het Rijksmuseum?",
                "Welke archivarissen hebben bij meerdere instellingen gewerkt?",
            ],
            "required_modeling": "Employment periods with start/end dates",
        },
        "collection_items": {
            "description": "Individual collection items",
            "example_questions": [
                "Welke musea hebben werken van Rembrandt?",
                "Waar kan ik de Nachtwacht zien?",
            ],
            "required_modeling": "Collection items linked to institutions",
        },
        "institutional_relationships": {
            "description": "Relationships between institutions",
            "example_questions": [
                "Welke instellingen zijn onderdeel van de Reinwardt Academie?",
                "Met welke musea werkt het Rijksmuseum samen?",
            ],
            "required_modeling": "Formal relationships (parent, partner, etc.)",
        },
    }

    def analyze_unmatched_question(
        self,
        question: str,
        routing: dspy.Prediction,
    ) -> OntologyGapReport | None:
        """Analyze why a relevant question couldn't be matched to a template.

        Args:
            question: The user's question
            routing: Routing prediction with intent, entities, etc.

        Returns:
            OntologyGapReport if a gap is detected, None otherwise
        """
        # Check if this matches a known gap pattern
        for gap_id, gap_info in self.KNOWN_GAPS.items():
            for example in gap_info["example_questions"]:
                if self._is_similar_question(question, example):
                    return OntologyGapReport(
                        question=question,
                        timestamp=datetime.now(),
                        detected_intent=routing.intent,
                        detected_entities=routing.entities,
                        required_classes=self._infer_required_classes(gap_id),
                        required_properties=self._infer_required_properties(gap_id),
                        existing_coverage=self._describe_current_coverage(routing.intent),
                        gap_description=gap_info["description"],
                        priority=self._assess_priority(gap_id),
                    )

        # Unknown gap - log for review
        return OntologyGapReport(
            question=question,
            timestamp=datetime.now(),
            detected_intent=routing.intent,
            detected_entities=routing.entities,
            required_classes=["UNKNOWN"],
            required_properties=["UNKNOWN"],
            existing_coverage=self._describe_current_coverage(routing.intent),
            gap_description="Unknown gap - requires manual review",
            priority="medium",
        )

    def _is_similar_question(self, q1: str, q2: str) -> bool:
        """Check if two questions are semantically similar."""
        # Simple implementation - could use embeddings
        q1_lower = q1.lower()
        q2_lower = q2.lower()

        # Check for key term overlap
        q1_terms = set(q1_lower.split())
        q2_terms = set(q2_lower.split())

        overlap = len(q1_terms & q2_terms) / max(len(q1_terms), len(q2_terms))
        return overlap > 0.5
```

### Gap Reporting Dashboard

```python
"""Generate ontology coverage reports."""

from collections import Counter


def generate_coverage_report(
    competency_questions: list[CompetencyQuestion],
    gap_reports: list[OntologyGapReport],
) -> dict:
    """Generate an ontology coverage report.

    Returns:
        Dict with coverage statistics and gap analysis
    """
    # Count CQ statuses
    status_counts = Counter(cq.status for cq in competency_questions)

    # Calculate coverage percentage
    total_cqs = len(competency_questions)
    answerable = status_counts.get(CQStatus.ANSWERABLE, 0)
    partial = status_counts.get(CQStatus.PARTIAL, 0)
    coverage_pct = (answerable + 0.5 * partial) / total_cqs * 100 if total_cqs > 0 else 0

    # Group gaps by category
    gaps_by_category = {}
    for gap in gap_reports:
        category = gap.detected_intent
        if category not in gaps_by_category:
            gaps_by_category[category] = []
        gaps_by_category[category].append(gap)

    # Identify most common gaps
    gap_descriptions = Counter(g.gap_description for g in gap_reports)

    return {
        "summary": {
            "total_competency_questions": total_cqs,
            "answerable": answerable,
            "partial": partial,
            "unanswerable": status_counts.get(CQStatus.UNANSWERABLE, 0),
            "out_of_scope": status_counts.get(CQStatus.OUT_OF_SCOPE, 0),
            "coverage_percentage": round(coverage_pct, 1),
        },
        "gaps_by_category": {
            cat: len(gaps) for cat, gaps in gaps_by_category.items()
        },
        "top_gaps": gap_descriptions.most_common(10),
        "recommendations": _generate_recommendations(gap_reports),
    }


def _generate_recommendations(gaps: list[OntologyGapReport]) -> list[str]:
    """Generate recommendations for ontology improvements."""
    recommendations = []

    # Analyze gap patterns
    gap_types = Counter(g.gap_description for g in gaps)

    for gap_type, count in gap_types.most_common(5):
        if count >= 3:
            recommendations.append(
                f"High priority: Add modeling for '{gap_type}' "
                f"({count} unanswered questions)"
            )

    return recommendations
```

## Competency Question Registry

### YAML Format for CQ Tracking

```yaml
# data/competency_questions.yaml
version: "1.0.0"

categories:
  geographic:
    description: "Questions about institution locations"
    coverage: high
  statistical:
    description: "Questions about counts and distributions"
    coverage: high
  relational:
    description: "Questions about relationships between entities"
    coverage: low
  temporal:
    description: "Questions about historical changes"
    coverage: medium
  biographical:
    description: "Questions about people in heritage sector"
    coverage: medium

competency_questions:
  # ANSWERABLE - Geographic
  - id: CQ-GEO-001
    question_nl: "Welke archieven zijn er in een bepaalde provincie?"
    question_en: "What archives exist in a given province?"
    category: geographic
    status: answerable
    template_id: region_institution_search
    ontology_classes: [crm:E39_Actor]
    ontology_properties: [hc:institutionType, schema:addressLocality]

  - id: CQ-GEO-002
    question_nl: "Welke musea zijn er in een bepaalde stad?"
    question_en: "What museums exist in a given city?"
    category: geographic
    status: answerable
    template_id: city_institution_search
    ontology_classes: [crm:E39_Actor]
    ontology_properties: [hc:institutionType, schema:addressLocality]

  # ANSWERABLE - Statistical
  - id: CQ-STAT-001
    question_nl: "Hoeveel musea zijn er in Nederland?"
    question_en: "How many museums are there in the Netherlands?"
    category: statistical
    status: answerable
    template_id: count_institutions_by_type
    ontology_classes: [crm:E39_Actor]
    ontology_properties: [hc:institutionType, schema:addressCountry]

  # PARTIAL - Temporal
  - id: CQ-TEMP-001
    question_nl: "Wat is het oudste archief in Nederland?"
    question_en: "What is the oldest archive in the Netherlands?"
    category: temporal
    status: partial
    template_id: find_oldest_institution
    ontology_classes: [crm:E39_Actor]
    ontology_properties: [schema:foundingDate]
    notes: "Founding dates incomplete for many institutions"

  # UNANSWERABLE - Relational (GAP)
  - id: CQ-REL-001
    question_nl: "Welke instellingen hebben dezelfde directeur gehad?"
    question_en: "Which institutions have shared the same director?"
    category: relational
    status: unanswerable
    template_id: null
    ontology_classes: [crm:E39_Actor, schema:Person]
    ontology_properties: [schema:employee]  # MISSING: employment history
    gap_description: "No modeling for historical employment relationships"
    recommended_fix: "Add employment periods with start/end dates"

  - id: CQ-REL-002
    question_nl: "Welke musea zijn onderdeel van een groter netwerk?"
    question_en: "Which museums are part of a larger network?"
    category: relational
    status: unanswerable
    template_id: null
    ontology_classes: [crm:E39_Actor, org:Organization]
    ontology_properties: [org:subOrganizationOf]  # MISSING
    gap_description: "No modeling for organizational hierarchies"

  # OUT OF SCOPE - Fyke
  - id: CQ-OUT-001
    question_nl: "Waar kan ik tandpasta met korting kopen?"
    question_en: "Where can I buy toothpaste with a discount?"
    category: out_of_scope
    status: out_of_scope
    template_id: null
    ontology_classes: []
    ontology_properties: []
    notes: "Shopping query - route to fyke"

  - id: CQ-OUT-002
    question_nl: "Wat is het weer morgen in Amsterdam?"
    question_en: "What is the weather tomorrow in Amsterdam?"
    category: out_of_scope
    status: out_of_scope
    template_id: null
    ontology_classes: []
    ontology_properties: []
    notes: "Weather query - route to fyke"
```

## Integration with Template System

### Template-CQ Bidirectional Linking

```python
"""Link templates to competency questions bidirectionally."""

def validate_template_cq_coverage(
    templates: dict[str, SPARQLTemplate],
    competency_questions: list[CompetencyQuestion],
) -> dict:
    """Validate that templates cover expected CQs and vice versa.

    Returns:
        Validation report with coverage analysis
    """
    # Templates without CQ coverage
    templates_without_cq = []
    for template_id, template in templates.items():
        matching_cqs = [
            cq for cq in competency_questions
            if cq.template_id == template_id
        ]
        if not matching_cqs:
            templates_without_cq.append(template_id)

    # CQs without template coverage
    cqs_without_template = [
        cq for cq in competency_questions
        if cq.status == CQStatus.ANSWERABLE and cq.template_id is None
    ]

    # CQs marked answerable but template doesn't exist
    orphaned_cqs = [
        cq for cq in competency_questions
        if cq.template_id and cq.template_id not in templates
    ]

    return {
        "templates_without_cq": templates_without_cq,
        "cqs_without_template": [cq.id for cq in cqs_without_template],
        "orphaned_cqs": [cq.id for cq in orphaned_cqs],
        "coverage_complete": (
            len(templates_without_cq) == 0
            and len(cqs_without_template) == 0
            and len(orphaned_cqs) == 0
        ),
    }
```

## Summary

The template-based SPARQL system provides critical insights into ontology coverage:

| Aspect | Implementation | Benefit |
|--------|----------------|---------|
| **Competency Questions** | YAML registry with status tracking | Defines what ontology should answer |
| **Fyke Filter** | DSPy relevance classifier | Catches irrelevant questions early |
| **Gap Detection** | Analysis of unmatched questions | Identifies ontology improvements needed |
| **Coverage Reports** | Automated metrics generation | Tracks ontology completeness over time |
| **Bidirectional Linking** | Templates ↔ CQs validation | Ensures consistency |

### Key Metrics

- **Coverage %** = (Answerable CQs + 0.5 × Partial CQs) / Total Relevant CQs
- **Fyke Rate** = Out-of-scope questions / Total questions
- **Gap Rate** = Unanswerable relevant questions / Total relevant questions

Target: **>90% coverage** of relevant competency questions.