glam/docs/plan/dspy_rag_automation/04-playwright-tests.md

# Playwright E2E Test Specification

## Overview

Playwright tests validate that RAG responses render correctly in the ArchiefAssistent UI. This catches bugs in:
- Frontend parsing of API responses
- UI component rendering
- Map marker placement
- Debug panel accuracy

## Test Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│                    Playwright Test Suite                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                    Test Fixtures                          │   │
│  │  - Browser context with authentication                    │   │
│  │  - Golden dataset examples                                │   │
│  │  - API mocking (optional)                                 │   │
│  └──────────────────────────────────────────────────────────┘   │
│                              │                                   │
│          ┌───────────────────┼───────────────────┐              │
│          ▼                   ▼                   ▼              │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐      │
│  │   Chat UI    │    │  Map Panel   │    │ Debug Panel  │      │
│  │   Tests      │    │   Tests      │    │   Tests      │      │
│  └──────────────┘    └──────────────┘    └──────────────┘      │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
```

## Implementation

### Test Configuration

```python
# tests/e2e/conftest.py

import pytest
from playwright.sync_api import Page, Browser, sync_playwright
import json


@pytest.fixture(scope="session")
def browser():
    """Create browser instance for all tests."""
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        yield browser
        browser.close()


@pytest.fixture
def page(browser: Browser):
    """Create new page for each test."""
    context = browser.new_context(
        viewport={"width": 1280, "height": 720},
        locale="nl-NL",
    )
    page = context.new_page()
    yield page
    context.close()


@pytest.fixture
def authenticated_page(page: Page):
    """Page with authentication cookie set."""
    # Navigate to login page
    page.goto("https://archief.support/login")

    # For testing, use test credentials or mock auth
    page.fill('[data-testid="username"]', 'test_user')
    page.fill('[data-testid="password"]', 'test_password')
    page.click('[data-testid="login-button"]')

    # Wait for redirect to chat
    page.wait_for_url("**/chat")

    return page


@pytest.fixture
def golden_examples():
    """Load golden dataset examples."""
    with open('data/rag_eval/golden_dataset.json') as f:
        data = json.load(f)
    return data['examples']
```

### Chat UI Tests

```python
# tests/e2e/test_chat_ui.py

import pytest
import re
from playwright.sync_api import Page, expect


class TestChatBasics:
    """Basic chat functionality tests."""

    def test_chat_page_loads(self, page: Page):
        """Chat page should load without errors."""
        page.goto("https://archief.support/chat")

        # Check title
        expect(page).to_have_title(re.compile("ArchiefAssistent"))

        # Check chat input exists
        input_field = page.locator('[data-testid="chat-input"]')
        expect(input_field).to_be_visible()

        # Check send button exists
        send_button = page.locator('[data-testid="send-button"]')
        expect(send_button).to_be_visible()

    def test_send_message(self, authenticated_page: Page):
        """User can send a message and receive response."""
        page = authenticated_page

        # Type message
        page.fill('[data-testid="chat-input"]', 'Hallo')
        page.click('[data-testid="send-button"]')

        # Wait for response
        response = page.locator('[data-testid="assistant-message"]').first
        expect(response).to_be_visible(timeout=30000)

        # Response should not be empty
        expect(response).not_to_have_text('')


class TestCountQueries:
    """Tests for COUNT query rendering."""

    @pytest.mark.parametrize("question,expected_pattern", [
        ("Hoeveel archieven zijn er in Utrecht?", r"Er zijn \d+ archieven"),
        ("Hoeveel musea zijn er in Noord-Holland?", r"Er zijn \d+ musea"),
        ("Hoeveel bibliotheken zijn er in Amsterdam?", r"Er zijn \d+ bibliotheken"),
    ])
    def test_count_query_response(
        self,
        authenticated_page: Page,
        question: str,
        expected_pattern: str
    ):
        """COUNT queries should return properly formatted answers."""
        page = authenticated_page

        # Send question
        page.fill('[data-testid="chat-input"]', question)
        page.click('[data-testid="send-button"]')

        # Wait for response
        response = page.locator('[data-testid="assistant-message"]').last
        expect(response).to_be_visible(timeout=30000)

        # Check response matches expected pattern
        text = response.text_content()
        assert re.search(expected_pattern, text), f"Response '{text}' doesn't match pattern '{expected_pattern}'"

    def test_count_shows_correct_number(self, authenticated_page: Page):
        """COUNT should show actual number from SPARQL, not Qdrant count."""
        page = authenticated_page

        # This is a known case: Utrecht has exactly 10 archives
        page.fill('[data-testid="chat-input"]', 'Hoeveel archieven zijn er in Utrecht?')
        page.click('[data-testid="send-button"]')

        # Wait for response
        response = page.locator('[data-testid="assistant-message"]').last
        expect(response).to_be_visible(timeout=30000)

        # Should contain "10" not "1" (the bug we fixed)
        text = response.text_content()
        assert "10" in text, f"Expected '10' in response, got: {text}"


class TestListQueries:
    """Tests for LIST query rendering."""

    def test_list_shows_institution_cards(self, authenticated_page: Page):
        """LIST queries should render institution cards."""
        page = authenticated_page

        page.fill('[data-testid="chat-input"]', 'Welke archieven zijn er in Utrecht?')
        page.click('[data-testid="send-button"]')

        # Wait for response with cards
        cards = page.locator('[data-testid="institution-card"]')
        expect(cards.first).to_be_visible(timeout=30000)

        # Should have multiple cards
        count = cards.count()
        assert count >= 5, f"Expected at least 5 institution cards, got {count}"

    def test_institution_card_has_name(self, authenticated_page: Page):
        """Institution cards should show institution name."""
        page = authenticated_page

        page.fill('[data-testid="chat-input"]', 'Welke musea zijn er in Amsterdam?')
        page.click('[data-testid="send-button"]')

        # Wait for first card
        card = page.locator('[data-testid="institution-card"]').first
        expect(card).to_be_visible(timeout=30000)

        # Card should have a name element
        name = card.locator('[data-testid="institution-name"]')
        expect(name).to_be_visible()
        expect(name).not_to_have_text('')

    def test_institution_card_has_link(self, authenticated_page: Page):
        """Institution cards should have clickable website links."""
        page = authenticated_page

        page.fill('[data-testid="chat-input"]', 'Welke musea zijn er in Amsterdam?')
        page.click('[data-testid="send-button"]')

        # Find card with link
        link = page.locator('[data-testid="institution-card"] a[href^="http"]').first
        expect(link).to_be_visible(timeout=30000)

        # Link should have valid href
        href = link.get_attribute('href')
        assert href.startswith('http'), f"Invalid href: {href}"


class TestMapVisualization:
    """Tests for geographic map rendering."""

    def test_map_appears_for_location_queries(self, authenticated_page: Page):
        """Map panel should appear for queries with geographic results."""
        page = authenticated_page

        page.fill('[data-testid="chat-input"]', 'Welke archieven zijn er in Utrecht?')
        page.click('[data-testid="send-button"]')

        # Wait for map panel to appear
        map_panel = page.locator('[data-testid="map-panel"]')
        expect(map_panel).to_be_visible(timeout=30000)

    def test_map_has_markers(self, authenticated_page: Page):
        """Map should display markers for institutions."""
        page = authenticated_page

        page.fill('[data-testid="chat-input"]', 'Welke musea zijn er in Haarlem?')
        page.click('[data-testid="send-button"]')

        # Wait for map markers
        markers = page.locator('[data-testid="map-marker"]')
        expect(markers.first).to_be_visible(timeout=30000)

        # Should have at least one marker
        assert markers.count() >= 1

    def test_marker_click_shows_popup(self, authenticated_page: Page):
        """Clicking a map marker should show institution popup."""
        page = authenticated_page

        page.fill('[data-testid="chat-input"]', 'Welke archieven zijn er in Utrecht?')
        page.click('[data-testid="send-button"]')

        # Wait for and click first marker
        marker = page.locator('[data-testid="map-marker"]').first
        expect(marker).to_be_visible(timeout=30000)
        marker.click()

        # Popup should appear
        popup = page.locator('[data-testid="map-popup"]')
        expect(popup).to_be_visible()


class TestDebugPanel:
    """Tests for debug panel showing SPARQL and metadata."""

    def test_debug_panel_toggle(self, authenticated_page: Page):
        """Debug panel can be toggled open/closed."""
        page = authenticated_page

        # Send a query first
        page.fill('[data-testid="chat-input"]', 'Hoeveel archieven zijn er in Utrecht?')
        page.click('[data-testid="send-button"]')

        # Wait for response
        page.wait_for_selector('[data-testid="assistant-message"]', timeout=30000)

        # Toggle debug panel
        toggle = page.locator('[data-testid="debug-toggle"]')
        expect(toggle).to_be_visible()
        toggle.click()

        # Debug panel should be visible
        debug_panel = page.locator('[data-testid="debug-panel"]')
        expect(debug_panel).to_be_visible()

    def test_debug_panel_shows_sparql(self, authenticated_page: Page):
        """Debug panel should show generated SPARQL query."""
        page = authenticated_page

        page.fill('[data-testid="chat-input"]', 'Hoeveel archieven zijn er in Utrecht?')
        page.click('[data-testid="send-button"]')

        # Open debug panel
        page.wait_for_selector('[data-testid="assistant-message"]', timeout=30000)
        page.click('[data-testid="debug-toggle"]')

        # SPARQL tab should show query
        sparql_tab = page.locator('[data-testid="debug-tab-sparql"]')
        sparql_tab.click()

        sparql_content = page.locator('[data-testid="sparql-query"]')
        expect(sparql_content).to_contain_text('SELECT')
        expect(sparql_content).to_contain_text('hc:institutionType')

    def test_debug_panel_shows_slots(self, authenticated_page: Page):
        """Debug panel should show extracted slot values."""
        page = authenticated_page

        page.fill('[data-testid="chat-input"]', 'Hoeveel musea zijn er in Noord-Holland?')
        page.click('[data-testid="send-button"]')

        # Open debug panel
        page.wait_for_selector('[data-testid="assistant-message"]', timeout=30000)
        page.click('[data-testid="debug-toggle"]')

        # Slots tab should show extracted values
        slots_tab = page.locator('[data-testid="debug-tab-slots"]')
        slots_tab.click()

        slots_content = page.locator('[data-testid="slot-values"]')
        expect(slots_content).to_contain_text('institution_type')
        expect(slots_content).to_contain_text('M')  # Museum code
```

## Running Tests

```bash
# Install Playwright
pip install playwright pytest-playwright
playwright install chromium

# Run all E2E tests
pytest tests/e2e/ -v

# Run specific test file
pytest tests/e2e/test_chat_ui.py -v

# Run with headed browser (for debugging)
pytest tests/e2e/ -v --headed

# Run with slow motion (for debugging)
pytest tests/e2e/ -v --slowmo 500

# Generate HTML report
pytest tests/e2e/ -v --html=reports/playwright_report.html

# Run in parallel (4 workers)
pytest tests/e2e/ -v -n 4
```

## CI/CD Integration

```yaml
# .github/workflows/e2e-tests.yml
name: E2E Tests

on:
  push:
    branches: [main]
  pull_request:

jobs:
  playwright:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - name: Install dependencies
        run: |
          pip install -r requirements-test.txt
          playwright install chromium --with-deps

      - name: Run E2E tests
        run: pytest tests/e2e/ -v --html=reports/playwright.html
        env:
          TEST_BASE_URL: https://archief.support

      - name: Upload report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: playwright-report
          path: reports/playwright.html
```

## Required data-testid Attributes

Add these to frontend components:

| Component | data-testid | Element |
|-----------|-------------|---------|
| ChatPage | `chat-input` | TextField for input |
| ChatPage | `send-button` | Send IconButton |
| ChatPage | `assistant-message` | AI response Paper |
| ChatPage | `user-message` | User message Paper |
| InstitutionCard | `institution-card` | Card container |
| InstitutionCard | `institution-name` | Name Typography |
| ChatMapPanel | `map-panel` | Map container |
| ChatMapPanel | `map-marker` | Marker element |
| ChatMapPanel | `map-popup` | Popup container |
| DebugPanel | `debug-toggle` | Toggle button |
| DebugPanel | `debug-panel` | Panel container |
| DebugPanel | `debug-tab-sparql` | SPARQL tab |
| DebugPanel | `sparql-query` | SPARQL code block |
| DebugPanel | `debug-tab-slots` | Slots tab |
| DebugPanel | `slot-values` | Slots display |