glam/frontend/PHASE3_TASK7_PLAN.md
kempersc 2761857b0d Add scripts for converting OWL/Turtle ontology to Mermaid and PlantUML diagrams
- Implemented `owl_to_mermaid.py` to convert OWL/Turtle files into Mermaid class diagrams.
- Implemented `owl_to_plantuml.py` to convert OWL/Turtle files into PlantUML class diagrams.
- Added two new PlantUML files for custodian multi-aspect diagrams.
2025-11-22 23:01:13 +01:00

23 KiB

Phase 3 Task 7: SPARQL Execution with Oxigraph

Goal: Connect the Visual SPARQL Query Builder to a real RDF triplestore and execute queries against heritage institution data.

Estimated Time: 6-8 hours
Status: In Progress
Dependencies: Task 6 (Query Builder) Complete


Overview

Task 7 transforms the Query Builder from a SPARQL generator into a full SPARQL IDE by:

  1. Installing Oxigraph RDF triplestore
  2. Loading sample heritage institution data
  3. Executing SPARQL queries via HTTP
  4. Displaying results in interactive tables
  5. Exporting results to multiple formats

Step-by-Step Implementation Plan

Step 1: Install & Configure Oxigraph (1-1.5 hours)

1.1 Install Oxigraph Binary

# macOS (Homebrew)
brew install oxigraph

# Or download from GitHub releases
# https://github.com/oxigraph/oxigraph/releases

1.2 Prepare Sample RDF Data

# Create data directory
mkdir -p frontend/data/sample-rdf

# Generate sample heritage institutions (N-Triples format)
# Use existing GLAM data or create minimal test dataset

1.3 Start Oxigraph Server

# Create persistent database
oxigraph_server --location frontend/data/oxigraph-db --bind 127.0.0.1:7878

# Load sample data
curl -X POST \
  -H 'Content-Type: application/n-triples' \
  --data-binary '@frontend/data/sample-rdf/heritage-institutions.nt' \
  http://localhost:7878/store

1.4 Test SPARQL Endpoint

# Test query
curl -X POST \
  -H 'Content-Type: application/sparql-query' \
  -H 'Accept: application/sparql-results+json' \
  --data 'SELECT * WHERE { ?s ?p ?o } LIMIT 10' \
  http://localhost:7878/query

Files to Create:

  • frontend/scripts/start-oxigraph.sh - Server startup script
  • frontend/scripts/load-sample-data.sh - Data loading script
  • frontend/data/sample-rdf/heritage-institutions.nt - Test data
  • frontend/OXIGRAPH_SETUP.md - Setup documentation

Step 2: Implement HTTP Client (1 hour)

2.1 Create SPARQL Client Module

File: src/lib/sparql/client.ts

/**
 * SPARQL HTTP Client for Oxigraph
 * 
 * Handles query execution, response parsing, and error handling.
 */

export interface SparqlEndpointConfig {
  url: string;
  timeout?: number;
  headers?: Record<string, string>;
}

export interface SparqlBinding {
  [variable: string]: {
    type: 'uri' | 'literal' | 'bnode';
    value: string;
    datatype?: string;
    'xml:lang'?: string;
  };
}

export interface SparqlResults {
  head: {
    vars: string[];
  };
  results: {
    bindings: SparqlBinding[];
  };
}

export interface SparqlError {
  message: string;
  statusCode?: number;
  query?: string;
}

export class SparqlClient {
  private config: SparqlEndpointConfig;

  constructor(config: SparqlEndpointConfig) {
    this.config = {
      timeout: 30000, // 30 seconds default
      ...config,
    };
  }

  /**
   * Execute SELECT query
   */
  async executeSelect(query: string): Promise<SparqlResults> {
    // Implementation
  }

  /**
   * Execute ASK query
   */
  async executeAsk(query: string): Promise<boolean> {
    // Implementation
  }

  /**
   * Execute CONSTRUCT query
   */
  async executeConstruct(query: string): Promise<string> {
    // Implementation (returns RDF triples)
  }

  /**
   * Execute DESCRIBE query
   */
  async executeDescribe(query: string): Promise<string> {
    // Implementation (returns RDF triples)
  }

  /**
   * Test endpoint connectivity
   */
  async testConnection(): Promise<boolean> {
    // Simple ASK query to test connection
  }
}

/**
 * Default client instance (configured from environment)
 */
export const defaultSparqlClient = new SparqlClient({
  url: import.meta.env.VITE_SPARQL_ENDPOINT || 'http://localhost:7878/query',
});

Features:

  • Fetch API for HTTP requests
  • Timeout handling
  • Query type detection (SELECT/ASK/CONSTRUCT/DESCRIBE)
  • Response format negotiation (JSON for SELECT, RDF for CONSTRUCT)
  • Error handling with meaningful messages
  • Retry logic for network errors

Environment Variables:

# .env.local
VITE_SPARQL_ENDPOINT=http://localhost:7878/query
VITE_SPARQL_TIMEOUT=30000

Step 3: Build Results Table Component (2-2.5 hours)

3.1 Create Results Table Component

File: src/components/query/ResultsTable.tsx

/**
 * SPARQL Query Results Table
 * 
 * Interactive table displaying query results with:
 * - Column sorting
 * - Row selection
 * - Pagination
 * - URI link rendering
 * - Literal type display
 */

import React, { useState, useMemo } from 'react';
import { type SparqlResults, type SparqlBinding } from '../../lib/sparql/client';

export interface ResultsTableProps {
  results: SparqlResults;
  onExport?: (format: 'csv' | 'json' | 'jsonld') => void;
  maxRowsPerPage?: number;
}

export const ResultsTable: React.FC<ResultsTableProps> = ({
  results,
  onExport,
  maxRowsPerPage = 100,
}) => {
  const [currentPage, setCurrentPage] = useState(1);
  const [sortColumn, setSortColumn] = useState<string | null>(null);
  const [sortDirection, setSortDirection] = useState<'asc' | 'desc'>('asc');
  const [selectedRows, setSelectedRows] = useState<Set<number>>(new Set());

  // Column headers from SPARQL query
  const columns = results.head.vars;

  // Paginated and sorted data
  const paginatedData = useMemo(() => {
    let data = [...results.results.bindings];

    // Sort if column selected
    if (sortColumn) {
      data.sort((a, b) => {
        const aVal = a[sortColumn]?.value || '';
        const bVal = b[sortColumn]?.value || '';
        return sortDirection === 'asc' 
          ? aVal.localeCompare(bVal)
          : bVal.localeCompare(aVal);
      });
    }

    // Paginate
    const start = (currentPage - 1) * maxRowsPerPage;
    const end = start + maxRowsPerPage;
    return data.slice(start, end);
  }, [results, sortColumn, sortDirection, currentPage, maxRowsPerPage]);

  const totalPages = Math.ceil(results.results.bindings.length / maxRowsPerPage);

  // Render cell based on binding type
  const renderCell = (binding: any) => {
    if (!binding) return <td className="results-table__cell--empty">-</td>;

    switch (binding.type) {
      case 'uri':
        return (
          <td className="results-table__cell--uri">
            <a href={binding.value} target="_blank" rel="noopener noreferrer">
              {shortenUri(binding.value)}
            </a>
          </td>
        );
      case 'literal':
        return (
          <td className="results-table__cell--literal" title={binding.datatype || binding['xml:lang']}>
            {binding.value}
          </td>
        );
      case 'bnode':
        return (
          <td className="results-table__cell--bnode">
            {binding.value}
          </td>
        );
    }
  };

  return (
    <div className="results-table">
      {/* Toolbar */}
      <div className="results-table__toolbar">
        <div className="results-table__info">
          {results.results.bindings.length} results
        </div>
        <div className="results-table__actions">
          <button onClick={() => onExport?.('csv')}>Export CSV</button>
          <button onClick={() => onExport?.('json')}>Export JSON</button>
          <button onClick={() => onExport?.('jsonld')}>Export JSON-LD</button>
        </div>
      </div>

      {/* Table */}
      <table className="results-table__table">
        <thead>
          <tr>
            <th className="results-table__header--checkbox">
              <input type="checkbox" onChange={/* select all */} />
            </th>
            {columns.map(col => (
              <th 
                key={col}
                className="results-table__header"
                onClick={() => handleSort(col)}
              >
                {col}
                {sortColumn === col && (sortDirection === 'asc' ? '↑' : '↓')}
              </th>
            ))}
          </tr>
        </thead>
        <tbody>
          {paginatedData.map((row, idx) => (
            <tr key={idx} className={selectedRows.has(idx) ? 'selected' : ''}>
              <td>
                <input 
                  type="checkbox" 
                  checked={selectedRows.has(idx)}
                  onChange={() => toggleRow(idx)}
                />
              </td>
              {columns.map(col => renderCell(row[col]))}
            </tr>
          ))}
        </tbody>
      </table>

      {/* Pagination */}
      <div className="results-table__pagination">
        <button 
          disabled={currentPage === 1}
          onClick={() => setCurrentPage(p => p - 1)}
        >
          Previous
        </button>
        <span>Page {currentPage} of {totalPages}</span>
        <button 
          disabled={currentPage === totalPages}
          onClick={() => setCurrentPage(p => p + 1)}
        >
          Next
        </button>
      </div>
    </div>
  );
};

// Helper: Shorten long URIs
function shortenUri(uri: string): string {
  const prefixes: Record<string, string> = {
    'http://schema.org/': 'schema:',
    'http://data.europa.eu/m8g/': 'cpov:',
    'http://www.w3.org/2002/07/owl#': 'owl:',
    // Add more prefixes
  };

  for (const [ns, prefix] of Object.entries(prefixes)) {
    if (uri.startsWith(ns)) {
      return uri.replace(ns, prefix);
    }
  }

  // Fallback: show last part after /
  const parts = uri.split('/');
  return parts[parts.length - 1] || uri;
}

Styling: src/components/query/ResultsTable.css

Features:

  • Responsive table layout
  • Column sorting (click header)
  • Pagination (100 rows per page default)
  • Row selection (checkboxes)
  • URI rendering (clickable links with shortened display)
  • Literal type indicators (datatype tooltip)
  • Empty cell handling
  • Export buttons (CSV, JSON, JSON-LD)

Step 4: Export Functionality (1 hour)

4.1 Create Export Utilities

File: src/lib/sparql/export.ts

/**
 * SPARQL Results Export Utilities
 * 
 * Convert SPARQL results to various formats and trigger downloads.
 */

import { type SparqlResults, type SparqlBinding } from './client';

/**
 * Export results as CSV
 */
export function exportToCsv(results: SparqlResults): string {
  const headers = results.head.vars;
  const rows = results.results.bindings;

  // CSV header row
  const csvHeaders = headers.join(',');

  // CSV data rows
  const csvRows = rows.map(row => {
    return headers.map(col => {
      const binding = row[col];
      if (!binding) return '';
      
      // Escape quotes and commas
      let value = binding.value.replace(/"/g, '""');
      if (value.includes(',') || value.includes('"')) {
        value = `"${value}"`;
      }
      return value;
    }).join(',');
  });

  return [csvHeaders, ...csvRows].join('\n');
}

/**
 * Export results as JSON
 */
export function exportToJson(results: SparqlResults): string {
  return JSON.stringify(results, null, 2);
}

/**
 * Export results as JSON-LD (if applicable)
 */
export function exportToJsonLd(results: SparqlResults): string {
  // Convert SPARQL results to JSON-LD format
  // This requires mapping bindings to @graph structure
  const graph = results.results.bindings.map(binding => {
    const node: any = {};
    for (const [key, value] of Object.entries(binding)) {
      if (value.type === 'uri') {
        node[key] = { '@id': value.value };
      } else {
        node[key] = value.value;
      }
    }
    return node;
  });

  return JSON.stringify({
    '@context': {
      'schema': 'http://schema.org/',
      'cpov': 'http://data.europa.eu/m8g/',
    },
    '@graph': graph,
  }, null, 2);
}

/**
 * Trigger browser download
 */
export function downloadFile(content: string, filename: string, mimeType: string): void {
  const blob = new Blob([content], { type: mimeType });
  const url = URL.createObjectURL(blob);
  
  const link = document.createElement('a');
  link.href = url;
  link.download = filename;
  link.click();
  
  URL.revokeObjectURL(url);
}

/**
 * Export and download results
 */
export function exportResults(
  results: SparqlResults,
  format: 'csv' | 'json' | 'jsonld',
  queryName?: string
): void {
  const timestamp = new Date().toISOString().split('T')[0];
  const baseName = queryName || 'sparql-results';

  let content: string;
  let filename: string;
  let mimeType: string;

  switch (format) {
    case 'csv':
      content = exportToCsv(results);
      filename = `${baseName}-${timestamp}.csv`;
      mimeType = 'text/csv';
      break;
    case 'json':
      content = exportToJson(results);
      filename = `${baseName}-${timestamp}.json`;
      mimeType = 'application/json';
      break;
    case 'jsonld':
      content = exportToJsonLd(results);
      filename = `${baseName}-${timestamp}.jsonld`;
      mimeType = 'application/ld+json';
      break;
  }

  downloadFile(content, filename, mimeType);
}

Step 5: Integrate with Query Builder Page (1.5 hours)

5.1 Update QueryBuilderPage Component

Add state for query execution:

// Add to QueryBuilderPage.tsx

import { defaultSparqlClient, type SparqlResults } from '../lib/sparql/client';
import { ResultsTable } from '../components/query/ResultsTable';
import { exportResults } from '../lib/sparql/export';

// Add state
const [isExecuting, setIsExecuting] = useState(false);
const [queryResults, setQueryResults] = useState<SparqlResults | null>(null);
const [queryError, setQueryError] = useState<string | null>(null);
const [executionTime, setExecutionTime] = useState<number | null>(null);

// Execute query handler
const handleExecuteQuery = async () => {
  if (!validationResult?.isValid) return;

  setIsExecuting(true);
  setQueryError(null);
  setQueryResults(null);

  const startTime = performance.now();

  try {
    const results = await defaultSparqlClient.executeSelect(sparqlQuery);
    const endTime = performance.now();
    
    setQueryResults(results);
    setExecutionTime(endTime - startTime);
  } catch (error) {
    setQueryError(error instanceof Error ? error.message : 'Query execution failed');
  } finally {
    setIsExecuting(false);
  }
};

// Export handler
const handleExport = (format: 'csv' | 'json' | 'jsonld') => {
  if (!queryResults) return;
  exportResults(queryResults, format, queryName || undefined);
};

Add Results section to JSX:

{/* Query Results */}
{queryResults && (
  <div className="query-builder-page__results">
    <div className="query-builder-page__results-header">
      <h3>Query Results</h3>
      <div className="query-builder-page__results-stats">
        {queryResults.results.bindings.length} results in {executionTime?.toFixed(2)}ms
      </div>
    </div>
    <ResultsTable
      results={queryResults}
      onExport={handleExport}
    />
  </div>
)}

{/* Query Error */}
{queryError && (
  <div className="query-builder-page__error">
    <h3>Execution Error</h3>
    <p>{queryError}</p>
  </div>
)}

Update Execute button:

<button
  className="query-builder-page__action-btn query-builder-page__action-btn--success"
  onClick={handleExecuteQuery}
  disabled={!validationResult?.isValid || isExecuting}
>
  {isExecuting ? 'Executing...' : 'Execute Query'}
</button>

Step 6: GraphContext Integration (1 hour)

6.1 Populate Graph from Query Results

Option A: Convert SPARQL results to graph nodes/links

// In GraphContext or new hook
function sparqlResultsToGraph(results: SparqlResults): GraphData {
  const nodes = new Map<string, Node>();
  const links: Link[] = [];

  // Analyze query to extract subject-predicate-object patterns
  // This requires parsing SPARQL SELECT to understand variable roles
  
  // Simplified approach: assume ?s ?p ?o pattern
  results.results.bindings.forEach(binding => {
    if (binding.s && binding.o) {
      // Add subject node
      if (!nodes.has(binding.s.value)) {
        nodes.set(binding.s.value, {
          id: binding.s.value,
          label: shortenUri(binding.s.value),
          type: 'resource',
        });
      }

      // Add object node (if URI)
      if (binding.o.type === 'uri' && !nodes.has(binding.o.value)) {
        nodes.set(binding.o.value, {
          id: binding.o.value,
          label: shortenUri(binding.o.value),
          type: 'resource',
        });
      }

      // Add link
      if (binding.p && binding.o.type === 'uri') {
        links.push({
          source: binding.s.value,
          target: binding.o.value,
          label: shortenUri(binding.p.value),
        });
      }
    }
  });

  return {
    nodes: Array.from(nodes.values()),
    links,
  };
}

Option B: Store results separately and allow user to visualize selected results


Step 7: Testing (1 hour)

7.1 Unit Tests for SPARQL Client

File: tests/unit/sparql-client.test.ts

import { describe, it, expect, vi, beforeEach } from 'vitest';
import { SparqlClient } from '../../src/lib/sparql/client';

describe('SparqlClient', () => {
  let client: SparqlClient;

  beforeEach(() => {
    client = new SparqlClient({ url: 'http://localhost:7878/query' });
  });

  it('should execute SELECT query', async () => {
    // Mock fetch response
    global.fetch = vi.fn().mockResolvedValue({
      ok: true,
      json: async () => ({
        head: { vars: ['s', 'p', 'o'] },
        results: { bindings: [] },
      }),
    });

    const results = await client.executeSelect('SELECT * WHERE { ?s ?p ?o } LIMIT 10');
    expect(results.head.vars).toEqual(['s', 'p', 'o']);
  });

  it('should handle network errors', async () => {
    global.fetch = vi.fn().mockRejectedValue(new Error('Network error'));

    await expect(
      client.executeSelect('SELECT * WHERE { ?s ?p ?o }')
    ).rejects.toThrow('Network error');
  });

  it('should handle HTTP errors', async () => {
    global.fetch = vi.fn().mockResolvedValue({
      ok: false,
      status: 500,
      statusText: 'Internal Server Error',
    });

    await expect(
      client.executeSelect('SELECT * WHERE { ?s ?p ?o }')
    ).rejects.toThrow();
  });

  // Add more tests...
});

7.2 Integration Tests

File: tests/integration/query-execution.test.tsx

Test full flow:

  1. Load query template
  2. Validate query
  3. Execute query (mocked endpoint)
  4. Display results
  5. Export results

Step 8: Sample Data Generation (30 min)

8.1 Create Sample Heritage Institution Data

File: frontend/data/sample-rdf/heritage-institutions.nt

# Sample N-Triples data for testing

# Amsterdam Museum
<https://w3id.org/heritage/custodian/nl/amsterdam-museum> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Museum> .
<https://w3id.org/heritage/custodian/nl/amsterdam-museum> <http://schema.org/name> "Amsterdam Museum" .
<https://w3id.org/heritage/custodian/nl/amsterdam-museum> <http://schema.org/address> <https://w3id.org/heritage/custodian/nl/amsterdam-museum/address> .

# Address
<https://w3id.org/heritage/custodian/nl/amsterdam-museum/address> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/PostalAddress> .
<https://w3id.org/heritage/custodian/nl/amsterdam-museum/address> <http://schema.org/addressLocality> "Amsterdam" .
<https://w3id.org/heritage/custodian/nl/amsterdam-museum/address> <http://schema.org/addressCountry> "NL" .

# Rijksmuseum
<https://w3id.org/heritage/custodian/nl/rijksmuseum> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Museum> .
<https://w3id.org/heritage/custodian/nl/rijksmuseum> <http://schema.org/name> "Rijksmuseum" .
<https://w3id.org/heritage/custodian/nl/rijksmuseum> <http://schema.org/url> <https://www.rijksmuseum.nl> .

# ... more institutions

Or generate programmatically from existing GLAM data:

# Convert LinkML instances to N-Triples
python scripts/export_linkml_to_ntriples.py \
  --input data/instances/ \
  --output frontend/data/sample-rdf/heritage-institutions.nt

Files Summary

New Files to Create

frontend/
├── scripts/
│   ├── start-oxigraph.sh              (Bash script)
│   └── load-sample-data.sh            (Bash script)
├── data/
│   ├── sample-rdf/
│   │   └── heritage-institutions.nt   (Sample RDF data)
│   └── oxigraph-db/                   (Database directory, gitignored)
├── src/
│   ├── lib/
│   │   └── sparql/
│   │       ├── client.ts              (~200 lines)
│   │       └── export.ts              (~150 lines)
│   └── components/
│       └── query/
│           ├── ResultsTable.tsx       (~300 lines)
│           └── ResultsTable.css       (~150 lines)
└── tests/
    ├── unit/
    │   ├── sparql-client.test.ts      (~150 lines)
    │   └── sparql-export.test.ts      (~100 lines)
    └── integration/
        └── query-execution.test.tsx   (~200 lines)

Total New Code: ~1,250 lines (production + tests)


Dependencies

New Dependencies

# None required! Using native Fetch API
# Oxigraph is standalone binary, not a Node.js package

Environment Variables

# .env.local
VITE_SPARQL_ENDPOINT=http://localhost:7878/query
VITE_SPARQL_TIMEOUT=30000

Testing Strategy

Manual Testing Checklist

  1. Oxigraph Setup

    • Oxigraph binary installed
    • Server starts successfully
    • Sample data loads
    • Test query executes via curl
  2. HTTP Client

    • SELECT query returns results
    • ASK query returns boolean
    • Network errors handled
    • Timeout works
  3. Results Table

    • Table displays results
    • Column sorting works
    • Pagination works
    • URIs are clickable
    • Row selection works
  4. Export Functionality

    • CSV export downloads
    • JSON export downloads
    • JSON-LD export downloads
    • Filenames include timestamp
  5. Integration

    • Execute button enables when valid
    • Execution shows loading state
    • Results appear after execution
    • Errors display clearly
    • Execution time shown

Automated Testing

  • 15+ unit tests for SPARQL client
  • 10+ unit tests for export utilities
  • 5+ integration tests for full flow
  • All existing tests still pass (148+)

Timeline

Step Task Est. Time Priority
1 Install Oxigraph + load data 1-1.5h 🔥 Critical
2 Implement HTTP client 1h 🔥 Critical
3 Build results table 2-2.5h 🔥 Critical
4 Export functionality 1h 🔴 High
5 Integrate with page 1.5h 🔥 Critical
6 GraphContext integration 1h 🟡 Medium
7 Testing 1h 🔴 High
8 Sample data 0.5h 🔴 High

Total: 8.5-10 hours (accounting for debugging)


Success Criteria

Task 7 Complete When:

  • Oxigraph server running locally
  • Sample heritage institution data loaded
  • SPARQL queries execute successfully
  • Results display in interactive table
  • Export to CSV/JSON/JSON-LD works
  • 25+ new tests passing
  • Documentation complete
  • Phase 3 = 100% complete

Next Session Starting Point

Begin with: Step 1 - Install Oxigraph and load sample data

First Commands:

# Check Oxigraph installation
brew install oxigraph

# Start server
oxigraph_server --location frontend/data/oxigraph-db --bind 127.0.0.1:7878

# Test connectivity
curl http://localhost:7878/query

Created: November 22, 2025
Status: Ready to begin
Blocked By: None (Task 6 complete )
Estimated Completion: 8-10 hours from start