glam/schemas/20251121/linkml/modules/classes/SearchAPI.yaml
2025-12-14 17:09:55 +01:00

454 lines
13 KiB
YAML

# Search API Class
# Represents REST/JSON search endpoints for heritage collection discovery
#
# Search APIs provide programmatic access to collection search functionality,
# enabling developers to build custom search interfaces and integrate heritage
# data into applications.
id: https://nde.nl/ontology/hc/class/SearchAPI
name: search_api
title: SearchAPI Class
prefixes:
linkml: https://w3id.org/linkml/
hc: https://nde.nl/ontology/hc/
dcat: http://www.w3.org/ns/dcat#
dcterms: http://purl.org/dc/terms/
schema: http://schema.org/
hydra: http://www.w3.org/ns/hydra/core#
xsd: http://www.w3.org/2001/XMLSchema#
imports:
- linkml:types
- ../metadata
- ./DataServiceEndpoint
classes:
SearchAPI:
is_a: DataServiceEndpoint
class_uri: hc:SearchAPI
description: |
REST/JSON search API endpoint for heritage collection discovery.
**Purpose:**
Models search APIs that provide programmatic access to collection search
functionality. These APIs enable:
- Custom search interface development
- Integration with aggregation platforms
- Automated data discovery
- Faceted search and filtering
**Common Patterns:**
Heritage search APIs typically support:
- **Full-text search**: Query across all fields
- **Field-specific search**: Query specific metadata fields
- **Faceted search**: Filter by category, date, type, etc.
- **Pagination**: Navigate large result sets
- **Sorting**: Order results by relevance, date, etc.
**Example - Nationaal Archief Search API:**
```yaml
search_api:
endpoint_name: "Nationaal Archief Search API"
endpoint_url: "https://www.nationaalarchief.nl/onderzoeken/api/zoeken"
protocol: REST
query_parameters:
- name: "q"
type: "string"
description: "Full-text search query"
- name: "from"
type: "integer"
description: "Pagination offset"
- name: "size"
type: "integer"
description: "Results per page"
pagination_method: OFFSET_LIMIT
max_results_per_page: 100
response_format: JSON
supports_facets: true
facet_fields: ["type", "periode", "archief"]
```
**Response Structure:**
Most heritage search APIs return JSON with:
- `total`: Total number of matching records
- `results`/`items`/`records`: Array of result objects
- `facets`: Aggregation counts for filtering
- `pagination`: Links or cursors for paging
**See Also:**
- OpenSearch: https://opensearch.org/
- Hydra API vocabulary: https://www.hydra-cg.com/
attributes:
search_url:
slot_uri: dcat:endpointURL
description: |
Search endpoint URL.
The base URL for search requests. Query parameters are appended.
Example: "https://www.nationaalarchief.nl/onderzoeken/api/zoeken"
range: uri
required: true
query_parameters:
slot_uri: hydra:mapping
description: |
Query parameters supported by this search API.
Documents the available search parameters, their types, and usage.
Example:
```yaml
query_parameters:
- name: "q"
type: "string"
required: true
description: "Full-text query"
- name: "type"
type: "string"
description: "Filter by record type"
```
range: SearchQueryParameter
multivalued: true
inlined_as_list: true
http_method:
slot_uri: hydra:method
description: |
HTTP method(s) supported for search requests.
Values:
- GET: Query parameters in URL (most common)
- POST: Query in request body (for complex queries)
- BOTH: Supports both methods
range: HTTPMethodEnum
pagination_method:
slot_uri: hydra:pageIndex
description: |
Pagination method used by this API.
Values:
- OFFSET_LIMIT: Uses offset/start and limit/size parameters
- PAGE_NUMBER: Uses page number and page size
- CURSOR: Uses cursor/token for stateful pagination
- LINK_HEADER: Uses HTTP Link headers (RFC 5988)
- NONE: No pagination (returns all results)
range: PaginationMethodEnum
max_results_per_page:
slot_uri: schema:maxValue
description: |
Maximum number of results per page.
The API may return fewer results, but never more.
Example: 100
range: integer
default_results_per_page:
slot_uri: schema:defaultValue
description: |
Default number of results returned per page if not specified.
Example: 20
range: integer
total_records:
slot_uri: schema:numberOfItems
description: |
Total number of searchable records (approximate).
Example: 15000000
range: integer
response_format:
slot_uri: dcterms:format
description: |
Primary response format.
Most modern APIs use JSON.
Values: JSON, XML, JSON_LD, HTML, CSV
range: SearchResponseFormatEnum
supports_facets:
slot_uri: schema:additionalProperty
description: |
Whether the API supports faceted search.
Faceted search allows filtering by categories (type, date, location, etc.)
with counts for each facet value.
range: boolean
facet_fields:
slot_uri: hydra:variable
description: |
Fields available for faceted filtering.
Example: ["type", "periode", "archief", "toegang"]
range: string
multivalued: true
supports_sorting:
slot_uri: schema:additionalProperty
description: |
Whether the API supports custom sorting of results.
range: boolean
sort_fields:
slot_uri: hydra:variable
description: |
Fields available for sorting.
Example: ["relevance", "date", "title", "created"]
range: string
multivalued: true
supports_boolean_operators:
slot_uri: schema:additionalProperty
description: |
Whether the API supports boolean operators (AND, OR, NOT) in queries.
range: boolean
supports_phrase_search:
slot_uri: schema:additionalProperty
description: |
Whether the API supports phrase search (exact match using quotes).
range: boolean
supports_wildcards:
slot_uri: schema:additionalProperty
description: |
Whether the API supports wildcard characters (* or ?).
range: boolean
supports_field_search:
slot_uri: schema:additionalProperty
description: |
Whether the API supports field-specific search (e.g., title:painting).
range: boolean
date_filter_format:
slot_uri: dcterms:temporal
description: |
Expected format for date filters.
Example: "YYYY-MM-DD", "ISO8601", "Unix timestamp"
range: string
result_schema_url:
slot_uri: schema:encodingFormat
description: |
URL to JSON Schema or documentation describing the result format.
Example: "https://api.example.org/schema/search-result.json"
range: uri
opensearch_description_url:
slot_uri: schema:potentialAction
description: |
URL to OpenSearch description document (if applicable).
Example: "https://example.org/opensearch.xml"
range: uri
example_query:
slot_uri: schema:workExample
description: |
Example search query demonstrating API usage.
Example: "/api/zoeken?q=foto&type=image&size=10"
range: string
slot_usage:
protocol:
description: |
Protocol for search APIs.
Typically REST, but may be GRAPHQL or OPENSEARCH.
response_formats:
description: |
Response formats supported.
Common: ["application/json"]
comments:
- "Primary interface for programmatic collection discovery"
- "Most heritage institutions expose REST/JSON search APIs"
- "Consider rate limits and pagination for large-scale harvesting"
see_also:
- "https://opensearch.org/"
- "https://www.hydra-cg.com/spec/latest/core/"
SearchQueryParameter:
class_uri: hydra:IriTemplateMapping
description: |
Describes a query parameter supported by a search API.
Documents the parameter name, type, whether it's required,
and its purpose.
attributes:
name:
slot_uri: hydra:variable
description: |
Parameter name as used in the query string.
Example: "q", "type", "from", "size"
range: string
required: true
type:
slot_uri: hydra:property
description: |
Data type of the parameter value.
Values: string, integer, boolean, date, array
range: string
required:
slot_uri: hydra:required
description: |
Whether this parameter is required.
range: boolean
description:
slot_uri: dcterms:description
description: |
Human-readable description of the parameter.
Example: "Full-text search query"
range: string
default_value:
slot_uri: schema:defaultValue
description: |
Default value if not specified.
Example: "10" for a size parameter
range: string
allowed_values:
slot_uri: schema:valueReference
description: |
List of allowed values (for enumerated parameters).
Example: ["image", "document", "audio", "video"]
range: string
multivalued: true
example_value:
slot_uri: schema:workExample
description: |
Example value for this parameter.
Example: "amsterdam museum"
range: string
enums:
PaginationMethodEnum:
description: |
Methods for paginating large result sets.
permissible_values:
OFFSET_LIMIT:
description: |
Uses offset (or start/from) and limit (or size/count) parameters.
Example: ?offset=100&limit=20
Pros: Simple, allows jumping to any page
Cons: Inconsistent results if data changes between requests
PAGE_NUMBER:
description: |
Uses page number and page size parameters.
Example: ?page=5&per_page=20
Similar to OFFSET_LIMIT but uses page numbers.
CURSOR:
description: |
Uses opaque cursor/token for stateful pagination.
Example: ?cursor=eyJsYXN0X2lkIjogMTIzfQ==
Pros: Consistent results, efficient for large datasets
Cons: Cannot jump to arbitrary page
LINK_HEADER:
description: |
Uses HTTP Link headers for pagination (RFC 5988).
Response includes: Link: <url>; rel="next", <url>; rel="prev"
RESTful approach following HATEOAS principles.
SCROLL:
description: |
Uses scroll/scan API for deep pagination.
Common in Elasticsearch-based APIs.
Maintains search context for consistent results.
NONE:
description: |
No pagination - returns all results at once.
Only suitable for small result sets.
HTTPMethodEnum:
description: |
HTTP methods supported for API requests.
permissible_values:
GET:
description: |
HTTP GET method.
Query parameters in URL. Most common for search APIs.
POST:
description: |
HTTP POST method.
Query in request body. Used for complex queries.
BOTH:
description: |
Supports both GET and POST methods.
SearchResponseFormatEnum:
description: |
Response formats for search API results.
permissible_values:
JSON:
description: |
JSON (application/json).
Most common format for modern APIs.
XML:
description: |
XML (application/xml or text/xml).
Legacy format, still used in some heritage APIs.
JSON_LD:
description: |
JSON-LD (application/ld+json).
JSON with linked data semantics.
HTML:
description: |
HTML (text/html).
Human-readable results page (not ideal for API use).
CSV:
description: |
CSV (text/csv).
Tabular export format.
ATOM:
description: |
Atom feed (application/atom+xml).
Syndication format.