454 lines
13 KiB
YAML
454 lines
13 KiB
YAML
# Search API Class
|
|
# Represents REST/JSON search endpoints for heritage collection discovery
|
|
#
|
|
# Search APIs provide programmatic access to collection search functionality,
|
|
# enabling developers to build custom search interfaces and integrate heritage
|
|
# data into applications.
|
|
|
|
id: https://nde.nl/ontology/hc/class/SearchAPI
|
|
name: search_api
|
|
title: SearchAPI Class
|
|
|
|
prefixes:
|
|
linkml: https://w3id.org/linkml/
|
|
hc: https://nde.nl/ontology/hc/
|
|
dcat: http://www.w3.org/ns/dcat#
|
|
dcterms: http://purl.org/dc/terms/
|
|
schema: http://schema.org/
|
|
hydra: http://www.w3.org/ns/hydra/core#
|
|
xsd: http://www.w3.org/2001/XMLSchema#
|
|
|
|
imports:
|
|
- linkml:types
|
|
- ../metadata
|
|
- ./DataServiceEndpoint
|
|
|
|
classes:
|
|
SearchAPI:
|
|
is_a: DataServiceEndpoint
|
|
class_uri: hc:SearchAPI
|
|
description: |
|
|
REST/JSON search API endpoint for heritage collection discovery.
|
|
|
|
**Purpose:**
|
|
|
|
Models search APIs that provide programmatic access to collection search
|
|
functionality. These APIs enable:
|
|
- Custom search interface development
|
|
- Integration with aggregation platforms
|
|
- Automated data discovery
|
|
- Faceted search and filtering
|
|
|
|
**Common Patterns:**
|
|
|
|
Heritage search APIs typically support:
|
|
- **Full-text search**: Query across all fields
|
|
- **Field-specific search**: Query specific metadata fields
|
|
- **Faceted search**: Filter by category, date, type, etc.
|
|
- **Pagination**: Navigate large result sets
|
|
- **Sorting**: Order results by relevance, date, etc.
|
|
|
|
**Example - Nationaal Archief Search API:**
|
|
|
|
```yaml
|
|
search_api:
|
|
endpoint_name: "Nationaal Archief Search API"
|
|
endpoint_url: "https://www.nationaalarchief.nl/onderzoeken/api/zoeken"
|
|
protocol: REST
|
|
query_parameters:
|
|
- name: "q"
|
|
type: "string"
|
|
description: "Full-text search query"
|
|
- name: "from"
|
|
type: "integer"
|
|
description: "Pagination offset"
|
|
- name: "size"
|
|
type: "integer"
|
|
description: "Results per page"
|
|
pagination_method: OFFSET_LIMIT
|
|
max_results_per_page: 100
|
|
response_format: JSON
|
|
supports_facets: true
|
|
facet_fields: ["type", "periode", "archief"]
|
|
```
|
|
|
|
**Response Structure:**
|
|
|
|
Most heritage search APIs return JSON with:
|
|
- `total`: Total number of matching records
|
|
- `results`/`items`/`records`: Array of result objects
|
|
- `facets`: Aggregation counts for filtering
|
|
- `pagination`: Links or cursors for paging
|
|
|
|
**See Also:**
|
|
|
|
- OpenSearch: https://opensearch.org/
|
|
- Hydra API vocabulary: https://www.hydra-cg.com/
|
|
|
|
attributes:
|
|
search_url:
|
|
slot_uri: dcat:endpointURL
|
|
description: |
|
|
Search endpoint URL.
|
|
|
|
The base URL for search requests. Query parameters are appended.
|
|
|
|
Example: "https://www.nationaalarchief.nl/onderzoeken/api/zoeken"
|
|
range: uri
|
|
required: true
|
|
|
|
query_parameters:
|
|
slot_uri: hydra:mapping
|
|
description: |
|
|
Query parameters supported by this search API.
|
|
|
|
Documents the available search parameters, their types, and usage.
|
|
|
|
Example:
|
|
```yaml
|
|
query_parameters:
|
|
- name: "q"
|
|
type: "string"
|
|
required: true
|
|
description: "Full-text query"
|
|
- name: "type"
|
|
type: "string"
|
|
description: "Filter by record type"
|
|
```
|
|
range: SearchQueryParameter
|
|
multivalued: true
|
|
inlined_as_list: true
|
|
|
|
http_method:
|
|
slot_uri: hydra:method
|
|
description: |
|
|
HTTP method(s) supported for search requests.
|
|
|
|
Values:
|
|
- GET: Query parameters in URL (most common)
|
|
- POST: Query in request body (for complex queries)
|
|
- BOTH: Supports both methods
|
|
range: HTTPMethodEnum
|
|
|
|
pagination_method:
|
|
slot_uri: hydra:pageIndex
|
|
description: |
|
|
Pagination method used by this API.
|
|
|
|
Values:
|
|
- OFFSET_LIMIT: Uses offset/start and limit/size parameters
|
|
- PAGE_NUMBER: Uses page number and page size
|
|
- CURSOR: Uses cursor/token for stateful pagination
|
|
- LINK_HEADER: Uses HTTP Link headers (RFC 5988)
|
|
- NONE: No pagination (returns all results)
|
|
range: PaginationMethodEnum
|
|
|
|
max_results_per_page:
|
|
slot_uri: schema:maxValue
|
|
description: |
|
|
Maximum number of results per page.
|
|
|
|
The API may return fewer results, but never more.
|
|
|
|
Example: 100
|
|
range: integer
|
|
|
|
default_results_per_page:
|
|
slot_uri: schema:defaultValue
|
|
description: |
|
|
Default number of results returned per page if not specified.
|
|
|
|
Example: 20
|
|
range: integer
|
|
|
|
total_records:
|
|
slot_uri: schema:numberOfItems
|
|
description: |
|
|
Total number of searchable records (approximate).
|
|
|
|
Example: 15000000
|
|
range: integer
|
|
|
|
response_format:
|
|
slot_uri: dcterms:format
|
|
description: |
|
|
Primary response format.
|
|
|
|
Most modern APIs use JSON.
|
|
|
|
Values: JSON, XML, JSON_LD, HTML, CSV
|
|
range: SearchResponseFormatEnum
|
|
|
|
supports_facets:
|
|
slot_uri: schema:additionalProperty
|
|
description: |
|
|
Whether the API supports faceted search.
|
|
|
|
Faceted search allows filtering by categories (type, date, location, etc.)
|
|
with counts for each facet value.
|
|
range: boolean
|
|
|
|
facet_fields:
|
|
slot_uri: hydra:variable
|
|
description: |
|
|
Fields available for faceted filtering.
|
|
|
|
Example: ["type", "periode", "archief", "toegang"]
|
|
range: string
|
|
multivalued: true
|
|
|
|
supports_sorting:
|
|
slot_uri: schema:additionalProperty
|
|
description: |
|
|
Whether the API supports custom sorting of results.
|
|
range: boolean
|
|
|
|
sort_fields:
|
|
slot_uri: hydra:variable
|
|
description: |
|
|
Fields available for sorting.
|
|
|
|
Example: ["relevance", "date", "title", "created"]
|
|
range: string
|
|
multivalued: true
|
|
|
|
supports_boolean_operators:
|
|
slot_uri: schema:additionalProperty
|
|
description: |
|
|
Whether the API supports boolean operators (AND, OR, NOT) in queries.
|
|
range: boolean
|
|
|
|
supports_phrase_search:
|
|
slot_uri: schema:additionalProperty
|
|
description: |
|
|
Whether the API supports phrase search (exact match using quotes).
|
|
range: boolean
|
|
|
|
supports_wildcards:
|
|
slot_uri: schema:additionalProperty
|
|
description: |
|
|
Whether the API supports wildcard characters (* or ?).
|
|
range: boolean
|
|
|
|
supports_field_search:
|
|
slot_uri: schema:additionalProperty
|
|
description: |
|
|
Whether the API supports field-specific search (e.g., title:painting).
|
|
range: boolean
|
|
|
|
date_filter_format:
|
|
slot_uri: dcterms:temporal
|
|
description: |
|
|
Expected format for date filters.
|
|
|
|
Example: "YYYY-MM-DD", "ISO8601", "Unix timestamp"
|
|
range: string
|
|
|
|
result_schema_url:
|
|
slot_uri: schema:encodingFormat
|
|
description: |
|
|
URL to JSON Schema or documentation describing the result format.
|
|
|
|
Example: "https://api.example.org/schema/search-result.json"
|
|
range: uri
|
|
|
|
opensearch_description_url:
|
|
slot_uri: schema:potentialAction
|
|
description: |
|
|
URL to OpenSearch description document (if applicable).
|
|
|
|
Example: "https://example.org/opensearch.xml"
|
|
range: uri
|
|
|
|
example_query:
|
|
slot_uri: schema:workExample
|
|
description: |
|
|
Example search query demonstrating API usage.
|
|
|
|
Example: "/api/zoeken?q=foto&type=image&size=10"
|
|
range: string
|
|
|
|
slot_usage:
|
|
protocol:
|
|
description: |
|
|
Protocol for search APIs.
|
|
Typically REST, but may be GRAPHQL or OPENSEARCH.
|
|
|
|
response_formats:
|
|
description: |
|
|
Response formats supported.
|
|
Common: ["application/json"]
|
|
|
|
comments:
|
|
- "Primary interface for programmatic collection discovery"
|
|
- "Most heritage institutions expose REST/JSON search APIs"
|
|
- "Consider rate limits and pagination for large-scale harvesting"
|
|
|
|
see_also:
|
|
- "https://opensearch.org/"
|
|
- "https://www.hydra-cg.com/spec/latest/core/"
|
|
|
|
SearchQueryParameter:
|
|
class_uri: hydra:IriTemplateMapping
|
|
description: |
|
|
Describes a query parameter supported by a search API.
|
|
|
|
Documents the parameter name, type, whether it's required,
|
|
and its purpose.
|
|
|
|
attributes:
|
|
name:
|
|
slot_uri: hydra:variable
|
|
description: |
|
|
Parameter name as used in the query string.
|
|
|
|
Example: "q", "type", "from", "size"
|
|
range: string
|
|
required: true
|
|
|
|
type:
|
|
slot_uri: hydra:property
|
|
description: |
|
|
Data type of the parameter value.
|
|
|
|
Values: string, integer, boolean, date, array
|
|
range: string
|
|
|
|
required:
|
|
slot_uri: hydra:required
|
|
description: |
|
|
Whether this parameter is required.
|
|
range: boolean
|
|
|
|
description:
|
|
slot_uri: dcterms:description
|
|
description: |
|
|
Human-readable description of the parameter.
|
|
|
|
Example: "Full-text search query"
|
|
range: string
|
|
|
|
default_value:
|
|
slot_uri: schema:defaultValue
|
|
description: |
|
|
Default value if not specified.
|
|
|
|
Example: "10" for a size parameter
|
|
range: string
|
|
|
|
allowed_values:
|
|
slot_uri: schema:valueReference
|
|
description: |
|
|
List of allowed values (for enumerated parameters).
|
|
|
|
Example: ["image", "document", "audio", "video"]
|
|
range: string
|
|
multivalued: true
|
|
|
|
example_value:
|
|
slot_uri: schema:workExample
|
|
description: |
|
|
Example value for this parameter.
|
|
|
|
Example: "amsterdam museum"
|
|
range: string
|
|
|
|
enums:
|
|
PaginationMethodEnum:
|
|
description: |
|
|
Methods for paginating large result sets.
|
|
permissible_values:
|
|
OFFSET_LIMIT:
|
|
description: |
|
|
Uses offset (or start/from) and limit (or size/count) parameters.
|
|
|
|
Example: ?offset=100&limit=20
|
|
|
|
Pros: Simple, allows jumping to any page
|
|
Cons: Inconsistent results if data changes between requests
|
|
PAGE_NUMBER:
|
|
description: |
|
|
Uses page number and page size parameters.
|
|
|
|
Example: ?page=5&per_page=20
|
|
|
|
Similar to OFFSET_LIMIT but uses page numbers.
|
|
CURSOR:
|
|
description: |
|
|
Uses opaque cursor/token for stateful pagination.
|
|
|
|
Example: ?cursor=eyJsYXN0X2lkIjogMTIzfQ==
|
|
|
|
Pros: Consistent results, efficient for large datasets
|
|
Cons: Cannot jump to arbitrary page
|
|
LINK_HEADER:
|
|
description: |
|
|
Uses HTTP Link headers for pagination (RFC 5988).
|
|
|
|
Response includes: Link: <url>; rel="next", <url>; rel="prev"
|
|
|
|
RESTful approach following HATEOAS principles.
|
|
SCROLL:
|
|
description: |
|
|
Uses scroll/scan API for deep pagination.
|
|
|
|
Common in Elasticsearch-based APIs.
|
|
Maintains search context for consistent results.
|
|
NONE:
|
|
description: |
|
|
No pagination - returns all results at once.
|
|
|
|
Only suitable for small result sets.
|
|
|
|
HTTPMethodEnum:
|
|
description: |
|
|
HTTP methods supported for API requests.
|
|
permissible_values:
|
|
GET:
|
|
description: |
|
|
HTTP GET method.
|
|
|
|
Query parameters in URL. Most common for search APIs.
|
|
POST:
|
|
description: |
|
|
HTTP POST method.
|
|
|
|
Query in request body. Used for complex queries.
|
|
BOTH:
|
|
description: |
|
|
Supports both GET and POST methods.
|
|
|
|
SearchResponseFormatEnum:
|
|
description: |
|
|
Response formats for search API results.
|
|
permissible_values:
|
|
JSON:
|
|
description: |
|
|
JSON (application/json).
|
|
|
|
Most common format for modern APIs.
|
|
XML:
|
|
description: |
|
|
XML (application/xml or text/xml).
|
|
|
|
Legacy format, still used in some heritage APIs.
|
|
JSON_LD:
|
|
description: |
|
|
JSON-LD (application/ld+json).
|
|
|
|
JSON with linked data semantics.
|
|
HTML:
|
|
description: |
|
|
HTML (text/html).
|
|
|
|
Human-readable results page (not ideal for API use).
|
|
CSV:
|
|
description: |
|
|
CSV (text/csv).
|
|
|
|
Tabular export format.
|
|
ATOM:
|
|
description: |
|
|
Atom feed (application/atom+xml).
|
|
|
|
Syndication format.
|