glam/schemas/20251121/linkml/modules/classes/XPath.yaml
kempersc 53c6dbc2d9 feat(schema): Migrate temporal slots and introduce new pattern classes
Major slot migrations following slot_fixes.yaml revisions:
- TimeSpan: begin_of_the_begin, begin_of_the_end, end_of_the_begin, end_of_the_end
- Quantity: has_or_had_measurement_unit with MeasureUnit class
- Description: has_or_had_description with Description class
- URL, WikiData, Timestamp, Location, Provenance pattern classes

New slots for RiC-O compliance:
- Temporal: has_or_had_time_interval, calendar_system
- Transfer: is_or_was_transferred, has_or_had_policy
- Location: starts/ends_or_started/ended_at_location
- Provenance: has_or_had_provenance_path, is_or_was_webarchived_at

Archive deprecated slots per Rule 53 workflow.
2026-01-14 20:01:55 +01:00

101 lines
3.2 KiB
YAML

# XPath - An XPath expression for locating elements in HTML/XML documents
# Created per slot_fixes.yaml migration for: xpath
# Creation date: 2026-01-14
id: https://nde.nl/ontology/hc/classes/XPath
name: XPath
title: XPath
prefixes:
linkml: https://w3id.org/linkml/
hc: https://nde.nl/ontology/hc/
prov: http://www.w3.org/ns/prov#
schema: http://schema.org/
xsd: http://www.w3.org/2001/XMLSchema#
imports:
- linkml:types
default_range: string
classes:
XPath:
description: >-
An XPath expression used to locate a specific element within an HTML or XML document.
**CRITICAL PROVENANCE FIELD**:
XPath expressions provide the essential link between extracted data values and their
original source location in archived documents. Without an XPath, a claim extracted
from a webpage is unverifiable.
**FORMAT**: Standard XPath 1.0 expressions
**EXAMPLE**: `/html[1]/body[1]/div[6]/div[1]/table[3]/tbody[1]/tr[1]/td[1]/p[6]`
**USAGE CONTEXT**:
Used with `has_or_had_provenance_path` slot to link provenance records to
specific locations in source documents.
class_uri: prov:Location
close_mappings:
- schema:xpath
related_mappings:
- prov:atLocation
attributes:
expression:
range: string
required: true
description: >-
The XPath expression string.
Example: /html[1]/body[1]/div[6]/div[1]/table[3]/tbody[1]/tr[1]/td[1]/p[6]
pattern: "^/.*"
matched_text:
range: string
description: >-
The text content found at this XPath location.
Used for verification and debugging.
match_score:
range: float
minimum_value: 0.0
maximum_value: 1.0
description: >-
Confidence score (0.0 to 1.0) for the XPath match.
1.0 = exact match, <1.0 = fuzzy match.
source_document:
range: uriorcurie
description: >-
URI or path to the source document where this XPath applies.
Example: web/GHCID/example.org/rendered.html
annotations:
custodian_types: '["*"]'
custodian_types_rationale: >-
XPath provenance is relevant for any custodian type where web content
is extracted and archived.
custodian_types_primary: "*"
specificity_score: 0.7
specificity_rationale: >-
High specificity - only relevant for web-extracted data with HTML archival.
examples:
- value: |
XPath:
expression: "/html[1]/body[1]/div[6]/div[1]/table[3]/tbody[1]/tr[1]/td[1]/p[6]"
matched_text: "Historische Vereniging Nijeveen"
match_score: 1.0
source_document: "web/0021/historischeverenigingnijeveen.nl/rendered.html"
description: >-
XPath extraction pointing to an institution name in archived HTML.
- value: |
XPath:
expression: "//meta[@property='og:title']/@content"
matched_text: "Amsterdam Museum - Official Website"
match_score: 0.95
description: >-
XPath to OpenGraph metadata in a webpage header.