- Updated WorldCatIdentifier.yaml to remove unnecessary description and ensure consistent formatting. - Enhanced WorldHeritageSite.yaml by breaking long description into multiple lines for better readability and removed unused attributes. - Simplified WritingSystem.yaml by removing redundant attributes and ensuring consistent formatting. - Cleaned up XPathScore.yaml by removing unnecessary attributes and ensuring consistent formatting. - Improved YoutubeChannel.yaml by breaking long description into multiple lines for better readability. - Enhanced YoutubeEnrichment.yaml by breaking long description into multiple lines for better readability. - Updated YoutubeVideo.yaml to break long description into multiple lines and removed legacy field name. - Refined has_or_had_affiliation.yaml by removing unnecessary comments and ensuring clarity. - Cleaned up is_or_was_retrieved_at.yaml by removing unnecessary comments and ensuring clarity. - Added rules for generic slots and avoiding rough edits in schema files to maintain structural integrity. - Introduced changes_or_changed_through.yaml to define a new slot for linking entities to change events.
49 lines
2.4 KiB
YAML
49 lines
2.4 KiB
YAML
id: https://nde.nl/ontology/hc/class/ExtractionMethod
|
|
name: ExtractionMethod
|
|
title: ExtractionMethod Class - Methods for Data Extraction
|
|
prefixes:
|
|
linkml: https://w3id.org/linkml/
|
|
hc: https://nde.nl/ontology/hc/
|
|
schema: http://schema.org/
|
|
prov: http://www.w3.org/ns/prov#
|
|
nif: http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#
|
|
imports:
|
|
- linkml:types
|
|
- ../slots/has_or_had_label
|
|
- ./Label
|
|
default_prefix: hc
|
|
classes:
|
|
ExtractionMethod:
|
|
class_uri: prov:SoftwareAgent
|
|
description: "A method or technique used to extract data from a source.\n\nExtraction methods define HOW data was obtained, providing\ntransparency and reproducibility for claim provenance.\n\n**Common Extraction Methods**:\n- `xpath_exact_match`: XPath query with exact text match\n- `xpath_fuzzy_match`: XPath query with fuzzy text matching\n- `text_search`: Full-text search within document\n- `css_selector`: CSS selector for element location\n- `json_ld_parse`: Parsing structured JSON-LD data\n- `regex_pattern`: Regular expression pattern matching\n- `nlp_ner`: Named Entity Recognition via NLP\n- `manual_annotation`: Human annotator extraction\n\n**Ontology Mapping Rationale**:\n- class_uri is prov:SoftwareAgent because extraction methods\n are typically software-based agents that perform extraction\n- close_mappings includes nif:Context as NIF models text\n extraction contexts and methods\n\n**MIGRATION NOTE (2026-01-19)**:\nCreated per slot_fixes.yaml revision for claim_extraction_method\n\
|
|
slot migration (Rule 53/56).\n"
|
|
exact_mappings:
|
|
- prov:SoftwareAgent
|
|
close_mappings:
|
|
- nif:Context
|
|
- schema:HowTo
|
|
slots:
|
|
- has_or_had_label
|
|
slot_usage:
|
|
has_or_had_label:
|
|
range: Label
|
|
inlined: true
|
|
required: true
|
|
comments:
|
|
- 'CREATED 2026-01-19: Per slot_fixes.yaml revision (Rule 53/56)'
|
|
- Replaces string-valued claim_extraction_method slot
|
|
- Enables structured representation of extraction techniques
|
|
examples:
|
|
- value:
|
|
has_or_had_label:
|
|
has_or_had_label: xpath_exact_match
|
|
- value:
|
|
has_or_had_label:
|
|
has_or_had_label: nlp_ner
|
|
- value:
|
|
has_or_had_label:
|
|
has_or_had_label: json_ld_parse
|
|
annotations:
|
|
specificity_score: 0.1
|
|
specificity_rationale: Generic utility class/slot created during migration
|
|
custodian_types: "['*']"
|