- Removed obsolete slots: `has_or_had_custodian_observation`, `provider`, and `specificity_annotation`. - Updated `has_or_had_score` slot to use `SpecificityScore` class and modified its description and examples. - Added new slots: `end_seconds`, `end_time`, `has_archive_path`, `has_or_had_custodian_name`, `protocol_name`, and `protocol_version`. - Introduced a script `check_annotation_types.py` to validate the presence and structure of `custodian_types` in YAML files. - Added a script `update_specificity.py` to automate updates related to `SpecificityAnnotation` to `SpecificityScore`.
71 lines
2.5 KiB
YAML
71 lines
2.5 KiB
YAML
id: https://nde.nl/ontology/hc/class/ConfidenceMethod
|
|
name: confidence_method_class
|
|
title: Confidence Method
|
|
prefixes:
|
|
linkml: https://w3id.org/linkml/
|
|
hc: https://nde.nl/ontology/hc/
|
|
prov: http://www.w3.org/ns/prov#
|
|
schema: http://schema.org/
|
|
imports:
|
|
- linkml:types
|
|
- ../slots/has_or_had_description
|
|
- ../slots/has_or_had_identifier
|
|
- ../slots/has_or_had_type
|
|
default_prefix: hc
|
|
classes:
|
|
ConfidenceMethod:
|
|
description: 'A method or algorithm used to calculate confidence scores.
|
|
|
|
**USAGE**: Documents how confidence values were computed: - Fuzzy string matching algorithms - ML model predictions
|
|
- Rule-based validation - XPath match verification - Human assessment
|
|
|
|
**COMMON METHODS**: | Method | Description | |--------|-------------| | fuzzy_matching | Levenshtein, Jaro-Winkler,
|
|
etc. | | xpath_validation | XPath match confidence | | llm_classification | LLM-based entity classification | | ml_prediction
|
|
| Machine learning model output | | human_assessment | Manual quality assessment | | ensemble | Combined multiple methods
|
|
|'
|
|
class_uri: prov:Plan
|
|
exact_mappings:
|
|
- prov:Plan
|
|
close_mappings:
|
|
- schema:HowTo
|
|
slots:
|
|
- has_or_had_type
|
|
- has_or_had_description
|
|
- has_or_had_identifier
|
|
slot_usage:
|
|
has_or_had_type:
|
|
range: string
|
|
required: true
|
|
examples:
|
|
- value: fuzzy_matching
|
|
- value: ml_prediction
|
|
- value: human_assessment
|
|
has_or_had_description:
|
|
range: string
|
|
required: false
|
|
examples:
|
|
- value: Jaro-Winkler similarity with 0.7 threshold
|
|
has_or_had_identifier:
|
|
range: string
|
|
required: false
|
|
examples:
|
|
- value: rapidfuzz-2.15.1
|
|
annotations:
|
|
custodian_types: '["*"]'
|
|
custodian_types_rationale: Confidence methods apply universally to data quality assessment.
|
|
custodian_types_primary: '*'
|
|
specificity_score: 0.25
|
|
specificity_rationale: Low specificity - fundamental methodology documentation.
|
|
examples:
|
|
- value:
|
|
has_or_had_type: fuzzy_matching
|
|
has_or_had_description: Levenshtein distance with ratio normalization
|
|
has_or_had_identifier: rapidfuzz-levenshtein
|
|
- value:
|
|
has_or_had_type: llm_classification
|
|
has_or_had_description: GPT-4 based entity type classification
|
|
has_or_had_identifier: gpt-4-turbo-2024-04-09
|
|
comments:
|
|
- Created from slot_fixes.yaml migration (2026-01-19)
|
|
- Documents confidence calculation methodology
|
|
- Used with ConfidenceScore class
|