glam/data/entity_annotation/modules/hypernyms/tmp.yaml
2025-12-05 15:30:23 +01:00

371 lines
13 KiB
YAML

# =============================================================================
# GLAM-NER: TEMPORAL HYPERNYM MODULE
# =============================================================================
# Module: hypernyms/tmp.yaml
# Parent: entity_annotation_rules_v1.7.0_unified.yaml
# Purpose: TEMPORAL entity type - time references following TimeML/TIMEX3 model
# =============================================================================
# BREAKING CHANGE v1.7.0: Restructured from TEMPORAL_REFERENCE (TMP)
# Now follows TimeML/TIMEX3 typology with absolute vs. relative distinction
# =============================================================================
id: https://w3id.org/glam/ner/hypernym/temporal
name: glam-ner-temporal-hypernym
TEMPORAL:
code: "TMP"
definition: |
Temporal expressions following the TimeML/TIMEX3 standard. Temporal
expressions denote points, intervals, or frequencies on the timeline.
Key distinctions:
- ABSOLUTE: Resolved to specific calendar/clock ("15 July 1606")
- RELATIVE: Requires context to resolve ("last year", "recently")
- DURATION: Temporal extent ("three years", "for a decade")
- SET: Recurring frequencies ("every Monday", "annually")
TimeML/TIMEX3 is the ISO standard for temporal expression markup,
widely used in NLP (TimeBank, TempEval). This provides interoperability
with temporal reasoning systems.
design_rationale: |
NERD's generic "Time" class lacks the semantic precision needed for:
- Temporal reasoning and event ordering
- Calendar normalization (Julian vs. Gregorian)
- Duration calculations
- Frequency detection (for opening hours, events)
TimeML/TIMEX3 (ISO-TimeML) provides:
- TYPE attribute: DATE | TIME | DURATION | SET
- VALUE attribute: ISO 8601 normalized value
- MOD attribute: START | END | MID | BEFORE | AFTER | APPROX
CIDOC-CRM provides complementary modeling:
- E52 Time-Span: Temporal extent with fuzzy boundaries
- E61 Time Primitive: Precise instant or interval
- E4 Period: Named historical periods
# ---------------------------------------------------------------------------
# ONTOLOGY MAPPINGS
# ---------------------------------------------------------------------------
ontology_mappings:
primary_class: "crm:E52_Time-Span"
primary_class_definition: |
CIDOC-CRM E52 Time-Span: "This class comprises abstract temporal extents,
in the sense of Galilean physics, having a beginning, an end and a
duration."
alternative_classes:
- "time:TemporalEntity" # W3C Time Ontology
- "time:Instant" # Specific point
- "time:Interval" # Duration with bounds
linkml_mapping:
class_uri: "crm:E52_Time-Span"
exact_mappings:
- "time:TemporalEntity"
close_mappings:
- "dct:temporal"
nerd_class: "nerd:Time"
nerd_deprecation_note: |
DEPRECATED: NERD's Time class is too generic, lacking the DATE/TIME/
DURATION/SET typology essential for temporal reasoning. Use TimeML
TIMEX3 types for NLP annotation; map to CIDOC-CRM for semantics.
Retain NERD mapping ONLY for basic NLP pipeline interchange.
timeml_mapping:
element: "TIMEX3"
attributes:
type: "DATE | TIME | DURATION | SET"
value: "ISO 8601 normalized value"
mod: "BEFORE | AFTER | APPROX | START | END | MID"
anchorTimeID: "ID of anchoring time for relative expressions"
note: |
W3C Time Ontology (OWL-Time) provides formal semantics for temporal
entities compatible with CIDOC-CRM. TimeML provides NLP annotation
conventions. Both map to ISO 8601 for interchange.
# ---------------------------------------------------------------------------
# SUBCATEGORIES
# ---------------------------------------------------------------------------
subcategories:
# ----- ABSOLUTE TEMPORAL EXPRESSIONS -----
DATE_ABS:
code: "TMP.DAB"
definition: |
Absolute calendar dates that can be resolved without context.
Maps to TimeML TIMEX3 type="DATE" without relative modifiers.
examples:
- "15 July 1606"
- "March 2023"
- "1888"
- "12/04/1943"
- "the year 2000"
ontology_class: "time:Instant"
timeml_type: "DATE"
xsd_type: "xsd:date | xsd:gYear | xsd:gYearMonth"
linkml_mapping:
class_uri: "time:Instant"
exact_mappings:
- "xsd:date"
- "xsd:gYear"
note: |
Normalize to ISO 8601 format when possible:
- Full date: 1606-07-15
- Year-month: 2023-03
- Year only: 1888
Historical dates may require calendar specification (Julian/Gregorian).
DATE_REL:
code: "TMP.DRL"
definition: |
Relative dates requiring context (document date, speech time)
to resolve. Maps to TimeML TIMEX3 type="DATE" with anchorTimeID.
examples:
- "last year"
- "yesterday"
- "next month"
- "three years ago"
- "in recent decades"
- "since the war"
ontology_class: "crm:E52_Time-Span"
timeml_type: "DATE"
timeml_anchor: "Requires @anchorTimeID to resolve"
note: |
Relative dates are common in heritage texts but require context:
- "last year" → need document_date to resolve
- "since the war" → need historical context (which war?)
Mark with @certainty when resolution is uncertain.
TIME_ABS:
code: "TMP.TAB"
definition: |
Absolute clock times. Maps to TimeML TIMEX3 type="TIME".
examples:
- "10:00"
- "14:30"
- "midnight"
- "noon"
- "3 PM"
ontology_class: "time:Instant"
timeml_type: "TIME"
xsd_type: "xsd:time"
note: |
Normalize to 24-hour ISO 8601 format: HH:MM:SS
Named times: midnight → 00:00, noon → 12:00
TIME_REL:
code: "TMP.TRL"
definition: |
Relative clock times requiring context.
examples:
- "an hour later"
- "that morning"
- "in the evening"
ontology_class: "crm:E52_Time-Span"
timeml_type: "TIME"
# ----- DURATIONS -----
DURATION:
code: "TMP.DUR"
definition: |
Temporal durations: lengths of time without fixed endpoints.
Maps to TimeML TIMEX3 type="DURATION".
examples:
- "three years"
- "for a decade"
- "two hours"
- "the 17th century (100 years)"
- "a fortnight"
- "several months"
ontology_class: "time:Duration"
timeml_type: "DURATION"
xsd_type: "xsd:duration"
linkml_mapping:
class_uri: "time:Duration"
exact_mappings:
- "xsd:duration"
note: |
Normalize to ISO 8601 duration format: P[n]Y[n]M[n]DT[n]H[n]M[n]S
Examples:
- "three years" → P3Y
- "two hours" → PT2H
- "a decade" → P10Y
- "17th century" → P100Y (with temporal bounds for period)
# ----- RECURRING/PERIODIC TIMES -----
SET:
code: "TMP.SET"
definition: |
Recurring or periodic temporal expressions. Maps to TimeML
TIMEX3 type="SET". Common for opening hours, event schedules.
examples:
- "every Monday"
- "annually"
- "twice a week"
- "open Tuesday-Sunday"
- "each summer"
- "quarterly"
ontology_class: "schema:OpeningHoursSpecification"
timeml_type: "SET"
alternative_classes:
- "time:GeneralDateTimeDescription"
note: |
SET expressions describe recurring patterns:
- Frequency: "twice a week" → SET with @quant="2" @freq="1W"
- Schedule: "every Monday" → SET with value="XXXX-WXX-1"
For opening hours, schema:OpeningHoursSpecification provides
structured representation with dayOfWeek, opens, closes properties.
# ----- OPENING HOURS (specialized SET) -----
OPENHRS:
code: "TMP.OPH"
definition: |
Institutional opening hours and operational schedules. A specialized
form of SET expression.
examples:
- "open Tuesday-Sunday 10:00-17:00"
- "closed Mondays"
- "last entry at 16:30"
- "open daily except holidays"
ontology_class: "schema:OpeningHoursSpecification"
schema_properties:
dayOfWeek: "schema:DayOfWeek (Monday, Tuesday, etc.)"
opens: "xsd:time (opening time)"
closes: "xsd:time (closing time)"
validFrom: "xsd:date (seasonal start)"
validThrough: "xsd:date (seasonal end)"
note: |
Links to GROUP hypernym (heritage institutions) via schema:openingHours.
Use schema:specialOpeningHoursSpecification for holidays/exceptions.
# ----- DATE RANGES -----
RANGE:
code: "TMP.RNG"
definition: |
Date/time ranges with explicit start and end points.
examples:
- "1888-1890"
- "from March to June 2023"
- "10 February - 4 June 2023"
- "between 1650 and 1670"
ontology_class: "time:ProperInterval"
alternative_classes:
- "crm:E52_Time-Span"
edtf_note: |
Extended Date/Time Format (EDTF, ISO 8601-2) provides:
- Intervals: 1888/1890
- Open ranges: 1888/.. (from 1888 onwards)
- Uncertain: 1888?/1890
# ----- NAMED PERIODS -----
CENTURY:
code: "TMP.CEN"
definition: "Century references, a common periodization"
examples:
- "17th century"
- "the 1800s"
- "nineteenth century"
- "early 20th century"
ontology_class: "crm:E52_Time-Span"
timeml_type: "DATE"
note: |
Normalize centuries to date ranges:
- "17th century" → 1601-01-01/1700-12-31
- "the 1800s" → 1800-01-01/1899-12-31
Modifiers (early, mid, late) narrow the range.
ERA:
code: "TMP.ERA"
definition: |
Named historical periods, movements, and eras. These are cultural
periodizations, not calendar units.
examples:
- "the Golden Age"
- "the Renaissance"
- "Medieval period"
- "Edo period"
- "the Enlightenment"
- "Art Deco era"
ontology_class: "crm:E4_Period"
linkml_mapping:
class_uri: "crm:E4_Period"
close_mappings:
- "dct:PeriodOfTime"
note: |
Named periods have fuzzy boundaries and geographic variation:
- "Renaissance" varies by region (Italy vs. Northern Europe)
- "Golden Age" is culture-specific (Dutch vs. Spanish)
Use crm:E4_Period which explicitly allows fuzzy temporal boundaries.
Link to authority files (Getty AAT, Wikidata) for disambiguation.
EXHIBPER:
code: "TMP.EXP"
definition: "Exhibition periods and event dates"
examples:
- "10 February - 4 June 2023"
- "on view through December"
- "opening reception: May 5, 7-9 PM"
ontology_class: "crm:E52_Time-Span"
schema_mapping: "schema:Event with schema:startDate and schema:endDate"
note: "Use for temporally bounded institutional events."
# ---------------------------------------------------------------------------
# INCLUSION RULES
# ---------------------------------------------------------------------------
inclusion_rules:
- id: "TMP_INC001"
rule: "Tag complete date/time expressions as single entities"
examples:
- "15 July 1606 (single entity)"
- "between 1888 and 1890 (range)"
- "every Monday at 10:00"
- id: "TMP_INC002"
rule: "Tag named periods and eras"
examples:
- "the Dutch Golden Age"
- "the Baroque period"
- "during the Renaissance"
- id: "TMP_INC003"
rule: "Tag opening hours as complete SET expressions"
examples:
- "Tuesday to Sunday, 10:00-17:00"
- "open daily except Mondays"
- id: "TMP_INC004"
rule: "Tag relative expressions with their anchor context"
examples:
- "last year (relative to document date)"
- "since the merger (relative to event)"
- id: "TMP_INC005"
rule: "Tag durations even without specific anchoring"
examples:
- "for three years"
- "a decade of research"
# ---------------------------------------------------------------------------
# EXCLUSION RULES
# ---------------------------------------------------------------------------
exclusion_rules:
- id: "TMP_EXC001"
rule: "Do NOT tag deictics without recoverable reference"
examples:
- "now (unanchored)"
- "today (unless document date known)"
note: "These require pragmatic resolution beyond text"
- id: "TMP_EXC002"
rule: "Do NOT tag ordinal centuries as QUANTITY"
examples:
- "17th century → TMP.CEN, not QTY.ORD"
- id: "TMP_EXC003"
rule: "Do NOT tag temporal prepositions alone"
examples:
- "before (preposition, not temporal reference)"
- "during (connector)"