# ============================================================================= # PiCo Integration Module: Temporal Patterns & Calendar Systems # ============================================================================= # Part of: data/entity_annotation/modules/integrations/pico/ # Parent: _index.yaml # # Description: Temporal expression handling, calendar systems, date normalization, # and PROV-O provenance model for tracking observation/reconstruction # activities. # # Last Updated: 2025-12-12 # ============================================================================= # ----------------------------------------------------------------------------- # Calendar Systems # ----------------------------------------------------------------------------- # Historical documents use various calendar systems. This section defines # how to handle and normalize dates from different calendrical traditions. calendar_systems: description: | Historical sources use diverse calendar systems depending on culture, religion, and time period. Proper extraction requires: 1. Identifying the source calendar 2. Preserving the original date expression 3. Providing normalized ISO 8601 equivalents where possible supported_calendars: gregorian: id: "gregorian" label: "Gregorian Calendar" uri: "https://www.wikidata.org/wiki/Q12138" description: | The civil calendar used worldwide since 1582 (Catholic countries) or later (Protestant/Orthodox countries). adoption_dates: catholic: "1582-10-15" protestant: "1700-03-01" british_empire: "1752-09-14" russia: "1918-02-14" greece: "1923-03-01" usage_notes: | - Default for modern documents - Used in civil registrations after adoption - Standard for ISO 8601 normalization example: original: "15 October 1582" normalized: "1582-10-15" julian: id: "julian" label: "Julian Calendar" uri: "https://www.wikidata.org/wiki/Q11184" description: | Calendar introduced by Julius Caesar in 45 BCE. Used in Europe until Gregorian reform, and by Eastern Orthodox churches today. offset_from_gregorian: 16th_century: 10 17th_century: 10 18th_century: 11 19th_century: 12 20th_century: 13 21st_century: 13 usage_notes: | - Greek Orthodox Church records use Julian calendar - Russian Empire used Julian until 1918 - Dual dating common in transition periods - Format: "Julian date / Gregorian date" or "O.S./N.S." notation example: original: "14 March 1875 (O.S.)" gregorian_equivalent: "27 March 1875" normalized: "1875-03-27" note: "Greek Orthodox used Julian; Gregorian equivalent calculated" hijri: id: "hijri" label: "Islamic/Hijri Calendar" uri: "https://www.wikidata.org/wiki/Q28892" alternative_names: - "Islamic Calendar" - "Muslim Calendar" - "Lunar Hijri" - "Anno Hegirae (AH)" description: | Lunar calendar used in Islamic societies. Year 1 = 622 CE (Hijra). 354 or 355 days per year (12 lunar months). months: 1: "Muharram" 2: "Safar" 3: "Rabi' al-Awwal" 4: "Rabi' al-Thani" 5: "Jumada al-Awwal" 6: "Jumada al-Thani" 7: "Rajab" 8: "Sha'ban" 9: "Ramadan" 10: "Shawwal" 11: "Dhu al-Qa'dah" 12: "Dhu al-Hijjah" usage_notes: | - Ottoman Empire, Waqf documents, Sijill records - Year conversion: Gregorian = (Hijri * 0.97) + 622 - Month-level precision often sufficient - Some documents use both Hijri and local calendars example: original: "month of Rajab, year 1225 Hijri" normalized: "1810-07" note: "Approximate month - exact day unknown" hebrew: id: "hebrew" label: "Hebrew Calendar" uri: "https://www.wikidata.org/wiki/Q9644" alternative_names: - "Jewish Calendar" - "Anno Mundi" description: | Lunisolar calendar used in Jewish religious and civil life. Year 1 = 3761 BCE (traditional Creation date). months: 1: "Nisan" 2: "Iyar" 3: "Sivan" 4: "Tammuz" 5: "Av" 6: "Elul" 7: "Tishrei" 8: "Cheshvan" 9: "Kislev" 10: "Tevet" 11: "Shevat" 12: "Adar" usage_notes: | - Ketubot (marriage contracts) - Get (divorce documents) - Synagogue records - Year conversion: Gregorian = Hebrew - 3760 (approx) - Month names often transliterated in various ways example: original: "23 Elul 5656" normalized: "1896-09-01" note: "Hebrew date from Creation (anno mundi)" french_republican: id: "french_republican" label: "French Republican Calendar" uri: "https://www.wikidata.org/wiki/Q181974" description: | Calendar used in France 1793-1805. Year 1 = 1792 CE. 12 months of 30 days + 5-6 supplementary days. months: 1: "Vendemiaire" 2: "Brumaire" 3: "Frimaire" 4: "Nivose" 5: "Pluviose" 6: "Ventose" 7: "Germinal" 8: "Floreal" 9: "Prairial" 10: "Messidor" 11: "Thermidor" 12: "Fructidor" usage_notes: | - French civil registrations 1793-1805 - Some Belgian/Dutch territories - Conversion tables widely available example: original: "14 Vendemiaire an IV" normalized: "1795-10-06" chinese: id: "chinese" label: "Chinese Calendar" uri: "https://www.wikidata.org/wiki/Q32823" description: | Lunisolar calendar used in China and East Asia. Combines 60-year cycle with lunar months. usage_notes: | - Emperor reign year + lunar month + day - Gregorian adopted 1912 (Republic of China) - Traditional dates still used for festivals example: original: "Guangxu 22, 8th month, 15th day" normalized: "1896-09-21" # ----------------------------------------------------------------------------- # Date Expression Patterns # ----------------------------------------------------------------------------- date_expression_patterns: description: | Common patterns for expressing dates in historical sources. GLM annotators should recognize these patterns and extract: 1. The original expression (exact transcription) 2. The calendar system used 3. A normalized ISO 8601 date (where possible) patterns: full_date: description: "Complete date with day, month, and year" examples: - pattern: "15 October 1582" calendar: "gregorian" normalized: "1582-10-15" - pattern: "the fifteenth day of October in the year 1582" calendar: "gregorian" normalized: "1582-10-15" - pattern: "23 Elul 5656" calendar: "hebrew" normalized: "1896-09-01" partial_date: description: "Date with some components missing" examples: - pattern: "March 1875" calendar: "gregorian" normalized: "1875-03" precision: "month" - pattern: "in the year 1810" calendar: "gregorian" normalized: "1810" precision: "year" - pattern: "month of Rajab, 1225 AH" calendar: "hijri" normalized: "1810-07" precision: "month" dual_dating: description: "Documents showing both Julian and Gregorian dates" notation_styles: - "O.S. (Old Style = Julian)" - "N.S. (New Style = Gregorian)" - "Slash notation: 14/27 March 1875" examples: - pattern: "14/27 March 1875" interpretation: "14 March (Julian) = 27 March (Gregorian)" normalized: "1875-03-27" note: "Use Gregorian for normalization" - pattern: "6 January 1894 (Gregorian)" normalized: "1894-01-06" note: "Explicit calendar indicator" relative_dating: description: "Dates relative to events or other dates" examples: - pattern: "three days after Easter" requires: "Year context to calculate" - pattern: "the Sunday before St. Martins Day" requires: "Year context and liturgical calendar" floruit: description: "Period when person was known to be active" notation: "fl." examples: - pattern: "fl. 1780-1820" interpretation: "Active between 1780 and 1820" - pattern: "fl. c. 1850" interpretation: "Active around 1850" # ----------------------------------------------------------------------------- # Temporal Properties in PiCo # ----------------------------------------------------------------------------- temporal_properties: description: | Properties for capturing temporal information about persons observed in historical sources. biographical_dates: birth_date: property: "sdo:birthDate" property_uri: "https://schema.org/birthDate" range: "xsd:date or xsd:gYearMonth or xsd:gYear" description: "Date of birth" extraction_notes: | - May be explicitly stated or inferred from age - Capture calendar system if non-Gregorian - Normalize to ISO 8601 for querying death_date: property: "sdo:deathDate" property_uri: "https://schema.org/deathDate" range: "xsd:date or xsd:gYearMonth or xsd:gYear" description: "Date of death" extraction_notes: | - "deceased" annotation indicates death before document date - Infer approximate date from context when possible baptism_date: property: "pico:baptismDate" range: "xsd:date" description: "Date of baptism/christening" note: "Common in church records; often within days of birth" burial_date: property: "pico:burialDate" range: "xsd:date" description: "Date of burial" note: "Common in church/cemetery records" event_dates: marriage_date: property: "pico:marriageDate" range: "xsd:date" description: "Date of marriage event" divorce_date: property: "pico:divorceDate" range: "xsd:date" description: "Date of divorce" document_date: property: "sdo:dateCreated" property_uri: "https://schema.org/dateCreated" range: "xsd:date" description: "Date the source document was created" note: "Critical for temporal context of observations" age_expressions: age_at_event: property: "pico:ageAtEvent" range: "xsd:string" description: "Age as stated in document" examples: - "25 years" - "about 30 years old" - "minor (under legal age)" - "of full age (adult)" note: | Preserve original expression; calculate birth year if needed. "oud 25 jaar" (Dutch) = "25 years old" # ----------------------------------------------------------------------------- # PROV-O Provenance Model # ----------------------------------------------------------------------------- provenance_model: description: | PiCo uses W3C PROV-O for provenance tracking at two levels: 1. OBSERVATION LEVEL: Where did this observation come from? - prov:hadPrimarySource -> Source document - prov:wasGeneratedBy -> Extraction activity (optional) 2. RECONSTRUCTION LEVEL: How was this person entity created? - prov:wasDerivedFrom -> Source observation(s) - prov:wasGeneratedBy -> Reconstruction activity - prov:wasRevisionOf -> Previous reconstruction version activity_class: class: "prov:Activity" class_uri: "http://www.w3.org/ns/prov#Activity" description: "The activity that generated a PersonReconstruction" properties: - property: "prov:wasAssociatedWith" description: "Agent responsible for the activity" range: "prov:Agent" - property: "prov:startedAtTime" description: "When the activity started" range: "xsd:dateTime" - property: "prov:endedAtTime" description: "When the activity completed" range: "xsd:dateTime" - property: "prov:used" description: "Resources/tools used in the activity" range: "prov:Entity" note: "E.g., ML model, matching algorithm, rule set" activity_types: human_reconstruction: description: "Manual reconstruction by researcher" note: "Provide: time, place, knowledge sources, researcher name" algorithmic_reconstruction: description: "Automated reconstruction by software" note: "Provide: algorithm name, version, configuration, parameters" agent_class: class: "prov:Agent" class_uri: "http://www.w3.org/ns/prov#Agent" description: "Person or organization responsible for reconstruction" properties: - property: "sdo:name" description: "Name of the agent" range: "xsd:string" - property: "sdo:url" description: "URL identifying the agent" range: "sdo:URL" examples: - name: "CBG Center for Family History" url: "https://cbg.nl" type: "organization" - name: "GLM-4.6 Person Extractor v1.0" url: null type: "software" derivation_properties: - property: "prov:wasDerivedFrom" property_uri: "http://www.w3.org/ns/prov#wasDerivedFrom" description: "Links PersonReconstruction to source PersonObservation(s)" domain: "pico:PersonReconstruction" range: "pico:PersonObservation" cardinality: "1..*" note: "REQUIRED for all PersonReconstructions" - property: "prov:wasRevisionOf" property_uri: "http://www.w3.org/ns/prov#wasRevisionOf" description: "Links to previous version of reconstruction" domain: "pico:PersonReconstruction" range: "pico:PersonReconstruction" cardinality: "0..1" note: "For tracking reconstruction updates over time" # ----------------------------------------------------------------------------- # PiCo Vocabularies/Thesauri # ----------------------------------------------------------------------------- pico_vocabularies: description: | PiCo defines three SKOS concept schemes for controlled terminology: - Roles: The role a person plays in a source (child, declarant, witness, etc.) - SourceTypes: Types of historical sources (birth certificate, census, etc.) - EventTypes: Types of life events (birth, marriage, death, etc.) roles_thesaurus: id: "picot_roles" uri: "https://terms.personsincontext.org/roles/" type: "skos:ConceptScheme" label: "Persons in Context role thesaurus" description: "Roles that persons can have in historical sources" usage: | Use pico:hasRole property with a term from this thesaurus. Example: picot_roles:575 (child), picot_roles:489 (declarant) example_concepts: - id: "575" label: "child" description: "Person appearing as child in a record" - id: "489" label: "declarant" description: "Person declaring/reporting an event" - id: "witness" label: "witness" description: "Person witnessing an event or signing a document" - id: "bride" label: "bride" description: "Female partner in a marriage" - id: "groom" label: "groom" description: "Male partner in a marriage" sourcetypes_thesaurus: id: "picot_sourcetypes" uri: "https://terms.personsincontext.org/sourcetypes/" type: "skos:ConceptScheme" label: "Persons in Context sourceType thesaurus" description: "Types of historical sources containing person observations" usage: | Use sdo:additionalType property on sdo:ArchiveComponent. Example: picot_sourcetypes:551 (civil registry: birth) example_concepts: - id: "551" label: "civil registry: birth" description: "Birth certificate from civil registration" - id: "marriage" label: "civil registry: marriage" description: "Marriage certificate" - id: "death" label: "civil registry: death" description: "Death certificate" - id: "census" label: "census" description: "Population census record" - id: "church_baptism" label: "church record: baptism" description: "Baptismal record from church register" - id: "notarial" label: "notarial record" description: "Notarial act or protocol" eventtypes_thesaurus: id: "picot_eventtypes" uri: "https://terms.personsincontext.org/eventtypes/" type: "skos:ConceptScheme" label: "Persons in Context eventType thesaurus" description: "Types of life events documented in sources" example_concepts: - id: "birth" label: "birth" - id: "baptism" label: "baptism" - id: "marriage" label: "marriage" - id: "death" label: "death" - id: "burial" label: "burial" - id: "emigration" label: "emigration" - id: "immigration" label: "immigration" # ----------------------------------------------------------------------------- # CH-Annotator Hypernym Integration for Temporal # ----------------------------------------------------------------------------- temporal_hypernym_mapping: description: | Mapping between temporal expressions and CH-Annotator hypernyms. mappings: - pico_property: "sdo:birthDate" ch_hypernym: "TMP.DAT" ch_code: "TMP.DAT" note: "Birth date temporal expression" - pico_property: "sdo:deathDate" ch_hypernym: "TMP.DAT" ch_code: "TMP.DAT" note: "Death date temporal expression" - pico_property: "sdo:dateCreated" ch_hypernym: "TMP.DAT" ch_code: "TMP.DAT" note: "Document creation date" - calendar_expression: "Hijri date" ch_hypernym: "TMP.DAT" normalization: "Convert to Gregorian ISO 8601" - calendar_expression: "Hebrew date" ch_hypernym: "TMP.DAT" normalization: "Convert to Gregorian ISO 8601" - calendar_expression: "Julian date" ch_hypernym: "TMP.DAT" normalization: "Convert to Gregorian ISO 8601"