glam/schemas/20251121/linkml/modules/classes/VideoSubtitle.yaml

id: https://nde.nl/ontology/hc/class/VideoSubtitle
name: video_subtitle_class
title: Video Subtitle Class
imports:
- linkml:types
- ./VideoTranscript
- ./VideoTimeSegment
- ../slots/class_metadata_slots
prefixes:
  linkml: https://w3id.org/linkml/
  hc: https://nde.nl/ontology/hc/
  schema: http://schema.org/
  dcterms: http://purl.org/dc/terms/
  prov: http://www.w3.org/ns/prov#
  crm: http://www.cidoc-crm.org/cidoc-crm/
  skos: http://www.w3.org/2004/02/skos/core#
  ma: http://www.w3.org/ns/ma-ont#
default_prefix: hc
classes:
  VideoSubtitle:
    is_a: VideoTranscript
    class_uri: hc:VideoSubtitle
    abstract: false
    description: |
      Time-coded caption/subtitle content for video.

      **DEFINITION**:

      VideoSubtitle represents caption/subtitle tracks that provide time-coded
      text synchronized with video playback. It extends VideoTranscript because
      subtitles contain complete transcription PLUS temporal synchronization.

      **INHERITANCE FROM VideoTranscript**:

      VideoSubtitle inherits all transcript capabilities:
      - `full_text`: Complete subtitle text concatenated
      - `segments`: Time-coded entries (REQUIRED for subtitles)
      - `includes_timestamps`: Always true for subtitles
      - `content_language`: Language of subtitle text
      - All provenance from VideoTextContent

      And adds subtitle-specific properties:
      - `subtitle_format`: SRT, VTT, TTML, SBV, ASS
      - `is_closed_caption`: CC vs regular subtitles
      - `is_sdh`: Subtitles for Deaf/Hard-of-Hearing
      - `includes_sound_descriptions`: Non-speech audio descriptions

      **SCHEMA.ORG ALIGNMENT**:

      Maps to `schema:caption` property:
      > "For downloadable machine formats (closed caption, subtitles etc.)
      >  use the MediaObject.encodingFormat property."

      **SUBTITLE vs CAPTION vs TRANSCRIPT**:

      | Type | Time-coded | Purpose | Audience |
      |------|------------|---------|----------|
      | Transcript | Optional | Reading, search | Everyone |
      | Subtitle | Required | Language translation | Hearing viewers |
      | Caption (CC) | Required | Accessibility | Deaf/HoH viewers |
      | SDH | Required | Full accessibility | Deaf viewers, noisy environments |

      **SDH (Subtitles for Deaf/Hard-of-Hearing)**:

      SDH differs from regular subtitles by including:
      - Speaker identification: "(John) Hello"
      - Sound effects: "[door slams]", "[music playing]"
      - Music descriptions: "♪ upbeat jazz ♪"
      - Emotional cues: "[laughing]", "[whispering]"

      **SUBTITLE FORMATS**:

      | Format | Extension | Features | Use Case |
      |--------|-----------|----------|----------|
      | SRT | .srt | Simple, universal | Most video players |
      | VTT | .vtt | W3C standard, styling | HTML5 video, web |
      | TTML | .ttml/.dfxp | XML, rich styling | Broadcast, streaming |
      | SBV | .sbv | YouTube native | YouTube uploads |
      | ASS | .ass | Advanced styling | Anime, complex layouts |

      **SRT FORMAT EXAMPLE**:

      ```
      1
      00:00:00,000 --> 00:00:03,500
      Welcome to the Rijksmuseum.

      2
      00:00:03,500 --> 00:00:08,200
      Today we'll explore the Night Watch gallery.
      ```

      **VTT FORMAT EXAMPLE**:

      ```
      WEBVTT

      00:00:00.000 --> 00:00:03.500
      Welcome to the Rijksmuseum.

      00:00:03.500 --> 00:00:08.200
      Today we'll explore the Night Watch gallery.
      ```

      **HERITAGE INSTITUTION CONTEXT**:

      Subtitles are critical for heritage video accessibility:

      1. **Accessibility Compliance**: WCAG 2.1, Section 508
      2. **Multilingual Access**: Translate for international audiences
      3. **Silent Viewing**: Social media, public displays, quiet spaces
      4. **Search Discovery**: Subtitle text is indexed by platforms
      5. **Preservation**: Text outlasts video format obsolescence

      **YOUTUBE API INTEGRATION**:

      Subtitle tracks from YouTube API populate:
      - `subtitle_format`: Typically VTT or SRT
      - `generation_method`: PLATFORM_PROVIDED or ASR_AUTOMATIC
      - `content_language`: From track language code
      - `is_auto_generated`: YouTube auto-caption flag

      **SEGMENTS ARE REQUIRED**:

      Unlike VideoTranscript where segments are optional, VideoSubtitle
      REQUIRES the `segments` slot to be populated with VideoTimeSegment
      entries that include start_seconds, end_seconds, and segment_text.
    exact_mappings:
    - schema:caption
    close_mappings:
    - ma:CaptioningFormat
    related_mappings:
    - schema:transcript
    slots:
    - average_entry_duration_seconds
    - default_position
    - entry_count
    - includes_music_descriptions
    - includes_sound_descriptions
    - includes_speaker_identification
    - is_auto_generated
    - is_closed_caption
    - is_sdh
    - raw_subtitle_content
    - specificity_annotation
    - subtitle_format
    - template_specificity
    - track_id
    - track_name
    slot_usage:
      segments:
        required: true
        description: |
          Time-coded subtitle entries as VideoTimeSegment objects.

          **REQUIRED for VideoSubtitle** (optional in parent VideoTranscript).

          Each segment represents one caption display unit:
          - start_seconds: When caption appears
          - end_seconds: When caption disappears
          - segment_text: Caption text content
          - segment_index: Order in subtitle track
          - confidence: For auto-generated captions

          Segments are ordered by start_seconds for proper playback.
      includes_timestamps:
        ifabsent: 'true'
        description: |
          Whether subtitle includes time markers.

          **Always true for VideoSubtitle** - time-coding is definitional.
      subtitle_format:
        slot_uri: dcterms:format
        description: |
          Subtitle file format.

          Dublin Core: format for resource format.

          Specifies the encoding format of the subtitle content.
          Affects parsing and rendering capabilities.
        range: SubtitleFormatEnum
        required: true
        examples:
        - value: VTT
          description: WebVTT format (W3C standard)
        - value: SRT
          description: SubRip format (most common)
      raw_subtitle_content:
        slot_uri: hc:rawSubtitleContent
        description: |
          Original subtitle file content as raw string.

          Preserves the complete subtitle file in its native format.
          Useful for:
          - Format conversion
          - Re-parsing with different tools
          - Archive preservation

          May be large - consider storing separately for large files.
        range: string
        required: false
        examples:
        - value: |
            WEBVTT

            00:00:00.000 --> 00:00:03.500
            Welcome to the museum.
          description: Complete VTT file content
      is_closed_caption:
        slot_uri: hc:isClosedCaption
        description: |
          Whether this is a closed caption track (CC).

          Closed captions differ from subtitles:
          - **CC (true)**: Designed for Deaf/HoH, includes non-speech audio
          - **Subtitles (false)**: Translation of dialogue only

          CC typically includes [MUSIC], [APPLAUSE], speaker ID, etc.
        range: boolean
        required: false
        ifabsent: 'false'
        examples:
        - value: true
          description: This is a closed caption track
      is_sdh:
        slot_uri: hc:isSDH
        description: |
          Whether these are Subtitles for Deaf/Hard-of-Hearing (SDH).

          SDH combines subtitle translation with CC-style annotations:
          - Dialogue translation (like subtitles)
          - Sound descriptions (like CC)
          - Speaker identification

          Typically marked "[SDH]" on streaming platforms.
        range: boolean
        required: false
        ifabsent: 'false'
        examples:
        - value: true
          description: SDH subtitle track
      includes_sound_descriptions:
        slot_uri: hc:includesSoundDescriptions
        description: |
          Whether subtitle includes non-speech sound descriptions.

          Examples of sound descriptions:
          - [door slams]
          - [phone ringing]
          - [thunder]
          - [footsteps approaching]

          Characteristic of CC and SDH tracks.
        range: boolean
        required: false
        ifabsent: 'false'
        examples:
        - value: true
          description: Contains sound effect descriptions
      includes_music_descriptions:
        slot_uri: hc:includesMusicDescriptions
        description: |
          Whether subtitle includes music/song descriptions.

          Examples:
          - ♪ upbeat jazz playing ♪
          - [classical music]
          - ♪ singing in Dutch ♪
          - [somber orchestral music]

          Important for heritage content with significant musical elements.
        range: boolean
        required: false
        ifabsent: 'false'
        examples:
        - value: true
          description: Contains music descriptions
      includes_speaker_identification:
        slot_uri: hc:includesSpeakerIdentification
        description: |
          Whether subtitle identifies speakers.

          Speaker identification patterns:
          - (John): Hello there.
          - NARRATOR: Welcome to the museum.
          - [Curator] This painting dates from 1642.

          Different from transcript speaker_id which is per-segment;
          this indicates whether the TEXT CONTENT includes labels.
        range: boolean
        required: false
        ifabsent: 'false'
        examples:
        - value: true
          description: Subtitle text includes speaker labels
      is_auto_generated:
        slot_uri: hc:isAutoGenerated
        description: |
          Whether subtitle was auto-generated by the platform.

          Distinct from generation_method (inherited from VideoTextContent):
          - `is_auto_generated`: Platform flag (YouTube, Vimeo)
          - `generation_method`: How WE know it was generated

          Auto-generated captions typically have lower accuracy.
        range: boolean
        required: false
        ifabsent: 'false'
        examples:
        - value: true
          description: YouTube auto-generated caption
      track_name:
        slot_uri: schema:name
        description: |
          Human-readable name of the subtitle track.

          Schema.org: name for track label.

          Examples from YouTube:
          - "English"
          - "English (auto-generated)"
          - "Dutch - Nederlands"
          - "English (United Kingdom)"
        range: string
        required: false
        examples:
        - value: English (auto-generated)
          description: YouTube auto-caption track name
      track_id:
        slot_uri: dcterms:identifier
        description: |
          Platform-specific identifier for this subtitle track.

          Dublin Core: identifier for unique ID.

          Used to fetch/update specific tracks via API.
        range: string
        required: false
        examples:
        - value: en.3OWxR1w4QfE
          description: YouTube caption track ID
      default_position:
        slot_uri: hc:defaultPosition
        description: "Default display position for captions.\n\nFor formats that support\
          \ positioning (VTT, TTML, ASS):\n- BOTTOM: Default, below video content\n\
          - TOP: Above video content  \n- MIDDLE: Center of video\n\nMay be overridden\
          \ per-segment in advanced formats.\n"
        range: SubtitlePositionEnum
        required: false
        ifabsent: string(BOTTOM)
        examples:
        - value: BOTTOM
          description: Standard bottom caption position
      entry_count:
        slot_uri: hc:entryCount
        description: |
          Number of subtitle entries (caption cues).

          Equals length of segments array.
          Useful for content sizing without loading full segments.
        range: integer
        required: false
        minimum_value: 0
        examples:
        - value: 127
          description: 127 caption cues in this track
      average_entry_duration_seconds:
        slot_uri: hc:averageEntryDuration
        description: |
          Average duration of subtitle entries in seconds.

          Typical ranges:
          - 2-4 seconds: Normal speech rate
          - < 2 seconds: Rapid dialogue
          - > 5 seconds: Slow narration or long displays

          Useful for quality assessment - very short or long entries
          may indicate timing issues.
        range: float
        required: false
        minimum_value: 0.0
        examples:
        - value: 3.2
          description: Average 3.2 seconds per caption
      specificity_annotation:
        range: SpecificityAnnotation
        inlined: true
      template_specificity:
        range: TemplateSpecificityScores
        inlined: true
    rules:
    - postconditions:
        description: |
          segments must be populated for VideoSubtitle.
          This is enforced by making segments required in slot_usage.
    comments:
    - Time-coded caption/subtitle content
    - Extends VideoTranscript - subtitles ARE transcripts plus time codes
    - 'Supports multiple formats: SRT, VTT, TTML, SBV, ASS'
    - 'Accessibility metadata: CC, SDH, sound/music descriptions'
    - Critical for heritage video accessibility compliance
    see_also:
    - https://schema.org/caption
    - https://www.w3.org/TR/webvtt1/
    - https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API
    - https://www.3playmedia.com/learn/popular-topics/closed-captioning/
enums:
  SubtitleFormatEnum:
    description: |
      Subtitle/caption file formats.

      Each format has different capabilities for timing precision,
      styling, positioning, and metadata.
    permissible_values:
      SRT:
        description: |
          SubRip subtitle format (.srt).
          Most widely supported format.
          Simple: sequence number, timecode, text.
          No styling or positioning support.
      VTT:
        description: |
          WebVTT format (.vtt).
          W3C standard for HTML5 video.
          Supports styling (CSS), positioning, cue settings.
          Recommended for web delivery.
      TTML:
        description: |
          Timed Text Markup Language (.ttml/.dfxp/.xml).
          W3C XML-based standard.
          Rich styling, regions, timing.
          Used in broadcast and streaming (Netflix, Amazon).
      SBV:
        description: |
          YouTube SubViewer format (.sbv).
          Simple format similar to SRT.
          Native YouTube caption format.
      ASS:
        description: |
          Advanced SubStation Alpha (.ass).
          Advanced styling, positioning, effects.
          Popular for anime subtitles.
          Includes SSA (.ssa) as predecessor.
      STL:
        description: |
          EBU STL format (.stl).
          European Broadcasting Union standard.
          Used in broadcast television.
          Binary format with teletext compatibility.
      CAP:
        description: |
          Scenarist Closed Caption (.scc/.cap).
          Used for broadcast closed captioning.
          CEA-608/CEA-708 compliant.
      SAMI:
        description: |
          Synchronized Accessible Media Interchange (.smi/.sami).
          Microsoft format for Windows Media.
          HTML-like markup with timing.
      LRC:
        description: |
          LRC lyrics format (.lrc).
          Simple format for song lyrics.
          Line-by-line timing, no duration.
      JSON:
        description: |
          JSON-based subtitle format.
          Used by some APIs (YouTube transcript API).
          Structure varies by source.
      UNKNOWN:
        description: |
          Unknown or unrecognized format.
          May require manual parsing or conversion.
  SubtitlePositionEnum:
    description: |
      Default caption display position on video.

      May be overridden by format-specific positioning (VTT, TTML, ASS).
    permissible_values:
      BOTTOM:
        description: |
          Bottom of video frame (standard position).
          Most common for subtitles and captions.
          Typically in lower 10-15% of frame.
      TOP:
        description: |
          Top of video frame.
          Used when bottom is occluded.
          Common for some broadcast formats.
      MIDDLE:
        description: |
          Center of video frame.
          Rarely used except for specific effects.
      LEFT:
        description: |
          Left side of frame (vertical text).
          Rare, used for specific languages/effects.
      RIGHT:
        description: |
          Right side of frame (vertical text).
          Rare, used for specific languages/effects.
slots:
  subtitle_format:
    description: Subtitle file format (SRT, VTT, TTML, etc.)
    range: SubtitleFormatEnum
  raw_subtitle_content:
    description: Original subtitle file content as raw string
    range: string
  is_closed_caption:
    description: Whether this is a closed caption (CC) track
    range: boolean
  is_sdh:
    description: Whether these are Subtitles for Deaf/Hard-of-Hearing
    range: boolean
  includes_sound_descriptions:
    description: Whether subtitle includes non-speech sound descriptions
    range: boolean
  includes_music_descriptions:
    description: Whether subtitle includes music descriptions
    range: boolean
  includes_speaker_identification:
    description: Whether subtitle text includes speaker labels
    range: boolean
  is_auto_generated:
    description: Whether subtitle was auto-generated by platform
    range: boolean
  track_name:
    description: Human-readable name of subtitle track
    range: string
  track_id:
    description: Platform-specific identifier for subtitle track
    range: string
  default_position:
    description: Default display position for captions
    range: SubtitlePositionEnum
  entry_count:
    description: Number of subtitle entries (caption cues)
    range: integer
  average_entry_duration_seconds:
    description: Average duration of subtitle entries in seconds
    range: float