glam/schemas/20251121/linkml/modules/classes/VideoSubtitle.yaml
kempersc 51554947a0 feat(schema): Add video content schema with comprehensive examples
Video Schema Classes (9 files):
- VideoPost, VideoComment: Social media video modeling
- VideoTextContent: Base class for text content extraction
- VideoTranscript, VideoSubtitle: Text with timing and formatting
- VideoTimeSegment: Time code handling with ISO 8601 duration
- VideoAnnotation: Base annotation with W3C Web Annotation alignment
- VideoAnnotationTypes: Scene, Object, OCR detection annotations
- VideoChapter, VideoChapterList: Navigation and chapter structure
- VideoAudioAnnotation: Speaker diarization, music, sound events

Enumerations (12 enums):
- VideoDefinitionEnum, LiveBroadcastStatusEnum
- TranscriptFormatEnum, SubtitleFormatEnum, SubtitlePositionEnum
- AnnotationTypeEnum, AnnotationMotivationEnum
- DetectionLevelEnum, SceneTypeEnum, TransitionTypeEnum, TextTypeEnum
- ChapterSourceEnum, AudioEventTypeEnum, SoundEventTypeEnum, MusicTypeEnum

Examples (904 lines, 10 comprehensive heritage-themed examples):
- Rijksmuseum virtual tour chapters (5 chapters with heritage entity refs)
- Operation Night Watch documentary chapters (5 chapters)
- VideoAudioAnnotation: curator interview, exhibition promo, museum lecture

All examples reference real heritage entities with Wikidata IDs:
Q5598 (Rembrandt), Q41264 (Vermeer), Q219831 (The Night Watch)
2025-12-16 20:03:17 +01:00

632 lines
20 KiB
YAML

# Video Subtitle Class
# Time-coded caption/subtitle content extending VideoTranscript
#
# Part of Heritage Custodian Ontology v0.9.5
#
# HIERARCHY:
# E73_Information_Object (CIDOC-CRM)
# │
# └── VideoTextContent (abstract - provenance)
# │
# └── VideoTranscript (full text transcription)
# │
# └── VideoSubtitle (this class - time-coded captions)
#
# DESIGN RATIONALE:
# VideoSubtitle extends VideoTranscript because subtitles ARE transcripts
# with additional time-coding and display metadata:
#
# 1. A subtitle file (SRT, VTT) contains complete spoken content (transcript)
# 2. Plus precise start/end times for each caption
# 3. Plus display formatting (position, styling in some formats)
#
# You can always derive a plain transcript from subtitles by stripping times.
# This inheritance enables polymorphic handling: treat subtitles as transcripts
# when time-coding isn't needed.
#
# SUBTITLE FORMATS SUPPORTED:
# - SRT (SubRip): Most common, simple time + text
# - VTT (WebVTT): W3C standard, supports styling
# - TTML (DFXP): XML-based, broadcast standard
# - SBV (YouTube): YouTube's native format
# - ASS/SSA: Advanced styling, anime subtitles
id: https://nde.nl/ontology/hc/class/VideoSubtitle
name: video_subtitle_class
title: Video Subtitle Class
imports:
- linkml:types
- ./VideoTranscript
- ./VideoTimeSegment
prefixes:
linkml: https://w3id.org/linkml/
hc: https://nde.nl/ontology/hc/
schema: http://schema.org/
dcterms: http://purl.org/dc/terms/
prov: http://www.w3.org/ns/prov#
crm: http://www.cidoc-crm.org/cidoc-crm/
skos: http://www.w3.org/2004/02/skos/core#
ma: http://www.w3.org/ns/ma-ont#
default_prefix: hc
classes:
VideoSubtitle:
is_a: VideoTranscript
class_uri: hc:VideoSubtitle
abstract: false
description: |
Time-coded caption/subtitle content for video.
**DEFINITION**:
VideoSubtitle represents caption/subtitle tracks that provide time-coded
text synchronized with video playback. It extends VideoTranscript because
subtitles contain complete transcription PLUS temporal synchronization.
**INHERITANCE FROM VideoTranscript**:
VideoSubtitle inherits all transcript capabilities:
- `full_text`: Complete subtitle text concatenated
- `segments`: Time-coded entries (REQUIRED for subtitles)
- `includes_timestamps`: Always true for subtitles
- `content_language`: Language of subtitle text
- All provenance from VideoTextContent
And adds subtitle-specific properties:
- `subtitle_format`: SRT, VTT, TTML, SBV, ASS
- `is_closed_caption`: CC vs regular subtitles
- `is_sdh`: Subtitles for Deaf/Hard-of-Hearing
- `includes_sound_descriptions`: Non-speech audio descriptions
**SCHEMA.ORG ALIGNMENT**:
Maps to `schema:caption` property:
> "For downloadable machine formats (closed caption, subtitles etc.)
> use the MediaObject.encodingFormat property."
**SUBTITLE vs CAPTION vs TRANSCRIPT**:
| Type | Time-coded | Purpose | Audience |
|------|------------|---------|----------|
| Transcript | Optional | Reading, search | Everyone |
| Subtitle | Required | Language translation | Hearing viewers |
| Caption (CC) | Required | Accessibility | Deaf/HoH viewers |
| SDH | Required | Full accessibility | Deaf viewers, noisy environments |
**SDH (Subtitles for Deaf/Hard-of-Hearing)**:
SDH differs from regular subtitles by including:
- Speaker identification: "(John) Hello"
- Sound effects: "[door slams]", "[music playing]"
- Music descriptions: "♪ upbeat jazz ♪"
- Emotional cues: "[laughing]", "[whispering]"
**SUBTITLE FORMATS**:
| Format | Extension | Features | Use Case |
|--------|-----------|----------|----------|
| SRT | .srt | Simple, universal | Most video players |
| VTT | .vtt | W3C standard, styling | HTML5 video, web |
| TTML | .ttml/.dfxp | XML, rich styling | Broadcast, streaming |
| SBV | .sbv | YouTube native | YouTube uploads |
| ASS | .ass | Advanced styling | Anime, complex layouts |
**SRT FORMAT EXAMPLE**:
```
1
00:00:00,000 --> 00:00:03,500
Welcome to the Rijksmuseum.
2
00:00:03,500 --> 00:00:08,200
Today we'll explore the Night Watch gallery.
```
**VTT FORMAT EXAMPLE**:
```
WEBVTT
00:00:00.000 --> 00:00:03.500
Welcome to the Rijksmuseum.
00:00:03.500 --> 00:00:08.200
Today we'll explore the Night Watch gallery.
```
**HERITAGE INSTITUTION CONTEXT**:
Subtitles are critical for heritage video accessibility:
1. **Accessibility Compliance**: WCAG 2.1, Section 508
2. **Multilingual Access**: Translate for international audiences
3. **Silent Viewing**: Social media, public displays, quiet spaces
4. **Search Discovery**: Subtitle text is indexed by platforms
5. **Preservation**: Text outlasts video format obsolescence
**YOUTUBE API INTEGRATION**:
Subtitle tracks from YouTube API populate:
- `subtitle_format`: Typically VTT or SRT
- `generation_method`: PLATFORM_PROVIDED or ASR_AUTOMATIC
- `content_language`: From track language code
- `is_auto_generated`: YouTube auto-caption flag
**SEGMENTS ARE REQUIRED**:
Unlike VideoTranscript where segments are optional, VideoSubtitle
REQUIRES the `segments` slot to be populated with VideoTimeSegment
entries that include start_seconds, end_seconds, and segment_text.
exact_mappings:
- schema:caption
close_mappings:
- ma:CaptioningFormat
related_mappings:
- schema:transcript
slots:
# Subtitle-specific format
- subtitle_format
- raw_subtitle_content
# Accessibility metadata
- is_closed_caption
- is_sdh
- includes_sound_descriptions
- includes_music_descriptions
- includes_speaker_identification
# Source/generation info
- is_auto_generated
- track_name
- track_id
# Positioning (for formats that support it)
- default_position
# Entry counts
- entry_count
- average_entry_duration_seconds
slot_usage:
# Override segments to be required for subtitles
segments:
required: true
description: |
Time-coded subtitle entries as VideoTimeSegment objects.
**REQUIRED for VideoSubtitle** (optional in parent VideoTranscript).
Each segment represents one caption display unit:
- start_seconds: When caption appears
- end_seconds: When caption disappears
- segment_text: Caption text content
- segment_index: Order in subtitle track
- confidence: For auto-generated captions
Segments are ordered by start_seconds for proper playback.
# Override includes_timestamps to default true
includes_timestamps:
ifabsent: "true"
description: |
Whether subtitle includes time markers.
**Always true for VideoSubtitle** - time-coding is definitional.
subtitle_format:
slot_uri: dcterms:format
description: |
Subtitle file format.
Dublin Core: format for resource format.
Specifies the encoding format of the subtitle content.
Affects parsing and rendering capabilities.
range: SubtitleFormatEnum
required: true
examples:
- value: "VTT"
description: "WebVTT format (W3C standard)"
- value: "SRT"
description: "SubRip format (most common)"
raw_subtitle_content:
slot_uri: hc:rawSubtitleContent
description: |
Original subtitle file content as raw string.
Preserves the complete subtitle file in its native format.
Useful for:
- Format conversion
- Re-parsing with different tools
- Archive preservation
May be large - consider storing separately for large files.
range: string
required: false
examples:
- value: |
WEBVTT
00:00:00.000 --> 00:00:03.500
Welcome to the museum.
description: "Complete VTT file content"
is_closed_caption:
slot_uri: hc:isClosedCaption
description: |
Whether this is a closed caption track (CC).
Closed captions differ from subtitles:
- **CC (true)**: Designed for Deaf/HoH, includes non-speech audio
- **Subtitles (false)**: Translation of dialogue only
CC typically includes [MUSIC], [APPLAUSE], speaker ID, etc.
range: boolean
required: false
ifabsent: "false"
examples:
- value: true
description: "This is a closed caption track"
is_sdh:
slot_uri: hc:isSDH
description: |
Whether these are Subtitles for Deaf/Hard-of-Hearing (SDH).
SDH combines subtitle translation with CC-style annotations:
- Dialogue translation (like subtitles)
- Sound descriptions (like CC)
- Speaker identification
Typically marked "[SDH]" on streaming platforms.
range: boolean
required: false
ifabsent: "false"
examples:
- value: true
description: "SDH subtitle track"
includes_sound_descriptions:
slot_uri: hc:includesSoundDescriptions
description: |
Whether subtitle includes non-speech sound descriptions.
Examples of sound descriptions:
- [door slams]
- [phone ringing]
- [thunder]
- [footsteps approaching]
Characteristic of CC and SDH tracks.
range: boolean
required: false
ifabsent: "false"
examples:
- value: true
description: "Contains sound effect descriptions"
includes_music_descriptions:
slot_uri: hc:includesMusicDescriptions
description: |
Whether subtitle includes music/song descriptions.
Examples:
- ♪ upbeat jazz playing ♪
- [classical music]
- ♪ singing in Dutch ♪
- [somber orchestral music]
Important for heritage content with significant musical elements.
range: boolean
required: false
ifabsent: "false"
examples:
- value: true
description: "Contains music descriptions"
includes_speaker_identification:
slot_uri: hc:includesSpeakerIdentification
description: |
Whether subtitle identifies speakers.
Speaker identification patterns:
- (John): Hello there.
- NARRATOR: Welcome to the museum.
- [Curator] This painting dates from 1642.
Different from transcript speaker_id which is per-segment;
this indicates whether the TEXT CONTENT includes labels.
range: boolean
required: false
ifabsent: "false"
examples:
- value: true
description: "Subtitle text includes speaker labels"
is_auto_generated:
slot_uri: hc:isAutoGenerated
description: |
Whether subtitle was auto-generated by the platform.
Distinct from generation_method (inherited from VideoTextContent):
- `is_auto_generated`: Platform flag (YouTube, Vimeo)
- `generation_method`: How WE know it was generated
Auto-generated captions typically have lower accuracy.
range: boolean
required: false
ifabsent: "false"
examples:
- value: true
description: "YouTube auto-generated caption"
track_name:
slot_uri: schema:name
description: |
Human-readable name of the subtitle track.
Schema.org: name for track label.
Examples from YouTube:
- "English"
- "English (auto-generated)"
- "Dutch - Nederlands"
- "English (United Kingdom)"
range: string
required: false
examples:
- value: "English (auto-generated)"
description: "YouTube auto-caption track name"
track_id:
slot_uri: dcterms:identifier
description: |
Platform-specific identifier for this subtitle track.
Dublin Core: identifier for unique ID.
Used to fetch/update specific tracks via API.
range: string
required: false
examples:
- value: "en.3OWxR1w4QfE"
description: "YouTube caption track ID"
default_position:
slot_uri: hc:defaultPosition
description: |
Default display position for captions.
For formats that support positioning (VTT, TTML, ASS):
- BOTTOM: Default, below video content
- TOP: Above video content
- MIDDLE: Center of video
May be overridden per-segment in advanced formats.
range: SubtitlePositionEnum
required: false
ifabsent: "string(BOTTOM)"
examples:
- value: "BOTTOM"
description: "Standard bottom caption position"
entry_count:
slot_uri: hc:entryCount
description: |
Number of subtitle entries (caption cues).
Equals length of segments array.
Useful for content sizing without loading full segments.
range: integer
required: false
minimum_value: 0
examples:
- value: 127
description: "127 caption cues in this track"
average_entry_duration_seconds:
slot_uri: hc:averageEntryDuration
description: |
Average duration of subtitle entries in seconds.
Typical ranges:
- 2-4 seconds: Normal speech rate
- < 2 seconds: Rapid dialogue
- > 5 seconds: Slow narration or long displays
Useful for quality assessment - very short or long entries
may indicate timing issues.
range: float
required: false
minimum_value: 0.0
examples:
- value: 3.2
description: "Average 3.2 seconds per caption"
rules:
- postconditions:
description: |
segments must be populated for VideoSubtitle.
This is enforced by making segments required in slot_usage.
comments:
- "Time-coded caption/subtitle content"
- "Extends VideoTranscript - subtitles ARE transcripts plus time codes"
- "Supports multiple formats: SRT, VTT, TTML, SBV, ASS"
- "Accessibility metadata: CC, SDH, sound/music descriptions"
- "Critical for heritage video accessibility compliance"
see_also:
- "https://schema.org/caption"
- "https://www.w3.org/TR/webvtt1/"
- "https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API"
- "https://www.3playmedia.com/learn/popular-topics/closed-captioning/"
# ============================================================================
# Enumerations
# ============================================================================
enums:
SubtitleFormatEnum:
description: |
Subtitle/caption file formats.
Each format has different capabilities for timing precision,
styling, positioning, and metadata.
permissible_values:
SRT:
description: |
SubRip subtitle format (.srt).
Most widely supported format.
Simple: sequence number, timecode, text.
No styling or positioning support.
VTT:
description: |
WebVTT format (.vtt).
W3C standard for HTML5 video.
Supports styling (CSS), positioning, cue settings.
Recommended for web delivery.
TTML:
description: |
Timed Text Markup Language (.ttml/.dfxp/.xml).
W3C XML-based standard.
Rich styling, regions, timing.
Used in broadcast and streaming (Netflix, Amazon).
SBV:
description: |
YouTube SubViewer format (.sbv).
Simple format similar to SRT.
Native YouTube caption format.
ASS:
description: |
Advanced SubStation Alpha (.ass).
Advanced styling, positioning, effects.
Popular for anime subtitles.
Includes SSA (.ssa) as predecessor.
STL:
description: |
EBU STL format (.stl).
European Broadcasting Union standard.
Used in broadcast television.
Binary format with teletext compatibility.
CAP:
description: |
Scenarist Closed Caption (.scc/.cap).
Used for broadcast closed captioning.
CEA-608/CEA-708 compliant.
SAMI:
description: |
Synchronized Accessible Media Interchange (.smi/.sami).
Microsoft format for Windows Media.
HTML-like markup with timing.
LRC:
description: |
LRC lyrics format (.lrc).
Simple format for song lyrics.
Line-by-line timing, no duration.
JSON:
description: |
JSON-based subtitle format.
Used by some APIs (YouTube transcript API).
Structure varies by source.
UNKNOWN:
description: |
Unknown or unrecognized format.
May require manual parsing or conversion.
SubtitlePositionEnum:
description: |
Default caption display position on video.
May be overridden by format-specific positioning (VTT, TTML, ASS).
permissible_values:
BOTTOM:
description: |
Bottom of video frame (standard position).
Most common for subtitles and captions.
Typically in lower 10-15% of frame.
TOP:
description: |
Top of video frame.
Used when bottom is occluded.
Common for some broadcast formats.
MIDDLE:
description: |
Center of video frame.
Rarely used except for specific effects.
LEFT:
description: |
Left side of frame (vertical text).
Rare, used for specific languages/effects.
RIGHT:
description: |
Right side of frame (vertical text).
Rare, used for specific languages/effects.
# ============================================================================
# Slot Definitions
# ============================================================================
slots:
subtitle_format:
description: Subtitle file format (SRT, VTT, TTML, etc.)
range: SubtitleFormatEnum
raw_subtitle_content:
description: Original subtitle file content as raw string
range: string
is_closed_caption:
description: Whether this is a closed caption (CC) track
range: boolean
is_sdh:
description: Whether these are Subtitles for Deaf/Hard-of-Hearing
range: boolean
includes_sound_descriptions:
description: Whether subtitle includes non-speech sound descriptions
range: boolean
includes_music_descriptions:
description: Whether subtitle includes music descriptions
range: boolean
includes_speaker_identification:
description: Whether subtitle text includes speaker labels
range: boolean
is_auto_generated:
description: Whether subtitle was auto-generated by platform
range: boolean
track_name:
description: Human-readable name of subtitle track
range: string
track_id:
description: Platform-specific identifier for subtitle track
range: string
default_position:
description: Default display position for captions
range: SubtitlePositionEnum
entry_count:
description: Number of subtitle entries (caption cues)
range: integer
average_entry_duration_seconds:
description: Average duration of subtitle entries in seconds
range: float