All checks were successful
Deploy Frontend / build-and-deploy (push) Successful in 3m57s
- Remove inline slot definitions from 144 class files - Create 7 new centralized slot files in modules/slots/: - custodian_type_broader.yaml - custodian_type_narrower.yaml - custodian_type_related.yaml - definition.yaml - finding_aid_access_restriction.yaml - finding_aid_description.yaml - finding_aid_temporal_coverage.yaml - Add centralize_inline_slots.py automation script - Update manifest with new timestamp Rule 48: Class files must NOT define inline slots - all slots must be imported from modules/slots/ directory. Note: Pre-existing IdentifierFormat duplicate class definition (in Standard.yaml and IdentifierFormat.yaml) not addressed in this commit - requires separate schema refactor.
730 lines
17 KiB
YAML
730 lines
17 KiB
YAML
id: https://nde.nl/ontology/hc/class/VideoSubtitle
|
|
name: video_subtitle_class
|
|
title: Video Subtitle Class
|
|
imports:
|
|
- linkml:types
|
|
- ./VideoTranscript
|
|
- ./VideoTimeSegment
|
|
- ../slots/has_average_entry_duration_seconds
|
|
- ../slots/default_position
|
|
- ../slots/entry_count
|
|
- ../slots/includes_music_description
|
|
- ../slots/includes_sound_description
|
|
- ../slots/includes_speaker_identification
|
|
- ../slots/is_auto_generated
|
|
- ../slots/is_closed_caption
|
|
- ../slots/is_sdh
|
|
- ../slots/raw_subtitle_content
|
|
- ../slots/specificity_annotation
|
|
- ../slots/subtitle_format
|
|
- ../slots/template_specificity
|
|
- ../slots/track_id
|
|
- ../slots/track_name
|
|
- ./SpecificityAnnotation
|
|
- ./TemplateSpecificityScores
|
|
prefixes:
|
|
linkml: https://w3id.org/linkml/
|
|
hc: https://nde.nl/ontology/hc/
|
|
schema: http://schema.org/
|
|
dcterms: http://purl.org/dc/terms/
|
|
prov: http://www.w3.org/ns/prov#
|
|
crm: http://www.cidoc-crm.org/cidoc-crm/
|
|
skos: http://www.w3.org/2004/02/skos/core#
|
|
ma: http://www.w3.org/ns/ma-ont#
|
|
default_prefix: hc
|
|
classes:
|
|
VideoSubtitle:
|
|
is_a: VideoTranscript
|
|
class_uri: hc:VideoSubtitle
|
|
abstract: false
|
|
description: 'Time-coded caption/subtitle content for video.
|
|
|
|
|
|
**DEFINITION**:
|
|
|
|
|
|
VideoSubtitle represents caption/subtitle tracks that provide time-coded
|
|
|
|
text synchronized with video playback. It extends VideoTranscript because
|
|
|
|
subtitles contain complete transcription PLUS temporal synchronization.
|
|
|
|
|
|
**INHERITANCE FROM VideoTranscript**:
|
|
|
|
|
|
VideoSubtitle inherits all transcript capabilities:
|
|
|
|
- `full_text`: Complete subtitle text concatenated
|
|
|
|
- `segments`: Time-coded entries (REQUIRED for subtitles)
|
|
|
|
- `includes_timestamps`: Always true for subtitles
|
|
|
|
- `content_language`: Language of subtitle text
|
|
|
|
- All provenance from VideoTextContent
|
|
|
|
|
|
And adds subtitle-specific properties:
|
|
|
|
- `subtitle_format`: SRT, VTT, TTML, SBV, ASS
|
|
|
|
- `is_closed_caption`: CC vs regular subtitles
|
|
|
|
- `is_sdh`: Subtitles for Deaf/Hard-of-Hearing
|
|
|
|
- `includes_sound_descriptions`: Non-speech audio descriptions
|
|
|
|
|
|
**SCHEMA.ORG ALIGNMENT**:
|
|
|
|
|
|
Maps to `schema:caption` property:
|
|
|
|
> "For downloadable machine formats (closed caption, subtitles etc.)
|
|
|
|
> use the MediaObject.encodingFormat property."
|
|
|
|
|
|
**SUBTITLE vs CAPTION vs TRANSCRIPT**:
|
|
|
|
|
|
| Type | Time-coded | Purpose | Audience |
|
|
|
|
|------|------------|---------|----------|
|
|
|
|
| Transcript | Optional | Reading, search | Everyone |
|
|
|
|
| Subtitle | Required | Language translation | Hearing viewers |
|
|
|
|
| Caption (CC) | Required | Accessibility | Deaf/HoH viewers |
|
|
|
|
| SDH | Required | Full accessibility | Deaf viewers, noisy environments |
|
|
|
|
|
|
**SDH (Subtitles for Deaf/Hard-of-Hearing)**:
|
|
|
|
|
|
SDH differs from regular subtitles by including:
|
|
|
|
- Speaker identification: "(John) Hello"
|
|
|
|
- Sound effects: "[door slams]", "[music playing]"
|
|
|
|
- Music descriptions: "♪ upbeat jazz ♪"
|
|
|
|
- Emotional cues: "[laughing]", "[whispering]"
|
|
|
|
|
|
**SUBTITLE FORMATS**:
|
|
|
|
|
|
| Format | Extension | Features | Use Case |
|
|
|
|
|--------|-----------|----------|----------|
|
|
|
|
| SRT | .srt | Simple, universal | Most video players |
|
|
|
|
| VTT | .vtt | W3C standard, styling | HTML5 video, web |
|
|
|
|
| TTML | .ttml/.dfxp | XML, rich styling | Broadcast, streaming |
|
|
|
|
| SBV | .sbv | YouTube native | YouTube uploads |
|
|
|
|
| ASS | .ass | Advanced styling | Anime, complex layouts |
|
|
|
|
|
|
**SRT FORMAT EXAMPLE**:
|
|
|
|
|
|
```
|
|
|
|
1
|
|
|
|
00:00:00,000 --> 00:00:03,500
|
|
|
|
Welcome to the Rijksmuseum.
|
|
|
|
|
|
2
|
|
|
|
00:00:03,500 --> 00:00:08,200
|
|
|
|
Today we''ll explore the Night Watch gallery.
|
|
|
|
```
|
|
|
|
|
|
**VTT FORMAT EXAMPLE**:
|
|
|
|
|
|
```
|
|
|
|
WEBVTT
|
|
|
|
|
|
00:00:00.000 --> 00:00:03.500
|
|
|
|
Welcome to the Rijksmuseum.
|
|
|
|
|
|
00:00:03.500 --> 00:00:08.200
|
|
|
|
Today we''ll explore the Night Watch gallery.
|
|
|
|
```
|
|
|
|
|
|
**HERITAGE INSTITUTION CONTEXT**:
|
|
|
|
|
|
Subtitles are critical for heritage video accessibility:
|
|
|
|
|
|
1. **Accessibility Compliance**: WCAG 2.1, Section 508
|
|
|
|
2. **Multilingual Access**: Translate for international audiences
|
|
|
|
3. **Silent Viewing**: Social media, public displays, quiet spaces
|
|
|
|
4. **Search Discovery**: Subtitle text is indexed by platforms
|
|
|
|
5. **Preservation**: Text outlasts video format obsolescence
|
|
|
|
|
|
**YOUTUBE API INTEGRATION**:
|
|
|
|
|
|
Subtitle tracks from YouTube API populate:
|
|
|
|
- `subtitle_format`: Typically VTT or SRT
|
|
|
|
- `generation_method`: PLATFORM_PROVIDED or ASR_AUTOMATIC
|
|
|
|
- `content_language`: From track language code
|
|
|
|
- `is_auto_generated`: YouTube auto-caption flag
|
|
|
|
|
|
**SEGMENTS ARE REQUIRED**:
|
|
|
|
|
|
Unlike VideoTranscript where segments are optional, VideoSubtitle
|
|
|
|
REQUIRES the `segments` slot to be populated with VideoTimeSegment
|
|
|
|
entries that include start_seconds, end_seconds, and segment_text.
|
|
|
|
'
|
|
exact_mappings:
|
|
- schema:caption
|
|
close_mappings:
|
|
- ma:CaptioningFormat
|
|
related_mappings:
|
|
- schema:transcript
|
|
slots:
|
|
- has_average_entry_duration_seconds
|
|
- default_position
|
|
- entry_count
|
|
- includes_music_description
|
|
- includes_sound_description
|
|
- includes_speaker_identification
|
|
- is_auto_generated
|
|
- is_closed_caption
|
|
- is_sdh
|
|
- raw_subtitle_content
|
|
- specificity_annotation
|
|
- subtitle_format
|
|
- template_specificity
|
|
- track_id
|
|
- track_name
|
|
slot_usage:
|
|
has_or_had_segment:
|
|
required: true
|
|
description: 'Time-coded subtitle entries as VideoTimeSegment objects.
|
|
|
|
|
|
**REQUIRED for VideoSubtitle** (optional in parent VideoTranscript).
|
|
|
|
|
|
Each segment represents one caption display unit:
|
|
|
|
- start_seconds: When caption appears
|
|
|
|
- end_seconds: When caption disappears
|
|
|
|
- segment_text: Caption text content
|
|
|
|
- segment_index: Order in subtitle track
|
|
|
|
- confidence: For auto-generated captions
|
|
|
|
|
|
Segments are ordered by start_seconds for proper playback.
|
|
|
|
'
|
|
includes_timestamp:
|
|
ifabsent: 'true'
|
|
description: 'Whether subtitle includes time markers.
|
|
|
|
|
|
**Always true for VideoSubtitle** - time-coding is definitional.
|
|
|
|
'
|
|
subtitle_format:
|
|
slot_uri: dcterms:format
|
|
description: 'Subtitle file format.
|
|
|
|
|
|
Dublin Core: format for resource format.
|
|
|
|
|
|
Specifies the encoding format of the subtitle content.
|
|
|
|
Affects parsing and rendering capabilities.
|
|
|
|
'
|
|
range: SubtitleFormatEnum
|
|
required: true
|
|
examples:
|
|
- value: VTT
|
|
description: WebVTT format (W3C standard)
|
|
- value: SRT
|
|
description: SubRip format (most common)
|
|
raw_subtitle_content:
|
|
slot_uri: hc:rawSubtitleContent
|
|
description: 'Original subtitle file content as raw string.
|
|
|
|
|
|
Preserves the complete subtitle file in its native format.
|
|
|
|
Useful for:
|
|
|
|
- Format conversion
|
|
|
|
- Re-parsing with different tools
|
|
|
|
- Archive preservation
|
|
|
|
|
|
May be large - consider storing separately for large files.
|
|
|
|
'
|
|
range: string
|
|
required: false
|
|
examples:
|
|
- value: 'WEBVTT
|
|
|
|
|
|
00:00:00.000 --> 00:00:03.500
|
|
|
|
Welcome to the museum.
|
|
|
|
'
|
|
description: Complete VTT file content
|
|
is_closed_caption:
|
|
slot_uri: hc:isClosedCaption
|
|
description: 'Whether this is a closed caption track (CC).
|
|
|
|
|
|
Closed captions differ from subtitles:
|
|
|
|
- **CC (true)**: Designed for Deaf/HoH, includes non-speech audio
|
|
|
|
- **Subtitles (false)**: Translation of dialogue only
|
|
|
|
|
|
CC typically includes [MUSIC], [APPLAUSE], speaker ID, etc.
|
|
|
|
'
|
|
range: boolean
|
|
required: false
|
|
ifabsent: 'false'
|
|
examples:
|
|
- value: true
|
|
description: This is a closed caption track
|
|
is_sdh:
|
|
slot_uri: hc:isSDH
|
|
description: 'Whether these are Subtitles for Deaf/Hard-of-Hearing (SDH).
|
|
|
|
|
|
SDH combines subtitle translation with CC-style annotations:
|
|
|
|
- Dialogue translation (like subtitles)
|
|
|
|
- Sound descriptions (like CC)
|
|
|
|
- Speaker identification
|
|
|
|
|
|
Typically marked "[SDH]" on streaming platforms.
|
|
|
|
'
|
|
range: boolean
|
|
required: false
|
|
ifabsent: 'false'
|
|
examples:
|
|
- value: true
|
|
description: SDH subtitle track
|
|
includes_sound_description:
|
|
slot_uri: hc:includesSoundDescriptions
|
|
description: 'Whether subtitle includes non-speech sound descriptions.
|
|
|
|
|
|
Examples of sound descriptions:
|
|
|
|
- [door slams]
|
|
|
|
- [phone ringing]
|
|
|
|
- [thunder]
|
|
|
|
- [footsteps approaching]
|
|
|
|
|
|
Characteristic of CC and SDH tracks.
|
|
|
|
'
|
|
range: boolean
|
|
required: false
|
|
ifabsent: 'false'
|
|
examples:
|
|
- value: true
|
|
description: Contains sound effect descriptions
|
|
includes_music_description:
|
|
slot_uri: hc:includesMusicDescriptions
|
|
description: 'Whether subtitle includes music/song descriptions.
|
|
|
|
|
|
Examples:
|
|
|
|
- ♪ upbeat jazz playing ♪
|
|
|
|
- [classical music]
|
|
|
|
- ♪ singing in Dutch ♪
|
|
|
|
- [somber orchestral music]
|
|
|
|
|
|
Important for heritage content with significant musical elements.
|
|
|
|
'
|
|
range: boolean
|
|
required: false
|
|
ifabsent: 'false'
|
|
examples:
|
|
- value: true
|
|
description: Contains music descriptions
|
|
includes_speaker_identification:
|
|
slot_uri: hc:includesSpeakerIdentification
|
|
description: 'Whether subtitle identifies speakers.
|
|
|
|
|
|
Speaker identification patterns:
|
|
|
|
- (John): Hello there.
|
|
|
|
- NARRATOR: Welcome to the museum.
|
|
|
|
- [Curator] This painting dates from 1642.
|
|
|
|
|
|
Different from transcript speaker_id which is per-segment;
|
|
|
|
this indicates whether the TEXT CONTENT includes labels.
|
|
|
|
'
|
|
range: boolean
|
|
required: false
|
|
ifabsent: 'false'
|
|
examples:
|
|
- value: true
|
|
description: Subtitle text includes speaker labels
|
|
is_auto_generated:
|
|
slot_uri: hc:isAutoGenerated
|
|
description: 'Whether subtitle was auto-generated by the platform.
|
|
|
|
|
|
Distinct from generation_method (inherited from VideoTextContent):
|
|
|
|
- `is_auto_generated`: Platform flag (YouTube, Vimeo)
|
|
|
|
- `generation_method`: How WE know it was generated
|
|
|
|
|
|
Auto-generated captions typically have lower accuracy.
|
|
|
|
'
|
|
range: boolean
|
|
required: false
|
|
ifabsent: 'false'
|
|
examples:
|
|
- value: true
|
|
description: YouTube auto-generated caption
|
|
track_name:
|
|
slot_uri: schema:name
|
|
description: 'Human-readable name of the subtitle track.
|
|
|
|
|
|
Schema.org: name for track label.
|
|
|
|
|
|
Examples from YouTube:
|
|
|
|
- "English"
|
|
|
|
- "English (auto-generated)"
|
|
|
|
- "Dutch - Nederlands"
|
|
|
|
- "English (United Kingdom)"
|
|
|
|
'
|
|
range: string
|
|
required: false
|
|
examples:
|
|
- value: English (auto-generated)
|
|
description: YouTube auto-caption track name
|
|
track_id:
|
|
slot_uri: dcterms:identifier
|
|
description: 'Platform-specific identifier for this subtitle track.
|
|
|
|
|
|
Dublin Core: identifier for unique ID.
|
|
|
|
|
|
Used to fetch/update specific tracks via API.
|
|
|
|
'
|
|
range: string
|
|
required: false
|
|
examples:
|
|
- value: en.3OWxR1w4QfE
|
|
description: YouTube caption track ID
|
|
default_position:
|
|
slot_uri: hc:defaultPosition
|
|
description: "Default display position for captions.\n\nFor formats that support positioning (VTT, TTML, ASS):\n-\
|
|
\ BOTTOM: Default, below video content\n- TOP: Above video content \n- MIDDLE: Center of video\n\nMay be overridden\
|
|
\ per-segment in advanced formats.\n"
|
|
range: SubtitlePositionEnum
|
|
required: false
|
|
ifabsent: string(BOTTOM)
|
|
examples:
|
|
- value: BOTTOM
|
|
description: Standard bottom caption position
|
|
entry_count:
|
|
slot_uri: hc:entryCount
|
|
description: 'Number of subtitle entries (caption cues).
|
|
|
|
|
|
Equals length of segments array.
|
|
|
|
Useful for content sizing without loading full segments.
|
|
|
|
'
|
|
range: integer
|
|
required: false
|
|
minimum_value: 0
|
|
examples:
|
|
- value: 127
|
|
description: 127 caption cues in this track
|
|
has_average_entry_duration_seconds:
|
|
slot_uri: hc:averageEntryDuration
|
|
description: 'Average duration of subtitle entries in seconds.
|
|
|
|
|
|
Typical ranges:
|
|
|
|
- 2-4 seconds: Normal speech rate
|
|
|
|
- < 2 seconds: Rapid dialogue
|
|
|
|
- > 5 seconds: Slow narration or long displays
|
|
|
|
|
|
Useful for quality assessment - very short or long entries
|
|
|
|
may indicate timing issues.
|
|
|
|
'
|
|
range: float
|
|
required: false
|
|
minimum_value: 0.0
|
|
examples:
|
|
- value: 3.2
|
|
description: Average 3.2 seconds per caption
|
|
specificity_annotation:
|
|
range: SpecificityAnnotation
|
|
inlined: true
|
|
template_specificity:
|
|
range: TemplateSpecificityScores
|
|
inlined: true
|
|
rules:
|
|
- postconditions:
|
|
description: 'segments must be populated for VideoSubtitle.
|
|
|
|
This is enforced by making segments required in slot_usage.
|
|
|
|
'
|
|
comments:
|
|
- Time-coded caption/subtitle content
|
|
- Extends VideoTranscript - subtitles ARE transcripts plus time codes
|
|
- 'Supports multiple formats: SRT, VTT, TTML, SBV, ASS'
|
|
- 'Accessibility metadata: CC, SDH, sound/music descriptions'
|
|
- Critical for heritage video accessibility compliance
|
|
see_also:
|
|
- https://schema.org/caption
|
|
- https://www.w3.org/TR/webvtt1/
|
|
- https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API
|
|
- https://www.3playmedia.com/learn/popular-topics/closed-captioning/
|
|
enums:
|
|
SubtitleFormatEnum:
|
|
description: 'Subtitle/caption file formats.
|
|
|
|
|
|
Each format has different capabilities for timing precision,
|
|
|
|
styling, positioning, and metadata.
|
|
|
|
'
|
|
permissible_values:
|
|
SRT:
|
|
description: 'SubRip subtitle format (.srt).
|
|
|
|
Most widely supported format.
|
|
|
|
Simple: sequence number, timecode, text.
|
|
|
|
No styling or positioning support.
|
|
|
|
'
|
|
VTT:
|
|
description: 'WebVTT format (.vtt).
|
|
|
|
W3C standard for HTML5 video.
|
|
|
|
Supports styling (CSS), positioning, cue settings.
|
|
|
|
Recommended for web delivery.
|
|
|
|
'
|
|
TTML:
|
|
description: 'Timed Text Markup Language (.ttml/.dfxp/.xml).
|
|
|
|
W3C XML-based standard.
|
|
|
|
Rich styling, regions, timing.
|
|
|
|
Used in broadcast and streaming (Netflix, Amazon).
|
|
|
|
'
|
|
SBV:
|
|
description: 'YouTube SubViewer format (.sbv).
|
|
|
|
Simple format similar to SRT.
|
|
|
|
Native YouTube caption format.
|
|
|
|
'
|
|
ASS:
|
|
description: 'Advanced SubStation Alpha (.ass).
|
|
|
|
Advanced styling, positioning, effects.
|
|
|
|
Popular for anime subtitles.
|
|
|
|
Includes SSA (.ssa) as predecessor.
|
|
|
|
'
|
|
STL:
|
|
description: 'EBU STL format (.stl).
|
|
|
|
European Broadcasting Union standard.
|
|
|
|
Used in broadcast television.
|
|
|
|
Binary format with teletext compatibility.
|
|
|
|
'
|
|
CAP:
|
|
description: 'Scenarist Closed Caption (.scc/.cap).
|
|
|
|
Used for broadcast closed captioning.
|
|
|
|
CEA-608/CEA-708 compliant.
|
|
|
|
'
|
|
SAMI:
|
|
description: 'Synchronized Accessible Media Interchange (.smi/.sami).
|
|
|
|
Microsoft format for Windows Media.
|
|
|
|
HTML-like markup with timing.
|
|
|
|
'
|
|
LRC:
|
|
description: 'LRC lyrics format (.lrc).
|
|
|
|
Simple format for song lyrics.
|
|
|
|
Line-by-line timing, no duration.
|
|
|
|
'
|
|
JSON:
|
|
description: 'JSON-based subtitle format.
|
|
|
|
Used by some APIs (YouTube transcript API).
|
|
|
|
Structure varies by source.
|
|
|
|
'
|
|
UNKNOWN:
|
|
description: 'Unknown or unrecognized format.
|
|
|
|
May require manual parsing or conversion.
|
|
|
|
'
|
|
SubtitlePositionEnum:
|
|
description: 'Default caption display position on video.
|
|
|
|
|
|
May be overridden by format-specific positioning (VTT, TTML, ASS).
|
|
|
|
'
|
|
permissible_values:
|
|
BOTTOM:
|
|
description: 'Bottom of video frame (standard position).
|
|
|
|
Most common for subtitles and captions.
|
|
|
|
Typically in lower 10-15% of frame.
|
|
|
|
'
|
|
TOP:
|
|
description: 'Top of video frame.
|
|
|
|
Used when bottom is occluded.
|
|
|
|
Common for some broadcast formats.
|
|
|
|
'
|
|
MIDDLE:
|
|
description: 'Center of video frame.
|
|
|
|
Rarely used except for specific effects.
|
|
|
|
'
|
|
LEFT:
|
|
description: 'Left side of frame (vertical text).
|
|
|
|
Rare, used for specific languages/effects.
|
|
|
|
'
|
|
RIGHT:
|
|
description: 'Right side of frame (vertical text).
|
|
|
|
Rare, used for specific languages/effects.
|
|
|
|
'
|