feat(schema): Add video content schema with comprehensive examples
Video Schema Classes (9 files): - VideoPost, VideoComment: Social media video modeling - VideoTextContent: Base class for text content extraction - VideoTranscript, VideoSubtitle: Text with timing and formatting - VideoTimeSegment: Time code handling with ISO 8601 duration - VideoAnnotation: Base annotation with W3C Web Annotation alignment - VideoAnnotationTypes: Scene, Object, OCR detection annotations - VideoChapter, VideoChapterList: Navigation and chapter structure - VideoAudioAnnotation: Speaker diarization, music, sound events Enumerations (12 enums): - VideoDefinitionEnum, LiveBroadcastStatusEnum - TranscriptFormatEnum, SubtitleFormatEnum, SubtitlePositionEnum - AnnotationTypeEnum, AnnotationMotivationEnum - DetectionLevelEnum, SceneTypeEnum, TransitionTypeEnum, TextTypeEnum - ChapterSourceEnum, AudioEventTypeEnum, SoundEventTypeEnum, MusicTypeEnum Examples (904 lines, 10 comprehensive heritage-themed examples): - Rijksmuseum virtual tour chapters (5 chapters with heritage entity refs) - Operation Night Watch documentary chapters (5 chapters) - VideoAudioAnnotation: curator interview, exhibition promo, museum lecture All examples reference real heritage entities with Wikidata IDs: Q5598 (Rembrandt), Q41264 (Vermeer), Q219831 (The Night Watch)
This commit is contained in:
parent
b0416efc7d
commit
51554947a0
10 changed files with 7250 additions and 0 deletions
904
schemas/20251121/linkml/examples/video_content_examples.yaml
Normal file
904
schemas/20251121/linkml/examples/video_content_examples.yaml
Normal file
|
|
@ -0,0 +1,904 @@
|
|||
# Video Content Examples
|
||||
# Instance data demonstrating video schema classes for heritage institutions
|
||||
# Covers: VideoPost, VideoComment, VideoTranscript, VideoSubtitle, VideoAnnotation types
|
||||
#
|
||||
# Part of Heritage Custodian Ontology v0.9.10
|
||||
#
|
||||
# HERITAGE INSTITUTION VIDEO USE CASES:
|
||||
# - Virtual museum tours
|
||||
# - Conservation documentation
|
||||
# - Curator interviews
|
||||
# - Collection spotlights
|
||||
# - Educational content
|
||||
# - Live event recordings
|
||||
|
||||
# ============================================================================
|
||||
# EXAMPLE 1: Museum Virtual Tour Video
|
||||
# Complete VideoPost with transcript, subtitles, and scene annotations
|
||||
# ============================================================================
|
||||
|
||||
video_posts:
|
||||
|
||||
- post_id: "https://nde.nl/ontology/hc/video/nl/rijksmuseum-gallery-honour"
|
||||
platform_type: YOUTUBE
|
||||
platform_id: "UCo2sQFl0mV4K2v6D4d8Z9bQ"
|
||||
platform_post_id: "dQw4w9WgXcQ"
|
||||
post_url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
|
||||
post_title: "The Gallery of Honour - Rijksmuseum Virtual Tour"
|
||||
post_description: |
|
||||
Take a virtual walk through the famous Gallery of Honour at the Rijksmuseum
|
||||
in Amsterdam. This corridor displays masterpieces of Dutch Golden Age painting,
|
||||
culminating in Rembrandt's Night Watch. Our curator guides you through the
|
||||
history and significance of these iconic works.
|
||||
|
||||
# Video technical properties
|
||||
duration: "PT15M42S"
|
||||
definition: hd
|
||||
aspect_ratio: "16:9"
|
||||
frame_rate: 30.0
|
||||
|
||||
# Caption and language
|
||||
caption_available: true
|
||||
default_language: "nl"
|
||||
default_audio_language: "nl"
|
||||
available_caption_languages:
|
||||
- "nl"
|
||||
- "en"
|
||||
- "de"
|
||||
- "fr"
|
||||
- "zh"
|
||||
|
||||
# Engagement metrics (observational)
|
||||
view_count: 125847
|
||||
like_count: 3421
|
||||
dislike_count: 42
|
||||
comment_count: 287
|
||||
favorite_count: 892
|
||||
metrics_observed_at: "2025-12-15T10:30:00Z"
|
||||
|
||||
# Platform-specific
|
||||
video_category_id: "27" # Education
|
||||
live_broadcast_content: none
|
||||
is_licensed_content: false
|
||||
is_embeddable: true
|
||||
is_made_for_kids: false
|
||||
|
||||
# Publishing info (inherited from SocialMediaPost)
|
||||
published_at: "2023-03-15T14:00:00Z"
|
||||
last_updated_at: "2023-03-15T14:00:00Z"
|
||||
|
||||
# Comments
|
||||
comments_fetched: 50
|
||||
video_comments:
|
||||
- comment_id: "Ugw3x9K2mL8f7nPqR1"
|
||||
comment_author: "ArtHistoryFan"
|
||||
comment_author_channel_id: "UC7f8n2p3m4x5L6qR7sT8vW"
|
||||
comment_text: "This virtual tour is amazing! I visited last year and seeing it again brings back wonderful memories. The Night Watch looks even more spectacular in 4K."
|
||||
comment_published_at: "2023-03-16T09:22:15Z"
|
||||
comment_like_count: 45
|
||||
comment_reply_count: 3
|
||||
comment_replies:
|
||||
- comment_id: "Ugw3x9K2mL8f7nPqR1.8nRq"
|
||||
comment_author: "Rijksmuseum"
|
||||
comment_author_channel_id: "UCo2sQFl0mV4K2v6D4d8Z9bQ"
|
||||
comment_text: "Thank you for visiting and for your kind words! We hope to see you again soon."
|
||||
comment_published_at: "2023-03-16T11:45:30Z"
|
||||
comment_like_count: 12
|
||||
comment_reply_count: 0
|
||||
|
||||
- comment_id: "Ugw5y7T4nM9g8oPsS2"
|
||||
comment_author: "DutchHeritageExplorer"
|
||||
comment_author_channel_id: "UC9g0n3p4m5x6L7qR8sT9vX"
|
||||
comment_text: "Great explanation of the Vermeer paintings! Would love to see more content about the restoration process."
|
||||
comment_published_at: "2023-03-17T16:33:45Z"
|
||||
comment_like_count: 28
|
||||
comment_reply_count: 1
|
||||
|
||||
# ============================================================================
|
||||
# EXAMPLE 2: Video Transcript (Full Text)
|
||||
# ============================================================================
|
||||
|
||||
video_transcripts:
|
||||
|
||||
- content_id: "https://nde.nl/ontology/hc/transcript/nl/rijksmuseum-gallery-honour-full"
|
||||
source_video_url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
|
||||
content_language: "nl"
|
||||
|
||||
full_transcript: |
|
||||
Welkom in de Eregalerij van het Rijksmuseum. Deze iconische gang is het hart
|
||||
van het museum en herbergt de grootste meesterwerken uit de Gouden Eeuw.
|
||||
|
||||
We beginnen onze wandeling bij de ingang, waar we direct worden begroet door
|
||||
Frans Hals' portret van Isaac Massa en Beatrix van der Laen. Dit schilderij
|
||||
uit 1622 toont de levendige penseelstreek waarmee Hals bekend staat.
|
||||
|
||||
Verderop zien we werken van Jan Steen, bekend om zijn humoristische taferelen
|
||||
van het dagelijks leven. Zijn schilderij "De vrolijke huishouding" illustreert
|
||||
het Nederlandse spreekwoord "een huishouden van Jan Steen."
|
||||
|
||||
Aan het einde van de galerie staat het beroemdste schilderij van Nederland:
|
||||
De Nachtwacht van Rembrandt. Dit monumentale werk uit 1642 toont de
|
||||
schutterij van kapitein Frans Banninck Cocq in actie.
|
||||
|
||||
word_count: 142
|
||||
generation_method: AUTOMATIC
|
||||
generation_model: "whisper-large-v3"
|
||||
generation_confidence: 0.94
|
||||
manual_corrections: true
|
||||
|
||||
# Provenance
|
||||
generated_by: "OpenAI Whisper"
|
||||
generation_timestamp: "2025-12-01T08:15:00Z"
|
||||
reviewed_by: "Rijksmuseum Digital Team"
|
||||
review_timestamp: "2025-12-02T14:30:00Z"
|
||||
|
||||
transcript_format: PLAIN_TEXT
|
||||
|
||||
# ============================================================================
|
||||
# EXAMPLE 3: Video Subtitles (Time-Coded)
|
||||
# ============================================================================
|
||||
|
||||
video_subtitles:
|
||||
|
||||
- content_id: "https://nde.nl/ontology/hc/subtitle/nl/rijksmuseum-gallery-honour-en"
|
||||
source_video_url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
|
||||
content_language: "en"
|
||||
|
||||
subtitle_format: VTT
|
||||
total_cues: 45
|
||||
|
||||
subtitle_entries:
|
||||
- sequence_number: 1
|
||||
start_time: "00:00:00.000"
|
||||
end_time: "00:00:04.500"
|
||||
text: "Welcome to the Gallery of Honour at the Rijksmuseum."
|
||||
speaker_label: "Curator"
|
||||
|
||||
- sequence_number: 2
|
||||
start_time: "00:00:04.500"
|
||||
end_time: "00:00:09.200"
|
||||
text: "This iconic corridor is the heart of the museum"
|
||||
speaker_label: "Curator"
|
||||
|
||||
- sequence_number: 3
|
||||
start_time: "00:00:09.200"
|
||||
end_time: "00:00:14.800"
|
||||
text: "and houses the greatest masterpieces from the Golden Age."
|
||||
speaker_label: "Curator"
|
||||
|
||||
- sequence_number: 4
|
||||
start_time: "00:00:14.800"
|
||||
end_time: "00:00:20.500"
|
||||
text: "We begin our walk at the entrance, where we are immediately greeted"
|
||||
speaker_label: "Curator"
|
||||
|
||||
- sequence_number: 5
|
||||
start_time: "00:00:20.500"
|
||||
end_time: "00:00:27.000"
|
||||
text: "by Frans Hals' portrait of Isaac Massa and Beatrix van der Laen."
|
||||
speaker_label: "Curator"
|
||||
|
||||
is_closed_captions: false
|
||||
is_sdh: false
|
||||
|
||||
generation_method: HUMAN
|
||||
reviewed_by: "Rijksmuseum Translation Team"
|
||||
review_timestamp: "2023-03-10T16:00:00Z"
|
||||
|
||||
# ============================================================================
|
||||
# EXAMPLE 4: Scene Annotations (Computer Vision)
|
||||
# ============================================================================
|
||||
|
||||
video_scene_annotations:
|
||||
|
||||
- annotation_id: "https://nde.nl/ontology/hc/annotation/scene/rijksmuseum-01"
|
||||
source_video_url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
|
||||
annotation_type: SCENE
|
||||
annotation_motivation: DESCRIBING
|
||||
|
||||
time_segment:
|
||||
segment_id: "scene-01"
|
||||
start_time: "00:00:00.000"
|
||||
end_time: "00:00:45.000"
|
||||
duration_seconds: 45.0
|
||||
|
||||
scene_type: ESTABLISHING
|
||||
scene_label: "Gallery Entrance Introduction"
|
||||
scene_description: |
|
||||
Wide shot of the Gallery of Honour entrance. Camera slowly pans
|
||||
from left to right, revealing the long corridor with paintings
|
||||
on both walls. Natural light streams in from skylights above.
|
||||
|
||||
detected_elements:
|
||||
- "architectural interior"
|
||||
- "museum gallery"
|
||||
- "natural lighting"
|
||||
- "oil paintings"
|
||||
- "parquet flooring"
|
||||
|
||||
dominant_colors:
|
||||
- "#8B7355" # Brown/wood tones
|
||||
- "#F5F5DC" # Cream walls
|
||||
- "#DAA520" # Golden frames
|
||||
|
||||
confidence_score: 0.92
|
||||
detection_model: "google-video-intelligence-v1"
|
||||
detection_timestamp: "2025-12-01T09:00:00Z"
|
||||
|
||||
- annotation_id: "https://nde.nl/ontology/hc/annotation/scene/rijksmuseum-02"
|
||||
source_video_url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
|
||||
annotation_type: SCENE
|
||||
annotation_motivation: DESCRIBING
|
||||
|
||||
time_segment:
|
||||
segment_id: "scene-02"
|
||||
start_time: "00:00:45.000"
|
||||
end_time: "00:02:30.000"
|
||||
duration_seconds: 105.0
|
||||
|
||||
scene_type: CLOSE_UP
|
||||
scene_label: "Frans Hals Portrait Detail"
|
||||
scene_description: |
|
||||
Close-up shots of Frans Hals' portrait painting showing
|
||||
brushwork detail and color palette. Camera moves slowly
|
||||
across canvas surface highlighting texture.
|
||||
|
||||
detected_elements:
|
||||
- "oil painting"
|
||||
- "portrait"
|
||||
- "17th century costume"
|
||||
- "lace collar"
|
||||
- "dark background"
|
||||
|
||||
confidence_score: 0.88
|
||||
detection_model: "google-video-intelligence-v1"
|
||||
detection_timestamp: "2025-12-01T09:00:00Z"
|
||||
|
||||
# ============================================================================
|
||||
# EXAMPLE 5: Object Annotations (Artwork Detection)
|
||||
# ============================================================================
|
||||
|
||||
video_object_annotations:
|
||||
|
||||
- annotation_id: "https://nde.nl/ontology/hc/annotation/object/rijksmuseum-night-watch"
|
||||
source_video_url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
|
||||
annotation_type: OBJECT
|
||||
annotation_motivation: IDENTIFYING
|
||||
|
||||
time_segment:
|
||||
segment_id: "night-watch-segment"
|
||||
start_time: "00:12:30.000"
|
||||
end_time: "00:15:42.000"
|
||||
duration_seconds: 192.0
|
||||
|
||||
detected_objects:
|
||||
- object_id: "obj-night-watch-001"
|
||||
object_label: "The Night Watch"
|
||||
object_category: "painting"
|
||||
confidence: 0.98
|
||||
bounding_box_x: 120
|
||||
bounding_box_y: 80
|
||||
bounding_box_width: 1680
|
||||
bounding_box_height: 920
|
||||
wikidata_entity: "Q219831"
|
||||
artist: "Rembrandt van Rijn"
|
||||
creation_year: 1642
|
||||
|
||||
- object_id: "obj-captain-001"
|
||||
object_label: "Captain Frans Banninck Cocq"
|
||||
object_category: "person (depicted)"
|
||||
confidence: 0.91
|
||||
bounding_box_x: 450
|
||||
bounding_box_y: 150
|
||||
bounding_box_width: 380
|
||||
bounding_box_height: 720
|
||||
wikidata_entity: "Q467089"
|
||||
|
||||
detection_level: FRAME
|
||||
confidence_score: 0.95
|
||||
detection_model: "artwork-recognition-v2"
|
||||
detection_timestamp: "2025-12-01T09:15:00Z"
|
||||
|
||||
# ============================================================================
|
||||
# EXAMPLE 6: OCR Annotations (Text in Video)
|
||||
# ============================================================================
|
||||
|
||||
video_ocr_annotations:
|
||||
|
||||
- annotation_id: "https://nde.nl/ontology/hc/annotation/ocr/rijksmuseum-label-01"
|
||||
source_video_url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
|
||||
annotation_type: OCR
|
||||
annotation_motivation: TRANSCRIBING
|
||||
|
||||
time_segment:
|
||||
segment_id: "label-segment-01"
|
||||
start_time: "00:05:15.000"
|
||||
end_time: "00:05:22.000"
|
||||
duration_seconds: 7.0
|
||||
|
||||
detected_text_regions:
|
||||
- region_id: "text-001"
|
||||
detected_text: "Johannes Vermeer"
|
||||
text_language: "nl"
|
||||
text_type: ARTWORK_LABEL
|
||||
bounding_box_x: 100
|
||||
bounding_box_y: 650
|
||||
bounding_box_width: 280
|
||||
bounding_box_height: 35
|
||||
confidence: 0.97
|
||||
|
||||
- region_id: "text-002"
|
||||
detected_text: "Het melkmeisje, ca. 1660"
|
||||
text_language: "nl"
|
||||
text_type: ARTWORK_LABEL
|
||||
bounding_box_x: 100
|
||||
bounding_box_y: 690
|
||||
bounding_box_width: 320
|
||||
bounding_box_height: 30
|
||||
confidence: 0.94
|
||||
|
||||
- region_id: "text-003"
|
||||
detected_text: "Olieverf op doek"
|
||||
text_language: "nl"
|
||||
text_type: CAPTION
|
||||
bounding_box_x: 100
|
||||
bounding_box_y: 725
|
||||
bounding_box_width: 200
|
||||
bounding_box_height: 25
|
||||
confidence: 0.91
|
||||
|
||||
detection_level: FRAME
|
||||
confidence_score: 0.94
|
||||
detection_model: "google-cloud-vision-ocr"
|
||||
detection_timestamp: "2025-12-01T09:20:00Z"
|
||||
|
||||
# ============================================================================
|
||||
# EXAMPLE 7: Conservation Documentation Video
|
||||
# Archive use case with technical annotations
|
||||
# ============================================================================
|
||||
|
||||
conservation_videos:
|
||||
|
||||
- post_id: "https://nde.nl/ontology/hc/video/nl/rijksmuseum-night-watch-restoration"
|
||||
platform_type: YOUTUBE
|
||||
platform_id: "UCo2sQFl0mV4K2v6D4d8Z9bQ"
|
||||
platform_post_id: "abcd1234efgh"
|
||||
post_url: "https://www.youtube.com/watch?v=abcd1234efgh"
|
||||
post_title: "Operation Night Watch - Restoration Process Documentary"
|
||||
post_description: |
|
||||
Follow the largest and most detailed art research and conservation project
|
||||
ever undertaken on a single painting. Operation Night Watch uses cutting-edge
|
||||
technology to study and restore Rembrandt's masterpiece.
|
||||
|
||||
duration: "PT45M30S"
|
||||
definition: uhd
|
||||
aspect_ratio: "16:9"
|
||||
frame_rate: 24.0
|
||||
|
||||
caption_available: true
|
||||
default_language: "en"
|
||||
default_audio_language: "en"
|
||||
available_caption_languages:
|
||||
- "en"
|
||||
- "nl"
|
||||
- "de"
|
||||
- "ja"
|
||||
|
||||
view_count: 892341
|
||||
like_count: 28456
|
||||
comment_count: 1523
|
||||
metrics_observed_at: "2025-12-15T10:30:00Z"
|
||||
|
||||
video_category_id: "28" # Science & Technology
|
||||
live_broadcast_content: none
|
||||
is_licensed_content: false
|
||||
is_embeddable: true
|
||||
is_made_for_kids: false
|
||||
|
||||
published_at: "2021-06-22T12:00:00Z"
|
||||
|
||||
# ============================================================================
|
||||
# EXAMPLE 8: Video Chapters (Navigation Segments)
|
||||
# YouTube chapters, virtual tour sections, conservation phases
|
||||
# ============================================================================
|
||||
|
||||
video_chapters:
|
||||
|
||||
# Rijksmuseum Virtual Tour - Gallery of Honour chapters
|
||||
- chapter_id: "dQw4w9WgXcQ_chapter_0"
|
||||
chapter_title: "Introduction - Welcome to the Rijksmuseum"
|
||||
chapter_index: 0
|
||||
chapter_start_seconds: 0.0
|
||||
chapter_end_seconds: 45.0
|
||||
chapter_start_time: "PT0S"
|
||||
chapter_end_time: "PT45S"
|
||||
chapter_description: |
|
||||
Opening shot of the Gallery of Honour entrance with curator introduction.
|
||||
Overview of what visitors will see during the virtual tour.
|
||||
auto_generated: false
|
||||
chapter_source: MANUAL
|
||||
chapter_thumbnail_url: "https://i.ytimg.com/vi/dQw4w9WgXcQ/hqdefault.jpg?sqp=-oaymwEjCNACELwBSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLBp1"
|
||||
|
||||
- chapter_id: "dQw4w9WgXcQ_chapter_1"
|
||||
chapter_title: "Frans Hals and Early Portraits"
|
||||
chapter_index: 1
|
||||
chapter_start_seconds: 45.0
|
||||
chapter_end_seconds: 180.0
|
||||
chapter_start_time: "PT45S"
|
||||
chapter_end_time: "PT3M"
|
||||
chapter_description: |
|
||||
Exploration of Frans Hals' portrait of Isaac Massa and Beatrix van der Laen.
|
||||
Discussion of Hals' innovative brushwork techniques.
|
||||
auto_generated: false
|
||||
chapter_source: MANUAL
|
||||
heritage_entities_mentioned:
|
||||
- entity_id: "Q167654" # Frans Hals
|
||||
entity_type: "Person"
|
||||
entity_label: "Frans Hals"
|
||||
- entity_id: "Q2628540" # Portrait of Isaac Massa and Beatrix van der Laen
|
||||
entity_type: "Artwork"
|
||||
entity_label: "Portrait of Isaac Massa and Beatrix van der Laen"
|
||||
|
||||
- chapter_id: "dQw4w9WgXcQ_chapter_2"
|
||||
chapter_title: "Jan Steen's Household Scenes"
|
||||
chapter_index: 2
|
||||
chapter_start_seconds: 180.0
|
||||
chapter_end_seconds: 360.0
|
||||
chapter_start_time: "PT3M"
|
||||
chapter_end_time: "PT6M"
|
||||
chapter_description: |
|
||||
The humorous domestic scenes of Jan Steen and the meaning behind
|
||||
the Dutch expression "een huishouden van Jan Steen."
|
||||
auto_generated: false
|
||||
chapter_source: MANUAL
|
||||
heritage_entities_mentioned:
|
||||
- entity_id: "Q205863" # Jan Steen
|
||||
entity_type: "Person"
|
||||
entity_label: "Jan Steen"
|
||||
|
||||
- chapter_id: "dQw4w9WgXcQ_chapter_3"
|
||||
chapter_title: "Vermeer's Masterpieces"
|
||||
chapter_index: 3
|
||||
chapter_start_seconds: 360.0
|
||||
chapter_end_seconds: 600.0
|
||||
chapter_start_time: "PT6M"
|
||||
chapter_end_time: "PT10M"
|
||||
chapter_description: |
|
||||
Close examination of Johannes Vermeer's The Milkmaid and other works.
|
||||
Analysis of Vermeer's distinctive use of light and color.
|
||||
auto_generated: false
|
||||
chapter_source: MANUAL
|
||||
heritage_entities_mentioned:
|
||||
- entity_id: "Q41264" # Johannes Vermeer
|
||||
entity_type: "Person"
|
||||
entity_label: "Johannes Vermeer"
|
||||
- entity_id: "Q154349" # The Milkmaid
|
||||
entity_type: "Artwork"
|
||||
entity_label: "Het melkmeisje (The Milkmaid)"
|
||||
|
||||
- chapter_id: "dQw4w9WgXcQ_chapter_4"
|
||||
chapter_title: "The Night Watch - Rembrandt's Masterpiece"
|
||||
chapter_index: 4
|
||||
chapter_start_seconds: 600.0
|
||||
chapter_end_seconds: 942.0
|
||||
chapter_start_time: "PT10M"
|
||||
chapter_end_time: "PT15M42S"
|
||||
chapter_description: |
|
||||
Culmination of the tour at Rembrandt's iconic Night Watch.
|
||||
Discussion of the painting's history, composition, and restoration.
|
||||
auto_generated: false
|
||||
chapter_source: MANUAL
|
||||
heritage_entities_mentioned:
|
||||
- entity_id: "Q5598" # Rembrandt
|
||||
entity_type: "Person"
|
||||
entity_label: "Rembrandt van Rijn"
|
||||
- entity_id: "Q219831" # The Night Watch
|
||||
entity_type: "Artwork"
|
||||
entity_label: "De Nachtwacht (The Night Watch)"
|
||||
|
||||
# Conservation Documentary - Operation Night Watch chapters
|
||||
- chapter_id: "abcd1234efgh_chapter_0"
|
||||
chapter_title: "Project Overview"
|
||||
chapter_index: 0
|
||||
chapter_start_seconds: 0.0
|
||||
chapter_end_seconds: 300.0
|
||||
chapter_start_time: "PT0S"
|
||||
chapter_end_time: "PT5M"
|
||||
chapter_description: |
|
||||
Introduction to Operation Night Watch, the most extensive research
|
||||
and conservation project ever undertaken on a single painting.
|
||||
auto_generated: false
|
||||
chapter_source: MANUAL
|
||||
|
||||
- chapter_id: "abcd1234efgh_chapter_1"
|
||||
chapter_title: "Technical Imaging and Analysis"
|
||||
chapter_index: 1
|
||||
chapter_start_seconds: 300.0
|
||||
chapter_end_seconds: 900.0
|
||||
chapter_start_time: "PT5M"
|
||||
chapter_end_time: "PT15M"
|
||||
chapter_description: |
|
||||
Multi-spectral imaging, X-ray analysis, and macro photography
|
||||
revealing hidden layers and underdrawings in the painting.
|
||||
auto_generated: false
|
||||
chapter_source: MANUAL
|
||||
conservation_phase: "DOCUMENTATION"
|
||||
|
||||
- chapter_id: "abcd1234efgh_chapter_2"
|
||||
chapter_title: "Condition Assessment"
|
||||
chapter_index: 2
|
||||
chapter_start_seconds: 900.0
|
||||
chapter_end_seconds: 1500.0
|
||||
chapter_start_time: "PT15M"
|
||||
chapter_end_time: "PT25M"
|
||||
chapter_description: |
|
||||
Detailed examination of the painting's condition, including
|
||||
craquelure patterns, varnish degradation, and previous restorations.
|
||||
auto_generated: false
|
||||
chapter_source: MANUAL
|
||||
conservation_phase: "ASSESSMENT"
|
||||
|
||||
- chapter_id: "abcd1234efgh_chapter_3"
|
||||
chapter_title: "Cleaning Process"
|
||||
chapter_index: 3
|
||||
chapter_start_seconds: 1500.0
|
||||
chapter_end_seconds: 2100.0
|
||||
chapter_start_time: "PT25M"
|
||||
chapter_end_time: "PT35M"
|
||||
chapter_description: |
|
||||
The meticulous cleaning process using specialized solvents and
|
||||
techniques to remove centuries of accumulated dirt and varnish.
|
||||
auto_generated: false
|
||||
chapter_source: MANUAL
|
||||
conservation_phase: "TREATMENT"
|
||||
|
||||
- chapter_id: "abcd1234efgh_chapter_4"
|
||||
chapter_title: "AI-Assisted Reconstruction"
|
||||
chapter_index: 4
|
||||
chapter_start_seconds: 2100.0
|
||||
chapter_end_seconds: 2730.0
|
||||
chapter_start_time: "PT35M"
|
||||
chapter_end_time: "PT45M30S"
|
||||
chapter_description: |
|
||||
How artificial intelligence was used to digitally reconstruct
|
||||
missing portions of the painting that were cut off in 1715.
|
||||
auto_generated: false
|
||||
chapter_source: MANUAL
|
||||
conservation_phase: "DIGITAL_RECONSTRUCTION"
|
||||
|
||||
# ============================================================================
|
||||
# EXAMPLE 9: Video Chapter Lists (Complete Sets)
|
||||
# ============================================================================
|
||||
|
||||
video_chapter_lists:
|
||||
|
||||
# Complete chapter list for Rijksmuseum virtual tour
|
||||
- list_id: "https://nde.nl/ontology/hc/chapterlist/rijksmuseum-gallery-honour"
|
||||
video_id: "dQw4w9WgXcQ"
|
||||
video_url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
|
||||
video_title: "The Gallery of Honour - Rijksmuseum Virtual Tour"
|
||||
|
||||
chapters:
|
||||
- "dQw4w9WgXcQ_chapter_0"
|
||||
- "dQw4w9WgXcQ_chapter_1"
|
||||
- "dQw4w9WgXcQ_chapter_2"
|
||||
- "dQw4w9WgXcQ_chapter_3"
|
||||
- "dQw4w9WgXcQ_chapter_4"
|
||||
|
||||
total_chapters: 5
|
||||
chapters_source: MANUAL
|
||||
covers_full_video: true
|
||||
video_duration_seconds: 942.0
|
||||
|
||||
extraction_timestamp: "2025-12-15T14:00:00Z"
|
||||
extraction_method: "YouTube Data API v3"
|
||||
|
||||
# Complete chapter list for Operation Night Watch documentary
|
||||
- list_id: "https://nde.nl/ontology/hc/chapterlist/operation-night-watch"
|
||||
video_id: "abcd1234efgh"
|
||||
video_url: "https://www.youtube.com/watch?v=abcd1234efgh"
|
||||
video_title: "Operation Night Watch - Restoration Process Documentary"
|
||||
|
||||
chapters:
|
||||
- "abcd1234efgh_chapter_0"
|
||||
- "abcd1234efgh_chapter_1"
|
||||
- "abcd1234efgh_chapter_2"
|
||||
- "abcd1234efgh_chapter_3"
|
||||
- "abcd1234efgh_chapter_4"
|
||||
|
||||
total_chapters: 5
|
||||
chapters_source: MANUAL
|
||||
covers_full_video: true
|
||||
video_duration_seconds: 2730.0
|
||||
|
||||
extraction_timestamp: "2025-12-15T14:00:00Z"
|
||||
extraction_method: "YouTube Data API v3"
|
||||
|
||||
# ============================================================================
|
||||
# EXAMPLE 10: Video Audio Annotations (Speech, Music, Sound Events)
|
||||
# ============================================================================
|
||||
|
||||
video_audio_annotations:
|
||||
|
||||
# Example 1: Curator Interview with Speaker Diarization
|
||||
- annotation_id: "https://nde.nl/ontology/hc/annotation/audio/rijksmuseum-interview-01"
|
||||
source_video_url: "https://www.youtube.com/watch?v=xyz789curator"
|
||||
annotation_type: AUDIO
|
||||
annotation_motivation: TRANSCRIBING
|
||||
|
||||
# Primary audio characteristics
|
||||
primary_audio_event_type: SPEECH
|
||||
speech_detected: true
|
||||
speech_language: "nl"
|
||||
languages_detected:
|
||||
- "nl"
|
||||
- "en" # Some English art terminology used
|
||||
|
||||
# Speaker diarization (who spoke when)
|
||||
diarization_enabled: true
|
||||
speaker_count: 2
|
||||
speaker_labels:
|
||||
- "Dr. Taco Dibbits"
|
||||
- "Interviewer"
|
||||
|
||||
diarization_segments:
|
||||
- segment_id: "diar-001"
|
||||
diarization_start_seconds: 0.0
|
||||
diarization_end_seconds: 8.5
|
||||
diarization_start_time: "PT0S"
|
||||
diarization_end_time: "PT8.5S"
|
||||
diarization_speaker_id: "spk_001"
|
||||
diarization_speaker_label: "Interviewer"
|
||||
diarization_confidence: 0.94
|
||||
transcript_snippet: "Welkom bij het Rijksmuseum. Vandaag spreken we met de directeur..."
|
||||
|
||||
- segment_id: "diar-002"
|
||||
diarization_start_seconds: 8.5
|
||||
diarization_end_seconds: 45.0
|
||||
diarization_start_time: "PT8.5S"
|
||||
diarization_end_time: "PT45S"
|
||||
diarization_speaker_id: "spk_002"
|
||||
diarization_speaker_label: "Dr. Taco Dibbits"
|
||||
diarization_confidence: 0.97
|
||||
transcript_snippet: "Dank u wel. Het is een bijzonder moment voor het museum..."
|
||||
|
||||
- segment_id: "diar-003"
|
||||
diarization_start_seconds: 45.0
|
||||
diarization_end_seconds: 52.0
|
||||
diarization_start_time: "PT45S"
|
||||
diarization_end_time: "PT52S"
|
||||
diarization_speaker_id: "spk_001"
|
||||
diarization_speaker_label: "Interviewer"
|
||||
diarization_confidence: 0.92
|
||||
transcript_snippet: "Kunt u ons meer vertellen over de nieuwe tentoonstelling?"
|
||||
|
||||
- segment_id: "diar-004"
|
||||
diarization_start_seconds: 52.0
|
||||
diarization_end_seconds: 180.0
|
||||
diarization_start_time: "PT52S"
|
||||
diarization_end_time: "PT3M"
|
||||
diarization_speaker_id: "spk_002"
|
||||
diarization_speaker_label: "Dr. Taco Dibbits"
|
||||
diarization_confidence: 0.96
|
||||
transcript_snippet: "Jazeker. Deze tentoonstelling is uniek omdat we voor het eerst..."
|
||||
|
||||
# Audio quality metrics
|
||||
audio_quality_score: 0.92
|
||||
snr_db: 28.0
|
||||
has_clipping: false
|
||||
audio_channels: 2
|
||||
sample_rate_hz: 48000
|
||||
|
||||
# No music in this interview
|
||||
music_detected: false
|
||||
|
||||
# Detection metadata
|
||||
detection_model: "whisper-large-v3-diarize"
|
||||
detection_timestamp: "2025-12-15T16:00:00Z"
|
||||
confidence_score: 0.94
|
||||
|
||||
# Example 2: Exhibition Promotional Video with Music
|
||||
- annotation_id: "https://nde.nl/ontology/hc/annotation/audio/vangogh-exhibition-promo"
|
||||
source_video_url: "https://www.youtube.com/watch?v=promo2025vgm"
|
||||
annotation_type: AUDIO
|
||||
annotation_motivation: DESCRIBING
|
||||
|
||||
# Mixed speech and music
|
||||
primary_audio_event_type: MIXED
|
||||
speech_detected: true
|
||||
music_detected: true
|
||||
|
||||
speech_language: "en"
|
||||
languages_detected:
|
||||
- "en"
|
||||
- "nl"
|
||||
|
||||
# Speech segments (voiceover narration)
|
||||
speech_segments:
|
||||
- segment_id: "speech-001"
|
||||
speech_start_seconds: 5.0
|
||||
speech_end_seconds: 25.0
|
||||
speech_start_time: "PT5S"
|
||||
speech_end_time: "PT25S"
|
||||
speaker_id: "narrator"
|
||||
speaker_label: "Voiceover Narrator"
|
||||
speech_type: NARRATION
|
||||
transcript_snippet: "This spring, the Van Gogh Museum presents a groundbreaking exhibition..."
|
||||
|
||||
- segment_id: "speech-002"
|
||||
speech_start_seconds: 45.0
|
||||
speech_end_seconds: 60.0
|
||||
speech_start_time: "PT45S"
|
||||
speech_end_time: "PT1M"
|
||||
speaker_id: "curator"
|
||||
speaker_label: "Exhibition Curator"
|
||||
speech_type: INTERVIEW
|
||||
transcript_snippet: "Van Gogh's use of color was revolutionary..."
|
||||
|
||||
# Music segments (background and featured)
|
||||
music_segments:
|
||||
- segment_id: "music-001"
|
||||
music_start_seconds: 0.0
|
||||
music_end_seconds: 120.0
|
||||
music_start_time: "PT0S"
|
||||
music_end_time: "PT2M"
|
||||
music_type: BACKGROUND
|
||||
music_genre: "classical"
|
||||
is_background: true
|
||||
volume_level: "low"
|
||||
music_title: null # Unknown background track
|
||||
|
||||
- segment_id: "music-002"
|
||||
music_start_seconds: 90.0
|
||||
music_end_seconds: 115.0
|
||||
music_start_time: "PT1M30S"
|
||||
music_end_time: "PT1M55S"
|
||||
music_type: DRAMATIC
|
||||
music_genre: "orchestral"
|
||||
is_background: false
|
||||
volume_level: "medium"
|
||||
music_description: "Dramatic orchestral swell accompanying visual climax"
|
||||
|
||||
music_genres_detected:
|
||||
- "classical"
|
||||
- "orchestral"
|
||||
|
||||
# Audio quality metrics
|
||||
audio_quality_score: 0.88
|
||||
snr_db: 22.0 # Lower due to music mixing
|
||||
audio_channels: 2
|
||||
sample_rate_hz: 48000
|
||||
|
||||
detection_model: "audio-analysis-v2"
|
||||
detection_timestamp: "2025-12-15T16:30:00Z"
|
||||
confidence_score: 0.86
|
||||
|
||||
# Example 3: Museum Lecture Recording with Audience Reactions
|
||||
- annotation_id: "https://nde.nl/ontology/hc/annotation/audio/stedelijk-lecture-2024"
|
||||
source_video_url: "https://www.youtube.com/watch?v=lecture2024sted"
|
||||
annotation_type: AUDIO
|
||||
annotation_motivation: TRANSCRIBING
|
||||
|
||||
primary_audio_event_type: SPEECH
|
||||
speech_detected: true
|
||||
music_detected: false
|
||||
|
||||
speech_language: "nl"
|
||||
languages_detected:
|
||||
- "nl"
|
||||
|
||||
# Main lecture content
|
||||
diarization_enabled: true
|
||||
speaker_count: 1
|
||||
speaker_labels:
|
||||
- "Prof. Dr. Beatrix Ruf"
|
||||
|
||||
diarization_segments:
|
||||
- segment_id: "lecture-001"
|
||||
diarization_start_seconds: 0.0
|
||||
diarization_end_seconds: 1800.0
|
||||
diarization_start_time: "PT0S"
|
||||
diarization_end_time: "PT30M"
|
||||
diarization_speaker_id: "spk_main"
|
||||
diarization_speaker_label: "Prof. Dr. Beatrix Ruf"
|
||||
diarization_confidence: 0.98
|
||||
|
||||
# Sound events detected (audience reactions)
|
||||
sound_events_detected: true
|
||||
sound_event_types:
|
||||
- APPLAUSE
|
||||
- LAUGHTER
|
||||
- CROWD_NOISE
|
||||
|
||||
sound_event_segments:
|
||||
- segment_id: "sound-001"
|
||||
sound_start_seconds: 420.0
|
||||
sound_end_seconds: 425.0
|
||||
sound_start_time: "PT7M"
|
||||
sound_end_time: "PT7M5S"
|
||||
sound_event_type: LAUGHTER
|
||||
sound_confidence: 0.89
|
||||
sound_description: "Audience laughter in response to humorous anecdote"
|
||||
|
||||
- segment_id: "sound-002"
|
||||
sound_start_seconds: 1795.0
|
||||
sound_end_seconds: 1810.0
|
||||
sound_start_time: "PT29M55S"
|
||||
sound_end_time: "PT30M10S"
|
||||
sound_event_type: APPLAUSE
|
||||
sound_confidence: 0.96
|
||||
sound_description: "Audience applause at conclusion of lecture"
|
||||
|
||||
- segment_id: "sound-003"
|
||||
sound_start_seconds: 1200.0
|
||||
sound_end_seconds: 1203.0
|
||||
sound_start_time: "PT20M"
|
||||
sound_end_time: "PT20M3S"
|
||||
sound_event_type: CROWD_NOISE
|
||||
sound_confidence: 0.72
|
||||
sound_description: "Brief audience murmuring during slide transition"
|
||||
|
||||
# Audio quality metrics (live recording)
|
||||
audio_quality_score: 0.78
|
||||
snr_db: 18.0 # Lower due to room acoustics
|
||||
has_reverb: true
|
||||
audio_channels: 2
|
||||
sample_rate_hz: 44100
|
||||
|
||||
detection_model: "audio-event-detector-v1"
|
||||
detection_timestamp: "2025-12-15T17:00:00Z"
|
||||
confidence_score: 0.82
|
||||
|
||||
# ============================================================================
|
||||
# PROVENANCE METADATA
|
||||
# ============================================================================
|
||||
|
||||
provenance:
|
||||
data_source: EXAMPLE_INSTANCES
|
||||
data_tier: TIER_4_INFERRED
|
||||
extraction_date: "2025-12-16T00:00:00Z"
|
||||
extraction_method: "Manual example creation for schema documentation"
|
||||
confidence_score: 1.0
|
||||
notes: |
|
||||
Example instances demonstrating video content modeling capabilities.
|
||||
Based on real heritage institution video patterns but with synthetic data.
|
||||
|
||||
Classes demonstrated:
|
||||
- VideoPost (with VideoComment)
|
||||
- VideoTranscript
|
||||
- VideoSubtitle
|
||||
- VideoSceneAnnotation
|
||||
- VideoObjectAnnotation
|
||||
- VideoOCRAnnotation
|
||||
- VideoChapter (NEW in v0.9.10)
|
||||
- VideoChapterList (NEW in v0.9.10)
|
||||
- VideoAudioAnnotation (NEW in v0.9.10)
|
||||
- SpeechSegment
|
||||
- DiarizationSegment
|
||||
- MusicSegment
|
||||
- SoundEventSegment
|
||||
|
||||
Heritage use cases covered:
|
||||
- Virtual museum tours
|
||||
- Conservation documentation
|
||||
- Artwork recognition
|
||||
- Museum label OCR
|
||||
- Video chapter navigation (NEW)
|
||||
- Speaker diarization in interviews (NEW)
|
||||
- Music detection in promotional content (NEW)
|
||||
- Audience reaction detection in lectures (NEW)
|
||||
|
||||
Enumerations demonstrated:
|
||||
- ChapterSourceEnum: MANUAL, AUTO_GENERATED, YOUTUBE_API
|
||||
- AudioEventTypeEnum: SPEECH, MUSIC, MIXED, AMBIENT, SILENCE
|
||||
- SoundEventTypeEnum: APPLAUSE, LAUGHTER, CROWD_NOISE
|
||||
- MusicTypeEnum: BACKGROUND, FOREGROUND, DRAMATIC
|
||||
|
||||
Heritage entities referenced (Wikidata):
|
||||
- Q5598 (Rembrandt van Rijn)
|
||||
- Q41264 (Johannes Vermeer)
|
||||
- Q167654 (Frans Hals)
|
||||
- Q205863 (Jan Steen)
|
||||
- Q219831 (The Night Watch)
|
||||
- Q154349 (The Milkmaid)
|
||||
- Q2628540 (Portrait of Isaac Massa and Beatrix van der Laen)
|
||||
542
schemas/20251121/linkml/modules/classes/VideoAnnotation.yaml
Normal file
542
schemas/20251121/linkml/modules/classes/VideoAnnotation.yaml
Normal file
|
|
@ -0,0 +1,542 @@
|
|||
# Video Annotation Class
|
||||
# Abstract base class for computer vision and multimodal video annotations
|
||||
#
|
||||
# Part of Heritage Custodian Ontology v0.9.5
|
||||
#
|
||||
# HIERARCHY:
|
||||
# E73_Information_Object (CIDOC-CRM)
|
||||
# │
|
||||
# └── VideoTextContent (abstract base)
|
||||
# │
|
||||
# ├── VideoTranscript (audio-derived)
|
||||
# │ │
|
||||
# │ └── VideoSubtitle (time-coded captions)
|
||||
# │
|
||||
# └── VideoAnnotation (this class - ABSTRACT)
|
||||
# │
|
||||
# ├── VideoSceneAnnotation (scene/shot detection)
|
||||
# ├── VideoObjectAnnotation (object/face/logo detection)
|
||||
# └── VideoOCRAnnotation (text-in-video extraction)
|
||||
#
|
||||
# DESIGN RATIONALE:
|
||||
# VideoAnnotation is the abstract parent for all annotations derived from
|
||||
# visual analysis of video content. Unlike VideoTranscript (audio-derived),
|
||||
# these annotations come from computer vision, multimodal AI, or manual
|
||||
# visual analysis.
|
||||
#
|
||||
# Key differences from transcript branch:
|
||||
# - Frame-based rather than audio-based analysis
|
||||
# - Spatial information (bounding boxes, regions)
|
||||
# - Detection thresholds and frame sampling
|
||||
# - Multiple detection types per segment
|
||||
#
|
||||
# ONTOLOGY ALIGNMENT:
|
||||
# - W3C Web Annotation (oa:Annotation) for annotation structure
|
||||
# - CIDOC-CRM E13_Attribute_Assignment for attribution activities
|
||||
# - IIIF Presentation API for spatial/temporal selectors
|
||||
|
||||
id: https://nde.nl/ontology/hc/class/VideoAnnotation
|
||||
name: video_annotation_class
|
||||
title: Video Annotation Class
|
||||
|
||||
imports:
|
||||
- linkml:types
|
||||
- ./VideoTextContent
|
||||
- ./VideoTimeSegment
|
||||
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
oa: http://www.w3.org/ns/oa#
|
||||
as: https://www.w3.org/ns/activitystreams#
|
||||
|
||||
default_prefix: hc
|
||||
|
||||
classes:
|
||||
|
||||
VideoAnnotation:
|
||||
is_a: VideoTextContent
|
||||
class_uri: oa:Annotation
|
||||
abstract: true
|
||||
description: |
|
||||
Abstract base class for computer vision and multimodal video annotations.
|
||||
|
||||
**DEFINITION**:
|
||||
|
||||
VideoAnnotation represents structured information derived from visual
|
||||
analysis of video content. This includes:
|
||||
|
||||
| Subclass | Analysis Type | Output |
|
||||
|----------|---------------|--------|
|
||||
| VideoSceneAnnotation | Shot/scene detection | Scene boundaries, types |
|
||||
| VideoObjectAnnotation | Object detection | Objects, faces, logos |
|
||||
| VideoOCRAnnotation | Text extraction | On-screen text (OCR) |
|
||||
|
||||
**RELATIONSHIP TO W3C WEB ANNOTATION**:
|
||||
|
||||
VideoAnnotation aligns with the W3C Web Annotation Data Model:
|
||||
|
||||
```turtle
|
||||
:annotation a oa:Annotation ;
|
||||
oa:hasBody :detection_result ;
|
||||
oa:hasTarget [
|
||||
oa:hasSource :video ;
|
||||
oa:hasSelector [
|
||||
a oa:FragmentSelector ;
|
||||
dcterms:conformsTo <http://www.w3.org/TR/media-frags/> ;
|
||||
rdf:value "t=30,35"
|
||||
]
|
||||
] ;
|
||||
oa:motivatedBy oa:classifying .
|
||||
```
|
||||
|
||||
**FRAME-BASED ANALYSIS**:
|
||||
|
||||
Unlike audio transcription (continuous stream), video annotation is
|
||||
typically frame-based:
|
||||
|
||||
- `frame_sample_rate`: Frames analyzed per second (e.g., 1 fps, 5 fps)
|
||||
- `total_frames_analyzed`: Total frames processed
|
||||
- Higher sample rates = more detections but higher compute cost
|
||||
|
||||
**DETECTION THRESHOLDS**:
|
||||
|
||||
CV models output confidence scores. Thresholds filter noise:
|
||||
|
||||
| Threshold | Use Case |
|
||||
|-----------|----------|
|
||||
| 0.9+ | High precision, production display |
|
||||
| 0.7-0.9 | Balanced, general use |
|
||||
| 0.5-0.7 | High recall, research/review |
|
||||
| < 0.5 | Raw output, needs filtering |
|
||||
|
||||
**MODEL ARCHITECTURE TRACKING**:
|
||||
|
||||
Different model architectures have different characteristics:
|
||||
|
||||
| Architecture | Examples | Strengths |
|
||||
|--------------|----------|-----------|
|
||||
| CNN | ResNet, VGG | Fast inference, good for objects |
|
||||
| Transformer | ViT, CLIP | Better context, multimodal |
|
||||
| Hybrid | DETR, Swin | Balance of speed and accuracy |
|
||||
|
||||
**HERITAGE INSTITUTION CONTEXT**:
|
||||
|
||||
Video annotations enable:
|
||||
- **Discovery**: Find videos containing specific objects/artworks
|
||||
- **Accessibility**: Scene descriptions for visually impaired
|
||||
- **Research**: Analyze visual content at scale
|
||||
- **Preservation**: Document visual content as text
|
||||
- **Linking**: Connect detected artworks to collection records
|
||||
|
||||
**CIDOC-CRM E13_Attribute_Assignment**:
|
||||
|
||||
Annotations are attribute assignments - asserting properties about
|
||||
video segments. The CV model or human annotator is the assigning agent.
|
||||
|
||||
exact_mappings:
|
||||
- oa:Annotation
|
||||
|
||||
close_mappings:
|
||||
- crm:E13_Attribute_Assignment
|
||||
|
||||
related_mappings:
|
||||
- as:Activity
|
||||
- schema:ClaimReview
|
||||
|
||||
slots:
|
||||
# Annotation structure
|
||||
- annotation_type
|
||||
- annotation_segments
|
||||
|
||||
# Detection parameters
|
||||
- detection_threshold
|
||||
- detection_count
|
||||
|
||||
# Frame analysis
|
||||
- frame_sample_rate
|
||||
- total_frames_analyzed
|
||||
- keyframe_extraction
|
||||
|
||||
# Model details
|
||||
- model_architecture
|
||||
- model_task
|
||||
|
||||
# Spatial information
|
||||
- includes_bounding_boxes
|
||||
- includes_segmentation_masks
|
||||
|
||||
# Annotation motivation
|
||||
- annotation_motivation
|
||||
|
||||
slot_usage:
|
||||
annotation_type:
|
||||
slot_uri: dcterms:type
|
||||
description: |
|
||||
High-level type classification for this annotation.
|
||||
|
||||
Dublin Core: type for resource categorization.
|
||||
|
||||
**Standard Types**:
|
||||
- SCENE_DETECTION: Shot/scene boundary detection
|
||||
- OBJECT_DETECTION: Object, face, logo detection
|
||||
- OCR: Text-in-video extraction
|
||||
- ACTION_RECOGNITION: Human action detection
|
||||
- SEMANTIC_SEGMENTATION: Pixel-level classification
|
||||
- MULTIMODAL: Combined audio+visual analysis
|
||||
range: AnnotationTypeEnum
|
||||
required: true
|
||||
examples:
|
||||
- value: "OBJECT_DETECTION"
|
||||
description: "Object and face detection annotation"
|
||||
|
||||
annotation_segments:
|
||||
slot_uri: oa:hasBody
|
||||
description: |
|
||||
List of temporal segments with detection results.
|
||||
|
||||
Web Annotation: hasBody links annotation to its content.
|
||||
|
||||
Each segment contains:
|
||||
- Time boundaries (start/end)
|
||||
- Detection text/description
|
||||
- Per-segment confidence
|
||||
|
||||
Reuses VideoTimeSegment for consistent temporal modeling.
|
||||
range: VideoTimeSegment
|
||||
multivalued: true
|
||||
required: false
|
||||
inlined_as_list: true
|
||||
examples:
|
||||
- value: "[{start_seconds: 30.0, end_seconds: 35.0, segment_text: 'Night Watch painting visible'}]"
|
||||
description: "Object detection segment"
|
||||
|
||||
detection_threshold:
|
||||
slot_uri: hc:detectionThreshold
|
||||
description: |
|
||||
Minimum confidence threshold used for detection filtering.
|
||||
|
||||
Detections below this threshold were excluded from results.
|
||||
|
||||
Range: 0.0 to 1.0
|
||||
|
||||
**Common Values**:
|
||||
- 0.5: Standard threshold (balanced)
|
||||
- 0.7: High precision mode
|
||||
- 0.3: High recall mode (includes uncertain detections)
|
||||
range: float
|
||||
required: false
|
||||
minimum_value: 0.0
|
||||
maximum_value: 1.0
|
||||
examples:
|
||||
- value: 0.5
|
||||
description: "Standard detection threshold"
|
||||
|
||||
detection_count:
|
||||
slot_uri: hc:detectionCount
|
||||
description: |
|
||||
Total number of detections across all analyzed frames.
|
||||
|
||||
Useful for:
|
||||
- Understanding annotation density
|
||||
- Quality assessment
|
||||
- Performance metrics
|
||||
|
||||
Note: May be higher than annotation_segments count if segments
|
||||
are aggregated or filtered.
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 342
|
||||
description: "342 total detections found"
|
||||
|
||||
frame_sample_rate:
|
||||
slot_uri: hc:frameSampleRate
|
||||
description: |
|
||||
Number of frames analyzed per second of video.
|
||||
|
||||
**Common Values**:
|
||||
- 1.0: One frame per second (efficient)
|
||||
- 5.0: Five frames per second (balanced)
|
||||
- 30.0: Every frame at 30fps (thorough but expensive)
|
||||
- 0.1: One frame every 10 seconds (overview only)
|
||||
|
||||
Higher rates catch more content but increase compute cost.
|
||||
range: float
|
||||
required: false
|
||||
minimum_value: 0.0
|
||||
examples:
|
||||
- value: 1.0
|
||||
description: "Analyzed 1 frame per second"
|
||||
|
||||
total_frames_analyzed:
|
||||
slot_uri: hc:totalFramesAnalyzed
|
||||
description: |
|
||||
Total number of video frames that were analyzed.
|
||||
|
||||
Calculated as: video_duration_seconds × frame_sample_rate
|
||||
|
||||
Useful for:
|
||||
- Understanding analysis coverage
|
||||
- Cost estimation
|
||||
- Reproducibility
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 1800
|
||||
description: "Analyzed 1,800 frames (30 min video at 1 fps)"
|
||||
|
||||
keyframe_extraction:
|
||||
slot_uri: hc:keyframeExtraction
|
||||
description: |
|
||||
Whether keyframe extraction was used instead of uniform sampling.
|
||||
|
||||
**Keyframe extraction** selects visually distinct frames
|
||||
(scene changes, significant motion) rather than uniform intervals.
|
||||
|
||||
- true: Keyframes extracted (variable frame selection)
|
||||
- false: Uniform sampling at frame_sample_rate
|
||||
|
||||
Keyframe extraction is more efficient but may miss content
|
||||
between scene changes.
|
||||
range: boolean
|
||||
required: false
|
||||
examples:
|
||||
- value: true
|
||||
description: "Used keyframe extraction"
|
||||
|
||||
model_architecture:
|
||||
slot_uri: hc:modelArchitecture
|
||||
description: |
|
||||
Architecture type of the CV/ML model used.
|
||||
|
||||
**Common Architectures**:
|
||||
- CNN: Convolutional Neural Network (ResNet, VGG, EfficientNet)
|
||||
- Transformer: Vision Transformer (ViT, Swin, CLIP)
|
||||
- Hybrid: Combined architectures (DETR, ConvNeXt)
|
||||
- RNN: Recurrent (for temporal analysis)
|
||||
- GAN: Generative (for reconstruction tasks)
|
||||
|
||||
Useful for understanding model capabilities and limitations.
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "Transformer"
|
||||
description: "Vision Transformer architecture"
|
||||
- value: "CNN"
|
||||
description: "Convolutional Neural Network"
|
||||
|
||||
model_task:
|
||||
slot_uri: hc:modelTask
|
||||
description: |
|
||||
Specific task the model was trained for.
|
||||
|
||||
**Common Tasks**:
|
||||
- classification: Image/frame classification
|
||||
- detection: Object detection with bounding boxes
|
||||
- segmentation: Pixel-level classification
|
||||
- captioning: Image/video captioning
|
||||
- embedding: Feature extraction for similarity
|
||||
|
||||
A model's task determines its output format.
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "detection"
|
||||
description: "Object detection task"
|
||||
- value: "captioning"
|
||||
description: "Video captioning task"
|
||||
|
||||
includes_bounding_boxes:
|
||||
slot_uri: hc:includesBoundingBoxes
|
||||
description: |
|
||||
Whether annotation includes spatial bounding box coordinates.
|
||||
|
||||
Bounding boxes define rectangular regions in frames where
|
||||
objects/faces/text were detected.
|
||||
|
||||
Format typically: [x, y, width, height] or [x1, y1, x2, y2]
|
||||
|
||||
- true: Spatial coordinates available in segment data
|
||||
- false: Only temporal information (no spatial)
|
||||
range: boolean
|
||||
required: false
|
||||
examples:
|
||||
- value: true
|
||||
description: "Includes bounding box coordinates"
|
||||
|
||||
includes_segmentation_masks:
|
||||
slot_uri: hc:includesSegmentationMasks
|
||||
description: |
|
||||
Whether annotation includes pixel-level segmentation masks.
|
||||
|
||||
Segmentation masks provide precise object boundaries
|
||||
(more detailed than bounding boxes).
|
||||
|
||||
- true: Pixel masks available (typically as separate files)
|
||||
- false: No segmentation data
|
||||
|
||||
Masks are memory-intensive; often stored externally.
|
||||
range: boolean
|
||||
required: false
|
||||
examples:
|
||||
- value: false
|
||||
description: "No segmentation masks included"
|
||||
|
||||
annotation_motivation:
|
||||
slot_uri: oa:motivatedBy
|
||||
description: |
|
||||
The motivation or purpose for creating this annotation.
|
||||
|
||||
Web Annotation: motivatedBy describes why annotation was created.
|
||||
|
||||
**Standard Motivations** (from W3C Web Annotation):
|
||||
- classifying: Categorizing content
|
||||
- describing: Adding description
|
||||
- identifying: Identifying depicted things
|
||||
- tagging: Adding tags/keywords
|
||||
- linking: Linking to external resources
|
||||
|
||||
**Heritage-Specific**:
|
||||
- accessibility: For accessibility services
|
||||
- discovery: For search/discovery
|
||||
- preservation: For digital preservation
|
||||
range: AnnotationMotivationEnum
|
||||
required: false
|
||||
examples:
|
||||
- value: "CLASSIFYING"
|
||||
description: "Annotation for classification purposes"
|
||||
|
||||
comments:
|
||||
- "Abstract base for all CV/multimodal video annotations"
|
||||
- "Extends VideoTextContent with frame-based analysis parameters"
|
||||
- "W3C Web Annotation compatible structure"
|
||||
- "Supports both temporal and spatial annotation"
|
||||
- "Tracks detection thresholds and model architecture"
|
||||
|
||||
see_also:
|
||||
- "https://www.w3.org/TR/annotation-model/"
|
||||
- "http://www.cidoc-crm.org/cidoc-crm/E13_Attribute_Assignment"
|
||||
- "https://iiif.io/api/presentation/3.0/"
|
||||
|
||||
# ============================================================================
|
||||
# Enumerations
|
||||
# ============================================================================
|
||||
|
||||
enums:
|
||||
|
||||
AnnotationTypeEnum:
|
||||
description: |
|
||||
Types of video annotation based on analysis method.
|
||||
permissible_values:
|
||||
SCENE_DETECTION:
|
||||
description: Shot and scene boundary detection
|
||||
OBJECT_DETECTION:
|
||||
description: Object, face, and logo detection
|
||||
OCR:
|
||||
description: Optical character recognition (text-in-video)
|
||||
ACTION_RECOGNITION:
|
||||
description: Human action and activity detection
|
||||
SEMANTIC_SEGMENTATION:
|
||||
description: Pixel-level semantic classification
|
||||
POSE_ESTIMATION:
|
||||
description: Human body pose detection
|
||||
EMOTION_RECOGNITION:
|
||||
description: Facial emotion/expression analysis
|
||||
MULTIMODAL:
|
||||
description: Combined audio-visual analysis
|
||||
CAPTIONING:
|
||||
description: Automated video captioning/description
|
||||
CUSTOM:
|
||||
description: Custom annotation type
|
||||
|
||||
AnnotationMotivationEnum:
|
||||
description: |
|
||||
Motivation for creating annotation (W3C Web Annotation aligned).
|
||||
permissible_values:
|
||||
CLASSIFYING:
|
||||
description: Categorizing or classifying content
|
||||
meaning: oa:classifying
|
||||
DESCRIBING:
|
||||
description: Adding descriptive information
|
||||
meaning: oa:describing
|
||||
IDENTIFYING:
|
||||
description: Identifying depicted entities
|
||||
meaning: oa:identifying
|
||||
TAGGING:
|
||||
description: Adding tags or keywords
|
||||
meaning: oa:tagging
|
||||
LINKING:
|
||||
description: Linking to external resources
|
||||
meaning: oa:linking
|
||||
COMMENTING:
|
||||
description: Adding commentary
|
||||
meaning: oa:commenting
|
||||
ACCESSIBILITY:
|
||||
description: Providing accessibility support
|
||||
DISCOVERY:
|
||||
description: Enabling search and discovery
|
||||
PRESERVATION:
|
||||
description: Supporting digital preservation
|
||||
RESEARCH:
|
||||
description: Supporting research and analysis
|
||||
|
||||
# ============================================================================
|
||||
# Slot Definitions
|
||||
# ============================================================================
|
||||
|
||||
slots:
|
||||
annotation_type:
|
||||
description: High-level type of video annotation
|
||||
range: AnnotationTypeEnum
|
||||
|
||||
annotation_segments:
|
||||
description: List of temporal segments with detection results
|
||||
range: VideoTimeSegment
|
||||
multivalued: true
|
||||
|
||||
detection_threshold:
|
||||
description: Minimum confidence threshold for detection filtering
|
||||
range: float
|
||||
|
||||
detection_count:
|
||||
description: Total number of detections found
|
||||
range: integer
|
||||
|
||||
frame_sample_rate:
|
||||
description: Frames analyzed per second of video
|
||||
range: float
|
||||
|
||||
total_frames_analyzed:
|
||||
description: Total number of frames analyzed
|
||||
range: integer
|
||||
|
||||
keyframe_extraction:
|
||||
description: Whether keyframe extraction was used
|
||||
range: boolean
|
||||
|
||||
model_architecture:
|
||||
description: Architecture type of CV/ML model (CNN, Transformer, etc.)
|
||||
range: string
|
||||
|
||||
model_task:
|
||||
description: Specific task model was trained for
|
||||
range: string
|
||||
|
||||
includes_bounding_boxes:
|
||||
description: Whether annotation includes spatial bounding boxes
|
||||
range: boolean
|
||||
|
||||
includes_segmentation_masks:
|
||||
description: Whether annotation includes pixel segmentation masks
|
||||
range: boolean
|
||||
|
||||
annotation_motivation:
|
||||
description: Motivation for creating annotation (W3C Web Annotation)
|
||||
range: AnnotationMotivationEnum
|
||||
1312
schemas/20251121/linkml/modules/classes/VideoAnnotationTypes.yaml
Normal file
1312
schemas/20251121/linkml/modules/classes/VideoAnnotationTypes.yaml
Normal file
File diff suppressed because it is too large
Load diff
1108
schemas/20251121/linkml/modules/classes/VideoAudioAnnotation.yaml
Normal file
1108
schemas/20251121/linkml/modules/classes/VideoAudioAnnotation.yaml
Normal file
File diff suppressed because it is too large
Load diff
621
schemas/20251121/linkml/modules/classes/VideoChapter.yaml
Normal file
621
schemas/20251121/linkml/modules/classes/VideoChapter.yaml
Normal file
|
|
@ -0,0 +1,621 @@
|
|||
# Video Chapter Class
|
||||
# Models video chapter markers (YouTube chapters, manual/auto-generated sections)
|
||||
#
|
||||
# Part of Heritage Custodian Ontology v0.9.10
|
||||
#
|
||||
# STRUCTURE:
|
||||
# VideoChapter (this class)
|
||||
# - chapter_title, chapter_index
|
||||
# - start/end times (via VideoTimeSegment composition)
|
||||
# - auto_generated flag
|
||||
# - thumbnail references
|
||||
#
|
||||
# USE CASES:
|
||||
# - YouTube video chapters (manual creator-defined)
|
||||
# - Auto-generated chapters (YouTube AI, third-party tools)
|
||||
# - Museum virtual tour sections
|
||||
# - Conservation documentation phases
|
||||
# - Interview segments
|
||||
#
|
||||
# ONTOLOGY ALIGNMENT:
|
||||
# - Schema.org Clip for media segments
|
||||
# - W3C Media Fragments for temporal addressing
|
||||
# - CIDOC-CRM E52_Time-Span for temporal extent
|
||||
|
||||
id: https://nde.nl/ontology/hc/class/VideoChapter
|
||||
name: video_chapter_class
|
||||
title: Video Chapter Class
|
||||
|
||||
imports:
|
||||
- linkml:types
|
||||
- ./VideoTimeSegment
|
||||
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
oa: http://www.w3.org/ns/oa#
|
||||
ma: http://www.w3.org/ns/ma-ont#
|
||||
wikidata: http://www.wikidata.org/entity/
|
||||
|
||||
default_prefix: hc
|
||||
|
||||
classes:
|
||||
|
||||
VideoChapter:
|
||||
class_uri: schema:Clip
|
||||
abstract: false
|
||||
description: |
|
||||
A named chapter or section within a video, defined by temporal boundaries.
|
||||
|
||||
**DEFINITION**:
|
||||
|
||||
VideoChapter represents a titled segment of video content, typically used for
|
||||
navigation and content organization. Chapters appear in video player interfaces
|
||||
(YouTube chapters, Vimeo chapters) allowing viewers to jump to specific sections.
|
||||
|
||||
**PLATFORM SUPPORT**:
|
||||
|
||||
| Platform | Chapter Support | Auto-Generated | Custom Thumbnails |
|
||||
|----------|-----------------|----------------|-------------------|
|
||||
| YouTube | Yes (2020+) | Yes | No (keyframe) |
|
||||
| Vimeo | Yes | No | Yes |
|
||||
| Facebook | Limited | No | No |
|
||||
| Wistia | Yes | No | Yes |
|
||||
|
||||
**YOUTUBE CHAPTER REQUIREMENTS**:
|
||||
|
||||
For YouTube to recognize chapters:
|
||||
- First chapter MUST start at 0:00
|
||||
- Minimum 3 chapters required
|
||||
- Each chapter must be at least 10 seconds
|
||||
- Timestamps in description in `MM:SS` or `HH:MM:SS` format
|
||||
|
||||
**HERITAGE INSTITUTION USE CASES**:
|
||||
|
||||
| Content Type | Chapter Examples |
|
||||
|--------------|------------------|
|
||||
| Virtual tour | "Main Hall", "Dutch Masters", "Gift Shop" |
|
||||
| Conservation | "Assessment", "Cleaning", "Retouching", "Varnishing" |
|
||||
| Interview | "Introduction", "Early Career", "Major Works", "Legacy" |
|
||||
| Exhibition | "Curator Introduction", "Theme 1", "Theme 2", "Conclusion" |
|
||||
| Lecture | "Overview", "Case Study 1", "Case Study 2", "Q&A" |
|
||||
|
||||
**AUTO-GENERATED VS MANUAL CHAPTERS**:
|
||||
|
||||
| Source | Characteristics | Quality |
|
||||
|--------|-----------------|---------|
|
||||
| Manual (creator) | Semantic, meaningful titles | High |
|
||||
| YouTube AI | Scene-based, generic titles | Variable |
|
||||
| Third-party tools | Transcript-based, keyword titles | Medium |
|
||||
|
||||
The `auto_generated` flag distinguishes these sources.
|
||||
|
||||
**RELATIONSHIP TO VideoTimeSegment**:
|
||||
|
||||
VideoChapter USES VideoTimeSegment for temporal boundaries rather than
|
||||
extending it. This composition pattern allows:
|
||||
- Reuse of segment validation (start < end)
|
||||
- Consistent time representation across schema
|
||||
- Separation of structural (chapter) and temporal (segment) concerns
|
||||
|
||||
**MEDIA FRAGMENTS URI**:
|
||||
|
||||
Chapters can be addressed via W3C Media Fragments:
|
||||
```
|
||||
https://youtube.com/watch?v=ABC123#t=120,300
|
||||
```
|
||||
Corresponds to chapter starting at 2:00, ending at 5:00.
|
||||
|
||||
**NESTED CHAPTERS**:
|
||||
|
||||
Some platforms support hierarchical chapters (parent/child).
|
||||
Use `parent_chapter_id` for nested structure:
|
||||
|
||||
```
|
||||
Chapter 1: Dutch Golden Age
|
||||
└─ 1.1: Rembrandt
|
||||
└─ 1.2: Vermeer
|
||||
Chapter 2: Modern Art
|
||||
```
|
||||
|
||||
exact_mappings:
|
||||
- schema:Clip
|
||||
|
||||
close_mappings:
|
||||
- ma:MediaFragment
|
||||
- crm:E52_Time-Span
|
||||
|
||||
related_mappings:
|
||||
- wikidata:Q1454986 # Chapter (division of a book/document)
|
||||
|
||||
slots:
|
||||
# Chapter identification
|
||||
- chapter_id
|
||||
- chapter_title
|
||||
- chapter_index
|
||||
- chapter_description
|
||||
|
||||
# Temporal boundaries (composition with VideoTimeSegment)
|
||||
- chapter_start_seconds
|
||||
- chapter_end_seconds
|
||||
- chapter_start_time
|
||||
- chapter_end_time
|
||||
|
||||
# Generation metadata
|
||||
- auto_generated
|
||||
- chapter_source
|
||||
|
||||
# Visual
|
||||
- chapter_thumbnail_url
|
||||
- chapter_thumbnail_timestamp
|
||||
|
||||
# Hierarchy
|
||||
- parent_chapter_id
|
||||
- nesting_level
|
||||
|
||||
slot_usage:
|
||||
chapter_id:
|
||||
slot_uri: dcterms:identifier
|
||||
description: |
|
||||
Unique identifier for this chapter.
|
||||
|
||||
Dublin Core: identifier for unique identification.
|
||||
|
||||
**Format**: Platform-specific or UUID
|
||||
- YouTube: No native chapter ID (use index)
|
||||
- Generated: `{video_id}_chapter_{index}`
|
||||
range: string
|
||||
required: true
|
||||
examples:
|
||||
- value: "ABC123_chapter_0"
|
||||
description: "First chapter of video ABC123"
|
||||
- value: "550e8400-e29b-41d4-a716-446655440000"
|
||||
description: "UUID-based chapter ID"
|
||||
|
||||
chapter_title:
|
||||
slot_uri: schema:name
|
||||
description: |
|
||||
Title of the chapter as displayed to viewers.
|
||||
|
||||
Schema.org: name for the chapter's title.
|
||||
|
||||
**Best Practices**:
|
||||
- Keep titles concise (under 50 characters)
|
||||
- Use descriptive, meaningful titles
|
||||
- Avoid timestamps in title (redundant)
|
||||
|
||||
**Auto-Generated Titles**:
|
||||
- YouTube AI: Often generic ("Introduction", "Main Content")
|
||||
- May need manual refinement for heritage content
|
||||
range: string
|
||||
required: true
|
||||
examples:
|
||||
- value: "De Nachtwacht (The Night Watch)"
|
||||
description: "Chapter about specific artwork"
|
||||
- value: "Curator Interview: Conservation Process"
|
||||
description: "Interview segment chapter"
|
||||
|
||||
chapter_index:
|
||||
slot_uri: hc:chapterIndex
|
||||
description: |
|
||||
Zero-based index of this chapter within the video.
|
||||
|
||||
**Ordering**:
|
||||
- 0: First chapter (typically starts at 0:00)
|
||||
- Subsequent chapters in temporal order
|
||||
|
||||
Used for:
|
||||
- Reconstruction of chapter sequence
|
||||
- Navigation (previous/next)
|
||||
- Display ordering
|
||||
range: integer
|
||||
required: true
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 0
|
||||
description: "First chapter"
|
||||
- value: 5
|
||||
description: "Sixth chapter (zero-indexed)"
|
||||
|
||||
chapter_description:
|
||||
slot_uri: schema:description
|
||||
description: |
|
||||
Optional detailed description of chapter content.
|
||||
|
||||
Schema.org: description for chapter details.
|
||||
|
||||
Longer-form description than title. May include:
|
||||
- Topics covered
|
||||
- Featured artworks
|
||||
- Key points discussed
|
||||
|
||||
Not all platforms display chapter descriptions.
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "Dr. Dibbits discusses the restoration of Rembrandt's masterpiece, including the controversial 2019 operation."
|
||||
description: "Detailed chapter description"
|
||||
|
||||
chapter_start_seconds:
|
||||
slot_uri: ma:hasStartTime
|
||||
description: |
|
||||
Start time of chapter in seconds from video beginning.
|
||||
|
||||
Media Ontology: hasStartTime for temporal start.
|
||||
|
||||
**First Chapter Rule**:
|
||||
For YouTube chapters to be recognized, the first chapter
|
||||
MUST start at 0.0 seconds.
|
||||
|
||||
Floating point for millisecond precision.
|
||||
range: float
|
||||
required: true
|
||||
minimum_value: 0.0
|
||||
examples:
|
||||
- value: 0.0
|
||||
description: "First chapter starts at video beginning"
|
||||
- value: 120.5
|
||||
description: "Chapter starts at 2:00.5"
|
||||
|
||||
chapter_end_seconds:
|
||||
slot_uri: ma:hasEndTime
|
||||
description: |
|
||||
End time of chapter in seconds from video beginning.
|
||||
|
||||
Media Ontology: hasEndTime for temporal end.
|
||||
|
||||
**Chapter Boundaries**:
|
||||
- End time of chapter N = start time of chapter N+1
|
||||
- Last chapter ends at video duration
|
||||
- No gaps between chapters (continuous coverage)
|
||||
range: float
|
||||
required: false
|
||||
minimum_value: 0.0
|
||||
examples:
|
||||
- value: 120.0
|
||||
description: "Chapter ends at 2:00"
|
||||
|
||||
chapter_start_time:
|
||||
slot_uri: hc:chapterStartTime
|
||||
description: |
|
||||
Start time as ISO 8601 duration for display/serialization.
|
||||
|
||||
Derived from chapter_start_seconds.
|
||||
|
||||
**Format**: ISO 8601 duration (e.g., "PT2M30S" = 2:30)
|
||||
range: string
|
||||
required: false
|
||||
pattern: "^PT(\\d+H)?(\\d+M)?(\\d+(\\.\\d+)?S)?$"
|
||||
examples:
|
||||
- value: "PT0S"
|
||||
description: "Start of video"
|
||||
- value: "PT10M30S"
|
||||
description: "10 minutes 30 seconds"
|
||||
|
||||
chapter_end_time:
|
||||
slot_uri: hc:chapterEndTime
|
||||
description: |
|
||||
End time as ISO 8601 duration for display/serialization.
|
||||
|
||||
Derived from chapter_end_seconds.
|
||||
range: string
|
||||
required: false
|
||||
pattern: "^PT(\\d+H)?(\\d+M)?(\\d+(\\.\\d+)?S)?$"
|
||||
examples:
|
||||
- value: "PT5M0S"
|
||||
description: "5 minutes"
|
||||
|
||||
auto_generated:
|
||||
slot_uri: hc:autoGenerated
|
||||
description: |
|
||||
Whether this chapter was auto-generated by AI/ML.
|
||||
|
||||
**Sources**:
|
||||
- true: YouTube AI chapters, third-party tools, ASR-based
|
||||
- false: Manual creator-defined chapters
|
||||
|
||||
Auto-generated chapters may have:
|
||||
- Generic titles
|
||||
- Less semantic meaning
|
||||
- Scene-based rather than topic-based boundaries
|
||||
range: boolean
|
||||
required: false
|
||||
examples:
|
||||
- value: false
|
||||
description: "Manual creator-defined chapter"
|
||||
- value: true
|
||||
description: "YouTube AI auto-generated"
|
||||
|
||||
chapter_source:
|
||||
slot_uri: prov:wasAttributedTo
|
||||
description: |
|
||||
Source or method that created this chapter.
|
||||
|
||||
PROV-O: wasAttributedTo for attribution.
|
||||
|
||||
**Common Values**:
|
||||
- MANUAL: Creator-defined in video description
|
||||
- YOUTUBE_AI: YouTube auto-chapters feature
|
||||
- WHISPER_CHAPTERS: Generated from Whisper transcript
|
||||
- SCENE_DETECTION: Based on visual scene changes
|
||||
- THIRD_PARTY: External tool (specify in notes)
|
||||
range: ChapterSourceEnum
|
||||
required: false
|
||||
examples:
|
||||
- value: "MANUAL"
|
||||
description: "Creator manually added chapters"
|
||||
|
||||
chapter_thumbnail_url:
|
||||
slot_uri: schema:thumbnailUrl
|
||||
description: |
|
||||
URL to thumbnail image for this chapter.
|
||||
|
||||
Schema.org: thumbnailUrl for preview image.
|
||||
|
||||
**Platform Behavior**:
|
||||
- YouTube: Auto-selects keyframe from chapter start
|
||||
- Vimeo: Allows custom chapter thumbnails
|
||||
|
||||
Thumbnail helps viewers preview chapter content.
|
||||
range: uri
|
||||
required: false
|
||||
examples:
|
||||
- value: "https://i.ytimg.com/vi/ABC123/hq1.jpg"
|
||||
description: "YouTube chapter thumbnail"
|
||||
|
||||
chapter_thumbnail_timestamp:
|
||||
slot_uri: hc:thumbnailTimestamp
|
||||
description: |
|
||||
Timestamp (in seconds) of frame used for thumbnail.
|
||||
|
||||
May differ slightly from chapter_start_seconds if
|
||||
a more visually representative frame was selected.
|
||||
range: float
|
||||
required: false
|
||||
minimum_value: 0.0
|
||||
examples:
|
||||
- value: 122.5
|
||||
description: "Thumbnail from 2:02.5"
|
||||
|
||||
parent_chapter_id:
|
||||
slot_uri: dcterms:isPartOf
|
||||
description: |
|
||||
Reference to parent chapter for hierarchical chapters.
|
||||
|
||||
Dublin Core: isPartOf for containment relationship.
|
||||
|
||||
Enables nested chapter structure:
|
||||
```
|
||||
Chapter 1: Dutch Masters
|
||||
└─ 1.1: Rembrandt
|
||||
└─ 1.2: Vermeer
|
||||
```
|
||||
|
||||
null/empty for top-level chapters.
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "ABC123_chapter_0"
|
||||
description: "This is a sub-chapter of chapter 0"
|
||||
|
||||
nesting_level:
|
||||
slot_uri: hc:nestingLevel
|
||||
description: |
|
||||
Depth level in chapter hierarchy.
|
||||
|
||||
- 0: Top-level chapter
|
||||
- 1: First-level sub-chapter
|
||||
- 2: Second-level sub-chapter
|
||||
- etc.
|
||||
|
||||
Most platforms only support level 0 (flat chapters).
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 0
|
||||
description: "Top-level chapter"
|
||||
- value: 1
|
||||
description: "Sub-chapter"
|
||||
|
||||
comments:
|
||||
- "Models video chapters for navigation (YouTube chapters, etc.)"
|
||||
- "Supports both manual and auto-generated chapters"
|
||||
- "Temporal boundaries via composition with VideoTimeSegment pattern"
|
||||
- "Hierarchical chapters supported via parent_chapter_id"
|
||||
- "Schema.org Clip alignment for semantic web compatibility"
|
||||
|
||||
see_also:
|
||||
- "https://support.google.com/youtube/answer/9884579"
|
||||
- "https://schema.org/Clip"
|
||||
- "https://www.w3.org/TR/media-frags/"
|
||||
|
||||
|
||||
# ==========================================================================
|
||||
# Supporting Class: VideoChapterList
|
||||
# ==========================================================================
|
||||
|
||||
VideoChapterList:
|
||||
class_uri: schema:ItemList
|
||||
description: |
|
||||
A collection of chapters for a video.
|
||||
|
||||
Groups all chapters for a video with metadata about the chapter set.
|
||||
|
||||
Enables bulk operations on chapters:
|
||||
- Import/export of chapter lists
|
||||
- Validation of chapter coverage
|
||||
- Source tracking for entire chapter set
|
||||
|
||||
exact_mappings:
|
||||
- schema:ItemList
|
||||
|
||||
slots:
|
||||
- video_id
|
||||
- chapters
|
||||
- total_chapters
|
||||
- chapters_source
|
||||
- chapters_generated_at
|
||||
- covers_full_video
|
||||
|
||||
slot_usage:
|
||||
video_id:
|
||||
slot_uri: schema:isPartOf
|
||||
description: Reference to the parent video
|
||||
range: string
|
||||
required: true
|
||||
|
||||
chapters:
|
||||
slot_uri: schema:itemListElement
|
||||
description: Ordered list of chapters
|
||||
range: VideoChapter
|
||||
multivalued: true
|
||||
required: true
|
||||
inlined_as_list: true
|
||||
|
||||
total_chapters:
|
||||
slot_uri: hc:totalChapters
|
||||
description: Total number of chapters
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
|
||||
chapters_source:
|
||||
slot_uri: prov:wasAttributedTo
|
||||
description: Primary source for this chapter list
|
||||
range: ChapterSourceEnum
|
||||
required: false
|
||||
|
||||
chapters_generated_at:
|
||||
slot_uri: prov:generatedAtTime
|
||||
description: When chapters were generated/extracted
|
||||
range: datetime
|
||||
required: false
|
||||
|
||||
covers_full_video:
|
||||
slot_uri: hc:coversFullVideo
|
||||
description: |
|
||||
Whether chapters cover the entire video duration.
|
||||
|
||||
- true: No gaps, first chapter at 0:00, last ends at video end
|
||||
- false: Partial coverage (gaps between chapters)
|
||||
range: boolean
|
||||
required: false
|
||||
|
||||
# ============================================================================
|
||||
# Enumerations
|
||||
# ============================================================================
|
||||
|
||||
enums:
|
||||
|
||||
ChapterSourceEnum:
|
||||
description: |
|
||||
Source or method that created video chapters.
|
||||
permissible_values:
|
||||
MANUAL:
|
||||
description: Creator manually defined chapters in video description
|
||||
YOUTUBE_AI:
|
||||
description: YouTube auto-chapters feature (AI-generated)
|
||||
WHISPER_CHAPTERS:
|
||||
description: Generated from Whisper transcript analysis
|
||||
SCENE_DETECTION:
|
||||
description: Based on visual scene change detection
|
||||
TRANSCRIPT_ANALYSIS:
|
||||
description: Topic segmentation from transcript
|
||||
THIRD_PARTY:
|
||||
description: External tool or service
|
||||
IMPORTED:
|
||||
description: Imported from another platform/format
|
||||
UNKNOWN:
|
||||
description: Chapter source not determined
|
||||
|
||||
# ============================================================================
|
||||
# Slot Definitions
|
||||
# ============================================================================
|
||||
|
||||
slots:
|
||||
chapter_id:
|
||||
description: Unique identifier for chapter
|
||||
range: string
|
||||
|
||||
chapter_title:
|
||||
description: Display title of chapter
|
||||
range: string
|
||||
|
||||
chapter_index:
|
||||
description: Zero-based index in chapter sequence
|
||||
range: integer
|
||||
|
||||
chapter_description:
|
||||
description: Detailed description of chapter content
|
||||
range: string
|
||||
|
||||
chapter_start_seconds:
|
||||
description: Start time in seconds
|
||||
range: float
|
||||
|
||||
chapter_end_seconds:
|
||||
description: End time in seconds
|
||||
range: float
|
||||
|
||||
chapter_start_time:
|
||||
description: Start time as ISO 8601 duration
|
||||
range: string
|
||||
|
||||
chapter_end_time:
|
||||
description: End time as ISO 8601 duration
|
||||
range: string
|
||||
|
||||
auto_generated:
|
||||
description: Whether chapter was auto-generated by AI
|
||||
range: boolean
|
||||
|
||||
chapter_source:
|
||||
description: Source that created this chapter
|
||||
range: ChapterSourceEnum
|
||||
|
||||
chapter_thumbnail_url:
|
||||
description: URL to chapter thumbnail image
|
||||
range: uri
|
||||
|
||||
chapter_thumbnail_timestamp:
|
||||
description: Timestamp of thumbnail frame
|
||||
range: float
|
||||
|
||||
parent_chapter_id:
|
||||
description: Reference to parent chapter for nesting
|
||||
range: string
|
||||
|
||||
nesting_level:
|
||||
description: Depth level in chapter hierarchy
|
||||
range: integer
|
||||
|
||||
# VideoChapterList slots
|
||||
video_id:
|
||||
description: Reference to parent video
|
||||
range: string
|
||||
|
||||
chapters:
|
||||
description: Ordered list of video chapters
|
||||
range: VideoChapter
|
||||
multivalued: true
|
||||
|
||||
total_chapters:
|
||||
description: Total number of chapters
|
||||
range: integer
|
||||
|
||||
chapters_source:
|
||||
description: Primary source for chapter list
|
||||
range: ChapterSourceEnum
|
||||
|
||||
chapters_generated_at:
|
||||
description: When chapters were generated
|
||||
range: datetime
|
||||
|
||||
covers_full_video:
|
||||
description: Whether chapters cover entire video
|
||||
range: boolean
|
||||
763
schemas/20251121/linkml/modules/classes/VideoPost.yaml
Normal file
763
schemas/20251121/linkml/modules/classes/VideoPost.yaml
Normal file
|
|
@ -0,0 +1,763 @@
|
|||
# Video Post Class
|
||||
# Concrete subclass of SocialMediaPost for video content with platform-specific properties
|
||||
#
|
||||
# Part of Heritage Custodian Ontology v0.9.5
|
||||
#
|
||||
# STRUCTURE:
|
||||
# SocialMediaPost (parent)
|
||||
# └── VideoPost (this class)
|
||||
# - duration, definition, captions
|
||||
# - view/like/comment metrics
|
||||
# - YouTube-specific fields
|
||||
#
|
||||
# DATA SOURCE EXAMPLE:
|
||||
# From data/custodian/NL-GE-AAL-M-NOM-nationaal_onderduikmuseum.yaml:
|
||||
# youtube_enrichment:
|
||||
# videos:
|
||||
# - video_id: FbIoC-Owy-M
|
||||
# duration: PT10M59S
|
||||
# definition: hd
|
||||
# caption_available: false
|
||||
# view_count: 132
|
||||
# like_count: 2
|
||||
# comment_count: 0
|
||||
|
||||
id: https://nde.nl/ontology/hc/class/VideoPost
|
||||
name: video_post_class
|
||||
title: Video Post Class
|
||||
|
||||
imports:
|
||||
- linkml:types
|
||||
- ./SocialMediaPost
|
||||
- ./SocialMediaPostTypes
|
||||
- ../slots/language
|
||||
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
as: https://www.w3.org/ns/activitystreams#
|
||||
wikidata: http://www.wikidata.org/entity/
|
||||
|
||||
default_prefix: hc
|
||||
|
||||
classes:
|
||||
|
||||
VideoPost:
|
||||
is_a: SocialMediaPost
|
||||
class_uri: as:Video
|
||||
abstract: false
|
||||
description: |
|
||||
Concrete class for video content with platform-specific properties.
|
||||
|
||||
**DEFINITION**:
|
||||
|
||||
VideoPost is a specialized SocialMediaPost for video content. It extends
|
||||
the base post class with video-specific slots for duration, resolution,
|
||||
captions, and engagement metrics.
|
||||
|
||||
**EXTENDS**: SocialMediaPost
|
||||
|
||||
This class adds:
|
||||
- Video technical properties (duration, definition, aspect ratio)
|
||||
- Caption and subtitle availability
|
||||
- Engagement metrics (views, likes, comments)
|
||||
- Platform-specific fields (YouTube category, live broadcast status)
|
||||
- Temporal markers (chapters, segments)
|
||||
|
||||
**ONTOLOGY MAPPINGS**:
|
||||
|
||||
| Property | Activity Streams | Schema.org |
|
||||
|----------|------------------|------------|
|
||||
| Class | as:Video | schema:VideoObject |
|
||||
| duration | as:duration | schema:duration |
|
||||
| definition | - | schema:videoQuality |
|
||||
| caption | - | schema:caption |
|
||||
| view_count | - | schema:interactionStatistic |
|
||||
|
||||
**PLATFORM SUPPORT**:
|
||||
|
||||
| Platform | Duration Limit | Resolution | Captions |
|
||||
|----------|----------------|------------|----------|
|
||||
| YouTube | 12 hours (verified) | Up to 8K | VTT, SRT |
|
||||
| Vimeo | Varies by plan | Up to 8K | VTT, SRT |
|
||||
| Facebook | 4 hours | Up to 4K | Auto-generated |
|
||||
| TikTok | 10 minutes | 1080p | Auto-generated |
|
||||
| Instagram Reels | 90 seconds | 1080p | Auto-generated |
|
||||
|
||||
**HERITAGE INSTITUTION USE CASES**:
|
||||
|
||||
| Content Type | Typical Duration | Platform |
|
||||
|--------------|------------------|----------|
|
||||
| Virtual tours | 10-30 min | YouTube |
|
||||
| Conservation docs | 5-20 min | YouTube, Vimeo |
|
||||
| Curator interviews | 15-60 min | YouTube |
|
||||
| Object spotlights | 2-5 min | YouTube, Instagram |
|
||||
| Short clips | 15-60 sec | TikTok, Reels |
|
||||
| Live recordings | 30-120 min | YouTube |
|
||||
|
||||
**METRICS OBSERVATION**:
|
||||
|
||||
Video metrics (views, likes, comments) are observational data that change
|
||||
constantly. Each metric reading should include:
|
||||
- `metrics_observed_at`: When metrics were recorded
|
||||
- `retrieval_timestamp`: When API call was made
|
||||
|
||||
**RELATIONSHIP TO VideoPostType**:
|
||||
|
||||
- VideoPost is a **concrete post instance** with video content
|
||||
- VideoPostType is a **type classification** for categorizing posts
|
||||
- A VideoPost typically has `post_types: [VideoPostType]`
|
||||
- But may also have multiple types: `[LiveStreamPostType, VideoPostType]`
|
||||
|
||||
**CAPTION AND SUBTITLE DISTINCTION**:
|
||||
|
||||
Related classes for textual content derived from video:
|
||||
- VideoSubtitle: Time-coded text (SRT/VTT format)
|
||||
- VideoTranscript: Full text without timestamps
|
||||
- VideoAnnotation: Computer vision derived content
|
||||
|
||||
See VideoTextContent hierarchy for detailed modeling.
|
||||
|
||||
exact_mappings:
|
||||
- as:Video
|
||||
- schema:VideoObject
|
||||
|
||||
close_mappings:
|
||||
- crm:E73_Information_Object
|
||||
|
||||
related_mappings:
|
||||
- wikidata:Q34508 # Video
|
||||
- wikidata:Q604644 # Online video
|
||||
|
||||
slots:
|
||||
# ========================================
|
||||
# Video Technical Properties
|
||||
# ========================================
|
||||
- duration
|
||||
- definition
|
||||
- aspect_ratio
|
||||
- frame_rate
|
||||
|
||||
# ========================================
|
||||
# Caption and Subtitle Availability
|
||||
# ========================================
|
||||
- caption_available
|
||||
- default_language
|
||||
- default_audio_language
|
||||
- available_caption_languages
|
||||
|
||||
# ========================================
|
||||
# Engagement Metrics
|
||||
# ========================================
|
||||
- view_count
|
||||
- like_count
|
||||
- dislike_count
|
||||
- comment_count
|
||||
- favorite_count
|
||||
- metrics_observed_at
|
||||
|
||||
# ========================================
|
||||
# Platform-Specific
|
||||
# ========================================
|
||||
- video_category_id
|
||||
- live_broadcast_content
|
||||
- is_licensed_content
|
||||
- is_embeddable
|
||||
- is_made_for_kids
|
||||
|
||||
# ========================================
|
||||
# Comments/Replies
|
||||
# ========================================
|
||||
- comments_fetched
|
||||
- video_comments
|
||||
|
||||
slot_usage:
|
||||
# --- Video Technical Properties ---
|
||||
|
||||
duration:
|
||||
slot_uri: schema:duration
|
||||
description: |
|
||||
Duration of the video in ISO 8601 format.
|
||||
|
||||
Schema.org: duration for media length.
|
||||
|
||||
**Format**: ISO 8601 duration (e.g., "PT10M59S" = 10 minutes 59 seconds)
|
||||
|
||||
**Common Patterns**:
|
||||
- PT30S = 30 seconds
|
||||
- PT5M = 5 minutes
|
||||
- PT1H30M = 1 hour 30 minutes
|
||||
- PT2H15M30S = 2 hours 15 minutes 30 seconds
|
||||
range: string
|
||||
required: false
|
||||
pattern: "^P(T(\\d+H)?(\\d+M)?(\\d+S)?)?$"
|
||||
examples:
|
||||
- value: "PT10M59S"
|
||||
description: "10 minutes and 59 seconds"
|
||||
- value: "PT1H30M"
|
||||
description: "1 hour 30 minutes"
|
||||
|
||||
definition:
|
||||
slot_uri: schema:videoQuality
|
||||
description: |
|
||||
Video resolution/definition quality.
|
||||
|
||||
Schema.org: videoQuality for resolution class.
|
||||
|
||||
**Values**:
|
||||
- sd: Standard definition (480p or lower)
|
||||
- hd: High definition (720p, 1080p)
|
||||
- 4k: Ultra HD (2160p)
|
||||
- 8k: Full Ultra HD (4320p)
|
||||
range: VideoDefinitionEnum
|
||||
required: false
|
||||
examples:
|
||||
- value: "hd"
|
||||
description: "High definition (720p/1080p)"
|
||||
|
||||
aspect_ratio:
|
||||
slot_uri: schema:width
|
||||
description: |
|
||||
Video aspect ratio.
|
||||
|
||||
**Common Values**:
|
||||
- 16:9: Standard widescreen (YouTube default)
|
||||
- 9:16: Vertical (Shorts, Reels, TikTok)
|
||||
- 4:3: Classic TV format
|
||||
- 1:1: Square (Instagram legacy)
|
||||
- 21:9: Cinematic ultrawide
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "16:9"
|
||||
description: "Standard widescreen"
|
||||
- value: "9:16"
|
||||
description: "Vertical format for Shorts/Reels"
|
||||
|
||||
frame_rate:
|
||||
slot_uri: hc:frameRate
|
||||
description: |
|
||||
Video frame rate in frames per second.
|
||||
|
||||
**Common Values**:
|
||||
- 24: Cinema standard
|
||||
- 25: PAL standard
|
||||
- 30: NTSC standard
|
||||
- 60: High frame rate
|
||||
range: float
|
||||
required: false
|
||||
examples:
|
||||
- value: 30.0
|
||||
description: "30 frames per second"
|
||||
|
||||
# --- Caption and Subtitle Availability ---
|
||||
|
||||
caption_available:
|
||||
slot_uri: schema:hasPart
|
||||
description: |
|
||||
Whether captions/subtitles are available for this video.
|
||||
|
||||
Indicates if the video has any caption tracks (auto-generated or manual).
|
||||
|
||||
Related: Use `available_caption_languages` for specific languages.
|
||||
range: boolean
|
||||
required: false
|
||||
examples:
|
||||
- value: true
|
||||
description: "Video has captions available"
|
||||
- value: false
|
||||
description: "No captions available"
|
||||
|
||||
default_language:
|
||||
slot_uri: schema:inLanguage
|
||||
description: |
|
||||
Default/primary language of the video content.
|
||||
|
||||
Schema.org: inLanguage for content language.
|
||||
|
||||
ISO 639-1 code (e.g., "nl", "en", "de").
|
||||
|
||||
Refers to on-screen text, title, description language.
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "nl"
|
||||
description: "Dutch language content"
|
||||
|
||||
default_audio_language:
|
||||
slot_uri: hc:defaultAudioLanguage
|
||||
description: |
|
||||
Language of the video's default audio track.
|
||||
|
||||
ISO 639-1 code. May differ from `default_language` for
|
||||
dubbed or multilingual content.
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "nl"
|
||||
description: "Dutch audio track"
|
||||
|
||||
available_caption_languages:
|
||||
slot_uri: hc:availableCaptionLanguages
|
||||
description: |
|
||||
List of languages for which captions are available.
|
||||
|
||||
ISO 639-1 codes for all caption tracks.
|
||||
range: string
|
||||
multivalued: true
|
||||
required: false
|
||||
examples:
|
||||
- value: ["nl", "en", "de"]
|
||||
description: "Captions available in Dutch, English, German"
|
||||
|
||||
# --- Engagement Metrics ---
|
||||
|
||||
view_count:
|
||||
slot_uri: schema:interactionCount
|
||||
description: |
|
||||
Number of views for this video.
|
||||
|
||||
Schema.org: interactionCount for view statistic.
|
||||
|
||||
**OBSERVATIONAL**: This value changes constantly.
|
||||
Always record `metrics_observed_at` timestamp.
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 132
|
||||
description: "132 views at observation time"
|
||||
|
||||
like_count:
|
||||
slot_uri: hc:likeCount
|
||||
description: |
|
||||
Number of likes/upvotes for this video.
|
||||
|
||||
Platform-specific: YouTube likes, Facebook reactions, etc.
|
||||
|
||||
**OBSERVATIONAL**: Record with `metrics_observed_at`.
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 2
|
||||
description: "2 likes at observation time"
|
||||
|
||||
dislike_count:
|
||||
slot_uri: hc:dislikeCount
|
||||
description: |
|
||||
Number of dislikes/downvotes (if available).
|
||||
|
||||
Note: YouTube hid public dislike counts in Nov 2021.
|
||||
API may still return dislike data for channel owners.
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
|
||||
comment_count:
|
||||
slot_uri: hc:commentCount
|
||||
description: |
|
||||
Number of comments on this video.
|
||||
|
||||
**OBSERVATIONAL**: Record with `metrics_observed_at`.
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 0
|
||||
description: "No comments at observation time"
|
||||
|
||||
favorite_count:
|
||||
slot_uri: hc:favoriteCount
|
||||
description: |
|
||||
Number of times video was favorited/saved.
|
||||
|
||||
Platform-specific availability.
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
|
||||
metrics_observed_at:
|
||||
slot_uri: prov:atTime
|
||||
description: |
|
||||
Timestamp when engagement metrics were recorded.
|
||||
|
||||
PROV-O: atTime for observation timestamp.
|
||||
|
||||
**CRITICAL**: Metrics change constantly. This timestamp
|
||||
indicates when view_count, like_count, etc. were observed.
|
||||
range: datetime
|
||||
required: false
|
||||
examples:
|
||||
- value: "2025-12-01T23:16:22.294232+00:00"
|
||||
description: "Metrics observed December 1, 2025"
|
||||
|
||||
# --- Platform-Specific ---
|
||||
|
||||
video_category_id:
|
||||
slot_uri: hc:videoCategoryId
|
||||
description: |
|
||||
Platform-specific category identifier.
|
||||
|
||||
**YouTube Category IDs**:
|
||||
- 1: Film & Animation
|
||||
- 2: Autos & Vehicles
|
||||
- 10: Music
|
||||
- 15: Pets & Animals
|
||||
- 17: Sports
|
||||
- 19: Travel & Events
|
||||
- 20: Gaming
|
||||
- 22: People & Blogs
|
||||
- 23: Comedy
|
||||
- 24: Entertainment
|
||||
- 25: News & Politics
|
||||
- 26: Howto & Style
|
||||
- 27: Education
|
||||
- 28: Science & Technology
|
||||
- 29: Nonprofits & Activism
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "22"
|
||||
description: "YouTube: People & Blogs"
|
||||
- value: "27"
|
||||
description: "YouTube: Education"
|
||||
|
||||
live_broadcast_content:
|
||||
slot_uri: hc:liveBroadcastContent
|
||||
description: |
|
||||
Live broadcast status of the video.
|
||||
|
||||
**Values**:
|
||||
- none: Not a live broadcast (standard video)
|
||||
- live: Currently broadcasting live
|
||||
- upcoming: Scheduled live stream not yet started
|
||||
|
||||
When `live` or `upcoming` becomes `none`, video is archived.
|
||||
range: LiveBroadcastStatusEnum
|
||||
required: false
|
||||
examples:
|
||||
- value: "none"
|
||||
description: "Standard video (not live)"
|
||||
- value: "live"
|
||||
description: "Currently broadcasting"
|
||||
|
||||
is_licensed_content:
|
||||
slot_uri: hc:isLicensedContent
|
||||
description: |
|
||||
Whether the video contains licensed content (music, clips).
|
||||
|
||||
Affects monetization and regional availability.
|
||||
range: boolean
|
||||
required: false
|
||||
|
||||
is_embeddable:
|
||||
slot_uri: hc:isEmbeddable
|
||||
description: |
|
||||
Whether the video can be embedded on external sites.
|
||||
|
||||
Publisher-controlled setting.
|
||||
range: boolean
|
||||
required: false
|
||||
|
||||
is_made_for_kids:
|
||||
slot_uri: hc:isMadeForKids
|
||||
description: |
|
||||
Whether the video is designated as made for children.
|
||||
|
||||
COPPA compliance flag. Affects comments, ads, features.
|
||||
range: boolean
|
||||
required: false
|
||||
|
||||
# --- Comments ---
|
||||
|
||||
comments_fetched:
|
||||
slot_uri: hc:commentsFetched
|
||||
description: |
|
||||
Number of comments actually fetched/archived.
|
||||
|
||||
May be less than `comment_count` due to API limits,
|
||||
deleted comments, or pagination.
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 0
|
||||
description: "No comments fetched"
|
||||
|
||||
video_comments:
|
||||
slot_uri: hc:videoComments
|
||||
description: |
|
||||
Collection of comments on this video.
|
||||
|
||||
Structured comment data with author, text, timestamp, likes.
|
||||
|
||||
Note: Comments may contain nested replies.
|
||||
range: VideoComment
|
||||
multivalued: true
|
||||
required: false
|
||||
inlined: true
|
||||
|
||||
comments:
|
||||
- "Extends SocialMediaPost with video-specific properties"
|
||||
- "Maps to as:Video and schema:VideoObject"
|
||||
- "Metrics are observational - always include metrics_observed_at"
|
||||
- "Caption availability signals but not content (see VideoSubtitle)"
|
||||
- "YouTube is primary platform for heritage institution video content"
|
||||
|
||||
see_also:
|
||||
- "https://www.w3.org/ns/activitystreams#Video"
|
||||
- "https://schema.org/VideoObject"
|
||||
- "https://developers.google.com/youtube/v3/docs/videos"
|
||||
|
||||
# ==========================================================================
|
||||
# Supporting Class: VideoComment
|
||||
# ==========================================================================
|
||||
|
||||
VideoComment:
|
||||
class_uri: schema:Comment
|
||||
description: |
|
||||
A comment on a video post.
|
||||
|
||||
Models user-generated comments with author, text, timestamp,
|
||||
and engagement metrics. Supports nested reply threads.
|
||||
|
||||
exact_mappings:
|
||||
- schema:Comment
|
||||
- as:Note
|
||||
|
||||
slots:
|
||||
- comment_id
|
||||
- comment_author
|
||||
- comment_author_channel_id
|
||||
- comment_text
|
||||
- comment_published_at
|
||||
- comment_updated_at
|
||||
- comment_like_count
|
||||
- comment_reply_count
|
||||
- comment_replies
|
||||
|
||||
slot_usage:
|
||||
comment_id:
|
||||
slot_uri: dcterms:identifier
|
||||
description: Unique identifier for the comment
|
||||
range: string
|
||||
required: true
|
||||
|
||||
comment_author:
|
||||
slot_uri: schema:author
|
||||
description: Display name of comment author
|
||||
range: string
|
||||
required: true
|
||||
|
||||
comment_author_channel_id:
|
||||
slot_uri: hc:authorChannelId
|
||||
description: Platform channel/account ID of author
|
||||
range: string
|
||||
required: false
|
||||
|
||||
comment_text:
|
||||
slot_uri: schema:text
|
||||
description: Full text content of the comment
|
||||
range: string
|
||||
required: true
|
||||
|
||||
comment_published_at:
|
||||
slot_uri: dcterms:created
|
||||
description: When comment was originally posted
|
||||
range: datetime
|
||||
required: true
|
||||
|
||||
comment_updated_at:
|
||||
slot_uri: dcterms:modified
|
||||
description: When comment was last edited
|
||||
range: datetime
|
||||
required: false
|
||||
|
||||
comment_like_count:
|
||||
slot_uri: hc:likeCount
|
||||
description: Number of likes on this comment
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
|
||||
comment_reply_count:
|
||||
slot_uri: hc:replyCount
|
||||
description: Number of replies to this comment
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
|
||||
comment_replies:
|
||||
slot_uri: schema:comment
|
||||
description: Nested reply comments
|
||||
range: VideoComment
|
||||
multivalued: true
|
||||
required: false
|
||||
inlined: true
|
||||
|
||||
# ============================================================================
|
||||
# Supporting Enumerations
|
||||
# ============================================================================
|
||||
|
||||
enums:
|
||||
|
||||
VideoDefinitionEnum:
|
||||
description: |
|
||||
Video resolution/definition quality categories.
|
||||
|
||||
Based on common platform standards.
|
||||
permissible_values:
|
||||
sd:
|
||||
description: Standard definition (480p or lower)
|
||||
hd:
|
||||
description: High definition (720p, 1080p)
|
||||
uhd:
|
||||
description: Ultra HD (2160p/4K)
|
||||
4k:
|
||||
description: 4K resolution (2160p) - alias for uhd
|
||||
8k:
|
||||
description: Full Ultra HD (4320p)
|
||||
|
||||
LiveBroadcastStatusEnum:
|
||||
description: |
|
||||
Live broadcast status values for video content.
|
||||
|
||||
Based on YouTube API liveBroadcastContent values.
|
||||
permissible_values:
|
||||
none:
|
||||
description: Not a live broadcast (standard uploaded video)
|
||||
live:
|
||||
description: Currently broadcasting live
|
||||
upcoming:
|
||||
description: Scheduled live stream that hasn't started yet
|
||||
|
||||
# ============================================================================
|
||||
# Slot Definitions
|
||||
# ============================================================================
|
||||
|
||||
slots:
|
||||
duration:
|
||||
description: Duration in ISO 8601 format
|
||||
range: string
|
||||
|
||||
definition:
|
||||
description: Video resolution quality (sd, hd, 4k, 8k)
|
||||
range: VideoDefinitionEnum
|
||||
|
||||
aspect_ratio:
|
||||
description: Video aspect ratio (16:9, 9:16, 4:3, etc.)
|
||||
range: string
|
||||
|
||||
frame_rate:
|
||||
description: Frame rate in FPS
|
||||
range: float
|
||||
|
||||
caption_available:
|
||||
description: Whether captions are available
|
||||
range: boolean
|
||||
|
||||
default_audio_language:
|
||||
description: Language of default audio track
|
||||
range: string
|
||||
|
||||
available_caption_languages:
|
||||
description: Languages for which captions exist
|
||||
range: string
|
||||
multivalued: true
|
||||
|
||||
view_count:
|
||||
description: Number of views
|
||||
range: integer
|
||||
|
||||
like_count:
|
||||
description: Number of likes
|
||||
range: integer
|
||||
|
||||
dislike_count:
|
||||
description: Number of dislikes
|
||||
range: integer
|
||||
|
||||
comment_count:
|
||||
description: Number of comments
|
||||
range: integer
|
||||
|
||||
favorite_count:
|
||||
description: Number of favorites/saves
|
||||
range: integer
|
||||
|
||||
metrics_observed_at:
|
||||
description: When metrics were recorded
|
||||
range: datetime
|
||||
|
||||
video_category_id:
|
||||
description: Platform category identifier
|
||||
range: string
|
||||
|
||||
live_broadcast_content:
|
||||
description: Live broadcast status
|
||||
range: LiveBroadcastStatusEnum
|
||||
|
||||
is_licensed_content:
|
||||
description: Contains licensed content
|
||||
range: boolean
|
||||
|
||||
is_embeddable:
|
||||
description: Can be embedded externally
|
||||
range: boolean
|
||||
|
||||
is_made_for_kids:
|
||||
description: COPPA kids content flag
|
||||
range: boolean
|
||||
|
||||
comments_fetched:
|
||||
description: Number of comments actually retrieved
|
||||
range: integer
|
||||
|
||||
video_comments:
|
||||
description: Collection of video comments
|
||||
range: VideoComment
|
||||
multivalued: true
|
||||
|
||||
# VideoComment slots
|
||||
comment_id:
|
||||
description: Unique comment identifier
|
||||
range: string
|
||||
|
||||
comment_author:
|
||||
description: Comment author display name
|
||||
range: string
|
||||
|
||||
comment_author_channel_id:
|
||||
description: Author's channel/account ID
|
||||
range: string
|
||||
|
||||
comment_text:
|
||||
description: Comment text content
|
||||
range: string
|
||||
|
||||
comment_published_at:
|
||||
description: When comment was posted
|
||||
range: datetime
|
||||
|
||||
comment_updated_at:
|
||||
description: When comment was edited
|
||||
range: datetime
|
||||
|
||||
comment_like_count:
|
||||
description: Likes on this comment
|
||||
range: integer
|
||||
|
||||
comment_reply_count:
|
||||
description: Number of replies
|
||||
range: integer
|
||||
|
||||
comment_replies:
|
||||
description: Nested reply comments
|
||||
range: VideoComment
|
||||
multivalued: true
|
||||
632
schemas/20251121/linkml/modules/classes/VideoSubtitle.yaml
Normal file
632
schemas/20251121/linkml/modules/classes/VideoSubtitle.yaml
Normal file
|
|
@ -0,0 +1,632 @@
|
|||
# Video Subtitle Class
|
||||
# Time-coded caption/subtitle content extending VideoTranscript
|
||||
#
|
||||
# Part of Heritage Custodian Ontology v0.9.5
|
||||
#
|
||||
# HIERARCHY:
|
||||
# E73_Information_Object (CIDOC-CRM)
|
||||
# │
|
||||
# └── VideoTextContent (abstract - provenance)
|
||||
# │
|
||||
# └── VideoTranscript (full text transcription)
|
||||
# │
|
||||
# └── VideoSubtitle (this class - time-coded captions)
|
||||
#
|
||||
# DESIGN RATIONALE:
|
||||
# VideoSubtitle extends VideoTranscript because subtitles ARE transcripts
|
||||
# with additional time-coding and display metadata:
|
||||
#
|
||||
# 1. A subtitle file (SRT, VTT) contains complete spoken content (transcript)
|
||||
# 2. Plus precise start/end times for each caption
|
||||
# 3. Plus display formatting (position, styling in some formats)
|
||||
#
|
||||
# You can always derive a plain transcript from subtitles by stripping times.
|
||||
# This inheritance enables polymorphic handling: treat subtitles as transcripts
|
||||
# when time-coding isn't needed.
|
||||
#
|
||||
# SUBTITLE FORMATS SUPPORTED:
|
||||
# - SRT (SubRip): Most common, simple time + text
|
||||
# - VTT (WebVTT): W3C standard, supports styling
|
||||
# - TTML (DFXP): XML-based, broadcast standard
|
||||
# - SBV (YouTube): YouTube's native format
|
||||
# - ASS/SSA: Advanced styling, anime subtitles
|
||||
|
||||
id: https://nde.nl/ontology/hc/class/VideoSubtitle
|
||||
name: video_subtitle_class
|
||||
title: Video Subtitle Class
|
||||
|
||||
imports:
|
||||
- linkml:types
|
||||
- ./VideoTranscript
|
||||
- ./VideoTimeSegment
|
||||
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
ma: http://www.w3.org/ns/ma-ont#
|
||||
|
||||
default_prefix: hc
|
||||
|
||||
classes:
|
||||
|
||||
VideoSubtitle:
|
||||
is_a: VideoTranscript
|
||||
class_uri: hc:VideoSubtitle
|
||||
abstract: false
|
||||
description: |
|
||||
Time-coded caption/subtitle content for video.
|
||||
|
||||
**DEFINITION**:
|
||||
|
||||
VideoSubtitle represents caption/subtitle tracks that provide time-coded
|
||||
text synchronized with video playback. It extends VideoTranscript because
|
||||
subtitles contain complete transcription PLUS temporal synchronization.
|
||||
|
||||
**INHERITANCE FROM VideoTranscript**:
|
||||
|
||||
VideoSubtitle inherits all transcript capabilities:
|
||||
- `full_text`: Complete subtitle text concatenated
|
||||
- `segments`: Time-coded entries (REQUIRED for subtitles)
|
||||
- `includes_timestamps`: Always true for subtitles
|
||||
- `content_language`: Language of subtitle text
|
||||
- All provenance from VideoTextContent
|
||||
|
||||
And adds subtitle-specific properties:
|
||||
- `subtitle_format`: SRT, VTT, TTML, SBV, ASS
|
||||
- `is_closed_caption`: CC vs regular subtitles
|
||||
- `is_sdh`: Subtitles for Deaf/Hard-of-Hearing
|
||||
- `includes_sound_descriptions`: Non-speech audio descriptions
|
||||
|
||||
**SCHEMA.ORG ALIGNMENT**:
|
||||
|
||||
Maps to `schema:caption` property:
|
||||
> "For downloadable machine formats (closed caption, subtitles etc.)
|
||||
> use the MediaObject.encodingFormat property."
|
||||
|
||||
**SUBTITLE vs CAPTION vs TRANSCRIPT**:
|
||||
|
||||
| Type | Time-coded | Purpose | Audience |
|
||||
|------|------------|---------|----------|
|
||||
| Transcript | Optional | Reading, search | Everyone |
|
||||
| Subtitle | Required | Language translation | Hearing viewers |
|
||||
| Caption (CC) | Required | Accessibility | Deaf/HoH viewers |
|
||||
| SDH | Required | Full accessibility | Deaf viewers, noisy environments |
|
||||
|
||||
**SDH (Subtitles for Deaf/Hard-of-Hearing)**:
|
||||
|
||||
SDH differs from regular subtitles by including:
|
||||
- Speaker identification: "(John) Hello"
|
||||
- Sound effects: "[door slams]", "[music playing]"
|
||||
- Music descriptions: "♪ upbeat jazz ♪"
|
||||
- Emotional cues: "[laughing]", "[whispering]"
|
||||
|
||||
**SUBTITLE FORMATS**:
|
||||
|
||||
| Format | Extension | Features | Use Case |
|
||||
|--------|-----------|----------|----------|
|
||||
| SRT | .srt | Simple, universal | Most video players |
|
||||
| VTT | .vtt | W3C standard, styling | HTML5 video, web |
|
||||
| TTML | .ttml/.dfxp | XML, rich styling | Broadcast, streaming |
|
||||
| SBV | .sbv | YouTube native | YouTube uploads |
|
||||
| ASS | .ass | Advanced styling | Anime, complex layouts |
|
||||
|
||||
**SRT FORMAT EXAMPLE**:
|
||||
|
||||
```
|
||||
1
|
||||
00:00:00,000 --> 00:00:03,500
|
||||
Welcome to the Rijksmuseum.
|
||||
|
||||
2
|
||||
00:00:03,500 --> 00:00:08,200
|
||||
Today we'll explore the Night Watch gallery.
|
||||
```
|
||||
|
||||
**VTT FORMAT EXAMPLE**:
|
||||
|
||||
```
|
||||
WEBVTT
|
||||
|
||||
00:00:00.000 --> 00:00:03.500
|
||||
Welcome to the Rijksmuseum.
|
||||
|
||||
00:00:03.500 --> 00:00:08.200
|
||||
Today we'll explore the Night Watch gallery.
|
||||
```
|
||||
|
||||
**HERITAGE INSTITUTION CONTEXT**:
|
||||
|
||||
Subtitles are critical for heritage video accessibility:
|
||||
|
||||
1. **Accessibility Compliance**: WCAG 2.1, Section 508
|
||||
2. **Multilingual Access**: Translate for international audiences
|
||||
3. **Silent Viewing**: Social media, public displays, quiet spaces
|
||||
4. **Search Discovery**: Subtitle text is indexed by platforms
|
||||
5. **Preservation**: Text outlasts video format obsolescence
|
||||
|
||||
**YOUTUBE API INTEGRATION**:
|
||||
|
||||
Subtitle tracks from YouTube API populate:
|
||||
- `subtitle_format`: Typically VTT or SRT
|
||||
- `generation_method`: PLATFORM_PROVIDED or ASR_AUTOMATIC
|
||||
- `content_language`: From track language code
|
||||
- `is_auto_generated`: YouTube auto-caption flag
|
||||
|
||||
**SEGMENTS ARE REQUIRED**:
|
||||
|
||||
Unlike VideoTranscript where segments are optional, VideoSubtitle
|
||||
REQUIRES the `segments` slot to be populated with VideoTimeSegment
|
||||
entries that include start_seconds, end_seconds, and segment_text.
|
||||
|
||||
exact_mappings:
|
||||
- schema:caption
|
||||
|
||||
close_mappings:
|
||||
- ma:CaptioningFormat
|
||||
|
||||
related_mappings:
|
||||
- schema:transcript
|
||||
|
||||
slots:
|
||||
# Subtitle-specific format
|
||||
- subtitle_format
|
||||
- raw_subtitle_content
|
||||
|
||||
# Accessibility metadata
|
||||
- is_closed_caption
|
||||
- is_sdh
|
||||
- includes_sound_descriptions
|
||||
- includes_music_descriptions
|
||||
- includes_speaker_identification
|
||||
|
||||
# Source/generation info
|
||||
- is_auto_generated
|
||||
- track_name
|
||||
- track_id
|
||||
|
||||
# Positioning (for formats that support it)
|
||||
- default_position
|
||||
|
||||
# Entry counts
|
||||
- entry_count
|
||||
- average_entry_duration_seconds
|
||||
|
||||
slot_usage:
|
||||
# Override segments to be required for subtitles
|
||||
segments:
|
||||
required: true
|
||||
description: |
|
||||
Time-coded subtitle entries as VideoTimeSegment objects.
|
||||
|
||||
**REQUIRED for VideoSubtitle** (optional in parent VideoTranscript).
|
||||
|
||||
Each segment represents one caption display unit:
|
||||
- start_seconds: When caption appears
|
||||
- end_seconds: When caption disappears
|
||||
- segment_text: Caption text content
|
||||
- segment_index: Order in subtitle track
|
||||
- confidence: For auto-generated captions
|
||||
|
||||
Segments are ordered by start_seconds for proper playback.
|
||||
|
||||
# Override includes_timestamps to default true
|
||||
includes_timestamps:
|
||||
ifabsent: "true"
|
||||
description: |
|
||||
Whether subtitle includes time markers.
|
||||
|
||||
**Always true for VideoSubtitle** - time-coding is definitional.
|
||||
|
||||
subtitle_format:
|
||||
slot_uri: dcterms:format
|
||||
description: |
|
||||
Subtitle file format.
|
||||
|
||||
Dublin Core: format for resource format.
|
||||
|
||||
Specifies the encoding format of the subtitle content.
|
||||
Affects parsing and rendering capabilities.
|
||||
range: SubtitleFormatEnum
|
||||
required: true
|
||||
examples:
|
||||
- value: "VTT"
|
||||
description: "WebVTT format (W3C standard)"
|
||||
- value: "SRT"
|
||||
description: "SubRip format (most common)"
|
||||
|
||||
raw_subtitle_content:
|
||||
slot_uri: hc:rawSubtitleContent
|
||||
description: |
|
||||
Original subtitle file content as raw string.
|
||||
|
||||
Preserves the complete subtitle file in its native format.
|
||||
Useful for:
|
||||
- Format conversion
|
||||
- Re-parsing with different tools
|
||||
- Archive preservation
|
||||
|
||||
May be large - consider storing separately for large files.
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: |
|
||||
WEBVTT
|
||||
|
||||
00:00:00.000 --> 00:00:03.500
|
||||
Welcome to the museum.
|
||||
description: "Complete VTT file content"
|
||||
|
||||
is_closed_caption:
|
||||
slot_uri: hc:isClosedCaption
|
||||
description: |
|
||||
Whether this is a closed caption track (CC).
|
||||
|
||||
Closed captions differ from subtitles:
|
||||
- **CC (true)**: Designed for Deaf/HoH, includes non-speech audio
|
||||
- **Subtitles (false)**: Translation of dialogue only
|
||||
|
||||
CC typically includes [MUSIC], [APPLAUSE], speaker ID, etc.
|
||||
range: boolean
|
||||
required: false
|
||||
ifabsent: "false"
|
||||
examples:
|
||||
- value: true
|
||||
description: "This is a closed caption track"
|
||||
|
||||
is_sdh:
|
||||
slot_uri: hc:isSDH
|
||||
description: |
|
||||
Whether these are Subtitles for Deaf/Hard-of-Hearing (SDH).
|
||||
|
||||
SDH combines subtitle translation with CC-style annotations:
|
||||
- Dialogue translation (like subtitles)
|
||||
- Sound descriptions (like CC)
|
||||
- Speaker identification
|
||||
|
||||
Typically marked "[SDH]" on streaming platforms.
|
||||
range: boolean
|
||||
required: false
|
||||
ifabsent: "false"
|
||||
examples:
|
||||
- value: true
|
||||
description: "SDH subtitle track"
|
||||
|
||||
includes_sound_descriptions:
|
||||
slot_uri: hc:includesSoundDescriptions
|
||||
description: |
|
||||
Whether subtitle includes non-speech sound descriptions.
|
||||
|
||||
Examples of sound descriptions:
|
||||
- [door slams]
|
||||
- [phone ringing]
|
||||
- [thunder]
|
||||
- [footsteps approaching]
|
||||
|
||||
Characteristic of CC and SDH tracks.
|
||||
range: boolean
|
||||
required: false
|
||||
ifabsent: "false"
|
||||
examples:
|
||||
- value: true
|
||||
description: "Contains sound effect descriptions"
|
||||
|
||||
includes_music_descriptions:
|
||||
slot_uri: hc:includesMusicDescriptions
|
||||
description: |
|
||||
Whether subtitle includes music/song descriptions.
|
||||
|
||||
Examples:
|
||||
- ♪ upbeat jazz playing ♪
|
||||
- [classical music]
|
||||
- ♪ singing in Dutch ♪
|
||||
- [somber orchestral music]
|
||||
|
||||
Important for heritage content with significant musical elements.
|
||||
range: boolean
|
||||
required: false
|
||||
ifabsent: "false"
|
||||
examples:
|
||||
- value: true
|
||||
description: "Contains music descriptions"
|
||||
|
||||
includes_speaker_identification:
|
||||
slot_uri: hc:includesSpeakerIdentification
|
||||
description: |
|
||||
Whether subtitle identifies speakers.
|
||||
|
||||
Speaker identification patterns:
|
||||
- (John): Hello there.
|
||||
- NARRATOR: Welcome to the museum.
|
||||
- [Curator] This painting dates from 1642.
|
||||
|
||||
Different from transcript speaker_id which is per-segment;
|
||||
this indicates whether the TEXT CONTENT includes labels.
|
||||
range: boolean
|
||||
required: false
|
||||
ifabsent: "false"
|
||||
examples:
|
||||
- value: true
|
||||
description: "Subtitle text includes speaker labels"
|
||||
|
||||
is_auto_generated:
|
||||
slot_uri: hc:isAutoGenerated
|
||||
description: |
|
||||
Whether subtitle was auto-generated by the platform.
|
||||
|
||||
Distinct from generation_method (inherited from VideoTextContent):
|
||||
- `is_auto_generated`: Platform flag (YouTube, Vimeo)
|
||||
- `generation_method`: How WE know it was generated
|
||||
|
||||
Auto-generated captions typically have lower accuracy.
|
||||
range: boolean
|
||||
required: false
|
||||
ifabsent: "false"
|
||||
examples:
|
||||
- value: true
|
||||
description: "YouTube auto-generated caption"
|
||||
|
||||
track_name:
|
||||
slot_uri: schema:name
|
||||
description: |
|
||||
Human-readable name of the subtitle track.
|
||||
|
||||
Schema.org: name for track label.
|
||||
|
||||
Examples from YouTube:
|
||||
- "English"
|
||||
- "English (auto-generated)"
|
||||
- "Dutch - Nederlands"
|
||||
- "English (United Kingdom)"
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "English (auto-generated)"
|
||||
description: "YouTube auto-caption track name"
|
||||
|
||||
track_id:
|
||||
slot_uri: dcterms:identifier
|
||||
description: |
|
||||
Platform-specific identifier for this subtitle track.
|
||||
|
||||
Dublin Core: identifier for unique ID.
|
||||
|
||||
Used to fetch/update specific tracks via API.
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "en.3OWxR1w4QfE"
|
||||
description: "YouTube caption track ID"
|
||||
|
||||
default_position:
|
||||
slot_uri: hc:defaultPosition
|
||||
description: |
|
||||
Default display position for captions.
|
||||
|
||||
For formats that support positioning (VTT, TTML, ASS):
|
||||
- BOTTOM: Default, below video content
|
||||
- TOP: Above video content
|
||||
- MIDDLE: Center of video
|
||||
|
||||
May be overridden per-segment in advanced formats.
|
||||
range: SubtitlePositionEnum
|
||||
required: false
|
||||
ifabsent: "string(BOTTOM)"
|
||||
examples:
|
||||
- value: "BOTTOM"
|
||||
description: "Standard bottom caption position"
|
||||
|
||||
entry_count:
|
||||
slot_uri: hc:entryCount
|
||||
description: |
|
||||
Number of subtitle entries (caption cues).
|
||||
|
||||
Equals length of segments array.
|
||||
Useful for content sizing without loading full segments.
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 127
|
||||
description: "127 caption cues in this track"
|
||||
|
||||
average_entry_duration_seconds:
|
||||
slot_uri: hc:averageEntryDuration
|
||||
description: |
|
||||
Average duration of subtitle entries in seconds.
|
||||
|
||||
Typical ranges:
|
||||
- 2-4 seconds: Normal speech rate
|
||||
- < 2 seconds: Rapid dialogue
|
||||
- > 5 seconds: Slow narration or long displays
|
||||
|
||||
Useful for quality assessment - very short or long entries
|
||||
may indicate timing issues.
|
||||
range: float
|
||||
required: false
|
||||
minimum_value: 0.0
|
||||
examples:
|
||||
- value: 3.2
|
||||
description: "Average 3.2 seconds per caption"
|
||||
|
||||
rules:
|
||||
- postconditions:
|
||||
description: |
|
||||
segments must be populated for VideoSubtitle.
|
||||
This is enforced by making segments required in slot_usage.
|
||||
|
||||
comments:
|
||||
- "Time-coded caption/subtitle content"
|
||||
- "Extends VideoTranscript - subtitles ARE transcripts plus time codes"
|
||||
- "Supports multiple formats: SRT, VTT, TTML, SBV, ASS"
|
||||
- "Accessibility metadata: CC, SDH, sound/music descriptions"
|
||||
- "Critical for heritage video accessibility compliance"
|
||||
|
||||
see_also:
|
||||
- "https://schema.org/caption"
|
||||
- "https://www.w3.org/TR/webvtt1/"
|
||||
- "https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API"
|
||||
- "https://www.3playmedia.com/learn/popular-topics/closed-captioning/"
|
||||
|
||||
# ============================================================================
|
||||
# Enumerations
|
||||
# ============================================================================
|
||||
|
||||
enums:
|
||||
|
||||
SubtitleFormatEnum:
|
||||
description: |
|
||||
Subtitle/caption file formats.
|
||||
|
||||
Each format has different capabilities for timing precision,
|
||||
styling, positioning, and metadata.
|
||||
permissible_values:
|
||||
SRT:
|
||||
description: |
|
||||
SubRip subtitle format (.srt).
|
||||
Most widely supported format.
|
||||
Simple: sequence number, timecode, text.
|
||||
No styling or positioning support.
|
||||
VTT:
|
||||
description: |
|
||||
WebVTT format (.vtt).
|
||||
W3C standard for HTML5 video.
|
||||
Supports styling (CSS), positioning, cue settings.
|
||||
Recommended for web delivery.
|
||||
TTML:
|
||||
description: |
|
||||
Timed Text Markup Language (.ttml/.dfxp/.xml).
|
||||
W3C XML-based standard.
|
||||
Rich styling, regions, timing.
|
||||
Used in broadcast and streaming (Netflix, Amazon).
|
||||
SBV:
|
||||
description: |
|
||||
YouTube SubViewer format (.sbv).
|
||||
Simple format similar to SRT.
|
||||
Native YouTube caption format.
|
||||
ASS:
|
||||
description: |
|
||||
Advanced SubStation Alpha (.ass).
|
||||
Advanced styling, positioning, effects.
|
||||
Popular for anime subtitles.
|
||||
Includes SSA (.ssa) as predecessor.
|
||||
STL:
|
||||
description: |
|
||||
EBU STL format (.stl).
|
||||
European Broadcasting Union standard.
|
||||
Used in broadcast television.
|
||||
Binary format with teletext compatibility.
|
||||
CAP:
|
||||
description: |
|
||||
Scenarist Closed Caption (.scc/.cap).
|
||||
Used for broadcast closed captioning.
|
||||
CEA-608/CEA-708 compliant.
|
||||
SAMI:
|
||||
description: |
|
||||
Synchronized Accessible Media Interchange (.smi/.sami).
|
||||
Microsoft format for Windows Media.
|
||||
HTML-like markup with timing.
|
||||
LRC:
|
||||
description: |
|
||||
LRC lyrics format (.lrc).
|
||||
Simple format for song lyrics.
|
||||
Line-by-line timing, no duration.
|
||||
JSON:
|
||||
description: |
|
||||
JSON-based subtitle format.
|
||||
Used by some APIs (YouTube transcript API).
|
||||
Structure varies by source.
|
||||
UNKNOWN:
|
||||
description: |
|
||||
Unknown or unrecognized format.
|
||||
May require manual parsing or conversion.
|
||||
|
||||
SubtitlePositionEnum:
|
||||
description: |
|
||||
Default caption display position on video.
|
||||
|
||||
May be overridden by format-specific positioning (VTT, TTML, ASS).
|
||||
permissible_values:
|
||||
BOTTOM:
|
||||
description: |
|
||||
Bottom of video frame (standard position).
|
||||
Most common for subtitles and captions.
|
||||
Typically in lower 10-15% of frame.
|
||||
TOP:
|
||||
description: |
|
||||
Top of video frame.
|
||||
Used when bottom is occluded.
|
||||
Common for some broadcast formats.
|
||||
MIDDLE:
|
||||
description: |
|
||||
Center of video frame.
|
||||
Rarely used except for specific effects.
|
||||
LEFT:
|
||||
description: |
|
||||
Left side of frame (vertical text).
|
||||
Rare, used for specific languages/effects.
|
||||
RIGHT:
|
||||
description: |
|
||||
Right side of frame (vertical text).
|
||||
Rare, used for specific languages/effects.
|
||||
|
||||
# ============================================================================
|
||||
# Slot Definitions
|
||||
# ============================================================================
|
||||
|
||||
slots:
|
||||
subtitle_format:
|
||||
description: Subtitle file format (SRT, VTT, TTML, etc.)
|
||||
range: SubtitleFormatEnum
|
||||
|
||||
raw_subtitle_content:
|
||||
description: Original subtitle file content as raw string
|
||||
range: string
|
||||
|
||||
is_closed_caption:
|
||||
description: Whether this is a closed caption (CC) track
|
||||
range: boolean
|
||||
|
||||
is_sdh:
|
||||
description: Whether these are Subtitles for Deaf/Hard-of-Hearing
|
||||
range: boolean
|
||||
|
||||
includes_sound_descriptions:
|
||||
description: Whether subtitle includes non-speech sound descriptions
|
||||
range: boolean
|
||||
|
||||
includes_music_descriptions:
|
||||
description: Whether subtitle includes music descriptions
|
||||
range: boolean
|
||||
|
||||
includes_speaker_identification:
|
||||
description: Whether subtitle text includes speaker labels
|
||||
range: boolean
|
||||
|
||||
is_auto_generated:
|
||||
description: Whether subtitle was auto-generated by platform
|
||||
range: boolean
|
||||
|
||||
track_name:
|
||||
description: Human-readable name of subtitle track
|
||||
range: string
|
||||
|
||||
track_id:
|
||||
description: Platform-specific identifier for subtitle track
|
||||
range: string
|
||||
|
||||
default_position:
|
||||
description: Default display position for captions
|
||||
range: SubtitlePositionEnum
|
||||
|
||||
entry_count:
|
||||
description: Number of subtitle entries (caption cues)
|
||||
range: integer
|
||||
|
||||
average_entry_duration_seconds:
|
||||
description: Average duration of subtitle entries in seconds
|
||||
range: float
|
||||
524
schemas/20251121/linkml/modules/classes/VideoTextContent.yaml
Normal file
524
schemas/20251121/linkml/modules/classes/VideoTextContent.yaml
Normal file
|
|
@ -0,0 +1,524 @@
|
|||
# Video Text Content Class
|
||||
# Abstract base class for all textual/derived content from videos
|
||||
#
|
||||
# Part of Heritage Custodian Ontology v0.9.5
|
||||
#
|
||||
# HIERARCHY:
|
||||
# E73_Information_Object (CIDOC-CRM)
|
||||
# │
|
||||
# └── VideoTextContent (this class - ABSTRACT)
|
||||
# │
|
||||
# ├── VideoTranscript (full text transcription)
|
||||
# │ │
|
||||
# │ └── VideoSubtitle (time-coded captions)
|
||||
# │
|
||||
# └── VideoAnnotation (CV/multimodal derived)
|
||||
# │
|
||||
# ├── VideoSceneAnnotation
|
||||
# ├── VideoObjectAnnotation
|
||||
# └── VideoOCRAnnotation
|
||||
#
|
||||
# DESIGN RATIONALE:
|
||||
# All text derived from video (transcripts, subtitles, annotations) shares
|
||||
# common provenance requirements:
|
||||
# - Source video reference
|
||||
# - Generation method (ASR, manual, CV model)
|
||||
# - Generation timestamp
|
||||
# - Model/tool version
|
||||
# - Overall confidence score
|
||||
#
|
||||
# This abstract base ensures consistent provenance tracking across all
|
||||
# video-derived text content types.
|
||||
|
||||
id: https://nde.nl/ontology/hc/class/VideoTextContent
|
||||
name: video_text_content_class
|
||||
title: Video Text Content Class
|
||||
|
||||
imports:
|
||||
- linkml:types
|
||||
- ./VideoPost
|
||||
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
oa: http://www.w3.org/ns/oa#
|
||||
|
||||
default_prefix: hc
|
||||
|
||||
classes:
|
||||
|
||||
VideoTextContent:
|
||||
class_uri: crm:E73_Information_Object
|
||||
abstract: true
|
||||
description: |
|
||||
Abstract base class for all textual/derived content from videos.
|
||||
|
||||
**DEFINITION**:
|
||||
|
||||
VideoTextContent is the abstract parent for all text that is extracted,
|
||||
transcribed, or derived from video content. This includes:
|
||||
|
||||
| Subclass | Source | Description |
|
||||
|----------|--------|-------------|
|
||||
| VideoTranscript | Audio | Full text transcription of spoken content |
|
||||
| VideoSubtitle | Audio | Time-coded caption entries (SRT/VTT) |
|
||||
| VideoAnnotation | Visual | CV/multimodal-derived descriptions |
|
||||
|
||||
**PROVENANCE REQUIREMENTS**:
|
||||
|
||||
All video-derived text MUST include comprehensive provenance:
|
||||
|
||||
1. **Source**: Which video was processed (`source_video`)
|
||||
2. **Method**: How was content generated (`generation_method`)
|
||||
3. **Agent**: Who/what generated it (`generated_by`)
|
||||
4. **Time**: When was it generated (`generation_timestamp`)
|
||||
5. **Version**: Tool/model version (`model_version`)
|
||||
6. **Quality**: Overall confidence (`overall_confidence`)
|
||||
|
||||
**PROV-O ALIGNMENT**:
|
||||
|
||||
Maps to W3C PROV-O for provenance tracking:
|
||||
|
||||
```turtle
|
||||
:transcript a hc:VideoTranscript ;
|
||||
prov:wasGeneratedBy :asr_activity ;
|
||||
prov:wasAttributedTo :whisper_model ;
|
||||
prov:generatedAtTime "2025-12-01T10:00:00Z" ;
|
||||
prov:wasDerivedFrom :source_video .
|
||||
```
|
||||
|
||||
**CIDOC-CRM E73_Information_Object**:
|
||||
|
||||
- E73 is the base for all identifiable immaterial items
|
||||
- Includes texts, computer programs, songs, recipes
|
||||
- VideoTextContent are E73 instances derived from video (E73)
|
||||
|
||||
**GENERATION METHODS**:
|
||||
|
||||
| Method | Description | Typical Confidence |
|
||||
|--------|-------------|-------------------|
|
||||
| ASR_AUTOMATIC | Automatic speech recognition | 0.75-0.95 |
|
||||
| ASR_ENHANCED | ASR with post-processing | 0.85-0.98 |
|
||||
| MANUAL_TRANSCRIPTION | Human transcription | 0.98-1.0 |
|
||||
| MANUAL_CORRECTION | Human-corrected ASR | 0.95-1.0 |
|
||||
| CV_AUTOMATIC | Computer vision detection | 0.60-0.90 |
|
||||
| MULTIMODAL | Combined audio+visual AI | 0.70-0.95 |
|
||||
| OCR | Optical character recognition | 0.80-0.98 |
|
||||
| PLATFORM_PROVIDED | From YouTube/Vimeo API | 0.85-0.95 |
|
||||
|
||||
**HERITAGE INSTITUTION CONTEXT**:
|
||||
|
||||
Video text content is critical for:
|
||||
- **Accessibility**: Deaf/HoH users need accurate captions
|
||||
- **Discovery**: Full-text search over video collections
|
||||
- **Preservation**: Text outlasts video format obsolescence
|
||||
- **Research**: Analyzing spoken content at scale
|
||||
- **Translation**: Multilingual access to heritage content
|
||||
|
||||
**LANGUAGE SUPPORT**:
|
||||
|
||||
- `content_language`: Primary language of text content
|
||||
- May differ from video's default_audio_language if translated
|
||||
- ISO 639-1 codes (e.g., "nl", "en", "de")
|
||||
|
||||
exact_mappings:
|
||||
- crm:E73_Information_Object
|
||||
|
||||
close_mappings:
|
||||
- prov:Entity
|
||||
|
||||
related_mappings:
|
||||
- schema:CreativeWork
|
||||
- dcterms:Text
|
||||
|
||||
slots:
|
||||
# Source reference
|
||||
- source_video
|
||||
- source_video_url
|
||||
|
||||
# Content metadata
|
||||
- content_language
|
||||
- content_title
|
||||
|
||||
# Provenance - Generation
|
||||
- generated_by
|
||||
- generation_method
|
||||
- generation_timestamp
|
||||
- model_version
|
||||
- model_provider
|
||||
|
||||
# Quality
|
||||
- overall_confidence
|
||||
- is_verified
|
||||
- verified_by
|
||||
- verification_date
|
||||
|
||||
# Processing metadata
|
||||
- processing_duration_seconds
|
||||
- word_count
|
||||
- character_count
|
||||
|
||||
slot_usage:
|
||||
source_video:
|
||||
slot_uri: prov:wasDerivedFrom
|
||||
description: |
|
||||
Reference to the VideoPost from which this content was derived.
|
||||
|
||||
PROV-O: wasDerivedFrom links derived content to source.
|
||||
|
||||
Links to the video's unique identifier (post_id).
|
||||
range: string
|
||||
required: true
|
||||
examples:
|
||||
- value: "FbIoC-Owy-M"
|
||||
description: "YouTube video ID as source reference"
|
||||
|
||||
source_video_url:
|
||||
slot_uri: schema:url
|
||||
description: |
|
||||
URL of the source video.
|
||||
|
||||
Convenience field for direct video access.
|
||||
Derived from source_video but stored for quick reference.
|
||||
range: uri
|
||||
required: false
|
||||
examples:
|
||||
- value: "https://www.youtube.com/watch?v=FbIoC-Owy-M"
|
||||
description: "Full YouTube video URL"
|
||||
|
||||
content_language:
|
||||
slot_uri: dcterms:language
|
||||
description: |
|
||||
Primary language of the text content.
|
||||
|
||||
Dublin Core: language for content language.
|
||||
|
||||
ISO 639-1 code. May differ from video's audio language
|
||||
if this is a translation or localization.
|
||||
range: string
|
||||
required: true
|
||||
examples:
|
||||
- value: "nl"
|
||||
description: "Dutch language content"
|
||||
- value: "en"
|
||||
description: "English translation"
|
||||
|
||||
content_title:
|
||||
slot_uri: dcterms:title
|
||||
description: |
|
||||
Title or label for this text content.
|
||||
|
||||
Dublin Core: title for content name.
|
||||
|
||||
Examples:
|
||||
- "Rijksmuseum Tour - Full Transcript"
|
||||
- "Dutch Subtitles - Auto-generated"
|
||||
- "Scene Annotations - CV Model v2.1"
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "De Vrijheidsroute Ep.3 - Dutch Transcript"
|
||||
description: "Descriptive title for transcript"
|
||||
|
||||
generated_by:
|
||||
slot_uri: prov:wasAttributedTo
|
||||
description: |
|
||||
The agent (model, service, person) that generated this content.
|
||||
|
||||
PROV-O: wasAttributedTo identifies the responsible agent.
|
||||
|
||||
**Examples**:
|
||||
- AI Models: "openai/whisper-large-v3", "google/speech-to-text"
|
||||
- Services: "YouTube Auto-captions", "Rev.com"
|
||||
- Human: "transcriber:jane.doe@museum.nl"
|
||||
range: string
|
||||
required: true
|
||||
examples:
|
||||
- value: "openai/whisper-large-v3"
|
||||
description: "OpenAI Whisper ASR model"
|
||||
- value: "YouTube Auto-captions"
|
||||
description: "Platform-provided captions"
|
||||
- value: "manual:curator@rijksmuseum.nl"
|
||||
description: "Human transcriber"
|
||||
|
||||
generation_method:
|
||||
slot_uri: prov:wasGeneratedBy
|
||||
description: |
|
||||
The method used to generate this content.
|
||||
|
||||
PROV-O: wasGeneratedBy for generation activity type.
|
||||
|
||||
See GenerationMethodEnum for standardized values.
|
||||
range: GenerationMethodEnum
|
||||
required: true
|
||||
examples:
|
||||
- value: "ASR_AUTOMATIC"
|
||||
description: "Automatic speech recognition"
|
||||
- value: "MANUAL_TRANSCRIPTION"
|
||||
description: "Human transcription"
|
||||
|
||||
generation_timestamp:
|
||||
slot_uri: prov:generatedAtTime
|
||||
description: |
|
||||
When this content was generated.
|
||||
|
||||
PROV-O: generatedAtTime for creation timestamp.
|
||||
|
||||
ISO 8601 datetime. Critical for versioning and reproducibility.
|
||||
range: datetime
|
||||
required: true
|
||||
examples:
|
||||
- value: "2025-12-01T10:30:00Z"
|
||||
description: "Generated December 1, 2025 at 10:30 UTC"
|
||||
|
||||
model_version:
|
||||
slot_uri: schema:softwareVersion
|
||||
description: |
|
||||
Version of the model or tool used for generation.
|
||||
|
||||
Schema.org: softwareVersion for version tracking.
|
||||
|
||||
Critical for reproducibility and quality assessment.
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "large-v3"
|
||||
description: "Whisper model version"
|
||||
- value: "v2.3.1"
|
||||
description: "Software version number"
|
||||
|
||||
model_provider:
|
||||
slot_uri: schema:provider
|
||||
description: |
|
||||
Provider or vendor of the generation model/service.
|
||||
|
||||
Schema.org: provider for service provider.
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "OpenAI"
|
||||
description: "Model provider"
|
||||
- value: "Google Cloud"
|
||||
description: "Cloud service provider"
|
||||
|
||||
overall_confidence:
|
||||
slot_uri: hc:overallConfidence
|
||||
description: |
|
||||
Overall confidence score for the generated content.
|
||||
|
||||
Range: 0.0 (no confidence) to 1.0 (complete certainty)
|
||||
|
||||
Aggregated from per-segment confidence scores or
|
||||
provided by the generation model.
|
||||
|
||||
**Thresholds** (suggested):
|
||||
- > 0.9: High quality, production-ready
|
||||
- 0.75-0.9: Good, may have minor errors
|
||||
- 0.6-0.75: Usable, should be reviewed
|
||||
- < 0.6: Low quality, needs significant correction
|
||||
range: float
|
||||
required: false
|
||||
minimum_value: 0.0
|
||||
maximum_value: 1.0
|
||||
examples:
|
||||
- value: 0.92
|
||||
description: "High confidence ASR output"
|
||||
|
||||
is_verified:
|
||||
slot_uri: hc:isVerified
|
||||
description: |
|
||||
Whether content has been verified by a human.
|
||||
|
||||
- **true**: Human-reviewed and approved
|
||||
- **false**: Not yet verified (default for AI-generated)
|
||||
|
||||
Critical for quality assurance in heritage contexts.
|
||||
range: boolean
|
||||
required: false
|
||||
ifabsent: "false"
|
||||
examples:
|
||||
- value: true
|
||||
description: "Human-verified transcript"
|
||||
|
||||
verified_by:
|
||||
slot_uri: prov:wasAttributedTo
|
||||
description: |
|
||||
Identity of the person who verified the content.
|
||||
|
||||
Only populated when is_verified = true.
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "curator@rijksmuseum.nl"
|
||||
description: "Staff member who verified"
|
||||
|
||||
verification_date:
|
||||
slot_uri: dcterms:dateAccepted
|
||||
description: |
|
||||
Date when content was verified.
|
||||
|
||||
Dublin Core: dateAccepted for approval date.
|
||||
range: datetime
|
||||
required: false
|
||||
examples:
|
||||
- value: "2025-12-02T15:00:00Z"
|
||||
description: "Verified December 2, 2025"
|
||||
|
||||
processing_duration_seconds:
|
||||
slot_uri: hc:processingDuration
|
||||
description: |
|
||||
Time taken to generate this content, in seconds.
|
||||
|
||||
Useful for performance monitoring and cost estimation.
|
||||
range: float
|
||||
required: false
|
||||
minimum_value: 0.0
|
||||
examples:
|
||||
- value: 45.3
|
||||
description: "Processed in 45.3 seconds"
|
||||
|
||||
word_count:
|
||||
slot_uri: hc:wordCount
|
||||
description: |
|
||||
Total number of words in the text content.
|
||||
|
||||
Useful for content sizing and analysis.
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 1523
|
||||
description: "1,523 words in transcript"
|
||||
|
||||
character_count:
|
||||
slot_uri: hc:characterCount
|
||||
description: |
|
||||
Total number of characters in the text content.
|
||||
|
||||
Includes spaces. Useful for storage estimation.
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 8742
|
||||
description: "8,742 characters"
|
||||
|
||||
comments:
|
||||
- "Abstract base for all video-derived text content"
|
||||
- "Comprehensive PROV-O provenance tracking"
|
||||
- "Confidence scoring for AI-generated content"
|
||||
- "Verification workflow support"
|
||||
- "Critical for heritage accessibility and discovery"
|
||||
|
||||
see_also:
|
||||
- "https://www.w3.org/TR/prov-o/"
|
||||
- "http://www.cidoc-crm.org/cidoc-crm/E73_Information_Object"
|
||||
|
||||
# ============================================================================
|
||||
# Enumerations
|
||||
# ============================================================================
|
||||
|
||||
enums:
|
||||
|
||||
GenerationMethodEnum:
|
||||
description: |
|
||||
Methods for generating video-derived text content.
|
||||
|
||||
Standardized values for provenance tracking.
|
||||
permissible_values:
|
||||
ASR_AUTOMATIC:
|
||||
description: Automatic speech recognition (raw output)
|
||||
ASR_ENHANCED:
|
||||
description: ASR with post-processing (punctuation, normalization)
|
||||
MANUAL_TRANSCRIPTION:
|
||||
description: Fully human-transcribed content
|
||||
MANUAL_CORRECTION:
|
||||
description: Human-corrected ASR output
|
||||
CV_AUTOMATIC:
|
||||
description: Computer vision detection (raw output)
|
||||
CV_ENHANCED:
|
||||
description: CV with post-processing or filtering
|
||||
MULTIMODAL:
|
||||
description: Combined audio+visual AI processing
|
||||
OCR:
|
||||
description: Optical character recognition from video frames
|
||||
PLATFORM_PROVIDED:
|
||||
description: Content from platform API (YouTube, Vimeo captions)
|
||||
HYBRID:
|
||||
description: Combination of automated and manual methods
|
||||
UNKNOWN:
|
||||
description: Generation method not recorded
|
||||
|
||||
# ============================================================================
|
||||
# Slot Definitions
|
||||
# ============================================================================
|
||||
|
||||
slots:
|
||||
source_video:
|
||||
description: Reference to source VideoPost (video ID)
|
||||
range: string
|
||||
|
||||
source_video_url:
|
||||
description: URL of the source video
|
||||
range: uri
|
||||
|
||||
content_language:
|
||||
description: Primary language of text content (ISO 639-1)
|
||||
range: string
|
||||
|
||||
content_title:
|
||||
description: Title or label for this text content
|
||||
range: string
|
||||
|
||||
generated_by:
|
||||
description: Agent that generated this content (model, service, person)
|
||||
range: string
|
||||
|
||||
generation_method:
|
||||
description: Method used to generate content
|
||||
range: GenerationMethodEnum
|
||||
|
||||
generation_timestamp:
|
||||
description: When content was generated
|
||||
range: datetime
|
||||
|
||||
model_version:
|
||||
description: Version of model/tool used
|
||||
range: string
|
||||
|
||||
model_provider:
|
||||
description: Provider of model/service
|
||||
range: string
|
||||
|
||||
overall_confidence:
|
||||
description: Overall confidence score (0.0-1.0)
|
||||
range: float
|
||||
|
||||
is_verified:
|
||||
description: Whether content has been human-verified
|
||||
range: boolean
|
||||
|
||||
verified_by:
|
||||
description: Person who verified the content
|
||||
range: string
|
||||
|
||||
verification_date:
|
||||
description: Date content was verified
|
||||
range: datetime
|
||||
|
||||
processing_duration_seconds:
|
||||
description: Time taken to generate content
|
||||
range: float
|
||||
|
||||
word_count:
|
||||
description: Total word count
|
||||
range: integer
|
||||
|
||||
character_count:
|
||||
description: Total character count
|
||||
range: integer
|
||||
375
schemas/20251121/linkml/modules/classes/VideoTimeSegment.yaml
Normal file
375
schemas/20251121/linkml/modules/classes/VideoTimeSegment.yaml
Normal file
|
|
@ -0,0 +1,375 @@
|
|||
# Video Time Segment Class
|
||||
# Reusable temporal segment for video content (subtitles, annotations, chapters)
|
||||
#
|
||||
# Part of Heritage Custodian Ontology v0.9.5
|
||||
#
|
||||
# STRUCTURE:
|
||||
# VideoTimeSegment (this class)
|
||||
# - start_time, end_time (ISO 8601 duration)
|
||||
# - start_seconds, end_seconds (float for computation)
|
||||
# - segment_text (text content for this segment)
|
||||
# - confidence (for ASR/CV generated content)
|
||||
#
|
||||
# USED BY:
|
||||
# - VideoSubtitle (time-coded caption entries)
|
||||
# - VideoAnnotation (scene/object detection segments)
|
||||
# - VideoChapter (user-defined chapters)
|
||||
#
|
||||
# ONTOLOGY ALIGNMENT:
|
||||
# - Maps to Media Fragments URI 1.0 (W3C) for temporal addressing
|
||||
# - CIDOC-CRM E52_Time-Span for temporal extent
|
||||
# - Web Annotation oa:FragmentSelector for annotation targets
|
||||
|
||||
id: https://nde.nl/ontology/hc/class/VideoTimeSegment
|
||||
name: video_time_segment_class
|
||||
title: Video Time Segment Class
|
||||
|
||||
imports:
|
||||
- linkml:types
|
||||
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
oa: http://www.w3.org/ns/oa#
|
||||
ma: http://www.w3.org/ns/ma-ont#
|
||||
|
||||
default_prefix: hc
|
||||
|
||||
classes:
|
||||
|
||||
VideoTimeSegment:
|
||||
class_uri: crm:E52_Time-Span
|
||||
abstract: false
|
||||
description: |
|
||||
A temporal segment within a video, defined by start and end times.
|
||||
|
||||
**DEFINITION**:
|
||||
|
||||
VideoTimeSegment represents a bounded temporal portion of video content.
|
||||
It is the foundational unit for time-coded content including:
|
||||
- Subtitle/caption entries (text displayed at specific times)
|
||||
- Annotation segments (detected scenes, objects, faces)
|
||||
- Chapter markers (user-defined content sections)
|
||||
|
||||
**DUAL TIME REPRESENTATION**:
|
||||
|
||||
Times are stored in two formats for different use cases:
|
||||
|
||||
| Format | Example | Use Case |
|
||||
|--------|---------|----------|
|
||||
| ISO 8601 duration | PT0M30S | Human-readable, serialization |
|
||||
| Seconds (float) | 30.0 | Computation, synchronization |
|
||||
|
||||
Both representations MUST be kept in sync. The seconds format is
|
||||
primary for computation; ISO 8601 is derived for display/storage.
|
||||
|
||||
**MEDIA FRAGMENTS URI (W3C)**:
|
||||
|
||||
VideoTimeSegment aligns with W3C Media Fragments URI 1.0 specification
|
||||
for addressing temporal fragments of video:
|
||||
|
||||
```
|
||||
https://example.com/video.mp4#t=30,35
|
||||
```
|
||||
|
||||
The `start_seconds` and `end_seconds` map directly to the `t=` parameter.
|
||||
|
||||
**WEB ANNOTATION COMPATIBILITY**:
|
||||
|
||||
When used as an annotation target selector:
|
||||
- Maps to `oa:FragmentSelector` with `conformsTo` Media Fragments
|
||||
- Enables interoperability with W3C Web Annotation Data Model
|
||||
|
||||
**CIDOC-CRM E52_Time-Span**:
|
||||
|
||||
In cultural heritage documentation:
|
||||
- E52_Time-Span is the extent of a time-span
|
||||
- Used for temporal properties of cultural objects
|
||||
- VideoTimeSegment extends this to media-specific temporal segments
|
||||
|
||||
**CONFIDENCE SCORING**:
|
||||
|
||||
For segments generated by ASR (speech recognition) or CV (computer vision):
|
||||
- `confidence`: 0.0-1.0 score for segment accuracy
|
||||
- Enables filtering by quality threshold
|
||||
- Critical for AI-generated transcripts and annotations
|
||||
|
||||
**HERITAGE USE CASES**:
|
||||
|
||||
| Use Case | Example | Start | End |
|
||||
|----------|---------|-------|-----|
|
||||
| Subtitle entry | "Welcome to the museum" | 0:30 | 0:35 |
|
||||
| Scene annotation | "Exhibition hall panorama" | 1:00 | 1:30 |
|
||||
| Chapter marker | "Introduction" | 0:00 | 2:00 |
|
||||
| Object detection | "Painting: Night Watch" | 3:15 | 3:20 |
|
||||
| Speaker change | "Curator speaking" | 5:00 | 7:30 |
|
||||
|
||||
exact_mappings:
|
||||
- crm:E52_Time-Span
|
||||
- oa:FragmentSelector
|
||||
|
||||
close_mappings:
|
||||
- ma:MediaFragment
|
||||
|
||||
related_mappings:
|
||||
- schema:Clip
|
||||
|
||||
slots:
|
||||
# Time boundaries (ISO 8601 duration format)
|
||||
- start_time
|
||||
- end_time
|
||||
|
||||
# Time boundaries (seconds for computation)
|
||||
- start_seconds
|
||||
- end_seconds
|
||||
|
||||
# Content
|
||||
- segment_text
|
||||
- segment_index
|
||||
|
||||
# Quality
|
||||
- confidence
|
||||
|
||||
# Metadata
|
||||
- speaker_id
|
||||
- speaker_label
|
||||
|
||||
slot_usage:
|
||||
start_time:
|
||||
slot_uri: ma:hasStartTime
|
||||
description: |
|
||||
Start time of segment as ISO 8601 duration from video beginning.
|
||||
|
||||
Media Ontology: hasStartTime for temporal start.
|
||||
|
||||
**Format**: ISO 8601 duration (e.g., "PT0M30S" = 30 seconds from start)
|
||||
|
||||
**Common Patterns**:
|
||||
- PT0S = Start of video (0 seconds)
|
||||
- PT30S = 30 seconds
|
||||
- PT1M30S = 1 minute 30 seconds
|
||||
- PT1H15M30S = 1 hour 15 minutes 30 seconds
|
||||
range: string
|
||||
required: false
|
||||
pattern: "^PT(\\d+H)?(\\d+M)?(\\d+(\\.\\d+)?S)?$"
|
||||
examples:
|
||||
- value: "PT0M30S"
|
||||
description: "30 seconds from video start"
|
||||
- value: "PT1H15M30S"
|
||||
description: "1 hour 15 minutes 30 seconds"
|
||||
|
||||
end_time:
|
||||
slot_uri: ma:hasEndTime
|
||||
description: |
|
||||
End time of segment as ISO 8601 duration from video beginning.
|
||||
|
||||
Media Ontology: hasEndTime for temporal end.
|
||||
|
||||
Must be greater than or equal to start_time.
|
||||
range: string
|
||||
required: false
|
||||
pattern: "^PT(\\d+H)?(\\d+M)?(\\d+(\\.\\d+)?S)?$"
|
||||
examples:
|
||||
- value: "PT0M35S"
|
||||
description: "35 seconds from video start"
|
||||
|
||||
start_seconds:
|
||||
slot_uri: hc:startSeconds
|
||||
description: |
|
||||
Start time in seconds (floating point) from video beginning.
|
||||
|
||||
**PRIMARY for computation**. Use for:
|
||||
- Video player synchronization
|
||||
- Duration calculations
|
||||
- Time-based sorting and filtering
|
||||
|
||||
Precision to milliseconds (3 decimal places) is typical.
|
||||
range: float
|
||||
required: true
|
||||
minimum_value: 0.0
|
||||
examples:
|
||||
- value: 30.0
|
||||
description: "30 seconds from start"
|
||||
- value: 30.500
|
||||
description: "30.5 seconds (millisecond precision)"
|
||||
|
||||
end_seconds:
|
||||
slot_uri: hc:endSeconds
|
||||
description: |
|
||||
End time in seconds (floating point) from video beginning.
|
||||
|
||||
Must be greater than start_seconds.
|
||||
|
||||
For single-frame annotations (e.g., object detection in one frame),
|
||||
end_seconds may equal start_seconds or be slightly greater.
|
||||
range: float
|
||||
required: true
|
||||
minimum_value: 0.0
|
||||
examples:
|
||||
- value: 35.0
|
||||
description: "35 seconds from start"
|
||||
|
||||
segment_text:
|
||||
slot_uri: oa:bodyValue
|
||||
description: |
|
||||
Text content for this segment.
|
||||
|
||||
Web Annotation: bodyValue for textual content.
|
||||
|
||||
**Usage by content type**:
|
||||
- Subtitles: Displayed caption text
|
||||
- Transcripts: Spoken words during this segment
|
||||
- Annotations: Description of detected content
|
||||
- Chapters: Chapter title/description
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "Welkom bij het Rijksmuseum"
|
||||
description: "Dutch subtitle text"
|
||||
- value: "The curator explains the painting's history"
|
||||
description: "Transcript segment"
|
||||
|
||||
segment_index:
|
||||
slot_uri: hc:segmentIndex
|
||||
description: |
|
||||
Sequential index of this segment within the parent content.
|
||||
|
||||
Zero-based index for ordering segments:
|
||||
- Subtitle: Order in which captions appear
|
||||
- Annotation: Detection sequence
|
||||
|
||||
Enables reconstruction of segment order when times overlap
|
||||
or for stable sorting.
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 0
|
||||
description: "First segment"
|
||||
- value: 42
|
||||
description: "43rd segment (zero-indexed)"
|
||||
|
||||
confidence:
|
||||
slot_uri: hc:confidence
|
||||
description: |
|
||||
Confidence score for AI-generated content.
|
||||
|
||||
Range: 0.0 (no confidence) to 1.0 (complete certainty)
|
||||
|
||||
**Applies to**:
|
||||
- ASR-generated transcript/subtitle segments
|
||||
- CV-detected scene or object annotations
|
||||
- OCR-extracted text from video frames
|
||||
|
||||
**Thresholds** (suggested):
|
||||
- > 0.9: High confidence, suitable for display
|
||||
- 0.7-0.9: Medium, may need review
|
||||
- < 0.7: Low, flag for human verification
|
||||
range: float
|
||||
required: false
|
||||
minimum_value: 0.0
|
||||
maximum_value: 1.0
|
||||
examples:
|
||||
- value: 0.95
|
||||
description: "High confidence ASR segment"
|
||||
- value: 0.72
|
||||
description: "Medium confidence, may contain errors"
|
||||
|
||||
speaker_id:
|
||||
slot_uri: hc:speakerId
|
||||
description: |
|
||||
Identifier for the speaker during this segment.
|
||||
|
||||
For transcripts with speaker diarization:
|
||||
- Links to identified speaker (e.g., "SPEAKER_01")
|
||||
- May be resolved to actual person identity
|
||||
|
||||
Enables multi-speaker transcript navigation.
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "SPEAKER_01"
|
||||
description: "First identified speaker"
|
||||
- value: "curator_taco_dibbits"
|
||||
description: "Resolved speaker identity"
|
||||
|
||||
speaker_label:
|
||||
slot_uri: hc:speakerLabel
|
||||
description: |
|
||||
Human-readable label for the speaker.
|
||||
|
||||
Display name for the speaker during this segment:
|
||||
- May be generic ("Narrator", "Interviewer")
|
||||
- May be specific ("Dr. Taco Dibbits, Museum Director")
|
||||
|
||||
Distinguished from speaker_id which is a machine identifier.
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "Narrator"
|
||||
description: "Generic speaker label"
|
||||
- value: "Dr. Taco Dibbits, Museum Director"
|
||||
description: "Specific identified speaker"
|
||||
|
||||
rules:
|
||||
- postconditions:
|
||||
description: end_seconds must be >= start_seconds
|
||||
# Note: LinkML doesn't support direct comparison rules,
|
||||
# but this documents the constraint for validation
|
||||
|
||||
comments:
|
||||
- "Reusable time segment for subtitles, annotations, chapters"
|
||||
- "Dual time format: ISO 8601 for serialization, seconds for computation"
|
||||
- "Aligns with W3C Media Fragments URI specification"
|
||||
- "Confidence scoring for AI-generated content"
|
||||
- "Speaker diarization support for multi-speaker transcripts"
|
||||
|
||||
see_also:
|
||||
- "https://www.w3.org/TR/media-frags/"
|
||||
- "https://www.w3.org/TR/annotation-model/"
|
||||
- "https://www.w3.org/ns/ma-ont"
|
||||
- "http://www.cidoc-crm.org/cidoc-crm/E52_Time-Span"
|
||||
|
||||
# ============================================================================
|
||||
# Slot Definitions
|
||||
# ============================================================================
|
||||
|
||||
slots:
|
||||
start_time:
|
||||
description: Start time as ISO 8601 duration from video beginning
|
||||
range: string
|
||||
|
||||
end_time:
|
||||
description: End time as ISO 8601 duration from video beginning
|
||||
range: string
|
||||
|
||||
start_seconds:
|
||||
description: Start time in seconds (float) from video beginning
|
||||
range: float
|
||||
|
||||
end_seconds:
|
||||
description: End time in seconds (float) from video beginning
|
||||
range: float
|
||||
|
||||
segment_text:
|
||||
description: Text content for this time segment
|
||||
range: string
|
||||
|
||||
segment_index:
|
||||
description: Sequential index of segment within parent
|
||||
range: integer
|
||||
|
||||
confidence:
|
||||
description: Confidence score for AI-generated content (0.0-1.0)
|
||||
range: float
|
||||
|
||||
speaker_id:
|
||||
description: Identifier for speaker during this segment
|
||||
range: string
|
||||
|
||||
speaker_label:
|
||||
description: Human-readable label for speaker
|
||||
range: string
|
||||
469
schemas/20251121/linkml/modules/classes/VideoTranscript.yaml
Normal file
469
schemas/20251121/linkml/modules/classes/VideoTranscript.yaml
Normal file
|
|
@ -0,0 +1,469 @@
|
|||
# Video Transcript Class
|
||||
# Full text transcription of video audio content
|
||||
#
|
||||
# Part of Heritage Custodian Ontology v0.9.5
|
||||
#
|
||||
# HIERARCHY:
|
||||
# E73_Information_Object (CIDOC-CRM)
|
||||
# │
|
||||
# └── VideoTextContent (abstract base - provenance)
|
||||
# │
|
||||
# └── VideoTranscript (this class)
|
||||
# │
|
||||
# └── VideoSubtitle (time-coded extension)
|
||||
#
|
||||
# DESIGN RATIONALE:
|
||||
# VideoTranscript represents the complete textual representation of spoken
|
||||
# content in a video. It extends VideoTextContent to inherit comprehensive
|
||||
# provenance tracking and adds transcript-specific slots:
|
||||
#
|
||||
# - full_text: Complete transcript as single text block
|
||||
# - transcript_format: How the text is structured (plain, paragraphed, etc.)
|
||||
# - segments: Optional structured breakdown into VideoTimeSegments
|
||||
# - includes_timestamps/speakers: Metadata about content structure
|
||||
#
|
||||
# VideoSubtitle extends this because subtitles ARE transcripts plus time-codes.
|
||||
|
||||
id: https://nde.nl/ontology/hc/class/VideoTranscript
|
||||
name: video_transcript_class
|
||||
title: Video Transcript Class
|
||||
|
||||
imports:
|
||||
- linkml:types
|
||||
- ./VideoTextContent
|
||||
- ./VideoTimeSegment
|
||||
|
||||
prefixes:
|
||||
linkml: https://w3id.org/linkml/
|
||||
hc: https://nde.nl/ontology/hc/
|
||||
schema: http://schema.org/
|
||||
dcterms: http://purl.org/dc/terms/
|
||||
prov: http://www.w3.org/ns/prov#
|
||||
crm: http://www.cidoc-crm.org/cidoc-crm/
|
||||
skos: http://www.w3.org/2004/02/skos/core#
|
||||
|
||||
default_prefix: hc
|
||||
|
||||
classes:
|
||||
|
||||
VideoTranscript:
|
||||
is_a: VideoTextContent
|
||||
class_uri: crm:E33_Linguistic_Object
|
||||
abstract: false
|
||||
description: |
|
||||
Full text transcription of video audio content.
|
||||
|
||||
**DEFINITION**:
|
||||
|
||||
A VideoTranscript is the complete textual representation of all spoken
|
||||
content in a video. It extends VideoTextContent with transcript-specific
|
||||
properties and inherits all provenance tracking capabilities.
|
||||
|
||||
**RELATIONSHIP TO VideoSubtitle**:
|
||||
|
||||
VideoSubtitle is a subclass of VideoTranscript because:
|
||||
1. A subtitle file contains everything a transcript needs PLUS time codes
|
||||
2. You can derive a plain transcript from subtitles by stripping times
|
||||
3. This inheritance allows polymorphic handling of text content
|
||||
|
||||
```
|
||||
VideoTranscript VideoSubtitle (is_a VideoTranscript)
|
||||
├── full_text ├── full_text (inherited)
|
||||
├── segments[] ├── segments[] (required, with times)
|
||||
└── (optional times) └── subtitle_format (SRT, VTT, etc.)
|
||||
```
|
||||
|
||||
**SCHEMA.ORG ALIGNMENT**:
|
||||
|
||||
Maps to `schema:transcript` property:
|
||||
> "If this MediaObject is an AudioObject or VideoObject,
|
||||
> the transcript of that object."
|
||||
|
||||
**CIDOC-CRM E33_Linguistic_Object**:
|
||||
|
||||
E33 is the class comprising:
|
||||
> "identifiable expressions in natural language or code"
|
||||
|
||||
A transcript is a linguistic object derived from the audio track of
|
||||
a video (which is itself an E73_Information_Object).
|
||||
|
||||
**TRANSCRIPT FORMATS**:
|
||||
|
||||
| Format | Description | Use Case |
|
||||
|--------|-------------|----------|
|
||||
| PLAIN_TEXT | Continuous text, no structure | Simple search indexing |
|
||||
| PARAGRAPHED | Text broken into paragraphs | Human reading |
|
||||
| STRUCTURED | Segments with speaker labels | Research, analysis |
|
||||
| TIMESTAMPED | Segments with time markers | Navigation, subtitling |
|
||||
|
||||
**GENERATION METHODS** (inherited from VideoTextContent):
|
||||
|
||||
| Method | Typical Use | Quality |
|
||||
|--------|-------------|---------|
|
||||
| ASR_AUTOMATIC | Whisper, Google STT | 0.80-0.95 |
|
||||
| MANUAL_TRANSCRIPTION | Human transcriber | 0.98-1.0 |
|
||||
| PLATFORM_PROVIDED | YouTube auto-captions | 0.75-0.90 |
|
||||
| HYBRID | ASR + human correction | 0.95-1.0 |
|
||||
|
||||
**HERITAGE INSTITUTION CONTEXT**:
|
||||
|
||||
Transcripts are critical for heritage video collections:
|
||||
|
||||
1. **Discovery**: Full-text search over video content
|
||||
2. **Accessibility**: Deaf/HoH access to spoken content
|
||||
3. **Preservation**: Text outlasts video format obsolescence
|
||||
4. **Research**: Corpus analysis, keyword extraction
|
||||
5. **Translation**: Base for multilingual access
|
||||
6. **SEO**: Search engine indexing of video content
|
||||
|
||||
**STRUCTURED SEGMENTS**:
|
||||
|
||||
When `segments` is populated, the transcript has structural breakdown:
|
||||
|
||||
```yaml
|
||||
segments:
|
||||
- segment_index: 0
|
||||
start_seconds: 0.0
|
||||
end_seconds: 5.5
|
||||
segment_text: "Welcome to the Rijksmuseum."
|
||||
speaker_label: "Narrator"
|
||||
confidence: 0.94
|
||||
- segment_index: 1
|
||||
start_seconds: 5.5
|
||||
end_seconds: 12.3
|
||||
segment_text: "Today we'll explore the Night Watch gallery."
|
||||
speaker_label: "Narrator"
|
||||
confidence: 0.91
|
||||
```
|
||||
|
||||
**PROVENANCE** (inherited from VideoTextContent):
|
||||
|
||||
All transcripts include:
|
||||
- `source_video`: Which video was transcribed
|
||||
- `generated_by`: Tool/person that created transcript
|
||||
- `generation_method`: ASR_AUTOMATIC, MANUAL_TRANSCRIPTION, etc.
|
||||
- `generation_timestamp`: When transcript was created
|
||||
- `overall_confidence`: Aggregate quality score
|
||||
- `is_verified`: Whether human-reviewed
|
||||
|
||||
exact_mappings:
|
||||
- crm:E33_Linguistic_Object
|
||||
|
||||
close_mappings:
|
||||
- schema:transcript
|
||||
|
||||
related_mappings:
|
||||
- dcterms:Text
|
||||
|
||||
slots:
|
||||
# Core content
|
||||
- full_text
|
||||
- transcript_format
|
||||
|
||||
# Structural information
|
||||
- includes_timestamps
|
||||
- includes_speakers
|
||||
- segments
|
||||
|
||||
# Speaker metadata
|
||||
- speaker_count
|
||||
- primary_speaker
|
||||
|
||||
# Additional metadata
|
||||
- source_language_auto_detected
|
||||
- paragraph_count
|
||||
- sentence_count
|
||||
|
||||
slot_usage:
|
||||
full_text:
|
||||
slot_uri: schema:text
|
||||
description: |
|
||||
Complete transcript text as a single string.
|
||||
|
||||
Schema.org: text for primary textual content.
|
||||
|
||||
Contains all spoken content from the video, concatenated.
|
||||
May include:
|
||||
- Speaker labels (if includes_speakers = true)
|
||||
- Timestamps (if includes_timestamps = true)
|
||||
- Paragraph breaks (if format = PARAGRAPHED or STRUCTURED)
|
||||
|
||||
For structured access, use the `segments` slot instead.
|
||||
range: string
|
||||
required: true
|
||||
examples:
|
||||
- value: |
|
||||
Welcome to the Rijksmuseum. Today we'll explore the masterpieces
|
||||
of Dutch Golden Age painting. Our first stop is the Night Watch
|
||||
by Rembrandt van Rijn, painted in 1642.
|
||||
description: "Plain text transcript excerpt"
|
||||
- value: |
|
||||
[Narrator] Welcome to the Rijksmuseum.
|
||||
[Narrator] Today we'll explore the masterpieces of Dutch Golden Age painting.
|
||||
[Curator] Our first stop is the Night Watch by Rembrandt van Rijn.
|
||||
description: "Transcript with speaker labels"
|
||||
|
||||
transcript_format:
|
||||
slot_uri: dcterms:format
|
||||
description: |
|
||||
Format/structure of the transcript text.
|
||||
|
||||
Dublin Core: format for resource format.
|
||||
|
||||
Indicates how the full_text is structured:
|
||||
- PLAIN_TEXT: Continuous text without breaks
|
||||
- PARAGRAPHED: Broken into paragraphs
|
||||
- STRUCTURED: Includes speaker labels, times, or both
|
||||
- TIMESTAMPED: Includes inline time markers
|
||||
range: TranscriptFormatEnum
|
||||
required: false
|
||||
ifabsent: "string(PLAIN_TEXT)"
|
||||
examples:
|
||||
- value: "STRUCTURED"
|
||||
description: "Text with speaker labels and paragraph breaks"
|
||||
|
||||
includes_timestamps:
|
||||
slot_uri: hc:includesTimestamps
|
||||
description: |
|
||||
Whether the transcript includes time markers.
|
||||
|
||||
- **true**: Timestamps are embedded in full_text or segments have times
|
||||
- **false**: No temporal information (default)
|
||||
|
||||
If true, prefer using `segments` for programmatic access.
|
||||
range: boolean
|
||||
required: false
|
||||
ifabsent: "false"
|
||||
examples:
|
||||
- value: true
|
||||
description: "Transcript has time codes"
|
||||
|
||||
includes_speakers:
|
||||
slot_uri: hc:includesSpeakers
|
||||
description: |
|
||||
Whether the transcript includes speaker identification.
|
||||
|
||||
- **true**: Speaker labels/diarization available
|
||||
- **false**: Single speaker or no identification (default)
|
||||
|
||||
When true, check `speaker_count` for number of distinct speakers.
|
||||
range: boolean
|
||||
required: false
|
||||
ifabsent: "false"
|
||||
examples:
|
||||
- value: true
|
||||
description: "Multi-speaker transcript with diarization"
|
||||
|
||||
segments:
|
||||
slot_uri: hc:transcriptSegments
|
||||
description: |
|
||||
Structured breakdown of transcript into time-coded segments.
|
||||
|
||||
Optional for VideoTranscript (plain transcripts may not have times).
|
||||
Required for VideoSubtitle (subtitles must have time codes).
|
||||
|
||||
Each segment is a VideoTimeSegment with:
|
||||
- start_seconds / end_seconds: Time boundaries
|
||||
- segment_text: Text for this segment
|
||||
- confidence: Per-segment accuracy score
|
||||
- speaker_id / speaker_label: Speaker identification
|
||||
|
||||
Use segments for:
|
||||
- Video player synchronization
|
||||
- Jump-to-time navigation
|
||||
- Per-segment quality analysis
|
||||
- Speaker-separated views
|
||||
range: VideoTimeSegment
|
||||
required: false
|
||||
multivalued: true
|
||||
inlined: true
|
||||
inlined_as_list: true
|
||||
examples:
|
||||
- value: |
|
||||
- segment_index: 0
|
||||
start_seconds: 0.0
|
||||
end_seconds: 3.5
|
||||
segment_text: "Welcome to the museum."
|
||||
confidence: 0.95
|
||||
description: "Single structured segment"
|
||||
|
||||
speaker_count:
|
||||
slot_uri: hc:speakerCount
|
||||
description: |
|
||||
Number of distinct speakers identified in the transcript.
|
||||
|
||||
Only meaningful when includes_speakers = true.
|
||||
|
||||
0 = Unknown/not analyzed
|
||||
1 = Single speaker (monologue)
|
||||
2+ = Multi-speaker (dialogue, panel, interview)
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 3
|
||||
description: "Three speakers identified"
|
||||
|
||||
primary_speaker:
|
||||
slot_uri: hc:primarySpeaker
|
||||
description: |
|
||||
Identifier or name of the main/dominant speaker.
|
||||
|
||||
For interviews: the interviewee (not interviewer)
|
||||
For presentations: the presenter
|
||||
For tours: the guide
|
||||
|
||||
May be generic ("Narrator") or specific ("Dr. Taco Dibbits").
|
||||
range: string
|
||||
required: false
|
||||
examples:
|
||||
- value: "Narrator"
|
||||
description: "Generic primary speaker"
|
||||
- value: "Dr. Taco Dibbits, Museum Director"
|
||||
description: "Named primary speaker"
|
||||
|
||||
source_language_auto_detected:
|
||||
slot_uri: hc:sourceLanguageAutoDetected
|
||||
description: |
|
||||
Whether the content_language was auto-detected by ASR.
|
||||
|
||||
- **true**: Language detected by ASR model
|
||||
- **false**: Language was specified/known (default)
|
||||
|
||||
Useful for quality assessment - auto-detection may be wrong.
|
||||
range: boolean
|
||||
required: false
|
||||
ifabsent: "false"
|
||||
examples:
|
||||
- value: true
|
||||
description: "Language was auto-detected"
|
||||
|
||||
paragraph_count:
|
||||
slot_uri: hc:paragraphCount
|
||||
description: |
|
||||
Number of paragraphs in the transcript.
|
||||
|
||||
Only meaningful when transcript_format = PARAGRAPHED or STRUCTURED.
|
||||
|
||||
Useful for content sizing and readability assessment.
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 15
|
||||
description: "Transcript has 15 paragraphs"
|
||||
|
||||
sentence_count:
|
||||
slot_uri: hc:sentenceCount
|
||||
description: |
|
||||
Approximate number of sentences in the transcript.
|
||||
|
||||
Derived from punctuation analysis or NLP sentence segmentation.
|
||||
|
||||
Useful for content analysis and readability metrics.
|
||||
range: integer
|
||||
required: false
|
||||
minimum_value: 0
|
||||
examples:
|
||||
- value: 47
|
||||
description: "Transcript has ~47 sentences"
|
||||
|
||||
comments:
|
||||
- "Full text transcription of video audio content"
|
||||
- "Extends VideoTextContent with transcript-specific properties"
|
||||
- "Base class for VideoSubtitle (subtitles are transcripts + time codes)"
|
||||
- "Supports both plain text and structured segment-based transcripts"
|
||||
- "Critical for accessibility, discovery, and preservation"
|
||||
|
||||
see_also:
|
||||
- "https://schema.org/transcript"
|
||||
- "http://www.cidoc-crm.org/cidoc-crm/E33_Linguistic_Object"
|
||||
|
||||
# ============================================================================
|
||||
# Enumerations
|
||||
# ============================================================================
|
||||
|
||||
enums:
|
||||
|
||||
TranscriptFormatEnum:
|
||||
description: |
|
||||
Format/structure of transcript text content.
|
||||
|
||||
Indicates how the full_text is organized.
|
||||
permissible_values:
|
||||
PLAIN_TEXT:
|
||||
description: |
|
||||
Continuous text without structural markers.
|
||||
No speaker labels, no timestamps, no paragraph breaks.
|
||||
Suitable for simple full-text search indexing.
|
||||
PARAGRAPHED:
|
||||
description: |
|
||||
Text broken into paragraphs.
|
||||
May be based on topic changes, speaker pauses, or semantic units.
|
||||
Improves human readability.
|
||||
STRUCTURED:
|
||||
description: |
|
||||
Text with speaker labels and/or section markers.
|
||||
Format: "[Speaker] Text content" or similar.
|
||||
Enables speaker-specific analysis.
|
||||
TIMESTAMPED:
|
||||
description: |
|
||||
Text with inline time markers.
|
||||
Format: "[00:30] Text content" or similar.
|
||||
Enables temporal navigation in text view.
|
||||
VERBATIM:
|
||||
description: |
|
||||
Exact transcription including fillers, false starts, overlaps.
|
||||
"[um]", "[pause]", "[crosstalk]" markers.
|
||||
Used for linguistic analysis or legal transcripts.
|
||||
CLEAN:
|
||||
description: |
|
||||
Edited for readability - fillers removed, grammar corrected.
|
||||
May diverge slightly from literal spoken content.
|
||||
Suitable for publication or accessibility.
|
||||
|
||||
# ============================================================================
|
||||
# Slot Definitions
|
||||
# ============================================================================
|
||||
|
||||
slots:
|
||||
full_text:
|
||||
description: Complete transcript text as single string
|
||||
range: string
|
||||
|
||||
transcript_format:
|
||||
description: Format/structure of transcript text
|
||||
range: TranscriptFormatEnum
|
||||
|
||||
includes_timestamps:
|
||||
description: Whether transcript includes time markers
|
||||
range: boolean
|
||||
|
||||
includes_speakers:
|
||||
description: Whether transcript includes speaker identification
|
||||
range: boolean
|
||||
|
||||
segments:
|
||||
description: Structured breakdown into time-coded segments
|
||||
range: VideoTimeSegment
|
||||
multivalued: true
|
||||
|
||||
speaker_count:
|
||||
description: Number of distinct speakers identified
|
||||
range: integer
|
||||
|
||||
primary_speaker:
|
||||
description: Identifier/name of main speaker
|
||||
range: string
|
||||
|
||||
source_language_auto_detected:
|
||||
description: Whether language was auto-detected by ASR
|
||||
range: boolean
|
||||
|
||||
paragraph_count:
|
||||
description: Number of paragraphs in transcript
|
||||
range: integer
|
||||
|
||||
sentence_count:
|
||||
description: Number of sentences in transcript
|
||||
range: integer
|
||||
Loading…
Reference in a new issue