glam/schemas/20251121/linkml/modules/classes/VideoAnnotation.yaml
kempersc 4319f38c05 Add archived slots for audience size, audience type, and capacity metrics
- Created new YAML files for audience size and audience type slots, defining their properties and annotations.
- Added archived capacity slots including cubic meters, linear meters, item count, and descriptions, with appropriate URIs and ranges.
- Introduced a template specificity slot for context-aware RAG filtering.
- Consolidated capacity-related slots into a unified structure, including has_or_had_capacity, capacity_type, and capacity_value, with detailed descriptions and examples.
2026-01-17 18:53:23 +01:00

212 lines
8.7 KiB
YAML

id: https://nde.nl/ontology/hc/class/VideoAnnotation
name: video_annotation_class
title: Video Annotation Class
imports:
- linkml:types
- ./VideoTextContent
- ./VideoTimeSegment
- ./AnnotationMotivationType
- ./AnnotationMotivationTypes
- ../slots/has_annotation_motivation
- ../slots/has_annotation_segment
- ../slots/has_annotation_type
- ../slots/detection_count
- ../slots/detection_threshold
- ../slots/frame_sample_rate
- ../slots/includes_bounding_box
- ../slots/includes_segmentation_mask
- ../slots/keyframe_extraction
- ../slots/model_architecture
- ../slots/model_task
- ../slots/specificity_annotation
- ../slots/has_or_had_score # was: template_specificity - migrated per Rule 53 (2026-01-17)
- ../slots/analyzes_or_analyzed
- ./SpecificityAnnotation
- ./TemplateSpecificityScore # was: TemplateSpecificityScores - migrated per Rule 53 (2026-01-17)
- ./TemplateSpecificityType
- ./TemplateSpecificityTypes
- ../enums/AnnotationTypeEnum
- ../slots/analyzes_or_analyzed
- ../slots/detection_count
- ../slots/detection_threshold
- ../slots/frame_sample_rate
- ../slots/has_annotation_motivation
- ../slots/has_annotation_segment
- ../slots/has_annotation_type
- ../slots/includes_bounding_box
- ../slots/includes_segmentation_mask
- ../slots/keyframe_extraction
- ../slots/model_architecture
- ../slots/model_task
- ../slots/specificity_annotation
- ../slots/has_or_had_score # was: template_specificity - migrated per Rule 53 (2026-01-17)
- ../slots/analyzes_or_analyzed
- ../slots/detection_count
- ../slots/detection_threshold
- ../slots/frame_sample_rate
- ../slots/has_annotation_motivation
- ../slots/has_annotation_segment
- ../slots/has_annotation_type
- ../slots/includes_bounding_box
- ../slots/includes_segmentation_mask
- ../slots/keyframe_extraction
- ../slots/model_architecture
- ../slots/model_task
- ../slots/specificity_annotation
- ../slots/has_or_had_score # was: template_specificity - migrated per Rule 53 (2026-01-17)
prefixes:
linkml: https://w3id.org/linkml/
hc: https://nde.nl/ontology/hc/
schema: http://schema.org/
dcterms: http://purl.org/dc/terms/
prov: http://www.w3.org/ns/prov#
crm: http://www.cidoc-crm.org/cidoc-crm/
oa: http://www.w3.org/ns/oa#
as: https://www.w3.org/ns/activitystreams#
default_prefix: hc
classes:
VideoAnnotation:
is_a: VideoTextContent
class_uri: oa:Annotation
abstract: true
description: "Abstract base class for computer vision and multimodal video annotations.\n\n**DEFINITION**:\n\nVideoAnnotation\
\ represents structured information derived from visual\nanalysis of video content. This includes:\n\n| Subclass | Analysis\
\ Type | Output |\n|----------|---------------|--------|\n| VideoSceneAnnotation | Shot/scene detection | Scene boundaries,\
\ types |\n| VideoObjectAnnotation | Object detection | Objects, faces, logos |\n| VideoOCRAnnotation | Text extraction\
\ | On-screen text (OCR) |\n\n**RELATIONSHIP TO W3C WEB ANNOTATION**:\n\nVideoAnnotation aligns with the W3C Web Annotation\
\ Data Model:\n\n```turtle\n:annotation a oa:Annotation ;\n oa:hasBody :detection_result ;\n oa:hasTarget [\n\
\ oa:hasSource :video ;\n oa:hasSelector [\n a oa:FragmentSelector ;\n dcterms:conformsTo\
\ <http://www.w3.org/TR/media-frags/> ;\n rdf:value \"t=30,35\"\n ]\n ] ;\n oa:motivatedBy oa:classifying\
\ .\n```\n\n**FRAME-BASED ANALYSIS**:\n\nUnlike audio transcription (continuous stream), video annotation is\ntypically\
\ frame-based:\n\n- `frame_sample_rate`: Frames analyzed per second (e.g., 1 fps, 5 fps)\n- `analyzes_or_analyzed`:\
\ Total frames processed\n- Higher sample rates = more detections but higher compute cost\n\n**DETECTION THRESHOLDS**:\n\
\nCV models output confidence scores. Thresholds filter noise:\n\n| Threshold | Use Case |\n|-----------|----------|\n\
| 0.9+ | High precision, production display |\n| 0.7-0.9 | Balanced, general use |\n| 0.5-0.7 | High recall, research/review\
\ |\n| < 0.5 | Raw output, needs filtering |\n\n**MODEL ARCHITECTURE TRACKING**:\n\nDifferent model architectures have\
\ different characteristics:\n\n| Architecture | Examples | Strengths |\n|--------------|----------|-----------|\n|\
\ CNN | ResNet, VGG | Fast inference, good for objects |\n| Transformer | ViT, CLIP | Better context, multimodal |\n\
| Hybrid | DETR, Swin | Balance of speed and accuracy |\n\n**HERITAGE INSTITUTION CONTEXT**:\n\nVideo annotations enable:\n\
- **Discovery**: Find videos containing specific objects/artworks\n- **Accessibility**: Scene descriptions for visually\
\ impaired\n- **Research**: Analyze visual content at scale\n- **Preservation**: Document visual content as text\n-\
\ **Linking**: Connect detected artworks to collection records\n\n**CIDOC-CRM E13_Attribute_Assignment**:\n\nAnnotations\
\ are attribute assignments - asserting properties about\nvideo segments. The CV model or human annotator is the assigning\
\ agent.\n"
exact_mappings:
- oa:Annotation
close_mappings:
- crm:E13_Attribute_Assignment
related_mappings:
- as:Activity
- schema:ClaimReview
slots:
- has_annotation_motivation
- has_annotation_segment
- has_annotation_type
- detection_count
- detection_threshold
- frame_sample_rate
- includes_bounding_box
- includes_segmentation_mask
- keyframe_extraction
- model_architecture
- model_task
- specificity_annotation
- has_or_had_score # was: template_specificity - migrated per Rule 53 (2026-01-17)
- analyzes_or_analyzed
slot_usage:
has_annotation_type:
range: AnnotationTypeEnum
required: true
examples:
- value: OBJECT_DETECTION
description: Object and face detection annotation
has_annotation_segment:
range: VideoTimeSegment
multivalued: true
required: false
inlined_as_list: true
examples:
- value: '[{start_seconds: 30.0, end_seconds: 35.0, segment_text: ''Night Watch painting visible''}]'
description: Object detection segment
detection_threshold:
range: float
required: false
minimum_value: 0.0
maximum_value: 1.0
examples:
- value: 0.5
description: Standard detection threshold
detection_count:
range: integer
required: false
minimum_value: 0
examples:
- value: 342
description: 342 total detections found
frame_sample_rate:
range: float
required: false
minimum_value: 0.0
examples:
- value: 1.0
description: Analyzed 1 frame per second
analyzes_or_analyzed:
description: Total frames analyzed during video annotation processing.
range: integer
required: false
minimum_value: 0
examples:
- value: 1800
description: Analyzed 1,800 frames (30 min video at 1 fps)
keyframe_extraction:
range: boolean
required: false
examples:
- value: true
description: Used keyframe extraction
model_architecture:
range: string
required: false
examples:
- value: Transformer
description: Vision Transformer architecture
- value: CNN
description: Convolutional Neural Network
model_task:
range: string
required: false
examples:
- value: detection
description: Object detection task
- value: captioning
description: Video captioning task
includes_bounding_box:
range: boolean
required: false
examples:
- value: true
description: Includes bounding box coordinates
includes_segmentation_mask:
range: boolean
required: false
examples:
- value: false
description: No segmentation masks included
has_annotation_motivation:
range: AnnotationMotivationType
required: false
examples:
- value: ClassifyingMotivation
description: Annotation for classification purposes
comments:
- Abstract base for all CV/multimodal video annotations
- Extends VideoTextContent with frame-based analysis parameters
- W3C Web Annotation compatible structure
- Supports both temporal and spatial annotation
- Tracks detection thresholds and model architecture
see_also:
- https://www.w3.org/TR/annotation-model/
- http://www.cidoc-crm.org/cidoc-crm/E13_Attribute_Assignment
- https://iiif.io/api/presentation/3.0/