id: https://nde.nl/ontology/hc/class/VideoAnnotation name: video_annotation_class title: Video Annotation Class imports: - linkml:types - ../enums/AnnotationTypeEnum - ../slots/analyzes_or_analyzed - ../slots/contains_or_contained - ../slots/filters_or_filtered - ../slots/has_or_had_quantity - ../slots/has_or_had_rationale - ../slots/has_or_had_score - ../slots/has_or_had_treshold - ../slots/has_or_had_type - ../slots/has_or_had_unit - ../slots/includes_bounding_box - ../slots/includes_segmentation_mask - ../slots/keyframe_extraction - ../slots/model_architecture - ../slots/model_task - ../slots/specificity_annotation - ./AnnotationMotivationType - ./AnnotationMotivationTypes - ./DetectedEntity - ./DetectionThreshold - ./Quantity - ./SpecificityAnnotation - ./TemplateSpecificityScore - ./TemplateSpecificityType - ./TemplateSpecificityTypes - ./Unit - ./VideoFrame - ./VideoTextContent - ./VideoTimeSegment - ./Segment - ./AnnotationType - ./Rationale prefixes: linkml: https://w3id.org/linkml/ hc: https://nde.nl/ontology/hc/ schema: http://schema.org/ dcterms: http://purl.org/dc/terms/ prov: http://www.w3.org/ns/prov# crm: http://www.cidoc-crm.org/cidoc-crm/ oa: http://www.w3.org/ns/oa# as: https://www.w3.org/ns/activitystreams# default_prefix: hc classes: VideoAnnotation: is_a: VideoTextContent class_uri: oa:Annotation abstract: true description: "Abstract base class for computer vision and multimodal video annotations.\n\n**DEFINITION**:\n\nVideoAnnotation represents structured information derived from visual\nanalysis of video content. This includes:\n\n| Subclass | Analysis Type | Output |\n|----------|---------------|--------|\n| VideoSceneAnnotation | Shot/scene detection | Scene boundaries, types |\n| VideoObjectAnnotation | Object detection | Objects, faces, logos |\n| VideoOCRAnnotation | Text extraction | On-screen text (OCR) |\n\n**RELATIONSHIP TO W3C WEB ANNOTATION**:\n\nVideoAnnotation aligns with the W3C Web Annotation Data Model:\n\n```turtle\n:annotation a oa:Annotation ;\n oa:hasBody :detection_result ;\n oa:hasTarget [\n oa:hasSource :video ;\n oa:hasSelector [\n a oa:FragmentSelector ;\n dcterms:conformsTo ;\n rdf:value \"t=30,35\"\n ]\n ] ;\n oa:motivatedBy oa:classifying .\n```\n\n**FRAME-BASED\ \ ANALYSIS**:\n\nUnlike audio transcription (continuous stream), video annotation is\ntypically frame-based:\n\n- `frame_sample_rate`: Frames analyzed per second (e.g., 1 fps, 5 fps)\n- `analyzes_or_analyzed`: Total frames processed\n- Higher sample rates = more detections but higher compute cost\n\n**DETECTION THRESHOLDS**:\n\nCV models output confidence scores. Thresholds filter noise:\n\n| Threshold | Use Case |\n|-----------|----------|\n| 0.9+ | High precision, production display |\n| 0.7-0.9 | Balanced, general use |\n| 0.5-0.7 | High recall, research/review |\n| < 0.5 | Raw output, needs filtering |\n\n**MODEL ARCHITECTURE TRACKING**:\n\nDifferent model architectures have different characteristics:\n\n| Architecture | Examples | Strengths |\n|--------------|----------|-----------|\n| CNN | ResNet, VGG | Fast inference, good for objects |\n| Transformer | ViT, CLIP | Better context, multimodal |\n| Hybrid | DETR, Swin | Balance of speed and accuracy |\n\n**HERITAGE INSTITUTION\ \ CONTEXT**:\n\nVideo annotations enable:\n- **Discovery**: Find videos containing specific objects/artworks\n- **Accessibility**: Scene descriptions for visually impaired\n- **Research**: Analyze visual content at scale\n- **Preservation**: Document visual content as text\n- **Linking**: Connect detected artworks to collection records\n\n**CIDOC-CRM E13_Attribute_Assignment**:\n\nAnnotations are attribute assignments - asserting properties about\nvideo segments. The CV model or human annotator is the assigning agent.\n" exact_mappings: - oa:Annotation close_mappings: - crm:E13_Attribute_Assignment related_mappings: - as:Activity - schema:ClaimReview slots: - has_or_had_rationale - contains_or_contained - has_or_had_type - filters_or_filtered - includes_bounding_box - includes_segmentation_mask - keyframe_extraction - model_architecture - model_task - specificity_annotation - has_or_had_score - analyzes_or_analyzed slot_usage: has_or_had_type: range: uriorcurie required: true examples: - value: has_or_had_code: OBJECT_DETECTION has_or_had_label: Object Detection contains_or_contained: range: string multivalued: true required: false inlined_as_list: true examples: - value: has_or_had_label: Night Watch painting visible has_or_had_description: 30.0 - 35.0 seconds has_or_had_rationale: range: string required: false examples: - value: has_or_had_label: ClassifyingMotivation filters_or_filtered: description: "MIGRATED 2026-01-25: Replaces detection_count and detection_threshold slots.\n\nLinks to DetectedEntity which contains:\n- has_or_had_quantity \u2192 Quantity (for detection_count)\n- has_or_had_treshold \u2192 DetectionThreshold (for detection_threshold)\n\n**Migration Pattern**:\n- Old: detection_count: 342, detection_threshold: 0.5\n- New: filters_or_filtered \u2192 DetectedEntity with structured data\n" range: DetectedEntity inlined: true required: false examples: - value: has_or_had_quantity: has_or_had_unit: has_or_had_treshold: - value: has_or_had_quantity: has_or_had_unit: has_or_had_treshold: has_or_had_label: High Precision analyzes_or_analyzed: description: "MIGRATED 2026-01-22: Now supports VideoFrame class for frame_sample_rate migration.\n\nFrame analysis information including:\n- Total frames analyzed (integer, legacy pattern)\n- Frame sample rate and analysis parameters (VideoFrame class)\n\nMIGRATED SLOTS:\n- frame_sample_rate \u2192 VideoFrame.has_or_had_quantity with unit \"samples per second\"\n" range: VideoFrame inlined: true required: false examples: - value: has_or_had_quantity: has_or_had_unit: - value: has_or_had_quantity: has_or_had_unit: keyframe_extraction: range: boolean required: false examples: - value: true model_architecture: range: string required: false examples: - value: Transformer - value: CNN model_task: range: string required: false examples: - value: detection - value: captioning includes_bounding_box: range: boolean required: false examples: - value: true includes_segmentation_mask: range: boolean required: false examples: - value: false comments: - Abstract base for all CV/multimodal video annotations - Extends VideoTextContent with frame-based analysis parameters - W3C Web Annotation compatible structure - Supports both temporal and spatial annotation - Tracks detection thresholds and model architecture see_also: - https://www.w3.org/TR/annotation-model/ - http://www.cidoc-crm.org/cidoc-crm/E13_Attribute_Assignment - https://iiif.io/api/presentation/3.0/ annotations: specificity_score: 0.1 specificity_rationale: Generic utility class/slot created during migration custodian_types: "['*']"