glam/schemas/20251121/linkml/modules/classes/SocialMediaPost.yaml
2025-12-17 10:11:56 +01:00

561 lines
19 KiB
YAML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Social Media Post Class
# Concrete class for social media posts with multivalued post type support
#
# CONCEPTUAL HIERARCHY:
#
# SocialMediaPlatformType → SocialMediaProfile → SocialMediaPost
# (Platform) (Account/Channel) (Individual Post)
# ↓
# post_types (multivalued)
# ↓
# SocialMediaPostType[]
#
# This class represents individual content items posted on social media platforms.
# A single post can have MULTIPLE post types (e.g., a carousel with images AND videos).
#
# DESIGN RATIONALE:
# Social media content often combines multiple formats:
# - A carousel may contain both images and videos
# - A thread consists of multiple text posts with attached media
# - A live stream becomes a standard video when archived
# - A podcast episode may also be published as a video
#
# The multivalued post_types slot captures this reality.
id: https://nde.nl/ontology/hc/class/SocialMediaPost
name: social_media_post_class
title: Social Media Post Class
imports:
- linkml:types
- ./SocialMediaProfile
- ./SocialMediaPlatformType
- ./SocialMediaPostType
- ./TimeSpan
- ./WebObservation
- ../slots/language
- ../slots/api_endpoint
- ../slots/description
- ../slots/platform_type
prefixes:
linkml: https://w3id.org/linkml/
hc: https://nde.nl/ontology/hc/
schema: http://schema.org/
foaf: http://xmlns.com/foaf/0.1/
dcterms: http://purl.org/dc/terms/
prov: http://www.w3.org/ns/prov#
crm: http://www.cidoc-crm.org/cidoc-crm/
skos: http://www.w3.org/2004/02/skos/core#
as: https://www.w3.org/ns/activitystreams#
default_prefix: hc
classes:
SocialMediaPost:
class_uri: as:Object
abstract: false
description: |
Concrete class for social media posts/content items.
**DEFINITION**:
SocialMediaPost represents a discrete piece of content published on a social media
platform. This includes videos, images, text posts, stories, carousels, threads,
and other content types. Each post is published by a SocialMediaProfile (account/channel).
**CRITICAL: MULTIVALUED POST TYPES**:
A single SocialMediaPost can have **multiple** post types via the `post_types` slot.
This reflects the reality that social media content often combines formats:
| Post Example | Primary Type | Secondary Types |
|--------------|--------------|-----------------|
| Instagram carousel with videos | CarouselPostType | ImagePostType, VideoPostType |
| Twitter thread with images | ThreadPostType | TextPostType, ImagePostType |
| YouTube Live archived as video | LiveStreamPostType | VideoPostType |
| Podcast with video recording | AudioPostType | VideoPostType |
| Story highlight (saved) | StoryPostType | ImagePostType, VideoPostType |
**TYPE ORDERING CONVENTION**:
- First type in list = primary/dominant format
- Subsequent types = secondary characteristics
- Order matters for display and categorization
**CRITICAL: POST vs PROFILE vs PLATFORM**:
| Class | Represents | Example | Cardinality |
|-------|------------|---------|-------------|
| SocialMediaPlatformType | Platform category | YouTube | ~25 types |
| SocialMediaProfile | Account/channel | @rijksmuseum | Thousands |
| **SocialMediaPost** | Individual content | A video, post | Millions |
**ACTIVITY STREAMS 2.0 ALIGNMENT**:
Maps to `as:Object` from W3C Activity Streams 2.0 vocabulary:
- as:Object is the base type for content in social media
- Individual posts may also map to as:Video, as:Image, as:Note based on post_types
- Enables federation with ActivityPub-based platforms (Mastodon, PeerTube)
**HERITAGE INSTITUTION CONTEXT**:
Social media content from heritage institutions includes:
1. **Official content** (posted by the institution):
- Collection highlights (ImagePostType)
- Exhibition announcements (TextPostType)
- Educational videos (VideoPostType)
- Behind-the-scenes content (ShortVideoPostType, StoryPostType)
- Event recordings (LiveStreamPostType → VideoPostType)
- Curator interviews (VideoPostType, AudioPostType)
- Multi-image collection stories (CarouselPostType)
2. **Third-party content** (about the institution):
- Visitor videos/photos
- News coverage
- Academic discussions
- Review content
3. **User-generated content** (mentions):
- Tagged posts
- Check-ins
- Comments/discussions
**PROVENANCE TRACKING**:
Content items are OBSERVATIONAL data retrieved via:
- Platform APIs (YouTube Data API, Twitter API, etc.)
- Web scraping (when API unavailable)
- Manual archival
Each content record includes:
- `retrieval_timestamp`: When content was fetched
- `api_endpoint`: Which API/method was used
- `metrics_observed_date`: When engagement metrics were recorded
**TEMPORAL CONSIDERATIONS**:
Content has multiple temporal dimensions:
- `published_at`: When originally posted
- `updated_at`: Last modification by author
- `retrieval_timestamp`: When we fetched it
- `metrics_observed_date`: When metrics (views, likes) were recorded
Engagement metrics change constantly; always record observation timestamp.
**SUBCLASSES FOR SPECIALIZED CONTENT**:
While SocialMediaPost can represent any content with post_types,
specialized subclasses provide additional platform-specific slots:
- **VideoPost**: YouTube, TikTok, Vimeo videos (duration, definition, captions)
- **ImagePost**: Instagram, Pinterest posts (dimensions, alt_text) [future]
- **TextPost**: Twitter/X, Mastodon posts (character_count) [future]
- **StoryPost**: Instagram/Facebook stories (ephemeral, segments) [future]
exact_mappings:
- as:Object
close_mappings:
- schema:CreativeWork
- crm:E73_Information_Object
related_mappings:
- schema:SocialMediaPosting
- dcterms:BibliographicResource
slots:
# Identification
- post_id
- post_url
# Type classification (MULTIVALUED!)
- post_types
- platform_type
# Authorship
- posted_by_profile
- is_official_content
# Content metadata
- title
- description
- language
- tags
- thumbnail_url
- content_category
# Temporal
- published_at
- updated_at
# Provenance
- retrieval_timestamp
- api_endpoint
- api_version
slot_usage:
post_id:
slot_uri: dcterms:identifier
description: |
Unique identifier for this post.
Format varies by platform:
- YouTube: Video ID (e.g., "dQw4w9WgXcQ")
- Twitter/X: Tweet ID (numeric string)
- Instagram: Media ID or shortcode
Combined with platform type, this uniquely identifies content globally.
range: string
required: true
identifier: true
examples:
- value: "FbIoC-Owy-M"
description: "YouTube video ID"
- value: "1234567890123456789"
description: "Twitter/X tweet ID"
post_url:
slot_uri: schema:url
description: |
Canonical URL to access this post on its native platform.
URL patterns by platform:
- YouTube: https://www.youtube.com/watch?v={video_id}
- Twitter/X: https://x.com/{user}/status/{tweet_id}
- Instagram: https://www.instagram.com/p/{shortcode}
- Mastodon: https://{instance}/@{user}/{post_id}
range: uri
required: true
pattern: "^https?://"
examples:
- value: "https://www.youtube.com/watch?v=FbIoC-Owy-M"
description: "YouTube video URL"
post_types:
slot_uri: dcterms:type
description: |
The type(s) of content in this post.
**MULTIVALUED**: A single post can have multiple types!
Uses SocialMediaPostType class hierarchy for standardized categorization.
**Ordering Convention**:
- First type = primary/dominant format
- Subsequent types = secondary characteristics
**Examples**:
| Post | post_types Value |
|------|------------------|
| YouTube video | [VideoPostType] |
| Instagram carousel | [CarouselPostType, ImagePostType, VideoPostType] |
| Twitter thread | [ThreadPostType, TextPostType, ImagePostType] |
| YouTube Live (archived) | [LiveStreamPostType, VideoPostType] |
| Podcast on YouTube | [AudioPostType, VideoPostType] |
range: SocialMediaPostType
multivalued: true
required: true
inlined: false
examples:
- value: "[VideoPostType]"
description: "Standard video content"
- value: "[CarouselPostType, ImagePostType, VideoPostType]"
description: "Instagram carousel with mixed media"
platform_type:
slot_uri: hc:platformType
description: |
The social media platform where this post was published.
Uses SocialMediaPlatformType class hierarchy.
Distinct from post_types which describes CONTENT format,
platform_type describes WHERE it was posted.
range: SocialMediaPlatformType
required: true
inlined: false
examples:
- value: "YouTube"
description: "Posted on YouTube"
posted_by_profile:
slot_uri: as:attributedTo
description: |
The social media profile (account/channel) that posted this content.
Activity Streams: attributedTo identifies the actor responsible for the content.
Links to SocialMediaProfile which in turn links to the Custodian hub.
range: SocialMediaProfile
required: false
inlined: false
examples:
- value: "https://nde.nl/ontology/hc/social-media/nationaal-onderduikmuseum-youtube"
description: "Museum's YouTube channel profile"
title:
slot_uri: dcterms:title
description: |
Title or headline of the post.
Dublin Core: title for the content's name.
Platform behavior:
- YouTube videos: Video title
- Twitter/X: First line of tweet (or null)
- Instagram: Caption beginning
- Threads: First post text
range: string
required: false
examples:
- value: "De Vrijheidsroute (aflevering 3) Zevenaar, Duiven, Westervoort"
description: "YouTube video title"
description:
slot_uri: dcterms:description
description: |
Full description or body text of the post.
Dublin Core: description for content text.
Platform behavior:
- YouTube: Video description
- Twitter/X: Full tweet text
- Instagram: Full caption
- Thread: Combined text of all posts in thread
range: string
required: false
examples:
- value: "De videoreeks De Vrijheidsroute is gebaseerd op de gelijknamige fietsroute..."
description: "YouTube video description"
published_at:
slot_uri: dcterms:created
description: |
Date/time when post was originally published.
Dublin Core: created for publication timestamp.
ISO 8601 format (e.g., "2025-07-30T18:05:15Z").
range: datetime
required: true
examples:
- value: "2025-07-30T18:05:15Z"
description: "Published July 30, 2025"
updated_at:
slot_uri: dcterms:modified
description: |
Date/time when post was last modified by the author.
Dublin Core: modified for last update timestamp.
May be null if post has never been edited.
range: datetime
required: false
examples:
- value: "2025-08-01T10:30:00Z"
description: "Last edited August 1, 2025"
language:
slot_uri: dcterms:language
description: |
Primary language of the post.
Dublin Core: language for content language.
ISO 639-1 code (e.g., "nl", "en", "de").
range: string
required: false
examples:
- value: "nl"
description: "Dutch language content"
tags:
slot_uri: schema:keywords
description: |
Tags, hashtags, or keywords associated with the post.
Schema.org: keywords for topic classification.
Platform-specific:
- YouTube: Video tags (author-defined)
- Twitter/X: Hashtags
- Instagram: Hashtags from caption
range: string
multivalued: true
required: false
examples:
- value: ["80 jaar vrijheid", "wo2", "vrijheidsroute"]
description: "YouTube video tags"
thumbnail_url:
slot_uri: schema:thumbnailUrl
description: |
URL to a thumbnail/preview image for the post.
Schema.org: thumbnailUrl for preview image.
range: uri
required: false
examples:
- value: "https://i.ytimg.com/vi/FbIoC-Owy-M/hqdefault.jpg"
description: "YouTube video thumbnail"
is_official_content:
slot_uri: hc:isOfficialContent
description: |
Whether this post was published by the heritage institution's official account.
- **true**: Posted by the custodian's verified/official account
- **false**: Third-party content (visitors, media, etc.) about the institution
Helps distinguish official communications from external coverage.
range: boolean
required: false
ifabsent: "true"
examples:
- value: true
description: "Posted by official museum channel"
content_category:
slot_uri: schema:genre
description: |
Category or genre of the post.
Schema.org: genre for content classification.
Platform-specific category systems:
- YouTube: Category ID (e.g., "22" = People & Blogs)
- Instagram: N/A
- TikTok: Category/trend
range: string
required: false
examples:
- value: "22"
description: "YouTube category ID for People & Blogs"
retrieval_timestamp:
slot_uri: prov:atTime
description: |
Timestamp when this post data was retrieved from the platform.
PROV-O: atTime for observation timestamp.
Critical for understanding data freshness, especially for metrics.
range: datetime
required: true
examples:
- value: "2025-12-01T23:16:22.294232+00:00"
description: "Retrieved December 1, 2025"
api_endpoint:
slot_uri: prov:wasGeneratedBy
description: |
The API endpoint or method used to retrieve this post.
PROV-O: wasGeneratedBy for generation method.
Examples:
- "https://www.googleapis.com/youtube/v3"
- "https://api.twitter.com/2/tweets"
- "web_scrape"
range: string
required: false
examples:
- value: "https://www.googleapis.com/youtube/v3"
description: "YouTube Data API v3"
api_version:
slot_uri: schema:softwareVersion
description: |
Version of the API used for retrieval.
Schema.org: softwareVersion for API version tracking.
range: string
required: false
examples:
- value: "v3"
description: "YouTube API version 3"
comments:
- "Concrete class for social media posts"
- "post_types is MULTIVALUED - a post can have multiple content types"
- "First type in post_types list is the primary format"
- "Use specialized subclasses (VideoPost) for platform-specific properties"
- "Activity Streams 2.0 alignment enables ActivityPub federation"
- "Metrics are observational - always include retrieval_timestamp"
see_also:
- "https://www.w3.org/ns/activitystreams#Object"
- "https://schema.org/CreativeWork"
- "https://schema.org/SocialMediaPosting"
# Slot definitions
slots:
post_id:
description: Platform-specific unique identifier for the post
range: string
post_url:
description: Canonical URL to access the post
range: uri
title:
description: Title/headline of the post
range: string
slot_uri: dcterms:title
# description is defined globally in ../slots/description.yaml
post_types:
description: |
Type(s) of content in this post (MULTIVALUED).
First type is primary format, subsequent types are secondary.
range: SocialMediaPostType
multivalued: true
posted_by_profile:
description: Social media profile that published this post
range: SocialMediaProfile
is_official_content:
description: Whether post is from official institutional account
range: boolean
content_category:
description: Platform-specific category/genre
range: string
retrieval_timestamp:
description: When post data was retrieved
range: datetime
# api_endpoint is defined globally in ../slots/api_endpoint.yaml
# Do not redefine here - use slot_usage in class to customize
api_version:
description: Version of API used
range: string
tags:
description: Keywords/hashtags for the post
range: string
multivalued: true
thumbnail_url:
description: URL to preview/thumbnail image
range: uri
published_at:
description: Date/time when post was originally published
range: datetime
slot_uri: dcterms:created
updated_at:
description: Date/time when post was last modified
range: datetime
slot_uri: dcterms:modified
# platform_type is defined globally in ../slots/platform_type.yaml