glam/schemas/20251121/linkml/modules/classes/ConnectionSourceMetadata.yaml
kempersc 626bd3a095 refactor(schemas): apply naming conventions to 261 class files
- Apply Rule 39: RiC-O style hasOrHad*/isOrWas* for temporal slots
- Apply Rule 43: Singular noun convention (keywords → keyword)
- Update slot references to match renamed slot files
- Maintain schema integrity across all class definitions
2026-01-10 15:36:33 +01:00

137 lines
4.4 KiB
YAML

id: https://nde.nl/ontology/hc/class/ConnectionSourceMetadata
name: connection_source_metadata_class
title: Connection Source Metadata Class
version: 1.0.0
prefixes:
linkml: https://w3id.org/linkml/
hc: https://nde.nl/ontology/hc/
schema: http://schema.org/
prov: http://www.w3.org/ns/prov#
dct: http://purl.org/dc/terms/
imports:
- linkml:types
- ./SpecificityAnnotation
- ./TemplateSpecificityScores
- ../enums/ScrapeMethodEnum
- ../slots/connections_extracted
- ../slots/note
- ../slots/scrape_method
- ../slots/scraped_timestamp
- ../slots/source_url
- ../slots/specificity_annotation
- ../slots/target_name
- ../slots/target_profile
- ../slots/template_specificity
default_prefix: hc
classes:
ConnectionSourceMetadata:
class_uri: prov:Activity
description: |
Provenance metadata about how the connections were extracted.
Records the extraction context including:
- Source URL (LinkedIn search or profile page)
- When the extraction occurred
- Which method was used (manual browse, automated scrape)
- Target profile being analyzed
- Count of connections extracted
**Scrape Methods**:
- manual_linkedin_browse: Manual copy-paste while logged in
- linkedin_html_parser: Parsed from saved HTML file
- exa_search: Extracted via Exa API
exact_mappings:
- prov:Activity
slots:
- connections_extracted
- note
- scrape_method
- scraped_timestamp
- source_url
- specificity_annotation
- target_name
- target_profile
- template_specificity
slot_usage:
source_url:
description: |
URL of the LinkedIn page where connections were extracted from.
Usually a LinkedIn search results URL or profile connections page.
slot_uri: prov:used
range: uri
required: true
examples:
- value: https://www.linkedin.com/search/results/people/?network=%5B%22F%22%2C%22S%22%2C%22O%22%5D
description: LinkedIn connection search URL
scraped_timestamp:
description: |
ISO 8601 timestamp when the connections were extracted.
Critical for tracking network changes over time.
slot_uri: prov:endedAtTime
range: datetime
required: true
examples:
- value: "2025-12-09T22:00:00Z"
scrape_method:
description: |
Method used to extract the connection data.
Values:
- manual_linkedin_browse: Manual extraction while logged in
- linkedin_html_parser: Parsed from saved HTML file
- exa_search: Extracted via Exa API
slot_uri: prov:wasAssociatedWith
range: ScrapeMethodEnum
required: true
examples:
- value: manual_linkedin_browse
target_profile:
description: |
LinkedIn slug of the profile whose connections were extracted.
Format: lowercase alphanumeric with hyphens.
slot_uri: dct:subject
range: string
required: true
pattern: "^[a-z0-9-]+$"
examples:
- value: giovannafossati
- value: alexandr-belov-bb547b46
target_name:
description: |
Full display name of the target profile.
The person whose connections were extracted.
slot_uri: schema:name
range: string
required: true
examples:
- value: Giovanna Fossati
- value: Alexandr Belov
connections_extracted:
description: |
Total number of connections extracted from this source.
Used for validation and completeness tracking.
slot_uri: schema:numberOfItems
range: integer
required: true
minimum_value: 0
examples:
- value: 776
note:
description: |
Optional notes about the extraction process.
May reference raw source files or explain any issues.
slot_uri: schema:description
range: string
examples:
- value: Raw scrape in giovannafossati_connections_20251209T220000Z_note-max100p-1st2nd3th.md
specificity_annotation:
range: SpecificityAnnotation
inlined: true
template_specificity:
range: TemplateSpecificityScores
inlined: true
comments:
- Aligns with PROV-O Activity pattern
- scraped_timestamp maps to prov:endedAtTime
- target_profile is the LinkedIn slug being analyzed