id: https://nde.nl/ontology/hc/class/ConnectionSourceMetadata name: connection_source_metadata_class title: Connection Source Metadata Class version: 1.0.0 prefixes: linkml: https://w3id.org/linkml/ hc: https://nde.nl/ontology/hc/ schema: http://schema.org/ prov: http://www.w3.org/ns/prov# dct: http://purl.org/dc/terms/ imports: - linkml:types - ./SpecificityAnnotation - ./TemplateSpecificityScores - ../enums/ScrapeMethodEnum - ../slots/connections_extracted - ../slots/note - ../slots/scrape_method - ../slots/scraped_timestamp - ../slots/source_url - ../slots/specificity_annotation - ../slots/target_name - ../slots/target_profile - ../slots/template_specificity default_prefix: hc classes: ConnectionSourceMetadata: class_uri: prov:Activity description: | Provenance metadata about how the connections were extracted. Records the extraction context including: - Source URL (LinkedIn search or profile page) - When the extraction occurred - Which method was used (manual browse, automated scrape) - Target profile being analyzed - Count of connections extracted **Scrape Methods**: - manual_linkedin_browse: Manual copy-paste while logged in - linkedin_html_parser: Parsed from saved HTML file - exa_search: Extracted via Exa API exact_mappings: - prov:Activity slots: - connections_extracted - note - scrape_method - scraped_timestamp - source_url - specificity_annotation - target_name - target_profile - template_specificity slot_usage: source_url: description: | URL of the LinkedIn page where connections were extracted from. Usually a LinkedIn search results URL or profile connections page. slot_uri: prov:used range: uri required: true examples: - value: https://www.linkedin.com/search/results/people/?network=%5B%22F%22%2C%22S%22%2C%22O%22%5D description: LinkedIn connection search URL scraped_timestamp: description: | ISO 8601 timestamp when the connections were extracted. Critical for tracking network changes over time. slot_uri: prov:endedAtTime range: datetime required: true examples: - value: "2025-12-09T22:00:00Z" scrape_method: description: | Method used to extract the connection data. Values: - manual_linkedin_browse: Manual extraction while logged in - linkedin_html_parser: Parsed from saved HTML file - exa_search: Extracted via Exa API slot_uri: prov:wasAssociatedWith range: ScrapeMethodEnum required: true examples: - value: manual_linkedin_browse target_profile: description: | LinkedIn slug of the profile whose connections were extracted. Format: lowercase alphanumeric with hyphens. slot_uri: dct:subject range: string required: true pattern: "^[a-z0-9-]+$" examples: - value: giovannafossati - value: alexandr-belov-bb547b46 target_name: description: | Full display name of the target profile. The person whose connections were extracted. slot_uri: schema:name range: string required: true examples: - value: Giovanna Fossati - value: Alexandr Belov connections_extracted: description: | Total number of connections extracted from this source. Used for validation and completeness tracking. slot_uri: schema:numberOfItems range: integer required: true minimum_value: 0 examples: - value: 776 note: description: | Optional notes about the extraction process. May reference raw source files or explain any issues. slot_uri: schema:description range: string examples: - value: Raw scrape in giovannafossati_connections_20251209T220000Z_note-max100p-1st2nd3th.md specificity_annotation: range: SpecificityAnnotation inlined: true template_specificity: range: TemplateSpecificityScores inlined: true comments: - Aligns with PROV-O Activity pattern - scraped_timestamp maps to prov:endedAtTime - target_profile is the LinkedIn slug being analyzed