id: https://nde.nl/ontology/hc/class/ConnectionNetwork name: connection_network_class title: Connection Network Class version: 1.0.0 prefixes: linkml: https://w3id.org/linkml/ hc: https://nde.nl/ontology/hc/ schema: http://schema.org/ prov: http://www.w3.org/ns/prov# dct: http://purl.org/dc/terms/ xsd: http://www.w3.org/2001/XMLSchema# imports: - linkml:types - ../metadata - ./PersonConnection - ../slots/notes - ../slots/class_metadata_slots default_range: string classes: ConnectionNetwork: class_uri: schema:ItemList description: | Collection of LinkedIn network connections with source metadata. This is the root class for connection JSON files stored at: `data/custodian/person/connection/bu/{linkedin_slug}_connections_{timestamp}.json` Each file contains: - **source_metadata**: Provenance about the extraction (who, when, how) - **connections**: Array of PersonConnection entries (the actual network data) - **network_analysis**: Optional aggregated statistics **Use Cases**: - Heritage sector network analysis - Cross-custodian relationship discovery - Staff member connection patterns - Professional community mapping **File Naming Convention**: `{linkedin-slug}_connections_{ISO-timestamp}.json` Example: `giovannafossati_connections_20251209T220000Z.json` exact_mappings: - schema:ItemList close_mappings: - prov:Collection slots: - connections - network_analysis - source_metadata - specificity_annotation - template_specificity slot_usage: source_metadata: description: Provenance metadata about the connection extraction range: ConnectionSourceMetadata required: true inlined: true connections: description: Array of connection entries from the LinkedIn network range: PersonConnection required: true multivalued: true inlined: true inlined_as_list: true network_analysis: description: Aggregated statistics about the connection network range: NetworkAnalysis inlined: true specificity_annotation: range: SpecificityAnnotation inlined: true template_specificity: range: TemplateSpecificityScores inlined: true comments: - Root class for connection network JSON files (validated with -C ConnectionNetwork) - 'Per AGENTS.md Rule 15: ALL connections must be fully registered' - Enables heritage sector network analysis - 'File naming: {linkedin-slug}_connections_{timestamp}.json' see_also: - https://schema.org/ItemList ConnectionSourceMetadata: class_uri: prov:Activity description: | Provenance metadata about how the connections were extracted. Records the extraction context including: - Source URL (LinkedIn search or profile page) - When the extraction occurred - Which method was used (manual browse, automated scrape) - Target profile being analyzed - Count of connections extracted **Scrape Methods**: - manual_linkedin_browse: Manual copy-paste while logged in - linkedin_html_parser: Parsed from saved HTML file - exa_search: Extracted via Exa API exact_mappings: - prov:Activity slots: - connections_extracted - notes - scrape_method - scraped_timestamp - source_url - specificity_annotation - target_name - target_profile - template_specificity slot_usage: source_url: description: | URL of the LinkedIn page where connections were extracted from. Usually a LinkedIn search results URL or profile connections page. slot_uri: prov:used range: uri required: true examples: - value: https://www.linkedin.com/search/results/people/?network=%5B%22F%22%2C%22S%22%2C%22O%22%5D description: LinkedIn connection search URL scraped_timestamp: description: | ISO 8601 timestamp when the connections were extracted. Critical for tracking network changes over time. slot_uri: prov:endedAtTime range: datetime required: true examples: - value: '2025-12-09T22:00:00Z' scrape_method: description: | Method used to extract the connection data. Values: - manual_linkedin_browse: Manual extraction while logged in - linkedin_html_parser: Parsed from saved HTML file - exa_search: Extracted via Exa API slot_uri: prov:wasAssociatedWith range: ScrapeMethodEnum required: true examples: - value: manual_linkedin_browse target_profile: description: | LinkedIn slug of the profile whose connections were extracted. Format: lowercase alphanumeric with hyphens. slot_uri: dct:subject range: string required: true pattern: ^[a-z0-9-]+$ examples: - value: giovannafossati - value: alexandr-belov-bb547b46 target_name: description: | Full display name of the target profile. The person whose connections were extracted. slot_uri: schema:name range: string required: true examples: - value: Giovanna Fossati - value: Alexandr Belov connections_extracted: description: | Total number of connections extracted from this source. Used for validation and completeness tracking. slot_uri: schema:numberOfItems range: integer required: true minimum_value: 0 examples: - value: 776 notes: description: | Optional notes about the extraction process. May reference raw source files or explain any issues. slot_uri: schema:description range: string examples: - value: Raw scrape in giovannafossati_connections_20251209T220000Z_note-max100p-1st2nd3th.md specificity_annotation: range: SpecificityAnnotation inlined: true template_specificity: range: TemplateSpecificityScores inlined: true comments: - Aligns with PROV-O Activity pattern - scraped_timestamp maps to prov:endedAtTime - target_profile is the LinkedIn slug being analyzed NetworkAnalysis: class_uri: schema:DataFeedItem description: | Aggregated statistics about the connection network. Provides summary metrics for quick analysis: - Total connections extracted - Heritage-relevant count and percentage - Breakdown by heritage type (GLAMORCUBESFIXPHDNT) **Example**: ```json { "total_connections_extracted": 776, "heritage_relevant_count": 456, "heritage_relevant_percentage": 58.8, "connections_by_heritage_type": { "A": 45, "M": 89, "D": 112, "R": 78 } } ``` slots: - connections_by_heritage_type - heritage_relevant_count - heritage_relevant_percentage - specificity_annotation - template_specificity - total_connections_extracted slot_usage: total_connections_extracted: description: Total number of connections in the network slot_uri: schema:numberOfItems range: integer required: true minimum_value: 0 heritage_relevant_count: description: Number of connections marked as heritage-relevant slot_uri: hc:heritageRelevantCount range: integer required: true minimum_value: 0 heritage_relevant_percentage: description: Percentage of connections that are heritage-relevant (0-100) slot_uri: hc:heritageRelevantPercentage range: float minimum_value: 0.0 maximum_value: 100.0 examples: - value: 58.8 connections_by_heritage_type: description: | Breakdown of heritage-relevant connections by type code. Keys are single-letter GLAMORCUBESFIXPHDNT codes. slot_uri: hc:connectionsByHeritageType range: HeritageTypeCount multivalued: true inlined: true inlined_as_list: true specificity_annotation: range: SpecificityAnnotation inlined: true template_specificity: range: TemplateSpecificityScores inlined: true comments: - Optional aggregation - can be computed from connections array - Useful for quick heritage sector analysis HeritageTypeCount: class_uri: schema:PropertyValue description: | Count of connections for a specific heritage type. Used in network_analysis.connections_by_heritage_type. slots: - count - heritage_type_code - specificity_annotation - template_specificity slot_usage: heritage_type_code: description: Single-letter heritage type code (G,L,A,M,O,R,C,U,B,E,S,F,I,X,P,H,D,N,T) slot_uri: schema:propertyID range: string required: true pattern: ^[GLAMORCUBESFIXPHDNT]$ count: description: Number of connections of this heritage type slot_uri: schema:value range: integer required: true minimum_value: 0 specificity_annotation: range: SpecificityAnnotation inlined: true template_specificity: range: TemplateSpecificityScores inlined: true enums: ScrapeMethodEnum: description: | Methods used to extract LinkedIn connection data. Determines data quality and potential limitations. permissible_values: manual_linkedin_browse: description: Manual extraction while logged into LinkedIn meaning: prov:SoftwareAgent linkedin_html_parser: description: Parsed from saved LinkedIn HTML file meaning: prov:SoftwareAgent exa_search: description: Extracted via Exa API search meaning: prov:SoftwareAgent automated_scraper: description: Automated scraping tool meaning: prov:SoftwareAgent slots: source_metadata: description: Provenance metadata about the extraction range: ConnectionSourceMetadata connections: description: Array of connection entries range: PersonConnection multivalued: true network_analysis: description: Aggregated network statistics range: NetworkAnalysis source_url: description: URL where data was extracted from range: uri scraped_timestamp: description: When the extraction occurred range: datetime scrape_method: description: Method used for extraction range: ScrapeMethodEnum target_profile: description: LinkedIn slug of target profile range: string target_name: description: Display name of target profile range: string connections_extracted: description: Number of connections extracted range: integer total_connections_extracted: description: Total connection count range: integer heritage_relevant_count: description: Count of heritage-relevant connections range: integer heritage_relevant_percentage: description: Percentage of heritage-relevant connections range: float connections_by_heritage_type: description: Breakdown by heritage type code range: HeritageTypeCount multivalued: true heritage_type_code: description: Single-letter heritage type code range: string count: description: Count value range: integer