- Apply Rule 39: RiC-O style hasOrHad*/isOrWas* for temporal slots - Apply Rule 43: Singular noun convention (keywords → keyword) - Update slot references to match renamed slot files - Maintain schema integrity across all class definitions
582 lines
19 KiB
YAML
582 lines
19 KiB
YAML
id: https://nde.nl/ontology/hc/class/LinkedInProfile
|
|
name: linkedin_profile_class
|
|
title: LinkedIn Profile Class
|
|
version: 1.0.0
|
|
prefixes:
|
|
linkml: https://w3id.org/linkml/
|
|
hc: https://nde.nl/ontology/hc/
|
|
schema: http://schema.org/
|
|
foaf: http://xmlns.com/foaf/0.1/
|
|
prov: http://www.w3.org/ns/prov#
|
|
dct: http://purl.org/dc/terms/
|
|
imports:
|
|
- linkml:types
|
|
- ../metadata
|
|
- ./ExtractionMetadata
|
|
- ./WorkExperience
|
|
- ./EducationCredential
|
|
- ./HeritageRelevance
|
|
- ./LanguageProficiency
|
|
- ../slots/has_or_had_about_text
|
|
- ../slots/all_data_real
|
|
- ../slots/has_assessment_date
|
|
- ../slots/connections_text
|
|
- ../slots/data_source_whatsapp
|
|
- ../slots/digital_confidence
|
|
- ../slots/digital_indicator
|
|
- ../slots/digital_professional
|
|
- ../slots/education
|
|
- ../slots/enriched_date
|
|
- ../slots/enrichment_metadata_whatsapp
|
|
- ../slots/enrichment_method_whatsapp
|
|
- ../slots/experience
|
|
- ../slots/extraction_metadata
|
|
- ../slots/headline
|
|
- ../slots/heritage_relevance
|
|
- ../slots/has_or_had_language
|
|
- ../slots/languages_raw
|
|
- ../slots/likelihood_confidence
|
|
- ../slots/likelihood_factor
|
|
- ../slots/likelihood_level
|
|
- ../slots/likelihood_score
|
|
- ../slots/likely_whatsapp_proficient
|
|
- ../slots/max_likelihood_score
|
|
- ../slots/no_fabrication
|
|
- ../slots/profile_data
|
|
- ../slots/profile_image_url
|
|
- ../slots/profile_linkedin_url
|
|
- ../slots/profile_location
|
|
- ../slots/profile_name
|
|
- ../slots/skill
|
|
- ../slots/source_organization
|
|
- ../slots/specificity_annotation
|
|
- ../slots/template_specificity
|
|
- ../slots/whatsapp_business_likelihood
|
|
- ../slots/whatsapp_enrichment
|
|
- ./DigitalProfessionalAssessment
|
|
- ./LinkedInProfileData
|
|
- ./SpecificityAnnotation
|
|
- ./TemplateSpecificityScores
|
|
- ./WhatsAppEnrichment
|
|
- ./WhatsAppEnrichmentMetadata
|
|
- ./WhatsAppLikelihood
|
|
default_range: string
|
|
classes:
|
|
LinkedInProfile:
|
|
class_uri: schema:ProfilePage
|
|
description: |
|
|
Complete LinkedIn profile extraction for a person.
|
|
|
|
Models the content of person entity JSON files stored at
|
|
`data/custodian/person/entity/*.json`. This is the root class
|
|
for LinkedIn profile data extracted via Exa API or HTML parsing.
|
|
|
|
**Relationship to PersonObservation**:
|
|
- PersonObservation.linkedin_profile_path references the file containing
|
|
this LinkedInProfile data
|
|
- PersonObservation.linkedin_profile_url links to the source URL
|
|
- This class models the CONTENT of that file
|
|
|
|
**Relationship to SocialMediaProfile**:
|
|
- SocialMediaProfile (in same schema) is for CUSTODIAN social media accounts
|
|
(e.g., Rijksmuseum's Instagram, Nationaal Archief's Twitter)
|
|
- LinkedInProfile is for PERSON LinkedIn profiles (staff members)
|
|
- These are complementary, not overlapping classes
|
|
|
|
**Data Flow**:
|
|
```
|
|
LinkedIn URL → Exa API → JSON file → LinkedInProfile (this class)
|
|
↑
|
|
PersonObservation.linkedin_profile_path references this file
|
|
```
|
|
exact_mappings:
|
|
- schema:ProfilePage
|
|
close_mappings:
|
|
- foaf:PersonalProfileDocument
|
|
- schema:Person
|
|
slots:
|
|
- extraction_metadata
|
|
- heritage_relevance
|
|
- profile_data
|
|
- source_organization
|
|
- specificity_annotation
|
|
- template_specificity
|
|
- whatsapp_enrichment
|
|
slot_usage:
|
|
extraction_metadata:
|
|
description: |
|
|
Provenance metadata for the extraction activity.
|
|
Records how, when, and by what agent this profile was extracted.
|
|
See ExtractionMetadata class for field definitions.
|
|
range: ExtractionMetadata
|
|
required: true
|
|
inlined: true
|
|
profile_data:
|
|
description: |
|
|
Core profile data extracted from LinkedIn.
|
|
Contains personal info, career history, education, skills, languages.
|
|
See LinkedInProfileData class for field definitions.
|
|
range: LinkedInProfileData
|
|
required: true
|
|
inlined: true
|
|
heritage_relevance:
|
|
description: |
|
|
Classification of this person's relevance to heritage sectors.
|
|
See HeritageRelevance class for scoring guidelines.
|
|
range: HeritageRelevance
|
|
inlined: true
|
|
source_organization:
|
|
description: |
|
|
Slug identifier of the organization from which this profile was discovered.
|
|
Matches the custodian slug used in staff list parsing.
|
|
Format: lowercase with hyphens (e.g., "rijksmuseum", "nationaal-archief")
|
|
slot_uri: prov:wasInfluencedBy
|
|
range: string
|
|
pattern: ^[a-z0-9-]+$
|
|
examples:
|
|
- value: the-dutch-inspectorate-of-education
|
|
description: Organization where person was discovered as staff
|
|
- value: rijksmuseum
|
|
description: Heritage institution employer
|
|
whatsapp_enrichment:
|
|
description: |
|
|
Optional WhatsApp business likelihood enrichment.
|
|
Added by enrichment scripts to assess digital communication capabilities.
|
|
range: WhatsAppEnrichment
|
|
inlined: true
|
|
specificity_annotation:
|
|
range: SpecificityAnnotation
|
|
inlined: true
|
|
template_specificity:
|
|
range: TemplateSpecificityScores
|
|
inlined: true
|
|
comments:
|
|
- This is the root class for person entity JSON files
|
|
- PersonObservation.linkedin_profile_path references files containing this data
|
|
- See AGENTS.md Rule 20 for person entity file requirements
|
|
- See AGENTS.md Rule 27 for person-custodian data architecture
|
|
see_also:
|
|
- https://schema.org/ProfilePage
|
|
- https://nde.nl/ontology/hc/class/PersonObservation
|
|
- https://nde.nl/ontology/hc/class/SocialMediaProfile
|
|
LinkedInProfileData:
|
|
class_uri: schema:Person
|
|
description: |
|
|
Core profile data extracted from a LinkedIn profile.
|
|
|
|
Contains the person's professional information including:
|
|
- Basic info (name, headline, location, connections)
|
|
- About/summary text
|
|
- Career history (experience array)
|
|
- Education history (education array)
|
|
- Skills and languages
|
|
- Profile image URL
|
|
|
|
**Note on Data Representation**:
|
|
- Raw strings are preserved for provenance (e.g., connections text)
|
|
- Nested objects use defined classes (WorkExperience, EducationCredential)
|
|
- Skills are simple strings (not structured objects)
|
|
- Languages may be raw strings or LanguageProficiency objects
|
|
exact_mappings:
|
|
- schema:Person
|
|
close_mappings:
|
|
- foaf:Person
|
|
slots:
|
|
- about_text
|
|
- connections_text
|
|
- education
|
|
- experience
|
|
- headline
|
|
- has_or_had_language
|
|
- languages_raw
|
|
- profile_image_url
|
|
- profile_linkedin_url
|
|
- profile_location
|
|
- profile_name
|
|
- skill
|
|
- specificity_annotation
|
|
- template_specificity
|
|
slot_usage:
|
|
profile_name:
|
|
description: |
|
|
Full name of the person as displayed on LinkedIn.
|
|
slot_uri: schema:name
|
|
range: string
|
|
required: true
|
|
examples:
|
|
- value: Sander Hulleman
|
|
- value: Jan van der Berg
|
|
profile_linkedin_url:
|
|
description: |
|
|
LinkedIn profile URL for this person.
|
|
Duplicated from extraction_metadata for convenience.
|
|
slot_uri: schema:url
|
|
range: uri
|
|
pattern: ^https://www\.linkedin\.com/in/[a-z0-9-]+/?$
|
|
examples:
|
|
- value: https://www.linkedin.com/in/sander-hulleman-5017b9105
|
|
headline:
|
|
description: |
|
|
Professional headline/tagline from LinkedIn.
|
|
Typically includes current job title and/or professional identity.
|
|
slot_uri: schema:jobTitle
|
|
range: string
|
|
examples:
|
|
- value: Stafadviseur PO
|
|
description: Dutch job title
|
|
- value: Senior Curator | Rijksmuseum
|
|
description: Title with organization
|
|
- value: Digital Archivist | Heritage Data Specialist
|
|
description: Multiple roles
|
|
profile_location:
|
|
description: |
|
|
Location as displayed on LinkedIn profile.
|
|
Format varies: "City, Region, Country" or "City, Country"
|
|
slot_uri: schema:homeLocation
|
|
range: string
|
|
examples:
|
|
- value: Arnhem, Gelderland, Netherlands
|
|
- value: Amsterdam, Netherlands
|
|
connections_text:
|
|
description: |
|
|
Raw connections/followers text from LinkedIn.
|
|
Format: "X connections • Y followers"
|
|
Preserved as-is for provenance.
|
|
slot_uri: schema:description
|
|
range: string
|
|
examples:
|
|
- value: 246 connections • 248 followers
|
|
- value: 500+ connections
|
|
has_or_had_about_text:
|
|
description: |
|
|
About/summary section text from LinkedIn profile.
|
|
May be absent if person hasn't written a summary.
|
|
slot_uri: schema:description
|
|
range: string
|
|
examples:
|
|
- value: Third year student at Stenden University...
|
|
experience:
|
|
description: |
|
|
Work experience entries from LinkedIn.
|
|
Array of WorkExperience objects with job title, company, dates, location.
|
|
range: WorkExperience
|
|
multivalued: true
|
|
inlined_as_list: true
|
|
education:
|
|
description: |
|
|
Education entries from LinkedIn.
|
|
Array of EducationCredential objects with school, degree, years.
|
|
range: EducationCredential
|
|
multivalued: true
|
|
inlined_as_list: true
|
|
skill:
|
|
description: |
|
|
Skills listed on LinkedIn profile.
|
|
Simple string array (not structured objects).
|
|
slot_uri: schema:knowsAbout
|
|
range: string
|
|
multivalued: true
|
|
examples:
|
|
- value:
|
|
- education
|
|
- teaching
|
|
- curriculum development
|
|
languages_raw:
|
|
description: |
|
|
Raw language strings as extracted from LinkedIn.
|
|
Format: "Language - Proficiency level"
|
|
Use this when storing unprocessed data.
|
|
range: string
|
|
multivalued: true
|
|
examples:
|
|
- value:
|
|
- English - Native or bilingual
|
|
- Dutch - Native or bilingual
|
|
has_or_had_language:
|
|
description: |
|
|
Parsed language proficiency entries.
|
|
Array of LanguageProficiency objects with language name, code, level.
|
|
Use this when storing processed/structured data.
|
|
range: LanguageProficiency
|
|
multivalued: true
|
|
inlined_as_list: true
|
|
profile_image_url:
|
|
description: |
|
|
URL to the LinkedIn profile photo.
|
|
Should be the actual CDN URL (media.licdn.com), not overlay page.
|
|
See AGENTS.md Rule 16 for photo URL requirements.
|
|
slot_uri: schema:image
|
|
range: uri
|
|
pattern: ^https://media\.licdn\.com/.*$
|
|
examples:
|
|
- value: https://media.licdn.com/dms/image/v2/C4E03AQHoGyR6G0kphA/profile-displayphoto-shrink_200_200/...
|
|
specificity_annotation:
|
|
range: SpecificityAnnotation
|
|
inlined: true
|
|
template_specificity:
|
|
range: TemplateSpecificityScores
|
|
inlined: true
|
|
comments:
|
|
- Inlined within LinkedInProfile as profile_data
|
|
- experience and education use inlined_as_list for JSON array representation
|
|
- languages_raw preserves original strings; languages has parsed objects
|
|
- profile_image_url must be CDN URL per AGENTS.md Rule 16
|
|
WhatsAppEnrichment:
|
|
class_uri: hc:WhatsAppEnrichment
|
|
description: |
|
|
WhatsApp business likelihood enrichment data.
|
|
|
|
Added by enrichment scripts to assess whether a person is likely
|
|
to use WhatsApp for professional/business communication.
|
|
|
|
**Assessment Factors**:
|
|
- Digital technology indicators in profile
|
|
- Role type (customer-facing, technical, etc.)
|
|
- Industry/sector norms
|
|
- Geographic region (WhatsApp prevalence varies)
|
|
slots:
|
|
- digital_professional
|
|
- enrichment_metadata_whatsapp
|
|
- specificity_annotation
|
|
- template_specificity
|
|
- whatsapp_business_likelihood
|
|
slot_usage:
|
|
digital_professional:
|
|
description: |
|
|
Assessment of digital/technology proficiency.
|
|
range: DigitalProfessionalAssessment
|
|
inlined: true
|
|
whatsapp_business_likelihood:
|
|
description: |
|
|
Likelihood score for WhatsApp business usage.
|
|
range: WhatsAppLikelihood
|
|
inlined: true
|
|
enrichment_metadata_whatsapp:
|
|
description: |
|
|
Metadata about the enrichment process.
|
|
range: WhatsAppEnrichmentMetadata
|
|
inlined: true
|
|
specificity_annotation:
|
|
range: SpecificityAnnotation
|
|
inlined: true
|
|
template_specificity:
|
|
range: TemplateSpecificityScores
|
|
inlined: true
|
|
DigitalProfessionalAssessment:
|
|
class_uri: hc:DigitalProfessionalAssessment
|
|
description: |
|
|
Assessment of a person's digital/technology proficiency.
|
|
slots:
|
|
- digital_confidence
|
|
- digital_indicator
|
|
- likely_whatsapp_proficient
|
|
- specificity_annotation
|
|
- template_specificity
|
|
slot_usage:
|
|
likely_whatsapp_proficient:
|
|
description: Whether person is likely proficient with WhatsApp
|
|
range: boolean
|
|
digital_indicator:
|
|
description: Indicators of digital proficiency from profile
|
|
range: string
|
|
multivalued: true
|
|
digital_confidence:
|
|
description: 'Confidence level: low, medium, high'
|
|
range: string
|
|
specificity_annotation:
|
|
range: SpecificityAnnotation
|
|
inlined: true
|
|
template_specificity:
|
|
range: TemplateSpecificityScores
|
|
inlined: true
|
|
WhatsAppLikelihood:
|
|
class_uri: hc:WhatsAppLikelihood
|
|
description: |
|
|
Likelihood score for WhatsApp business usage.
|
|
slots:
|
|
- assessment_date
|
|
- likelihood_confidence
|
|
- likelihood_factor
|
|
- likelihood_level
|
|
- likelihood_score
|
|
- max_likelihood_score
|
|
- specificity_annotation
|
|
- template_specificity
|
|
slot_usage:
|
|
likelihood_score:
|
|
description: Numeric score (0-100)
|
|
range: integer
|
|
minimum_value: 0
|
|
maximum_value: 100
|
|
max_likelihood_score:
|
|
description: Maximum possible score (typically 100)
|
|
range: integer
|
|
likelihood_level:
|
|
description: 'Categorical level: low, medium, high'
|
|
range: string
|
|
likelihood_confidence:
|
|
description: Confidence in the assessment (0.0-1.0)
|
|
range: float
|
|
minimum_value: 0.0
|
|
maximum_value: 1.0
|
|
likelihood_factor:
|
|
description: Factors contributing to the score
|
|
range: string
|
|
multivalued: true
|
|
has_assessment_date:
|
|
description: When the assessment was performed (ISO 8601)
|
|
range: datetime
|
|
specificity_annotation:
|
|
range: SpecificityAnnotation
|
|
inlined: true
|
|
template_specificity:
|
|
range: TemplateSpecificityScores
|
|
inlined: true
|
|
WhatsAppEnrichmentMetadata:
|
|
class_uri: hc:WhatsAppEnrichmentMetadata
|
|
description: |
|
|
Metadata about the WhatsApp enrichment process.
|
|
slots:
|
|
- all_data_real
|
|
- data_source_whatsapp
|
|
- enriched_date
|
|
- enrichment_method_whatsapp
|
|
- no_fabrication
|
|
- specificity_annotation
|
|
- template_specificity
|
|
slot_usage:
|
|
enriched_date:
|
|
description: When enrichment was performed (ISO 8601)
|
|
range: datetime
|
|
enrichment_method_whatsapp:
|
|
description: Method used for enrichment
|
|
range: string
|
|
examples:
|
|
- value: linkedin_profile_analysis
|
|
data_source_whatsapp:
|
|
description: Source of data for enrichment
|
|
range: string
|
|
examples:
|
|
- value: public_linkedin_profile
|
|
no_fabrication:
|
|
description: Confirms no data was fabricated
|
|
range: boolean
|
|
has_all_data_real_flag:
|
|
description: Confirms all data is from real sources
|
|
range: boolean
|
|
specificity_annotation:
|
|
range: SpecificityAnnotation
|
|
inlined: true
|
|
template_specificity:
|
|
range: TemplateSpecificityScores
|
|
inlined: true
|
|
slots:
|
|
extraction_metadata:
|
|
description: Provenance metadata for the extraction activity
|
|
range: ExtractionMetadata
|
|
profile_data:
|
|
description: Core profile data from LinkedIn
|
|
range: LinkedInProfileData
|
|
heritage_relevance:
|
|
description: Heritage sector classification
|
|
range: HeritageRelevance
|
|
source_organization:
|
|
description: Organization slug where person was discovered
|
|
range: string
|
|
whatsapp_enrichment:
|
|
description: WhatsApp business likelihood enrichment
|
|
range: WhatsAppEnrichment
|
|
profile_name:
|
|
description: Full name of the person
|
|
range: string
|
|
profile_linkedin_url:
|
|
description: LinkedIn profile URL
|
|
range: uri
|
|
headline:
|
|
description: Professional headline/tagline
|
|
range: string
|
|
profile_location:
|
|
description: Location as displayed on profile
|
|
range: string
|
|
connections_text:
|
|
description: Raw connections/followers text
|
|
range: string
|
|
has_or_had_about_text:
|
|
description: About/summary section text
|
|
range: string
|
|
experience:
|
|
description: Work experience entries
|
|
range: WorkExperience
|
|
multivalued: true
|
|
education:
|
|
description: Education entries
|
|
range: EducationCredential
|
|
multivalued: true
|
|
skill:
|
|
description: Skills listed on profile
|
|
range: string
|
|
multivalued: true
|
|
languages_raw:
|
|
description: Raw language strings
|
|
range: string
|
|
multivalued: true
|
|
has_or_had_language:
|
|
description: Parsed language proficiency entries
|
|
range: LanguageProficiency
|
|
multivalued: true
|
|
profile_image_url:
|
|
description: Profile photo URL
|
|
range: uri
|
|
digital_professional:
|
|
description: Digital proficiency assessment
|
|
range: DigitalProfessionalAssessment
|
|
whatsapp_business_likelihood:
|
|
description: WhatsApp business usage likelihood
|
|
range: WhatsAppLikelihood
|
|
enrichment_metadata_whatsapp:
|
|
description: WhatsApp enrichment metadata
|
|
range: WhatsAppEnrichmentMetadata
|
|
likely_whatsapp_proficient:
|
|
description: Whether person is likely WhatsApp proficient
|
|
range: boolean
|
|
digital_indicator:
|
|
description: Indicators of digital proficiency
|
|
range: string
|
|
multivalued: true
|
|
digital_confidence:
|
|
description: Digital proficiency confidence level
|
|
range: string
|
|
likelihood_score:
|
|
description: Numeric likelihood score
|
|
range: integer
|
|
max_likelihood_score:
|
|
description: Maximum possible score
|
|
range: integer
|
|
likelihood_level:
|
|
description: Categorical likelihood level
|
|
range: string
|
|
likelihood_confidence:
|
|
description: Confidence in the assessment
|
|
range: float
|
|
likelihood_factor:
|
|
description: Factors contributing to score
|
|
range: string
|
|
multivalued: true
|
|
has_assessment_date:
|
|
description: When assessment was performed
|
|
range: datetime
|
|
enriched_date:
|
|
description: When enrichment was performed
|
|
range: datetime
|
|
enrichment_method_whatsapp:
|
|
description: Method used for enrichment
|
|
range: string
|
|
data_source_whatsapp:
|
|
description: Data source for enrichment
|
|
range: string
|
|
no_fabrication:
|
|
description: Confirms no data was fabricated
|
|
range: boolean
|
|
has_all_data_real_flag:
|
|
description: Confirms all data is real
|
|
range: boolean
|