glam/.opencode/PERSON_PROFILE_CONFIDENCE_SCORING.md
2025-12-15 22:31:41 +01:00

6.4 KiB

Person Profile Extraction Confidence Scoring

Version: 1.0.0
Created: 2025-12-15
Applies To: Person entity profiles in data/custodian/person/entity/


Purpose

This document defines the confidence scoring rubric for profile extraction quality - how confident we are that the extracted profile data is accurate and complete. This is distinct from heritage_sector_relevance (which measures domain expertise).

Two Different Scores:

Field Measures Range
exa_enrichment.confidence_score Data extraction quality/completeness 0.50-0.95
heritage_sector_relevance.score Domain expertise in heritage sector 0.10-1.0

Confidence Score Rubric

Score Range Level Criteria Examples
0.90-0.95 High Confidence Senior heritage role, clear title, named institution, verifiable details "Director at Rijksmuseum", "Chief Curator at British Museum"
0.75-0.85 Good Confidence Mid-level heritage role, good institutional context, clear affiliation "Junior Development at Rijksmuseum | MA Cultural Economics"
0.60-0.70 Moderate Confidence Entry-level/support role, or technical role at heritage institution, limited details "Staff at Internet Archive", "Stedelijk Museum Amsterdam" (no role)
0.50-0.55 Low Confidence Intern, unclear relationship, privacy-abbreviated name, minimal data "Intern at Museum", "Amy B." (abbreviated name)

Scoring Factors

Factors That INCREASE Confidence

Factor Impact Example
Clear job title visible +0.10 to +0.15 "Curator", "Archivist", "Director"
Named institution in headline +0.05 to +0.10 "at Rijksmuseum", "at Internet Archive"
Education degree visible +0.05 "MA Cultural Economics", "PhD Art History"
Seniority indicator +0.05 to +0.10 "Senior", "Head of", "Director"
Multiple data points +0.05 Role + Education + Location
Full name (not abbreviated) +0.05 "Aliza Snoek" vs "Amy B."
Specific department/team +0.05 "Development Team", "Conservation Department"

Factors That DECREASE Confidence

Factor Impact Example
No role title (institution only) -0.15 to -0.20 Headline: "Stedelijk Museum Amsterdam"
Generic "staff" title -0.10 "staff at The Internet Archive"
Privacy-abbreviated name -0.15 to -0.20 "Amy B.", "J. Smith"
Intern/trainee position -0.10 "Intern", "Stagiair", "Trainee"
No location data -0.05 Location field is null
403 privacy restriction -0.10 Full profile unavailable
Ambiguous affiliation -0.10 Unclear which institution

Score Calculation Examples

Example 1: Score 0.80 (Good Confidence)

Profile: "Junior Development Rijksmuseum | MA Cultural Economics"

Base score: 0.65
+ Clear role title ("Junior Development"): +0.10
+ Named institution ("Rijksmuseum"): +0.05
+ Education visible ("MA Cultural Economics"): +0.05
= Final score: 0.80

Example 2: Score 0.65 (Moderate Confidence)

Profile: "staff at The Internet Archive"

Base score: 0.65
+ Named institution ("Internet Archive"): +0.05
- Generic title ("staff"): -0.10
+ Full name visible: +0.05
= Final score: 0.65

Example 3: Score 0.60 (Moderate-Low Confidence)

Profile: "Stedelijk Museum Amsterdam" (no role)

Base score: 0.65
+ Named institution: +0.05
- No role title visible: -0.15
+ Full name visible: +0.05
= Final score: 0.60

Example 4: Score 0.50 (Low Confidence)

Profile: "Intern at Kröller-Müller Museum"

Base score: 0.65
+ Named institution: +0.05
- Intern position: -0.10
- 403 privacy restriction: -0.10
= Final score: 0.50

Example 5: Score 0.50 (Low Confidence - Abbreviated Name)

Profile: "Amy B. - Film Archivist"

Base score: 0.65
+ Clear role title: +0.10
- Abbreviated name: -0.20
- No institution in headline: -0.05
= Final score: 0.50

Implementation in Entity Files

The confidence score is stored in the exa_enrichment block:

{
  "exa_enrichment": {
    "confidence_score": 0.75,
    "enrichment_date": "2025-12-15T12:45:00Z",
    "sources_consulted": [
      "LinkedIn profile headline",
      "Rijksmuseum institutional website"
    ],
    "notes": "Clear role title and educational background visible in headline. Development roles are core museum functions."
  }
}

Relationship to Other Scores

exa_enrichment.confidence_score vs heritage_sector_relevance.score

Aspect confidence_score heritage_sector_relevance.score
What it measures Data extraction quality Domain expertise
Question answered "How sure are we about this data?" "How relevant is this person to heritage?"
High score means Rich, verifiable profile data Deep heritage sector expertise
Low score means Sparse, uncertain data Peripheral/support role
Example: IT Director 0.90 (clear role, full data) 0.45 (enabling role, not heritage-specific)
Example: Intern Curator 0.50 (intern, limited data) 0.65 (heritage role, limited experience)

When Both Scores Are Used

{
  "exa_enrichment": {
    "confidence_score": 0.75,
    "notes": "Good extraction with clear role title"
  },
  "heritage_sector_relevance": {
    "score": 0.85,
    "primary_domain": "Archives",
    "assessment_notes": "Senior archivist with 10+ years experience"
  }
}

Quality Control

Minimum Thresholds

Threshold Action
< 0.50 Flag for manual review, consider re-extraction
0.50-0.60 Accept but note uncertainty in provenance
0.60-0.75 Standard acceptance
> 0.75 High-quality record

Required Documentation

For scores below 0.60, the notes field MUST explain:

  1. Why the score is low
  2. What data is missing or uncertain
  3. Potential sources for verification

References

  • AGENTS.md: Rule 30 (Person Profile Extraction Confidence Scoring)
  • AGENTS.md: Rule 20 (Person Entity Profiles)
  • HERITAGE_SECTOR_RELEVANCE_SCORING.md: Domain expertise scoring (separate metric)
  • PERSON_ENTITY_PROFILE_FORMAT_RULE.md: Entity file structure
  • DATA_FABRICATION_PROHIBITION.md: Never fabricate data to increase confidence