glam/.opencode/PERSON_PROFILE_CONFIDENCE_SCORING.md
2025-12-15 22:31:41 +01:00

202 lines
6.4 KiB
Markdown

# Person Profile Extraction Confidence Scoring
**Version**: 1.0.0
**Created**: 2025-12-15
**Applies To**: Person entity profiles in `data/custodian/person/entity/`
---
## Purpose
This document defines the confidence scoring rubric for **profile extraction quality** - how confident we are that the extracted profile data is accurate and complete. This is distinct from `heritage_sector_relevance` (which measures domain expertise).
**Two Different Scores:**
| Field | Measures | Range |
|-------|----------|-------|
| `exa_enrichment.confidence_score` | Data extraction quality/completeness | 0.50-0.95 |
| `heritage_sector_relevance.score` | Domain expertise in heritage sector | 0.10-1.0 |
---
## Confidence Score Rubric
| Score Range | Level | Criteria | Examples |
|-------------|-------|----------|----------|
| **0.90-0.95** | High Confidence | Senior heritage role, clear title, named institution, verifiable details | "Director at Rijksmuseum", "Chief Curator at British Museum" |
| **0.75-0.85** | Good Confidence | Mid-level heritage role, good institutional context, clear affiliation | "Junior Development at Rijksmuseum \| MA Cultural Economics" |
| **0.60-0.70** | Moderate Confidence | Entry-level/support role, or technical role at heritage institution, limited details | "Staff at Internet Archive", "Stedelijk Museum Amsterdam" (no role) |
| **0.50-0.55** | Low Confidence | Intern, unclear relationship, privacy-abbreviated name, minimal data | "Intern at Museum", "Amy B." (abbreviated name) |
---
## Scoring Factors
### Factors That INCREASE Confidence
| Factor | Impact | Example |
|--------|--------|---------|
| **Clear job title visible** | +0.10 to +0.15 | "Curator", "Archivist", "Director" |
| **Named institution in headline** | +0.05 to +0.10 | "at Rijksmuseum", "at Internet Archive" |
| **Education degree visible** | +0.05 | "MA Cultural Economics", "PhD Art History" |
| **Seniority indicator** | +0.05 to +0.10 | "Senior", "Head of", "Director" |
| **Multiple data points** | +0.05 | Role + Education + Location |
| **Full name (not abbreviated)** | +0.05 | "Aliza Snoek" vs "Amy B." |
| **Specific department/team** | +0.05 | "Development Team", "Conservation Department" |
### Factors That DECREASE Confidence
| Factor | Impact | Example |
|--------|--------|---------|
| **No role title (institution only)** | -0.15 to -0.20 | Headline: "Stedelijk Museum Amsterdam" |
| **Generic "staff" title** | -0.10 | "staff at The Internet Archive" |
| **Privacy-abbreviated name** | -0.15 to -0.20 | "Amy B.", "J. Smith" |
| **Intern/trainee position** | -0.10 | "Intern", "Stagiair", "Trainee" |
| **No location data** | -0.05 | Location field is null |
| **403 privacy restriction** | -0.10 | Full profile unavailable |
| **Ambiguous affiliation** | -0.10 | Unclear which institution |
---
## Score Calculation Examples
### Example 1: Score 0.80 (Good Confidence)
```
Profile: "Junior Development Rijksmuseum | MA Cultural Economics"
Base score: 0.65
+ Clear role title ("Junior Development"): +0.10
+ Named institution ("Rijksmuseum"): +0.05
+ Education visible ("MA Cultural Economics"): +0.05
= Final score: 0.80
```
### Example 2: Score 0.65 (Moderate Confidence)
```
Profile: "staff at The Internet Archive"
Base score: 0.65
+ Named institution ("Internet Archive"): +0.05
- Generic title ("staff"): -0.10
+ Full name visible: +0.05
= Final score: 0.65
```
### Example 3: Score 0.60 (Moderate-Low Confidence)
```
Profile: "Stedelijk Museum Amsterdam" (no role)
Base score: 0.65
+ Named institution: +0.05
- No role title visible: -0.15
+ Full name visible: +0.05
= Final score: 0.60
```
### Example 4: Score 0.50 (Low Confidence)
```
Profile: "Intern at Kröller-Müller Museum"
Base score: 0.65
+ Named institution: +0.05
- Intern position: -0.10
- 403 privacy restriction: -0.10
= Final score: 0.50
```
### Example 5: Score 0.50 (Low Confidence - Abbreviated Name)
```
Profile: "Amy B. - Film Archivist"
Base score: 0.65
+ Clear role title: +0.10
- Abbreviated name: -0.20
- No institution in headline: -0.05
= Final score: 0.50
```
---
## Implementation in Entity Files
The confidence score is stored in the `exa_enrichment` block:
```json
{
"exa_enrichment": {
"confidence_score": 0.75,
"enrichment_date": "2025-12-15T12:45:00Z",
"sources_consulted": [
"LinkedIn profile headline",
"Rijksmuseum institutional website"
],
"notes": "Clear role title and educational background visible in headline. Development roles are core museum functions."
}
}
```
---
## Relationship to Other Scores
### exa_enrichment.confidence_score vs heritage_sector_relevance.score
| Aspect | confidence_score | heritage_sector_relevance.score |
|--------|------------------|--------------------------------|
| **What it measures** | Data extraction quality | Domain expertise |
| **Question answered** | "How sure are we about this data?" | "How relevant is this person to heritage?" |
| **High score means** | Rich, verifiable profile data | Deep heritage sector expertise |
| **Low score means** | Sparse, uncertain data | Peripheral/support role |
| **Example: IT Director** | 0.90 (clear role, full data) | 0.45 (enabling role, not heritage-specific) |
| **Example: Intern Curator** | 0.50 (intern, limited data) | 0.65 (heritage role, limited experience) |
### When Both Scores Are Used
```json
{
"exa_enrichment": {
"confidence_score": 0.75,
"notes": "Good extraction with clear role title"
},
"heritage_sector_relevance": {
"score": 0.85,
"primary_domain": "Archives",
"assessment_notes": "Senior archivist with 10+ years experience"
}
}
```
---
## Quality Control
### Minimum Thresholds
| Threshold | Action |
|-----------|--------|
| **< 0.50** | Flag for manual review, consider re-extraction |
| **0.50-0.60** | Accept but note uncertainty in provenance |
| **0.60-0.75** | Standard acceptance |
| **> 0.75** | High-quality record |
### Required Documentation
For scores **below 0.60**, the `notes` field MUST explain:
1. Why the score is low
2. What data is missing or uncertain
3. Potential sources for verification
---
## References
- **AGENTS.md**: Rule 30 (Person Profile Extraction Confidence Scoring)
- **AGENTS.md**: Rule 20 (Person Entity Profiles)
- **HERITAGE_SECTOR_RELEVANCE_SCORING.md**: Domain expertise scoring (separate metric)
- **PERSON_ENTITY_PROFILE_FORMAT_RULE.md**: Entity file structure
- **DATA_FABRICATION_PROHIBITION.md**: Never fabricate data to increase confidence