glam/.opencode/CONNECTION_DATA_REGISTRATION_RULE.md
2025-12-10 13:01:13 +01:00

278 lines
7.3 KiB
Markdown

# Connection Data Registration Rule
## Rule: All Manually Recorded Connections MUST Be Fully Registered
**🚨 CRITICAL: When connection data is manually recorded for a person, ALL connections MUST be fully registered in a dedicated connections file in `data/custodian/person/`.**
This rule ensures:
- Complete network data preservation
- Single-source-of-truth for professional networks
- Cross-custodian relationship discovery
- Heritage sector network analysis capabilities
---
## Directory Structure
```
data/custodian/
├── person/
│ ├── {linkedin-slug}_{timestamp}.json # Person profile
│ ├── {linkedin-slug}_connections_{timestamp}.json # Connections file
│ └── ...
└── ...
```
---
## File Naming Convention for Connections Files
**Format**: `{linkedin-slug}_connections_{ISO-timestamp}.json`
**Examples**:
```
alexandr-belov-bb547b46_connections_20251210T160000Z.json
giovanna-fossati_connections_20251211T140000Z.json
```
---
## Required Connection File Structure
```json
{
"source_metadata": {
"source_url": "https://www.linkedin.com/search/results/people/?...",
"scraped_timestamp": "2025-12-10T16:00:00Z",
"scrape_method": "manual_linkedin_browse",
"pages_scraped": 12,
"target_profile": "{linkedin-slug}",
"target_name": "Full Name",
"connection_count_reported": 94,
"connections_extracted": 107,
"note": "Explanation of any discrepancies"
},
"connections": [
{
"name": "Connection Name",
"degree": "1st" | "2nd" | "3rd+",
"headline": "Current role or description",
"location": "City, Region, Country",
"organization": "Primary organization",
"organization_secondary": "Secondary organization (if applicable)",
"mutual_connections": ["Person 1", "Person 2", "N others"],
"followers": 1234,
"heritage_relevant": true | false,
"heritage_type": "A" | "L" | "M" | "R" | "E" | "D" | "G" | etc.,
"note": "Optional context (e.g., 'Direct colleague at Eye Filmmuseum')"
}
],
"network_analysis": {
"total_connections_extracted": 107,
"heritage_relevant_count": 56,
"heritage_relevant_percentage": 52.3,
"connections_by_heritage_type": {
"A": 10,
"L": 16,
"M": 7,
"R": 13,
"E": 18,
"D": 1
},
"connections_by_degree": {
"1st": 1,
"2nd": 40,
"3rd+": 66
},
"key_organizations": [...],
"geographic_distribution": {...},
"mutual_connection_hubs": [...]
},
"heritage_network_insights": {
"primary_clusters": [...],
"career_network_trace": [...],
"strategic_value": "Analysis summary..."
}
}
```
---
## Connection Entry Required Fields
### Minimum Required Fields
| Field | Type | Description |
|-------|------|-------------|
| `name` | string | Full name of connection |
| `degree` | string | Connection degree: `1st`, `2nd`, `3rd+` |
| `headline` | string | Current role/description |
| `heritage_relevant` | boolean | Is this person in heritage sector? |
### Recommended Fields (When Available)
| Field | Type | Description |
|-------|------|-------------|
| `location` | string | Geographic location |
| `organization` | string | Primary organization |
| `organization_secondary` | string | Secondary affiliation |
| `mutual_connections` | array | Shared connections |
| `followers` | integer | LinkedIn follower count |
| `heritage_type` | string | GLAMORCUBESFIXPHDNT code |
| `note` | string | Additional context |
---
## Heritage Type Codes for Connections
Use single-letter GLAMORCUBESFIXPHDNT codes:
| Code | Type |
|------|------|
| G | Gallery |
| L | Library |
| A | Archive |
| M | Museum |
| O | Official institution |
| R | Research center |
| C | Corporation |
| U | Unknown |
| B | Botanical/Zoo |
| E | Education provider |
| S | Collecting society |
| F | Feature custodian |
| I | Intangible heritage |
| X | Mixed types |
| P | Personal collection |
| H | Holy/sacred site |
| D | Digital platform |
| N | NGO |
| T | Taste/smell heritage |
---
## Referencing Connections from Person Profile
When a person profile file exists alongside a connections file, the person profile should reference the connections file:
```json
{
"exa_search_metadata": {...},
"linkedin_profile_url": "...",
"profile_data": {...},
"connections_file_path": "data/custodian/person/alexandr-belov-bb547b46_connections_20251210T160000Z.json"
}
```
---
## Referencing from Custodian Files
Custodian files can reference both profile and connections:
```yaml
collection_management_specialist:
- name: Alexandr Belov
role: Collection/Information Specialist
linkedin_url: https://www.linkedin.com/in/alexandr-belov-bb547b46
current: true
person_profile_path: data/custodian/person/alexandr-belov-bb547b46_20251210T120000Z.json
person_connections_path: data/custodian/person/alexandr-belov-bb547b46_connections_20251210T160000Z.json
```
---
## Network Analysis Section
The `network_analysis` section provides aggregate insights:
```json
"network_analysis": {
"total_connections_extracted": 107,
"heritage_relevant_count": 56,
"heritage_relevant_percentage": 52.3,
"connections_by_heritage_type": {...},
"connections_by_degree": {...},
"key_organizations": [
{
"name": "IISG / IISH",
"connection_count": 11,
"type": "A/R"
}
],
"geographic_distribution": {
"Netherlands": 42,
"Norway": 32
},
"mutual_connection_hubs": [
{"name": "Johan Oomen", "mentions": 14}
]
}
```
---
## Heritage Network Insights Section
Provide strategic analysis of heritage sector connections:
```json
"heritage_network_insights": {
"primary_clusters": [
{
"cluster_name": "Dutch Social History Archives",
"core_institution": "IISG/IISH",
"connection_count": 11,
"key_contacts": ["Contact 1", "Contact 2"]
}
],
"career_network_trace": [
{
"period": "2022-2025",
"location": "Netherlands",
"institutions": ["KNAW"],
"network_legacy": "Strong KNAW Humanities Cluster connections"
}
],
"strategic_value": "Summary of network significance..."
}
```
---
## Eye Filmmuseum Colleagues Tracking
When extracting connections for heritage custodian staff, identify direct colleagues:
```json
"eye_filmmuseum_colleagues": [
"Susan van Gelderen - Head of Film Related Collections/Eye Study",
"Gerdien Smit - Policy Advisor and Researcher",
"Maral Mohsenin - Director Collection & Knowledge Sharing",
"Lou Burkart - Curator and Secretary General CCAAA"
]
```
This enables:
- Staff discovery for the parent custodian
- Department structure inference
- Network-based enrichment
---
## Why This Rule Matters
1. **Complete Data Preservation**: Connections are expensive to extract (manual scraping, rate limits)
2. **Heritage Sector Mapping**: Understanding who knows whom in the heritage community
3. **Cross-Custodian Discovery**: Find staff who work at multiple institutions
4. **Network Analysis**: Identify key influencers and knowledge hubs
5. **Provenance Tracking**: Record when and how connections were extracted
---
## See Also
- `.opencode/PERSON_DATA_REFERENCE_PATTERN.md` - Person profile file patterns
- `AGENTS.md` - Rule 12: Person Data Reference Pattern
- `AGENTS.md` - Rule 15: Connection Data Registration (NEW)
- `data/custodian/person/alexandr-belov-bb547b46_connections_20251210T160000Z.json` - Reference implementation