glam/.opencode/ORGANIZATIONAL_SUBDIVISION_EXTRACTION.md
2025-12-15 22:31:41 +01:00

5.7 KiB

Rule 31: Organizational Subdivision Extraction

🚨 CRITICAL: When extracting person/staff affiliations, ALWAYS capture organizational subdivisions (departments, teams, units, divisions, sections) as structured data. This information is highly valuable for understanding institutional structure.

Why This Matters

Organizational subdivisions reveal:

  1. Institutional Structure: How heritage custodians organize their work
  2. Expertise Clustering: Where specific skills/knowledge are concentrated
  3. Contact Routing: Who to contact for specific inquiries
  4. Network Analysis: How teams connect across institutions
  5. Career Tracking: Movement between teams/departments over time

What to Extract

Subdivision Type Examples Field Name
Department Collections Department, Conservation Department department
Team Data Science Team, Digitization Team team
Unit Research Unit, Acquisitions Unit unit
Division Public Services Division, Technical Services division
Section Photographs Section, Maps Section section
Lab/Center Conservation Lab, Research Center lab_or_center
Office Director's Office, Communications Office office

Detection Patterns

LinkedIn Headlines

Parse subdivision indicators from headlines:

"Kadaster Data Science Team | BSc Artificial Intelligence UU"
         ↓
organization: Kadaster
team: Data Science Team

"Senior Curator, Asian Art Department | Rijksmuseum"
                ↓
organization: Rijksmuseum  
department: Asian Art Department

"Head of Conservation Lab | British Museum"
            ↓
organization: British Museum
lab_or_center: Conservation Lab

Keywords to Detect

Language Keywords
English team, department, dept., division, unit, section, lab, laboratory, center, centre, office, group, branch
Dutch team, afdeling, afd., divisie, eenheid, sectie, laboratorium, centrum, kantoor, groep
German Team, Abteilung, Abt., Division, Einheit, Sektion, Labor, Zentrum, Büro, Gruppe
French équipe, département, dép., division, unité, section, laboratoire, centre, bureau, groupe

Data Structure

In Person Entity Files

{
  "profile_data": {
    "name": "Aron Noordhoek",
    "headline": "Kadaster Data Science Team | BSc Artificial Intelligence UU"
  },
  "affiliations": [
    {
      "custodian_name": "Kadaster",
      "custodian_slug": "kadaster",
      "role_title": "Data Science Team Member",
      "subdivision": {
        "type": "team",
        "name": "Data Science Team",
        "parent_subdivision": null,
        "extraction_source": "linkedin_headline"
      },
      "heritage_relevant": true,
      "heritage_type": "D",
      "current": true
    }
  ]
}

Nested Subdivisions

Some organizations have hierarchical subdivisions:

{
  "subdivision": {
    "type": "section",
    "name": "Photographs Section",
    "parent_subdivision": {
      "type": "department",
      "name": "Collections Department"
    },
    "extraction_source": "institutional_website"
  }
}

Extraction Workflow

1. PARSE headline/role text
   ↓
2. IDENTIFY subdivision keywords (team, department, etc.)
   ↓
3. EXTRACT subdivision name
   ↓
4. CLASSIFY subdivision type
   ↓
5. CHECK for parent subdivisions (if hierarchical)
   ↓
6. STORE in structured format

LinkUp Search Strategy

When using LinkUp to enrich profiles, specifically search for subdivision information:

# Good queries for subdivision discovery
queries = [
    f'"{person_name}" "{organization}" department team',
    f'"{organization}" organizational structure',
    f'"{organization}" staff directory departments',
]

Examples from Real Data

Example 1: Kadaster Data Science Team

{
  "name": "Aron Noordhoek",
  "affiliations": [{
    "custodian_name": "Kadaster",
    "subdivision": {
      "type": "team",
      "name": "Data Science Team"
    }
  }]
}

Example 2: Museum Department

{
  "name": "Sarah Johnson",
  "affiliations": [{
    "custodian_name": "Rijksmuseum",
    "subdivision": {
      "type": "department", 
      "name": "Paintings Conservation Department"
    }
  }]
}

Example 3: Archive Unit

{
  "name": "Thomas van Berg",
  "affiliations": [{
    "custodian_name": "Nationaal Archief",
    "subdivision": {
      "type": "unit",
      "name": "Digital Preservation Unit",
      "parent_subdivision": {
        "type": "department",
        "name": "Collection Care"
      }
    }
  }]
}

Validation Rules

  1. Subdivision name MUST NOT be empty if type is specified
  2. Type MUST be one of: department, team, unit, division, section, lab_or_center, office
  3. extraction_source MUST be specified: linkedin_headline, institutional_website, linkedin_experience, manual
  4. Parent subdivision (if any) MUST have valid type and name

Integration with Existing Rules

This rule complements:

  • Rule 12: Person Data Reference Pattern
  • Rule 18: Custodian Staff Parsing
  • Rule 20: Person Entity Profiles
  • Rule 27: Person-Custodian Data Architecture

Provenance

When subdivision info comes from web sources, include provenance:

{
  "subdivision": {
    "type": "department",
    "name": "Conservation Department",
    "provenance": {
      "source_url": "https://www.museum.org/about/staff",
      "retrieved_on": "2025-12-15T20:00:00Z",
      "retrieval_agent": "firecrawl"
    }
  }
}

See Also

  • schemas/20251121/linkml/modules/classes/OrganizationalUnit.yaml (if exists)
  • .opencode/PERSON_CUSTODIAN_DATA_ARCHITECTURE.md
  • AGENTS.md Rule 31