# Rule 31: Organizational Subdivision Extraction **🚨 CRITICAL: When extracting person/staff affiliations, ALWAYS capture organizational subdivisions (departments, teams, units, divisions, sections) as structured data. This information is highly valuable for understanding institutional structure.** ## Why This Matters Organizational subdivisions reveal: 1. **Institutional Structure**: How heritage custodians organize their work 2. **Expertise Clustering**: Where specific skills/knowledge are concentrated 3. **Contact Routing**: Who to contact for specific inquiries 4. **Network Analysis**: How teams connect across institutions 5. **Career Tracking**: Movement between teams/departments over time ## What to Extract | Subdivision Type | Examples | Field Name | |------------------|----------|------------| | **Department** | Collections Department, Conservation Department | `department` | | **Team** | Data Science Team, Digitization Team | `team` | | **Unit** | Research Unit, Acquisitions Unit | `unit` | | **Division** | Public Services Division, Technical Services | `division` | | **Section** | Photographs Section, Maps Section | `section` | | **Lab/Center** | Conservation Lab, Research Center | `lab_or_center` | | **Office** | Director's Office, Communications Office | `office` | ## Detection Patterns ### LinkedIn Headlines Parse subdivision indicators from headlines: ``` "Kadaster Data Science Team | BSc Artificial Intelligence UU" ↓ organization: Kadaster team: Data Science Team "Senior Curator, Asian Art Department | Rijksmuseum" ↓ organization: Rijksmuseum department: Asian Art Department "Head of Conservation Lab | British Museum" ↓ organization: British Museum lab_or_center: Conservation Lab ``` ### Keywords to Detect | Language | Keywords | |----------|----------| | **English** | team, department, dept., division, unit, section, lab, laboratory, center, centre, office, group, branch | | **Dutch** | team, afdeling, afd., divisie, eenheid, sectie, laboratorium, centrum, kantoor, groep | | **German** | Team, Abteilung, Abt., Division, Einheit, Sektion, Labor, Zentrum, Büro, Gruppe | | **French** | équipe, département, dép., division, unité, section, laboratoire, centre, bureau, groupe | ## Data Structure ### In Person Entity Files ```json { "profile_data": { "name": "Aron Noordhoek", "headline": "Kadaster Data Science Team | BSc Artificial Intelligence UU" }, "affiliations": [ { "custodian_name": "Kadaster", "custodian_slug": "kadaster", "role_title": "Data Science Team Member", "subdivision": { "type": "team", "name": "Data Science Team", "parent_subdivision": null, "extraction_source": "linkedin_headline" }, "heritage_relevant": true, "heritage_type": "D", "current": true } ] } ``` ### Nested Subdivisions Some organizations have hierarchical subdivisions: ```json { "subdivision": { "type": "section", "name": "Photographs Section", "parent_subdivision": { "type": "department", "name": "Collections Department" }, "extraction_source": "institutional_website" } } ``` ## Extraction Workflow ``` 1. PARSE headline/role text ↓ 2. IDENTIFY subdivision keywords (team, department, etc.) ↓ 3. EXTRACT subdivision name ↓ 4. CLASSIFY subdivision type ↓ 5. CHECK for parent subdivisions (if hierarchical) ↓ 6. STORE in structured format ``` ## LinkUp Search Strategy When using LinkUp to enrich profiles, specifically search for subdivision information: ```python # Good queries for subdivision discovery queries = [ f'"{person_name}" "{organization}" department team', f'"{organization}" organizational structure', f'"{organization}" staff directory departments', ] ``` ## Examples from Real Data ### Example 1: Kadaster Data Science Team ```json { "name": "Aron Noordhoek", "affiliations": [{ "custodian_name": "Kadaster", "subdivision": { "type": "team", "name": "Data Science Team" } }] } ``` ### Example 2: Museum Department ```json { "name": "Sarah Johnson", "affiliations": [{ "custodian_name": "Rijksmuseum", "subdivision": { "type": "department", "name": "Paintings Conservation Department" } }] } ``` ### Example 3: Archive Unit ```json { "name": "Thomas van Berg", "affiliations": [{ "custodian_name": "Nationaal Archief", "subdivision": { "type": "unit", "name": "Digital Preservation Unit", "parent_subdivision": { "type": "department", "name": "Collection Care" } } }] } ``` ## Validation Rules 1. **Subdivision name MUST NOT be empty** if type is specified 2. **Type MUST be one of**: department, team, unit, division, section, lab_or_center, office 3. **extraction_source MUST be specified**: linkedin_headline, institutional_website, linkedin_experience, manual 4. **Parent subdivision (if any) MUST have valid type and name** ## Integration with Existing Rules This rule complements: - **Rule 12**: Person Data Reference Pattern - **Rule 18**: Custodian Staff Parsing - **Rule 20**: Person Entity Profiles - **Rule 27**: Person-Custodian Data Architecture ## Provenance When subdivision info comes from web sources, include provenance: ```json { "subdivision": { "type": "department", "name": "Conservation Department", "provenance": { "source_url": "https://www.museum.org/about/staff", "retrieved_on": "2025-12-15T20:00:00Z", "retrieval_agent": "firecrawl" } } } ``` ## See Also - `schemas/20251121/linkml/modules/classes/OrganizationalUnit.yaml` (if exists) - `.opencode/PERSON_CUSTODIAN_DATA_ARCHITECTURE.md` - `AGENTS.md` Rule 31