# Rule 59: LinkML Union Types Require `range: Any` 🚨 **CRITICAL**: When using `any_of` for union types in LinkML, you MUST also specify `range: Any` at the attribute level. Without it, the union type validation does NOT work. ## The Problem LinkML's `any_of` construct allows defining slots that accept multiple types (e.g., string OR integer). However, there's a critical implementation detail: **Without `range: Any`, the `any_of` constraint is silently ignored during validation.** This leads to validation failures where data that should be valid (e.g., integer value in a string/integer union field) is rejected. ## Correct Pattern ```yaml slots: identifier_value: range: Any # ← REQUIRED for any_of to work any_of: - range: string - range: integer description: The identifier value (can be string or integer) ``` ## Incorrect Pattern (WILL FAIL) ```yaml slots: identifier_value: # Missing range: Any - validation will fail! any_of: - range: string - range: integer description: The identifier value (can be string or integer) ``` ## Common Use Cases This pattern is required for: | Use Case | Types | Example Fields | |----------|-------|----------------| | Identifier values | string \| integer | `identifier_value`, `geonames_id`, `viaf_id` | | Social media IDs | string \| array | `youtube_channel_id`, `facebook_id`, `twitter_username` | | Flexible identifiers | object \| array | `identifiers` (dict or list format) | | Numeric strings | string \| integer | `postal_code`, `kvk_number` | ## Real-World Examples from GLAM Schema ### Example 1: OriginalEntryIdentifier.yaml ```yaml # Before (BROKEN): attributes: identifier_value: any_of: - range: string - range: integer # After (WORKING): attributes: identifier_value: range: Any # Added any_of: - range: string - range: integer ``` ### Example 2: WikidataSocialMedia.yaml ```yaml # Social media fields that can be single value or array attributes: youtube_channel_id: range: Any # Required for string|array union any_of: - range: string - range: string multivalued: true description: YouTube channel ID (single value or array) facebook_id: range: Any any_of: - range: string - range: string multivalued: true ``` ### Example 3: OriginalEntry.yaml (object|array union) ```yaml # identifiers field that accepts both dict and array formats attributes: identifiers: range: Any # Required for flexible typing description: >- Identifiers from original source. Accepts both dict format (e.g., {isil: "XX-123"}) and array format (e.g., [{scheme: "isil", value: "XX-123"}]) ``` ### Example 4: OriginalEntryLocation.yaml ```yaml attributes: geonames_id: range: Any # Required for string|integer any_of: - range: string - range: integer description: GeoNames ID (may be string or integer depending on source) ``` ## Validation Behavior | Schema Definition | Integer Data | String Data | Result | |-------------------|--------------|-------------|--------| | `range: string` | ❌ FAIL | ✅ PASS | Strict string only | | `range: integer` | ✅ PASS | ❌ FAIL | Strict integer only | | `any_of` without `range: Any` | ❌ FAIL | ❌ FAIL | Broken - nothing works | | `any_of` with `range: Any` | ✅ PASS | ✅ PASS | Correct union behavior | ## Why This Happens LinkML's validation engine processes `range` first to determine the basic type constraint. When `range` is not specified (or defaults to `string`), it applies that constraint before checking `any_of`. The `range: Any` tells the validator to defer type checking to the `any_of` constraints. ## Checklist for Union Types When adding a field that accepts multiple types: - [ ] Define the `any_of` block with all acceptable ranges - [ ] Add `range: Any` at the same level as `any_of` - [ ] Test with sample data of each type - [ ] Document the accepted types in the description ## See Also - LinkML Documentation: [Union Types](https://linkml.io/linkml/schemas/advanced.html#union-types) - GLAM Validation: `schemas/20251121/linkml/modules/classes/CustodianSourceFile.yaml` - Validation command: `linkml-validate -s .yaml .yaml` ## Migration Notes **Affected Files (Fixed January 2026)**: - `OriginalEntryIdentifier.yaml` - `identifier_value` - `Identifier.yaml` - `identifier_value` slot_usage - `WikidataSocialMedia.yaml` - `youtube_channel_id`, `facebook_id`, `instagram_username`, `linkedin_company_id`, `twitter_username`, `facebook_page_id` - `YoutubeEnrichment.yaml` - `channel_id` - `OriginalEntryLocation.yaml` - `geonames_id` - `OriginalEntry.yaml` - `identifiers` --- **Version**: 1.0 **Created**: 2026-01-18 **Author**: AI Agent (OpenCode Claude)