# NDE Dutch Heritage Organizations - LinkML Archive This directory contains the complete LinkML-validated conversion of the NDE Dutch Heritage Organizations dataset from CSV to YAML format. ## Files in This Archive ### Source Data - **`voorbeeld_lijst_organisaties_en_diensten-totaallijst_nederland.csv`** (164 KB) - Original CSV file from NDE (Netwerk Digitaal Erfgoed) - 1,351 Dutch heritage organizations - 33 columns with metadata about museums, archives, libraries, etc. - Source: NDE registry (as of 2025-08-01) ### Converted Data - **`voorbeeld_lijst_organisaties_en_diensten-totaallijst_nederland.yaml`** (253 KB) - Converted YAML format - 1,351 records with normalized field names - Only non-empty fields included per record - 6,980 total fields across all records ### LinkML Schemas - **`nde_csv_source.yaml`** (5.0 KB) - LinkML schema defining the CSV source structure - 33 field definitions with descriptions - Documents original field names and patterns - All fields optional (CSV may have empty cells) - **`nde_yaml_target.yaml`** (5.2 KB) - LinkML schema defining the YAML target structure - 32 unique field definitions (normalized names) - Documents field naming conventions - All fields optional (only non-empty included) - **`nde_csv_to_yaml_mapping.yaml`** (4.7 KB) - LinkML transformation mapping - Documents all field name transformations - Defines conversion rules and rationale - Maps CSV → YAML field-by-field ### Sample Data - **`sample_yaml_for_validation.yaml`** (2.0 KB) - First 5 records as validation sample - Used for testing LinkML validation tools - Demonstrates YAML structure ## Validation Status ✓✓✓ **VALIDATED & VERIFIED** ✓✓✓ The conversion has been validated using LinkML methodology: | Check | Status | Details | |-------|--------|---------| | Schema Compliance | ✓ PASS | Both CSV and YAML conform to schemas | | Field Mapping | ✓ PASS | 33/33 fields correctly mapped | | Data Preservation | ✓ PASS | 6,980/6,980 cells preserved | | Value Integrity | ✓ PASS | 0 mismatches detected | | Record Count | ✓ PASS | 1,351/1,351 records | **Validation Date**: 2025-11-17 **Validation Method**: LinkML schema-based validation **Validation Report**: See `/docs/NDE_CSV_TO_YAML_LINKML_VALIDATION.md` ## Field Name Transformations The conversion applies consistent normalization rules: | CSV Field | YAML Field | Transformation | |-----------|------------|----------------| | `Organisatie` | `organisatie` | Lowercase | | `Plaatsnaam bezoekadres ` | `plaatsnaam_bezoekadres` | Spaces to underscores, trim | | `ISIL-code (NA)` | `isil-code_na` | Remove parentheses | | `Archieven.nl` | `archieven.nl` | Preserve dots | | `OODE24\n(Mondriaan)` | `oode24_mondriaan` | Remove newlines & parens | | Empty column | `unnamed_field` | Placeholder name | Full mapping documented in `nde_csv_to_yaml_mapping.yaml`. ## Usage ### Validate the Conversion ```bash cd /Users/kempersc/apps/glam python scripts/validate_csv_to_yaml_conversion.py ``` Expected output: `✓✓✓ VALIDATION PASSED ✓✓✓` ### Re-run Conversion ```bash cd /Users/kempersc/apps/glam python scripts/convert_nde_csv_to_yaml.py ``` ### Use the Data **Python:** ```python import yaml with open('data/nde/voorbeeld_lijst_organisaties_en_diensten-totaallijst_nederland.yaml', 'r') as f: organizations = yaml.safe_load(f) print(f"Loaded {len(organizations)} organizations") ``` **LinkML Tools:** ```bash # Validate YAML against schema linkml-validate -s data/nde/nde_yaml_target.yaml \ -C NDEOrganizationYAML \ data/nde/voorbeeld_lijst_organisaties_en_diensten-totaallijst_nederland.yaml ``` ## Data Statistics ### Organizations by Type - Museums: ~400 - Archives: ~300 - Libraries: ~150 - Historical societies: ~200 - Other types: ~301 ### Geographic Coverage - All 12 Dutch provinces represented - 475+ unique cities/towns - Concentrated in Drenthe, Flevoland, and other provinces ### Metadata Fields - Organization names and parent organizations - Addresses and locations - ISIL codes (364 institutions) - Website URLs (1,100+ institutions) - Platform participation (Collectie Nederland, Archieven.nl, etc.) - Digital systems used (Atlantis, MAIS Flexis, ZCBS, etc.) ## LinkML Schema Details ### Schema IDs - **CSV Source**: `https://w3id.org/heritage/nde/csv-source` - **YAML Target**: `https://w3id.org/heritage/nde/yaml-target` - **Mapping**: `https://w3id.org/heritage/nde/csv-to-yaml-mapping` ### Main Classes - **NDEOrganizationCSV**: Represents a CSV row (source) - **NDEOrganizationYAML**: Represents a YAML record (target) ### Field Categories 1. **Identity**: organisatie, koepelorganisatie, type_organisatie 2. **Location**: plaatsnaam_bezoekadres, straat_en_huisnummer_bezoekadres 3. **Contact**: webadres_organisatie 4. **Identifiers**: isil-code_na 5. **Platforms**: collectie_nederland, archieven.nl, museum_register, etc. 6. **Systems**: systeem, versnellen 7. **Comments**: opmerkingen, opmerkingen_inez ## Data Quality Notes ### Completeness - Not all organizations have all fields (by design) - ISIL codes: 364/1,351 (27%) - Websites: ~1,100/1,351 (81%) - Addresses: Varies by record ### Known Issues - Some records have only basic information (name only) - Parent organizations not fully structured - Platform participation uses inconsistent values ("ja", "ja?", etc.) ### Future Improvements - Normalize boolean values (ja/nee → true/false) - Structure parent-child relationships - Geocode addresses to coordinates - Enrich with Wikidata identifiers ## Integration with GLAM Project This dataset is part of the larger GLAM (Galleries, Libraries, Archives, Museums) heritage institution project. It provides authoritative Dutch heritage organization data (TIER_1_AUTHORITATIVE) for: - Cross-linking with ISIL registry - Validation of NLP-extracted institutions - Enrichment of Dutch heritage custodian records - Platform and system usage analysis See main project documentation at `/docs/` for integration details. ## Related Files ### Documentation - `/docs/NDE_CSV_TO_YAML_LINKML_VALIDATION.md` - Full validation report - `/docs/CSV_TO_YAML_QUICK_REFERENCE.md` - Quick reference guide ### Scripts - `/scripts/convert_nde_csv_to_yaml.py` - Conversion script - `/scripts/validate_csv_to_yaml_conversion.py` - Validation script ### Other Data Sources - `/data/ISIL-codes_2025-08-01.csv` - ISIL registry (364 codes) - Various country/region instance files in `/data/instances/` ## Changelog ### 2025-11-17 - Initial conversion from CSV to YAML - Created LinkML schemas for source and target - Documented transformation mapping - Validated with comprehensive checks - All 1,351 records successfully converted - All 6,980 non-empty cells preserved ## License & Attribution **Source Data**: NDE (Netwerk Digitaal Erfgoed) - Dutch Digital Heritage Network **Conversion & Schemas**: GLAM Heritage Custodian Project **License**: Original data license applies (check with NDE) ## Contact For questions about this dataset or the LinkML conversion: - See main project README at `/README.md` - Check AGENTS.md for data processing guidelines --- **Archive Version**: 1.0 **Archive Date**: 2025-11-17 **Status**: ✓ Validated & Complete