# Library ISIL CSV to YAML Conversion Report **Date**: 2025-11-17 **Input**: `/data/isil/nl/kb/20250401 Bnetwerk overzicht ISIL-codes Bibliotheken Nederland.csv` **Output**: `/data/isil/nl/kb/20250401_Bnetwerk_ISIL_Bibliotheken_Nederland.yaml` **Script**: `/scripts/convert_library_isil_csv_to_yaml.py` **Data Date**: Stand 1 april 2025 (As of April 1, 2025) --- ## Conversion Summary ### Records Processed - **Total records**: 153 Dutch library ISIL codes - **Field preservation**: 100% (765 fields preserved exactly) - **Value mismatches**: 0 (perfect fidelity) ### CSV Structure (Original) Cleaner structure compared to national archive ISIL CSV: - UTF-8 encoding with BOM (✅ standard) - Semicolon delimiter - 3 metadata header rows (title, date, blank) - Column headers (row 4) - Data rows (rows 5+) **Fields**: 1. ISIL-code (NL-XXXXXXXXXX format) 2. Naam bibliotheek (library name) 3. Vestigingsplaats (city/location) 4. Opmerking (remarks/classification) ### YAML Structure (Output) Each record contains: **CSV Fields (preserved exactly)**: - `csv_row_number`: Original row number - `csv_isil_code`: ISIL identifier (13-character numeric code) - `csv_naam_bibliotheek`: Library name - `csv_vestigingsplaats`: City/location - `csv_opmerking`: Remarks (19 records have remarks) **LinkML Mapped Fields**: - `name`: Library name (mapped from csv_naam_bibliotheek) - `institution_type`: LIBRARY (all records) - `locations`: List with city and country (NL) - `identifiers`: ISIL identifier with scheme, value, URL - `library_type`: Classification based on remarks - `description`: Library type description - `provenance`: Data source metadata (TIER_1_AUTHORITATIVE) --- ## Data Quality Findings ### Geographic Distribution - **Unique cities**: 134 across Netherlands - **Top cities**: 1. Deventer: 5 libraries (Rijnbrink systems) 2. Den Haag: 4 libraries (KB + national orgs) 3. Groningen: 3 libraries (Biblionet + POI) 4. Assen: 3 libraries (Biblionet Drenthe) ### Library Type Classification Automated classification based on `opmerking` field: | Type | Count | % | Description | |------|-------|---|-------------| | **public_library** | 134 | 87.6% | Regular public libraries (no special classification) | | **library_automation_system** | 11 | 7.2% | POI (Public Online Information) systems | | **national_library_organization** | 5 | 3.3% | National library organizations (Muziekweb, SDI, etc.) | | **provincial_library_organization** | 2 | 1.3% | Provincial Musidesk systems | | **national_library** | 1 | 0.7% | KB (Koninklijke Bibliotheek) | ### ISIL Code Patterns **Uniform Structure**: - All codes: **exactly 13 characters** (NL-XXXXXXXXXX) - Format: `NL-` + 10 digits - All codes start with `NL-` - All characters after prefix are numeric - No duplicates (153 unique codes) **Examples by Type**: - National Library: `NL-0100030000` (KB) - Public Library: `NL-0800070000` (OBA Amsterdam) - Automation System: `NL-0700130000` (Zeeuwse Bibliotheken POI) - National Org: `NL-0735650000` (Muziekweb) ### Remarks Field Analysis 19 libraries (12.4%) have remarks documenting: **National Organizations** (5 libraries): - Muziekweb, Coöperatie SDI, online bibliotheek, Passend Lezen, Dedicon - Classification: "landelijke bibliotheekorganisatie" **Provincial Organizations** (2 libraries): - Rijnbrink Musidesk Gelderland, Rijnbrink Musidesk Overijssel - Classification: "provinciale bibliotheekorganisatie" **POI Systems** (11 libraries): - Library automation/consortium systems - Examples: Zeeuwse Bibliotheken, FERS Friesland, Rijnbrink, Cubiss, Probiblio, BiSC, Biblionet - Classification: "POI" (Public Online Information) **National Library** (1 library): - KB (Koninklijke Bibliotheek) - Classification: "KB, Nationale Bibliotheek" --- ## LinkML Schema Compliance ### Required Fields ✅ All 153 records contain: - `name` (library name) - `institution_type` (LIBRARY) - `locations` (city + country) - `identifiers` (ISIL code details) - `library_type` (automated classification) - `provenance` (data source metadata) ### Identifier Structure Each ISIL identifier includes: ```yaml identifiers: - identifier_scheme: ISIL identifier_value: NL-0800070000 identifier_url: https://isil.org/NL-0800070000 ``` ### Provenance Metadata All records marked as: - **Data source**: ISIL_REGISTRY - **Data tier**: TIER_1_AUTHORITATIVE - **Source URL**: https://www.kb.nl/organisatie/bibliotheken-in-nederland/isil-codes - **Source date**: Stand 1 april 2025 - **Confidence score**: 1.0 (authoritative) --- ## Library Network Structure ### National Level (6 organizations) 1. **KB, nationale bibliotheek** (Den Haag) - National library 2. **Muziekweb** (Rotterdam) - Music library service 3. **Coöperatie SDI** (Utrecht) - Digital library cooperation 4. **online bibliotheek** (Den Haag) - Online library platform 5. **Passend Lezen** (Den Haag) - Accessible reading service 6. **Dedicon** (Grave) - Audio book production ### Provincial Level (2 organizations) 1. **Rijnbrink Musidesk Gelderland** (Deventer) 2. **Rijnbrink Musidesk Overijssel** (Deventer) ### Library Automation Systems (11 POI systems) Regional consortia providing shared library management systems: - **Zeeland**: Zeeuwse Bibliotheken (Middelburg) - **Friesland**: FERS Friesland (Leeuwarden) - **Gelderland**: Rijnbrink Gelderland (Deventer) - **Groningen**: Biblionet Groningen + POI variant - **Limburg**: Cubiss Limburg (Heerlen) - **Noord-Brabant**: Cubiss Noord-Brabant (Tilburg) - **Noord-Holland**: Probiblio (Hoofddorp) - **Overijssel**: Rijnbrink Overijssel (Deventer) - **Utrecht**: BiSC Utrecht (Houten) - **Drenthe**: Biblionet Drenthe + POI variant - **Flevoland**: Bibliotheeknetwerk Flevoland (Lelystad) ### Public Libraries (134 organizations) Major city libraries include: - OBA (Amsterdam) - Bibliotheek Rotterdam - Bibliotheek Den Haag - Rozet (Arnhem) - Bibliotheek Schiedam - And 129 other municipal/regional libraries --- ## Validation Results ### Field Preservation Test ``` Total records: 153 Total fields: 765 Fields preserved: 765 Value mismatches: 0 Preservation rate: 100.0% ``` ✅ **VALIDATION PASSED** ### LinkML Schema Compliance ✅ All required fields present ✅ All CSV fields preserved ✅ Institution type set to LIBRARY ✅ Library type classification applied ✅ No data loss during conversion ✅ YAML structure valid --- ## Key Differences from National Archive ISIL Dataset | Aspect | National Archive ISIL | Library ISIL | |--------|----------------------|--------------| | Records | 371 | 153 | | ISIL Format | Variable (7-17 chars) | Uniform (13 chars) | | ISIL Pattern | NL-{City}{Abbrev} | NL-XXXXXXXXXX | | Institution Types | Mixed (archives, museums) | Libraries only | | Encoding | Latin-1 (problematic) | UTF-8 (clean) | | CSV Structure | Malformed (quotes/semicolons) | Clean (standard) | | Remarks | 18 records (4.9%) | 19 records (12.4%) | | Organization Types | N/A | Classified (5 types) | --- ## Use Cases This YAML file can be used for: 1. **Library Network Mapping**: Understand Dutch public library infrastructure 2. **Automation System Analysis**: Track which libraries use which POI systems 3. **Service Coverage**: Map library services by region/province 4. **Data Integration**: Merge with NDE dataset or national archive ISIL codes 5. **LinkML Validation**: Test schema compliance with library registry data 6. **Collection Management**: Link library collections to authoritative ISIL codes --- ## Insights and Patterns ### Decentralization Dutch public libraries operate through 11 regional automation consortia (POI systems), showing strong provincial/regional organization rather than a single national system. ### Key Players - **Rijnbrink** dominates eastern Netherlands (Gelderland, Overijssel) - **Cubiss** serves southern provinces (Limburg, Noord-Brabant) - **Biblionet** serves northern provinces (Groningen, Drenthe) ### National Services KB (national library) coordinates national-level services: - Digital collections (online bibliotheek) - Accessible reading (Passend Lezen, Dedicon) - Specialized collections (Muziekweb) - Shared infrastructure (Coöperatie SDI) ### ISIL Code Assignment Pattern Numeric codes suggest sequential assignment rather than semantic encoding: - KB: `NL-0100030000` (low number = early assignment) - Recent libraries likely have higher numbers - Different from archive ISIL codes which encode city/institution --- ## Next Steps ### Data Enrichment - [ ] Geocode city names to latitude/longitude - [ ] Add library website URLs - [ ] Cross-link with NDE organization dataset - [ ] Query Wikidata for Q-numbers - [ ] Link libraries to their POI systems (parent-child relationships) ### Analysis - [ ] Map library coverage by municipality - [ ] Analyze POI system membership - [ ] Identify gaps in library service coverage - [ ] Compare with population density data ### Integration - [ ] Merge with national archive ISIL dataset (371 records) - [ ] Create unified Dutch heritage custodian registry - [ ] Generate GHCID identifiers - [ ] Link to museum/archive records where libraries share buildings --- ## Files Created ### Data - `/data/isil/nl/kb/20250401_Bnetwerk_ISIL_Bibliotheken_Nederland.yaml` (3,577 lines, 153 records) ### Scripts - `/scripts/convert_library_isil_csv_to_yaml.py` (conversion + validation + classification) ### Documentation - `/docs/LIBRARY_ISIL_CSV_TO_YAML_CONVERSION_REPORT.md` (this file) --- ## Technical Notes ### Automated Library Classification The script classifies libraries into 5 types based on keyword matching in the `opmerking` field: ```python if 'landelijke bibliotheekorganisatie' in remark: return 'national_library_organization' elif 'provinciale bibliotheekorganisatie' in remark: return 'provincial_library_organization' elif 'poi' in remark.lower(): return 'library_automation_system' elif 'nationale bibliotheek' in remark: return 'national_library' else: return 'public_library' # default ``` This classification helps distinguish organizational hierarchy and service types. ### Performance - Parsing: ~0.05 seconds - Mapping: ~0.1 seconds - Classification: ~0.05 seconds - Validation: ~0.05 seconds - YAML write: ~0.3 seconds - **Total time**: < 0.6 seconds ### YAML Generation Used PyYAML with identical settings to national archive conversion for consistency. --- **Status**: ✅ Conversion complete **Quality**: 100% field preservation **Classification**: Automated library type classification applied **Ready for**: Data enrichment, integration, and network analysis