90 lines
2.5 KiB
Markdown
90 lines
2.5 KiB
Markdown
# Quick Reference: CSV to YAML Conversion & Validation
|
|
|
|
## Files Created
|
|
|
|
```
|
|
data/nde/
|
|
├── nde_csv_source.yaml # LinkML schema for CSV structure
|
|
├── nde_yaml_target.yaml # LinkML schema for YAML structure
|
|
├── nde_csv_to_yaml_mapping.yaml # Transformation mapping
|
|
├── voorbeeld_lijst_organisaties_en_diensten-totaallijst_nederland.csv # Source
|
|
├── voorbeeld_lijst_organisaties_en_diensten-totaallijst_nederland.yaml # Target
|
|
├── sample_yaml_for_validation.yaml # Sample data for testing
|
|
└── README.md # Archive documentation
|
|
|
|
scripts/
|
|
├── convert_nde_csv_to_yaml.py # Conversion script
|
|
└── validate_csv_to_yaml_conversion.py # Validation script
|
|
|
|
docs/
|
|
├── NDE_CSV_TO_YAML_LINKML_VALIDATION.md # Full validation report
|
|
└── CSV_TO_YAML_QUICK_REFERENCE.md # This file
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Convert CSV to YAML
|
|
|
|
```bash
|
|
python scripts/convert_nde_csv_to_yaml.py
|
|
```
|
|
|
|
Output:
|
|
- Creates YAML file in `data/nde/` directory
|
|
- Preserves all non-empty CSV data
|
|
- Normalizes field names (lowercase, underscores)
|
|
|
|
### Validate Conversion
|
|
|
|
```bash
|
|
python scripts/validate_csv_to_yaml_conversion.py
|
|
```
|
|
|
|
Output:
|
|
- Validates field mappings
|
|
- Checks data preservation
|
|
- Reports any missing/corrupted data
|
|
|
|
## Validation Results
|
|
|
|
✓✓✓ **ALL CHECKS PASSED** ✓✓✓
|
|
|
|
- **Records**: 1,351 / 1,351 ✓
|
|
- **Fields**: 6,980 / 6,980 ✓
|
|
- **Missing data**: 0 ✓
|
|
- **Value mismatches**: 0 ✓
|
|
|
|
## Field Name Transformations
|
|
|
|
| CSV Column | YAML Field |
|
|
|------------|------------|
|
|
| `Organisatie` | `organisatie` |
|
|
| `ISIL-code (NA)` | `isil-code_na` |
|
|
| `Archieven.nl` | `archieven.nl` |
|
|
| `OODE24 (Mondriaan)` | `oode24_mondriaan` |
|
|
| Empty column | `unnamed_field` |
|
|
|
|
## LinkML Schema Details
|
|
|
|
- **Source schema**: `data/nde/nde_csv_source.yaml` - 33 columns, all optional
|
|
- **Target schema**: `data/nde/nde_yaml_target.yaml` - 32 fields, all optional
|
|
- **Mapping**: 1:1 field mappings with normalization rules
|
|
|
|
## Key Features
|
|
|
|
1. **Lossless conversion**: All data preserved exactly
|
|
2. **Field normalization**: Consistent naming conventions
|
|
3. **Empty field handling**: Only non-empty values included
|
|
4. **Special character support**: Multi-line content, quotes, etc.
|
|
5. **LinkML validation**: Schema-based verification
|
|
|
|
## Re-running Validation
|
|
|
|
To re-validate at any time:
|
|
|
|
```bash
|
|
cd /Users/kempersc/apps/glam
|
|
python scripts/validate_csv_to_yaml_conversion.py
|
|
```
|
|
|
|
Expected output: `✓✓✓ VALIDATION PASSED ✓✓✓`
|