- Implemented a Python script to validate KB library YAML files for required fields and data quality. - Analyzed enrichment coverage from Wikidata and Google Maps, generating statistics. - Created a comprehensive markdown report summarizing validation results and enrichment quality. - Included error handling for file loading and validation processes. - Generated JSON statistics for further analysis.
165 lines
3.3 KiB
Markdown
165 lines
3.3 KiB
Markdown
# KB Netherlands Public Libraries - Enrichment Report
|
|
|
|
**Generated**: 2025-11-28 12:28:14 UTC
|
|
**Total Entries**: 149
|
|
|
|
## Executive Summary
|
|
|
|
The KB Netherlands library ISIL data has been successfully integrated and enriched with external data sources.
|
|
|
|
| Metric | Count | Percentage |
|
|
|--------|-------|------------|
|
|
| Total KB Library Entries | 149 | 100% |
|
|
| Valid Entries | 149 | 100.0% |
|
|
| Wikidata Enriched | 114 | 76.5% |
|
|
| Google Maps Enriched | 149 | 100.0% |
|
|
|
|
---
|
|
|
|
## Wikidata Enrichment
|
|
|
|
### Coverage
|
|
|
|
| Status | Count | Percentage |
|
|
|--------|-------|------------|
|
|
| Successfully enriched | 114 | 76.5% |
|
|
| Not found in Wikidata | 35 | 23.5% |
|
|
| Not attempted | 0 | 0.0% |
|
|
|
|
### Match Methods
|
|
|
|
| Method | Count |
|
|
|--------|-------|
|
|
| isil_code_match | 64 |
|
|
| fuzzy_name_match | 50 |
|
|
|
|
### Data Completeness (of 114 enriched)
|
|
|
|
| Field | Count | Percentage |
|
|
|-------|-------|------------|
|
|
| Coordinates | 68 | 59.6% |
|
|
| Inception Date | 11 | 9.6% |
|
|
| VIAF ID | 3 | 2.6% |
|
|
| Website | 114 | 100.0% |
|
|
|
|
---
|
|
|
|
## Google Maps Enrichment
|
|
|
|
### Coverage
|
|
|
|
| Status | Count | Percentage |
|
|
|--------|-------|------------|
|
|
| Successfully enriched | 149 | 100.0% |
|
|
| Not found | 0 | 0.0% |
|
|
| Not attempted | 0 | 0.0% |
|
|
|
|
### Data Completeness (of 149 enriched)
|
|
|
|
| Field | Count | Percentage |
|
|
|-------|-------|------------|
|
|
| Coordinates | 149 | 100.0% |
|
|
| Full Address | 149 | 100.0% |
|
|
| Phone Number | 146 | 98.0% |
|
|
| Website | 143 | 96.0% |
|
|
| Opening Hours | 145 | 97.3% |
|
|
| Rating | 147 | 98.7% |
|
|
|
|
### Business Status
|
|
|
|
| Status | Count |
|
|
|--------|-------|
|
|
| OPERATIONAL | 147 |
|
|
| CLOSED_TEMPORARILY | 1 |
|
|
| CLOSED_PERMANENTLY | 1 |
|
|
|
|
### Geographic Distribution by Province
|
|
|
|
| Province | Count |
|
|
|----------|-------|
|
|
| Zuid-Holland | 25 |
|
|
| Overijssel | 23 |
|
|
| Noord-Brabant | 18 |
|
|
| Gelderland | 18 |
|
|
| Noord-Holland | 16 |
|
|
| Limburg | 13 |
|
|
| Utrecht | 9 |
|
|
| Friesland | 6 |
|
|
| Drenthe | 5 |
|
|
| Zeeland | 4 |
|
|
| Groningen | 3 |
|
|
| Flevoland | 3 |
|
|
| Sint Eustatius | 1 |
|
|
| Saba | 1 |
|
|
| Bonaire | 1 |
|
|
|
|
---
|
|
|
|
## Geographic Distribution by City
|
|
|
|
Top 20 cities with most library entries:
|
|
|
|
| City | Count |
|
|
|------|-------|
|
|
| Deventer | 5 |
|
|
| Den Haag | 4 |
|
|
| Groningen | 3 |
|
|
| Assen | 3 |
|
|
| Middelburg | 2 |
|
|
| Leeuwarden | 2 |
|
|
| Heerlen | 2 |
|
|
| Hoofddorp | 2 |
|
|
| Lelystad | 2 |
|
|
| Rotterdam | 2 |
|
|
| Amsterdam | 1 |
|
|
| Tilburg | 1 |
|
|
| Houten | 1 |
|
|
| Utrecht | 1 |
|
|
| Grave | 1 |
|
|
| Schiedam | 1 |
|
|
| Maastricht | 1 |
|
|
| Haarlem | 1 |
|
|
| Eindhoven | 1 |
|
|
| Enschede | 1 |
|
|
|
|
---
|
|
|
|
## Validation Results
|
|
|
|
### Summary
|
|
|
|
- **Valid entries**: 149 (100.0%)
|
|
- **Entries with issues**: 0
|
|
- **Entries with warnings**: 0
|
|
- **File parsing errors**: 0
|
|
|
|
|
|
---
|
|
|
|
## Data Sources
|
|
|
|
1. **KB Netherlands Library Network** (Primary)
|
|
- Source file: `KB_Netherlands_ISIL_2025-04-01.xlsx`
|
|
- URL: https://www.bibliotheeknetwerk.nl/
|
|
- 149 library entries with ISIL codes
|
|
|
|
2. **Wikidata** (Enrichment)
|
|
- SPARQL endpoint: https://query.wikidata.org/sparql
|
|
- Match methods: ISIL code lookup, fuzzy name matching
|
|
- Coverage: 114/149 (76.5%)
|
|
|
|
3. **Google Maps Places API** (Enrichment)
|
|
- API: Places API (New)
|
|
- Coverage: 149/149 (100.0%)
|
|
|
|
---
|
|
|
|
## Files Generated
|
|
|
|
- Entry files: `data/nde/enriched/entries/{index}_kb_isil.yaml` (149 files)
|
|
- This report: `reports/kb_libraries_enrichment_report.md`
|
|
- Statistics JSON: `reports/kb_libraries_enrichment_stats.json`
|
|
|
|
---
|
|
|
|
*Report generated by validate_kb_libraries_report.py*
|