glam/data/isil/switzerland/VALIDATION_REPORT.txt
2025-11-19 23:25:22 +01:00

140 lines
No EOL
5.9 KiB
Text

====================================================================================================
SWISS ISIL DATABASE - COMPREHENSIVE DATA QUALITY REPORT
====================================================================================================
Generated: 2025-11-19 09:24:08
Total institutions: 2,379
1. DATA COMPLETENESS
----------------------------------------------------------------------------------------------------
ISIL codes: 1,923 (80.8%)
Descriptions: 1,136 (47.8%)
Canton information: 2,370 (99.6%)
Institution categories: 1,947 (81.8%)
Contact Information:
Any contact method: 1,275 (53.6%)
Email addresses: 986 (41.4%)
Phone numbers: 1,168 (49.1%)
Websites: 934 (39.3%)
Address Information:
Any address data: 117 (4.9%)
Complete addresses: 0 (0.0%)
Additional Metadata:
Opening hours: 0 (0.0%)
Memberships: 0 (0.0%)
Dewey classifications: 0 (0.0%)
2. GEOGRAPHIC DISTRIBUTION
----------------------------------------------------------------------------------------------------
By Canton (Top 15):
ZH : 479 ( 20.1%)
BE : 311 ( 13.1%)
GE : 227 ( 9.5%)
VD : 224 ( 9.4%)
BS : 139 ( 5.8%)
VS : 121 ( 5.1%)
NE : 121 ( 5.1%)
FR : 102 ( 4.3%)
SG : 87 ( 3.7%)
TG : 85 ( 3.6%)
AG : 81 ( 3.4%)
LU : 59 ( 2.5%)
GR : 56 ( 2.4%)
TI : 54 ( 2.3%)
ZG : 49 ( 2.1%)
By Region:
Central Plain : 594 ( 25.0%)
Lake Geneva region : 573 ( 24.1%)
Zurich : 479 ( 20.1%)
Northwest Switzerland : 263 ( 11.1%)
Eastern Switzerland : 261 ( 11.0%)
Central Switzerland : 145 ( 6.1%)
Tessin : 54 ( 2.3%)
3. INSTITUTION TYPE ANALYSIS
----------------------------------------------------------------------------------------------------
Swiss Categories (Top 20):
University and research library : 764
Public library : 347
Special library : 339
Municipal archives or county/local authority archives : 190
Church and religious archives : 85
Regional archives : 45
Cantonal library : 37
Specialised non-governmental archives and archives of other : 36
University and research archives : 36
Business archives : 23
Private persons and family archives : 22
Regional and local museums : 22
Historical museums : 19
Art museums : 18
Media archives : 16
Natural science museums : 8
Other museums : 8
National archives : 7
National library : 5
Ethnographic museums : 3
GLAM Taxonomy Mapping:
LIBRARY : 1431 ( 60.2%)
ARCHIVE : 444 ( 18.7%)
UNKNOWN : 432 ( 18.2%)
MUSEUM : 72 ( 3.0%)
4. ISIL CODE ANALYSIS
----------------------------------------------------------------------------------------------------
Institutions WITH ISIL codes: 1,923 (80.8%)
Institutions WITHOUT ISIL codes: 456 (19.2%)
ISIL Code Patterns:
CH-6 digits: 1,923
Cantons with Most Institutions Lacking ISIL Codes:
VD: 58
ZH: 48
TG: 42
BE: 40
VS: 38
GE: 37
NE: 28
TI: 27
BS: 23
FR: 18
5. DATA QUALITY SUMMARY
----------------------------------------------------------------------------------------------------
Overall Data Quality Score: 70.4%
Quality Metrics:
Canton coverage : 99.6%
ISIL code coverage : 80.8%
Contact info availability : 53.6%
Description completeness : 47.8%
6. OUTPUT FILES GENERATED
----------------------------------------------------------------------------------------------------
JSON (scraped) : swiss_isil_complete_final.json (1.3 MB)
CSV (spreadsheet) : swiss_isil_complete.csv (840.6 KB)
LinkML YAML : switzerland_isil.yaml (2.6 MB)
JSON-LD (RDF) : switzerland_isil.jsonld (3.3 MB)
Scraping report : FINAL_SCRAPING_REPORT.txt (4.0 KB)
7. RECOMMENDATIONS
----------------------------------------------------------------------------------------------------
✓ Dataset is ready for integration into GLAM project
✓ High data quality (80.8% ISIL code coverage)
✓ Complete geographic coverage across all Swiss cantons
Future Enhancements:
• Obtain full address data for geocoding (only 4.9% have complete addresses)
• Enrich 456 institutions without ISIL codes
• Cross-reference with Wikidata for additional identifiers
• Obtain opening hours for institutions (currently 0%)
• Link to collection-level metadata where available
====================================================================================================
END OF REPORT
====================================================================================================