glam/data/instances/libya/VALIDATION_REPORT.md
2025-11-19 23:25:22 +01:00

5.9 KiB

Libyan Heritage Institutions - Validation Report

Date: 2025-11-09
Schema Version: v0.2.1
Validator: validate_instances.py (custom LinkML validator)


Summary

VALIDATION SUCCESSFUL

  • Total institutions: 54
  • Valid records: 54 (100%)
  • Errors: 0
  • Warnings: 3 (minor, acceptable)

Source Files Merged

Batch Filename Records Notes
1 libya_universities_batch1.json 8 Universities with library collections
2 libya_museums_batch2.json 7 Museums and heritage sites
3 libya_sites_digital_manuscripts_batch3.json 13 Archaeological sites, digital archives
4 libya_historic_buildings_museums_batch4.json 4 Historic buildings and regional museums
5 libya_heritage_institutions_extracted.json 22 Mixed institutions (libraries, archives, sites)
TOTAL 54

Data Quality Metrics

Field Coverage

Field Coverage Count
id (URI) 100.0% 54/54
name 100.0% 54/54
institution_type 100.0% 54/54
provenance 100.0% 54/54
description 100.0% 54/54
locations 100.0% 54/54
alternative_names 96.3% 52/54
collections 87.0% 47/54
identifiers 74.1% 40/54
change_history 57.4% 31/54
digital_platforms 31.5% 17/54

Institution Type Distribution

Type Count
EDUCATION_PROVIDER 18
MUSEUM 13
ARCHIVE 7
LIBRARY 6
OFFICIAL_INSTITUTION 5
RESEARCH_CENTER 3
OTHER 2

Warnings

Minor Issues (Non-blocking)

3 institutions lack city name in location data (online-only resources):

  1. Heritage Gazetteer - Digital gazetteer (no physical location)
  2. Temehu Online Museum - Digital museum (online-only)
  3. Nafusa Libraries - Network of libraries (no single city)

Resolution: Acceptable. These are digital/distributed resources without a single physical location.


Provenance Metadata

All 54 records include complete provenance tracking:

  • Data Source: CONVERSATION_NLP
  • Data Tier: TIER_4_INFERRED
  • Extraction Date: 2025-11-09T00:00:00Z
  • Extraction Method: AI agent comprehensive extraction
  • Confidence Scores: Range 0.82 - 0.96 (average: 0.89)
  • Source Conversation: d06ded03-ba79-4b79-b068-406c2da01f8c

Schema Compliance

Required Fields

All records contain required fields per LinkML schema v0.2.1:

  • HeritageCustodian: id, name, institution_type, provenance
  • Provenance: data_source, data_tier, extraction_date, extraction_method, confidence_score
  • Location: country (all have "LY")

Enumeration Validation

All enumerated values conform to schema:

  • institution_type: Valid values from InstitutionTypeEnum
  • data_source: All records use CONVERSATION_NLP
  • data_tier: All records use TIER_4_INFERRED
  • change_type: Valid values from ChangeTypeEnum (FOUNDING, NAME_CHANGE, etc.)
  • platform_type: Valid values including new LEARNING_MANAGEMENT type

Notable Features

Digital Platforms (17 institutions)

  • Moodle LMS: 3 universities (Sirte, Libyan International, Al-Zawiya implied)
  • Google Classroom: 1 university (Misurata)
  • Greenstone Digital Library: 1 (Libyan Academy - first ETD in Libya)
  • SPARQL Endpoints: 2 (Heritage Gazetteer, Endangered Archaeology)
  • Digital Repositories: Multiple institutions

Change Events (31 institutions)

  • FOUNDING: 19 events (university establishments, museum openings)
  • NAME_CHANGE: 3 events (University of Zintan, Omar Al-Mukhtar University)
  • STATUS_CHANGE: 6 events (UNESCO listings, rankings, reopenings)
  • RELOCATION: 2 events (Red Castle Museum, National Library)
  • CLOSURE: 1 event (temporary closures due to conflict)

Geographic Coverage

18 cities across Libya:

  • Tripoli (9 institutions) - Capital, major cultural center
  • Benghazi (6 institutions) - Eastern Libya cultural hub
  • Sabha (2) - Southern gateway to Saharan heritage
  • Misrata (2) - Central coast
  • Cyrene, Leptis Magna, Sabratha - UNESCO World Heritage Sites
  • Plus 11 other regional cities

Next Steps

  1. COMPLETE: Validation passed
  2. Geocoding: Add lat/lon coordinates for 54 locations
  3. Wikidata Enrichment: Link institutions to Wikidata Q-numbers
  4. Export: Generate RDF, JSON-LD, GeoJSON formats
  5. Integrate: Merge with global dataset

Data Enhancement Opportunities

  • Wikidata Linking: 40 institutions have identifiers but only ~10 have Wikidata Q-numbers
  • Digital Platform URLs: 17 platforms mentioned but only 7 have URLs
  • Collection Details: 47 collections documented but extent/temporal coverage could be expanded
  • Geographic Precision: Add specific addresses and coordinates where available

Files Generated

Output Files

  • Primary: data/instances/libya/libyan_institutions.yaml (54 records, schema-compliant)
  • Report: data/instances/libya/VALIDATION_REPORT.md (this document)

Source Files (Preserved)

  • data/instances/libya_universities_batch1.json (updated with IDs)
  • data/instances/libya_museums_batch2.json
  • data/instances/libya_sites_digital_manuscripts_batch3.json
  • data/instances/libya_historic_buildings_museums_batch4.json
  • data/instances/libya_heritage_institutions_extracted.json (updated with IDs)

Compliance Statement

This dataset complies with:

  • LinkML Heritage Custodian Schema v0.2.1
  • PROV-O provenance tracking
  • Dublin Core metadata standards
  • ISO 3166-1 alpha-2 country codes (LY)
  • ISO 8601 date/time formats
  • W3C URI/IRI standards

Validation Status: PASSED
Approved for Integration: YES
Next Review Date: After geocoding/Wikidata enrichment