glam/reports/provenance_validation_final.txt
2025-12-30 03:43:31 +01:00

78 lines
2.1 KiB
Text

======================================================================
PROVENANCE VALIDATION REPORT
Generated: 2025-12-29T12:39:19.811656
======================================================================
## YAML Custodian Files
----------------------------------------
Total files scanned: 3,000
Files with enrichment: 1,924
Files with provenance: 1,924
Sections checked: 2,509
Sections with _provenance: 2,509
Sections with content_hash: 2,509
Sections with wasDerivedFrom: 2,434
Valid content_hashes: 2,509
Invalid content_hashes: 0
### Coverage Rates
_provenance coverage: 100.0%
content_hash coverage: 100.0%
wasDerivedFrom coverage: 97.0%
### By Enrichment Section
----------------------------------------
google_maps_enrichment:
Total sections: 351
With _provenance: 351 (100.0%)
With content_hash: 351 (100.0%)
With wasDerivedFrom: 348 (99.1%)
web_enrichment:
Total sections: 167
With _provenance: 167 (100.0%)
With content_hash: 167 (100.0%)
With wasDerivedFrom: 164 (98.2%)
wikidata_enrichment:
Total sections: 1,875
With _provenance: 1,875 (100.0%)
With content_hash: 1,875 (100.0%)
With wasDerivedFrom: 1,810 (96.5%)
youtube_enrichment:
Total sections: 106
With _provenance: 106 (100.0%)
With content_hash: 106 (100.0%)
With wasDerivedFrom: 102 (96.2%)
zcbs_enrichment:
Total sections: 10
With _provenance: 10 (100.0%)
With content_hash: 10 (100.0%)
With wasDerivedFrom: 10 (100.0%)
## JSON Person Entity Files
----------------------------------------
Total files scanned: 3,000
Files with web_claims: 115
Claims checked: 220
Claims with provenance: 194
Claims provenance coverage: 88.2%
## Error Summary
----------------------------------------
YAML parsing errors: 0
JSON parsing errors: 0
======================================================================
VALIDATION STATUS
======================================================================
STATUS: PASSED
All provenance metadata validated successfully!