glam/reports/provenance_validation_sample.txt
2025-12-30 03:43:31 +01:00

78 lines
2.1 KiB
Text

======================================================================
PROVENANCE VALIDATION REPORT
Generated: 2025-12-29T03:17:37.406108
======================================================================
## YAML Custodian Files
----------------------------------------
Total files scanned: 3,000
Files with enrichment: 1,909
Files with provenance: 1,909
Sections checked: 2,487
Sections with _provenance: 2,487
Sections with content_hash: 2,487
Sections with wasDerivedFrom: 2,271
Valid content_hashes: 2,487
Invalid content_hashes: 0
### Coverage Rates
_provenance coverage: 100.0%
content_hash coverage: 100.0%
wasDerivedFrom coverage: 91.3%
### By Enrichment Section
----------------------------------------
google_maps_enrichment:
Total sections: 370
With _provenance: 370 (100.0%)
With content_hash: 370 (100.0%)
With wasDerivedFrom: 363 (98.1%)
web_enrichment:
Total sections: 172
With _provenance: 172 (100.0%)
With content_hash: 172 (100.0%)
With wasDerivedFrom: 170 (98.8%)
wikidata_enrichment:
Total sections: 1,849
With _provenance: 1,849 (100.0%)
With content_hash: 1,849 (100.0%)
With wasDerivedFrom: 1,644 (88.9%)
youtube_enrichment:
Total sections: 83
With _provenance: 83 (100.0%)
With content_hash: 83 (100.0%)
With wasDerivedFrom: 81 (97.6%)
zcbs_enrichment:
Total sections: 13
With _provenance: 13 (100.0%)
With content_hash: 13 (100.0%)
With wasDerivedFrom: 13 (100.0%)
## JSON Person Entity Files
----------------------------------------
Total files scanned: 3,000
Files with web_claims: 125
Claims checked: 245
Claims with provenance: 216
Claims provenance coverage: 88.2%
## Error Summary
----------------------------------------
YAML parsing errors: 0
JSON parsing errors: 0
======================================================================
VALIDATION STATUS
======================================================================
STATUS: PASSED
All provenance metadata validated successfully!