glam/reports/provenance_validation_report.txt
2025-12-30 03:43:31 +01:00

88 lines
2.7 KiB
Text

======================================================================
PROVENANCE VALIDATION REPORT
Generated: 2025-12-28T23:37:00.213858
======================================================================
## YAML Custodian Files
----------------------------------------
Total files scanned: 29,073
Files with enrichment: 18,475
Files with provenance: 18,475
Sections checked: 24,277
Sections with _provenance: 24,277
Sections with content_hash: 24,277
Sections with wasDerivedFrom: 21,426
Valid content_hashes: 24,277
Invalid content_hashes: 0
### Coverage Rates
_provenance coverage: 100.0%
content_hash coverage: 100.0%
wasDerivedFrom coverage: 88.3%
### By Enrichment Section
----------------------------------------
google_maps_enrichment:
Total sections: 3,564
With _provenance: 3,564 (100.0%)
With content_hash: 3,564 (100.0%)
With wasDerivedFrom: 3,520 (98.8%)
web_enrichment:
Total sections: 1,708
With _provenance: 1,708 (100.0%)
With content_hash: 1,708 (100.0%)
With wasDerivedFrom: 1,671 (97.8%)
wikidata_enrichment:
Total sections: 17,898
With _provenance: 17,898 (100.0%)
With content_hash: 17,898 (100.0%)
With wasDerivedFrom: 15,894 (88.8%)
youtube_enrichment:
Total sections: 965
With _provenance: 965 (100.0%)
With content_hash: 965 (100.0%)
With wasDerivedFrom: 341 (35.3%)
zcbs_enrichment:
Total sections: 142
With _provenance: 142 (100.0%)
With content_hash: 142 (100.0%)
With wasDerivedFrom: 0 (0.0%)
## JSON Person Entity Files
----------------------------------------
Total files scanned: 8,887
Files with web_claims: 399
Claims checked: 761
Claims with provenance: 671
Claims provenance coverage: 88.2%
## Error Summary
----------------------------------------
YAML parsing errors: 2
JSON parsing errors: 0
YAML Errors:
JP-14-YOK-L-ISTSL.yaml: Failed to parse YAML: while parsing a block mapping
in "/Users/kempersc/apps/glam/data/custodian/JP-14-YOK-L-ISTSL.yaml", line 1, column 1
expected <block end>, but found '<scalar>'
in "/Users/kempersc/apps/glam/data/custodian/JP-14-YOK-L-ISTSL.yaml", line 230, column 2
CZ-52-PAB-L-IPVVZOVI.yaml: Failed to parse YAML: while parsing a block mapping
in "/Users/kempersc/apps/glam/data/custodian/CZ-52-PAB-L-IPVVZOVI.yaml", line 218, column 3
expected <block end>, but found '<block mapping start>'
in "/Users/kempersc/apps/glam/data/custodian/CZ-52-PAB-L-IPVVZOVI.yaml", line 232, column 4
======================================================================
VALIDATION STATUS
======================================================================
STATUS: ISSUES FOUND
- 2 YAML parsing errors