glam/data/nde/enriched
kempersc 55e2cd2340 feat: implement LLM-based extraction for Archives Lab content
- Introduced `llm_extract_archiveslab.py` script for entity and relationship extraction using LLMAnnotator with GLAM-NER v1.7.0.
- Replaced regex-based extraction with generative LLM inference.
- Added functions for loading markdown content, converting annotation sessions to dictionaries, and generating extraction statistics.
- Implemented comprehensive logging of extraction results, including counts of entities, relationships, and specific types like heritage institutions and persons.
- Results and statistics are saved in JSON format for further analysis.
2025-12-05 23:16:21 +01:00
..
entries feat: implement LLM-based extraction for Archives Lab content 2025-12-05 23:16:21 +01:00
llm_annotations annotation standards added 2025-12-05 15:30:23 +01:00
sources update enriched entries 2025-12-03 17:38:46 +01:00
_entries_needing_urls.yaml
collision_analysis_detailed.json enrich entries 2025-12-01 16:06:34 +01:00
collision_review.tsv enrich entries 2025-12-01 16:06:34 +01:00
DATASET_STATUS.md enrich entries 2025-12-01 16:06:34 +01:00
enrichment_log_20251127_204405.json
enrichment_stats_all_20251130_185147.json update entries 2025-11-30 23:30:29 +01:00
enrichment_stats_kb_isil_20251130_143105.json update entries 2025-11-30 23:30:29 +01:00
enrichment_stats_kb_isil_20251130_143124.json update entries 2025-11-30 23:30:29 +01:00
enrichment_stats_kb_isil_20251130_181016.json update entries 2025-11-30 23:30:29 +01:00
enrichment_stats_museum_register_20251130_142823.json update entries 2025-11-30 23:30:29 +01:00
enrichment_stats_museum_register_20251130_180651.json update entries 2025-11-30 23:30:29 +01:00
enrichment_stats_museum_register_20251130_180745.json update entries 2025-11-30 23:30:29 +01:00
enrichment_stats_museum_register_20251130_180941.json update entries 2025-11-30 23:30:29 +01:00
enrichment_stats_na_isil_20251130_143933.json update entries 2025-11-30 23:30:29 +01:00
enrichment_stats_na_isil_20251130_144030.json update entries 2025-11-30 23:30:29 +01:00
enrichment_stats_na_isil_20251130_144228.json update entries 2025-11-30 23:30:29 +01:00
entity_cache.json
ghcid_collision_report.json update enriched entries 2025-12-03 17:38:46 +01:00
google_maps_enrichment_stats_20251128_110230.json
google_maps_enrichment_stats_20251128_132314.json
kb_google_maps_enrichment_stats_20251128_132609.json
kb_wikidata_enrichment_stats_20251128_132240.json
kien_ghcid_collision_report.json annotation standards added 2025-12-05 15:30:23 +01:00