# NDE URL Discovery Report **Date**: 2025-12-01 **Dataset**: NDE Enriched Entries **Total Entries**: 1,674 ## Summary Statistics | Category | Count | Percentage | |----------|-------|------------| | With URL (any source) | 1,621 | 96.8% | | Without URL | 53 | 3.2% | | - With Wikidata ID | 31 | - | | - Without Wikidata ID | 22 | - | ## URLs Discovered via Wikidata P856 Found official websites for **16 entries** by querying Wikidata property P856: | Entry ID | Wikidata | Institution | URL | |----------|----------|-------------|-----| | 0006 | Q81181279 | Rijckheyt | http://www.rijckheyt.nl | | 0011 | Q110907483 | Stichting Nyen ä Wes van Nassau | https://www.nyenaenwasvannassau.nl/site/ | | 0047 | Q110907518 | HKK Zuidkwartier | https://www.hkk-zuidkwartier.nl/ | | 0059 | Q110907493 | Heemkunde Ravenstein | http://www.heemkunderavenstein.nl/ | | 0107 | Q110907545 | Heemkunde Gemonde | http://www.heemkundegemonde.nl/ | | 0135 | Q110995942 | Saet en Cruyt | https://www.saetencruyt.nl/ | | 0211 | Q701 | Provincie Noord-Holland | https://www.noord-holland.nl/ | | 0387 | Q2370540 | Collectie Six | http://nl.collectiesix.nl/ | | 0507 | Q113119338 | Stichting Omroep Muziek | https://www.omroepmuziek.nl/ | | 0700 | Q121225126 | Stadsarchief Oldenzaal | https://www.oldenzaal.nl/stadsarchief | | 0839 | Q110279958 | Stichting Utrecht Altijd | https://www.utrechtaltijd.nl/ | | 1022 | Q111509619 | Stichting Historische Behangsels | https://www.historischebehangsels.nl | | 1195 | Q2334800 | Verzetsmuseum Zuid-Holland | http://www.verzetsmuseum-zh.nl | | 1268 | Q2248381 | Spaarnestad Photo | http://www.spaarnestadphoto.nl | | 1637 | Q59962362 | Bibliotheek De Groene Venen | https://www.bibliotheekdegroenevenen.nl | | 0212 | Q1897962 | Geldmuseum | *Merged into DNB "De Nieuwe Schatkamer"* | ## URLs Discovered via Web Search (Exa) Found URLs for **10 additional entries**: | Entry ID | Institution | URL | Notes | |----------|-------------|-----|-------| | 0199 | Bibliotheek TU Kampen | https://tuu.nl/bibliotheek/ | Merged with TU Utrecht | | 0389 | Heemkunde Ambt-Delden | https://www.heemkundedelden.nl/ | | | 0413 | Historische Vereniging Old Deep'n | https://www.olddeepn.nl/ | Diepenveen | | 0594 | Sambeeks Heem | https://www.sambeeksheem.nl/ | | | 0627 | Stichting Weeshuisjes | https://www.weeshuisjes.nl/ | | | 0715 | HDC Protestants Erfgoed | https://www.hdcvu.nl/ | Within VU Library | | 0729 | Historische Vereniging Staphorst | https://www.historischeverenigingstaphorst.nl/ | | | 0851 | Historische Vereniging Den Dolder | https://www.historischeverenigingdendolder.nl/ | | | 1170 | Nederlandse Vereniging voor Papierknipkunst | https://papierknippen.nl/ | Original NDE entry | | 1504 | Nederlandse Vereniging voor Papierknipkunst | https://papierknippen.nl/ | Same org, from NAN ISIL 2025-11-06 | ## Entries Without Dedicated Websites (Parent Organization Only) | Entry ID | Institution | Parent Organization | Notes | |----------|-------------|---------------------|-------| | 1512 | Diocesane Commissie Kerkelijk Kunstbezit | Bisdom Roermond (https://bisdom-roermond.org/) | Commission managing diocesan church art - operates under diocese, no dedicated website | ## Problematic Entries ### NOT Heritage Custodians | Entry ID | Wikidata | Name | Issue | |----------|----------|------|-------| | 0874 | Q2789869 | HEEMAF | Was an electrical equipment manufacturer (1906-1973), NOT a heritage institution | **Recommendation**: Remove from dataset or mark as `status: NOT_HERITAGE` ### Closed/Defunct | Entry ID | Wikidata | Name | Issue | |----------|----------|------|-------| | 1130 | Q110282061 | Museum Oud Westdorpe | Permanently closed | **Recommendation**: Mark as `status: CLOSED` ### Duplicate Entries | Entry IDs | Wikidata | Name | Issue | |-----------|----------|------|-------| | 0875, 0993 | Q110891769 | (Same institution) | Duplicate Wikidata ID | | 0967, 0950 | - | (Same institution) | Duplicate entries | **Recommendation**: Merge records, keep one canonical entry ## Entries Still Without URLs (~16) These entries have no discoverable website: | Entry ID | Name | Notes | |----------|------|-------| | Various | Small heemkundige kringen | May only have Facebook presence | | Various | Historical societies | May be defunct or volunteer-run | | 0210 | Greccio Museum (Leiden) | Located inside Hartebrug church, no dedicated website | ## Actions Taken 1. ✅ Created `docs/nde/` directory 2. ✅ Created this URL Discovery Report 3. ✅ Updated 25 YAML files with discovered URLs (15 Wikidata + 9 web search + 1 dedicated page) 4. ✅ Flagged problematic entries: - `0874_Q2789869.yaml` (HEEMAF): `NOT_HERITAGE` - was electrical manufacturer - `1130_Q110282061.yaml` (Museum Oud Westdorpe): `CLOSED` - permanently closed - `0875_Q110891769.yaml` & `0993_Q110891769.yaml`: `DUPLICATE` - same Wikidata ID - `0950_unknown.yaml` & `0967_Q110891782.yaml`: `DUPLICATE` - same institution ## Final URL Coverage | Category | Count | Percentage | |----------|-------|------------| | **With URL** | 1,619 | **96.7%** | | **Without URL** | 55 | 3.3% | | **Flagged (duplicates, not heritage, closed)** | 5 | 0.3% | ## Remaining Entries Without URLs (55) These entries have no discoverable website. Many are: - Small heemkundige kringen (local history societies) - Entries with Wikidata IDs but no P856 property - Defunct or volunteer-run organizations - Facebook-only presence **Example entries still missing URLs:** - `0001_Q2679819`: Stichting Hunebedcentrum (has Wikidata, needs P856 check) - `0139_de_hollandse_cirkel`: De Hollandse Cirkel (no Wikidata) - `0144_Q2710899`: Nationaal Onderduik Museum (has Wikidata) - `0148_Q69725772`: CollectieGelderland (has Wikidata) - Various small historical societies in Overijssel ## Recommendations 1. **Query Wikidata again** for entries with Q-numbers but no P856 discovered 2. **Web search** remaining entries with organization names 3. **Check for dedicated pages** on umbrella organization websites (like Museum Greccio on Hartebrug church site) 4. **Mark as `url_status: NOT_FOUND`** for entries where no web presence exists 5. **Consider Facebook URLs** as fallback for small volunteer organizations ## Technical Notes - Added `digital_platforms` section with proper `platform_type` classification: - `OFFICIAL_WEBSITE` for standalone websites - `DEDICATED_PAGE` for pages on host websites (like Museum Greccio) - Preserved `data_tier` in provenance: - `TIER_2_VERIFIED` for Wikidata P856 URLs - `TIER_4_INFERRED` for web search discovered URLs --- *Generated by NDE URL Discovery workflow* *Last updated: 2025-12-01T16:30:00+00:00* ## Session Updates - December 2025 ### 2025-12-01: NAN ISIL Batch Corrections Fixed several issues with entries 1502-1513 (NAN ISIL 2025-11-06 batch): | Entry ID | Issue | Resolution | |----------|-------|------------| | 1504 | Missing URL | Added https://papierknippen.nl/ (discovered via Exa) | | 1508 | Wrong custodian_name ("Cookiesbeleid") | Corrected to "Parochiearchief Kampen" from NAN ISIL registry | | 1508 | Wrong institution type (U) | Changed to A (Archive) | | 1508 | Incorrect GHCID | Regenerated: NL-OV-KAM-A-PK | | 1511 | Wrong custodian_name (exhibition title) | Already corrected to "Wereldmuseum Leiden" | | 1511 | Wrong institution type (U) | Changed to M (Museum) | | 1511 | GHCID based on exhibition title | Regenerated: NL-ZH-LEI-M-WL | | 1512 | Wrong institution type (U) | Changed to H (Holy Sites - diocesan heritage commission) | | 1512 | No website info | Added parent organization note (Bisdom Roermond) | | 1512 | Incorrect GHCID | Regenerated: NL-LI-ROE-H-DCKK |