Audit of 188 Type I custodian files revealed: - 62 false matches (33%) detected and corrected - Categories: domain mismatch (39), name mismatch (8), wrong location (6), wrong org type (5), different entity (3), different event (3) - Documents why Google Maps fails for intangible heritage: virtual orgs, person-based heritage, volunteer networks, event-based orgs This validates KIEN as TIER_1_AUTHORITATIVE for Type I custodians.
8.7 KiB
Rule 40: KIEN Registry is Authoritative for Intangible Heritage Custodians
Summary
For Intangible Heritage Custodians (Type I), the KIEN registry at https://www.immaterieelerfgoed.nl/ is the TIER_1_AUTHORITATIVE source for contact data and addresses. Google Maps enrichment is TIER_3_CROWD_SOURCED and should NEVER override KIEN data.
Empirical Validation (January 2025)
A comprehensive audit of 188 Type I custodian files revealed:
| Category | Count | Percentage |
|---|---|---|
| ✅ Google Maps matches OK | 101 | 53.7% |
| 🔧 FALSE_MATCH detected | 62 | 33.0% |
| ⚠️ No official website (valid) | 20 | 10.6% |
| 📭 No Google Maps data | 5 | 2.7% |
Key Finding: 33% of Google Maps enrichment data for Type I custodians was incorrect.
False Match Categories Identified
- Domain mismatches (39 files): Google Maps website ≠ KIEN official website
- Name mismatches (8 files): Completely different organizations (e.g., "Ria Bos" heritage practitioner → "Ria Money Transfer Agent")
- Wrong location (6 files): Same-ish name but different city (Amsterdam→Den Haag, Netherlands→Suriname!)
- Wrong organization type (5 files): Federation vs specific member, heritage org vs webshop
- Different entity type (3 files): Organization vs location/street name
- Different event (3 files): Horse racing vs festival, different village's event
Why Google Maps Fails for Type I
Google Maps is optimized for commercial businesses with physical storefronts. Type I intangible heritage custodians are fundamentally different:
- Virtual organizations without commercial presence
- Person-based heritage (individual practitioners preserving traditional crafts)
- Volunteer networks meeting in private residences
- Event-based organizations that exist only during festivals
- Federations that coordinate member organizations without own premises
Rationale
Google Maps frequently returns false matches for intangible heritage organizations because:
- Virtual Organizations: Many intangible heritage custodians operate as networks/platforms without commercial storefronts
- Name Collisions: Common words in organization names (e.g., "Platform") match unrelated businesses
- No Physical Presence: Organizations focused on intangible heritage (handwriting, oral traditions, crafts) often have no Google Maps listing
- Volunteer-Run: Contact addresses are often private residences, not businesses
KIEN (Kenniscentrum Immaterieel Erfgoed Nederland) is the official Dutch registry for intangible cultural heritage and maintains verified contact information directly from the organizations.
Data Tier Hierarchy for Type I Custodians
| Priority | Source | Data Tier | Trust Level |
|---|---|---|---|
| 1st | KIEN Registry (immaterieelerfgoed.nl) |
TIER_1_AUTHORITATIVE | Highest |
| 2nd | Organization's Official Website | TIER_2_VERIFIED | High |
| 3rd | Wikidata | TIER_3_CROWD_SOURCED | Medium |
| 4th | Google Maps | TIER_3_CROWD_SOURCED | Low (verify!) |
Required Workflow for Type I Enrichment
Step 1: Scrape KIEN Page First
For every intangible heritage custodian, the KIEN profile page MUST be scraped to extract:
kien_enrichment:
kien_name: "Platform Handschriftontwikkeling"
kien_url: "https://www.immaterieelerfgoed.nl/nl/page/2476/platform-handschriftontwikkeling"
heritage_page_url: "https://www.immaterieelerfgoed.nl/nl/handschrift"
heritage_forms:
- "Ambachten, handwerk en techniek"
- "Sociale praktijken"
address:
street: "De Hazelaar 41"
postal_code: "6903 BB"
city: "Zevenaar"
province: "Gelderland"
country: "NL"
registered_since: "2019-11"
enrichment_timestamp: "2025-01-08T00:00:00Z"
source: "https://www.immaterieelerfgoed.nl"
Step 2: Validate Google Maps Match (If Any)
If Google Maps enrichment exists, compare against KIEN data:
def validate_google_maps_match(kien_data, gmaps_data):
"""Check if Google Maps data matches KIEN authoritative source."""
# Check website domain match
kien_domain = extract_domain(kien_data.get('website'))
gmaps_domain = extract_domain(gmaps_data.get('website'))
if kien_domain and gmaps_domain and kien_domain != gmaps_domain:
return {
'status': 'FALSE_MATCH',
'reason': f'Website mismatch: KIEN={kien_domain}, GMaps={gmaps_domain}'
}
# Check name similarity
kien_name = kien_data.get('kien_name', '').lower()
gmaps_name = gmaps_data.get('name', '').lower()
if fuzz.ratio(kien_name, gmaps_name) < 70:
return {
'status': 'FALSE_MATCH',
'reason': f'Name mismatch: KIEN="{kien_name}", GMaps="{gmaps_name}"'
}
return {'status': 'VERIFIED'}
Step 3: Mark False Matches
When Google Maps returns a different organization:
google_maps_enrichment:
status: FALSE_MATCH
false_match_reason: >-
Google Maps returned "Platform 9 BV" (a health/coaching business at
Nieuwleusen) instead of "Platform Handschriftontwikkeling" (a virtual
handwriting development platform). These are completely different
organizations. KIEN registry is authoritative for this Type I custodian.
original_false_match:
place_id: ChIJNZ6o7H_fx0cR-TURAN3Bj54
name: Platform 9 BV
formatted_address: Burg, Burgemeester Backxlaan 321, 7711 AD Nieuwleusen
website: http://www.platform9.nl/
correction_timestamp: "2025-01-08T00:00:00Z"
correction_agent: opencode-claude-sonnet-4
KIEN Contact Data Extraction
The KIEN heritage pages follow a consistent structure. Extract from the "Contact" section:
## Contact
[Organization Name](link-to-profile-page)
Street Address
Postal Code
City
Province
[Website](url)
Bijgeschreven in inventaris vanaf: [date]
Example Extraction (from immaterieelerfgoed.nl/nl/handschrift):
contact:
organization: "Platform Handschriftontwikkeling"
profile_url: "https://www.immaterieelerfgoed.nl/nl/page/2476/platform-handschriftontwikkeling"
address:
street: "De Hazelaar 41"
postal_code: "6903 BB"
city: "Zevenaar"
province: "Gelderland"
website: "http://www.handschriftontwikkeling.nl/"
registered_since: "november 2019"
Location Resolution for Type I
When KIEN provides an address:
- Use KIEN address for
location.formatted_address - Geocode KIEN address to get coordinates (NOT Google Maps coordinates)
- Update location_resolution with method
KIEN_ADDRESS_GEOCODE
location:
street_address: "De Hazelaar 41"
postal_code: "6903 BB"
city: Zevenaar
region_code: GE
country: NL
coordinate_provenance:
source_type: KIEN_ADDRESS_GEOCODE
source_url: "https://www.immaterieelerfgoed.nl/nl/handschrift"
geocoding_service: nominatim
geocoding_timestamp: "2025-01-08T00:00:00Z"
Batch Re-Enrichment Script
To fix all Type I custodians with potentially incorrect Google Maps data:
# Find all Type I custodians
python scripts/rescrape_kien_contacts.py --type I --output data/custodian/
# This script should:
# 1. Read all NL-*-I-*.yaml files
# 2. Fetch KIEN page for each (from kien_enrichment.kien_url)
# 3. Extract contact/address from KIEN
# 4. Compare with google_maps_enrichment
# 5. Mark mismatches as FALSE_MATCH
# 6. Update location with KIEN address
Anti-Patterns
WRONG - Using Google Maps as primary source for Type I:
# WRONG - Google Maps overriding KIEN data
location:
formatted_address: "Burg, Burgemeester Backxlaan 321, 7711 AD Nieuwleusen"
coordinate_provenance:
source_type: GOOGLE_MAPS # WRONG for Type I!
CORRECT - KIEN as primary source:
# CORRECT - KIEN is authoritative
location:
street_address: "De Hazelaar 41"
postal_code: "6903 BB"
city: Zevenaar
coordinate_provenance:
source_type: KIEN_ADDRESS_GEOCODE # Correct!
Affected Files
This rule affects approximately 100+ Type I custodian files:
data/custodian/NL-*-I-*.yaml
All should be reviewed to ensure:
kien_enrichmentcontains address from KIEN pagegoogle_maps_enrichmentis validated against KIENlocationuses KIEN address (not Google Maps)- False matches are properly documented
Related Rules
- Rule 5: NEVER Delete Enriched Data - Keep false match data in
original_false_match - Rule 6: WebObservation Claims - KIEN data should have provenance
- Rule 22: Custodian YAML Files Are Single Source of Truth
- Rule 35: Provenance Timestamps - Include KIEN fetch timestamps
See Also
- KIEN Registry: https://www.immaterieelerfgoed.nl/
- UNESCO Intangible Cultural Heritage: https://ich.unesco.org/
- Dutch Intangible Heritage Network documentation