NDE to Heritage Custodian RDF Field Mapping
This document details which fields from the enriched NDE YAML entries are mapped to RDF and which remain unmapped.
Summary
| Category |
Mapped |
Unmapped |
Coverage |
| Core Identifiers |
10 |
0 |
100% |
| Labels & Names |
3 |
1 |
75% |
| Location |
4 |
2 |
67% |
| Timestamps |
2 |
0 |
100% |
| Social Media |
5 |
0 |
100% |
| External IDs |
6 |
10+ |
~40% |
| Google Maps |
3 |
15+ |
~15% |
| Wikidata Claims |
2 |
30+ |
~5% |
| Provenance |
0 |
15+ |
0% |
Mapped Fields
Core Identifiers (✅ Fully Mapped)
| Source Field |
RDF Property |
Notes |
ghcid.ghcid_current |
skos:notation on crm:E42_Identifier |
GHCID scheme |
ghcid.ghcid_numeric |
dcterms:identifier, skos:notation |
Primary identifier |
ghcid.ghcid_uuid |
skos:notation, schema:url |
UUID v5 |
ghcid.ghcid_uuid_sha256 |
skos:notation, schema:url |
UUID v8 |
ghcid.record_id |
skos:notation, schema:url |
Database record ID |
identifiers[].identifier_scheme |
skos:inScheme |
Identifier scheme |
identifiers[].identifier_value |
skos:notation |
Identifier value |
wikidata_enrichment.wikidata_entity_id |
owl:sameAs, skos:notation |
Wikidata linking |
Labels & Names (✅ Mostly Mapped)
| Source Field |
RDF Property |
Notes |
custodian_name.claim_value |
skos:prefLabel@nl |
Primary label |
wikidata_enrichment.wikidata_label_nl |
skos:prefLabel@nl |
Fallback label |
wikidata_enrichment.wikidata_label_en |
skos:altLabel@en |
English alt label |
Unmapped:
wikidata_enrichment.wikidata_aliases - multilingual aliases
wikidata_enrichment.wikidata_description_* - descriptions
Custodian Type (✅ Mapped)
| Source Field |
RDF Property |
Notes |
original_entry.type[] |
hc:custodian_type |
Type code → enum |
Location & Place (✅ Partially Mapped)
| Source Field |
RDF Property |
Notes |
google_maps_enrichment.coordinates.latitude |
schema:latitude |
Coordinates |
google_maps_enrichment.coordinates.longitude |
schema:longitude |
Coordinates |
google_maps_enrichment.formatted_address |
schema:address |
Full address |
ghcid.location_resolution.geonames_id |
schema:containedInPlace |
GeoNames URI |
Unmapped:
google_maps_enrichment.address_components[] - structured address parts
google_maps_enrichment.utc_offset_minutes - timezone
Timestamps (✅ Mapped)
| Source Field |
RDF Property |
Notes |
processing_timestamp |
schema:dateCreated |
Record creation |
provenance.generated_at |
schema:dateModified |
Last modification |
Digital Platform (✅ Mapped)
| Source Field |
RDF Property |
Notes |
wikidata_enrichment.wikidata_official_website |
foaf:homepage, schema:url |
Primary website |
google_maps_enrichment.website |
foaf:homepage |
Fallback website |
wikidata_claims.P8768_online_catalog_url.value |
hc:collection_url |
Catalog URL(s) |
Social Media Profiles (✅ Mapped)
| Source Field |
RDF Property |
Notes |
web_claims.claims[].claim_type=social_* |
hc:platform_type |
Platform type |
web_claims.claims[].claim_value |
foaf:accountServiceHomepage |
Profile URL |
| Extracted from URL |
foaf:accountName |
Username |
web_claims.claims[].source_url |
prov:wasDerivedFrom |
Source provenance |
web_claims.claims[].retrieved_on |
prov:generatedAtTime |
Timestamp |
wikidata_claims.P2002_x__twitter__username.value |
foaf:accountName |
Twitter from Wikidata |
External Identifiers (✅ Partially Mapped)
| Source Field |
RDF Property |
Notes |
wikidata_enrichment.wikidata_identifiers.viaf |
skos:notation |
VIAF ID |
wikidata_enrichment.wikidata_identifiers.gnd |
skos:notation |
GND ID |
wikidata_enrichment.wikidata_identifiers.isni |
skos:notation |
ISNI |
wikidata_enrichment.wikidata_identifiers.lcnaf |
skos:notation |
Library of Congress |
wikidata_enrichment.wikidata_identifiers.ringgold |
skos:notation |
Ringgold ID |
Unmapped Fields
Google Maps Enrichment (❌ Not Mapped)
These fields contain valuable data but are not yet mapped to RDF:
| Field |
Type |
Potential Use |
opening_hours.weekday_text[] |
Array |
Operating hours display |
opening_hours.periods[] |
Array |
Structured hours |
rating |
Float |
User rating (1-5) |
total_ratings |
Integer |
Number of reviews |
reviews[] |
Array |
User reviews with text, rating, author |
photo_urls[] |
Array |
Photo URLs |
photos_metadata[] |
Array |
Photo details, attributions |
phone_international |
String |
Phone number |
phone_local |
String |
Local phone format |
editorial_summary |
String |
Google's description |
business_status |
String |
OPERATIONAL, CLOSED, etc. |
google_maps_url |
String |
Link to Google Maps |
street_view_url |
String |
Street View URL |
google_place_types[] |
Array |
Google's type classification |
place_id |
String |
Google Places ID |
Rationale for not mapping:
- Opening hours: Requires
schema:OpeningHoursSpecification modeling
- Reviews: Privacy considerations, volatile data
- Photos: External dependencies, storage concerns
- Phone: Could be added with
schema:telephone
Wikidata Claims (❌ Mostly Not Mapped)
Many Wikidata properties are retrieved but not converted to RDF:
| Wikidata Property |
Label |
Notes |
| P131 |
Located in admin entity |
Administrative hierarchy |
| P276 |
Location |
Building/structure |
| P17 |
Country |
Country entity |
| P571 |
Inception |
Founding date |
| P576 |
Dissolved |
Closure date |
| P84 |
Architect |
Building architect |
| P669 |
Located on street |
Street name |
| P1619 |
Date of opening |
Opening date |
| P166 |
Award received |
Awards |
| P2652 |
Partnership with |
Partnerships |
| P1343 |
Described by source |
Sources |
| P2851 |
Payment types accepted |
Payment methods |
| P3273 |
Actorenregister ID |
Dutch actors register |
| P646 |
Freebase ID |
Legacy identifier |
| P402 |
OSM relation ID |
OpenStreetMap |
Rationale:
- Many require complex modeling (dates with qualifiers)
- Some are volatile (awards, partnerships change)
- Some are domain-specific extensions
Provenance Metadata (❌ Not Mapped)
| Field |
Notes |
provenance.sources.* |
Detailed source tracking |
provenance.data_tier_summary |
Data quality tiers |
provenance.notes |
Human notes |
wikidata_enrichment.api_metadata.* |
API call details |
web_enrichment.web_archives[] |
WARC archive info |
Rationale:
- Could use PROV-O ontology for detailed provenance
- Currently simplified to timestamps only
Museum Register Enrichment (❌ Not Mapped)
| Field |
Notes |
museum_register_enrichment.registered_since |
Registration date |
museum_register_enrichment.province |
Province |
museum_register_enrichment.source_provenance |
Source details |
Future Enhancements
Priority 1: High Value, Easy to Add
google_maps_enrichment.phone_international → schema:telephone
google_maps_enrichment.editorial_summary → schema:description
wikidata_claims.P571_inception.value → schema:foundingDate
Priority 2: Moderate Complexity
- Opening hours →
schema:OpeningHoursSpecification
google_maps_enrichment.rating → schema:aggregateRating
- Wikidata relationships (P131, P276) → location hierarchy
Priority 3: Complex Modeling Required
- Full provenance chain → PROV-O
- Organizational history → change events
- Collection metadata → separate entities
Script Location
scripts/nde_to_hc_rdf.py
Output Location
data/nde/rdf/{ghcid_numeric}.ttl
Generated: 2025-12-02
Total entries converted: 1,619
Total triples: 114,705