| .. | ||
| argentina_complete.ttl | ||
| austria_complete.ttl | ||
| belarus_complete.ttl | ||
| belgium_complete.ttl | ||
| belgium_isil_institutions.ttl | ||
| bulgaria_complete.ttl | ||
| netherlands_complete.ttl | ||
| palestinian_heritage_custodians.ttl | ||
| prefixes.yaml | ||
| README.md | ||
Heritage Institution RDF Exports
This directory contains Linked Open Data exports of heritage institution datasets in W3C-compliant RDF formats.
Available Datasets
Denmark 🇩🇰 - COMPLETE (November 2025)
Dataset: denmark_complete.*
Status: ✅ Production-ready
Last Updated: 2025-11-19
| Format | File | Size | Use Case |
|---|---|---|---|
| Turtle | denmark_complete.ttl |
2.27 MB | Human-readable, SPARQL queries |
| RDF/XML | denmark_complete.rdf |
3.96 MB | Machine processing, legacy systems |
| JSON-LD | denmark_complete.jsonld |
5.16 MB | Web APIs, JavaScript applications |
| N-Triples | denmark_complete.nt |
6.24 MB | Line-oriented processing, MapReduce |
Statistics
- Institutions: 2,348 (555 libraries, 594 archives, 1,199 branches)
- RDF Triples: 43,429
- Ontologies Used: 9 (CPOV, Schema.org, RICO, ORG, PROV-O, SKOS, Dublin Core, OWL, Heritage)
- Wikidata Links: 769 institutions (32.8%)
- ISIL Codes: 555 institutions (23.6%)
- GHCID Identifiers: 998 institutions (42.5%)
Coverage by Institution Type
| Type | Count | ISIL | GHCID | Wikidata |
|---|---|---|---|---|
| Main Libraries | 555 | 100% | 78% | High |
| Archives | 594 | 0% (by design) | 95% | Moderate |
| Library Branches | 1,199 | Inherited | 0% (by design) | Low |
Ontology Alignment
All RDF exports follow these international standards:
Core Ontologies
-
CPOV (Core Public Organisation Vocabulary)
- Namespace:
http://data.europa.eu/m8g/ - Usage: Public sector organization type
- Spec: https://joinup.ec.europa.eu/collection/semantic-interoperability-community-semic/solution/core-public-organisation-vocabulary
- Namespace:
-
Schema.org
- Namespace:
http://schema.org/ - Usage: Names, addresses, descriptions, types
- Types:
schema:Library,schema:ArchiveOrganization,schema:Museum - Spec: https://schema.org/
- Namespace:
-
SKOS (Simple Knowledge Organization System)
- Namespace:
http://www.w3.org/2004/02/skos/core# - Usage: Preferred/alternative labels
- Spec: https://www.w3.org/TR/skos-reference/
- Namespace:
Specialized Ontologies
-
RICO (Records in Contexts Ontology)
- Namespace:
https://www.ica.org/standards/RiC/ontology# - Usage: Archival description (for archives)
- Spec: https://www.ica.org/standards/RiC/ontology
- Namespace:
-
ORG (W3C Organization Ontology)
- Namespace:
http://www.w3.org/ns/org# - Usage: Hierarchical relationships (library branches → main libraries)
- Spec: https://www.w3.org/TR/vocab-org/
- Namespace:
-
PROV-O (Provenance Ontology)
- Namespace:
http://www.w3.org/ns/prov# - Usage: Data provenance tracking
- Spec: https://www.w3.org/TR/prov-o/
- Namespace:
Linking Ontologies
-
OWL (Web Ontology Language)
- Namespace:
http://www.w3.org/2002/07/owl# - Usage: Semantic equivalence (
owl:sameAsfor Wikidata links) - Spec: https://www.w3.org/TR/owl2-primer/
- Namespace:
-
Dublin Core Terms
- Namespace:
http://purl.org/dc/terms/ - Usage: Identifiers, descriptions, metadata
- Spec: https://www.dublincore.org/specifications/dublin-core/dcmi-terms/
- Namespace:
-
Heritage (Project-Specific)
- Namespace:
https://w3id.org/heritage/custodian/ - Usage: GHCID identifiers, UUID properties
- Spec: See
/docs/PERSISTENT_IDENTIFIERS.md
- Namespace:
SPARQL Query Examples
Query 1: Find all libraries in a specific city
PREFIX schema: <http://schema.org/>
PREFIX cpov: <http://data.europa.eu/m8g/>
SELECT ?library ?name ?address WHERE {
?library a cpov:PublicOrganisation, schema:Library .
?library schema:name ?name .
?library schema:address ?addrNode .
?addrNode schema:addressLocality "København K" .
?addrNode schema:streetAddress ?address .
}
Query 2: Find all institutions with Wikidata links
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX schema: <http://schema.org/>
SELECT ?institution ?name ?wikidataID WHERE {
?institution schema:name ?name .
?institution owl:sameAs ?wikidataURI .
FILTER(STRSTARTS(STR(?wikidataURI), "http://www.wikidata.org/entity/Q"))
BIND(STRAFTER(STR(?wikidataURI), "http://www.wikidata.org/entity/") AS ?wikidataID)
}
Query 3: Find library hierarchies (parent-child branches)
PREFIX org: <http://www.w3.org/ns/org#>
PREFIX schema: <http://schema.org/>
SELECT ?parent ?parentName ?child ?childName WHERE {
?child org:subOrganizationOf ?parent .
?parent schema:name ?parentName .
?child schema:name ?childName .
}
LIMIT 100
Query 4: Count institutions by type
PREFIX schema: <http://schema.org/>
SELECT ?type (COUNT(?inst) AS ?count) WHERE {
?inst a ?type .
FILTER(?type IN (schema:Library, schema:ArchiveOrganization, schema:Museum))
}
GROUP BY ?type
Query 5: Find archives with specific ISIL codes
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX schema: <http://schema.org/>
SELECT ?archive ?name ?isil WHERE {
?archive a schema:ArchiveOrganization .
?archive schema:name ?name .
?archive dcterms:identifier ?isil .
FILTER(STRSTARTS(?isil, "DK-"))
}
Query 6: Get provenance for all institutions
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX schema: <http://schema.org/>
SELECT ?institution ?name ?source WHERE {
?institution schema:name ?name .
?institution prov:wasGeneratedBy ?activity .
?activity dcterms:source ?source .
}
LIMIT 100
Usage Examples
Loading RDF with Python (rdflib)
from rdflib import Graph
# Load Turtle format
g = Graph()
g.parse("denmark_complete.ttl", format="turtle")
print(f"Loaded {len(g)} triples")
# Query with SPARQL
qres = g.query("""
PREFIX schema: <http://schema.org/>
SELECT ?name WHERE {
?inst a schema:Library .
?inst schema:name ?name .
}
LIMIT 10
""")
for row in qres:
print(row.name)
Loading RDF with Apache Jena (Java)
import org.apache.jena.rdf.model.*;
import org.apache.jena.query.*;
// Load RDF/XML format
Model model = ModelFactory.createDefaultModel();
model.read("denmark_complete.rdf");
// Query with SPARQL
String queryString = """
PREFIX schema: <http://schema.org/>
SELECT ?name WHERE {
?inst a schema:Library .
?inst schema:name ?name .
}
LIMIT 10
""";
Query query = QueryFactory.create(queryString);
QueryExecution qexec = QueryExecutionFactory.create(query, model);
ResultSet results = qexec.execSelect();
ResultSetFormatter.out(System.out, results, query);
Loading JSON-LD with JavaScript
const jsonld = require('jsonld');
const fs = require('fs');
// Load JSON-LD
const doc = JSON.parse(fs.readFileSync('denmark_complete.jsonld', 'utf8'));
// Expand to N-Quads
jsonld.toRDF(doc, {format: 'application/n-quads'}).then(nquads => {
console.log(`Loaded ${nquads.split('\n').length} triples`);
});
Setting Up a SPARQL Endpoint
Option 1: Apache Jena Fuseki (Open Source)
# Download Jena Fuseki
wget https://dlcdn.apache.org/jena/binaries/apache-jena-fuseki-4.10.0.tar.gz
tar xzf apache-jena-fuseki-4.10.0.tar.gz
cd apache-jena-fuseki-4.10.0
# Start server
./fuseki-server --update --mem /denmark
# Load data
curl -X POST http://localhost:3030/denmark/data \
--data-binary @denmark_complete.ttl \
-H "Content-Type: text/turtle"
# Query endpoint
curl -X POST http://localhost:3030/denmark/query \
--data-urlencode "query=SELECT * WHERE { ?s ?p ?o } LIMIT 10"
Option 2: GraphDB (Free Edition)
- Download GraphDB Free from https://www.ontotext.com/products/graphdb/download/
- Install and start GraphDB
- Create new repository "denmark"
- Import
denmark_complete.ttlvia web UI - Query via SPARQL interface at http://localhost:7200/sparql
W3ID Persistent Identifiers
All institutions have persistent URIs following the pattern:
https://w3id.org/heritage/custodian/dk/{isil-or-id}
Examples:
- Royal Library:
https://w3id.org/heritage/custodian/dk/190101 - Copenhagen Libraries:
https://w3id.org/heritage/custodian/dk/710100 - Danish National Archives:
https://w3id.org/heritage/custodian/dk/archive/rigsarkivet
Content Negotiation (when w3id.org registration complete):
# Get HTML representation
curl https://w3id.org/heritage/custodian/dk/710100
# Get Turtle RDF
curl -H "Accept: text/turtle" https://w3id.org/heritage/custodian/dk/710100
# Get JSON-LD
curl -H "Accept: application/ld+json" https://w3id.org/heritage/custodian/dk/710100
Data Quality & Provenance
All RDF exports include complete provenance metadata using PROV-O:
<https://w3id.org/heritage/custodian/dk/710100>
prov:wasGeneratedBy [
a prov:Activity ;
dcterms:source "ISIL_REGISTRY" ;
prov:startedAtTime "2025-11-19T10:00:00Z"^^xsd:dateTime ;
prov:endedAtTime "2025-11-19T10:30:00Z"^^xsd:dateTime
] .
Data Tier Classification (see AGENTS.md):
- TIER_1_AUTHORITATIVE: Official registries (ISIL, national library databases)
- TIER_2_VERIFIED: Verified web scraping (Arkiv.dk)
- TIER_3_CROWD_SOURCED: Wikidata, OpenStreetMap
- TIER_4_INFERRED: NLP-extracted from conversations
Denmark Dataset:
- Main libraries (555): TIER_1 (ISIL registry)
- Archives (594): TIER_2 (Arkiv.dk verified scraping)
- Wikidata links (769): TIER_3 (crowd-sourced)
Validation
All RDF files have been validated using:
Syntax Validation
# Turtle syntax check
rapper -i turtle -o ntriples denmark_complete.ttl > /dev/null
# RDF/XML syntax check
rapper -i rdfxml -o ntriples denmark_complete.rdf > /dev/null
# JSON-LD context validation
jsonld validate denmark_complete.jsonld
Semantic Validation
- ✅ All URIs resolve to w3id.org namespace (when registration complete)
- ✅ owl:sameAs links point to valid Wikidata entities
- ✅ Hierarchical relationships use standard ORG vocabulary
- ✅ ISIL codes link to isil.org registry
- ✅ GHCID identifiers follow project specification
Citation
If you use this dataset in research, please cite:
@dataset{danish_glam_rdf_2025,
author = {GLAM Extractor Project},
title = {Danish Heritage Institutions Linked Open Data},
year = {2025},
month = {November},
version = {1.0},
url = {https://github.com/yourusername/glam-extractor},
note = {2,348 institutions (555 libraries, 594 archives, 1,199 branches), 43,429 RDF triples}
}
Related Documentation
- Project README:
/README.md - LinkML Schema:
/schemas/heritage_custodian.yaml - Persistent Identifiers:
/docs/PERSISTENT_IDENTIFIERS.md - Ontology Extensions:
/docs/ONTOLOGY_EXTENSIONS.md - Denmark Session Summary:
/SESSION_SUMMARY_20251119_RDF_WIKIDATA_COMPLETE.md
Contributing
To add new country datasets or improve existing RDF exports:
- Follow ontology alignment guidelines in
/docs/ONTOLOGY_EXTENSIONS.md - Use RDF exporter template:
/scripts/export_denmark_rdf.py - Validate with SPARQL queries before publishing
- Update this README with new dataset statistics
License
This data is published under CC0 1.0 Universal (Public Domain). You may use, modify, and distribute it freely without restrictions.
Individual institution data may be subject to different licenses from source registries. Consult:
- Danish ISIL Registry: https://slks.dk/isil
- Arkiv.dk: https://arkiv.dk
- Wikidata: CC0 (https://www.wikidata.org/wiki/Wikidata:Data_access#Licensing)
Last Updated: 2025-11-19
Maintainer: GLAM Extractor Project
Contact: GitHub Issues