# Denmark GLAM Dataset - Quick Reference Card **Last Updated**: 2025-11-19 **Status**: ✅ Complete (2,348 institutions) --- ## Files at a Glance | File | Institutions | Size | Purpose | |------|--------------|------|---------| | `denmark_complete.json` | **2,348** | 3.06 MB | ⭐ **MASTER FILE** - Use this | | `denmark_libraries_v2.json` | 555 | 964 KB | Main libraries only | | `denmark_archives.json` | 594 | 918 KB | Archives only | | `denmark_library_branches.json` | 1,199 | 1.2 MB | Library branches only | **Location**: `/Users/kempersc/apps/glam/data/instances/` --- ## Quick Statistics ``` Total Institutions: 2,348 ├── Libraries (main): 555 │ ├── Public libraries: 108 │ └── Research libraries (FFU): 447 ├── Archives: 594 └── Library branches: 1,199 ├── Public branches: 594 └── FFU branches: 605 GHCID Coverage: 998/2,348 (42.5%) ISIL Coverage: 555/2,348 (23.6%) Hierarchical Links: 1,176/1,199 (98.1%) ``` --- ## Common Queries ### Load complete dataset ```python import json with open('data/instances/denmark_complete.json', 'r') as f: danish_glam = json.load(f) ``` ### Filter by institution type ```python archives = [i for i in danish_glam if i['institution_type'] == 'ARCHIVE'] libraries = [i for i in danish_glam if i['institution_type'] == 'LIBRARY'] main_libs = [i for i in libraries if not i.get('parent_organization')] branches = [i for i in libraries if i.get('parent_organization')] ``` ### Find institutions by city ```python copenhagen = [ i for i in danish_glam if any(loc.get('city') == 'København K' for loc in i.get('locations', [])) ] ``` ### Get institutions with GHCID ```python with_ghcid = [i for i in danish_glam if i.get('ghcid_current')] ``` ### Get hierarchical structure (library + branches) ```python def get_library_with_branches(library_id): """Get a library and all its branches.""" library = next(i for i in danish_glam if i['id'] == library_id) branches = [ i for i in danish_glam if i.get('parent_organization') == library_id ] return {'library': library, 'branches': branches} # Example kb_system = get_library_with_branches( 'https://w3id.org/heritage/custodian/dk/library/k%C3%B8benhavn-k/k%C3%B8benhavns-biblioteker' ) print(f"{kb_system['library']['name']}: {len(kb_system['branches'])} branches") ``` --- ## Top Cities | City | Count | City | Count | |------|-------|------|-------| | Aalborg | 35 | København K | 30 | | Esbjerg | 30 | Hjørring | 28 | | Vejle | 28 | Herning | 26 | | Aarhus | 22 | Ringkøbing-Skjern | 22 | --- ## Next Steps **For RDF Export**: ```bash # Generate Turtle RDF linkml-convert -s schemas/heritage_custodian.yaml -t rdf \ data/instances/denmark_complete.json > data/rdf/denmark.ttl ``` **For Wikidata Enrichment**: ```bash # Query Wikidata for Danish institutions python3 scripts/enrich_denmark_wikidata.py ``` **For Analysis**: ```bash # Generate statistics report python3 scripts/analyze_denmark_dataset.py ``` --- ## Key Design Decisions ✅ **Library branches use parent_organization** - Reduces redundancy ✅ **Archives get GHCID (no ISIL)** - GHCID is primary identifier ✅ **Nordic characters normalized** - æ→ae, ø→oe, å→aa in GHCID ✅ **98.1% hierarchical linkage** - Near-perfect parent-child matching --- ## Session Documents - `SESSION_SUMMARY_20251119_DENMARK_COMPLETE.md` - Full session report - `SESSION_SUMMARY_20251119_DENMARK_ARCHIVES_COMPLETE.md` - Archive processing - `SESSION_SUMMARY_20251119_DENMARK_ISIL_COMPLETE.md` - Library processing --- **Questions?** See `SESSION_SUMMARY_20251119_DENMARK_COMPLETE.md` for detailed documentation.