glam/DENMARK_QUICK_REFERENCE.md
2025-11-19 23:25:22 +01:00

3.6 KiB

Denmark GLAM Dataset - Quick Reference Card

Last Updated: 2025-11-19
Status: Complete (2,348 institutions)


Files at a Glance

File Institutions Size Purpose
denmark_complete.json 2,348 3.06 MB MASTER FILE - Use this
denmark_libraries_v2.json 555 964 KB Main libraries only
denmark_archives.json 594 918 KB Archives only
denmark_library_branches.json 1,199 1.2 MB Library branches only

Location: /Users/kempersc/apps/glam/data/instances/


Quick Statistics

Total Institutions: 2,348
├── Libraries (main): 555
│   ├── Public libraries: 108
│   └── Research libraries (FFU): 447
├── Archives: 594
└── Library branches: 1,199
    ├── Public branches: 594
    └── FFU branches: 605

GHCID Coverage: 998/2,348 (42.5%)
ISIL Coverage: 555/2,348 (23.6%)
Hierarchical Links: 1,176/1,199 (98.1%)

Common Queries

Load complete dataset

import json
with open('data/instances/denmark_complete.json', 'r') as f:
    danish_glam = json.load(f)

Filter by institution type

archives = [i for i in danish_glam if i['institution_type'] == 'ARCHIVE']
libraries = [i for i in danish_glam if i['institution_type'] == 'LIBRARY']
main_libs = [i for i in libraries if not i.get('parent_organization')]
branches = [i for i in libraries if i.get('parent_organization')]

Find institutions by city

copenhagen = [
    i for i in danish_glam 
    if any(loc.get('city') == 'København K' for loc in i.get('locations', []))
]

Get institutions with GHCID

with_ghcid = [i for i in danish_glam if i.get('ghcid_current')]

Get hierarchical structure (library + branches)

def get_library_with_branches(library_id):
    """Get a library and all its branches."""
    library = next(i for i in danish_glam if i['id'] == library_id)
    branches = [
        i for i in danish_glam 
        if i.get('parent_organization') == library_id
    ]
    return {'library': library, 'branches': branches}

# Example
kb_system = get_library_with_branches(
    'https://w3id.org/heritage/custodian/dk/library/k%C3%B8benhavn-k/k%C3%B8benhavns-biblioteker'
)
print(f"{kb_system['library']['name']}: {len(kb_system['branches'])} branches")

Top Cities

City Count City Count
Aalborg 35 København K 30
Esbjerg 30 Hjørring 28
Vejle 28 Herning 26
Aarhus 22 Ringkøbing-Skjern 22

Next Steps

For RDF Export:

# Generate Turtle RDF
linkml-convert -s schemas/heritage_custodian.yaml -t rdf \
  data/instances/denmark_complete.json > data/rdf/denmark.ttl

For Wikidata Enrichment:

# Query Wikidata for Danish institutions
python3 scripts/enrich_denmark_wikidata.py

For Analysis:

# Generate statistics report
python3 scripts/analyze_denmark_dataset.py

Key Design Decisions

Library branches use parent_organization - Reduces redundancy
Archives get GHCID (no ISIL) - GHCID is primary identifier
Nordic characters normalized - æ→ae, ø→oe, å→aa in GHCID
98.1% hierarchical linkage - Near-perfect parent-child matching


Session Documents

  • SESSION_SUMMARY_20251119_DENMARK_COMPLETE.md - Full session report
  • SESSION_SUMMARY_20251119_DENMARK_ARCHIVES_COMPLETE.md - Archive processing
  • SESSION_SUMMARY_20251119_DENMARK_ISIL_COMPLETE.md - Library processing

Questions? See SESSION_SUMMARY_20251119_DENMARK_COMPLETE.md for detailed documentation.