141 lines
3.6 KiB
Markdown
141 lines
3.6 KiB
Markdown
# Denmark GLAM Dataset - Quick Reference Card
|
|
|
|
**Last Updated**: 2025-11-19
|
|
**Status**: ✅ Complete (2,348 institutions)
|
|
|
|
---
|
|
|
|
## Files at a Glance
|
|
|
|
| File | Institutions | Size | Purpose |
|
|
|------|--------------|------|---------|
|
|
| `denmark_complete.json` | **2,348** | 3.06 MB | ⭐ **MASTER FILE** - Use this |
|
|
| `denmark_libraries_v2.json` | 555 | 964 KB | Main libraries only |
|
|
| `denmark_archives.json` | 594 | 918 KB | Archives only |
|
|
| `denmark_library_branches.json` | 1,199 | 1.2 MB | Library branches only |
|
|
|
|
**Location**: `/Users/kempersc/apps/glam/data/instances/`
|
|
|
|
---
|
|
|
|
## Quick Statistics
|
|
|
|
```
|
|
Total Institutions: 2,348
|
|
├── Libraries (main): 555
|
|
│ ├── Public libraries: 108
|
|
│ └── Research libraries (FFU): 447
|
|
├── Archives: 594
|
|
└── Library branches: 1,199
|
|
├── Public branches: 594
|
|
└── FFU branches: 605
|
|
|
|
GHCID Coverage: 998/2,348 (42.5%)
|
|
ISIL Coverage: 555/2,348 (23.6%)
|
|
Hierarchical Links: 1,176/1,199 (98.1%)
|
|
```
|
|
|
|
---
|
|
|
|
## Common Queries
|
|
|
|
### Load complete dataset
|
|
```python
|
|
import json
|
|
with open('data/instances/denmark_complete.json', 'r') as f:
|
|
danish_glam = json.load(f)
|
|
```
|
|
|
|
### Filter by institution type
|
|
```python
|
|
archives = [i for i in danish_glam if i['institution_type'] == 'ARCHIVE']
|
|
libraries = [i for i in danish_glam if i['institution_type'] == 'LIBRARY']
|
|
main_libs = [i for i in libraries if not i.get('parent_organization')]
|
|
branches = [i for i in libraries if i.get('parent_organization')]
|
|
```
|
|
|
|
### Find institutions by city
|
|
```python
|
|
copenhagen = [
|
|
i for i in danish_glam
|
|
if any(loc.get('city') == 'København K' for loc in i.get('locations', []))
|
|
]
|
|
```
|
|
|
|
### Get institutions with GHCID
|
|
```python
|
|
with_ghcid = [i for i in danish_glam if i.get('ghcid_current')]
|
|
```
|
|
|
|
### Get hierarchical structure (library + branches)
|
|
```python
|
|
def get_library_with_branches(library_id):
|
|
"""Get a library and all its branches."""
|
|
library = next(i for i in danish_glam if i['id'] == library_id)
|
|
branches = [
|
|
i for i in danish_glam
|
|
if i.get('parent_organization') == library_id
|
|
]
|
|
return {'library': library, 'branches': branches}
|
|
|
|
# Example
|
|
kb_system = get_library_with_branches(
|
|
'https://w3id.org/heritage/custodian/dk/library/k%C3%B8benhavn-k/k%C3%B8benhavns-biblioteker'
|
|
)
|
|
print(f"{kb_system['library']['name']}: {len(kb_system['branches'])} branches")
|
|
```
|
|
|
|
---
|
|
|
|
## Top Cities
|
|
|
|
| City | Count | City | Count |
|
|
|------|-------|------|-------|
|
|
| Aalborg | 35 | København K | 30 |
|
|
| Esbjerg | 30 | Hjørring | 28 |
|
|
| Vejle | 28 | Herning | 26 |
|
|
| Aarhus | 22 | Ringkøbing-Skjern | 22 |
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
**For RDF Export**:
|
|
```bash
|
|
# Generate Turtle RDF
|
|
linkml-convert -s schemas/heritage_custodian.yaml -t rdf \
|
|
data/instances/denmark_complete.json > data/rdf/denmark.ttl
|
|
```
|
|
|
|
**For Wikidata Enrichment**:
|
|
```bash
|
|
# Query Wikidata for Danish institutions
|
|
python3 scripts/enrich_denmark_wikidata.py
|
|
```
|
|
|
|
**For Analysis**:
|
|
```bash
|
|
# Generate statistics report
|
|
python3 scripts/analyze_denmark_dataset.py
|
|
```
|
|
|
|
---
|
|
|
|
## Key Design Decisions
|
|
|
|
✅ **Library branches use parent_organization** - Reduces redundancy
|
|
✅ **Archives get GHCID (no ISIL)** - GHCID is primary identifier
|
|
✅ **Nordic characters normalized** - æ→ae, ø→oe, å→aa in GHCID
|
|
✅ **98.1% hierarchical linkage** - Near-perfect parent-child matching
|
|
|
|
---
|
|
|
|
## Session Documents
|
|
|
|
- `SESSION_SUMMARY_20251119_DENMARK_COMPLETE.md` - Full session report
|
|
- `SESSION_SUMMARY_20251119_DENMARK_ARCHIVES_COMPLETE.md` - Archive processing
|
|
- `SESSION_SUMMARY_20251119_DENMARK_ISIL_COMPLETE.md` - Library processing
|
|
|
|
---
|
|
|
|
**Questions?** See `SESSION_SUMMARY_20251119_DENMARK_COMPLETE.md` for detailed documentation.
|