glam/data/isil/germany/ARCHIVPORTAL_D_DISCOVERY.md
2025-11-19 23:25:22 +01:00

178 lines
5.6 KiB
Markdown

# Archivportal-D Discovery - National German Archive Portal
**Date**: November 19, 2025
**Discovery**: Found comprehensive German archive aggregation portal
---
## What is Archivportal-D?
**URL**: https://www.archivportal-d.de/
**Operator**: Deutsche Digitale Bibliothek (German Digital Library)
**Coverage**: ALL archives across Germany (federal, state, municipal, church, business, etc.)
### Key Features
- **National aggregator** - Covers all 16 federal states
- **Multiple archive types** - State, municipal, church, nobility, business, political, etc.
- **Digital finding aids** - Over 6,500 retrievable resources
- **3 million+ searchable items**
- **200,000+ digital copies**
---
## Coverage by Federal State
✅ All 16 German states included:
1. **Baden-Württemberg** - https://www.landesarchiv-bw.de/
2. **Bayern (Bavaria)** - State archive system
3. **Berlin** - Berlin state archives
4. **Brandenburg** - Brandenburg archives
5. **Bremen** - arcinsys.niedersachsen.de (shared with Niedersachsen)
6. **Hamburg** - Hamburg state archives
7. **Hessen** - arcinsys.hessen.de
8. **Mecklenburg-Vorpommern** - State archives
9. **Niedersachsen (Lower Saxony)** - https://www.arcinsys.niedersachsen.de/
10. **Nordrhein-Westfalen (NRW)** - https://www.archive.nrw.de/ (477 archives)
11. **Rheinland-Pfalz (Rhineland-Palatinate)** - State archives
12. **Saarland** - State archives
13. **Sachsen (Saxony)** - State archives
14. **Sachsen-Anhalt (Saxony-Anhalt)** - State archives
15. **Schleswig-Holstein** - arcinsys (shared system)
16. **Thüringen (Thuringia)** - State archives
---
## Archive Sectors in Archivportal-D
- **State archives** (Landesarchive)
- **Local/municipal archives** (Kommunalarchive)
- **Church archives** (Kirchenarchive)
- **Nobility and family archives** (Adelsarchive)
- **Business/economic archives** (Wirtschaftsarchive)
- **Political archives** (Politische Archive)
- **University archives** (Hochschularchive)
- **Media archives** (Medienarchive)
- **Other specialized archives** (Sonstige Archive)
---
## Strategic Importance
### Why Archivportal-D is Critical
1. **Comprehensive aggregation** - Single portal for ALL German archives
2. **Beyond ISIL** - Includes archives without ISIL codes
3. **Structured data** - Machine-readable archive metadata
4. **Official source** - Operated by national library infrastructure
5. **Up-to-date** - Actively maintained and updated
### Comparison with ISIL Registry
| Source | Coverage | Institutions | Data Type |
|--------|----------|--------------|-----------|
| **ISIL Registry** | ISIL-registered only | 16,979 | Libraries + Archives + Museums |
| **Archivportal-D** | All archives | ~10,000-20,000? | Archives only (focused) |
| **Combined** | Complete | ~25,000-35,000 | Comprehensive |
---
## Regional Archive Information Systems
### arcinsys - Multi-State Collaboration
**URL**: https://www.arcinsys.niedersachsen.de/
**States**: Niedersachsen, Bremen, Hessen, Schleswig-Holstein
**Features**:
- Shared archival information system
- Standardized metadata
- Cross-state searching
- Full archival descriptions (fonds, series, items)
### State-Specific Portals
1. **NRW**: https://www.archive.nrw.de/ (477 archives)
2. **Baden-Württemberg**: https://www.landesarchiv-bw.de/
3. **Niedersachsen/Bremen**: https://www.arcinsys.niedersachsen.de/
4. **Others**: Various regional systems
---
## Harvest Strategy
### Option 1: Scrape Archivportal-D (RECOMMENDED)
- **Pros**: Single source, comprehensive, national coverage
- **Cons**: May require complex scraping (pagination, filters)
- **Estimated institutions**: 10,000-20,000 archives
### Option 2: Scrape Individual State Portals
- **Pros**: More detailed metadata per state
- **Cons**: 16 different systems, inconsistent formats
- **Estimated institutions**: Similar total, more work
### Option 3: Hybrid Approach
- **Step 1**: Harvest Archivportal-D for complete list
- **Step 2**: Enrich with state portals for detailed metadata
- **Step 3**: Cross-reference with ISIL registry
---
## Next Actions
### Immediate
1. ✅ Create Archivportal-D scraper
2. ✅ Harvest all German archives from portal
3. ✅ Merge with ISIL dataset (16,979 institutions)
4. ✅ Create unified German archive database
### Data Integration
1. Match Archivportal-D records with ISIL codes
2. Identify archives without ISIL (new discoveries)
3. Enrich ISIL records with archival finding aids
4. Generate comprehensive German GLAM dataset
### Quality Metrics
- **Before**: 301 NRW archives (63% of portal)
- **After**: Target 100% NRW archive coverage
- **National**: Target 10,000-20,000 German archives total
---
## Technical Approach
### Archivportal-D API/Scraping
```python
# Check if API exists
# URL pattern: https://www.archivportal-d.de/struktur?lang=en
# Pagination: ?page=0, ?page=1, etc.
# Filters: facetValues%5B%5D=federalState-Nordrhein-Westfalen
```
### Data Schema
```yaml
- name: Archive name
location: City/region
federal_state: State (Bundesland)
archive_type: Sector category
finding_aids: Available finding aids
digital_copies: Digitized materials count
isil: ISIL code (if available)
source: archivportal-d
```
---
## References
- Archivportal-D: https://www.archivportal-d.de/
- Deutsche Digitale Bibliothek: https://www.deutsche-digitale-bibliothek.de/
- arcinsys Niedersachsen: https://www.arcinsys.niedersachsen.de/
- NRW Archive Portal: https://www.archive.nrw.de/
- Baden-Württemberg: https://www.landesarchiv-bw.de/
---
**Status**: Discovery complete, harvest planned
**Priority**: HIGH - Critical for complete German coverage
**Estimated gain**: +5,000-10,000 archives beyond ISIL registry