663 lines
19 KiB
Markdown
663 lines
19 KiB
Markdown
# German Regional Archive Portals - Discovery Report
|
|
|
|
**Date**: 2025-11-19
|
|
**Method**: Exa deep web search
|
|
**Context**: Discovered after finding archive.nrw.de (441 archives)
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
Discovered **12+ regional archive portals** across German federal states (Bundesländer), each with similar structure to archive.nrw.de. These portals provide searchable access to state, municipal, church, and specialized archives within each region.
|
|
|
|
### Key Finding
|
|
|
|
**Germany has a FEDERATED archive system** - each state (Bundesland) operates its own archive portal, with **Archivportal-D** serving as the national aggregator. This structure means regional portals contain MORE detailed information than the national ISIL registry or DDB.
|
|
|
|
---
|
|
|
|
## National Archive Portal
|
|
|
|
### Archivportal-D (National Aggregator)
|
|
|
|
**URL**: https://www.archivportal-d.de/
|
|
**Scope**: All 16 German federal states
|
|
**Language**: German + English
|
|
**Technology**: Part of Deutsche Digitale Bibliothek (DDB)
|
|
|
|
**Features**:
|
|
- Search across all German archives
|
|
- Filter by federal state (Bundesland)
|
|
- Filter by sector (state, municipal, church, business, etc.)
|
|
- Finding aids and digital copies
|
|
- Links to regional portals
|
|
|
|
**Federal States Covered**:
|
|
- Baden-Württemberg
|
|
- Bayern (Bavaria)
|
|
- Berlin
|
|
- Brandenburg
|
|
- Bremen
|
|
- Hamburg
|
|
- Hessen
|
|
- Mecklenburg-Vorpommern
|
|
- Niedersachsen (Lower Saxony)
|
|
- Nordrhein-Westfalen (NRW) ✅ **Already harvested**
|
|
- Rheinland-Pfalz (Rhineland-Palatinate)
|
|
- Saarland
|
|
- Sachsen (Saxony)
|
|
- Sachsen-Anhalt (Saxony-Anhalt)
|
|
- Schleswig-Holstein
|
|
- Thüringen (Thuringia)
|
|
|
|
---
|
|
|
|
## Regional Archive Portals by State
|
|
|
|
### 1. Nordrhein-Westfalen (NRW) ✅ **HARVESTED**
|
|
|
|
**Portal**: https://www.archive.nrw.de/archivsuche
|
|
**Status**: ✅ **441 archives harvested (2025-11-19)**
|
|
**Technology**: Drupal-based, JavaScript rendering
|
|
**Archive Types**: Municipal, district, state, university, church, corporate
|
|
|
|
**Harvest Results**:
|
|
- 441 archives extracted
|
|
- 356 cities covered
|
|
- 85 new institutions added to German dataset
|
|
|
|
---
|
|
|
|
### 2. Niedersachsen & Bremen (Arcinsys)
|
|
|
|
**Portal**: https://arcinsys.niedersachsen.de/
|
|
**Also**: http://arcinsys.niedersachsen.de/ (HTTP redirects to HTTPS)
|
|
**Language**: German + English
|
|
**Technology**: Arcinsys (shared with Hessen, Schleswig-Holstein)
|
|
|
|
**Features**:
|
|
- Joint portal for Niedersachsen AND Bremen
|
|
- Niedersächsisches Landesarchiv (7 locations)
|
|
- Municipal, church, and business archives
|
|
- Online finding aids
|
|
- Digital copies available
|
|
- User registration for ordering archival items
|
|
|
|
**Participating Archives**:
|
|
- State archives (Landesarchiv)
|
|
- District archives (Kreisarchive)
|
|
- Municipal archives (Stadtarchive)
|
|
- Community archives (Gemeindearchive)
|
|
- Church archives (Kirchenarchive)
|
|
- University archives (Hochschularchive)
|
|
- Business archives (Wirtschaftsarchive)
|
|
- Media archives (Medienarchive)
|
|
|
|
**Harvest Potential**: HIGH (likely 300+ archives)
|
|
|
|
---
|
|
|
|
### 3. Schleswig-Holstein (Arcinsys)
|
|
|
|
**Portal**: https://arcinsys.schleswig-holstein.de/
|
|
**Language**: German + English
|
|
**Technology**: Arcinsys (shared system)
|
|
|
|
**Features**:
|
|
- State archive in Schleswig (Prinzenpalais)
|
|
- Municipal and church archives
|
|
- Same Arcinsys interface as Niedersachsen
|
|
- Searchable finding aids
|
|
- Digital copies
|
|
|
|
**Harvest Potential**: MEDIUM (likely 150+ archives)
|
|
|
|
---
|
|
|
|
### 4. Hessen (Arcinsys)
|
|
|
|
**Portal**: https://arcinsys.hessen.de/
|
|
**Language**: German
|
|
**Technology**: Arcinsys (original developer)
|
|
|
|
**Features**:
|
|
- Hessisches Landesarchiv
|
|
- Municipal and specialized archives
|
|
- Finding aids online
|
|
- Part of 3-state Arcinsys consortium
|
|
|
|
**Note**: Hessen developed Arcinsys, later adopted by Niedersachsen and Schleswig-Holstein
|
|
|
|
**Harvest Potential**: MEDIUM-HIGH (likely 200+ archives)
|
|
|
|
---
|
|
|
|
### 5. Thüringen (Thuringia)
|
|
|
|
**Portal**: https://www.archive-in-thueringen.de/
|
|
**Also**: https://tharchivtest.thueringen.de/ (test environment)
|
|
**Language**: German + English
|
|
**Technology**: Custom archive portal
|
|
|
|
**Statistics (from portal)**:
|
|
- **149 archives**
|
|
- **14,793 inventories**
|
|
- **2,863 online finding aids**
|
|
|
|
**Archive Types**:
|
|
- State archives (5 locations: Altenburg, Gotha, Greiz, Meiningen, Rudolstadt, Weimar)
|
|
- Main state archive (Weimar)
|
|
- Municipal archives
|
|
- Specialized archives
|
|
|
|
**Features**:
|
|
- Cross-archive search
|
|
- Online finding aids
|
|
- Archive descriptions with historical context
|
|
- Newspaper and periodical collections
|
|
|
|
**Harvest Potential**: **HIGH - 149 archives confirmed**
|
|
|
|
---
|
|
|
|
### 6. Brandenburg
|
|
|
|
**Portal**: https://blha.brandenburg.de/
|
|
**Name**: Brandenburgisches Landeshauptarchiv
|
|
**Language**: German + English + Polish
|
|
**Location**: Potsdam (Zum Windmühlenberg)
|
|
|
|
**Features**:
|
|
- Main state archive for Brandenburg
|
|
- Holdings from 10th century to present
|
|
- Six main collections (Kurmark, Neumark, Niederlausitz, Prussian, GDR, modern Brandenburg)
|
|
- Research services
|
|
- Digital provenance research (NS-era financial records)
|
|
|
|
**Structure**: Centralized state archive (not a portal of multiple archives)
|
|
|
|
**Harvest Potential**: LOW (1 main institution, but check for branch archives)
|
|
|
|
---
|
|
|
|
### 7. Sachsen (Saxony)
|
|
|
|
**Portal**: https://www.staatsarchiv.sachsen.de/
|
|
**Name**: Sächsisches Staatsarchiv
|
|
**Language**: German
|
|
|
|
**Features**:
|
|
- State archive system
|
|
- Multiple locations (Dresden, Leipzig, Chemnitz, Freiberg, Bautzen)
|
|
- Historical records from medieval period
|
|
- Online research portal
|
|
- Finding aids
|
|
|
|
**Harvest Potential**: MEDIUM (state archive with multiple locations + municipal archives)
|
|
|
|
---
|
|
|
|
### 8. Sachsen-Anhalt (Saxony-Anhalt)
|
|
|
|
**Portal**: https://landesarchiv.sachsen-anhalt.de/
|
|
**Also**: https://lha.sachsen-anhalt.de/
|
|
**Name**: Landesarchiv Sachsen-Anhalt (LASA)
|
|
|
|
**Locations**:
|
|
- Abteilung Magdeburg
|
|
- Abteilung Dessau
|
|
- Abteilung Merseburg
|
|
|
|
**Features**:
|
|
- Three department locations
|
|
- Church book duplicates (Kirchenbuchduplikate)
|
|
- Civil status registers (Zivilstandsregister)
|
|
- Online research portal
|
|
- Genealogical resources
|
|
|
|
**Harvest Potential**: MEDIUM (3 main locations + municipal archives)
|
|
|
|
---
|
|
|
|
### 9. Baden-Württemberg
|
|
|
|
**Portal**: https://www.landesarchiv-bw.de/
|
|
**Name**: Landesarchiv Baden-Württemberg
|
|
**Language**: German + English
|
|
**Online System**: https://www2.landesarchiv-bw.de/ofs21/
|
|
|
|
**Features**:
|
|
- State archive system
|
|
- Multiple historical territories (Baden, Württemberg, Hohenzollern)
|
|
- Online finding aids (Findmittelsystem)
|
|
- Research services
|
|
- Medieval to modern holdings
|
|
|
|
**Harvest Potential**: HIGH (unified state archive + municipal networks)
|
|
|
|
---
|
|
|
|
### 10. Bayern (Bavaria)
|
|
|
|
**Portal**: https://www.gda.bayern.de/
|
|
**Name**: Generaldirektion der Staatlichen Archive Bayerns
|
|
**Language**: German (+ minimal English)
|
|
|
|
**State Archives**:
|
|
1. Bayerisches Hauptstaatsarchiv (Munich) - central repository
|
|
2. Staatsarchiv Amberg (Oberpfalz)
|
|
3. Staatsarchiv Augsburg (Schwaben)
|
|
4. Staatsarchiv Bamberg (Oberfranken)
|
|
5. Staatsarchiv Coburg (Oberfranken)
|
|
6. Staatsarchiv Landshut (Niederbayern)
|
|
7. Staatsarchiv München (Oberbayern)
|
|
8. Staatsarchiv Nürnberg (Mittelfranken)
|
|
9. Staatsarchiv Würzburg (Unterfranken)
|
|
|
|
**Features**:
|
|
- 9 state archives covering Bavaria's administrative regions
|
|
- Holdings from 777 CE (oldest charter)
|
|
- Genealogical research services
|
|
- No unified search portal (each archive separate)
|
|
|
|
**Harvest Potential**: HIGH (9 state archives + extensive municipal network)
|
|
|
|
---
|
|
|
|
### 11. Rheinland-Pfalz (Rhineland-Palatinate)
|
|
|
|
**Status**: Mentioned in Archivportal-D but **no dedicated regional portal found**
|
|
|
|
**Known Archives**:
|
|
- Landesarchiv Rheinland-Pfalz
|
|
- Municipal archives (Stadtarchive)
|
|
|
|
**Harvest Potential**: MEDIUM (rely on Archivportal-D or ISIL registry)
|
|
|
|
---
|
|
|
|
### 12. Mecklenburg-Vorpommern
|
|
|
|
**Portal**: https://www.digitale-bibliothek-mv.de/viewer/cms/
|
|
**Name**: Landeshauptarchiv Schwerin
|
|
**Part of**: Digitale Bibliothek Mecklenburg-Vorpommern
|
|
|
|
**Features**:
|
|
- Historical collections from Landeshauptarchiv Schwerin
|
|
- 15th century origins (ducal archives)
|
|
- Merged with Geheimes und Hauptarchiv (1779)
|
|
- Digital collections online
|
|
|
|
**Harvest Potential**: MEDIUM (state archive + regional municipal archives)
|
|
|
|
---
|
|
|
|
### 13. Saarland
|
|
|
|
**Status**: Mentioned in Archivportal-D but **no dedicated regional portal found**
|
|
|
|
**Harvest Potential**: LOW-MEDIUM (small state, rely on Archivportal-D)
|
|
|
|
---
|
|
|
|
### 14. Hamburg
|
|
|
|
**Status**: City-state, archives part of Hamburg government
|
|
|
|
**Harvest Potential**: LOW (single city-state archive)
|
|
|
|
---
|
|
|
|
### 15. Berlin
|
|
|
|
**Status**: City-state, archives part of Berlin government
|
|
|
|
**Harvest Potential**: LOW (single city-state archive)
|
|
|
|
---
|
|
|
|
### 16. Bremen
|
|
|
|
**Portal**: Part of Arcinsys Niedersachsen und Bremen
|
|
**URL**: https://www.staatsarchiv.bremen.de/
|
|
**Name**: Staatsarchiv Bremen
|
|
|
|
**Status**: Integrated into Arcinsys Niedersachsen portal (see #2 above)
|
|
|
|
**Harvest Potential**: LOW (covered by Niedersachsen harvest)
|
|
|
|
---
|
|
|
|
## Harvest Priority Ranking
|
|
|
|
Based on archive count, portal accessibility, and harvest feasibility:
|
|
|
|
### Priority 1 - High Impact (300+ archives expected)
|
|
|
|
1. **Thüringen** ⭐ - 149 archives CONFIRMED
|
|
2. **Niedersachsen & Bremen (Arcinsys)** ⭐ - 300+ archives estimated
|
|
3. **Baden-Württemberg** ⭐ - 200+ archives estimated
|
|
4. **Bayern (Bavaria)** - 9 state archives + municipal network
|
|
|
|
### Priority 2 - Medium Impact (100-200 archives)
|
|
|
|
5. **Hessen (Arcinsys)** - 200+ archives estimated
|
|
6. **Schleswig-Holstein (Arcinsys)** - 150+ archives estimated
|
|
7. **Sachsen (Saxony)** - State archive system + municipalities
|
|
8. **Sachsen-Anhalt** - 3 departments + municipalities
|
|
|
|
### Priority 3 - Lower Impact (<100 archives)
|
|
|
|
9. **Mecklenburg-Vorpommern** - State archive + regional
|
|
10. **Brandenburg** - Centralized system (1 main archive)
|
|
11. **Rheinland-Pfalz** - No dedicated portal (use Archivportal-D)
|
|
12. **Saarland** - Small state (use Archivportal-D)
|
|
13. **Hamburg** - City-state (single archive)
|
|
14. **Berlin** - City-state (single archive)
|
|
|
|
---
|
|
|
|
## Technical Observations
|
|
|
|
### Portal Technologies
|
|
|
|
1. **Arcinsys** (Hessen, Niedersachsen, Bremen, Schleswig-Holstein)
|
|
- Shared platform developed by Hessen
|
|
- Consistent interface across 4 states
|
|
- User registration system
|
|
- Finding aids + digital copies
|
|
- Web-based ordering system
|
|
|
|
2. **Custom Drupal** (NRW)
|
|
- JavaScript-rendered
|
|
- Archive navigation by category
|
|
- Button-based interface
|
|
|
|
3. **Custom Portals** (Thüringen, Baden-Württemberg, Sachsen)
|
|
- State-specific designs
|
|
- Online finding aids
|
|
- Search interfaces
|
|
|
|
4. **Institutional Websites** (Bayern, Brandenburg)
|
|
- Individual archive websites
|
|
- No unified search portal
|
|
|
|
### Common Features Across Portals
|
|
|
|
✅ **Archive Directory** - List of participating archives
|
|
✅ **Finding Aids** - Searchable inventories (Findmittel)
|
|
✅ **Digital Copies** - Scanned archival materials
|
|
✅ **Archive Descriptions** - Historical context, holdings info
|
|
✅ **Contact Information** - Addresses, hours, services
|
|
✅ **User Accounts** - Registration for ordering materials
|
|
|
|
### Harvest Challenges
|
|
|
|
1. **Arcinsys Portals** - May require clicking through archive listings
|
|
2. **JavaScript Rendering** - Need Playwright/Selenium (like NRW)
|
|
3. **No Unified API** - Each portal has custom structure
|
|
4. **German Language Only** - Most portals German-only (except English summaries)
|
|
5. **Finding Aid vs Directory** - Some portals focus on inventories, not archive lists
|
|
|
|
---
|
|
|
|
## Harvest Strategy Recommendations
|
|
|
|
### Approach 1: Arcinsys Consortium (3 states, ~650 archives)
|
|
|
|
**Targets**: Niedersachsen & Bremen, Schleswig-Holstein, Hessen
|
|
**Technology**: Shared Arcinsys platform
|
|
**Advantage**: Consistent structure, can reuse scraping logic
|
|
|
|
**Steps**:
|
|
1. Analyze Arcinsys archive directory structure
|
|
2. Build unified scraper for all 3 Arcinsys portals
|
|
3. Extract archive names, cities, types, contact info
|
|
4. Geocode and merge with German dataset
|
|
|
|
**Expected Yield**: 600+ archives
|
|
|
|
---
|
|
|
|
### Approach 2: High-Impact Custom Portals (2 states, ~350 archives)
|
|
|
|
**Targets**: Thüringen (149 confirmed), Baden-Württemberg (200+ estimated)
|
|
**Technology**: Custom portals
|
|
**Advantage**: High archive counts, separate portal structures
|
|
|
|
**Steps**:
|
|
1. Thüringen: Scrape https://www.archive-in-thueringen.de/ (149 archives listed)
|
|
2. Baden-Württemberg: Scrape https://www.landesarchiv-bw.de/ directory
|
|
3. Extract and merge
|
|
|
|
**Expected Yield**: 350+ archives
|
|
|
|
---
|
|
|
|
### Approach 3: Bayern State Archives (9 archives + municipal)
|
|
|
|
**Target**: Bayern (Bavaria)
|
|
**Technology**: Individual archive websites
|
|
**Challenge**: No unified portal, must compile from GDA directory
|
|
|
|
**Steps**:
|
|
1. Scrape archive list from https://www.gda.bayern.de/archive
|
|
2. Extract 9 state archives (Hauptstaatsarchiv + 8 regional)
|
|
3. Check for municipal archive lists on state archive websites
|
|
|
|
**Expected Yield**: 10-50 archives (state + major municipal)
|
|
|
|
---
|
|
|
|
### Approach 4: National Aggregator (Archivportal-D)
|
|
|
|
**Target**: All remaining states (Rheinland-Pfalz, Saarland, etc.)
|
|
**Portal**: https://www.archivportal-d.de/
|
|
**Advantage**: Single portal for all states
|
|
|
|
**Steps**:
|
|
1. Scrape Archivportal-D archive directory
|
|
2. Filter by federal state
|
|
3. Extract archive metadata (name, city, type, sector)
|
|
4. Cross-reference with existing harvests (avoid duplicates)
|
|
|
|
**Expected Yield**: 1,000+ archives (all Germany, including duplicates from regional portals)
|
|
|
|
---
|
|
|
|
## Expected German Dataset Growth
|
|
|
|
### Current State (Post-NRW)
|
|
|
|
- **Total German Institutions**: 20,846
|
|
- **Sources**: ISIL + DDB + NRW
|
|
- **NRW Archives**: 441
|
|
|
|
### Projected Growth (Optimistic Scenario)
|
|
|
|
| Portal/State | Expected Archives | Duplicates (Est.) | Net New |
|
|
|--------------|-------------------|-------------------|---------|
|
|
| **Thüringen** | 149 | 30 (20%) | 119 |
|
|
| **Niedersachsen & Bremen (Arcinsys)** | 350 | 70 (20%) | 280 |
|
|
| **Schleswig-Holstein (Arcinsys)** | 150 | 30 (20%) | 120 |
|
|
| **Hessen (Arcinsys)** | 200 | 40 (20%) | 160 |
|
|
| **Baden-Württemberg** | 250 | 50 (20%) | 200 |
|
|
| **Bayern** | 50 | 10 (20%) | 40 |
|
|
| **Sachsen** | 150 | 30 (20%) | 120 |
|
|
| **Sachsen-Anhalt** | 100 | 20 (20%) | 80 |
|
|
| **Other states** | 200 | 40 (20%) | 160 |
|
|
| **TOTAL** | **1,599** | **320** | **1,279** |
|
|
|
|
### Projected German Dataset (After Regional Harvests)
|
|
|
|
- **Before Regional Harvests**: 20,846 institutions
|
|
- **Expected New Additions**: ~1,280 archives
|
|
- **Projected Total**: **~22,100 German institutions**
|
|
|
|
### Phase 1 Impact
|
|
|
|
- **Current Phase 1**: 38,479 / 97,000 (39.7%)
|
|
- **After German Regional Harvests**: 39,800 / 97,000 (41.0%)
|
|
- **Gain**: +1.3 percentage points
|
|
|
|
---
|
|
|
|
## Recommended Next Steps
|
|
|
|
### Immediate Actions
|
|
|
|
1. **Start with Thüringen** ⭐ (149 archives confirmed, easiest harvest)
|
|
- Portal: https://www.archive-in-thueringen.de/
|
|
- Build scraper for archive directory
|
|
- Estimated time: 30 minutes
|
|
|
|
2. **Harvest Arcinsys Consortium** ⭐ (600+ archives, unified platform)
|
|
- Portals: Niedersachsen, Schleswig-Holstein, Hessen
|
|
- Build shared Arcinsys scraper
|
|
- Estimated time: 2-3 hours
|
|
|
|
3. **Harvest Baden-Württemberg** (200+ archives)
|
|
- Portal: https://www.landesarchiv-bw.de/
|
|
- Custom scraper for archive directory
|
|
- Estimated time: 1 hour
|
|
|
|
### Medium-Term Goals
|
|
|
|
4. **Harvest Bayern** (9-50 archives)
|
|
5. **Harvest Sachsen** (150+ archives)
|
|
6. **Harvest Sachsen-Anhalt** (100+ archives)
|
|
|
|
### Long-Term Strategy
|
|
|
|
7. **Use Archivportal-D as fallback** for remaining states
|
|
8. **Cross-reference regional harvests** with Archivportal-D to catch missing archives
|
|
9. **Validate against ISIL registry** for quality control
|
|
|
|
---
|
|
|
|
## Technical Requirements
|
|
|
|
### Tools Needed
|
|
|
|
- **Playwright** - JavaScript rendering (Arcinsys, Thüringen)
|
|
- **BeautifulSoup** - HTML parsing
|
|
- **RapidFuzz** - Deduplication (fuzzy matching)
|
|
- **Nominatim** - Geocoding (rate-limited 1 req/sec)
|
|
|
|
### Scraper Pattern (from NRW Success)
|
|
|
|
```python
|
|
# 1. Use Playwright for JavaScript-rendered portals
|
|
async with async_playwright() as p:
|
|
browser = await p.chromium.launch(headless=True)
|
|
page = await browser.new_page()
|
|
await page.goto(portal_url)
|
|
await page.wait_for_load_state('networkidle')
|
|
|
|
# 2. Extract archive buttons/links
|
|
archives = await page.locator('.archive-button').all()
|
|
|
|
# 3. Extract text without clicking (fast approach)
|
|
for archive in archives:
|
|
name = await archive.inner_text()
|
|
# Parse city from name using regex
|
|
|
|
# 4. Geocode cities
|
|
# 5. Merge with existing dataset (fuzzy matching)
|
|
# 6. Export unified dataset
|
|
```
|
|
|
|
---
|
|
|
|
## Key Insights
|
|
|
|
### 1. Federated Structure
|
|
|
|
Germany's archive system is **highly federated** - each state operates independently with its own portal/system. This means:
|
|
- Regional portals have MORE detail than national ISIL registry
|
|
- Must harvest state-by-state to get complete coverage
|
|
- Archivportal-D aggregates but doesn't replace regional portals
|
|
|
|
### 2. Arcinsys Advantage
|
|
|
|
**4 states share Arcinsys** (Hessen, Niedersachsen, Bremen, Schleswig-Holstein):
|
|
- Represents ~25% of German states
|
|
- Expected ~600+ archives total
|
|
- Single scraper can harvest all 4 portals
|
|
- Consistent data structure = easier extraction
|
|
|
|
### 3. NRW Pattern Replicable
|
|
|
|
The NRW harvest pattern (fast text extraction without clicking) works well for:
|
|
- Drupal-based portals
|
|
- Button/link-based archive listings
|
|
- JavaScript-rendered pages
|
|
|
|
**Reuse this approach** for Thüringen, Arcinsys portals, Baden-Württemberg
|
|
|
|
### 4. Duplicate Rate Validation
|
|
|
|
NRW showed **80.7% duplicate rate** (356/441) with existing ISIL+DDB data:
|
|
- Validates existing data sources are comprehensive
|
|
- Expect similar rates for other states
|
|
- ~20% new archives per state is realistic expectation
|
|
|
|
---
|
|
|
|
## Comparison to NRW Harvest
|
|
|
|
| Metric | NRW | Expected (All Regional Portals) |
|
|
|--------|-----|----------------------------------|
|
|
| **Archives Harvested** | 441 | 1,599 |
|
|
| **Duplicates (%)** | 80.7% | ~80% (estimated) |
|
|
| **Net New** | 85 | ~1,280 |
|
|
| **Cities Covered** | 356 | ~800 |
|
|
| **Geocoded (%)** | 83.7% | ~85% (target) |
|
|
| **Harvest Time** | 9.3 seconds | ~5 hours (estimated) |
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
|
|
Germany has a **rich ecosystem of regional archive portals** beyond archive.nrw.de. Harvesting these portals could add **~1,280 new institutions** to the German dataset, bringing the total from 20,846 → ~22,100.
|
|
|
|
**Priority targets**:
|
|
1. **Thüringen** (149 confirmed) - Quick win ⭐
|
|
2. **Arcinsys Consortium** (600+ estimated) - High impact ⭐
|
|
3. **Baden-Württemberg** (200+ estimated) - High impact ⭐
|
|
|
|
**Impact**: +1.3 percentage points toward Phase 1 goal (39.7% → 41.0%)
|
|
|
|
---
|
|
|
|
**Next Recommended Action**: Start with Thüringen harvest (149 archives, simple portal structure)
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
### Portal URLs
|
|
|
|
- **Archivportal-D**: https://www.archivportal-d.de/
|
|
- **NRW**: https://www.archive.nrw.de/archivsuche ✅ Harvested
|
|
- **Thüringen**: https://www.archive-in-thueringen.de/
|
|
- **Niedersachsen & Bremen**: https://arcinsys.niedersachsen.de/
|
|
- **Schleswig-Holstein**: https://arcinsys.schleswig-holstein.de/
|
|
- **Hessen**: https://arcinsys.hessen.de/
|
|
- **Baden-Württemberg**: https://www.landesarchiv-bw.de/
|
|
- **Bayern**: https://www.gda.bayern.de/
|
|
- **Brandenburg**: https://blha.brandenburg.de/
|
|
- **Sachsen**: https://www.staatsarchiv.sachsen.de/
|
|
- **Sachsen-Anhalt**: https://landesarchiv.sachsen-anhalt.de/
|
|
|
|
### Documentation
|
|
|
|
- **NRW Harvest**: `NRW_HARVEST_COMPLETE_20251119.md`
|
|
- **NRW Merge**: `SESSION_SUMMARY_20251119_NRW_MERGE_COMPLETE.md`
|
|
- **Quick Status**: `QUICK_STATUS_20251119_POST_NRW.md`
|
|
|
|
---
|
|
|
|
**Report Generated**: 2025-11-19 22:30 UTC
|
|
**Research Method**: Exa deep web search (30 queries)
|
|
**Status**: Ready for harvest implementation
|