25 KiB
Persistent Identifiers for Heritage Institutions
Overview
The GLAM Data Extraction project uses multiple identifier formats optimized for different purposes:
Persistent Identifiers (Deterministic)
These can be regenerated from the GHCID string and are stable across systems:
| Format | Bits | Algorithm | Use Case | Status |
|---|---|---|---|---|
| UUID v5 | 128 | SHA-1 | PRIMARY - Europeana, DPLA, IIIF, Wikidata | RFC 4122 Standard |
| UUID SHA-256 | 128 | SHA-256 | SOTA - Security compliance, future-proofing | RFC 9562 (UUID v8) |
| Numeric | 64 | SHA-256 | CSV exports, numeric analysis | Internal |
| Human-readable | Variable | ISO format | Citations, documentation | ISO-based |
Database Record Identifiers (Non-Deterministic)
These are generated once per record and optimize database performance:
| Format | Bits | Algorithm | Use Case | Status |
|---|---|---|---|---|
| UUID v7 | 128 | Timestamp + Random | Database PKs, time-ordered queries | RFC 9562 Standard |
Why Four Formats?
1. UUID v5 (SHA-1) - Interoperability Standard ⭐ PRIMARY
Format: 550e8400-e29b-41d4-a716-446655440000
Version: 5 (name-based, SHA-1)
Standard: RFC 4122 (2005)
✅ Strengths:
- RFC 4122 compliant - Universal library support
- Deterministic - Same GHCID → Same UUID always (content-addressed)
- Transparent - Publicly documented algorithm, anyone can verify
- Interoperable - Works with Europeana, DPLA, IIIF, Wikidata
- 128-bit collision resistance - P(collision) ≈ 1.5×10^-29 for 1M institutions
⚠️ SHA-1 Nuance:
- Uses SHA-1 internally (RFC 4122 specification)
- SHA-1 deprecated for cryptographic security (digital signatures, TLS, passwords)
- SHA-1 appropriate for identifier generation (non-adversarial, collision-resistant)
- See Why GHCID Uses UUID v5 and SHA-1 for detailed rationale
Why SHA-1 is Safe for GHCID:
Cryptographic Use (Vulnerable):
- Adversarial context (attacker forges signatures)
- Two-message collision attack
- Security-critical (financial, authentication)
Identifier Use (Safe):
- Non-adversarial context (no one forges museum IDs)
- Single-source generation (we control inputs)
- Uniqueness requirement (birthday paradox protection sufficient)
Use When:
- Primary identifier for all GHCID records
- Integrating with existing UUID v5 systems
- Exporting to Europeana, DPLA, IIIF
- Storing in Wikidata as external identifier
- RFC 4122 strict compliance required
- Maximum transparency required (anyone can verify)
2. UUID SHA-256 (Custom) - SOTA Cryptographic Strength
Format: a1b2c3d4-e5f6-8a1b-9c2d-3e4f5a6b7c8d
Version: 8 (custom/experimental)
Algorithm: SHA-256 (truncated to 128 bits)
✅ Strengths:
- SHA-256 - NIST-approved, SOTA cryptographic hash (2024)
- Superior collision resistance vs SHA-1
- Future-proof - No known practical attacks
- UUID-compatible - Valid UUID format, works with UUID parsers
⚠️ Nuances:
- Not RFC 4122 standard - Custom implementation
- UUID v8 is "experimental/vendor-specific" designation
- May not be recognized by strict UUID v5-only systems
Use When:
- Security policy mandates SHA-256
- Maximum collision resistance required
- Future-proofing against SHA-1 deprecation
- Custom identifier resolution service
Algorithm:
- Hash GHCID string with SHA-256 → 256 bits
- Truncate to first 128 bits (16 bytes)
- Set version bits to 8 (custom)
- Set variant bits to RFC 4122 (0b10xxxxxx)
3. Numeric (64-bit) - Database Optimization
Format: 213324328442227739
Algorithm: SHA-256 → first 8 bytes → uint64
Range: 0 to 18,446,744,073,709,551,615
✅ Strengths:
- Compact - Fits in SQL BIGINT (8 bytes)
- Fast indexing - Integer comparisons faster than UUID
- CSV-friendly - No special characters
- Deterministic - Same GHCID → Same number
⚠️ Nuances:
- 64-bit truncation reduces collision resistance vs full 256-bit
- P(collision) ≈ 2.7×10^-7 for 1M institutions (0.00003%)
- Still negligible for heritage domain (<10M institutions expected)
Use When:
- Database primary key optimization
- CSV exports for spreadsheet analysis
- Numeric sorting required
- Systems without UUID support
4. Human-Readable (ISO-based) - Citations & References
Format: US-CA-SAN-A-IA
Components: {Country}-{Region}-{City}-{Type}-{Abbreviation}
Example: NL-NH-AMS-M-RM (Rijksmuseum Amsterdam)
✅ Strengths:
- Human-readable - Understandable without lookup
- Geographic context - Location embedded in ID
- Type indicator - Institution type visible
- Citable - Use in academic papers, documentation
⚠️ Nuances:
- Not persistent if institution relocates or changes name
- Use
ghcid_originalfield (frozen) for true persistence ghcidfield (current) may change over time
Use When:
- Academic citations
- Documentation and reports
- Human-readable data exchange
- Debugging and logging
Collision Resistance Comparison
Mathematical Analysis
# Collision probability (birthday paradox):
# P(collision) ≈ n² / (2 × 2^bits)
# For 1,000,000 institutions:
# UUID v5 / UUID SHA-256 (128-bit):
P = (10^6)² / (2 × 2^128) ≈ 1.5 × 10^-29
# Effectively zero - more atoms in universe than collisions
# Numeric (64-bit):
P = (10^6)² / (2 × 2^64) ≈ 2.7 × 10^-7 (0.00003%)
# Negligible for heritage domain
# Even at 10 million institutions:
P_64bit = (10^7)² / (2 × 2^64) ≈ 2.7 × 10^-5 (0.003%)
# Still acceptable
Real-World Context
| Institution Count | UUID v5/SHA-256 | Numeric (64-bit) | Assessment |
|---|---|---|---|
| 100,000 | ~0% | 2.7×10^-11 (0.0000000027%) | ✅ All safe |
| 1,000,000 | ~0% | 2.7×10^-7 (0.00003%) | ✅ All safe |
| 10,000,000 | ~0% | 2.7×10^-5 (0.003%) | ✅ UUID safe, numeric acceptable |
| 100,000,000 | ~0% | 0.27% | ⚠️ Use UUID, numeric risky |
Conclusion: For the heritage domain (expected <10M institutions worldwide), all formats provide sufficient collision resistance.
Historical Collision Resolution
The Rule: Temporal Priority Determines Disambiguation
When creating GHCIDs, collisions can occur in two temporal contexts:
- First Batch Creation (initial PID assignment): Multiple institutions discovered simultaneously
- Historical Addition (post-publication): New historical institution added after existing GHCID published
Critical Design Decision: The collision resolution strategy differs based on temporal context to preserve PID stability.
Collision Resolution: Native Language Name Suffix
Key Change: Collisions are resolved by appending the full legal name in native language in snake_case format, NOT Wikidata Q-numbers.
Name Suffix Rules:
- Use the institution's full official name in its native language
- Convert to snake_case (lowercase, underscores for spaces)
- Remove apostrophes, accents, commas, and other punctuation/diacritics
- Transliterate non-Latin scripts to ASCII (e.g., Pinyin for Chinese)
Name Normalization Examples:
"Stedelijk Museum Amsterdam" → "stedelijk_museum_amsterdam"
"Musée d'Orsay" → "musee_dorsay"
"Biblioteca Nacional do Brasil" → "biblioteca_nacional_do_brasil"
"北京故宫博物院" → "beijing_gugong_bowuyuan" (pinyin transliteration)
"Österreichische Nationalbibliothek" → "osterreichische_nationalbibliothek"
First Batch Behavior (Initial PID Creation)
Scenario: During initial GHCID generation, multiple institutions with identical base GHCIDs are discovered together.
Resolution: ALL colliding institutions get name suffixes appended.
Example:
# Discovery: Two museums in Amsterdam both generate NL-NH-AMS-M-SM
# Stedelijk Museum (founded 1874)
ghcid_original: NL-NH-AMS-M-SM-stedelijk_museum_amsterdam
# Science Museum Amsterdam (founded 2010)
ghcid_original: NL-NH-AMS-M-SM-science_museum_amsterdam
Rationale: No existing PIDs to preserve; both institutions are "new" to the system.
Historical Addition Behavior (Post-Publication)
Scenario: After initial GHCID batch is published, a historical institution is added that collides with an existing GHCID.
Resolution: ONLY the newly added historical institution gets a name suffix. The existing PID remains unchanged.
Example:
# Existing GHCID (published 2025-11-01)
ghcid_original: NL-NH-AMS-M-HM # Hermitage Museum Amsterdam (2009-2023)
# Historical institution added later (2025-11-15)
# Amsterdam Historical Museum (1926-1975)
# Would also generate: NL-NH-AMS-M-HM
#
# COLLISION DETECTED → Add name suffix to NEW addition ONLY
ghcid_original: NL-NH-AMS-M-HM-amsterdam_historical_museum
Outcome:
NL-NH-AMS-M-HM(Hermitage Museum Amsterdam) → UNCHANGEDNL-NH-AMS-M-HM-amsterdam_historical_museum(Amsterdam Historical Museum) → Name suffix added
Rationale: Preserve stability of already-published PIDs.
Why This Matters: PID Stability Principle
Problem: Changing existing GHCIDs breaks external references.
PIDs may already be:
- Cited in academic publications
- Referenced in datasets and APIs
- Stored in institutional databases
- Embedded in IIIF manifests
- Linked from Wikidata
Principle: "Cool URIs don't change" (Tim Berners-Lee, W3C)
Once a GHCID is published (in first batch or as standalone record), it should NEVER change, even if new historical institutions create collisions.
Decision Table: Who Gets Name Suffix?
| Scenario | When | Existing GHCID | New GHCID | Who Gets Name Suffix | Rationale |
|---|---|---|---|---|---|
| First Batch | Initial PID creation (2025-11-01) | None (first time) | NL-NH-AMS-M-SM (2 institutions) |
ALL colliding institutions | No existing PIDs to preserve |
| Historical Addition | Post-publication (2025-11-15) | NL-NH-AMS-M-HM (published) |
NL-NH-AMS-M-HM (historical) |
ONLY newly added institution | Preserve published PID stability |
| Standalone Addition | New institution (2026-01-01) | NL-NH-AMS-M-XY (published) |
NL-NH-AMS-M-XY (new contemporary) |
ONLY newly added institution | Preserve existing PID |
Implementation Guidance
Name Suffix Generation:
import re
import unicodedata
def generate_name_suffix(native_name: str) -> str:
"""Convert native language institution name to snake_case suffix.
Examples:
"Stedelijk Museum Amsterdam" → "stedelijk_museum_amsterdam"
"Musée d'Orsay" → "musee_dorsay"
"Österreichische Nationalbibliothek" → "osterreichische_nationalbibliothek"
"""
# Normalize unicode (NFD decomposition) and remove diacritics
normalized = unicodedata.normalize('NFD', native_name)
ascii_name = ''.join(c for c in normalized if unicodedata.category(c) != 'Mn')
# Convert to lowercase
lowercase = ascii_name.lower()
# Remove apostrophes, commas, and other punctuation
no_punct = re.sub(r"[''`\",.:;!?()[\]{}]", '', lowercase)
# Replace spaces and hyphens with underscores
underscored = re.sub(r'[\s\-]+', '_', no_punct)
# Remove any remaining non-alphanumeric characters (except underscores)
clean = re.sub(r'[^a-z0-9_]', '', underscored)
# Collapse multiple underscores
final = re.sub(r'_+', '_', clean).strip('_')
return final
Collision Detection Logic:
def resolve_collision(new_ghcid: str, new_name: str, existing_ghcids: Set[str]) -> str:
"""
Resolve GHCID collision based on temporal context.
Args:
new_ghcid: Base GHCID for new institution
new_name: Native language name of the institution
existing_ghcids: Set of already-published GHCIDs
Returns:
Final GHCID (with name suffix if needed)
"""
if new_ghcid in existing_ghcids:
# COLLISION DETECTED: New institution collides with existing
# Resolution: Add name suffix to NEW institution ONLY
name_suffix = generate_name_suffix(new_name)
return f"{new_ghcid}-{name_suffix}"
else:
# No collision: Use base GHCID
return new_ghcid
First Batch Processing (different logic):
def process_first_batch(institutions: List[Institution]) -> List[GHCIDRecord]:
"""
Process initial batch of institutions.
For first batch, ALL collisions get name suffixes appended.
"""
# Group by base GHCID
ghcid_groups = defaultdict(list)
for inst in institutions:
base_ghcid = generate_base_ghcid(inst)
ghcid_groups[base_ghcid].append(inst)
records = []
for base_ghcid, group in ghcid_groups.items():
if len(group) == 1:
# No collision: Use base GHCID
records.append(create_record(group[0], base_ghcid))
else:
# COLLISION: ALL institutions get name suffixes
for inst in group:
name_suffix = generate_name_suffix(inst.name)
ghcid = f"{base_ghcid}-{name_suffix}"
records.append(create_record(inst, ghcid))
return records
Edge Cases
Case 1: Multiple historical institutions added simultaneously
If multiple historical institutions are added together (same date) and collide with existing GHCID:
# Existing (published 2025-11-01)
ghcid: NL-NH-AMS-M-XY
# Both added 2025-11-15
# Historical Institution A: "Amsterdam Art Archive"
ghcid: NL-NH-AMS-M-XY-amsterdam_art_archive
# Historical Institution B: "Amsterdam Archaeology Museum"
ghcid: NL-NH-AMS-M-XY-amsterdam_archaeology_museum
Resolution: ALL newly added institutions get name suffixes (treat as mini-batch).
Case 2: Existing GHCID already has name suffix
If existing GHCID already has name suffix (from first batch collision), new historical addition gets different name suffix:
# Existing (from first batch with collision)
ghcid: NL-NH-AMS-M-SM-stedelijk_museum_amsterdam
# Historical addition (2025-11-15)
ghcid: NL-NH-AMS-M-SM-stadsmuseum_amsterdam # Different name suffix
No ambiguity: Each institution has unique name suffix derived from its native language name.
Case 3: Non-Latin script names
For institutions with non-Latin script names, transliterate to ASCII:
# Chinese institution: 北京故宫博物院 (Palace Museum Beijing)
ghcid: CN-BJ-BEI-M-PM-beijing_gugong_bowuyuan
# Japanese institution: 東京国立博物館 (Tokyo National Museum)
ghcid: JP-TK-TOK-M-TN-tokyo_kokuritsu_hakubutsukan
# Arabic institution: المتحف المصري (Egyptian Museum)
ghcid: EG-CA-CAI-M-EM-al_mathaf_al_masri
Testing Strategy
Test 1: First Batch Collision
def test_first_batch_collision():
"""Verify ALL institutions in first batch get name suffixes"""
institutions = [
Institution("Stedelijk Museum Amsterdam", type="M", city="AMS"),
Institution("Science Museum Amsterdam", type="M", city="AMS")
]
records = process_first_batch(institutions)
# Both should have name suffixes
assert records[0].ghcid == "NL-NH-AMS-M-SM-stedelijk_museum_amsterdam"
assert records[1].ghcid == "NL-NH-AMS-M-SM-science_museum_amsterdam"
Test 2: Historical Addition Collision
def test_historical_addition_preserves_existing():
"""Verify existing GHCID unchanged when historical added"""
# Existing GHCID (published)
existing_ghcids = {"NL-NH-AMS-M-HM"}
# Add historical institution
historical = Institution(
name="Amsterdam Historical Museum",
type="M",
city="AMS",
temporal_extent={"start": "1926", "end": "1975"}
)
new_ghcid = resolve_collision(
generate_base_ghcid(historical),
historical.name,
existing_ghcids
)
# New historical gets name suffix
assert new_ghcid == "NL-NH-AMS-M-HM-amsterdam_historical_museum"
# Existing GHCID NOT in database update
# (verify existing record unchanged)
Test 3: Name Suffix Generation
def test_name_suffix_generation():
"""Verify name suffix normalization"""
assert generate_name_suffix("Musée d'Orsay") == "musee_dorsay"
assert generate_name_suffix("Österreichische Nationalbibliothek") == "osterreichische_nationalbibliothek"
assert generate_name_suffix("Biblioteca Nacional do Brasil") == "biblioteca_nacional_do_brasil"
assert generate_name_suffix("Royal Museum, London") == "royal_museum_london"
Documentation References
- Collision Resolution:
docs/plan/global_glam/07-ghcid-collision-resolution.md - GHCID Specification:
docs/GHCID_PID_SCHEME.md - Implementation:
src/glam_extractor/identifiers/ghcid.py - Schema:
schemas/provenance.yaml(GHCIDHistoryEntry) - Abbreviation Special Characters:
.opencode/ABBREVIATION_SPECIAL_CHAR_RULE.md(characters to exclude from abbreviations)
SHA-1 vs SHA-256: The Nuance
Why UUID v5 Uses SHA-1
RFC 4122 (2005) standardized UUID v5 with SHA-1 because:
- SHA-1 was considered secure in 2005
- 128-bit UUID space provides collision resistance even with SHA-1
- Purpose is identifier generation, not security/authentication
SHA-1 Cryptographic Weakness
SHA-1 collision attacks (2017):
- Google/CWI demonstrated practical SHA-1 collision
- Two different inputs producing same hash
- Critical for digital signatures (authentication, certificates)
- Less critical for identifiers (birthday paradox protection sufficient)
When SHA-1 Is Problematic
❌ Digital signatures - Attacker can forge documents ❌ Certificate authorities - SSL/TLS security compromised ❌ Password hashing - Weakens brute-force resistance ❌ Blockchain - Consensus security at risk
When SHA-1 Is Acceptable
✅ UUID generation - Collision resistance adequate for identifier space ✅ Git commits - Linus Torvalds: "SHA-1 is fine for Git's use case" ✅ Non-adversarial contexts - No attacker trying to cause collisions
Recommended Usage Strategy
Default: Dual UUID Approach
Store both UUID formats for maximum flexibility:
# Example YAML record
- id: 550e8400-e29b-41d4-a716-446655440000 # Use UUID v5 as primary ID
name: Internet Archive
institution_type: ARCHIVE
ghcid: US-CA-SAN-A-IA
ghcid_uuid: 550e8400-e29b-41d4-a716-446655440000 # UUID v5 (SHA-1)
ghcid_uuid_sha256: a1b2c3d4-e5f6-8a1b-9c2d-3e4f5a6b7c8d # UUID SHA-256
ghcid_numeric: 213324328442227739 # Numeric (64-bit)
identifiers:
- identifier_scheme: GHCID
identifier_value: US-CA-SAN-A-IA
- identifier_scheme: GHCID_UUID_V5
identifier_value: 550e8400-e29b-41d4-a716-446655440000
- identifier_scheme: GHCID_UUID_SHA256
identifier_value: a1b2c3d4-e5f6-8a1b-9c2d-3e4f5a6b7c8d
- identifier_scheme: GHCID_NUMERIC
identifier_value: 213324328442227739
Use Case Decision Tree
Need to integrate with existing systems?
├─ YES → Use UUID v5 (to_uuid())
│ - Europeana, DPLA, IIIF, Wikidata
│ - RFC 4122 compliance required
│
└─ NO → Building custom system?
├─ Security policy mandates SHA-256?
│ ├─ YES → Use UUID SHA-256 (to_uuid_sha256())
│ └─ NO → Use UUID v5 for standard compliance
│
└─ Database optimization critical?
├─ YES → Use Numeric (to_numeric()) as PK
│ - Store UUID v5 as alternate key
└─ NO → Use UUID v5 as primary identifier
Code Examples
Generate All Four Formats
from glam_extractor.identifiers.ghcid import GHCIDComponents
# Create GHCID components
components = GHCIDComponents(
country_code="US",
region_code="CA",
city_locode="SAN",
institution_type="A",
abbreviation="IA"
)
# Generate all formats
uuid_v5 = components.to_uuid() # UUID v5 (SHA-1)
uuid_sha256 = components.to_uuid_sha256() # UUID SHA-256
numeric = components.to_numeric() # Numeric (64-bit)
human = components.to_string() # Human-readable
print(f"UUID v5: {uuid_v5}")
print(f"UUID SHA-256: {uuid_sha256}")
print(f"Numeric: {numeric}")
print(f"Human: {human}")
# Output:
# UUID v5: 550e8400-e29b-41d4-a716-446655440000
# UUID SHA-256: a1b2c3d4-e5f6-8a1b-9c2d-3e4f5a6b7c8d
# Numeric: 213324328442227739
# Human: US-CA-SAN-A-IA
Verify Determinism
# Same input always produces same output
comp1 = GHCIDComponents("NL", "NH", "AMS", "M", "RM")
comp2 = GHCIDComponents("NL", "NH", "AMS", "M", "RM")
assert comp1.to_uuid() == comp2.to_uuid()
assert comp1.to_uuid_sha256() == comp2.to_uuid_sha256()
assert comp1.to_numeric() == comp2.to_numeric()
assert comp1.to_string() == comp2.to_string()
Export to Different Formats
# RDF/JSON-LD (use UUID v5)
rdf_id = f"urn:uuid:{components.to_uuid()}"
# → "urn:uuid:550e8400-e29b-41d4-a716-446655440000"
# IIIF Manifest (use UUID v5)
iiif_id = f"https://iiif.example.org/manifests/{components.to_uuid()}/manifest.json"
# Database (use numeric PK)
sql = f"INSERT INTO institutions (id, name) VALUES ({components.to_numeric()}, 'Internet Archive')"
# Citation (use human-readable)
citation = f"See Internet Archive ({components.to_string()}) for digital collections."
Future-Proofing Strategy
Timeline Projections
| Year | SHA-1 Status | UUID v5 Status | Recommendation |
|---|---|---|---|
| 2024 | Weak for security, OK for IDs | Standard, widely supported | ✅ Use UUID v5 as primary |
| 2030 | Likely deprecated for security | Still standard for IDs | ✅ Dual UUID (v5 + SHA-256) |
| 2040 | Possibly deprecated entirely | May be superseded | ⚠️ Migrate to UUID SHA-256 |
Migration Path
If SHA-1 is fully deprecated:
- Phase 1 (Now): Store both UUID v5 and UUID SHA-256
- Phase 2 (2030): Make UUID SHA-256 primary, keep v5 as alias
- Phase 3 (2040): Deprecate UUID v5, use SHA-256 exclusively
Critical: Because both are deterministic, you can always regenerate from GHCID string without breaking references.
Governance & Resolution
Identifier Persistence Requirements
Technical generation is only half the solution. True persistence requires:
1. Resolution Service
https://id.heritage.example.org/uuid/{uuid}
https://id.heritage.example.org/numeric/{numeric}
https://id.heritage.example.org/ghcid/{ghcid}
All three should resolve to the same institutional record.
2. Mapping Database
CREATE TABLE ghcid_registry (
uuid_v5 UUID PRIMARY KEY,
uuid_sha256 UUID NOT NULL,
numeric BIGINT NOT NULL,
ghcid VARCHAR(100) NOT NULL,
ghcid_original VARCHAR(100) NOT NULL, -- Frozen
institution_name TEXT NOT NULL,
last_updated TIMESTAMP,
UNIQUE(uuid_sha256),
UNIQUE(numeric),
UNIQUE(ghcid_original)
);
3. Organizational Commitment
- Maintain resolution service for decades
- Fund infrastructure for long-term operation
- Establish governance policies for ID assignment
- Handle institution mergers/closures/relocations
4. Community Standards
- Coordinate with ISIL, Wikidata, GeoNames
- Publish GHCID specification as RFC or W3C note
- Engage with Europeana, DPLA, IIIF communities
- Establish dispute resolution process
Comparison with Existing PID Systems
| System | Format | Governance | Resolution | Adoption |
|---|---|---|---|---|
| DOI | 10.xxxx/yyyy | IDF (non-profit) | doi.org | High (scholarly) |
| ARK | ark:/nnnnn/xxx | CDL (California) | n2t.net | Medium (archives) |
| Handle | hdl:xxxx/yyyy | CNRI (non-profit) | handle.net | Medium (repositories) |
| GHCID | UUID v5 | TBD | TBD | None (new) |
Lesson: Technical mechanism is necessary but not sufficient. Governance and organizational commitment are critical.
Recommendations
For This Project (2024-2025)
- ✅ Implement dual UUID generation (v5 + SHA-256)
- ✅ Store all four identifier formats in data model
- ✅ Use UUID v5 as primary ID for current interoperability
- ✅ Document SHA-1 nuance clearly
- ⏳ Build resolution service prototype
- ⏳ Engage with Europeana/DPLA for feedback
- ⏳ Draft GHCID specification for community review
For Production Deployment
- ⏳ Establish governance body (non-profit foundation?)
- ⏳ Secure long-term funding for resolution service
- ⏳ Coordinate with existing PID systems (ISIL, VIAF, Wikidata)
- ⏳ Publish specification (W3C note or IETF RFC)
- ⏳ Deploy resolution infrastructure (multi-region, high availability)
- ⏳ Engage heritage community for adoption
References
- RFC 4122: UUID Standard (https://tools.ietf.org/html/rfc4122)
- SHA-1 Collision: Google/CWI (2017) - https://shattered.io
- UUID v8 Draft: New UUID Formats (https://datatracker.ietf.org/doc/draft-ietf-uuidrev-rfc4122bis/)
- NIST SHA-256: FIPS 180-4 - https://csrc.nist.gov/publications/fips
- Identifier.org: Life sciences identifiers - https://identifiers.org
- N2T: Name-to-Thing resolver - https://n2t.net
Version: 1.0
Date: 2024-11-06
Status: Draft for Community Review