glam/schemas/20251121/linkml/modules/classes/CustodianArchive.yaml
kempersc 3a6ead8fde feat: Add legal form filtering rule for CustodianName
- Introduced LEGAL-FORM-FILTER rule to standardize CustodianName by removing legal form designations.
- Documented rationale, examples, and implementation guidelines for the filtering process.

docs: Create README for value standardization rules

- Established a comprehensive README outlining various value standardization rules applicable to Heritage Custodian classes.
- Categorized rules into Name Standardization, Geographic Standardization, Web Observation, and Schema Evolution.

feat: Implement transliteration standards for non-Latin scripts

- Added TRANSLIT-ISO rule to ensure GHCID abbreviations are generated from emic names using ISO standards for transliteration.
- Included detailed guidelines for various scripts and languages, along with implementation examples.

feat: Define XPath provenance rules for web observations

- Created XPATH-PROVENANCE rule mandating XPath pointers for claims extracted from web sources.
- Established a workflow for archiving websites and verifying claims against archived HTML.

chore: Update records lifecycle diagram

- Generated a new Mermaid diagram illustrating the records lifecycle for heritage custodians.
- Included phases for active records, inactive archives, and processed heritage collections with key relationships and classifications.
2025-12-09 16:58:41 +01:00

773 lines
30 KiB
YAML

# Custodian Archive Class
# Represents OPERATIONAL ARCHIVES that are NOT YET integrated into the formal
# heritage collection (CustodianCollection). These are records created through
# daily institutional operations that may take DECADES to process.
id: https://nde.nl/ontology/hc/class/CustodianArchive
name: custodian_archive_class
title: CustodianArchive Class
imports:
- linkml:types
- ./Custodian
- ./CustodianObservation
- ./ReconstructionActivity
- ./TimeSpan
- ./OrganizationalStructure
- ./CollectionManagementSystem
- ./Storage
- ../enums/ArchiveProcessingStatusEnum
- ../slots/access_restrictions
- ../slots/storage_location
- ./ReconstructedEntity
- ./CurrentArchive
prefixes:
linkml: https://w3id.org/linkml/
hc: https://nde.nl/ontology/hc/
crm: http://www.cidoc-crm.org/cidoc-crm/
dcterms: http://purl.org/dc/terms/
rico: https://www.ica.org/standards/RiC/ontology#
prov: http://www.w3.org/ns/prov#
time: http://www.w3.org/2006/time#
org: http://www.w3.org/ns/org#
premis: http://www.loc.gov/premis/rdf/v3/
skos: http://www.w3.org/2004/02/skos/core#
wikidata: http://www.wikidata.org/entity/
classes:
CustodianArchive:
is_a: ReconstructedEntity
class_uri: rico:RecordSet
description: |
Represents OPERATIONAL ARCHIVES created by a heritage custodian through its
daily activities that are NOT YET integrated into the formal heritage collection
(CustodianCollection).
**CRITICAL DISTINCTION - THREE-TIER RECORDS LIFECYCLE**:
```
┌─────────────────────────────────────────────────────────────────────┐
│ CustodianAdministration │
│ ════════════════════════ │
│ ACTIVE records in daily use │
│ - Current correspondence, invoices, personnel files │
│ - Digital files on shared drives │
│ - Email systems, databases │
│ - Actively referenced and modified │
│ - Managed by business units, NOT archives │
└─────────────────────────────────────────────────────────────────────┘
(Retention period ends,
transferred to archives)
┌─────────────────────────────────────────────────────────────────────┐
│ CustodianArchive (THIS CLASS) │
│ ════════════════════════════ │
│ INACTIVE records awaiting archival processing │
│ - Transferred from administration to archives │
│ - In BACKLOG - may wait DECADES for processing │
│ - Basic accession-level description only │
│ - NOT searchable by researchers (no finding aid) │
│ - Tracked in CMS for inventory purposes │
│ - May undergo appraisal, arrangement, description │
└─────────────────────────────────────────────────────────────────────┘
(Archival processing complete,
finding aid created)
┌─────────────────────────────────────────────────────────────────────┐
│ CustodianCollection │
│ ════════════════════════ │
│ PROCESSED heritage collection │
│ - Full finding aid available │
│ - Searchable by researchers │
│ - Arranged and described per archival standards │
│ - Integrated into institution's public collection │
│ - Managed as cultural heritage │
└─────────────────────────────────────────────────────────────────────┘
```
**WHY THIS MATTERS**:
Archival institutions (national archives, municipal archives, corporate archives)
create their OWN operational records through daily activities:
- Correspondence with depositors
- Acquisition documentation
- Conservation reports
- Staff files
- Financial records
- Research request logs
These institutional records are DISTINCT from the heritage collections they manage.
A national archive managing 17th-century municipal records ALSO has its own
20th-21st century operational archives that may take decades to process.
**TEMPORAL REALITY**:
Processing backlogs are measured in DECADES, not months:
- Large national archives: 30-50 year backlogs common
- Government transfers: Often 20+ years before processing
- Corporate archives: Legacy records may wait indefinitely
**RiC-O ALIGNMENT**:
- **rico:RecordSet**: Primary class for archival aggregations
- **rico:hasAccumulationDate**: When records were accumulated (created/received)
- **rico:hasAccumulator**: Agent that accumulated the records
- **rico:Activity**: Processing activities (appraisal, arrangement, description)
**RELATIONSHIP TO OTHER CLASSES**:
- **CustodianAdministration**: Active records → transfers TO CustodianArchive
- **CustodianCollection**: Processed archives → CustodianArchive transfers TO this
- **CollectionManagementSystem**: Tracks CustodianArchive for inventory
- **Storage**: Physical location of unprocessed archives
- **OrganizationalStructure**: Unit responsible for processing
**RELATIONSHIP TO LIFECYCLE TYPE CLASSES**:
CustodianArchive (this class) is an INSTANCE class representing actual
operational archives. It can be TYPED using lifecycle phase classifications:
- **CurrentArchive** (Q3621648): Active records in daily use
- skos:broaderTransitive links CustodianArchive → CurrentArchive type
- **DepositArchive** (Q244904): Intermediate/semi-current records
- **HistoricalArchive** (Q3621673): Permanent archival records
Use `lifecycle_phase_type` slot to classify by lifecycle position.
exact_mappings:
- rico:RecordSet
close_mappings:
- rico:RecordResource
- premis:IntellectualEntity
related_mappings:
- rico:hasAccumulationDate
- rico:Activity
- crm:E78_Curated_Holding
slots:
- id
- archive_name
- archive_description
- accession_number
- accession_date
- accumulation_date_start
- accumulation_date_end
- creating_agency
- processing_status
- processing_priority
- estimated_extent
- storage_location
- tracked_in_cms
- assigned_processor
- processing_started_date
- processing_completed_date
- transfer_to_collection_date
- successor_collection
- access_restrictions
- appraisal_notes
- arrangement_notes
- managing_unit
- refers_to_custodian
- was_derived_from
- was_generated_by
- valid_from
- valid_to
- lifecycle_phase_type
slot_usage:
id:
identifier: true
required: true
description: |
Unique identifier for this operational archive record (URI).
Format: https://nde.nl/ontology/hc/archive/{custodian-id}/{accession-number}
archive_name:
slot_uri: rico:name
range: string
required: true
description: |
Name/title for this operational archive accession.
Often derived from creating agency + date range.
**Examples**:
- "Rijksmuseum Director's Correspondence 2010-2020"
- "Acquisition Documentation 2015-2019"
- "Conservation Lab Reports 2000-2010"
- "Ministry of Finance Records 1990-2005"
archive_description:
slot_uri: rico:scopeAndContent
range: string
required: false
description: |
Brief description of contents, scope, and context.
**RiC-O**: rico:scopeAndContent for archival description.
At accession stage, description is typically minimal:
- General content summary
- Creating office/department
- Date range
- Volume/extent estimate
Detailed description created during IN_DESCRIPTION phase.
accession_number:
slot_uri: rico:identifier
range: string
required: true
description: |
Unique accession identifier assigned when records transferred
from CustodianAdministration to CustodianArchive.
**RiC-O**: rico:identifier for archival identifiers.
**Format varies by institution**:
- Sequential: "2024-001", "2024-002"
- Date-based: "20240115-A"
- Hybrid: "RM-2024-0042"
examples:
- value: "2024-0001"
description: "Sequential accession number"
- value: "NA-2024-GOV-0156"
description: "National Archives government transfer"
accession_date:
slot_uri: rico:hasAccumulationDate
range: date
required: true
description: |
Date when records were accessioned (transferred from
CustodianAdministration to CustodianArchive).
**RiC-O**: rico:hasAccumulationDate for when archive received records.
**Note**: This is when the ARCHIVE received the records,
NOT when the records were created (see accumulation_date_start/end).
examples:
- value: "2024-01-15"
description: "Records accessioned January 2024"
accumulation_date_start:
slot_uri: rico:hasBeginningDate
range: date
required: false
description: |
Earliest date of records in this accession (when records were CREATED).
**RiC-O**: rico:hasBeginningDate for temporal coverage.
**Example**: Correspondence files covering 2010-2020:
- accumulation_date_start: 2010-01-01
- accumulation_date_end: 2020-12-31
- accession_date: 2024-01-15 (when transferred to archive)
accumulation_date_end:
slot_uri: rico:hasEndDate
range: date
required: false
description: |
Latest date of records in this accession (when records were CREATED).
**RiC-O**: rico:hasEndDate for temporal coverage.
creating_agency:
slot_uri: rico:hasCreator
range: string
required: false
description: |
Office, department, or unit that created these records.
**RiC-O**: rico:hasCreator for provenance.
For institutional archives, this is typically an internal unit:
- "Director's Office"
- "Conservation Department"
- "Acquisition Committee"
- "Human Resources"
May link to OrganizationalStructure if modeled.
examples:
- value: "Director's Office"
description: "Executive correspondence"
- value: "Conservation Department"
description: "Treatment reports and documentation"
processing_status:
slot_uri: rico:Activity
range: ArchiveProcessingStatusEnum
required: true
description: |
Current processing status of this operational archive.
**See**: ArchiveProcessingStatusEnum for full status lifecycle.
**Common progression**:
UNPROCESSED → IN_APPRAISAL → IN_ARRANGEMENT → IN_DESCRIPTION
→ PROCESSED_PENDING_TRANSFER → TRANSFERRED_TO_COLLECTION
examples:
- value: "UNPROCESSED"
description: "In backlog awaiting processing"
- value: "IN_DESCRIPTION"
description: "Finding aid being created"
processing_priority:
slot_uri: dcterms:priority
range: string
required: false
description: |
Priority level for processing this accession.
**Values**:
- HIGH: Legal/regulatory requirement, researcher demand, condition issues
- MEDIUM: Standard processing queue
- LOW: No immediate need, can wait indefinitely
- URGENT: Immediate processing required (legal hold, condition emergency)
Priority may change based on:
- Researcher requests
- Anniversary/commemorative events
- Grant funding for specific processing
- Condition concerns (mold, pests, deterioration)
examples:
- value: "HIGH"
description: "Researcher demand for these records"
- value: "LOW"
description: "No immediate need, stable condition"
estimated_extent:
slot_uri: rico:hasExtent
range: string
required: false
description: |
Estimated physical or digital extent of accession.
**RiC-O**: rico:hasExtent for quantity.
**Physical examples**:
- "45 linear meters"
- "120 boxes (standard archive boxes)"
- "15 filing cabinets"
**Digital examples**:
- "2.5 TB"
- "150,000 files"
- "45,000 emails"
examples:
- value: "25 linear meters"
description: "Physical extent"
- value: "500 GB, ~50,000 files"
description: "Digital extent"
storage_location:
slot_uri: rico:hasOrHadPhysicalLocation
range: Storage
multivalued: true
required: false
description: |
Physical storage location(s) for this unprocessed archive.
**RiC-O**: rico:hasOrHadPhysicalLocation for storage.
Links to Storage class for:
- Building/facility
- Room/area
- Shelf/range
- Environmental conditions
May have multiple locations if spread across facilities.
tracked_in_cms:
slot_uri: crm:P70i_is_documented_in
range: CollectionManagementSystem
multivalued: true
required: false
description: |
Collection Management System(s) tracking this accession.
**CIDOC-CRM**: P70i_is_documented_in for CMS tracking.
Even unprocessed archives are typically tracked in CMS:
- Accession-level record
- Location tracking
- Processing queue management
- Basic provenance metadata
**Note**: This is for TRACKING, not full description.
Full description created when status = IN_DESCRIPTION.
assigned_processor:
slot_uri: rico:hasOrHadManager
range: string
required: false
description: |
Archivist or staff member assigned to process this accession.
**RiC-O**: rico:hasOrHadManager for responsible agent.
May be null if:
- Status = UNPROCESSED (in queue, not yet assigned)
- Status = ON_HOLD
May link to PersonObservation if staff modeled.
examples:
- value: "Dr. Maria van den Berg"
description: "Senior archivist assigned"
processing_started_date:
slot_uri: prov:startedAtTime
range: date
required: false
description: |
Date when archival processing began.
**PROV-O**: prov:startedAtTime for activity start.
Null if status = UNPROCESSED or ON_HOLD.
Set when status changes to IN_APPRAISAL or later.
examples:
- value: "2024-03-01"
description: "Processing started March 2024"
processing_completed_date:
slot_uri: prov:endedAtTime
range: date
required: false
description: |
Date when archival processing completed.
**PROV-O**: prov:endedAtTime for activity completion.
Set when status changes to PROCESSED_PENDING_TRANSFER.
**Metrics**: (processing_completed_date - accession_date) = processing lag
This metric is often measured in YEARS or DECADES.
examples:
- value: "2024-09-15"
description: "Processing completed September 2024"
transfer_to_collection_date:
slot_uri: rico:hasOrHadEndDate
range: date
required: false
description: |
Date when archive was transferred to CustodianCollection.
**RiC-O**: rico:hasOrHadEndDate for lifecycle end.
Set when status = TRANSFERRED_TO_COLLECTION.
After this date, primary record is CustodianCollection.
CustodianArchive retained for provenance documentation.
examples:
- value: "2024-10-01"
description: "Transferred to collection October 2024"
successor_collection:
slot_uri: prov:hadDerivation
range: uriorcurie
required: false
description: |
Link to CustodianCollection record created after archival processing.
**PROV-O**: prov:hadDerivation for derivation relationship.
This is the INVERSE of prov:wasDerivedFrom - the CustodianCollection
was DERIVED FROM this CustodianArchive through archival processing
(appraisal, arrangement, description).
**Why NOT rico:hasSuccessor?**
rico:hasSuccessor has domain/range of rico:Agent - it's for ORGANIZATIONAL
succession (institution A succeeded by institution B). For RECORD succession
we use PROV-O derivation semantics.
**Relationship direction**:
- CustodianArchive --prov:hadDerivation--> CustodianCollection
- CustodianCollection --prov:wasDerivedFrom--> CustodianArchive
Populated when status = TRANSFERRED_TO_COLLECTION.
Provides audit trail from unprocessed → processed.
examples:
- value: "https://nde.nl/ontology/hc/collection/rm-director-correspondence-2010-2020"
description: "Collection derived from this archive after processing"
access_restrictions:
slot_uri: rico:hasOrHadAllMembersWithContentType
range: string
required: false
description: |
Access restrictions on this unprocessed archive.
**Common restrictions**:
- "Closed - Personal data (GDPR)"
- "Closed - Donor agreement (until 2040)"
- "Restricted - Staff access only"
- "Open - May be consulted upon request"
**Note**: Unprocessed archives often have blanket restrictions
until proper review during IN_APPRAISAL phase.
examples:
- value: "Closed - Contains personnel files with personal data"
description: "Privacy restriction"
appraisal_notes:
slot_uri: rico:history
range: string
required: false
description: |
Notes from appraisal process (retention decisions, destruction).
**RiC-O**: rico:history for processing history.
Documents:
- What was retained and why
- What was destroyed and why
- Retention schedule applied
- Appraisal methodology used
examples:
- value: "Retained all policy files; destroyed duplicate copies and routine correspondence per retention schedule RS-2020-05"
description: "Appraisal decisions documented"
arrangement_notes:
slot_uri: rico:history
range: string
required: false
description: |
Notes from arrangement process (structure, order).
**RiC-O**: rico:history for processing history.
Documents:
- Original order maintained or reconstructed
- Series structure created
- Physical rehousing performed
- Preservation concerns noted
examples:
- value: "Maintained original order by correspondent. Created 5 series by function. Rehoused into acid-free folders and boxes."
description: "Arrangement decisions documented"
managing_unit:
slot_uri: org:unitOf
range: OrganizationalStructure
required: false
description: |
Organizational unit responsible for processing this archive.
**W3C Org**: org:unitOf for unit relationship.
Typically the archives/records management department.
May differ from creating_agency (which created the records).
refers_to_custodian:
slot_uri: crm:P46i_forms_part_of
range: Custodian
required: true
description: |
Links this archive record to the Custodian hub.
**CIDOC-CRM**: P46i_forms_part_of for part-whole.
The custodian that OWNS these operational archives
(which is the same institution whose operations created them).
was_derived_from:
slot_uri: prov:wasDerivedFrom
range: CustodianObservation
multivalued: true
required: false
description: |
Observation(s) from which this archive record was derived.
**PROV-O**: prov:wasDerivedFrom for provenance.
was_generated_by:
slot_uri: prov:wasGeneratedBy
range: ReconstructionActivity
required: false
description: |
Reconstruction activity that generated this record.
**PROV-O**: prov:wasGeneratedBy for generation.
valid_from:
slot_uri: time:hasBeginning
range: date
required: false
description: |
Start of validity period (typically = accession_date).
valid_to:
slot_uri: time:hasEnd
range: date
required: false
description: |
End of validity period (typically = transfer_to_collection_date).
lifecycle_phase_type:
slot_uri: skos:broaderTransitive
range: uriorcurie
required: false
description: |
Links this CustodianArchive INSTANCE to its lifecycle phase TYPE.
**SKOS**: skos:broaderTransitive for instance-to-type relationship.
**Archive Lifecycle Types (Wikidata)**:
- Q3621648 (CurrentArchive) - Active records phase
- Q244904 (DepositArchive) - Intermediate/semi-current phase
- Q3621673 (HistoricalArchive) - Archival/permanent phase
**Usage**:
Classify this operational archive by its position in the records lifecycle.
Most CustodianArchive records are in the intermediate phase (awaiting processing).
**Example**:
- CustodianArchive "Ministry Records 2010-2020" → lifecycle_phase_type →
DepositArchive (Q244904) - semi-current, awaiting processing
examples:
- value: "wikidata:Q244904"
description: "Deposit archive / semi-current records"
- value: "wikidata:Q3621648"
description: "Current archive / active records"
comments:
- "Represents operational archives BEFORE integration into CustodianCollection"
- "Processing backlogs commonly span DECADES in archival institutions"
- "Distinct from CustodianAdministration (active records) and CustodianCollection (processed)"
- "RiC-O rico:RecordSet as primary ontology class"
- "PROV-O prov:hadDerivation links to successor CustodianCollection (NOT rico:hasSuccessor which is for Agents)"
- "Tracks full processing lifecycle from accession to transfer"
see_also:
- "https://www.ica.org/standards/RiC/ontology#RecordSet"
- "https://nde.nl/ontology/hc/class/CustodianAdministration"
- "https://nde.nl/ontology/hc/class/CustodianCollection"
- "https://nde.nl/ontology/hc/enum/ArchiveProcessingStatusEnum"
examples:
- value:
id: "https://nde.nl/ontology/hc/archive/rm/2024-0001"
archive_name: "Rijksmuseum Director's Correspondence 2010-2020"
archive_description: "Incoming and outgoing correspondence of the museum director including policy discussions, loan requests, and exhibition planning."
accession_number: "RM-2024-0001"
accession_date: "2024-01-15"
accumulation_date_start: "2010-01-01"
accumulation_date_end: "2020-12-31"
creating_agency: "Director's Office"
processing_status: "UNPROCESSED"
processing_priority: "MEDIUM"
estimated_extent: "12 linear meters (48 boxes)"
access_restrictions: "Restricted - Contains sensitive correspondence"
refers_to_custodian: "https://nde.nl/ontology/hc/nl-nh-ams-m-rm-q190804"
description: "Unprocessed director's correspondence awaiting archival processing"
- value:
id: "https://nde.nl/ontology/hc/archive/na/2015-gov-0234"
archive_name: "Ministry of Finance Records 1990-2005"
archive_description: "Financial policy records, budget documentation, and ministerial correspondence transferred under government archives law."
accession_number: "NA-2015-GOV-0234"
accession_date: "2015-06-01"
accumulation_date_start: "1990-01-01"
accumulation_date_end: "2005-12-31"
creating_agency: "Ministry of Finance"
processing_status: "IN_ARRANGEMENT"
processing_priority: "HIGH"
estimated_extent: "85 linear meters"
assigned_processor: "Dr. Jan de Vries"
processing_started_date: "2024-01-10"
appraisal_notes: "Retained all policy files; weeded duplicate copies per retention schedule."
refers_to_custodian: "https://nde.nl/ontology/hc/nl-na"
description: "Government records in active processing (9 years after accession)"
# Slot definitions (basic - detailed in class slot_usage)
slots:
archive_name:
description: Name/title for operational archive accession
range: string
archive_description:
description: Brief description of archive contents
range: string
accession_number:
description: Unique accession identifier
range: string
accession_date:
description: Date when records were accessioned
range: date
accumulation_date_start:
description: Earliest date of records (creation date)
range: date
accumulation_date_end:
description: Latest date of records (creation date)
range: date
creating_agency:
description: Office/department that created records
range: string
processing_status:
description: Current processing status
range: ArchiveProcessingStatusEnum
processing_priority:
description: Priority level for processing
range: string
estimated_extent:
description: Estimated physical/digital extent
range: string
# NOTE: storage_location imported from global slot ../slots/storage_location.yaml
# Use slot_usage in class to customize range
tracked_in_cms:
description: CMS tracking this accession
range: CollectionManagementSystem
assigned_processor:
description: Staff member assigned to process
range: string
processing_started_date:
description: Date processing began
range: date
processing_completed_date:
description: Date processing completed
range: date
transfer_to_collection_date:
description: Date transferred to CustodianCollection
range: date
successor_collection:
description: Link to derived CustodianCollection (prov:hadDerivation)
range: uriorcurie
# NOTE: access_restrictions imported from global slot ../slots/access_restrictions.yaml
appraisal_notes:
description: Notes from appraisal process
range: string
arrangement_notes:
description: Notes from arrangement process
range: string
lifecycle_phase_type:
slot_uri: skos:broaderTransitive
description: |
Links CustodianArchive INSTANCE to lifecycle phase TYPE.
SKOS broaderTransitive for instance-to-type relationship.
Values: CurrentArchive (Q3621648), DepositArchive (Q244904),
HistoricalArchive (Q3621673).
range: uriorcurie