16 KiB
PiCo Ontology Analysis
Version: 0.1.0
Last Updated: 2025-01-09
Source: https://github.com/CBG-Centrum-voor-familiegeschiedenis/PiCo
Related: Executive Summary | Claims and Provenance
1. Overview
PiCo (Persons in Context) is an ontology developed by the CBG-Centrum-voor-familiegeschiedenis (Center for Family History) in the Netherlands. It provides a conceptual framework for modeling persons in historical sources with explicit distinction between observations (what sources say) and reconstructions (what we conclude).
This distinction is fundamental to the PPID design and directly informs our two-level identifier architecture.
2. Core Philosophy
2.1 The Observation-Reconstruction Distinction
PiCo's central innovation is the explicit separation of:
| Concept | Definition | Example |
|---|---|---|
| PersonObservation | A person as described in a specific source | "The baptism register states 'Johannes, son of Pieter'" |
| PersonReconstruction | A curated identity derived from one or more observations | "Johannes Pietersen van der Berg (1692-1756)" |
This mirrors the genealogical research process:
┌─────────────────────────────────────────────────────────────────┐
│ RESEARCH WORKFLOW │
│ │
│ Source A Source B Source C │
│ (Baptism) (Marriage) (Burial) │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │ Person │ │ Person │ │ Person │ │
│ │ Obs. A │ │ Obs. B │ │ Obs. C │ │
│ └────────┘ └────────┘ └────────┘ │
│ │ │ │ │
│ └────────────────┬┴─────────────────┘ │
│ │ │
│ ▼ (researcher reasoning) │
│ ┌─────────────────┐ │
│ │ Person │ │
│ │ Reconstruction │ │
│ │ "Johannes..." │ │
│ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
2.2 Why This Matters
| Benefit | Description |
|---|---|
| Transparency | Clear separation of evidence from conclusions |
| Traceability | Every assertion traceable to source |
| Revision safety | New evidence can update reconstruction without losing observations |
| Scholarly integrity | Supports genealogical proof standards |
| Conflict handling | Contradictory sources can coexist |
3. Ontology Structure
3.1 Namespace and Prefixes
@prefix picom: <https://personsincontext.org/model#> .
@prefix pnv: <https://w3id.org/pnv#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix bio: <http://purl.org/vocab/bio/0.1/> .
@prefix schema: <http://schema.org/> .
3.2 Class Hierarchy
schema:Person
│
├── picom:PersonObservation
│ │
│ └── (represents a person as found in a single source)
│
└── picom:PersonReconstruction
│
└── (represents a curated person identity)
3.3 Core Classes
PersonObservation
picom:PersonObservation a owl:Class ;
rdfs:subClassOf schema:Person ;
rdfs:label "Person Observation"@en ;
rdfs:comment """A person as observed/described in a specific source.
This represents what the source says, not necessarily
what is true."""@en .
Key properties:
picom:hasName→ Name as recorded in sourcepicom:hasRole→ Role mentioned in sourcepicom:inRecord→ Link to source documentprov:wasDerivedFrom→ Source provenance
PersonReconstruction
picom:PersonReconstruction a owl:Class ;
rdfs:subClassOf schema:Person ;
rdfs:label "Person Reconstruction"@en ;
rdfs:comment """A curated person identity constructed from one or more
PersonObservations through research and reasoning."""@en .
Key properties:
prov:wasDerivedFrom→ Links to source PersonObservationspicom:hasName→ Canonical name form(s)bio:birth/bio:death→ Life eventspicom:hasRole→ Aggregated roles
4. Integration with Existing Ontologies
PiCo builds on established vocabularies rather than reinventing:
4.1 Schema.org
| PiCo Usage | Schema.org Class/Property |
|---|---|
| Person base class | schema:Person |
| Birth date | schema:birthDate |
| Death date | schema:deathDate |
| Gender | schema:gender |
| Family name | schema:familyName |
| Given name | schema:givenName |
4.2 PROV-O (Provenance Ontology)
| PiCo Usage | PROV-O Property |
|---|---|
| Observation derived from source | prov:wasDerivedFrom |
| Reconstruction generated by activity | prov:wasGeneratedBy |
| Attribution to researcher | prov:wasAttributedTo |
| Revision tracking | prov:wasRevisionOf |
# Example: Reconstruction with provenance
<reconstruction/johannes-van-der-berg>
a picom:PersonReconstruction ;
prov:wasDerivedFrom <observation/baptism-1692-123> ;
prov:wasDerivedFrom <observation/marriage-1715-456> ;
prov:wasDerivedFrom <observation/burial-1756-789> ;
prov:wasGeneratedBy <research-activity/cbg-2024-001> ;
prov:wasAttributedTo <researcher/jan-jansen> .
4.3 BIO Vocabulary
| PiCo Usage | BIO Class/Property |
|---|---|
| Birth event | bio:Birth |
| Death event | bio:Death |
| Marriage | bio:Marriage |
| Event date | bio:date |
| Event place | bio:place |
4.4 PNV (Person Name Vocabulary)
PiCo uses PNV for structured name representation:
<observation/baptism-1692-123>
picom:hasName [
a pnv:PersonName ;
pnv:givenName "Johannes" ;
pnv:patronym "Pietersen" ;
pnv:surnamePrefix "van der" ;
pnv:baseSurname "Berg" ;
pnv:literalName "Johannes Pietersen van der Berg"
] .
5. Person Name Vocabulary (PNV) Deep Dive
5.1 Background
PNV was developed to handle the complexity of Dutch historical names, but its patterns apply globally:
- Patronymics: "Pietersen" (son of Pieter)
- Surname prefixes: "van der", "de", "ten"
- Multiple given names
- Initials
- Name changes over time
5.2 PNV Properties
| Property | Description | Example |
|---|---|---|
pnv:literalName |
Full name as single string | "Johannes Pietersen van der Berg" |
pnv:givenName |
First/given name(s) | "Johannes" |
pnv:patronym |
Patronymic name | "Pietersen" |
pnv:surnamePrefix |
Particles before surname | "van der" |
pnv:baseSurname |
Core family name | "Berg" |
pnv:surname |
Combined prefix + baseSurname | "van der Berg" |
pnv:initials |
Initials only | "J.P." |
pnv:infixTitle |
Title within name | "graaf" (count) |
pnv:disambiguatingDescription |
Distinguishing info | "de oude" (the elder) |
5.3 Name Complexity Examples
Dutch with patronymic:
[ a pnv:PersonName ;
pnv:givenName "Jan" ;
pnv:patronym "Hendrikszoon" ;
pnv:surnamePrefix "van" ;
pnv:baseSurname "Amstel" ;
pnv:literalName "Jan Hendrikszoon van Amstel" ] .
Spanish with two family names:
[ a pnv:PersonName ;
pnv:givenName "María" ;
pnv:givenName "Elena" ;
pnv:baseSurname "García" ;
pnv:baseSurname "López" ;
pnv:literalName "María Elena García López" ] .
Icelandic patronymic (no surname):
[ a pnv:PersonName ;
pnv:givenName "Björk" ;
pnv:patronym "Guðmundsdóttir" ;
pnv:literalName "Björk Guðmundsdóttir" ] .
6. Handling Uncertainty
6.1 Date Uncertainty
PiCo allows flexibility in date representation:
# Exact date known
<observation/birth-1692>
bio:date "1692-03-15"^^xsd:date .
# Only year known
<observation/birth-approx>
bio:date "1692"^^xsd:gYear .
# Estimated from age at death
<observation/birth-estimated>
picom:estimatedBirthYear "1692"^^xsd:gYear ;
picom:birthYearEstimationMethod "calculated from age 64 at death in 1756" .
6.2 Uncertain Identity Linkage
When observations might refer to same person:
<observation/a> picom:possibleSameAs <observation/b> .
<observation/a> picom:certainSameAs <observation/c> .
6.3 Confidence Scores
PiCo supports confidence assertions:
<reconstruction/johannes>
picom:hasConfidence [
picom:confidenceValue 0.85 ;
picom:confidenceMethod "probabilistic record linkage" ;
picom:confidenceNote "High confidence based on matching name, date, and location"
] .
7. Role Modeling
7.1 Persons in Context
PiCo's name reflects its focus on persons in context - roles and relationships:
<observation/baptism-1692-123>
picom:hasRole [
a picom:Role ;
picom:roleType "child" ;
picom:roleContext <event/baptism-1692>
] ;
picom:hasRole [
a picom:Role ;
picom:roleType "son" ;
picom:roleInRelationTo <observation/pieter-father>
] .
7.2 Role Types for Heritage Sector
| Role Type | Context | Example |
|---|---|---|
archivist |
Institution employment | "Chief archivist at Noord-Hollands Archief" |
curator |
Collection management | "Curator of Dutch Masters" |
director |
Leadership | "Museum director 2010-2020" |
donor |
Collection contribution | "Donated family papers in 1985" |
researcher |
Academic work | "Visiting researcher" |
subject |
Collection content | "Person depicted in portrait" |
8. PPID Alignment with PiCo
8.1 Mapping PiCo to PPID
| PiCo Concept | PPID Implementation |
|---|---|
picom:PersonObservation |
POID (Person Observation ID) |
picom:PersonReconstruction |
PRID (Person Reconstruction ID) |
prov:wasDerivedFrom |
Links PRID → POIDs |
pnv:PersonName |
Structured name storage |
picom:hasRole |
Role at heritage institution |
8.2 Extended PPID Model
PPID extends PiCo for heritage custodian context:
@prefix ppid: <https://ppid.org/> .
@prefix picom: <https://personsincontext.org/model#> .
@prefix ghcid: <https://w3id.org/heritage/custodian/> .
# Person Observation (from LinkedIn)
ppid:POID-7a3b-c4d5-e6f7-8901 a picom:PersonObservation ;
picom:hasName [
pnv:givenName "Jan" ;
pnv:baseSurname "Berg" ;
pnv:literalName "Jan van den Berg"
] ;
picom:hasRole [
picom:roleType "Senior Archivist" ;
picom:roleAtInstitution ghcid:NL-NH-HAA-A-NHA
] ;
prov:wasDerivedFrom <https://linkedin.com/in/jan-van-den-berg> ;
ppid:retrievedOn "2025-01-09"^^xsd:date .
# Person Observation (from institutional website)
ppid:POID-8b4c-d5e6-f7g8-9012 a picom:PersonObservation ;
picom:hasName [
pnv:givenName "J." ;
pnv:surnamePrefix "van den" ;
pnv:baseSurname "Berg" ;
pnv:literalName "J. van den Berg"
] ;
picom:hasRole [
picom:roleType "Archivaris" ;
picom:roleAtInstitution ghcid:NL-NH-HAA-A-NHA
] ;
prov:wasDerivedFrom <https://noord-hollandsarchief.nl/over-ons/medewerkers> ;
ppid:retrievedOn "2025-01-08"^^xsd:date .
# Person Reconstruction (curated identity)
ppid:PRID-1234-5678-90ab-cdef a picom:PersonReconstruction ;
picom:hasName [
pnv:givenName "Jan" ;
pnv:surnamePrefix "van den" ;
pnv:baseSurname "Berg" ;
pnv:literalName "Jan van den Berg"
] ;
prov:wasDerivedFrom ppid:POID-7a3b-c4d5-e6f7-8901 ;
prov:wasDerivedFrom ppid:POID-8b4c-d5e6-f7g8-9012 ;
prov:wasGeneratedBy [
a prov:Activity ;
prov:wasAssociatedWith <agent/ppid-matcher> ;
prov:atTime "2025-01-09T10:30:00Z"^^xsd:dateTime
] ;
ppid:employmentHistory [
ppid:institution ghcid:NL-NH-HAA-A-NHA ;
ppid:role "Senior Archivist" ;
ppid:startDate "2015"^^xsd:gYear ;
ppid:endDate "present"
] .
9. Implementation Considerations
9.1 When to Create POID vs PRID
| Scenario | Create |
|---|---|
| Extract person from LinkedIn | POID |
| Extract person from institutional website | POID |
| Extract person from archival document | POID |
| Match multiple POIDs to single identity | PRID |
| User claims "these are the same person" | PRID linking POIDs |
9.2 PRID Creation Rules
A PRID should be created when:
- Single authoritative source: One high-quality POID with comprehensive data
- Multiple matched POIDs: Algorithm or human determines multiple observations refer to same person
- External identifier exists: Person has ORCID, ISNI, or Wikidata ID
9.3 Handling Updates
# Original reconstruction
ppid:PRID-1234-5678-90ab-cdef a picom:PersonReconstruction ;
prov:wasGeneratedAt "2025-01-09T10:30:00Z"^^xsd:dateTime .
# Updated reconstruction (new evidence)
ppid:PRID-1234-5678-90ab-cdef-v2 a picom:PersonReconstruction ;
prov:wasRevisionOf ppid:PRID-1234-5678-90ab-cdef ;
prov:wasDerivedFrom ppid:POID-7a3b-c4d5-e6f7-8901 ;
prov:wasDerivedFrom ppid:POID-8b4c-d5e6-f7g8-9012 ;
prov:wasDerivedFrom ppid:POID-new-observation ; # New evidence
prov:wasGeneratedAt "2025-01-15T14:00:00Z"^^xsd:dateTime .
10. Gaps in PiCo for PPID
While PiCo provides an excellent foundation, PPID needs extensions:
| Gap | PPID Extension |
|---|---|
| Web source provenance | Add XPath, retrieval timestamp, HTML archival |
| Confidence scoring standards | Define confidence scale and methods |
| Heritage sector roles | Vocabulary for archivist, curator, director, etc. |
| Institution linking | Integration with GHCID |
| Living person data protection | GDPR-compliant access controls |
These extensions are detailed in 07_claims_and_provenance.md and 08_implementation_guidelines.md.
11. References
Primary Sources
- PiCo Ontology: https://github.com/CBG-Centrum-voor-familiegeschiedenis/PiCo
- PiCo Documentation: https://personsincontext.org/
- PNV Specification: https://w3id.org/pnv
Related Ontologies
- Schema.org Person: https://schema.org/Person
- BIO Vocabulary: http://purl.org/vocab/bio/0.1/
- PROV-O: https://www.w3.org/TR/prov-o/
Academic Papers
- Bloothooft, G., & Schraagen, M. (2015). "Learning name variants from true person resolution." Proceedings of the First Workshop on Computational Models of Reference, Anaphora and Coreference.