glam/data/ontology/ONTOLOGY_CATALOG.md
kempersc 1fb924c412 feat: add ontology mappings to LinkML schema and enhance entity resolution
Schema enhancements (443 files):
- Add class_uri with proper ontology references (schema:, prov:, skos:, rico:)
- Add close_mappings, related_mappings per Rule 50 convention
- Replace stub hc: slot_uri with standard predicates (dcterms:identifier, skos:prefLabel)
- Improve descriptions with ontology mapping rationale
- Add prefixes blocks to all schema modules

Entity Resolution improvements:
- Add entity_resolution module with email semantics parsing
- Enhance build_entity_resolution.py with email-based matching signals
- Extend Entity Review API with filtering by signal types and count
- Add candidates caching and indexing for performance
- Add ReviewLoginPage component

New rules and documentation:
- Add Rule 51: No Hallucinated Ontology References
- Add .opencode/rules/no-hallucinated-ontology-references.md
- Add .opencode/rules/slot-ontology-mapping-reference.md
- Add adms.ttl and dqv.ttl ontology files

Frontend ontology support:
- Add RiC-O_1-1.rdf and schemaorg.owl to public/ontology
2026-01-13 13:51:02 +01:00

200 lines
9.8 KiB
Markdown

# Ontology Catalog for Heritage Custodian Ontology
This document catalogs all ontologies used in the GLAM Heritage Custodian project.
**Last Updated**: 2025-01-13
## Core Domain Ontologies
### Heritage & Cultural Heritage
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `CIDOC_CRM_v7.1.3.rdf` | CIDOC Conceptual Reference Model | 7.1.3 | https://cidoc-crm.org/ | `crm:` |
| `CRMgeo_v1_2.rdfs` | CRMgeo (Geospatial Extension) | 1.2 | https://cidoc-crm.org/ | `crmgeo:` |
| `RiC-O_1-1.rdf` | Records in Contexts Ontology | 1.1 | https://www.ica.org/standards/RiC/ontology | `rico:` |
| `edm.owl` | Europeana Data Model | 5.2.8 | http://www.europeana.eu/schemas/edm/ | `edm:` |
| `bibframe.rdf` | BIBFRAME (Library Bibliographic) | 2.x | https://id.loc.gov/ontologies/bibframe.html | `bf:` |
| `premis3.owl` | PREMIS (Preservation Metadata) | 3.0 | http://www.loc.gov/premis/rdf/v3/ | `premis:` |
### Persons & Organizations
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `pico.ttl` | PiCo (Persons in Context) | 2023-09-21 | https://personsincontext.org/ | `pico:` |
| `foaf.ttl` | FOAF (Friend of a Friend) | 0.1 | http://xmlns.com/foaf/spec/ | `foaf:` |
| `org.rdf` | W3C Organization Ontology | 2014 | https://www.w3.org/TR/vocab-org/ | `org:` |
| `core-public-organisation-ap.ttl` | CPOV (Core Public Organisation) | 1.1 | https://joinup.ec.europa.eu/collection/semic-support-centre/solution/cpov-application-profile | `cpov:` |
| `regorg.ttl` | Registered Organization Vocabulary | 1.0 | https://www.w3.org/TR/vocab-regorg/ | `regorg:` |
| `tooiont.ttl` | TOOI (Dutch Government Ontology) | 2023 | https://identifier.overheid.nl/tooi/ | `tooi:` |
### Legal Entities & Business
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `gleif_base.ttl` | GLEIF Base Ontology | 1.0 | https://www.gleif.org/ontology/ | `gleif-base:` |
| `gleif_l1.ttl` | GLEIF Level 1 (LEI Data) | 1.0 | https://www.gleif.org/ontology/ | `gleif-L1:` |
| `gleif_l2.ttl` | GLEIF Level 2 (Relationships) | 1.0 | https://www.gleif.org/ontology/ | `gleif-L2:` |
| `gleif_legal_form.ttl` | GLEIF Legal Form | 1.0 | https://www.gleif.org/ontology/ | `gleif-elf:` |
| `gleif_ra.ttl` | GLEIF Registration Authority | 1.0 | https://www.gleif.org/ontology/ | `gleif-ra:` |
| `ebg-ontology.ttl` | euBusinessGraph Ontology | 1.0 | https://data.businessgraph.io/ontology | `ebg:` |
| `fibo.rdf` | FIBO (Financial Industry) | 2024 | https://spec.edmcouncil.org/fibo/ | `fibo-fnd:` |
## Geographic & Location Ontologies
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `geonames_ontology.rdf` | GeoNames Ontology | 3.3 | https://www.geonames.org/ontology/ | `gn:` |
| `geo.ttl` | GeoSPARQL | 1.1 | https://www.ogc.org/standard/geosparql/ | `geo:` |
| `wgs84_pos.rdf` | W3C WGS84 Geo Positioning | 2009 | https://www.w3.org/2003/01/geo/ | `wgs84_pos:` |
| `lcc-cr.rdf` | LCC Country Representation | 1.2 | https://www.omg.org/spec/LCC/ | `lcc-cr:` |
| `lcc-3166-1.rdf` | LCC ISO 3166-1 Country Codes | 1.2 | https://www.omg.org/spec/LCC/ | `lcc-3166-1:` |
| `lcc-3166-2.rdf` | LCC ISO 3166-2 Subdivisions | 1.2 | https://www.omg.org/spec/LCC/ | `lcc-3166-2:` |
## Language Ontologies
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `lcc-lr.rdf` | LCC Language Representation | 1.2 | https://www.omg.org/spec/LCC/ | `lcc-lr:` |
| `lcc-639-1.rdf` | LCC ISO 639-1 Language Codes | 1.2 | https://www.omg.org/spec/LCC/ | `lcc-639-1:` |
## Sensor & IoT Ontologies
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `sosa.ttl` | SOSA (Sensor, Observation, Sample, Actuator) | 1.1 | https://www.w3.org/TR/vocab-ssn/ | `sosa:` |
| `ssn.ttl` | SSN (Semantic Sensor Network) | 1.1 | https://www.w3.org/TR/vocab-ssn/ | `ssn:` |
## Foundational Ontologies
### Semantic Web Core
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `skos.rdf` | SKOS (Simple Knowledge Org System) | 2009 | https://www.w3.org/TR/skos-reference/ | `skos:` |
| `dublin_core_elements.rdf` | Dublin Core Elements | 1.1 | https://www.dublincore.org/specifications/dublin-core/ | `dc:` |
| `dcat3.ttl` | DCAT (Data Catalog Vocabulary) | 3.0 | https://www.w3.org/TR/vocab-dcat-3/ | `dcat:` |
| `schemaorg.owl` | Schema.org | 2024 | https://schema.org/ | `schema:` |
| `vcard.rdf` | vCard Ontology | 4.0 | https://www.w3.org/TR/vcard-rdf/ | `vcard:` |
| `dqv.ttl` | Data Quality Vocabulary | 2016-12 | https://www.w3.org/TR/vocab-dqv/ | `dqv:` |
| `adms.ttl` | Asset Description Metadata Schema | 2015-07 | https://www.w3.org/TR/vocab-adms/ | `adms:` |
### Provenance & Temporal
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `prov.ttl` | PROV-O (Provenance Ontology) | 2013 | https://www.w3.org/TR/prov-o/ | `prov:` |
| `prov-o.rdf` | PROV-O (RDF format) | 2013 | https://www.w3.org/TR/prov-o/ | `prov:` |
| `pav.rdf` | PAV (Provenance, Authoring, Versioning) | 2.3 | http://purl.org/pav/ | `pav:` |
| `time.rdf` | OWL-Time | 2017 | https://www.w3.org/TR/owl-time/ | `time:` |
### Object Reuse & Exchange
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `ore.rdf` | OAI-ORE (Object Reuse & Exchange) | 1.0 | http://www.openarchives.org/ore/terms/ | `ore:` |
## Domain-Specific Extensions
### Railway & Transport
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `era_ontology.ttl` | ERA Railway Ontology | 3.0 | https://data-interop.era.europa.eu/ | `era:` |
### Biomedical
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `omrse.owl` | OMRSE (Ontology for MRS Elements) | 2024 | http://purl.obolibrary.org/obo/omrse.owl | `omrse:` |
### Software & Projects
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `doap.rdf` | DOAP (Description of a Project) | 2023 | https://github.com/ewilderj/doap | `doap:` |
### DBpedia
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `dbpedia_ontology.owl` | DBpedia Ontology | 2024 | https://dbpedia.org/ontology/ | `dbo:` |
| `dbpedia_classes_sample.ttl` | DBpedia Classes (sample) | 2024 | https://dbpedia.org/ | `dbo:` |
| `dbpedia_heritage_classes.ttl` | DBpedia Heritage Classes | 2024 | https://dbpedia.org/ | `dbo:` |
| `dbpedia_wikidata_mappings.ttl` | DBpedia-Wikidata Mappings | 2024 | https://dbpedia.org/ | - |
### W3C Linked Data Platform
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `hydra_cg.jsonld` | Hydra Core Vocabulary | 1.0 | https://www.hydra-cg.com/spec/latest/core/ | `hydra:` |
### Other
| File | Ontology | Version | Source | Namespace |
|------|----------|---------|--------|-----------|
| `oasis.owl` | OASIS Ontology | 2023 | (proprietary) | `oasis:` |
| `wod_thing.ttl` | Web of Data Thing | 2024 | - | `wod:` |
## Supplementary Files
| File | Description |
|------|-------------|
| `2023-09-28-elf-code-list-v1.5.csv` | GLEIF Entity Legal Form codes (CSV) |
| `dbpedia_glam_mappings_index.md` | Index of DBpedia GLAM mappings |
## Usage in LinkML Schema
The Heritage Custodian Ontology (`schemas/20251121/linkml/01_custodian_name_modular.yaml`) references these ontologies for:
1. **PiCo** (`pico:`) - Core observation/reconstruction pattern for persons and entities
2. **PROV-O** (`prov:`) - Provenance tracking, activity chains
3. **SOSA/SSN** (`sosa:`, `ssn:`) - IoT devices, sensors, beacons
4. **GeoNames** (`gn:`) - Settlement and geographic feature references
5. **LCC** (`lcc-cr:`, `lcc-3166-1:`, `lcc-3166-2:`) - Country and subdivision codes
6. **GLEIF** (`gleif-base:`, `gleif-L1:`) - Legal entity identification
7. **SKOS** (`skos:`) - Concept schemes, labels, hierarchies
8. **Schema.org** (`schema:`) - Web semantics, social media profiles
9. **CIDOC-CRM** (`crm:`) - Cultural heritage event modeling
10. **EDM** (`edm:`) - Cultural heritage object aggregation
11. **ORE** (`ore:`) - Object aggregation patterns
## Download Commands
```bash
# SOSA/SSN (W3C)
curl -L -o sosa.ttl "http://www.w3.org/ns/sosa/" -H "Accept: text/turtle"
curl -L -o ssn.ttl "http://www.w3.org/ns/ssn/" -H "Accept: text/turtle"
# GeoNames
curl -L -o geonames_ontology.rdf "https://www.geonames.org/ontology/ontology_v3.3.rdf"
# LCC (OMG)
curl -L -o lcc-cr.rdf "https://www.omg.org/spec/LCC/Countries/CountryRepresentation/"
curl -L -o lcc-3166-1.rdf "https://www.omg.org/spec/LCC/Countries/ISO3166-1-CountryCodes/"
curl -L -o lcc-lr.rdf "https://www.omg.org/spec/LCC/Languages/LanguageRepresentation/"
curl -L -o lcc-639-1.rdf "https://www.omg.org/spec/LCC/Languages/ISO639-1-LanguageCodes/"
# EDM (Europeana - via Wayback Machine due to Cloudflare)
curl -L -o edm.owl "https://web.archive.org/web/2023/http://www.europeana.eu/schemas/edm/rdf/edm.owl"
# OAI-ORE
curl -L -o ore.rdf "http://www.openarchives.org/ore/terms/" -H "Accept: application/rdf+xml"
# WGS84 Geo
curl -L -o wgs84_pos.rdf "https://www.w3.org/2003/01/geo/wgs84_pos" -H "Accept: application/rdf+xml"
```
## Notes
- **EDM** required Wayback Machine due to Cloudflare protection on europeana.eu
- **PiCo** version 2023-09-21 is the current release
- **LCC** provides ISO 3166 and ISO 639 codes as RDF - essential for country/language modeling
- **SOSA/SSN** is a W3C/OGC joint standard for sensor observations
- **GeoNames** ontology v3.3 (2022-01-30) includes feature codes for settlements
## See Also
- LinkML Schema: `schemas/20251121/linkml/01_custodian_name_modular.yaml`
- AGENTS.md: Project documentation for AI agents
- Base ontology usage: `data/ontology/` directory