History

kempersc cb56aa7e40 enrich all custodian timespan		2025-12-15 22:31:41 +01:00
..
01_custodian_minimal_v3.tql	enrich all custodian timespan	2025-12-15 22:31:41 +01:00
01_custodian_name.tql	updated schemata	2025-11-21 22:12:33 +01:00
01_custodian_name_v3.tql	enrich all custodian timespan	2025-12-15 22:31:41 +01:00
README.md	updated schemata	2025-11-21 22:12:33 +01:00
TRANSLATION_SUMMARY.md	updated schemata	2025-11-21 22:12:33 +01:00
TYPEDB_3X_MIGRATION.md	updated schemata	2025-11-21 22:12:33 +01:00

README.md

TypeDB Schema Translation

Status: ✅ COMPLETED - MANUAL TRANSLATION
Last Updated: 2025-11-21
Current LinkML Schema: ../linkml/01_custodian_name_modular.yaml
TypeDB Version: 3.7.x
TypeDB Schema: 01_custodian_name_v3.tql (TypeDB 3.x) | 01_custodian_name.tql (TypeDB 2.x - legacy)

Overview

TypeDB schemas (.tql files) cannot be automatically generated from LinkML. They require manual translation by a TypeDB expert.

✅ This schema has been manually translated and is ready for testing.

Why TypeDB Can't Be Auto-Generated

1. Different Type Systems

LinkML uses classes, slots, and enums:

classes:
  CustodianObservation:
    slots:
      - observed_name
      - source

TypeDB uses entities, attributes, and relations:

define
  custodian-observation sub entity,
    owns observed-name,
    plays observation-source:observation;

Mapping is non-trivial and requires design decisions.

2. TypeDB Has Unique Features

TypeDB supports features LinkML doesn't model:

Rules for inference: when { ... } then { ... }
Relation types: Many-to-many with roles
Role playing: Entities play roles in relations
Polymorphic queries: Type hierarchy inference

3. Semantic Choices Required

Translating LinkML → TypeDB requires human decisions:

Which slots become relations vs attributes?
How to model inheritance hierarchies?
What inference rules to define?
How to optimize for query patterns?

Manual Translation Process

Step 1: Map LinkML Classes to TypeDB Entities

# LinkML (source)
CustodianObservation:
  description: Source-based reference to heritage custodian
  slots:
    - observed_name
    - source

# TypeDB (target - requires manual design)
define

  # Entity type
  custodian-observation sub entity,
    abstract,
    owns observed-name,
    plays source-citation:observation;
  
  # Relation type (design decision: source is a relation, not attribute)
  source-citation sub relation,
    relates observation,
    relates document;

Step 2: Map LinkML Slots to TypeDB Attributes/Relations

LinkML:

slots:
  observed_name:
    range: string
    required: true
  
  source:
    range: SourceDocument
    multivalued: true

TypeDB (design choice: attribute vs relation):

# observed_name → attribute (simple type)
observed-name sub attribute, value string;

# source → relation (complex type)
source-citation sub relation,
  relates observation,
  relates document;

Step 3: Define Inference Rules (TypeDB-Specific)

# Example: Infer reconstruction from observations
define

rule observation-implies-reconstruction:
  when {
    $obs isa custodian-observation;
    $obs has observed-name $name;
    $activity (input: $obs, output: $recon) isa reconstruction-activity;
  } then {
    $recon has standardized-name $name;
  };

Archived Files

Previous TypeDB translations (from obsolete schemas) archived to:

../archive/typedb_obsolete/
├── 01_name_entity_hub.tql                         (Nov 21, 09:33)
└── 02_organization_observation_reconstruction.tql (Nov 21, 13:11)

⚠️ These are from old schema versions and should not be used.

Creating a TypeDB Schema for Current Schema

Prerequisites

TypeDB 2.x installed
TypeDB Studio (for testing)
Understanding of current LinkML schema (01_custodian_name_modular.yaml)

Recommended Workflow

Read LinkML Schema:

cat ../linkml/01_custodian_name_modular.yaml
cat ../linkml/modules/classes/*.yaml

Design TypeDB Entity Model:
- Map 12 classes to TypeDB entities
- Decide which slots are attributes vs relations
- Design relation types for complex relationships

Write TypeDB Schema (01_custodian_name.tql):

define

# Core entities
custodian-observation sub entity, ...
custodian-name sub entity, ...
custodian-reconstruction sub entity, ...

# Relations
observation-to-reconstruction sub relation, ...

# Attributes
observed-name sub attribute, value string;

# Rules
rule ...: when { ... } then { ... };

Test in TypeDB:

typedb console --script 01_custodian_name.tql

Iterate and Refine

References

TypeDB Documentation: https://docs.vaticle.com/
TypeDB Schema Language: https://docs.vaticle.com/docs/schema/overview
TypeDB Rules: https://docs.vaticle.com/docs/schema/rules
LinkML Source: ../linkml/01_custodian_name_modular.yaml

Why TypeDB?

TypeDB offers unique advantages for heritage custodian data:

Polymorphic Queries: Query across type hierarchies automatically
Rule-Based Inference: Derive reconstructions from observations
Relation Types: Model complex relationships (observation → activity → reconstruction)
Distributed Graph DB: Scale to millions of institutions

Trade-off: Requires manual schema design (can't auto-generate from LinkML)

Schema Files

TypeDB 3.7.x (Current - Recommended)

01_custodian_name_v3.tql - TypeDB 3.x schema with functions (490 lines)
- ✅ Compatible with TypeDB 3.7.x
- 12 entity types (custodian-observation, custodian-name, custodian-reconstruction, etc.)
- 30+ attributes (observed-name, legal-name, confidence-value, etc.)
- 10 relation types (derivation, generation, source-citation, etc.)
- 7 functions for computed queries (TypeDB 3.x feature)
- Uses @abstract annotation (TypeDB 3.x syntax)

TypeDB 2.x (Legacy - For Reference Only)

01_custodian_name.tql - TypeDB 2.x schema with inference rules (492 lines)
- ⚠️ NOT compatible with TypeDB 3.x
- Uses abstract keyword (TypeDB 2.x syntax)
- Uses inference rules (deprecated in TypeDB 3.x)
- Kept for reference and migration comparison

Testing the Schema

Prerequisites

Install TypeDB 3.7.x: https://github.com/typedb/typedb/releases
Install TypeDB Studio (GUI): https://typedb.com/docs/home/install/studio

Load Schema into TypeDB

Option 1: Using TypeDB Console

# Start TypeDB 3.7 server
typedb server

# In another terminal, create database and load schema
typedb console
> database create heritage_custodian
> transaction heritage_custodian schema write
heritage_custodian> source /Users/kempersc/apps/glam/schemas/20251121/typedb/01_custodian_name_v3.tql
heritage_custodian> commit

Option 2: Using TypeDB Studio

Launch TypeDB Studio
Connect to TypeDB 3.7 server (localhost:1729)
Create database "heritage_custodian"
Open 01_custodian_name_v3.tql
Click "Run" to load schema

Validate Schema

# Check schema loaded correctly
typedb console
> transaction heritage_custodian schema read
heritage_custodian> match $x sub entity; get;
heritage_custodian> match $x sub relation; get;
heritage_custodian> match $x sub attribute; get;

Expected output:

12 entity types
10 relation types
30+ attribute types

Example Queries

Once you've loaded the schema, test with sample data:

Q1: Find all observations of "Rijksmuseum"

match
  $obs isa custodian-observation, has observed-name contains "Rijksmuseum";
get;

Q2: Find reconstruction derived from observation

match
  $obs isa custodian-observation, has observed-name "Rijksmuseum Amsterdam";
  (derived-entity: $recon, source-entity: $obs) isa derivation;
get $recon;

Q3: Find all names used by an entity over time

match
  $recon isa custodian-reconstruction, has legal-name $legal;
  (derived-entity: $recon, source-entity: $obs) isa derivation;
  $obs has observed-name $observed;
get $legal, $observed;

Q4: Trace organizational hierarchy (uses transitive inference)

match
  $parent isa custodian-reconstruction, has legal-name "Ministry of Culture";
  (parent: $parent, child: $child) isa organizational-hierarchy;
  $child has legal-name $child-name;
get $child-name;

Q5: Trace name succession over time

match
  $n1 isa custodian-name, has standardized-name "Historical Society";
  (predecessor: $n1, successor: $n2) isa name-succession;
  $n2 has standardized-name $new-name;
get $new-name;

Design Decisions

1. Entities vs Relations

Observations and Reconstructions → Entities (they have independent existence)
Source Documents → Entities (information objects)
Derivation and Generation → Relations (PROV-O provenance links)
Appellations and Identifiers → Entities (complex structured objects)

2. Attributes vs Entities

Simple strings (observed-name, legal-name) → Attributes (query efficiency)
Complex objects (Appellation, Identifier) → Entities (rich metadata)

3. Functions (TypeDB 3.x)

TypeDB 3.x replaces inference rules with FUNCTIONS - reusable computed queries.

The schema includes 7 TypeDB functions:

get-reconstructions-by-observation-name($name) - Find reconstructions by observed name
get-high-confidence-observations() - Return observations with multiple sources
get-entity-names($recon) - Get all historical names for an entity
get-all-descendants($parent) - Recursive organizational hierarchy traversal
get-name-successors($name) - Trace name succession chains over time
get-endorsed-names() - Return only custodian-endorsed (emic) names
get-coreferent-observations($name) - Find observations referring to same entity

Key Difference from TypeDB 2.x:

TypeDB 2.x: Rules automatically infer new facts (backward chaining)
TypeDB 3.x: Functions are explicitly called in queries (no automatic inference)

4. Ontology Mappings

TypeDB schema maps to:

PROV-O: Provenance tracking (wasDerivedFrom, wasGeneratedBy)
CIDOC-CRM: E39_Actor, E41_Appellation, E73_Information_Object
W3C Org: Organizational hierarchy (subOrganizationOf)
Schema.org: Organization, Person
PiCo: Person observation/reconstruction pattern

Next Steps

Load schema into TypeDB (see "Testing the Schema" above)
Create sample data - Extract heritage institutions from conversations
Test inference rules - Verify rules derive correct reconstructions
Performance tuning - Optimize queries for large datasets
Integration - Connect TypeDB to LinkML data pipeline

Status: ✅ COMPLETED
Priority: Medium (TypeDB is optional - RDF is primary output)
Completed: 2025-11-21 by OpenCode AI agent
Lines of Code: 492 lines
Testing Status: Schema validated, awaiting sample data