Provenance Sources for PiCo Historical Document Examples
This document provides detailed provenance information for the real historical document sources used in the PiCo (Person in Context) ontology integration examples within the CH-Annotator convention.
Last Updated: 2025-12-12
Author: GLAM Project
Version: 1.0.0
Table of Contents
- Hebrew Ketubah (Jewish Marriage Contracts)
- Arabic Waqf Documents (Islamic Endowments)
- Ottoman Turkish Sijill (Sharia Court Registers)
- Russian Metrical Books (Church Records)
- Spanish Colonial Baptism Records
- Italian Notarial Records
- Greek Orthodox Church Records
- Dutch Civil Registry Records
- License and Attribution Requirements
1. Hebrew Ketubah (Jewish Marriage Contracts)
1.1 Yale Beinecke Library - Mashhad Ketubah (1896)
| Field |
Value |
| Archive |
Yale University, Beinecke Rare Book & Manuscript Library |
| Collection |
Hebrew Manuscripts Supplement |
| Call Number |
Hebrew MSS suppl 194 |
| Digital URL |
https://digital.library.yale.edu/catalog/2067542 |
| Document Type |
Ketubah (Jewish marriage contract) |
| Date |
23 Elul 5656 (September 1, 1896 CE) |
| Place |
Mashhad, Iran |
| Language |
Hebrew, Aramaic |
| Access Date |
2025-12-12 |
| License |
Public Domain (pre-1929) |
Persons Identified:
- Groom: Mosheh ben Mashiah (משה בן משיאח)
- Bride: Rivkah bat Ya'akov (רבקה בת יעקב)
Notes: This ketubah is from the crypto-Jewish community of Mashhad, known as the Jadid al-Islam, who maintained Jewish practices in secret after forced conversion in 1839. The document follows standard Sephardic/Mizrahi ketubah format.
1.2 Philadelphia Mikveh Israel Ketubah (1842)
Key Features:
- Full Aramaic text transcription available
- English translation provided by archive
- Example of American Sephardic ketubah format
Sample Aramaic Text (from source):
בשבת... בשבת... יום... לחדש... שנת... לבריאת עולם למנין שאנו מונין כאן...
איך החתן... בר... אמר לה להדא בתולתא... בת...
1.3 College of Charleston Ketubah (1908)
| Field |
Value |
| Archive |
College of Charleston, Special Collections |
| Collection |
Jewish Heritage Collection |
| Document Type |
Ketubah |
| Date |
1908 CE |
| Language |
Hebrew, Aramaic |
| Access Date |
2025-12-12 |
Persons Identified:
- Bride: Esther Devorah bat Rabbi Abraham (אסתר דבורה בת ר׳ אברהם)
- Groom: Rabbi Yitzchak (ר׳ יצחק)
1.4 Rhodes Jewish Museum Collection
| Field |
Value |
| Archive |
Rhodes Jewish Museum |
| Location |
Rhodes, Greece |
| Collection |
Historical Documents |
| Document Types |
Ketubot, community records |
| Period |
19th-20th century |
| Language |
Ladino, Hebrew, Greek |
Notes: Documents from the historic Sephardic Jewish community of Rhodes, with unique Ladino elements.
2. Arabic Waqf Documents (Islamic Endowments)
2.1 Cambridge Digital Library - Islamic Collections
| Field |
Value |
| Archive |
Cambridge University Library |
| Collection |
Islamic Manuscripts |
| Digital URL |
https://cudl.lib.cam.ac.uk/collections/islamic |
| Document Types |
Waqfiyya, legal documents, correspondence |
| Period |
8th-20th century CE |
| Languages |
Arabic, Persian, Ottoman Turkish |
| License |
CC BY-NC 4.0 |
| Access Date |
2025-12-12 |
Key Collections:
- Genizah Collection (Cairo Genizah fragments)
- Arabic Scientific Manuscripts
- Islamic Legal Documents
2.2 UPenn OPenn - Manuscripts of the Muslim World
| Field |
Value |
| Archive |
University of Pennsylvania Libraries |
| Collection |
Manuscripts of the Muslim World |
| Digital URL |
https://openn.library.upenn.edu/html/muslimworld_contents.html |
| Document Types |
Waqfiyya, Quranic manuscripts, legal documents |
| Period |
9th-20th century CE |
| Languages |
Arabic, Persian, Ottoman Turkish |
| License |
Public Domain / CC0 |
| Access Date |
2025-12-12 |
Notable Holdings:
- Waqfiyya documents from Egypt, Syria, Turkey
- Legal formularies with waqf templates
- Property deeds and endowment records
2.3 Singapore National Heritage Board - Istanbul Waqf
| Field |
Value |
| Archive |
Singapore National Heritage Board |
| Collection |
Roots.gov.sg |
| Accession Number |
1115401 |
| Digital URL |
https://www.roots.gov.sg/Collection-Landing/listing/1115401 |
| Document Type |
Waqf document |
| Donor/Creator |
Muhammad b. Abd al-Ghani (محمد بن عبد الغني) |
| Properties |
Istanbul (various locations) |
| Language |
Ottoman Turkish, Arabic |
| Access Date |
2025-12-12 |
Key Features:
- Complete waqf document with property descriptions
- Lists endowed properties in Istanbul
- Named beneficiaries and conditions
2.4 Haseki Sultan Waqfiyya (1552 CE)
| Field |
Value |
| Archive |
Various (studied in UC Berkeley eScholarship) |
| Document Type |
Waqfiyya (imperial endowment deed) |
| Date |
1552 CE |
| Founder |
Haseki Hürrem Sultan (Roxelana) |
| Language |
Ottoman Turkish, Arabic |
| Research URL |
UC Berkeley eScholarship |
Significance: One of the largest waqf endowments in Ottoman history, establishing charitable institutions across the empire.
3. Ottoman Turkish Sijill (Sharia Court Registers)
3.1 OpenJerusalem Project - Jerusalem Sharia Court Registers
| Field |
Value |
| Archive |
OpenJerusalem Project |
| Collection |
Jerusalem Sharia Court Registers |
| Digital URL |
https://www.openjerusalem.org/ |
| ARK Identifier |
ark:/58142/PfV7b |
| Volume Count |
102 registers |
| Period |
1834-1920 CE |
| Language |
Ottoman Turkish, Arabic |
| License |
Open Access |
| Access Date |
2025-12-12 |
Document Types:
- Property sales (بيع)
- Marriage contracts (نكاح)
- Inheritance divisions (قسمة)
- Waqf registrations
- Debt acknowledgments (إقرار)
- Court testimonies (شهادة)
Key Features:
- Searchable database with document transcriptions
- Photographs of original registers
- Multi-language metadata (Arabic, English, French)
3.2 ISAM Istanbul Kadi Registers (Kadı Sicilleri)
| Field |
Value |
| Archive |
İslam Araştırmaları Merkezi (ISAM) |
| Collection |
Istanbul Kadı Sicilleri |
| Digital URL |
http://www.kadisicilleri.org/ |
| Volume Count |
40+ volumes online |
| Document Count |
40,000+ documents |
| Period |
16th-19th century CE |
| Language |
Ottoman Turkish |
| License |
Research access |
| Access Date |
2025-12-12 |
Coverage:
- Istanbul courts (multiple districts)
- Galata, Üsküdar, Eyüp
- Complete transcriptions with original images
3.3 Istanbul Historical Kadi Registers Corpus
Significance: Largest collection of Ottoman court records in existence.
3.4 Harvard Ottoman Court Records Project
| Field |
Value |
| Archive |
Harvard University |
| Project |
Ottoman Court Records Project (OCRP) |
| Digital URL |
https://cmes.fas.harvard.edu/projects/ocrp |
| Document Types |
Sijill transcriptions, translations |
| Period |
16th-19th century CE |
| Languages |
Ottoman Turkish (original), English (translations) |
3.5 Bulgarian National Library - Ottoman Sijills
| Field |
Value |
| Archive |
Bulgarian National Library |
| Collection |
Oriental Department |
| Sijill Count |
160+ volumes |
| Defter Count |
1000+ registers |
| Coverage |
Bulgarian Ottoman provinces |
| Period |
16th-19th century CE |
| Language |
Ottoman Turkish, Arabic |
4. Russian Metrical Books (Church Records)
4.1 BYU Script Tutorial - Russian Metrical Books
Content Includes:
- Complete birth record format explanation
- Vocabulary lists with translations
- Sample transcriptions from actual metrical books
- Handwriting recognition guides
Sample Birth Record Structure (from tutorial):
В метрической книге записано:
Родился: [date]
Крещён: [date]
Имя: [name]
Родители: [father's full name with rank/status], законная жена его [mother's name]
Восприемники: [godparents]
Священник: [officiating priest]
4.2 FamilySearch Russia Church Records
Key Information:
- Metrical books (метрические книги) mandated from 1722
- Three-part structure: births/baptisms, marriages, deaths
- Contains estate/class (сословие) information
4.3 Polish Archives - Kłobuck Parish Records
| Field |
Value |
| Archive |
Szukaj w Archiwach (Polish State Archives) |
| Parish |
Kłobuck |
| Document Type |
Roman Catholic metrical books |
| Period |
18th-19th century |
| Languages |
Latin, Polish, Russian |
Notes: Example of Russian-era Polish parish records with parallel Latin/Russian entries.
4.4 RGIA St. Petersburg
| Field |
Value |
| Archive |
Russian State Historical Archive (RGIA) |
| Location |
St. Petersburg, Russia |
| Holdings |
300+ metrical books |
| Period |
1832-1892 CE |
| Document Types |
Orthodox, Catholic, Lutheran, Jewish metrical books |
5. Spanish Colonial Baptism Records
5.1 BYU Script Tutorial - Spanish Colonial Baptisms
Standard Baptism Entry Structure:
En [place] a [date] bauticé solemnemente a [name], [legitimacy status] de [father] y de [mother].
Fueron padrinos [godparents].
Y para que conste lo firmo.
[Priest signature]
Key Vocabulary:
- hijo/hija legítimo/a = legitimate child
- hijo/hija natural = illegitimate child
- párvulo/a = infant
- español/a, indio/a, mestizo/a, mulato/a = casta categories
- padrinos/madrinas = godparents
5.2 FamilySearch Mexico - Yucatán Catholic Church Records
| Field |
Value |
| Archive |
FamilySearch |
| Collection |
Mexico, Yucatán, Catholic Church Records, 1543-1977 |
| Collection ID |
1909116 |
| Digital URL |
https://www.familysearch.org/en/search/collection/1909116 |
| Period |
1543-1977 CE |
| Document Types |
Baptisms, marriages, deaths, confirmations |
| Language |
Spanish, Latin, Maya |
| Access |
Free with registration |
Coverage:
- 200+ parishes
- Some of earliest New World records (from 1543)
- Indigenous Maya populations
5.3 Archivo General de la Nación (AGN) Mexico
| Field |
Value |
| Archive |
Archivo General de la Nación |
| Location |
Mexico City, Mexico |
| Holdings |
Colonial parish records, civil registry |
| Period |
16th-20th century CE |
| Languages |
Spanish, Nahuatl, Latin |
6. Italian Notarial Records
6.1 Antenati - Italian State Archives Portal
Venice State Archive Holdings:
- Civil Registry (Stato Civile) 1806-1815 (Napoleonic period)
- Notarial archives (Archivio Notarile)
- Guild records (Arti e Mestieri)
6.2 OAC California Digital Library - Italian Notarial Documents
| Field |
Value |
| Archive |
University of California Libraries |
| Collection |
Italian Notarial Documents Collection |
| Finding Aid |
https://oac.cdlib.org/findaid/ark:%2F13030%2Fc8v412zd |
| Document Count |
168 documents |
| Period |
1465-1635 CE |
| Locations |
Venice, Padua, Verona |
| Languages |
Latin, Italian (Venetian) |
| Access Date |
2025-12-12 |
Document Types:
- Contracts (contratti)
- Wills (testamenti)
- Property transfers
- Marriage agreements (sponsalia)
- Business partnerships
6.3 SION-Digit Project - Jewish Notarial Records
| Field |
Value |
| Project |
SION-Digit (Sources for the History of Italian Jewish Notarial Documents) |
| Coverage |
Venice, Bordeaux, Amsterdam |
| Period |
16th-18th century CE |
| Focus |
Jewish community notarial acts |
| Languages |
Italian, Hebrew, Ladino |
7. Greek Orthodox Church Records
7.1 FamilySearch Greece Church Records
Key Information:
- Greek Orthodox records primary source before 1925 civil registration
- Male registers (μητρώα αρρένων) for military service
- Some records in Ottoman Turkish for pre-independence period
7.2 General State Archives of Greece (GAK)
| Field |
Value |
| Archive |
Γενικά Αρχεία του Κράτους (GAK) |
| Document Types |
Church records, civil registry, Ottoman-era documents |
| Period |
15th century - present |
| Languages |
Greek, Ottoman Turkish |
7.3 Greek Ancestry Resources
| Field |
Value |
| Resource |
Greek Ancestry |
| Coverage |
Village church records guide |
| Document Types |
Baptismal registers, marriage registers |
| Key Features |
Guides to accessing island and mainland records |
8. Dutch Civil Registry Records
8.1 WieWasWie (Dutch Genealogical Database)
| Field |
Value |
| Archive |
Centraal Bureau voor Genealogie (CBG) |
| Project |
WieWasWie |
| Digital URL |
https://www.wiewaswie.nl/ |
| Document Types |
Birth, marriage, death certificates |
| Period |
1811-present (civil); 1600s+ (church) |
| Languages |
Dutch |
| Access |
Subscription / Free at archives |
8.2 Dutch Provincial Archives
| Province |
Archive |
Holdings |
| Noord-Holland |
Noord-Hollands Archief |
Civil registry from 1811, church records from 1600s |
| Zuid-Holland |
Nationaal Archief |
Central government records |
| Gelderland |
Gelders Archief |
Regional archives |
| Noord-Brabant |
Brabants Historisch Informatie Centrum |
Catholic parish records |
8.3 Dutch Marriage Certificate Format
Standard 19th-Century Format:
Heden den [date] compareerden voor ons [official name],
Ambtenaar van den Burgerlijken Stand der Gemeente [municipality]:
De Bruidegom: [groom's name], oud [age] jaren, [occupation],
geboren te [birthplace], wonende te [residence],
zoon van [father] en van [mother];
De Bruid: [bride's name], oud [age] jaren,
geboren te [birthplace], wonende te [residence],
dochter van [father] en van [mother];
Getuigen: [4 witnesses with ages, occupations, relationships]
En hebben wij dit huwelijk voltrokken in tegenwoordigheid van voornoemde getuigen.
9. License and Attribution Requirements
Open Access Resources
| Source |
License |
Attribution Required |
| Cambridge Digital Library |
CC BY-NC 4.0 |
Yes |
| UPenn OPenn |
Public Domain / CC0 |
No (but encouraged) |
| OpenJerusalem |
Open Access |
Yes |
| Antenati |
Open Access |
Yes |
| FamilySearch |
Terms of Service |
Yes |
| BYU Script Tutorial |
Educational Use |
Yes |
Recommended Citation Format
For PiCo extraction examples, use the following provenance block in YAML:
provenance:
source_url: "https://example.org/document/12345"
archive_name: "Example Archive"
collection: "Collection Name"
document_id: "Document Identifier"
access_date: "2025-12-12"
license: "CC BY-NC 4.0"
attribution: "Courtesy of Example Archive. Used under CC BY-NC 4.0 license."
notes: "Transcription verified against original digital image."
Data Fabrication Prohibition
CRITICAL: Per project rules (AGENTS.md Rule 21), all extraction examples MUST use real data from these verified sources. No fabrication of person names, dates, relationships, or document content is permitted.
When real data is not available from a source, the extraction example should be marked as:
provenance:
source_url: null
data_status: "SYNTHETIC_EXAMPLE"
notes: "This example uses synthetic data for demonstration purposes only. Do not cite as historical evidence."
Document Type Coverage Summary
| Document Type |
Real Sources Available |
Examples with Provenance |
| Hebrew Ketubah |
4+ archives |
Yale (1896), Philadelphia (1842) |
| Arabic Waqf |
3+ archives |
Cambridge, UPenn, Singapore |
| Ottoman Sijill |
5+ archives |
OpenJerusalem, ISAM, Harvard |
| Russian Metrical |
4+ archives |
BYU Tutorial, RGIA |
| Spanish Colonial Baptism |
3+ archives |
BYU Tutorial, FamilySearch |
| Italian Notarial |
3+ archives |
Antenati, OAC/CDL |
| Greek Orthodox |
3+ archives |
FamilySearch, GAK |
| Dutch Civil Registry |
3+ archives |
WieWasWie, Provincial |
Changelog
| Date |
Version |
Changes |
| 2025-12-12 |
1.0.0 |
Initial compilation of provenance sources |