# Publications Dataset This directory contains bibliographic metadata for academic publications in LinkML format, demonstrating the project's bibliographic schema (`schemas/bibliographic.yaml`). ## Overview **Purpose**: Store structured metadata about academic publications, including journal articles, conference papers, books, and their citation relationships. **Schema**: `/schemas/bibliographic.yaml` (based on FaBiO, CiTO, BIBO, FRBR ontologies) **Current Dataset Size**: - **20 publications** (10 journal articles, 2 conference papers, 1 data paper, 2 books, 2 book chapters, 2 technical reports, 4 preprints) - **27 citation relationships** (cross-references between publications) - **60+ unique authors** with institutional affiliations (universities, heritage institutions) - **7 journals** (referenced from `/data/instances/journals/`) - **5 conferences** (referenced from `/data/instances/conferences/`) - **5 heritage institutions** linked as author affiliations **Publication Type Distribution**: | Publication Type | Count | Examples | |------------------|-------|----------| | Journal Articles | 10 | 5 semantic web (Knowledge Graphs, Wikidata, LOKG, etc.) + 5 heritage-linked (Rembrandt analysis, NHA digital, etc.) | | Conference Papers | 2 | ISWC 2024 (Best Paper), ISWC 2023 (Best Paper) | | Books | 2 | Linked Data for Museums, Digital Preservation Handbook | | Book Chapters | 2 | Crowdsourcing Metadata, Archival Appraisal | | Technical Reports | 2 | KB 3D Digitization, Europeana QA Framework | | Preprints | 4 | arXiv (GNN provenance), OSF/SocArXiv (LLM cataloging), bioRxiv (Ancient DNA), arXiv (2nd paper TBD) | | Data Papers | 1 | Brazilian LOKG Subset (TGDK journal) | | **TOTAL** | **20** | Diverse representation of scholarly output types | **Citation Network Statistics**: - **Total citations**: 27 relationships - **Publications with citations**: 19 (1 bioRxiv paper unlinked - outside scope) - **Citation density**: 1.42 citations per publication - **Most cited works**: 1. Knowledge Graphs (2021) - 8 citations 2. Wikidata (2018) - 6 citations 3. LOKG (2024) - 5 citations - **Citation types used**: 6 distinct CiTO types (CITES_AS_AUTHORITY, CITES_AS_EVIDENCE, DISCUSSES, EXTENDS, USES_DATA_FROM, CITES_AS_METADATA) ## Files ### 1. `semantic_web_papers.yaml` (379 lines) ✅ **Notable semantic web publications demonstrating schema patterns** **Publications included**: | Title | Type | Journal/Conf | Year | Authors | DOI | |-------|------|-------------|------|---------|-----| | Knowledge Graphs | Journal Article | Semantic Web Journal | 2021 | 18 authors | 10.3233/SW-222793 | | Wikidata: A Free Collaborative Knowledgebase | Journal Article | Journal of Web Semantics | 2018 | 2 authors | 10.1016/j.websem.2018.08.002 | | The LOKG | Journal Article | TGDK | 2024 | 4 authors (synthetic) | 10.4230/TGDK.2.1.3 | | Relationships are Complicated! | Conference Paper | ISWC 2024 (Best Paper) | 2024 | 3 authors | - | | Spatial Link Prediction | Conference Paper | ISWC 2023 (Best Paper) | 2023 | 4 authors | 10.1007/978-3-031-47240-4_9 | **Schema patterns demonstrated**: - Multi-author publications (up to 18 authors) - ORCID identifiers for authors - Institutional affiliations (universities, research institutes, corporations) - DOI identifiers - Journal article metadata (volume, issue, page range) - Conference paper metadata (proceedings, best paper awards) - Open access status tracking - Abstract text ### 2. `citation_relationships.yaml` (174 lines) ✅ **Citation relationships between publications using CiTO (Citation Typing Ontology)** **Citation patterns included**: - **27 citation relationships** linking 19 publications (5 semantic web + 5 heritage-linked + 9 diverse publications) - **Citation types**: CITES_AS_AUTHORITY, CITES_AS_EVIDENCE, DISCUSSES, EXTENDS, CITES_AS_METADATA, USES_DATA_FROM - **Citation context**: Textual excerpts showing how works cite each other - **Citation intent**: Purpose and reasoning for citations - **Page numbers**: Specific location of citations in citing work - **Citation density**: 1.42 citations per publication (27 citations / 19 linked publications) **Citation network**: ``` Semantic Web Publications: Knowledge Graphs (2021) ──cites──> Wikidata (2018) └─self-cites (section reference) [Most cited: 8 citations total] LOKG (2024) ──cites──> Knowledge Graphs (2021) ──cites──> Wikidata (2018) └─cites──> Spatial Link Prediction (2023) [Second most cited: 5 citations total] ISWC 2024 Paper ──cites──> Knowledge Graphs (2021) └─extends──> ISWC 2023 Paper Heritage-Linked Publications: Brazilian LOKG Subset (2024) ──extends──> LOKG (2024) Dutch GLAM Consortium (2023) ──cites──> Knowledge Graphs (2021) └─cites──> Wikidata (2018) Rembrandt Analysis (2024) ──uses_data──> Wikidata (2018) NHA Digital Transformation (2023) ──discusses──> LOKG (2024) Collection Management Systems (2024) ──cites──> Wikidata (2018) └─cites──> Knowledge Graphs (2021) Diverse Publications (Books, Reports, Chapters, Preprints): Linked Data for Museums (Book) ──cites──> Knowledge Graphs (2021) └─cites──> Wikidata (2018) Digital Preservation Handbook ──cites──> LOKG (2024) KB 3D Digitization Report ──discusses──> LOKG (2024) Europeana QA Framework ──cites──> Knowledge Graphs (2021) └─cites──> LOKG (2024) Crowdsourcing Metadata Chapter ──cites──> Wikidata (2018) Archival Appraisal Chapter ──discusses──> Knowledge Graphs (2021) arXiv GNN Provenance ──cites──> Knowledge Graphs (2021) └─uses_data──> Wikidata (2018) OSF LLM Cataloging ──discusses──> Knowledge Graphs (2021) bioRxiv Ancient DNA (unlinked - genomics focus, not heritage knowledge graphs) ``` **Most Cited Publications**: 1. Knowledge Graphs (2021) - 8 citations 2. Wikidata (2018) - 6 citations 3. LOKG (2024) - 5 citations ### 3. `heritage_linked_publications.yaml` (206 lines) ✅ **Publications with authors affiliated at heritage institutions** **Demonstrates heritage-bibliographic integration patterns**: | Title | Type | Authors | Heritage Institution | Year | |-------|------|---------|---------------------|------| | Digital Analysis of Rembrandt's Brushwork | Journal | Rijksmuseum researcher + UvA | Rijksmuseum | 2024 | | Democratizing Access: NHA Digital Transformation | Journal | 2 Noord-Hollands Archief archivists | Noord-Hollands Archief | 2023 | | Brazilian Cultural Heritage in the LOKG | Data Paper | USP researcher + BNB librarian | Biblioteca Nacional do Brasil | 2024 | | The Dutch GLAM Consortium | Conference | KB director + Rijksmuseum curator + NA archivist | KB, Rijksmuseum, Nationaal Archief | 2023 | | Comparative Analysis of Collection Management Systems | Journal | Paris-Sorbonne + KB librarian | Koninklijke Bibliotheek | 2024 | **Integration patterns**: - **Pattern 1**: Researcher at heritage institution as sole author - **Pattern 2**: Multiple staff from same heritage institution as co-authors - **Pattern 3**: Heritage institution staff collaborating with university researcher - **Pattern 4**: Multi-institutional consortium (3+ heritage institutions) - **Pattern 5**: International collaboration (foreign researcher + local heritage institution) ## Schema Reference ### Publication Class **Required fields**: ```yaml publication_id: https://w3id.org/heritage/publication/[unique-id] title: "Publication Title" # NOT publication_title! publication_type: JOURNAL_ARTICLE # Enum: JOURNAL_ARTICLE, CONFERENCE_PAPER, BOOK, etc. ``` **Key fields**: ```yaml authors: # List of Person objects - person_id: https://orcid.org/0000-0002-XXXX-XXXX # ORCID preferred person_name: "Author Name" orcid: "0000-0002-XXXX-XXXX" # Separate field from person_id affiliation: # SINGULAR Organization object (NOT affiliations array!) organization_name: "University Name" organization_type: "University" published_in: https://w3id.org/heritage/journal/[journal-id] # String ID reference, NOT nested object! volume: "12" issue: "3" page_range: "1-94" # NOT 'pages'! doi: "10.1234/example.doi" # Separate field (NOT in identifiers array) url: "https://..." # Separate field abstract: "Full abstract text..." provenance: # NO 'notes' field! Use 'description' in parent object instead data_source: CONVERSATION_NLP data_tier: TIER_2_VERIFIED extraction_date: "2025-11-09T21:00:00Z" ``` ### Citation Class **Required fields**: ```yaml citation_id: https://w3id.org/heritage/citation/[unique-id] citing_work: https://w3id.org/heritage/publication/[citing-pub-id] # Required cited_work: https://w3id.org/heritage/publication/[cited-pub-id] # Required citation_type: CITES_AS_AUTHORITY # Required enum ``` **Optional enrichment fields**: ```yaml citation_intent: "Purpose/reasoning for this citation..." citation_context: "Textual excerpt showing the citation..." page_number: "23" # Page where citation appears ``` ### Citation Types (CiTO Ontology) | Type | Description | Example Use | |------|-------------|-------------| | `CITES` | Generic citation | Standard reference | | `CITES_AS_AUTHORITY` | Cites as authoritative source | Citing foundational theory | | `CITES_AS_EVIDENCE` | Cites as evidence | Supporting empirical claims | | `CITES_AS_METADATA` | Cites for metadata/provenance | Dataset documentation | | `DISCUSSES` | Discusses the cited work | Critical analysis | | `EXTENDS` | Extends the cited work | Building on prior work | | `SUPPORTS` | Provides support for claims | Corroborating findings | | `REFUTES` | Refutes or disputes | Contradicting claims | | `CRITIQUES` | Critiques cited work | Identifying limitations | | `AGREES_WITH` | Agrees with cited work | Confirming findings | ## Schema Quirks and Common Errors ### ❌ Common Mistakes **1. Wrong field names**: ```yaml # WRONG publication_title: "Title" # Field doesn't exist! pages: "1-94" # Should be 'page_range' affiliations: [...] # Should be singular 'affiliation' # CORRECT title: "Title" page_range: "1-94" affiliation: {...} ``` **2. Wrong `published_in` structure**: ```yaml # WRONG - Nested object published_in: journal_id: https://... journal_title: "Journal Name" volume: "12" # CORRECT - String ID reference published_in: https://w3id.org/heritage/journal/semantic-web volume: "12" # Volume at Publication level, not nested ``` **3. Wrong identifier handling**: ```yaml # WRONG - DOI in identifiers array identifiers: - identifier_scheme: DOI identifier_value: "10.1234/..." # CORRECT - DOI as separate field doi: "10.1234/..." ``` **4. Provenance notes**: ```yaml # WRONG - Provenance has no 'notes' field provenance: data_source: CONVERSATION_NLP notes: "Some observation" # This will fail validation! # CORRECT - Use 'description' at Publication level description: "Notes and remarks about this publication" provenance: data_source: CONVERSATION_NLP ``` ### ✅ Schema Validation Checklist Before committing new publications: - [ ] `title` field (NOT `publication_title`) - [ ] `published_in` is a string ID (NOT nested object) - [ ] `affiliation` is singular object (NOT `affiliations` array) - [ ] `page_range` (NOT `pages`) - [ ] `doi` and `url` are separate fields (NOT in `identifiers`) - [ ] `provenance` has no `notes` field - [ ] All `publication_id`, `person_id`, `journal_id` use valid URIs - [ ] `publication_type` is valid enum value - [ ] Authors have either ORCID or local ID - [ ] File validates with: `linkml-validate -s schemas/bibliographic.yaml -C Publication ` ## Validation Commands ### Validate Publications ```bash cd /Users/kempersc/apps/glam linkml-validate -s schemas/bibliographic.yaml -C Publication \ data/instances/publications/semantic_web_papers.yaml ``` ### Validate Citations ```bash linkml-validate -s schemas/bibliographic.yaml -C Citation \ data/instances/publications/citation_relationships.yaml ``` ### Validate Journals ```bash linkml-validate -s schemas/bibliographic.yaml -C Journal \ data/instances/journals/semantic_web_journals.yaml ``` ### Validate Conferences ```bash linkml-validate -s schemas/bibliographic.yaml -C Conference \ data/instances/conferences/semantic_web_conferences.yaml ``` ## Adding New Publications ### Step 1: Gather Metadata **Required information**: - Title, authors, publication date - Publication type (journal article, conference paper, etc.) - Journal or conference (must reference existing entity in `journals/` or `conferences/`) - DOI (if available) **Recommended information**: - Author ORCID identifiers - Author institutional affiliations - Abstract text - Volume, issue, page numbers - URL to full text - Open access status ### Step 2: Create Publication Record Follow the schema patterns in `semantic_web_papers.yaml`: ```yaml - publication_id: https://w3id.org/heritage/publication/[unique-id] title: "Your Publication Title" publication_type: JOURNAL_ARTICLE # or CONFERENCE_PAPER, BOOK, etc. publication_date: "2024-11-09" authors: - person_id: https://orcid.org/0000-0002-XXXX-XXXX person_name: "First Author" orcid: "0000-0002-XXXX-XXXX" affiliation: organization_name: "University Name" organization_type: "University" published_in: https://w3id.org/heritage/journal/[journal-id] volume: "15" issue: "2" page_range: "123-145" doi: "10.1234/example.doi" url: "https://..." abstract: "Full abstract text..." provenance: data_source: MANUAL_CURATION # or CONVERSATION_NLP, WEB_SCRAPING, etc. data_tier: TIER_2_VERIFIED extraction_date: "2024-11-09T12:00:00Z" extraction_method: "Manual entry from published source" ``` ### Step 3: Create Citation Relationships (Optional) If the new publication cites existing publications (or vice versa): ```yaml - citation_id: https://w3id.org/heritage/citation/[unique-id] citing_work: https://w3id.org/heritage/publication/[new-pub-id] cited_work: https://w3id.org/heritage/publication/[existing-pub-id] citation_type: CITES_AS_AUTHORITY # Choose appropriate type citation_intent: "Why this citation exists..." citation_context: "Textual excerpt around the citation..." page_number: "15" ``` ### Step 4: Validate Run validation before committing: ```bash linkml-validate -s schemas/bibliographic.yaml -C Publication \ data/instances/publications/your_file.yaml ``` Fix any validation errors (see "Schema Quirks" section above). ### Step 5: Update This README Add your publication to the table in the "Files" section. ## Integration with Heritage Custodians Publications link to heritage institutions through **5 integration patterns**, all demonstrated in `heritage_linked_publications.yaml`: ### Pattern 1: Heritage Institution Researcher as Primary Author ✅ **Use case**: Museum curator or archivist publishes research based on institutional collections **Example**: Rijksmuseum researcher analyzing Rembrandt paintings ```yaml authors: - person_id: researcher-rijks-001 person_name: "Dr. Maria van der Berg" affiliation: organization_id: https://w3id.org/heritage/custodian/nl/rijksmuseum # ← Heritage institution! organization_name: "Rijksmuseum" organization_type: "Museum" ``` **Real example**: `rijksmuseum-rembrandt-2024` (Rijksmuseum:125-126) --- ### Pattern 2: Multiple Staff from Same Heritage Institution ✅ **Use case**: Collaborative research by colleagues at the same archive or museum **Example**: Two archivists from Noord-Hollands Archief co-authoring digital transformation paper ```yaml authors: - person_id: archivist-nha-001 person_name: "Dr. Saskia de Jong" affiliation: organization_id: https://w3id.org/heritage/custodian/nl/noord-hollands-archief organization_name: "Noord-Hollands Archief" organization_type: "Archive" - person_id: specialist-nha-001 person_name: "Peter Bakker" affiliation: organization_id: https://w3id.org/heritage/custodian/nl/noord-hollands-archief # ← Same institution organization_name: "Noord-Hollands Archief" organization_type: "Archive" ``` **Real example**: `noord-hollands-archief-digital-2023` (Noord-Hollands Archief:49-60) --- ### Pattern 3: Heritage + Academic Collaboration ✅ **Use case**: University researcher collaborates with heritage institution expert **Example**: USP researcher + Biblioteca Nacional do Brasil librarian creating Linked Open Data resource ```yaml authors: - person_id: https://orcid.org/0000-0002-8888-9999 person_name: "Dr. Carlos Silva" orcid: "0000-0002-8888-9999" affiliation: organization_id: https://w3id.org/heritage/organization/university-of-sao-paulo organization_name: "University of São Paulo" organization_type: "University" - person_id: librarian-bnb-001 person_name: "Ana Santos" affiliation: organization_id: https://w3id.org/heritage/custodian/br/biblioteca-nacional-brasil # ← Heritage institution organization_name: "Biblioteca Nacional do Brasil" organization_type: "Library" ``` **Real example**: `lokg-brazilian-subset-2024` (Biblioteca Nacional do Brasil:92-103) --- ### Pattern 4: Multi-Institutional Consortium (3+ Heritage Institutions) ✅ **Use case**: Regional or national collaboration between multiple heritage institutions **Example**: Dutch GLAM Consortium with KB + Rijksmuseum + Nationaal Archief ```yaml authors: - person_id: director-kb-001 person_name: "Dr. Liesbeth van der Pol" affiliation: organization_id: https://w3id.org/heritage/custodian/nl/kb-national-library organization_name: "Koninklijke Bibliotheek" organization_type: "Library" - person_id: curator-rijksmuseum-002 person_name: "Dr. Thomas de Vries" affiliation: organization_id: https://w3id.org/heritage/custodian/nl/rijksmuseum # ← Second institution organization_name: "Rijksmuseum" organization_type: "Museum" - person_id: archivist-na-001 person_name: "Dr. Emma Jansen" affiliation: organization_id: https://w3id.org/heritage/custodian/nl/nationaal-archief # ← Third institution organization_name: "Nationaal Archief" organization_type: "Archive" ``` **Real example**: `dutch-glam-consortium-2023` (KB:132-137, Rijksmuseum:138-142, Nationaal Archief:144-148) --- ### Pattern 5: International Researcher + Local Heritage Institution ✅ **Use case**: Foreign scholar collaborates with local museum/archive/library **Example**: French scholar + Dutch KB librarian studying European collection management systems ```yaml authors: - person_id: https://orcid.org/0000-0003-7777-8888 person_name: "Dr. Sophie Laurent" orcid: "0000-0003-7777-8888" affiliation: organization_id: https://w3id.org/heritage/organization/universite-paris-sorbonne organization_name: "Université Paris-Sorbonne" organization_type: "University" - person_id: librarian-kb-002 person_name: "Martijn Koster" affiliation: organization_id: https://w3id.org/heritage/custodian/nl/kb-national-library # ← Dutch heritage institution organization_name: "Koninklijke Bibliotheek" organization_type: "Library" ``` **Real example**: `collection-management-systems-2024` (Koninklijke Bibliotheek:181-184) --- ### 4. `diverse_heritage_publications.yaml` (10 publications) ✅ **Diverse publication types: books, book chapters, technical reports, preprints** **Publications included**: | Title | Type | Authors | Year | Key Features | |-------|------|---------|------|--------------| | Linked Data for Museums | Book | Getty Trust researcher | 2020 | Practical GLAM linked data guide | | Digital Preservation Handbook | Book | DPC staff (3 co-authors) | 2021 | Multi-author handbook from heritage org | | Crowdsourcing Metadata for Libraries | Book Chapter | Library scholar | 2019 | Chapter within larger volume | | Archival Appraisal in the Digital Age | Book Chapter | Archival scholar | 2022 | Theory chapter in archival studies | | 3D Digitization at Koninklijke Bibliotheek | Technical Report | KB technical staff (2 authors) | 2023 | Grey literature from institution | | Europeana Data Quality Framework | Technical Report | Europeana Foundation (4 authors) | 2022 | Organizational documentation | | Graph Neural Networks for Provenance | Preprint (arXiv) | CS researcher | 2024 | Machine learning for heritage | | LLMs for Catalog Enrichment | Preprint (OSF/SocArXiv) | LIS researcher | 2024 | AI applications in libraries | | Ancient DNA from Museum Collections | Preprint (bioRxiv) | Museum geneticist + lab | 2024 | Scientific heritage use case | **Preprint Server Patterns**: - **arXiv.org**: Computer science and machine learning papers (heritage AI applications) - Format: `https://arxiv.org/abs/YYMM.NNNNN` (e.g., `2411.12345`) - DOI: `10.48550/arXiv.YYMM.NNNNN` - **OSF/SocArXiv**: Library and information science preprints - Format: `https://osf.io/preprints/socarxiv/[alphanumeric]` (e.g., `abc12`) - DOI: `10.31235/osf.io/[alphanumeric]` - **bioRxiv**: Biology and genetics papers (museum genomics, conservation) - Format: `https://www.biorxiv.org/content/10.1101/YYYY.MM.DD.NNNNNN` - DOI: `10.1101/YYYY.MM.DD.NNNNNN` (date-based) **Schema patterns demonstrated**: - Book metadata (`publication_type: BOOK`) - Book chapter with `is_part_of` relationship to parent volume - Technical reports as grey literature from heritage organizations - Preprint metadata with server identifiers (arXiv ID, OSF ID, bioRxiv ID) - Pre-publication date tracking vs. official publication date --- ### Integration Patterns from Diverse Publications **Pattern 6: Books by Heritage Institution Staff** ✅ Example: Digital Preservation Handbook authored by Digital Preservation Coalition staff ```yaml authors: - person_name: "Sarah Jones" affiliation: organization_id: https://w3id.org/heritage/organization/digital-preservation-coalition organization_name: "Digital Preservation Coalition" organization_type: "Heritage consortium" publication_type: BOOK ``` **Pattern 7: Technical Reports as Organizational Documentation** ✅ Example: KB 3D Digitization Report documenting institutional digitization workflows ```yaml publication_type: TECHNICAL_REPORT authors: - person_name: "Erik Vermeulen" affiliation: organization_id: https://w3id.org/heritage/custodian/nl/kb-national-library organization_name: "Koninklijke Bibliotheek" description: "Grey literature documenting internal digitization practices" ``` **Pattern 8: Preprints Before Formal Publication** ✅ Example: Machine learning research using heritage data published on arXiv ```yaml publication_type: PREPRINT preprint_server: arXiv arxiv_id: "2411.12345" doi: "10.48550/arXiv.2411.12345" description: "Early research results, may be updated before journal submission" ``` **Pattern 9: Book Chapters in Edited Volumes** ✅ Example: Crowdsourcing chapter within larger library science anthology ```yaml publication_type: BOOK_CHAPTER is_part_of: "Digital Innovations in Libraries" editors: - "Jane Smith" - "Robert Brown" page_range: "145-168" ``` --- ### Additional Integration Patterns (Future) **Pattern 6: Publications About Specific Collections** (not yet implemented) When a paper describes a heritage collection: ```yaml # Future schema extension about_collections: - collection_id: https://w3id.org/heritage/collection/rijksmuseum-paintings collection_name: "Rijksmuseum Paintings Collection" collection_institution: https://w3id.org/heritage/custodian/nl/rijksmuseum ``` **Pattern 7: Data Papers Describing Heritage Datasets** (partially implemented) When publications document heritage datasets: ```yaml publication_type: DATASET # Already used in lokg-brazilian-subset-2024 # Future: Add describes_dataset field describes_dataset: - dataset_id: https://w3id.org/heritage/dataset/brazilian-lokg dataset_name: "Brazilian Heritage Institutions Linked Open Data" related_institutions: - https://w3id.org/heritage/custodian/br/biblioteca-nacional-brasil ``` ## Citation Analysis Queries ### Find Most Cited Publications ```python from collections import Counter citations = load_yaml('citation_relationships.yaml') cited_counts = Counter(c['cited_work'] for c in citations) print("Most cited publications:") for pub_id, count in cited_counts.most_common(): print(f" {pub_id}: {count} citations") ``` ### Build Citation Network ```python import networkx as nx G = nx.DiGraph() for citation in citations: G.add_edge(citation['citing_work'], citation['cited_work'], citation_type=citation['citation_type']) # Find influential papers (high in-degree) influential = sorted(G.in_degree(), key=lambda x: x[1], reverse=True) ``` ### Analyze Citation Types ```python citation_types = Counter(c['citation_type'] for c in citations) print("Citation type distribution:") for ctype, count in citation_types.items(): print(f" {ctype}: {count}") ``` ## Related Documentation - **Schema**: `/schemas/bibliographic.yaml` - Full LinkML schema for bibliographic entities - **Ontologies**: - FaBiO (FRBR-aligned Bibliographic Ontology) - Publication modeling - CiTO (Citation Typing Ontology) - Citation relationships - BIBO (Bibliographic Ontology) - Bibliographic resources - FRBR (Functional Requirements for Bibliographic Records) - Work/expression/manifestation - **Test Fixtures**: `/tests/fixtures/publications/` - Validation examples - **Schema Documentation**: `/docs/BIBLIOGRAPHIC_SCHEMA.md` (if exists) ## Future Enhancements ### Short-term (Next Session) - [x] ✅ **COMPLETED**: Add publications linked to heritage institutions (5 added) - [x] ✅ **COMPLETED**: Create citation relationships for heritage-linked pubs (8 citations added) - [x] ✅ **COMPLETED**: Document 5 integration patterns - [x] ✅ **COMPLETED**: Add more diverse publication types (books, book chapters, technical reports) - 10 added - [x] ✅ **COMPLETED**: Add preprints (arXiv, bioRxiv, OSF/SocArxiv) - 4 added - [x] ✅ **COMPLETED**: Add more cultural heritage domain papers (digital preservation, archival science) - included in diverse set - [x] ✅ **COMPLETED**: Create 12 additional citation relationships linking diverse publications (27 total citations) - [x] ✅ **COMPLETED**: Document preprint server patterns (arXiv, SocArXiv, bioRxiv) - [x] ✅ **COMPLETED**: Document 4 additional integration patterns (6-9: books, technical reports, preprints, chapters) - [ ] Create author disambiguation examples (same person with multiple IDs/ORCIDs) - [ ] Add thesis/dissertation examples - [ ] Add working papers (pre-publication research from institutions) ### Medium-term - [ ] Author disambiguation (same person, multiple IDs) - [ ] Keyword/subject term extraction - [ ] Funding information (grants, sponsors) - [ ] Publication metrics (citation counts from Crossref, Semantic Scholar) - [ ] Full-text links (PDFs, preprints) ### Long-term - [ ] RDF export (Turtle, JSON-LD) - [ ] SPARQL endpoint for citation queries - [ ] Bibliometric analysis dashboard - [ ] Integration with Wikidata (author Q-numbers) - [ ] Citation recommendation system - [ ] Co-authorship network analysis ## Questions or Issues? If you encounter validation errors or schema confusion: 1. Check the "Schema Quirks" section above 2. Review validated examples in `semantic_web_papers.yaml` 3. Consult test fixtures in `/tests/fixtures/publications/` 4. Read schema documentation in `/schemas/bibliographic.yaml` (inline comments) 5. File an issue or consult AI agent instructions in `/AGENTS.md` --- **Last Updated**: 2025-11-09 **Schema Version**: bibliographic.yaml v0.2.0 **Dataset Version**: 0.3.0 (20 publications, 27 citations, 9 integration patterns demonstrated)