glam/docs/CH_ANNOTATOR_QUICK_REFERENCE.md
2025-12-07 00:26:01 +01:00

119 lines
2.8 KiB
Markdown

# CH-Annotator Quick Reference
**ID**: `ch_annotator-v1_7_0`
**Full Name**: CH-Annotator (Cultural Heritage Annotator)
**Status**: PRODUCTION
---
## What is CH-Annotator?
CH-Annotator is the comprehensive entity annotation convention for this project. It covers:
- Named Entity Recognition (NER)
- Property Extraction
- Entity Resolution and Linking
- Claim Validation
- Document Structure Annotation
---
## Quick Start
### Using CH-Annotator in Extraction
When extracting entities, reference the convention in provenance:
```yaml
provenance:
extraction_method: ch_annotator-v1_7_0
extraction_date: "2025-12-06T10:00:00Z"
```
### Finding the Convention
| Location | Description |
|----------|-------------|
| `data/entity_annotation/ch_annotator-v1_7_0.yaml` | Complete 2500+ line convention |
| `data/entity_annotation/modules/` | Modular hypernym definitions |
| `.opencode/CH_ANNOTATOR_CONVENTION.md` | Full documentation |
| `AGENTS.md` Rule 10 | Agent usage rules |
---
## Hypernym Codes (9 Categories)
| Code | Name | Use For |
|------|------|---------|
| **AGT** | AGENT | People, AI, animals, fictional beings |
| **GRP** | GROUP | Organizations, institutions, movements |
| **TOP** | TOPONYM | Place names (cities, countries, regions) |
| **GEO** | GEOMETRY | Coordinates, polygons, spatial data |
| **TMP** | TEMPORAL | Dates, times, periods, durations |
| **APP** | APPELLATION | Names, titles, identifiers |
| **ROL** | ROLE | Occupations, positions, honorifics |
| **WRK** | WORK | Publications, artworks, records |
| **QTY** | QUANTITY | Numbers, measurements, currencies |
---
## Heritage Institution Subtypes (GRP.HER)
```
GRP.HER.GAL → Gallery
GRP.HER.LIB → Library
GRP.HER.ARC → Archive
GRP.HER.MUS → Museum
GRP.HER.OFF → Official Institution
GRP.HER.RES → Research Center
GRP.HER.COR → Corporation
GRP.HER.UNK → Unknown
GRP.HER.BOT → Botanical/Zoo
GRP.HER.EDU → Education Provider
GRP.HER.SOC → Collecting Society
...
```
---
## Claim Provenance (5 Required Components)
Every claim MUST have:
```yaml
claim:
claim_type: full_name
claim_value: "Rijksmuseum"
provenance:
namespace: skos # 1. Ontology prefix
path: /html/body/h1[1] # 2. Source path
timestamp: "2025-12-06T10:00:00Z" # 3. Timestamp
agent: ch_annotator-v1_7_0 # 4. Model/agent
context_convention: ch_annotator-v1_7_0 # 5. Convention
```
---
## Authority Stack
**Use These** (Digital Humanities):
- TEI P5
- CIDOC-CRM 7.1.3
- TimeML/TIMEX3
- FRBR/LRM
- GeoSPARQL
**Avoid** (Web-centric):
- NERD (deprecated, interchange only)
---
## Naming History
| Date | Name |
|------|------|
| Pre-2025-12-06 | GLAM-NER v1.7.0-unified |
| 2025-12-06+ | CH-Annotator v1.7.0 (`ch_annotator-v1_7_0`) |
---
**See Also**: `.opencode/CH_ANNOTATOR_CONVENTION.md` for comprehensive documentation