13 KiB
Hybrid GLiNER2 + LLM Annotator Architecture
This document describes the hybrid annotation pipeline that combines fast encoder-based NER (GLiNER2) with powerful LLM reasoning for comprehensive entity and relationship extraction.
Overview
The hybrid annotator addresses a fundamental trade-off in NLP annotation:
| Approach | Speed | Accuracy | Relationships | Domain Knowledge |
|---|---|---|---|---|
| GLiNER2 (encoder) | ~100x faster | Good recall | Limited | Generic |
| LLM (decoder) | Slower | High precision | Excellent | Rich |
| Hybrid | Fast + thorough | Best of both | Full support | Domain-aware |
Pipeline Architecture
┌─────────────────────────────────────────────────────────────────────────────┐
│ HYBRID ANNOTATION PIPELINE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ INPUT │ │ STAGE 1 │ │ STAGE 2 │ │ STAGE 3 │ │
│ │ TEXT │───▶│ FAST-PASS │───▶│ REFINEMENT │───▶│ VALIDATION │ │
│ │ │ │ (GLiNER2) │ │ (LLM) │ │ (CROSS) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ AnnotationCandidate AnnotationCandidate EntityClaim │
│ (DETECTED) (REFINED) (VALIDATED) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Stage 1: Fast-Pass (GLiNER2)
Purpose: High-recall entity mention detection at ~100x speed of LLM-only.
Technology: GLiNER2 encoder model (urchade/gliner_multi-v2.1)
Input: Raw text document
Output: List[AnnotationCandidate] with status DETECTED
Process
- Tokenize input text
- Run GLiNER2 span prediction with configurable threshold (default 0.5)
- Map GLiNER2 generic labels to GLAM-NER hyponyms using
GLINER2_TO_GLAM_MAPPING - Create
AnnotationCandidatefor each detected span
GLiNER2 to GLAM-NER Type Mapping
GLINER2_TO_GLAM_MAPPING = {
# Person types
"person": "AGT.PER",
"people": "AGT.PER",
# Organization types
"organization": "GRP",
"museum": "GRP.HER",
"library": "GRP.HER",
"archive": "GRP.HER",
"university": "GRP.EDU",
# Location types
"location": "TOP",
"city": "TOP.SET",
"country": "TOP.CTY",
"building": "TOP.BLD",
# Temporal types
"date": "TMP.DAB",
"time": "TMP.TAB",
"period": "TMP.ERA",
# ... (see hybrid_annotator.py for complete mapping)
}
Configuration
HybridConfig(
gliner_model="urchade/gliner_multi-v2.1", # Model to use
gliner_threshold=0.5, # Detection confidence threshold
gliner_entity_labels=None, # Custom labels (or use defaults)
gliner_device="cpu", # Device (cpu/cuda)
enable_fast_pass=True, # Enable/disable this stage
)
Stage 2: Refinement (LLM)
Purpose: Entity type refinement, relationship extraction, and domain knowledge injection.
Technology: Z.AI GLM-4 (default), Claude, or GPT-4
Input: Original text + List[AnnotationCandidate] from Stage 1
Output: Refined List[AnnotationCandidate] + List[RelationshipCandidate]
Process
-
Construct prompt with:
- Original text
- GLiNER2 candidate spans (as hints)
- GLAM-NER type definitions
- Relationship extraction instructions
-
LLM performs:
- Type refinement: Upgrade generic types (e.g.,
GRP→GRP.HER) - New entity detection: Find entities GLiNER2 missed
- Relationship extraction: Identify semantic relationships
- Entity linking hints: Suggest Wikidata/VIAF IDs
- Temporal/spatial scoping: Add context
- Type refinement: Upgrade generic types (e.g.,
-
Parse LLM response and update candidates
Relationship Extraction
The LLM extracts relationships following GLAM-NER relationship hyponyms:
RelationshipCandidate(
subject_id="candidate-uuid-1",
subject_text="Rijksmuseum",
subject_type="GRP.HER",
relationship_type="REL.SPA.LOC", # Located at
relationship_label="located in",
object_id="candidate-uuid-2",
object_text="Amsterdam",
object_type="TOP.SET",
confidence=0.92,
)
Configuration
HybridConfig(
llm_model="glm-4", # Model name (auto-detects provider)
llm_api_key=None, # API key (or use ZAI_API_TOKEN env var)
enable_refinement=True, # Enable/disable this stage
enable_relationships=True, # Extract relationships
)
Stage 3: Validation (Cross-Check)
Purpose: Cross-validate outputs, detect hallucinations, ensure consistency.
Input: Candidates from both Stage 1 and Stage 2
Output: Final List[EntityClaim] with status VALIDATED or REJECTED
Process
-
Merge candidates from GLiNER2 and LLM:
- Match by span overlap (configurable threshold)
- Prefer LLM types on conflict (configurable)
- Create
MERGEDcandidates from both sources
-
Hallucination detection:
- Verify LLM-only entities exist in source text
- Check for fabricated relationships
- Flag suspicious confidence scores
-
Consistency checking:
- Validate relationship domain/range constraints
- Check temporal coherence
- Verify entity type compatibility
-
Final filtering:
- Apply minimum confidence threshold
- Remove rejected candidates (or keep with flag)
Merge Strategy
GLiNER2 Candidate: "Van Gogh" (AGT.PER, confidence=0.7)
LLM Candidate: "Vincent van Gogh" (AGT.PER, confidence=0.95)
Overlap ratio: 0.67 > threshold (0.3)
→ MERGE: Use LLM span + confidence, mark as MERGED
→ Result: "Vincent van Gogh" (AGT.PER, confidence=0.95, source=MERGED)
Configuration
HybridConfig(
enable_validation=True, # Enable/disable this stage
merge_threshold=0.3, # Minimum overlap ratio for merging
prefer_llm_on_conflict=True, # LLM types take precedence
minimum_confidence=0.3, # Filter low-confidence results
include_rejected=False, # Include rejected in output
)
Data Structures
AnnotationCandidate
Shared intermediate representation used across all pipeline stages:
@dataclass
class AnnotationCandidate:
candidate_id: str # Unique identifier
text: str # Extracted text span
start_offset: int # Character start position
end_offset: int # Character end position
hypernym: Optional[str] # Top-level type (AGT, GRP, TOP, etc.)
hyponym: Optional[str] # Fine-grained type (AGT.PER, GRP.HER)
# Confidence scores
detection_confidence: float # GLiNER2 detection score
classification_confidence: float # Type classification score
overall_confidence: float # Combined confidence
# Source tracking
source: CandidateSource # GLINER2, LLM, HYBRID, MERGED
status: CandidateStatus # DETECTED, REFINED, VALIDATED, REJECTED
# Entity linking
wikidata_id: Optional[str]
viaf_id: Optional[str]
isil_id: Optional[str]
# Relationships (populated during LLM refinement)
relationships: List[Dict[str, Any]]
# Provenance
provenance: Optional[Provenance]
RelationshipCandidate
Intermediate representation for relationships:
@dataclass
class RelationshipCandidate:
relationship_id: str
relationship_type: str # e.g., REL.CRE.AUT, REL.SPA.LOC
relationship_label: str # Human-readable label
subject_id: str # Reference to AnnotationCandidate
subject_text: str
subject_type: str
object_id: str
object_text: str
object_type: str
temporal_scope: Optional[str] # e.g., "1885-1890"
spatial_scope: Optional[str] # e.g., "Amsterdam"
confidence: float
is_valid: bool # Domain/range validation result
HybridAnnotationResult
Final output structure:
@dataclass
class HybridAnnotationResult:
entities: List[AnnotationCandidate]
relationships: List[RelationshipCandidate]
source_text: str
# Pipeline stage flags
gliner_pass: bool = False
llm_pass: bool = False
validation_pass: bool = False
# Statistics
total_candidates: int = 0
merged_count: int = 0
rejected_count: int = 0
Usage
Basic Usage
from glam_extractor.annotators import HybridAnnotator, HybridConfig
# Default configuration
annotator = HybridAnnotator()
# Annotate text
result = await annotator.annotate("""
The Rijksmuseum in Amsterdam, founded in 1800, houses over 8,000 objects.
Vincent van Gogh's works are among the most famous in the collection.
""")
# Access results
for entity in result.entities:
print(f"{entity.text}: {entity.hyponym} ({entity.overall_confidence:.2f})")
for rel in result.relationships:
print(f"{rel.subject_text} --[{rel.relationship_label}]--> {rel.object_text}")
Custom Configuration
config = HybridConfig(
# Use smaller, faster GLiNER2 model
gliner_model="urchade/gliner_small",
gliner_threshold=0.6,
# Use Claude instead of Z.AI
llm_model="claude-3-sonnet-20240229",
# Disable relationship extraction for speed
enable_relationships=False,
# Stricter filtering
minimum_confidence=0.5,
)
annotator = HybridAnnotator(config=config)
GLiNER2-Only Mode
For maximum speed when relationships aren't needed:
config = HybridConfig(
enable_fast_pass=True,
enable_refinement=False, # Skip LLM
enable_validation=True,
)
annotator = HybridAnnotator(config=config)
result = await annotator.annotate(text)
LLM-Only Mode
When GLiNER2 isn't available or for maximum accuracy:
config = HybridConfig(
enable_fast_pass=False, # Skip GLiNER2
enable_refinement=True,
enable_validation=False, # No cross-validation without GLiNER2
)
annotator = HybridAnnotator(config=config)
result = await annotator.annotate(text)
Performance Characteristics
| Configuration | Speed | Entity Recall | Entity Precision | Relationships |
|---|---|---|---|---|
| GLiNER2-only | ~100x | High | Medium | None |
| LLM-only | 1x | Medium | High | Full |
| Hybrid (default) | ~10x | High | High | Full |
| Hybrid (no-rel) | ~20x | High | High | None |
Dependencies
Required
- Python 3.10+
dataclassestyping
Optional
gliner- For GLiNER2 fast-pass (gracefully degrades if not installed)httpxoraiohttp- For LLM API callstorch- For GLiNER2 GPU acceleration
Install GLiNER2
pip install gliner
# For GPU support
pip install gliner torch
File Structure
src/glam_extractor/annotators/
├── __init__.py # Module exports
├── base.py # EntityClaim, Provenance, hypernyms
├── hybrid_annotator.py # HybridAnnotator, candidates, pipeline
├── llm_annotator.py # LLMAnnotator, provider configs
└── schema_builder.py # GLAMSchema, field specs
tests/annotators/
├── __init__.py
└── test_hybrid_annotator.py # 24 unit tests