glam/docs/GLM_API_SETUP.md
kempersc 271545fa8b docs: add Z.AI GLM API and transliteration rules to AGENTS.md
- Add Rule 11 for Z.AI Coding Plan API usage (not BigModel)
- Add transliteration standards for non-Latin scripts
- Document GLM model options and Python implementation
2025-12-08 14:58:22 +01:00

357 lines
9.6 KiB
Markdown

# GLM API Setup Guide
This guide explains how to configure and use the GLM-4 language model for entity recognition, verification, and enrichment tasks in the GLAM project.
## Overview
The GLAM project uses **GLM-4.6** via the **Z.AI Coding Plan** endpoint for LLM-powered tasks such as:
- **Entity Verification**: Verify that Wikidata entities are heritage institutions
- **Description Enrichment**: Generate rich descriptions from multiple data sources
- **Entity Resolution**: Match institution names across different data sources
- **Claim Validation**: Verify extracted claims against source documents
**Cost**: All GLM models are FREE (0 cost per token) on the Z.AI Coding Plan.
## Prerequisites
- Python 3.10+
- `httpx` library for async HTTP requests
- Access to Z.AI Coding Plan (same as OpenCode)
## Quick Start
### 1. Set Up Environment Variable
Add your Z.AI API token to the `.env` file in the project root:
```bash
# .env file
ZAI_API_TOKEN=your_token_here
```
### 2. Find Your Token
The token is shared with OpenCode. Check:
```bash
# View OpenCode auth file
cat ~/.local/share/opencode/auth.json | jq '.["zai-coding-plan"]'
```
Copy this token to your `.env` file.
### 3. Basic Python Usage
```python
import os
import httpx
import asyncio
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
async def call_glm():
api_url = "https://api.z.ai/api/coding/paas/v4/chat/completions"
api_key = os.environ.get("ZAI_API_TOKEN")
async with httpx.AsyncClient(timeout=60.0) as client:
response = await client.post(
api_url,
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
},
json={
"model": "glm-4.6",
"messages": [
{"role": "user", "content": "Hello, GLM!"}
],
"temperature": 0.3,
}
)
result = response.json()
print(result["choices"][0]["message"]["content"])
asyncio.run(call_glm())
```
## API Configuration
### Endpoint Details
| Property | Value |
|----------|-------|
| **Base URL** | `https://api.z.ai/api/coding/paas/v4` |
| **Chat Endpoint** | `/chat/completions` |
| **Auth Method** | Bearer Token |
| **Header** | `Authorization: Bearer {token}` |
### Available Models
| Model | Speed | Quality | Use Case |
|-------|-------|---------|----------|
| `glm-4.6` | Medium | Highest | Complex reasoning, verification |
| `glm-4.5` | Medium | High | General tasks |
| `glm-4.5-air` | Fast | Good | High-volume processing |
| `glm-4.5-flash` | Fastest | Good | Quick responses |
| `glm-4.5v` | Medium | High | Vision/image tasks |
**Recommendation**: Use `glm-4.6` for entity verification and complex tasks.
## Integration with CH-Annotator
When using GLM for entity recognition tasks, always reference the CH-Annotator convention:
### Heritage Institution Verification
```python
VERIFICATION_PROMPT = """You are a heritage institution classifier following CH-Annotator v1.7.0 convention.
## CH-Annotator GRP.HER Definition
Heritage institutions are organizations that:
- Collect, preserve, and provide access to cultural heritage materials
- Include: museums (GRP.HER.MUS), libraries (GRP.HER.LIB), archives (GRP.HER.ARC), galleries (GRP.HER.GAL)
## Entity Types That Are NOT Heritage Institutions
- Cities, towns, municipalities (places, not institutions)
- General businesses or companies
- People/individuals
- Events, festivals, exhibitions (temporary)
## Your Task
Analyze the entity and respond in JSON:
```json
{
"is_heritage_institution": true/false,
"subtype": "MUS|LIB|ARC|GAL|OTHER|null",
"confidence": 0.95,
"reasoning": "Brief explanation"
}
```
"""
```
### Entity Type Mapping
| CH-Annotator Type | GLAM Institution Type |
|-------------------|----------------------|
| GRP.HER.MUS | MUSEUM |
| GRP.HER.LIB | LIBRARY |
| GRP.HER.ARC | ARCHIVE |
| GRP.HER.GAL | GALLERY |
| GRP.HER.RES | RESEARCH_CENTER |
| GRP.HER.BOT | BOTANICAL_ZOO |
| GRP.HER.EDU | EDUCATION_PROVIDER |
## Complete Implementation Example
### Wikidata Verification Script
See `scripts/reenrich_wikidata_with_verification.py` for a complete example:
```python
import os
import httpx
import json
from typing import Any, Dict, List, Optional
class GLMHeritageVerifier:
"""Verify Wikidata entities using GLM-4.6 and CH-Annotator."""
API_URL = "https://api.z.ai/api/coding/paas/v4/chat/completions"
def __init__(self, model: str = "glm-4.6"):
self.api_key = os.environ.get("ZAI_API_TOKEN")
if not self.api_key:
raise ValueError("ZAI_API_TOKEN not found in environment")
self.model = model
self.client = httpx.AsyncClient(
timeout=60.0,
headers={
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json",
}
)
async def verify_heritage_institution(
self,
institution_name: str,
wikidata_label: str,
wikidata_description: str,
instance_of_types: List[str],
) -> Dict[str, Any]:
"""Check if a Wikidata entity is a heritage institution."""
prompt = f"""Analyze if this entity is a heritage institution (GRP.HER):
Institution Name: {institution_name}
Wikidata Label: {wikidata_label}
Description: {wikidata_description}
Instance Of: {', '.join(instance_of_types)}
Respond with JSON only."""
response = await self.client.post(
self.API_URL,
json={
"model": self.model,
"messages": [
{"role": "system", "content": self.VERIFICATION_PROMPT},
{"role": "user", "content": prompt}
],
"temperature": 0.1,
}
)
result = response.json()
content = result["choices"][0]["message"]["content"]
# Parse JSON from response
json_match = re.search(r'\{.*\}', content, re.DOTALL)
if json_match:
return json.loads(json_match.group())
return {"is_heritage_institution": False, "error": "No JSON found"}
```
## Error Handling
### Common Errors
| Error Code | Meaning | Solution |
|------------|---------|----------|
| 401 | Unauthorized | Check ZAI_API_TOKEN |
| 403 | Forbidden/Quota | Using wrong endpoint (use Z.AI, not BigModel) |
| 429 | Rate Limited | Add delays between requests |
| 500 | Server Error | Retry with backoff |
### Retry Pattern
```python
from tenacity import retry, stop_after_attempt, wait_exponential
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=2, max=10)
)
async def call_with_retry(client, messages):
response = await client.post(API_URL, json={"model": "glm-4.6", "messages": messages})
response.raise_for_status()
return response.json()
```
### JSON Parsing
LLM responses may contain text around JSON. Always parse safely:
```python
import re
import json
def parse_json_from_response(content: str) -> dict:
"""Extract JSON from LLM response text."""
# Try to find JSON block
json_match = re.search(r'```json\s*(\{.*?\})\s*```', content, re.DOTALL)
if json_match:
return json.loads(json_match.group(1))
# Try bare JSON
json_match = re.search(r'\{.*\}', content, re.DOTALL)
if json_match:
return json.loads(json_match.group())
return {"error": "No JSON found in response"}
```
## Best Practices
### 1. Use Low Temperature for Verification
```python
{
"temperature": 0.1 # Low for consistent, deterministic responses
}
```
### 2. Request JSON Output
Always request JSON format in your prompts for structured responses:
```
Respond in JSON format only:
```json
{"key": "value"}
```
```
### 3. Batch Processing
Process multiple entities with rate limiting:
```python
import asyncio
async def batch_verify(entities: List[dict], rate_limit: float = 0.5):
"""Verify entities with rate limiting."""
results = []
for entity in entities:
result = await verifier.verify(entity)
results.append(result)
await asyncio.sleep(rate_limit) # Respect rate limits
return results
```
### 4. Always Reference CH-Annotator
For entity recognition tasks, include CH-Annotator context:
```python
system_prompt = """You are following CH-Annotator v1.7.0 convention.
Heritage institutions are type GRP.HER with subtypes for museums, libraries, archives, and galleries.
"""
```
## Related Scripts
| Script | Purpose |
|--------|---------|
| `scripts/reenrich_wikidata_with_verification.py` | Wikidata entity verification |
## Related Documentation
- **Agent Rules**: `AGENTS.md` (Rule 11: Z.AI GLM API)
- **Agent Config**: `.opencode/ZAI_GLM_API_RULES.md`
- **CH-Annotator**: `.opencode/CH_ANNOTATOR_CONVENTION.md`
- **Entity Annotation**: `data/entity_annotation/ch_annotator-v1_7_0.yaml`
## Troubleshooting
### "Quota exceeded" Error
**Symptom**: 403 error with "quota exceeded" message
**Cause**: Using wrong API endpoint (`open.bigmodel.cn` instead of `api.z.ai`)
**Solution**: Update API URL to `https://api.z.ai/api/coding/paas/v4/chat/completions`
### "Token not found" Error
**Symptom**: ValueError about missing ZAI_API_TOKEN
**Solution**:
1. Check `~/.local/share/opencode/auth.json` for token
2. Add to `.env` file as `ZAI_API_TOKEN=your_token`
3. Ensure `load_dotenv()` is called before accessing environment
### JSON Parsing Failures
**Symptom**: LLM returns text that can't be parsed as JSON
**Solution**: Use the `parse_json_from_response()` helper function with fallback handling
---
**Last Updated**: 2025-12-08