- Add Rule 11 for Z.AI Coding Plan API usage (not BigModel) - Add transliteration standards for non-Latin scripts - Document GLM model options and Python implementation
357 lines
9.6 KiB
Markdown
357 lines
9.6 KiB
Markdown
# GLM API Setup Guide
|
|
|
|
This guide explains how to configure and use the GLM-4 language model for entity recognition, verification, and enrichment tasks in the GLAM project.
|
|
|
|
## Overview
|
|
|
|
The GLAM project uses **GLM-4.6** via the **Z.AI Coding Plan** endpoint for LLM-powered tasks such as:
|
|
|
|
- **Entity Verification**: Verify that Wikidata entities are heritage institutions
|
|
- **Description Enrichment**: Generate rich descriptions from multiple data sources
|
|
- **Entity Resolution**: Match institution names across different data sources
|
|
- **Claim Validation**: Verify extracted claims against source documents
|
|
|
|
**Cost**: All GLM models are FREE (0 cost per token) on the Z.AI Coding Plan.
|
|
|
|
## Prerequisites
|
|
|
|
- Python 3.10+
|
|
- `httpx` library for async HTTP requests
|
|
- Access to Z.AI Coding Plan (same as OpenCode)
|
|
|
|
## Quick Start
|
|
|
|
### 1. Set Up Environment Variable
|
|
|
|
Add your Z.AI API token to the `.env` file in the project root:
|
|
|
|
```bash
|
|
# .env file
|
|
ZAI_API_TOKEN=your_token_here
|
|
```
|
|
|
|
### 2. Find Your Token
|
|
|
|
The token is shared with OpenCode. Check:
|
|
|
|
```bash
|
|
# View OpenCode auth file
|
|
cat ~/.local/share/opencode/auth.json | jq '.["zai-coding-plan"]'
|
|
```
|
|
|
|
Copy this token to your `.env` file.
|
|
|
|
### 3. Basic Python Usage
|
|
|
|
```python
|
|
import os
|
|
import httpx
|
|
import asyncio
|
|
from dotenv import load_dotenv
|
|
|
|
# Load environment variables
|
|
load_dotenv()
|
|
|
|
async def call_glm():
|
|
api_url = "https://api.z.ai/api/coding/paas/v4/chat/completions"
|
|
api_key = os.environ.get("ZAI_API_TOKEN")
|
|
|
|
async with httpx.AsyncClient(timeout=60.0) as client:
|
|
response = await client.post(
|
|
api_url,
|
|
headers={
|
|
"Authorization": f"Bearer {api_key}",
|
|
"Content-Type": "application/json",
|
|
},
|
|
json={
|
|
"model": "glm-4.6",
|
|
"messages": [
|
|
{"role": "user", "content": "Hello, GLM!"}
|
|
],
|
|
"temperature": 0.3,
|
|
}
|
|
)
|
|
result = response.json()
|
|
print(result["choices"][0]["message"]["content"])
|
|
|
|
asyncio.run(call_glm())
|
|
```
|
|
|
|
## API Configuration
|
|
|
|
### Endpoint Details
|
|
|
|
| Property | Value |
|
|
|----------|-------|
|
|
| **Base URL** | `https://api.z.ai/api/coding/paas/v4` |
|
|
| **Chat Endpoint** | `/chat/completions` |
|
|
| **Auth Method** | Bearer Token |
|
|
| **Header** | `Authorization: Bearer {token}` |
|
|
|
|
### Available Models
|
|
|
|
| Model | Speed | Quality | Use Case |
|
|
|-------|-------|---------|----------|
|
|
| `glm-4.6` | Medium | Highest | Complex reasoning, verification |
|
|
| `glm-4.5` | Medium | High | General tasks |
|
|
| `glm-4.5-air` | Fast | Good | High-volume processing |
|
|
| `glm-4.5-flash` | Fastest | Good | Quick responses |
|
|
| `glm-4.5v` | Medium | High | Vision/image tasks |
|
|
|
|
**Recommendation**: Use `glm-4.6` for entity verification and complex tasks.
|
|
|
|
## Integration with CH-Annotator
|
|
|
|
When using GLM for entity recognition tasks, always reference the CH-Annotator convention:
|
|
|
|
### Heritage Institution Verification
|
|
|
|
```python
|
|
VERIFICATION_PROMPT = """You are a heritage institution classifier following CH-Annotator v1.7.0 convention.
|
|
|
|
## CH-Annotator GRP.HER Definition
|
|
Heritage institutions are organizations that:
|
|
- Collect, preserve, and provide access to cultural heritage materials
|
|
- Include: museums (GRP.HER.MUS), libraries (GRP.HER.LIB), archives (GRP.HER.ARC), galleries (GRP.HER.GAL)
|
|
|
|
## Entity Types That Are NOT Heritage Institutions
|
|
- Cities, towns, municipalities (places, not institutions)
|
|
- General businesses or companies
|
|
- People/individuals
|
|
- Events, festivals, exhibitions (temporary)
|
|
|
|
## Your Task
|
|
Analyze the entity and respond in JSON:
|
|
```json
|
|
{
|
|
"is_heritage_institution": true/false,
|
|
"subtype": "MUS|LIB|ARC|GAL|OTHER|null",
|
|
"confidence": 0.95,
|
|
"reasoning": "Brief explanation"
|
|
}
|
|
```
|
|
"""
|
|
```
|
|
|
|
### Entity Type Mapping
|
|
|
|
| CH-Annotator Type | GLAM Institution Type |
|
|
|-------------------|----------------------|
|
|
| GRP.HER.MUS | MUSEUM |
|
|
| GRP.HER.LIB | LIBRARY |
|
|
| GRP.HER.ARC | ARCHIVE |
|
|
| GRP.HER.GAL | GALLERY |
|
|
| GRP.HER.RES | RESEARCH_CENTER |
|
|
| GRP.HER.BOT | BOTANICAL_ZOO |
|
|
| GRP.HER.EDU | EDUCATION_PROVIDER |
|
|
|
|
## Complete Implementation Example
|
|
|
|
### Wikidata Verification Script
|
|
|
|
See `scripts/reenrich_wikidata_with_verification.py` for a complete example:
|
|
|
|
```python
|
|
import os
|
|
import httpx
|
|
import json
|
|
from typing import Any, Dict, List, Optional
|
|
|
|
class GLMHeritageVerifier:
|
|
"""Verify Wikidata entities using GLM-4.6 and CH-Annotator."""
|
|
|
|
API_URL = "https://api.z.ai/api/coding/paas/v4/chat/completions"
|
|
|
|
def __init__(self, model: str = "glm-4.6"):
|
|
self.api_key = os.environ.get("ZAI_API_TOKEN")
|
|
if not self.api_key:
|
|
raise ValueError("ZAI_API_TOKEN not found in environment")
|
|
|
|
self.model = model
|
|
self.client = httpx.AsyncClient(
|
|
timeout=60.0,
|
|
headers={
|
|
"Authorization": f"Bearer {self.api_key}",
|
|
"Content-Type": "application/json",
|
|
}
|
|
)
|
|
|
|
async def verify_heritage_institution(
|
|
self,
|
|
institution_name: str,
|
|
wikidata_label: str,
|
|
wikidata_description: str,
|
|
instance_of_types: List[str],
|
|
) -> Dict[str, Any]:
|
|
"""Check if a Wikidata entity is a heritage institution."""
|
|
|
|
prompt = f"""Analyze if this entity is a heritage institution (GRP.HER):
|
|
|
|
Institution Name: {institution_name}
|
|
Wikidata Label: {wikidata_label}
|
|
Description: {wikidata_description}
|
|
Instance Of: {', '.join(instance_of_types)}
|
|
|
|
Respond with JSON only."""
|
|
|
|
response = await self.client.post(
|
|
self.API_URL,
|
|
json={
|
|
"model": self.model,
|
|
"messages": [
|
|
{"role": "system", "content": self.VERIFICATION_PROMPT},
|
|
{"role": "user", "content": prompt}
|
|
],
|
|
"temperature": 0.1,
|
|
}
|
|
)
|
|
|
|
result = response.json()
|
|
content = result["choices"][0]["message"]["content"]
|
|
|
|
# Parse JSON from response
|
|
json_match = re.search(r'\{.*\}', content, re.DOTALL)
|
|
if json_match:
|
|
return json.loads(json_match.group())
|
|
|
|
return {"is_heritage_institution": False, "error": "No JSON found"}
|
|
```
|
|
|
|
## Error Handling
|
|
|
|
### Common Errors
|
|
|
|
| Error Code | Meaning | Solution |
|
|
|------------|---------|----------|
|
|
| 401 | Unauthorized | Check ZAI_API_TOKEN |
|
|
| 403 | Forbidden/Quota | Using wrong endpoint (use Z.AI, not BigModel) |
|
|
| 429 | Rate Limited | Add delays between requests |
|
|
| 500 | Server Error | Retry with backoff |
|
|
|
|
### Retry Pattern
|
|
|
|
```python
|
|
from tenacity import retry, stop_after_attempt, wait_exponential
|
|
|
|
@retry(
|
|
stop=stop_after_attempt(3),
|
|
wait=wait_exponential(multiplier=1, min=2, max=10)
|
|
)
|
|
async def call_with_retry(client, messages):
|
|
response = await client.post(API_URL, json={"model": "glm-4.6", "messages": messages})
|
|
response.raise_for_status()
|
|
return response.json()
|
|
```
|
|
|
|
### JSON Parsing
|
|
|
|
LLM responses may contain text around JSON. Always parse safely:
|
|
|
|
```python
|
|
import re
|
|
import json
|
|
|
|
def parse_json_from_response(content: str) -> dict:
|
|
"""Extract JSON from LLM response text."""
|
|
# Try to find JSON block
|
|
json_match = re.search(r'```json\s*(\{.*?\})\s*```', content, re.DOTALL)
|
|
if json_match:
|
|
return json.loads(json_match.group(1))
|
|
|
|
# Try bare JSON
|
|
json_match = re.search(r'\{.*\}', content, re.DOTALL)
|
|
if json_match:
|
|
return json.loads(json_match.group())
|
|
|
|
return {"error": "No JSON found in response"}
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
### 1. Use Low Temperature for Verification
|
|
|
|
```python
|
|
{
|
|
"temperature": 0.1 # Low for consistent, deterministic responses
|
|
}
|
|
```
|
|
|
|
### 2. Request JSON Output
|
|
|
|
Always request JSON format in your prompts for structured responses:
|
|
|
|
```
|
|
Respond in JSON format only:
|
|
```json
|
|
{"key": "value"}
|
|
```
|
|
```
|
|
|
|
### 3. Batch Processing
|
|
|
|
Process multiple entities with rate limiting:
|
|
|
|
```python
|
|
import asyncio
|
|
|
|
async def batch_verify(entities: List[dict], rate_limit: float = 0.5):
|
|
"""Verify entities with rate limiting."""
|
|
results = []
|
|
for entity in entities:
|
|
result = await verifier.verify(entity)
|
|
results.append(result)
|
|
await asyncio.sleep(rate_limit) # Respect rate limits
|
|
return results
|
|
```
|
|
|
|
### 4. Always Reference CH-Annotator
|
|
|
|
For entity recognition tasks, include CH-Annotator context:
|
|
|
|
```python
|
|
system_prompt = """You are following CH-Annotator v1.7.0 convention.
|
|
Heritage institutions are type GRP.HER with subtypes for museums, libraries, archives, and galleries.
|
|
"""
|
|
```
|
|
|
|
## Related Scripts
|
|
|
|
| Script | Purpose |
|
|
|--------|---------|
|
|
| `scripts/reenrich_wikidata_with_verification.py` | Wikidata entity verification |
|
|
|
|
## Related Documentation
|
|
|
|
- **Agent Rules**: `AGENTS.md` (Rule 11: Z.AI GLM API)
|
|
- **Agent Config**: `.opencode/ZAI_GLM_API_RULES.md`
|
|
- **CH-Annotator**: `.opencode/CH_ANNOTATOR_CONVENTION.md`
|
|
- **Entity Annotation**: `data/entity_annotation/ch_annotator-v1_7_0.yaml`
|
|
|
|
## Troubleshooting
|
|
|
|
### "Quota exceeded" Error
|
|
|
|
**Symptom**: 403 error with "quota exceeded" message
|
|
|
|
**Cause**: Using wrong API endpoint (`open.bigmodel.cn` instead of `api.z.ai`)
|
|
|
|
**Solution**: Update API URL to `https://api.z.ai/api/coding/paas/v4/chat/completions`
|
|
|
|
### "Token not found" Error
|
|
|
|
**Symptom**: ValueError about missing ZAI_API_TOKEN
|
|
|
|
**Solution**:
|
|
1. Check `~/.local/share/opencode/auth.json` for token
|
|
2. Add to `.env` file as `ZAI_API_TOKEN=your_token`
|
|
3. Ensure `load_dotenv()` is called before accessing environment
|
|
|
|
### JSON Parsing Failures
|
|
|
|
**Symptom**: LLM returns text that can't be parsed as JSON
|
|
|
|
**Solution**: Use the `parse_json_from_response()` helper function with fallback handling
|
|
|
|
---
|
|
|
|
**Last Updated**: 2025-12-08
|