kempersc 271545fa8b docs: add Z.AI GLM API and transliteration rules to AGENTS.md

- Add Rule 11 for Z.AI Coding Plan API usage (not BigModel)
- Add transliteration standards for non-Latin scripts
- Document GLM model options and Python implementation

2025-12-08 14:58:22 +01:00

9.6 KiB

Raw Blame History

GLM API Setup Guide

This guide explains how to configure and use the GLM-4 language model for entity recognition, verification, and enrichment tasks in the GLAM project.

Overview

The GLAM project uses GLM-4.6 via the Z.AI Coding Plan endpoint for LLM-powered tasks such as:

Entity Verification: Verify that Wikidata entities are heritage institutions
Description Enrichment: Generate rich descriptions from multiple data sources
Entity Resolution: Match institution names across different data sources
Claim Validation: Verify extracted claims against source documents

Cost: All GLM models are FREE (0 cost per token) on the Z.AI Coding Plan.

Prerequisites

Python 3.10+
httpx library for async HTTP requests
Access to Z.AI Coding Plan (same as OpenCode)

Quick Start

1. Set Up Environment Variable

Add your Z.AI API token to the .env file in the project root:

# .env file
ZAI_API_TOKEN=your_token_here

2. Find Your Token

The token is shared with OpenCode. Check:

# View OpenCode auth file
cat ~/.local/share/opencode/auth.json | jq '.["zai-coding-plan"]'

Copy this token to your .env file.

3. Basic Python Usage

import os
import httpx
import asyncio
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

async def call_glm():
    api_url = "https://api.z.ai/api/coding/paas/v4/chat/completions"
    api_key = os.environ.get("ZAI_API_TOKEN")
    
    async with httpx.AsyncClient(timeout=60.0) as client:
        response = await client.post(
            api_url,
            headers={
                "Authorization": f"Bearer {api_key}",
                "Content-Type": "application/json",
            },
            json={
                "model": "glm-4.6",
                "messages": [
                    {"role": "user", "content": "Hello, GLM!"}
                ],
                "temperature": 0.3,
            }
        )
        result = response.json()
        print(result["choices"][0]["message"]["content"])

asyncio.run(call_glm())

API Configuration

Endpoint Details

Property	Value
Base URL	`https://api.z.ai/api/coding/paas/v4`
Chat Endpoint	`/chat/completions`
Auth Method	Bearer Token
Header	`Authorization: Bearer {token}`

Available Models

Model	Speed	Quality	Use Case
`glm-4.6`	Medium	Highest	Complex reasoning, verification
`glm-4.5`	Medium	High	General tasks
`glm-4.5-air`	Fast	Good	High-volume processing
`glm-4.5-flash`	Fastest	Good	Quick responses
`glm-4.5v`	Medium	High	Vision/image tasks

Recommendation: Use glm-4.6 for entity verification and complex tasks.

Integration with CH-Annotator

When using GLM for entity recognition tasks, always reference the CH-Annotator convention:

Heritage Institution Verification

VERIFICATION_PROMPT = """You are a heritage institution classifier following CH-Annotator v1.7.0 convention.

## CH-Annotator GRP.HER Definition
Heritage institutions are organizations that:
- Collect, preserve, and provide access to cultural heritage materials
- Include: museums (GRP.HER.MUS), libraries (GRP.HER.LIB), archives (GRP.HER.ARC), galleries (GRP.HER.GAL)

## Entity Types That Are NOT Heritage Institutions
- Cities, towns, municipalities (places, not institutions)
- General businesses or companies
- People/individuals
- Events, festivals, exhibitions (temporary)

## Your Task
Analyze the entity and respond in JSON:
```json
{
  "is_heritage_institution": true/false,
  "subtype": "MUS|LIB|ARC|GAL|OTHER|null",
  "confidence": 0.95,
  "reasoning": "Brief explanation"
}

"""


### Entity Type Mapping

| CH-Annotator Type | GLAM Institution Type |
|-------------------|----------------------|
| GRP.HER.MUS | MUSEUM |
| GRP.HER.LIB | LIBRARY |
| GRP.HER.ARC | ARCHIVE |
| GRP.HER.GAL | GALLERY |
| GRP.HER.RES | RESEARCH_CENTER |
| GRP.HER.BOT | BOTANICAL_ZOO |
| GRP.HER.EDU | EDUCATION_PROVIDER |

## Complete Implementation Example

### Wikidata Verification Script

See `scripts/reenrich_wikidata_with_verification.py` for a complete example:

```python
import os
import httpx
import json
from typing import Any, Dict, List, Optional

class GLMHeritageVerifier:
    """Verify Wikidata entities using GLM-4.6 and CH-Annotator."""
    
    API_URL = "https://api.z.ai/api/coding/paas/v4/chat/completions"
    
    def __init__(self, model: str = "glm-4.6"):
        self.api_key = os.environ.get("ZAI_API_TOKEN")
        if not self.api_key:
            raise ValueError("ZAI_API_TOKEN not found in environment")
        
        self.model = model
        self.client = httpx.AsyncClient(
            timeout=60.0,
            headers={
                "Authorization": f"Bearer {self.api_key}",
                "Content-Type": "application/json",
            }
        )
    
    async def verify_heritage_institution(
        self,
        institution_name: str,
        wikidata_label: str,
        wikidata_description: str,
        instance_of_types: List[str],
    ) -> Dict[str, Any]:
        """Check if a Wikidata entity is a heritage institution."""
        
        prompt = f"""Analyze if this entity is a heritage institution (GRP.HER):

Institution Name: {institution_name}
Wikidata Label: {wikidata_label}
Description: {wikidata_description}
Instance Of: {', '.join(instance_of_types)}

Respond with JSON only."""

        response = await self.client.post(
            self.API_URL,
            json={
                "model": self.model,
                "messages": [
                    {"role": "system", "content": self.VERIFICATION_PROMPT},
                    {"role": "user", "content": prompt}
                ],
                "temperature": 0.1,
            }
        )
        
        result = response.json()
        content = result["choices"][0]["message"]["content"]
        
        # Parse JSON from response
        json_match = re.search(r'\{.*\}', content, re.DOTALL)
        if json_match:
            return json.loads(json_match.group())
        
        return {"is_heritage_institution": False, "error": "No JSON found"}

Error Handling

Common Errors

Error Code	Meaning	Solution
401	Unauthorized	Check ZAI_API_TOKEN
403	Forbidden/Quota	Using wrong endpoint (use Z.AI, not BigModel)
429	Rate Limited	Add delays between requests
500	Server Error	Retry with backoff

Retry Pattern

from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10)
)
async def call_with_retry(client, messages):
    response = await client.post(API_URL, json={"model": "glm-4.6", "messages": messages})
    response.raise_for_status()
    return response.json()

JSON Parsing

LLM responses may contain text around JSON. Always parse safely:

import re
import json

def parse_json_from_response(content: str) -> dict:
    """Extract JSON from LLM response text."""
    # Try to find JSON block
    json_match = re.search(r'```json\s*(\{.*?\})\s*```', content, re.DOTALL)
    if json_match:
        return json.loads(json_match.group(1))
    
    # Try bare JSON
    json_match = re.search(r'\{.*\}', content, re.DOTALL)
    if json_match:
        return json.loads(json_match.group())
    
    return {"error": "No JSON found in response"}

Best Practices

1. Use Low Temperature for Verification

{
    "temperature": 0.1  # Low for consistent, deterministic responses
}

2. Request JSON Output

Always request JSON format in your prompts for structured responses:

Respond in JSON format only:
```json
{"key": "value"}


### 3. Batch Processing

Process multiple entities with rate limiting:

```python
import asyncio

async def batch_verify(entities: List[dict], rate_limit: float = 0.5):
    """Verify entities with rate limiting."""
    results = []
    for entity in entities:
        result = await verifier.verify(entity)
        results.append(result)
        await asyncio.sleep(rate_limit)  # Respect rate limits
    return results

4. Always Reference CH-Annotator

For entity recognition tasks, include CH-Annotator context:

system_prompt = """You are following CH-Annotator v1.7.0 convention.
Heritage institutions are type GRP.HER with subtypes for museums, libraries, archives, and galleries.
"""

Script	Purpose
`scripts/reenrich_wikidata_with_verification.py`	Wikidata entity verification

Agent Rules: AGENTS.md (Rule 11: Z.AI GLM API)
Agent Config: .opencode/ZAI_GLM_API_RULES.md
CH-Annotator: .opencode/CH_ANNOTATOR_CONVENTION.md
Entity Annotation: data/entity_annotation/ch_annotator-v1_7_0.yaml

Troubleshooting

"Quota exceeded" Error

Symptom: 403 error with "quota exceeded" message

Cause: Using wrong API endpoint (open.bigmodel.cn instead of api.z.ai)

Solution: Update API URL to https://api.z.ai/api/coding/paas/v4/chat/completions

"Token not found" Error

Symptom: ValueError about missing ZAI_API_TOKEN

Solution:

Check ~/.local/share/opencode/auth.json for token
Add to .env file as ZAI_API_TOKEN=your_token
Ensure load_dotenv() is called before accessing environment

JSON Parsing Failures

Symptom: LLM returns text that can't be parsed as JSON

Solution: Use the parse_json_from_response() helper function with fallback handling

Last Updated: 2025-12-08

9.6 KiB Raw Blame History