kempersc 271545fa8b docs: add Z.AI GLM API and transliteration rules to AGENTS.md

- Add Rule 11 for Z.AI Coding Plan API usage (not BigModel)
- Add transliteration standards for non-Latin scripts
- Document GLM model options and Python implementation

2025-12-08 14:58:22 +01:00

6.6 KiB

Raw Blame History

Z.AI GLM API Rules for AI Agents

Last Updated: 2025-12-08
Status: MANDATORY for all LLM API calls in scripts

CRITICAL: Use Z.AI Coding Plan, NOT BigModel API

This project uses the Z.AI Coding Plan endpoint, which is the SAME endpoint that OpenCode uses internally.

The regular BigModel API (open.bigmodel.cn) will NOT work with the tokens stored in this project. You MUST use the Z.AI Coding Plan endpoint.

API Configuration

Correct Endpoint

Property	Value
API URL	`https://api.z.ai/api/coding/paas/v4/chat/completions`
Auth Header	`Authorization: Bearer {ZAI_API_TOKEN}`
Content-Type	`application/json`

Available Models

Model	Description	Cost
`glm-4.5`	Standard GLM-4.5	Free (0 per token)
`glm-4.5-air`	GLM-4.5 Air variant	Free
`glm-4.5-flash`	Fast GLM-4.5	Free
`glm-4.5v`	Vision-capable GLM-4.5	Free
`glm-4.6`	Latest GLM-4.6 (recommended)	Free

Recommended Model: glm-4.6 for best quality

Authentication

Token Location

The Z.AI API token can be obtained from two locations:

Environment Variable (preferred for scripts):

# In .env file at project root
ZAI_API_TOKEN=your_token_here

OpenCode Auth File (reference only):
```
~/.local/share/opencode/auth.json
```
The token is stored under key zai-coding-plan.

Getting the Token

If you need to set up the token:

The token is shared with OpenCode's Z.AI Coding Plan
Check ~/.local/share/opencode/auth.json for existing token
Add to .env file as ZAI_API_TOKEN

Python Implementation

Correct Implementation

import os
import httpx

class GLMClient:
    """Client for Z.AI GLM API (Coding Plan endpoint)."""
    
    # CORRECT endpoint - Z.AI Coding Plan
    API_URL = "https://api.z.ai/api/coding/paas/v4/chat/completions"
    
    def __init__(self, model: str = "glm-4.6"):
        self.api_key = os.environ.get("ZAI_API_TOKEN")
        if not self.api_key:
            raise ValueError("ZAI_API_TOKEN not found in environment")
        
        self.model = model
        self.client = httpx.AsyncClient(
            timeout=60.0,
            headers={
                "Authorization": f"Bearer {self.api_key}",
                "Content-Type": "application/json",
            }
        )
    
    async def chat(self, messages: list) -> dict:
        """Send chat completion request."""
        response = await self.client.post(
            self.API_URL,
            json={
                "model": self.model,
                "messages": messages,
                "temperature": 0.3,
            }
        )
        response.raise_for_status()
        return response.json()

WRONG Implementation (DO NOT USE)

# WRONG - This endpoint will fail with quota errors
API_URL = "https://open.bigmodel.cn/api/paas/v4/chat/completions"

# WRONG - This is for regular BigModel API, not Z.AI Coding Plan
api_key = os.environ.get("ZHIPU_API_KEY")

Request Format

Chat Completion Request

{
  "model": "glm-4.6",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Your prompt here"
    }
  ],
  "temperature": 0.3,
  "max_tokens": 4096
}

Response Format

{
  "id": "request-id",
  "created": 1733651234,
  "model": "glm-4.6",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Response text here"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 100,
    "completion_tokens": 50,
    "total_tokens": 150
  }
}

Error Handling

Common Errors

Error	Cause	Solution
`401 Unauthorized`	Invalid or missing token	Check ZAI_API_TOKEN in .env
`403 Quota exceeded`	Wrong endpoint (BigModel)	Use Z.AI Coding Plan endpoint
`429 Rate limited`	Too many requests	Add delay between requests
`500 Server error`	API issue	Retry with exponential backoff

Retry Strategy

import asyncio
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10)
)
async def call_api_with_retry(client, messages):
    return await client.chat(messages)

Integration with CH-Annotator

When using GLM for entity recognition or verification, always reference CH-Annotator v1.7.0:

PROMPT = """You are a heritage institution classifier following CH-Annotator v1.7.0 convention.

## CH-Annotator GRP.HER Definition
Heritage institutions are organizations that:
- Collect, preserve, and provide access to cultural heritage materials
- Include: museums (GRP.HER.MUS), libraries (GRP.HER.LIB), archives (GRP.HER.ARC), galleries (GRP.HER.GAL)

## Entity to Analyze
...
"""

See .opencode/CH_ANNOTATOR_CONVENTION.md for full convention details.

Scripts Using GLM API

The following scripts use the Z.AI GLM API:

Script	Purpose
`scripts/reenrich_wikidata_with_verification.py`	Wikidata entity verification using GLM-4.6

When creating new scripts that need LLM capabilities, follow this pattern.

Environment Setup Checklist

When setting up a new environment:

Check ~/.local/share/opencode/auth.json for existing Z.AI token
Add ZAI_API_TOKEN to .env file
Verify endpoint is https://api.z.ai/api/coding/paas/v4/chat/completions
Test with glm-4.6 model
Reference CH-Annotator v1.7.0 for entity recognition tasks

AI Agent Rules

DO

Use https://api.z.ai/api/coding/paas/v4/chat/completions endpoint
Get token from ZAI_API_TOKEN environment variable
Use glm-4.6 as the default model
Reference CH-Annotator v1.7.0 for entity tasks
Add retry logic with exponential backoff
Handle JSON parsing errors gracefully

DO NOT

Use open.bigmodel.cn endpoint (wrong API)
Use ZHIPU_API_KEY environment variable (wrong key)
Hard-code API tokens in scripts
Skip error handling for API calls
Forget to load .env file before accessing environment

CH-Annotator Convention: .opencode/CH_ANNOTATOR_CONVENTION.md
Entity Annotation: data/entity_annotation/ch_annotator-v1_7_0.yaml
Wikidata Enrichment Script: scripts/reenrich_wikidata_with_verification.py

Version History

Date	Change
2025-12-08	Initial documentation - Fixed API endpoint discovery

6.6 KiB Raw Blame History