glam/.opencode/ZAI_GLM_API_RULES.md
kempersc 271545fa8b docs: add Z.AI GLM API and transliteration rules to AGENTS.md
- Add Rule 11 for Z.AI Coding Plan API usage (not BigModel)
- Add transliteration standards for non-Latin scripts
- Document GLM model options and Python implementation
2025-12-08 14:58:22 +01:00

6.6 KiB

Z.AI GLM API Rules for AI Agents

Last Updated: 2025-12-08
Status: MANDATORY for all LLM API calls in scripts


CRITICAL: Use Z.AI Coding Plan, NOT BigModel API

This project uses the Z.AI Coding Plan endpoint, which is the SAME endpoint that OpenCode uses internally.

The regular BigModel API (open.bigmodel.cn) will NOT work with the tokens stored in this project. You MUST use the Z.AI Coding Plan endpoint.


API Configuration

Correct Endpoint

Property Value
API URL https://api.z.ai/api/coding/paas/v4/chat/completions
Auth Header Authorization: Bearer {ZAI_API_TOKEN}
Content-Type application/json

Available Models

Model Description Cost
glm-4.5 Standard GLM-4.5 Free (0 per token)
glm-4.5-air GLM-4.5 Air variant Free
glm-4.5-flash Fast GLM-4.5 Free
glm-4.5v Vision-capable GLM-4.5 Free
glm-4.6 Latest GLM-4.6 (recommended) Free

Recommended Model: glm-4.6 for best quality


Authentication

Token Location

The Z.AI API token can be obtained from two locations:

  1. Environment Variable (preferred for scripts):

    # In .env file at project root
    ZAI_API_TOKEN=your_token_here
    
  2. OpenCode Auth File (reference only):

    ~/.local/share/opencode/auth.json
    

    The token is stored under key zai-coding-plan.

Getting the Token

If you need to set up the token:

  1. The token is shared with OpenCode's Z.AI Coding Plan
  2. Check ~/.local/share/opencode/auth.json for existing token
  3. Add to .env file as ZAI_API_TOKEN

Python Implementation

Correct Implementation

import os
import httpx

class GLMClient:
    """Client for Z.AI GLM API (Coding Plan endpoint)."""
    
    # CORRECT endpoint - Z.AI Coding Plan
    API_URL = "https://api.z.ai/api/coding/paas/v4/chat/completions"
    
    def __init__(self, model: str = "glm-4.6"):
        self.api_key = os.environ.get("ZAI_API_TOKEN")
        if not self.api_key:
            raise ValueError("ZAI_API_TOKEN not found in environment")
        
        self.model = model
        self.client = httpx.AsyncClient(
            timeout=60.0,
            headers={
                "Authorization": f"Bearer {self.api_key}",
                "Content-Type": "application/json",
            }
        )
    
    async def chat(self, messages: list) -> dict:
        """Send chat completion request."""
        response = await self.client.post(
            self.API_URL,
            json={
                "model": self.model,
                "messages": messages,
                "temperature": 0.3,
            }
        )
        response.raise_for_status()
        return response.json()

WRONG Implementation (DO NOT USE)

# WRONG - This endpoint will fail with quota errors
API_URL = "https://open.bigmodel.cn/api/paas/v4/chat/completions"

# WRONG - This is for regular BigModel API, not Z.AI Coding Plan
api_key = os.environ.get("ZHIPU_API_KEY")

Request Format

Chat Completion Request

{
  "model": "glm-4.6",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Your prompt here"
    }
  ],
  "temperature": 0.3,
  "max_tokens": 4096
}

Response Format

{
  "id": "request-id",
  "created": 1733651234,
  "model": "glm-4.6",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Response text here"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 100,
    "completion_tokens": 50,
    "total_tokens": 150
  }
}

Error Handling

Common Errors

Error Cause Solution
401 Unauthorized Invalid or missing token Check ZAI_API_TOKEN in .env
403 Quota exceeded Wrong endpoint (BigModel) Use Z.AI Coding Plan endpoint
429 Rate limited Too many requests Add delay between requests
500 Server error API issue Retry with exponential backoff

Retry Strategy

import asyncio
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10)
)
async def call_api_with_retry(client, messages):
    return await client.chat(messages)

Integration with CH-Annotator

When using GLM for entity recognition or verification, always reference CH-Annotator v1.7.0:

PROMPT = """You are a heritage institution classifier following CH-Annotator v1.7.0 convention.

## CH-Annotator GRP.HER Definition
Heritage institutions are organizations that:
- Collect, preserve, and provide access to cultural heritage materials
- Include: museums (GRP.HER.MUS), libraries (GRP.HER.LIB), archives (GRP.HER.ARC), galleries (GRP.HER.GAL)

## Entity to Analyze
...
"""

See .opencode/CH_ANNOTATOR_CONVENTION.md for full convention details.


Scripts Using GLM API

The following scripts use the Z.AI GLM API:

Script Purpose
scripts/reenrich_wikidata_with_verification.py Wikidata entity verification using GLM-4.6

When creating new scripts that need LLM capabilities, follow this pattern.


Environment Setup Checklist

When setting up a new environment:

  • Check ~/.local/share/opencode/auth.json for existing Z.AI token
  • Add ZAI_API_TOKEN to .env file
  • Verify endpoint is https://api.z.ai/api/coding/paas/v4/chat/completions
  • Test with glm-4.6 model
  • Reference CH-Annotator v1.7.0 for entity recognition tasks

AI Agent Rules

DO

  • Use https://api.z.ai/api/coding/paas/v4/chat/completions endpoint
  • Get token from ZAI_API_TOKEN environment variable
  • Use glm-4.6 as the default model
  • Reference CH-Annotator v1.7.0 for entity tasks
  • Add retry logic with exponential backoff
  • Handle JSON parsing errors gracefully

DO NOT

  • Use open.bigmodel.cn endpoint (wrong API)
  • Use ZHIPU_API_KEY environment variable (wrong key)
  • Hard-code API tokens in scripts
  • Skip error handling for API calls
  • Forget to load .env file before accessing environment

  • CH-Annotator Convention: .opencode/CH_ANNOTATOR_CONVENTION.md
  • Entity Annotation: data/entity_annotation/ch_annotator-v1_7_0.yaml
  • Wikidata Enrichment Script: scripts/reenrich_wikidata_with_verification.py

Version History

Date Change
2025-12-08 Initial documentation - Fixed API endpoint discovery