glam/docs/GLM_API_SETUP.md
kempersc 271545fa8b docs: add Z.AI GLM API and transliteration rules to AGENTS.md
- Add Rule 11 for Z.AI Coding Plan API usage (not BigModel)
- Add transliteration standards for non-Latin scripts
- Document GLM model options and Python implementation
2025-12-08 14:58:22 +01:00

9.6 KiB

GLM API Setup Guide

This guide explains how to configure and use the GLM-4 language model for entity recognition, verification, and enrichment tasks in the GLAM project.

Overview

The GLAM project uses GLM-4.6 via the Z.AI Coding Plan endpoint for LLM-powered tasks such as:

  • Entity Verification: Verify that Wikidata entities are heritage institutions
  • Description Enrichment: Generate rich descriptions from multiple data sources
  • Entity Resolution: Match institution names across different data sources
  • Claim Validation: Verify extracted claims against source documents

Cost: All GLM models are FREE (0 cost per token) on the Z.AI Coding Plan.

Prerequisites

  • Python 3.10+
  • httpx library for async HTTP requests
  • Access to Z.AI Coding Plan (same as OpenCode)

Quick Start

1. Set Up Environment Variable

Add your Z.AI API token to the .env file in the project root:

# .env file
ZAI_API_TOKEN=your_token_here

2. Find Your Token

The token is shared with OpenCode. Check:

# View OpenCode auth file
cat ~/.local/share/opencode/auth.json | jq '.["zai-coding-plan"]'

Copy this token to your .env file.

3. Basic Python Usage

import os
import httpx
import asyncio
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

async def call_glm():
    api_url = "https://api.z.ai/api/coding/paas/v4/chat/completions"
    api_key = os.environ.get("ZAI_API_TOKEN")
    
    async with httpx.AsyncClient(timeout=60.0) as client:
        response = await client.post(
            api_url,
            headers={
                "Authorization": f"Bearer {api_key}",
                "Content-Type": "application/json",
            },
            json={
                "model": "glm-4.6",
                "messages": [
                    {"role": "user", "content": "Hello, GLM!"}
                ],
                "temperature": 0.3,
            }
        )
        result = response.json()
        print(result["choices"][0]["message"]["content"])

asyncio.run(call_glm())

API Configuration

Endpoint Details

Property Value
Base URL https://api.z.ai/api/coding/paas/v4
Chat Endpoint /chat/completions
Auth Method Bearer Token
Header Authorization: Bearer {token}

Available Models

Model Speed Quality Use Case
glm-4.6 Medium Highest Complex reasoning, verification
glm-4.5 Medium High General tasks
glm-4.5-air Fast Good High-volume processing
glm-4.5-flash Fastest Good Quick responses
glm-4.5v Medium High Vision/image tasks

Recommendation: Use glm-4.6 for entity verification and complex tasks.

Integration with CH-Annotator

When using GLM for entity recognition tasks, always reference the CH-Annotator convention:

Heritage Institution Verification

VERIFICATION_PROMPT = """You are a heritage institution classifier following CH-Annotator v1.7.0 convention.

## CH-Annotator GRP.HER Definition
Heritage institutions are organizations that:
- Collect, preserve, and provide access to cultural heritage materials
- Include: museums (GRP.HER.MUS), libraries (GRP.HER.LIB), archives (GRP.HER.ARC), galleries (GRP.HER.GAL)

## Entity Types That Are NOT Heritage Institutions
- Cities, towns, municipalities (places, not institutions)
- General businesses or companies
- People/individuals
- Events, festivals, exhibitions (temporary)

## Your Task
Analyze the entity and respond in JSON:
```json
{
  "is_heritage_institution": true/false,
  "subtype": "MUS|LIB|ARC|GAL|OTHER|null",
  "confidence": 0.95,
  "reasoning": "Brief explanation"
}

"""


### Entity Type Mapping

| CH-Annotator Type | GLAM Institution Type |
|-------------------|----------------------|
| GRP.HER.MUS | MUSEUM |
| GRP.HER.LIB | LIBRARY |
| GRP.HER.ARC | ARCHIVE |
| GRP.HER.GAL | GALLERY |
| GRP.HER.RES | RESEARCH_CENTER |
| GRP.HER.BOT | BOTANICAL_ZOO |
| GRP.HER.EDU | EDUCATION_PROVIDER |

## Complete Implementation Example

### Wikidata Verification Script

See `scripts/reenrich_wikidata_with_verification.py` for a complete example:

```python
import os
import httpx
import json
from typing import Any, Dict, List, Optional

class GLMHeritageVerifier:
    """Verify Wikidata entities using GLM-4.6 and CH-Annotator."""
    
    API_URL = "https://api.z.ai/api/coding/paas/v4/chat/completions"
    
    def __init__(self, model: str = "glm-4.6"):
        self.api_key = os.environ.get("ZAI_API_TOKEN")
        if not self.api_key:
            raise ValueError("ZAI_API_TOKEN not found in environment")
        
        self.model = model
        self.client = httpx.AsyncClient(
            timeout=60.0,
            headers={
                "Authorization": f"Bearer {self.api_key}",
                "Content-Type": "application/json",
            }
        )
    
    async def verify_heritage_institution(
        self,
        institution_name: str,
        wikidata_label: str,
        wikidata_description: str,
        instance_of_types: List[str],
    ) -> Dict[str, Any]:
        """Check if a Wikidata entity is a heritage institution."""
        
        prompt = f"""Analyze if this entity is a heritage institution (GRP.HER):

Institution Name: {institution_name}
Wikidata Label: {wikidata_label}
Description: {wikidata_description}
Instance Of: {', '.join(instance_of_types)}

Respond with JSON only."""

        response = await self.client.post(
            self.API_URL,
            json={
                "model": self.model,
                "messages": [
                    {"role": "system", "content": self.VERIFICATION_PROMPT},
                    {"role": "user", "content": prompt}
                ],
                "temperature": 0.1,
            }
        )
        
        result = response.json()
        content = result["choices"][0]["message"]["content"]
        
        # Parse JSON from response
        json_match = re.search(r'\{.*\}', content, re.DOTALL)
        if json_match:
            return json.loads(json_match.group())
        
        return {"is_heritage_institution": False, "error": "No JSON found"}

Error Handling

Common Errors

Error Code Meaning Solution
401 Unauthorized Check ZAI_API_TOKEN
403 Forbidden/Quota Using wrong endpoint (use Z.AI, not BigModel)
429 Rate Limited Add delays between requests
500 Server Error Retry with backoff

Retry Pattern

from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10)
)
async def call_with_retry(client, messages):
    response = await client.post(API_URL, json={"model": "glm-4.6", "messages": messages})
    response.raise_for_status()
    return response.json()

JSON Parsing

LLM responses may contain text around JSON. Always parse safely:

import re
import json

def parse_json_from_response(content: str) -> dict:
    """Extract JSON from LLM response text."""
    # Try to find JSON block
    json_match = re.search(r'```json\s*(\{.*?\})\s*```', content, re.DOTALL)
    if json_match:
        return json.loads(json_match.group(1))
    
    # Try bare JSON
    json_match = re.search(r'\{.*\}', content, re.DOTALL)
    if json_match:
        return json.loads(json_match.group())
    
    return {"error": "No JSON found in response"}

Best Practices

1. Use Low Temperature for Verification

{
    "temperature": 0.1  # Low for consistent, deterministic responses
}

2. Request JSON Output

Always request JSON format in your prompts for structured responses:

Respond in JSON format only:
```json
{"key": "value"}

### 3. Batch Processing

Process multiple entities with rate limiting:

```python
import asyncio

async def batch_verify(entities: List[dict], rate_limit: float = 0.5):
    """Verify entities with rate limiting."""
    results = []
    for entity in entities:
        result = await verifier.verify(entity)
        results.append(result)
        await asyncio.sleep(rate_limit)  # Respect rate limits
    return results

4. Always Reference CH-Annotator

For entity recognition tasks, include CH-Annotator context:

system_prompt = """You are following CH-Annotator v1.7.0 convention.
Heritage institutions are type GRP.HER with subtypes for museums, libraries, archives, and galleries.
"""
Script Purpose
scripts/reenrich_wikidata_with_verification.py Wikidata entity verification
  • Agent Rules: AGENTS.md (Rule 11: Z.AI GLM API)
  • Agent Config: .opencode/ZAI_GLM_API_RULES.md
  • CH-Annotator: .opencode/CH_ANNOTATOR_CONVENTION.md
  • Entity Annotation: data/entity_annotation/ch_annotator-v1_7_0.yaml

Troubleshooting

"Quota exceeded" Error

Symptom: 403 error with "quota exceeded" message

Cause: Using wrong API endpoint (open.bigmodel.cn instead of api.z.ai)

Solution: Update API URL to https://api.z.ai/api/coding/paas/v4/chat/completions

"Token not found" Error

Symptom: ValueError about missing ZAI_API_TOKEN

Solution:

  1. Check ~/.local/share/opencode/auth.json for token
  2. Add to .env file as ZAI_API_TOKEN=your_token
  3. Ensure load_dotenv() is called before accessing environment

JSON Parsing Failures

Symptom: LLM returns text that can't be parsed as JSON

Solution: Use the parse_json_from_response() helper function with fallback handling


Last Updated: 2025-12-08