glam/docs/GLM_API_SETUP.md

# GLM API Setup Guide

This guide explains how to configure and use the GLM-4 language model for entity recognition, verification, and enrichment tasks in the GLAM project.

## Overview

The GLAM project uses **GLM-4.6** via the **Z.AI Coding Plan** endpoint for LLM-powered tasks such as:

- **Entity Verification**: Verify that Wikidata entities are heritage institutions
- **Description Enrichment**: Generate rich descriptions from multiple data sources
- **Entity Resolution**: Match institution names across different data sources
- **Claim Validation**: Verify extracted claims against source documents

**Cost**: All GLM models are FREE (0 cost per token) on the Z.AI Coding Plan.

## Prerequisites

- Python 3.10+
- `httpx` library for async HTTP requests
- Access to Z.AI Coding Plan (same as OpenCode)

## Quick Start

### 1. Set Up Environment Variable

Add your Z.AI API token to the `.env` file in the project root:

```bash
# .env file
ZAI_API_TOKEN=your_token_here
```

### 2. Find Your Token

The token is shared with OpenCode. Check:

```bash
# View OpenCode auth file
cat ~/.local/share/opencode/auth.json | jq '.["zai-coding-plan"]'
```

Copy this token to your `.env` file.

### 3. Basic Python Usage

```python
import os
import httpx
import asyncio
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

async def call_glm():
    api_url = "https://api.z.ai/api/coding/paas/v4/chat/completions"
    api_key = os.environ.get("ZAI_API_TOKEN")

    async with httpx.AsyncClient(timeout=60.0) as client:
        response = await client.post(
            api_url,
            headers={
                "Authorization": f"Bearer {api_key}",
                "Content-Type": "application/json",
            },
            json={
                "model": "glm-4.6",
                "messages": [
                    {"role": "user", "content": "Hello, GLM!"}
                ],
                "temperature": 0.3,
            }
        )
        result = response.json()
        print(result["choices"][0]["message"]["content"])

asyncio.run(call_glm())
```

## API Configuration

### Endpoint Details

| Property | Value |
|----------|-------|
| **Base URL** | `https://api.z.ai/api/coding/paas/v4` |
| **Chat Endpoint** | `/chat/completions` |
| **Auth Method** | Bearer Token |
| **Header** | `Authorization: Bearer {token}` |

### Available Models

| Model | Speed | Quality | Use Case |
|-------|-------|---------|----------|
| `glm-4.6` | Medium | Highest | Complex reasoning, verification |
| `glm-4.5` | Medium | High | General tasks |
| `glm-4.5-air` | Fast | Good | High-volume processing |
| `glm-4.5-flash` | Fastest | Good | Quick responses |
| `glm-4.5v` | Medium | High | Vision/image tasks |

**Recommendation**: Use `glm-4.6` for entity verification and complex tasks.

## Integration with CH-Annotator

When using GLM for entity recognition tasks, always reference the CH-Annotator convention:

### Heritage Institution Verification

```python
VERIFICATION_PROMPT = """You are a heritage institution classifier following CH-Annotator v1.7.0 convention.

## CH-Annotator GRP.HER Definition
Heritage institutions are organizations that:
- Collect, preserve, and provide access to cultural heritage materials
- Include: museums (GRP.HER.MUS), libraries (GRP.HER.LIB), archives (GRP.HER.ARC), galleries (GRP.HER.GAL)

## Entity Types That Are NOT Heritage Institutions
- Cities, towns, municipalities (places, not institutions)
- General businesses or companies
- People/individuals
- Events, festivals, exhibitions (temporary)

## Your Task
Analyze the entity and respond in JSON:
```json
{
  "is_heritage_institution": true/false,
  "subtype": "MUS|LIB|ARC|GAL|OTHER|null",
  "confidence": 0.95,
  "reasoning": "Brief explanation"
}
```
"""
```

### Entity Type Mapping

| CH-Annotator Type | GLAM Institution Type |
|-------------------|----------------------|
| GRP.HER.MUS | MUSEUM |
| GRP.HER.LIB | LIBRARY |
| GRP.HER.ARC | ARCHIVE |
| GRP.HER.GAL | GALLERY |
| GRP.HER.RES | RESEARCH_CENTER |
| GRP.HER.BOT | BOTANICAL_ZOO |
| GRP.HER.EDU | EDUCATION_PROVIDER |

## Complete Implementation Example

### Wikidata Verification Script

See `scripts/reenrich_wikidata_with_verification.py` for a complete example:

```python
import os
import httpx
import json
from typing import Any, Dict, List, Optional

class GLMHeritageVerifier:
    """Verify Wikidata entities using GLM-4.6 and CH-Annotator."""

    API_URL = "https://api.z.ai/api/coding/paas/v4/chat/completions"

    def __init__(self, model: str = "glm-4.6"):
        self.api_key = os.environ.get("ZAI_API_TOKEN")
        if not self.api_key:
            raise ValueError("ZAI_API_TOKEN not found in environment")

        self.model = model
        self.client = httpx.AsyncClient(
            timeout=60.0,
            headers={
                "Authorization": f"Bearer {self.api_key}",
                "Content-Type": "application/json",
            }
        )

    async def verify_heritage_institution(
        self,
        institution_name: str,
        wikidata_label: str,
        wikidata_description: str,
        instance_of_types: List[str],
    ) -> Dict[str, Any]:
        """Check if a Wikidata entity is a heritage institution."""

        prompt = f"""Analyze if this entity is a heritage institution (GRP.HER):

Institution Name: {institution_name}
Wikidata Label: {wikidata_label}
Description: {wikidata_description}
Instance Of: {', '.join(instance_of_types)}

Respond with JSON only."""

        response = await self.client.post(
            self.API_URL,
            json={
                "model": self.model,
                "messages": [
                    {"role": "system", "content": self.VERIFICATION_PROMPT},
                    {"role": "user", "content": prompt}
                ],
                "temperature": 0.1,
            }
        )

        result = response.json()
        content = result["choices"][0]["message"]["content"]

        # Parse JSON from response
        json_match = re.search(r'\{.*\}', content, re.DOTALL)
        if json_match:
            return json.loads(json_match.group())

        return {"is_heritage_institution": False, "error": "No JSON found"}
```

## Error Handling

### Common Errors

| Error Code | Meaning | Solution |
|------------|---------|----------|
| 401 | Unauthorized | Check ZAI_API_TOKEN |
| 403 | Forbidden/Quota | Using wrong endpoint (use Z.AI, not BigModel) |
| 429 | Rate Limited | Add delays between requests |
| 500 | Server Error | Retry with backoff |

### Retry Pattern

```python
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10)
)
async def call_with_retry(client, messages):
    response = await client.post(API_URL, json={"model": "glm-4.6", "messages": messages})
    response.raise_for_status()
    return response.json()
```

### JSON Parsing

LLM responses may contain text around JSON. Always parse safely:

```python
import re
import json

def parse_json_from_response(content: str) -> dict:
    """Extract JSON from LLM response text."""
    # Try to find JSON block
    json_match = re.search(r'```json\s*(\{.*?\})\s*```', content, re.DOTALL)
    if json_match:
        return json.loads(json_match.group(1))

    # Try bare JSON
    json_match = re.search(r'\{.*\}', content, re.DOTALL)
    if json_match:
        return json.loads(json_match.group())

    return {"error": "No JSON found in response"}
```

## Best Practices

### 1. Use Low Temperature for Verification

```python
{
    "temperature": 0.1  # Low for consistent, deterministic responses
}
```

### 2. Request JSON Output

Always request JSON format in your prompts for structured responses:

```
Respond in JSON format only:
```json
{"key": "value"}
```
```

### 3. Batch Processing

Process multiple entities with rate limiting:

```python
import asyncio

async def batch_verify(entities: List[dict], rate_limit: float = 0.5):
    """Verify entities with rate limiting."""
    results = []
    for entity in entities:
        result = await verifier.verify(entity)
        results.append(result)
        await asyncio.sleep(rate_limit)  # Respect rate limits
    return results
```

### 4. Always Reference CH-Annotator

For entity recognition tasks, include CH-Annotator context:

```python
system_prompt = """You are following CH-Annotator v1.7.0 convention.
Heritage institutions are type GRP.HER with subtypes for museums, libraries, archives, and galleries.
"""
```

## Related Scripts

| Script | Purpose |
|--------|---------|
| `scripts/reenrich_wikidata_with_verification.py` | Wikidata entity verification |

## Related Documentation

- **Agent Rules**: `AGENTS.md` (Rule 11: Z.AI GLM API)
- **Agent Config**: `.opencode/ZAI_GLM_API_RULES.md`
- **CH-Annotator**: `.opencode/CH_ANNOTATOR_CONVENTION.md`
- **Entity Annotation**: `data/entity_annotation/ch_annotator-v1_7_0.yaml`

## Troubleshooting

### "Quota exceeded" Error

**Symptom**: 403 error with "quota exceeded" message

**Cause**: Using wrong API endpoint (`open.bigmodel.cn` instead of `api.z.ai`)

**Solution**: Update API URL to `https://api.z.ai/api/coding/paas/v4/chat/completions`

### "Token not found" Error

**Symptom**: ValueError about missing ZAI_API_TOKEN

**Solution**:
1. Check `~/.local/share/opencode/auth.json` for token
2. Add to `.env` file as `ZAI_API_TOKEN=your_token`
3. Ensure `load_dotenv()` is called before accessing environment

### JSON Parsing Failures

**Symptom**: LLM returns text that can't be parsed as JSON

**Solution**: Use the `parse_json_from_response()` helper function with fallback handling

---

**Last Updated**: 2025-12-08