glam/schemas/20251121/linkml/modules/classes/LLMResponse.yaml
kempersc ca4a54181e Refactor schema files to improve clarity and maintainability
- Updated WorldCatIdentifier.yaml to remove unnecessary description and ensure consistent formatting.
- Enhanced WorldHeritageSite.yaml by breaking long description into multiple lines for better readability and removed unused attributes.
- Simplified WritingSystem.yaml by removing redundant attributes and ensuring consistent formatting.
- Cleaned up XPathScore.yaml by removing unnecessary attributes and ensuring consistent formatting.
- Improved YoutubeChannel.yaml by breaking long description into multiple lines for better readability.
- Enhanced YoutubeEnrichment.yaml by breaking long description into multiple lines for better readability.
- Updated YoutubeVideo.yaml to break long description into multiple lines and removed legacy field name.
- Refined has_or_had_affiliation.yaml by removing unnecessary comments and ensuring clarity.
- Cleaned up is_or_was_retrieved_at.yaml by removing unnecessary comments and ensuring clarity.
- Added rules for generic slots and avoiding rough edits in schema files to maintain structural integrity.
- Introduced changes_or_changed_through.yaml to define a new slot for linking entities to change events.
2026-01-31 00:46:23 +01:00

178 lines
7.3 KiB
YAML

id: https://nde.nl/ontology/hc/class/LLMResponse
name: llm_response_class
title: LLM Response Class
version: 1.0.0
prefixes:
linkml: https://w3id.org/linkml/
hc: https://nde.nl/ontology/hc/
schema: http://schema.org/
prov: http://www.w3.org/ns/prov#
dct: http://purl.org/dc/terms/
xsd: http://www.w3.org/2001/XMLSchema#
imports:
- linkml:types
- ../enums/FinishReasonEnum
- ../enums/LLMProviderEnum
- ../enums/ThinkingModeEnum
- ../metadata
- ../slots/consumes_or_consumed
- ../slots/content
- ../slots/cost_usd
- ../slots/created
- ../slots/has_or_had_mode
- ../slots/has_or_had_score
- ../slots/has_or_had_token
- ../slots/is_or_was_ceased_by
- ../slots/latency_ms
- ../slots/model
- ../slots/preserves_or_preserved
- ../slots/reasoning_content
- ../slots/request_id
- ../slots/specificity_annotation
- ./CeaseEvent
- ./ReasoningContent
- ./SpecificityAnnotation
- ./TemplateSpecificityScore
- ./TemplateSpecificityType
- ./TemplateSpecificityTypes
- ./ThinkingMode
- ./Token
default_range: string
classes:
LLMResponse:
class_uri: prov:Activity
description: "Provenance metadata for LLM API responses, including GLM 4.7 Thinking Modes.\n\nCaptures complete response metadata from LLM providers (ZhipuAI GLM, Anthropic,\nOpenAI, etc.) for traceability and analysis. The key innovation is capturing\n`reasoning_content` - the chain-of-thought reasoning that GLM 4.7 exposes\nthrough its three thinking modes.\n\n**GLM 4.7 Thinking Modes** (https://docs.z.ai/guides/capabilities/thinking-mode):\n\n1. **Interleaved Thinking** (default, since GLM-4.5):\n - Model thinks between tool calls and after receiving tool results\n - Enables complex, step-by-step reasoning with tool chaining\n - Returns `reasoning_content` alongside `content` in every response\n\n2. **Preserved Thinking** (new in GLM-4.7):\n - Retains reasoning_content from previous assistant turns in context\n - Preserves reasoning continuity across multi-turn conversations\n - Improves model performance and increases cache hit rates\n - **Enabled by default on Coding\
\ Plan endpoint**\n - Requires returning EXACT, UNMODIFIED reasoning_content back to API\n - Set via: `preserves_or_preserved` with `is_preserved: true` (preserve previous reasoning)\n\n3. **Turn-level Thinking** (new in GLM-4.7):\n - Control reasoning computation on a per-turn basis\n - Enable/disable thinking independently for each request in a session\n - Useful for balancing speed (simple queries) vs accuracy (complex tasks)\n - Set via: `\"thinking\": {\"type\": \"enabled\"}` or `\"thinking\": {\"type\": \"disabled\"}`\n\n**Critical Implementation Note for Preserved Thinking**:\nWhen using Preserved Thinking with tool calls, thinking blocks MUST be:\n1. Explicitly preserved in the messages array\n2. Returned together with tool results\n3. Kept in EXACT original sequence (no reordering/editing)\n\n**PROV-O Alignment**:\n- LLMResponse IS a prov:Activity (the inference process)\n- content IS prov:Entity (the generated output)\n- model/provider IS prov:Agent (the AI system)\n\
- reasoning_content documents the prov:Plan (how the agent reasoned)\n- prompt (input) IS prov:used (input to the activity)\n\n**Use Cases**:\n- DSPy RAG responses with reasoning traces\n- Heritage institution extraction provenance\n- LinkML schema conformity validation\n- Ontology mapping decision logs\n- Multi-turn agent conversations with preserved context\n"
exact_mappings:
- prov:Activity
close_mappings:
- schema:Action
- schema:CreativeWork
slots:
- has_or_had_token
- preserves_or_preserved
- content
- cost_usd
- created
- is_or_was_ceased_by
- latency_ms
- model
- reasoning_content
- request_id
- specificity_annotation
- has_or_had_score
- has_or_had_mode
- consumes_or_consumed
slot_usage:
content:
range: string
required: true
examples:
- value: The Rijksmuseum is a national museum in Amsterdam dedicated to Dutch arts and history.
reasoning_content:
range: string
required: false
examples:
- value: 'The user is asking about Dutch heritage institutions. I need to identify: 1) Institution name: Rijksmuseum, 2) Type: Museum (maps to InstitutionTypeEnum.MUSEUM), 3) Location: Amsterdam (city in Noord-Holland province)...'
model:
range: string
required: true
examples:
- value: glm-4.7
request_id:
range: string
required: false
examples:
- value: req_8f3a2b1c4d5e6f7g
created:
range: datetime
required: true
examples:
- value: '2025-12-23T10:30:00Z'
consumes_or_consumed:
range: integer
minimum_value: 0
examples:
- value: 600
has_or_had_token:
range: Token
multivalued: true
inlined: true
inlined_as_list: true
required: false
examples:
- value:
- has_or_had_type:
has_or_had_identifier: hc:TokenType/CACHED
has_or_had_label: Cached Token
has_or_had_quantity:
has_or_had_description: Tokens from provider KV cache
- has_or_had_type:
has_or_had_identifier: hc:TokenType/OUTPUT
has_or_had_label: Output Token
has_or_had_quantity:
has_or_had_description: Completion tokens (content + reasoning)
- value:
- has_or_had_type:
has_or_had_identifier: hc:TokenType/OUTPUT
has_or_had_label: Output Token
has_or_had_quantity:
is_or_was_ceased_by:
range: CeaseEvent
inlined: true
required: false
examples:
- value:
has_or_had_label: stop
has_or_had_description: Model completed naturally
- value:
has_or_had_label: length
has_or_had_description: Max tokens exceeded
latency_ms:
range: integer
minimum_value: 0
required: false
examples:
- value: 1250
cost_usd:
range: float
minimum_value: 0.0
required: false
examples:
- value: 0.0
- value: 0.015
has_or_had_mode:
range: ThinkingMode
required: false
examples:
- value:
has_or_had_label: Preserved Thinking
- value:
has_or_had_label: Interleaved Thinking
- value:
has_or_had_label: Disabled
preserves_or_preserved:
range: ReasoningContent
inlined: true
multivalued: true
required: false
examples:
- value:
has_or_had_label: Preserved Reasoning
- value:
has_or_had_label: Fresh Context
comments:
- reasoning_content is the key field for Interleaved Thinking (GLM 4.7)
- Store reasoning_content for debugging, auditing, and DSPy optimization
- 'Z.AI Coding Plan endpoint: https://api.z.ai/api/coding/paas/v4/chat/completions'
- 'For DSPy: use LLMResponse to track all LLM calls in the pipeline'
- See AGENTS.md Rule 11 for Z.AI API configuration
see_also:
- https://www.w3.org/TR/prov-o/
- https://api.z.ai/docs
- https://dspy-docs.vercel.app/
annotations:
specificity_score: 0.1
specificity_rationale: Generic utility class/slot created during migration
custodian_types: "['*']"