docs: created docs for the new graph agent system

2026-05-18 15:37:53 -06:00
parent f40cc8af27
commit 6e08cfd984
3 changed files with 966 additions and 0 deletions
@@ -4,6 +4,12 @@ Agents in Loki follow the same style as OpenAI's GPTs. They consist of 3 parts:
 * [RAG](RAG) - Pre-built knowledge bases specifically for the agent
 * [Function Calling](Tools#tools) ([#2](MCP-Servers)) - Extends the functionality of the LLM through custom functions it can call
 > **Looking for declarative, multi-step workflows?** See
 > [Graph Agents](Graph-Agents): a YAML-driven workflow engine where each step
 > (LLM call, script, user prompt, child-agent spawn) is its own typed node.
 > Useful when an agent's behavior follows a fixed shape rather than a single
 > open-ended LLM loop.
 ![Agent example](./images/agents/sql.gif)
 Agent configuration files are stored in the `agents` subdirectory of your Loki configuration directory. The location of
@@ -738,3 +744,8 @@ Loki comes packaged with some useful built-in agents:
 * `oracle`: An agent for high-level architecture, design decisions, and complex debugging
 * `sisyphus`: A powerhouse orchestrator agent for writing complex code and acting as a natural language interface for your codebase (similar to ClaudeCode, Gemini CLI, Codex, or OpenCode). Uses sub-agent spawning to delegate to `explore`, `coder`, and `oracle`.
 * `sql`: A universal SQL agent that enables you to talk to any relational database in natural language
 Loki writes these built-in agents to your agents directory on first run and never overwrites them afterward, so any
 edits you make to them are preserved across Loki updates. To discard your local changes and reinstall the built-in
 agents from the current Loki build, run `loki --install agents` (or `.install agents` in the REPL). Agents you created
 yourself are not affected.
@@ -0,0 +1,950 @@
 Graph-based agents are a declarative, YAML-driven workflow engine layered on
 top of Loki's existing agent system. Where a normal [agent](Agents) runs as a
 single LLM loop driven by tool calls, a **graph agent** is a directed graph of
 typed nodes. Each node performs one well-defined step (call an LLM, run a
 script, ask the user a question, spawn a child agent, etc.) and routes to the
 next node based on its result.
 Graph agents are best for workflows that:
 - Have a fixed shape (e.g. parse -> query -> grade -> synthesize -> verify)
 - Mix LLM calls with deterministic steps (scripts, user prompts)
 - Need explicit human-in-the-loop checkpoints
 - Benefit from per-step model / tool / temperature overrides
 If you just want an agent that takes a goal and figures out the steps on its
 own, stick with a regular [agent](Agents).
 ---
 # Directory Structure
 A graph agent is defined by a single `graph.yaml`. It holds *both* the
 agent-level config (model, tools, MCP servers) *and* the workflow:
 ```
 <loki-config-dir>/agents
    └── my-graph-agent
        ├── graph.yaml           # agent config + workflow definition
        ├── tools.sh             # optional custom tools
        ├── <rag-node-id>.yaml   # auto-built knowledge base for a rag node
        └── scripts/             # optional script-node implementations
            ├── decide.py
            └── verify.py
 ```
 `<rag-node-id>.yaml` files are generated by Loki at agent load time - one
 per `rag` node - and should not be hand-edited.
 An agent directory must contain **either** a `config.yaml` (a normal,
 LLM-loop agent (see [Agents](Agents))) **or** a `graph.yaml` (a graph
 agent). Never both. The presence of `graph.yaml` is what marks an agent
 as a graph agent; when Loki runs it, execution is driven entirely by the
 graph.
 **Both files present is an error.** If an agent directory contains both
 `config.yaml` and `graph.yaml`, Loki refuses to load it and tells you to
 remove one. Pick the model that fits: `config.yaml` for an open-ended
 LLM-loop agent, `graph.yaml` for a fixed-shape workflow.
 ---
 # graph.yaml Top-Level Fields
 ```yaml
 name: my-graph-agent
 description: |
  Plain prose describing what the workflow does.
 version: "1.0"
 # --- agent-level config ---
 model: anthropic:claude-sonnet-4-6   # default model for llm nodes
 temperature: 0.0                     # default sampling temperature
 top_p: null                          # default sampling top-p
 global_tools:                        # global tools available to nodes
  - web_search_loki.sh
 mcp_servers:                         # MCP servers available to nodes
  - pubmed-search
 conversation_starters:               # suggested prompts in the UI
  - "Look up LOINC 2160-0"
 settings:
  max_loop_iterations: 100     # PER-NODE visit cap; default 100 (see below)
  log_state_snapshots: true    # log state JSON before each node executes
  validate_before_run: true    # run the graph validator on startup
  timeout: 600                 # optional overall timeout in seconds
 initial_state:                 # optional seed state for the run
  topic: "auth"
 start: parse_input             # required: ID of the first node to run
 nodes:
  parse_input: { ... }
  ...
 ```
 - **`version`:** Currently only `"1.0"` is accepted by the parser. Anything
  else fails at startup. This is the *graph schema* version, not your
  agent's version.
 - **Agent-level config** (`model`, `temperature`, `top_p`, `global_tools`,
  `mcp_servers`, `conversation_starters`) are all optional.
  These are the same fields a normal agent's `config.yaml` carries; in a
  graph agent they live at the top of `graph.yaml` instead. `model` /
  `temperature` / `top_p` act as the defaults for `llm` nodes that don't
  set their own. `global_tools` and `mcp_servers` define the tool universe
  that an `llm` node's `tools:` whitelist selects from (a node with no
 `tools:` field gets none of them).
 - **`can_spawn_agents` is derived, not declared.** A graph agent can spawn
  child agents iff its graph contains at least one `agent` node. You don't
  set a flag. The `agent` node's presence *is* the declaration.
 - **`max_loop_iterations`:** This is a **per-node visit cap**, not a total
  graph-step cap. If the same node id is entered more than this many times,
  execution aborts with `Node 'X' visited N times (max_loop_iterations=...)`.
  Default: 100.
 - **`timeout`:** Wall-clock cap on the entire graph run. The executor
  checks this between every node transition; nodes that block longer than
  the timeout will still finish before the check fires.
 - **`initial_state`:** A JSON-compatible object. Values are seeded into
  state before any node runs and are referenced from any node via `{{key}}`
  templates.
 ### `{{initial_prompt}}`: Automatically Seeded
 When Loki invokes a graph agent with a user prompt (whether from the
 command line `loki -a my-agent "what is X?"`, from the REPL, or from a
 parent agent that spawned it as a sub-agent), the dispatcher automatically
 seeds the prompt text into state under the key **`initial_prompt`** before
 any node runs.
 This means every graph agent's first node can reference the user's request
 via `{{initial_prompt}}`:
 ```yaml
 parse_input:
  id: parse_input
  type: llm
  prompt: "{{initial_prompt}}"     # the user's command-line / REPL text
  ...
 ```
 You do not need to (and should not) put `initial_prompt` in `initial_state` as it is overwritten by the dispatcher.
 ---
 # Node Types
 There are seven node types: **agent**, **script**, **approval**, **input**,
 **llm**, **rag**, and **end**. Every node has these common fields:
 ```yaml
 my_node:
  id: my_node               # must match the map key
  type: <one of the seven>
  description: optional      # free-form
  next: another_node         # optional default next node; semantics vary per type
 ```
 The `next` field defines the default routing edge. Node types interpret it
 differently (some types ignore it in favor of internal routing; see each type
 below).
 ---
 ## agent
 Spawns a Loki sub-agent and waits for it to finish. This is how a graph agent
 delegates a sub-goal to a fully autonomous Loki agent (with its own tool loop
 and configuration).
 ```yaml
 research_topic:
  id: research_topic
  type: agent
  agent: deep-researcher          # name of an existing Loki agent
  prompt: "Research {{topic}}"    # interpolated against state
  timeout: 600                    # optional, in seconds (default 300)
  state_updates:
    findings: "{{output}}"
  output_schema: { ... }          # optional, see "Structured Output" below
  next: render
 ```
 - **`agent`:** Name of the child agent to spawn. Must exist in
  `<loki-config-dir>/agents/`.
 - **`prompt`:** The user message sent to the child agent. Templated against
  the current graph state.
 - **`timeout`:** Hard wall-clock cap. If the child agent exceeds it, the
  whole graph fails (no built-in fallback path on agent nodes).
 - **`state_updates`:** Map of `state_key: "{{template}}"`. The child agent's
  final text is available inside this map as `{{output}}`.
 ---
 ## script
 Runs a Bash, Python, or TypeScript script and merges its JSON-object stdout
 into state. Script files live under the agent's `scripts/` directory.
 **Supported extensions and runtimes**:
 | Extension | Runtime invoked            | Notes                                   |
 |-----------|----------------------------|-----------------------------------------|
 | `.sh`     | `bash <script>`            |                                         |
 | `.py`     | `python3 <script>`         | not `python`. Must be Python 3          |
 | `.ts`     | `npx tsx <script>`         | requires Node + `tsx` available on PATH |
 `.js` / `.mjs` / other extensions are **not** supported. The shebang line
 inside the script is not used for script-node dispatch (it is for normal
 custom-tools); the file extension is the source of truth.
 ```yaml
 route_after_parse:
  id: route_after_parse
  type: script
  script: scripts/route_after_parse.py
  timeout: 30                     # seconds, default 30
  fallback: handle_error          # optional: where to route on script failure
  state_updates:                  # applied after stdout merge
    last_run: "{{some_value}}"
 ```
 The script receives the current state in two forms; use whichever fits:
 | Env var            | Contents                                                      |
 |--------------------|---------------------------------------------------------------|
 | `GRAPH_STATE`      | Inline JSON when serialized state is <= 32 KiB                |
 | `GRAPH_STATE_FILE` | Path to a temp JSON file when serialized state exceeds 32 KiB |
 Exactly one of the two is set per script invocation; **always check both**. The temp
 file (when used) is cleaned up automatically after the graph finishes.
 The script must print a single JSON object on stdout. All keys merge into
 state; the reserved `_next` key is extracted and overrides the default `next`
 routing.
 ```python
 #!/usr/bin/env python3
 import json, os
 def load_state():
    if path := os.environ.get("GRAPH_STATE_FILE"):
        with open(path) as f:
            return json.load(f)
    return json.loads(os.environ.get("GRAPH_STATE", "{}"))
 state = load_state()
 codes = (state.get("loinc_codes") or "").strip()
 next_node = "query_db" if codes else "ask_for_code"
 print(json.dumps({"_next": next_node, "trimmed_codes": codes}))
 ```
 **Tolerant-fail**: if the script exits non-zero or produces invalid JSON, the
 node routes to `fallback` (if set) or to `next` (if set). Without either,
 the graph errors.
 ---
 ## approval
 Prompts the user with a question and a list of options, then routes based on
 their answer. This is the human-in-the-loop checkpoint.
 ```yaml
 approve:
  id: approve
  type: approval
  question: |
    Final report:
    {{report}}
    Approve?
  options:
    - "yes"
    - "no"
  routes:
    "yes": end_accepted
    "no": end_rejected
  on_other: clarify                # Required - see below
  state_updates:
    decision: "{{choice}}"
 ```
 ### The `on_other` field
 This field is **required** and easy to miss. Loki's `user__ask` tool *always*
 gives the user a "type your own answer" option in addition to the listed
 options. There is no way to disable this. Without `on_other`, a user who
 types something other than the listed options would crash the graph at
 runtime.
 `on_other` says **where to route when the user's answer does not match any
 `routes` key**. The free-form text they typed is available downstream via
 the `{{choice}}` template variable inside `state_updates`.
 Common patterns:
 - **Free-form means "I want to clarify"** -> `on_other: clarify_node`
  where `clarify_node` is an `input` or `llm` node that processes their text.
 - **Free-form means "rejection by default"** -> `on_other: end_rejected`.
 ---
 ## input
 Collects a free-form string from the user.
 ```yaml
 ask_for_code:
  id: ask_for_code
  type: input
  question: "Enter a LOINC code (e.g. 6690-2):"
  default: "{{last_used_code}}"   # optional, interpolated against state
  validation: "len(input) > 0"    # optional, see below
  state_updates:
    loinc_code: "{{input}}"
  next: query_db
 ```
 - **`default`:** If the user submits an empty response, this template is
  used. Only `default` itself is templated, not the surrounding question
  (which is also templated).
 - **`validation`:** A length predicate of the form
  `len(input) <op> <integer>`, where `<op>` is `>`, `>=`, `<`, `<=`, or `==`.
  This is a deliberately narrow grammar; regex / type / range validation are
  not yet supported. If validation fails, the node fails (no fallback).
 - The user's text is exposed to `state_updates` as `{{input}}`.
 ---
 ## llm
 A one-shot LLM call with an optional bounded tool-call loop. Unlike `agent`
 nodes, this does NOT spawn a sub-agent; it runs in a fresh isolated context
 with a caller-supplied system prompt and user prompt. Tool access is strictly
 opt-in: an `llm` node gets **no tools at all** unless its `tools` field
 explicitly lists them (see below).
 ```yaml
 grade_research:
  id: grade_research
  type: llm
  instructions: |               # optional system prompt
    You decide whether research is needed for {{topic}}.
  prompt: |                     # required user prompt
    Research context:
    {{research_text}}
    Reply with YES or NO.
  tools: []                     # see below
  model: anthropic:haiku        # optional override
  temperature: 0.0
  top_p: null
  max_attempts: 1               # transient-error retries (default 1)
  max_iterations: 10            # tool-call-loop turn cap (default 10)
  fallback: skip                # routes here if all attempts fail
  state_updates:
    grade: "{{output}}"
  output_schema: { ... }        # optional, see "Structured Output" below
  timeout: 120                  # optional; node wall-clock cap in seconds (unset = no timeout)
  next: synthesize
 ```
 ### The `tools` field (whitelist)
 The `tools` field is a strict opt-in whitelist: an `llm` node receives
 **only** the tools it explicitly lists, never the agent's full tool set.
 Three modes:
 - **Unset (field omitted)** -> **no tools**. The LLM produces output but
  cannot make any tool calls. This is identical to `tools: []`. Leaving the
  field out does _not_ inherit the agent's tools.
 - **`tools: []`** -> **no tools**. Same as unset.
 - **`tools: [a, b, mcp:server-name]`** -> only those specific tools, and
  nothing else. Entries are either exact tool names (matching `global_tools`,
  agent custom tools, or individual MCP function names) or the shorthand
  `mcp:<server-name>` (which enables all functions for that MCP server).
 Even when `tools` lists entries, the LLM receives **exactly** that set. The
 whitelist is enforced against global tools, agent custom tools, and MCP
 alike. Each entry is validated at startup against the active agent's tool
 list; an unknown entry is a startup error.
 ### Tolerant-fail routing
 | Outcome                                  | Routes to                  |
 |------------------------------------------|----------------------------|
 | Success                                  | `next`                     |
 | Failure WITH `fallback` set              | `fallback`                 |
 | Failure WITHOUT `fallback`               | `next` (output is "LLM node failed: ...") |
 `state_updates` are always applied (success or failure). On failure,
 `{{output}}` resolves to an error description so downstream nodes can detect
 it.
 ### Retries (`max_attempts`)
 `max_attempts` retries the LLM call **only on transient errors**. The
 failure message containing one of: `timed out`, `rate limit`, `429`,
 `Connection reset`, `Connection refused`, or `produced no output`. Any
 other error fails immediately without consuming further attempts. The
 default is `1` (no retries).
 ---
 ## rag
 Runs a hybrid (vector + keyword) retrieval against a per-node knowledge base
 and writes the result into state. This is how a graph agent does
 Retrieval-Augmented Generation: the `rag` node retrieves context, downstream
 `llm`/`agent` nodes inject it into their prompts via normal templating.
 ```yaml
 research_context:
  id: research_context
  type: rag
  documents:                    # required; The knowledge sources
    - ./knowledge/
    - https://example.com/spec
  query: "{{initial_prompt}}"   # templated; defaults to "{{initial_prompt}}"
  top_k: 5                      # optional; default = the knowledge base's own top_k
  timeout: 120                  # optional; retrieval timeout in seconds (default 120)
  state_updates:                # required in practice (see below)
    rag_context: "{{output.context}}"
    rag_sources: "{{output.sources}}"
  next: answer
 answer:
  type: llm
  prompt: |
    Use this context to answer:
    {{rag_context}}
    Question: {{initial_prompt}}
 ```
 - **`documents`:** Knowledge sources: files, directories, URLs, or
  loader-protocol paths. **Required**. It's what makes the node a `rag`
  node. Relative paths resolve against the agent's directory.
 - **`query`:** The retrieval query, templated against state. Defaults to
  `{{initial_prompt}}`. Set it to `{{refined_query}}` to retrieve against a
  query an upstream `llm` node produced.
 - **`top_k`:** Number of chunks to retrieve. Defaults to the knowledge
  base's own configured `top_k`.
 - **`timeout`:** Retrieval timeout in seconds. Default 120.
 - **`state_updates`:** Where the result goes. A `rag` node with no
  `state_updates` discards its result (the validator warns).
 **Knowledge-base build config** (all optional; used only when the knowledge
 base is first built):
 - **`embedding_model`:** Embedding model for the corpus.
 - **`chunk_size`:** Document chunk size.
 - **`chunk_overlap`:** Overlap between chunks.
 - **`reranker_model`:** Reranker applied to hybrid-search results.
 - **`batch_size`:** Embedding-request batch size.
 Each falls back to the app-level `rag_*` config when omitted. **When
 `embedding_model`, `chunk_size`, and `chunk_overlap` are all set, the
 knowledge base builds with no interactive prompts**. So a fully-specified
 `rag` node works in non-interactive runs.
 ### `{{output}}` shape
 Inside `state_updates`, `{{output}}` is a JSON object:
 ```json
 {
  "context": "[Source: ./knowledge/a.md]\n...chunk...",
  "sources": ["./knowledge/a.md", "https://example.com/spec"]
 }
 ```
 - `{{output.context}}`: The retrieved context block, ready to inject into a
  prompt.
 - `{{output.sources}}`: An array of source paths; `{{output.sources[0]}}`
  indexes individual sources (useful for downstream citation/verification
  nodes).
 ### Knowledge base lifecycle
 Each `rag` node's knowledge base is built **once, at agent load time**, into
 `<agent-dir>/<node-id>.yaml`:
 - If that file exists -> it is loaded (no prompt; works non-interactively).
 - If it's missing and the node is **fully specified** (`embedding_model` +
  `chunk_size` + `chunk_overlap` all set) -> it is built directly, no
  prompts. Works in non-interactive runs.
 - If it's missing, not fully specified, and Loki is interactive -> you are
  asked to initialize it, then prompted for the missing build values;
  declining is a hard error.
 - If it's missing, not fully specified, and Loki is non-interactive
  (no TTY) -> hard error, with a hint to set the build-config fields or run
  the agent once interactively.
 A graph with a `rag` node whose knowledge base isn't built **cannot run**.
 This is deliberate fail-fast behavior. (In `--info` mode the agent is only
 inspected, not run, so knowledge-base building is skipped entirely.)
 ### Retrieval
 Retrieval at execution time is fast (no re-embedding of the corpus). It's
 the same hybrid vector + keyword search normal Loki RAG uses. The corpus
 embedding/chunking cost is paid once, at load time.
 ---
 ## end
 Terminates execution and returns a final result.
 ```yaml
 end_accepted:
  id: end_accepted
  type: end
  output: |
    Approved report:
    {{report}}
  state_updates:                # optional last state mutations
    completed_at: "now"
 ```
 - **`output`:** Templated against state, printed as the graph's final
  result.
 - Multiple `end` nodes are fine; you pick which one routes here based on
  upstream conditions.
 ---
 # State and Template Syntax
 Graph state is a `serde_json::Value` map. Templates use `{{path}}` syntax
 inside any string field.
 | Form                          | Resolves to                                  |
 |-------------------------------|----------------------------------------------|
 | `{{key}}`                     | top-level value                              |
 | `{{a.b.c}}`                   | nested object path                           |
 | `{{arr[0]}}`                  | array index                                  |
 | `{{matrix[0][1]}}`            | nested array indices                         |
 | `{{users[0].name}}`           | object field via index                       |
 | `{{a.b.arr[2].field}}`        | mixed path                                   |
 Rendering rules per value type:
 - **String** -> as-is
 - **Number / bool / null** -> stringified (`true`, `42`, `null`)
 - **Array / Object** -> JSON-encoded compactly (`["a","b"]`, `{"k":"v"}`)
 Missing keys / paths behave differently per template-evaluation site:
 - Inside a node's primary fields (`prompt`, `instructions`, `question`,
  `output`) -> strict mode, missing keys raise an error.
 - Inside `state_updates` values -> lenient mode, missing keys become empty
  strings.
 ---
 # state_updates
 Every node type (except `end`, which has a slightly different shape) accepts
 an optional `state_updates` map:
 ```yaml
 state_updates:
  some_key: "{{template}}"
  other_key: "literal text with {{var}}"
 ```
 After the node body executes, each template is interpolated against state and
 the result is stored under the corresponding key. Three scoped variables are
 available *only inside `state_updates`*:
 | Variable     | Available in       | Resolves to                                                   |
 |--------------|--------------------|----------------------------------------------------------------|
 | `{{output}}` | `agent`, `llm`     | The node's primary text output (or parsed JSON value if `output_schema` is set) |
 | `{{choice}}` | `approval`         | The option the user picked, or their free-form text            |
 | `{{input}}`  | `input`            | The user's text (or interpolated `default` if they submitted empty) |
 These variables are cleared after `state_updates` runs, so they don't leak
 into the next node's templates.
 > **End nodes are different.** An `end` node's `state_updates` runs with
 > plain lenient interpolation. There is no scoped `{{output}}` because
 > there is no node-body output to scope. After `state_updates` apply, the
 > `end` node's own `output` template is interpolated against the resulting
 > state and returned as the graph's final result.
 ---
 # Routing & Tolerant-Fail
 Nodes route via three mechanisms in priority order:
 1. **Script `_next` override:** `script` nodes can set `"_next": "node_id"`
   in their stdout JSON to dynamically choose the next node.
 2. **Internal routing:** `approval` routes via its `routes` map (or
   `on_other` when the answer matches no listed option).
 3. **Default `next` edge:** the `next` field on the node.
 ### Routing requirements per node type
 | Node type   | Needs `next`?                                                                                     |
 |-------------|---------------------------------------------------------------------------------------------------|
 | `agent`     | **Yes** - `next` is required (unless the agent node is unreachable). Error at runtime if missing. |
 | `script`    | Either `_next` from script output OR static `next` (or `fallback` on failure). Error if neither.  |
 | `approval`  | No - routing is via `routes` and `on_other`. `next` is ignored.                                   |
 | `input`     | **Yes** - `next` is the success route.                                                            |
 | `llm`       | **Yes** - `next` is the success route (and the default for failures without `fallback`).          |
 | `rag`       | **Yes** - `next` is required. Error at runtime if missing.                                        |
 | `end`       | No - terminal.                                                                                    |
 ### Tolerant-fail contract
 Currently honored by `script` and `llm` nodes:
 - Success -> default routing
 - Failure with `fallback` set -> `fallback` target
 - Failure without `fallback` -> default routing, with the error description
  exposed in state so the next node can react
 `agent` and `input` nodes do NOT have a tolerant-fail `fallback` path;
 their failures propagate as graph failures.
 ---
 # Structured Output (`output_schema`)
 Both `llm` and `agent` nodes can specify an `output_schema` field: a JSON
 Schema (written inline in YAML) describing the expected shape of the node's
 output:
 ```yaml
 extract_task:
  type: llm
  prompt: 'Parse: "{{raw_task}}"'
  output_schema:
    type: object
    properties:
      action: { type: string }
      items:
        type: array
        items: { type: string }
      time_minutes: { type: ["integer", "null"] }
      priority:
        type: string
        enum: [low, medium, high]
    required: [action, items, priority]
 ```
 When `output_schema` is set:
 1. The node body runs normally.
 2. The raw text output is **tried as JSON first** (with light cleanup of
   markdown code fences); the fast path. If parsing succeeds, that's the
   structured output.
 3. Otherwise Loki invokes a built-in `__structured_output__` role
   (constructed inline; not visible in the user's role list) to extract a
   JSON object matching the schema. One repair retry on extractor failure.
 4. When the parsed value is a JSON **object**, its **top-level keys
   auto-merge into state permanently** (a non-object result is still
   reachable via `{{output}}` but has no top-level keys to merge).
 5. `{{output}}` (inside `state_updates`) resolves to the full parsed value.
 6. Explicit `state_updates` win over auto-merge if the same key is set in
   both.
 After the example above, downstream nodes can use `{{action}}`, `{{items}}`,
 `{{items[0]}}`, `{{priority}}`, etc. directly.
 ### LLM nodes vs Agent nodes: schema-hint injection
 This is the **most important behavioral difference** between the two node
 types when `output_schema` is set:
 - **LLM nodes**: Loki automatically appends a schema hint to the prompt
  (to the system prompt if `instructions` is set, otherwise to the user
  prompt). The hint tells the model to respond with JSON matching the
  schema. This means the main LLM call usually emits valid JSON directly ->
  the fast path succeeds -> the extractor LLM call is skipped entirely
  (cheaper, faster, more reliable).
 - **Agent nodes**: Loki does NOT inject any schema hint. Agents are
  multi-turn with their own tool-use loop; stuffing a schema into the
  initial prompt risks the agent fixating on JSON output instead of doing
  its actual work. The agent runs to completion freely, and the extractor
  converts its final text to JSON afterward.
 If you need an agent to emit JSON-shaped output, include schema language in
 its prompt yourself. The auto-injected hint for LLM nodes uses this form:
 ```
 Respond with a JSON object that matches this schema. Output ONLY the JSON
 object with no surrounding prose or markdown fences.
 Schema:
 {...}
 ```
 ### Tolerant-fail for extraction
 - **LLM node**: extraction failure = node failure -> routes via `fallback`
  or `next`.
 - **Agent node**: extraction failure propagates as a graph error (agent
  nodes have no `fallback`).
 ---
 # Worked Example
 A compact illustrative graph -`input` -> `llm` (with `output_schema`) ->
 `end` - exercising structured output and all template-path forms. For a
 **full-featured reference** covering every node type and field, see the
 heavily-commented `graph.example.yaml` at the root of the Loki repository.
 Illustrative `graph.yaml`:
 ```yaml
 name: structured-test
 version: "1.0"
 start: ask_task
 nodes:
  ask_task:
    id: ask_task
    type: input
    question: "Describe a task in free-form text."
    validation: "len(input) > 0"
    state_updates:
      raw_task: "{{input}}"
    next: extract_task
  extract_task:
    id: extract_task
    type: llm
    instructions: |
      You are a task parser. If a field cannot be determined, use a sensible
      default (empty array, null, or "medium" for priority).
    prompt: 'Parse this task description: "{{raw_task}}"'
    tools: []
    output_schema:
      type: object
      properties:
        action: { type: string }
        items:
          type: array
          items: { type: string }
        time_minutes: { type: ["integer", "null"] }
        priority:
          type: string
          enum: [low, medium, high]
        details:
          type: object
          properties:
            urgent: { type: boolean }
            deadline: { type: ["string", "null"] }
          required: [urgent]
      required: [action, items, priority, details]
    next: done
  done:
    id: done
    type: end
    output: |
      Action:        {{action}}
      Priority:      {{priority}}
      Time:          {{time_minutes}} min
      Urgent?        {{details.urgent}}
      First item:    {{items[0]}}
      All items:     {{items}}
 ```
 With the sample input `Buy groceries: milk, eggs, bread. About 15 minutes. Urgent.`
 Sample state after `extract_task`:
 ```json
 {
  "raw_task": "Buy groceries: milk, eggs, bread. About 15 minutes. Urgent.",
  "action": "buy",
  "items": ["milk", "eggs", "bread"],
  "time_minutes": 15,
  "priority": "high",
  "details": { "urgent": true, "deadline": null }
 }
 ```
 ---
 # Validation
 When `validate_before_run: true` (the default), Loki validates the graph at
 startup.
 **Errors (abort startup)**:
 - Start node missing or pointing to a non-existent node
 - Any `next` / `routes` / `fallback` / `on_other` target pointing to a
  non-existent node
 - Any cycle in declared static edges (cycles are always errors. The
  per-node `max_loop_iterations` is a runtime safety net for dynamically-
  routed loops, not a license for static cycles)
 - Graph has zero `end` nodes. Execution would never terminate
 - `approval` option without a matching `routes` entry
 - `script` file path does not exist relative to the agent's directory
 - `agent` node references an agent name that doesn't exist in the
  loki agents directory, or that exists but has neither a `config.yaml`
  nor a `graph.yaml`
 - `rag` node with no `documents` (at least one knowledge source is required)
 - `llm` node referencing an unknown tool or `mcp:<server>` in its `tools`
  whitelist, or an unknown `model`. Validated against the agent's tool,
  MCP-server, and model sets
 **Warnings (printed, execution continues)**:
 - Any node unreachable from the start via declared static edges
 - No `end` node reachable from the start via declared static edges
 - `approval` `routes` entry without a matching option
 - `rag` node with no `state_updates` (its retrieval result goes nowhere)
 > **Why some of these are warnings and not errors:** the validator only
 > follows **declared static edges** (`next`, `routes`, `fallback`,
 > `on_other`). Script nodes can also route dynamically at
 > runtime via `_next` in their JSON output, and those edges are invisible
 > to static analysis. To avoid false positives against dynamically-routed
 > graphs, "unreachable" and "no reachable end" are reported as warnings,
 > not errors.
 ---
 # Invocation Entry Points
 A graph agent can be entered from three places, all of which seed the
 caller's prompt into state as `{{initial_prompt}}`:
 1. **Top-level CLI:** `loki -a my-graph-agent "user prompt here"`
 2. **REPL:** When the active agent has a `graph.yaml`, every user
   message in the REPL runs the graph fresh; the message becomes
   `{{initial_prompt}}`
 3. **Child-agent spawn:** When another (graph or normal) agent invokes
   this one via Loki's sub-agent mechanism, the parent's request becomes
   `{{initial_prompt}}` for the child graph
 After the graph finishes, any sub-agents this graph spawned via
 `agent`-type nodes are cancelled, so a graph cannot leak background tool
 loops. The graph's final `end` node output is what's returned to the
 caller.
 ---
 # Streaming and Observability
 Graph execution has two observability channels:
 **1. stderr narration:** Dimmed `▸` lines you follow along with in real
 time, regardless of log level:
 ```
 ▸ graph: my-agent (start: extract_task)
 ▸ extract_task (llm)
 ▸   llm call: model=<active> tools=<none>
 ▸ extract_task -> done
 ▸ done (end)
 ▸ graph done in 2.41s
 ```
 **2. `tracing` logs:** Structured `info!`/`debug!`/`warn!`/`error!`
 records gated by `RUST_LOG` (see [Configuration](#configuration) below).
 This is the developer-facing channel and includes:
 - Graph start / completion / failure
 - Per-node entry and routing decisions (`debug`)
 - A **performance summary** at completion — every node's visit count,
  total/avg/max wall-clock time, slowest first:
  ```
  [graph:my-agent] performance summary (slowest first):
  [graph:my-agent]   deep_research: 1 visit(s), total 8200ms, avg 8200ms, max 8200ms
  [graph:my-agent]   extract_task: 1 visit(s), total 1400ms, avg 1400ms, max 1400ms
  ```
 **State snapshots**: when `log_state_snapshots: true` (the default), before
 each node runs Loki logs the state's byte size and key list at `debug`
 level, and the *full* state at `trace` level. The full state is
 deliberately kept at `trace` because graph state can contain secrets so 
 be careful sharing `trace`-level logs.
 ## Configuration
 Control the `tracing` channel with `RUST_LOG`:
 ```sh
 RUST_LOG=loki::graph=debug    loki -a my-agent "..."   # graph debug logs
 RUST_LOG=loki::graph=trace    loki -a my-agent "..."   # + full state snapshots
 RUST_LOG=loki::graph=info     loki -a my-agent "..."   # start/end/perf summary
 ```
 The stderr `▸` narration is always shown and is not affected by `RUST_LOG`.
 ---
 # Limitations / Gotchas
 A short, honest list of things that bite people:
 - **A graph agent is `graph.yaml`-only**. It must not also have a
  `config.yaml`. Both files present is a hard load error.
 - **Graph agents do not support sessions**. A graph manages its own state
  (`GraphState`), so there is no conversational history to persist.
  Explicitly requesting a session is a hard error. `--session` on the
  CLI, a session name passed to `.agent` in the REPL, or running
  `.session` while inside a graph agent. Any app-level `agent_session`
  default is silently skipped for graph agents rather than applied.
 - **RAG is per-node, not agent-wide**. Graph agents do RAG via `rag`
  nodes (each with its own knowledge base); there is no agent-wide
  `documents` field at the `graph.yaml` top level.
 - **A `rag` node's knowledge base is built once, at load time**. Changing
  a `rag` node's `documents` does not rebuild it. Delete
  `<agent-dir>/<node-id>.yaml` to force a fresh build on next run.
 - **`on_other` is required on every `approval` node** because `user__ask`
  always permits free-form responses (see [the approval section](#approval)).
 - **`validation` on `input` nodes is length-only**. The grammar is
  `len(input) <op> <integer>` with `<op>` in `> >= < <= ==`. No regex, no
  type coercion, no range checks. Use a follow-up `script` node for richer
  validation.
 - **An `input` node's `default` is not re-validated.** When the user
  submits an empty response and the `default` is substituted in, that
  substituted value is *not* checked against `validation`. Make sure any
  `default` you set would itself satisfy the `validation` predicate.
 - **Tool whitelist is `llm`-only**. `agent` nodes always use the child
  agent's full tool universe. They ignore any `tools:` field. This is by
  design: child agents own their tool surface.
 - **`{{output}}`, `{{choice}}`, `{{input}}` are scoped to `state_updates`**.
  Outside `state_updates` (e.g. in another node's `prompt`), these
  scoped variables are not available unless the previous node explicitly
  stored them via `state_updates`. `end` nodes do NOT get a scoped
  `{{output}}`. They have no node body output to scope.
 - **Schema-hint auto-injection happens for `llm` nodes only**, not
  `agent` nodes (see [Structured Output](#structured-output-output_schema)).
 - **Script-output JSON must be an object**, not an array or primitive,
  even if you only want to set `_next`.
 - **Cycles in declared static edges are always errors**. The per-node
  `max_loop_iterations` is a runtime *safety net* for cycles built via
  dynamic `script._next` routing, not permission to write static cycles.
 - **Schema version is fixed at `"1.0"`** today. Any other value is a
  startup error.
 - **Script extensions are exactly `.sh`, `.py`, `.ts`**. No JavaScript,
  no Ruby, no Lua. Python must be available as `python3` and TypeScript
  requires `npx tsx` on PATH.
 ---
 # See Also
 - [`graph.example.yaml`](https://github.com/Dark-Alex-17/loki/blob/main/graph.example.yaml) - A fully-commented, full-featured reference
  graph agent at the root of the Loki repository (every top-level field,
  every node type).
 - [Agents](Agents) - non-graph agent system (config.yaml + LLM loop)
 - [Custom Tools](Custom-Tools) - building `tools.sh` / `tools.py` /
  `tools.ts` files for use in graph nodes
 - [Roles](Roles) - note that the built-in `__structured_output__` role used
  by `output_schema` is intentionally internal and is not user-visible
 - [MCP Servers](MCP-Servers) - `mcp:<server>` shorthand inside an `llm`
  node's `tools:` whitelist
@@ -34,6 +34,11 @@
  - [Sub-Agent Spawning](Agents#7-sub-agent-spawning-system)
  - [User Interaction Tools](Agents#8-user-interaction-tools)
  - [Built-In Agents](Agents#built-in-agents)
 - [Graph Agents](Graph-Agents)
  - [Node Types](Graph-Agents#node-types)
  - [State & Templates](Graph-Agents#state-and-template-syntax)
  - [Structured Output](Graph-Agents#structured-output-output_schema)
  - [Limitations](Graph-Agents#limitations--gotchas)
 ## Knowledge & Automation
 - [RAG](RAG)