From 687b0abb32ef61b3973ac7a7f509ae5c88e7206c Mon Sep 17 00:00:00 2001 From: Alex Clarke Date: Fri, 22 May 2026 18:33:35 -0600 Subject: [PATCH] docs: updated graph agent docs to reflect improved llm node failure behavior --- Graph-Agents.md | 46 +++++++++++++++++++++++++--------------------- 1 file changed, 25 insertions(+), 21 deletions(-) diff --git a/Graph-Agents.md b/Graph-Agents.md index e5adc8d..2238fbf 100644 --- a/Graph-Agents.md +++ b/Graph-Agents.md @@ -428,17 +428,18 @@ whitelist is enforced against global tools, agent custom tools, and MCP alike. Each entry is validated at startup against the active agent's tool list; an unknown entry is a startup error. -### Tolerant-fail routing +### Failure routing -| Outcome | Routes to | -|------------------------------------------|----------------------------| -| Success | `next` | -| Failure WITH `fallback` set | `fallback` | -| Failure WITHOUT `fallback` | `next` (output is "LLM node failed: ...") | +| Outcome | Behavior | +|------------------------------------------|-------------------------------------------------------------------------------------------| +| Success | Routes via `next`. | +| Failure with `fallback` set | Routes via `fallback`; `state_updates` are still applied; `{{output}}` holds the error. | +| Failure without `fallback` | **Graph fails at this node** with a clear error message naming the underlying cause. | -`state_updates` are always applied (success or failure). On failure, -`{{output}}` resolves to an error description so downstream nodes can detect -it. +`state_updates` are always applied when the node has a `fallback` route +(success or failure). On failure with no `fallback`, the graph aborts before +downstream nodes run, so downstream `{{output}}` references never see error +strings; the upstream cause is reported instead. ### Retries (`max_attempts`) @@ -746,22 +747,24 @@ Nodes route via three mechanisms in priority order: | `script` | Either `_next` from script output OR static `next` (or `fallback` on failure). Error if neither. | Yes (when `_next` is not emitted) | | `approval` | No - routing is via `routes` and `on_other`. `next` is ignored. | No - forbidden by validator | | `input` | **Yes** - `next` is the success route. | No - forbidden by validator | -| `llm` | **Yes** - `next` is the success route (and the default for failures without `fallback`). | Yes (success path; failure with `fallback` routes to single target) | +| `llm` | **Yes** - `next` is the success route. Failures without `fallback` halt the graph. | Yes (success path; failure with `fallback` routes to single target) | | `rag` | **Yes** - `next` is required. Error at runtime if missing. | Yes | | `map` | **Yes** - `next` is where the parent super-step continues after the map collects. | Yes | | `end` | No - terminal. | n/a | -### Tolerant-fail contract +### Failure-handling contract -Currently honored by `script` and `llm` nodes: +| Node type | Success | Failure with `fallback` | Failure without `fallback` | +|----------------------------|-----------------------------|-------------------------|-----------------------------------------------------------| +| `llm` | Routes via `next` | Routes via `fallback` | **Graph fails at this node** (use `fallback:` to recover) | +| `script` | Routes via `next` | Routes via `fallback` | Routes via `next`; `{{output}}` holds the error | +| `agent` / `input` | Routes via `next` | n/a (no `fallback`) | Graph fails at this node | +| `rag` / `map` / `approval` | Routes via configured edges | n/a | Graph fails at this node | -- Success -> default routing -- Failure with `fallback` set -> `fallback` target -- Failure without `fallback` -> default routing, with the error description - exposed in state so the next node can react - -`agent` and `input` nodes do NOT have a tolerant-fail `fallback` path; -their failures propagate as graph failures. +**Note:** LLM node failures halt the graph with +a clear error message. This prevents downstream nodes from running against +garbage state when an upstream LLM call fails (HTTP 4xx/5xx, timeout, +structured-extraction error, etc.). --- @@ -1127,8 +1130,9 @@ Schema: ### Tolerant-fail for extraction -- **LLM node**: extraction failure = node failure -> routes via `fallback` - or `next`. +- **LLM node**: extraction failure = node failure. Routes via `fallback` + if declared; otherwise the graph fails at this node with the extractor + error message. - **Agent node**: extraction failure propagates as a graph error (agent nodes have no `fallback`).