docs: updated graph agent docs to reflect improved llm node failure behavior

2026-05-22 18:33:35 -06:00
parent d64231280c
commit 687b0abb32
+25 -21
@@ -428,17 +428,18 @@ whitelist is enforced against global tools, agent custom tools, and MCP
alike. Each entry is validated at startup against the active agent's tool alike. Each entry is validated at startup against the active agent's tool
list; an unknown entry is a startup error. list; an unknown entry is a startup error.
### Tolerant-fail routing ### Failure routing
| Outcome | Routes to | | Outcome | Behavior |
|------------------------------------------|----------------------------| |------------------------------------------|-------------------------------------------------------------------------------------------|
| Success | `next` | | Success | Routes via `next`. |
| Failure WITH `fallback` set | `fallback` | | Failure with `fallback` set | Routes via `fallback`; `state_updates` are still applied; `{{output}}` holds the error. |
| Failure WITHOUT `fallback` | `next` (output is "LLM node failed: ...") | | Failure without `fallback` | **Graph fails at this node** with a clear error message naming the underlying cause. |
`state_updates` are always applied (success or failure). On failure, `state_updates` are always applied when the node has a `fallback` route
`{{output}}` resolves to an error description so downstream nodes can detect (success or failure). On failure with no `fallback`, the graph aborts before
it. downstream nodes run, so downstream `{{output}}` references never see error
strings; the upstream cause is reported instead.
### Retries (`max_attempts`) ### Retries (`max_attempts`)
@@ -746,22 +747,24 @@ Nodes route via three mechanisms in priority order:
| `script` | Either `_next` from script output OR static `next` (or `fallback` on failure). Error if neither. | Yes (when `_next` is not emitted) | | `script` | Either `_next` from script output OR static `next` (or `fallback` on failure). Error if neither. | Yes (when `_next` is not emitted) |
| `approval` | No - routing is via `routes` and `on_other`. `next` is ignored. | No - forbidden by validator | | `approval` | No - routing is via `routes` and `on_other`. `next` is ignored. | No - forbidden by validator |
| `input` | **Yes** - `next` is the success route. | No - forbidden by validator | | `input` | **Yes** - `next` is the success route. | No - forbidden by validator |
| `llm` | **Yes** - `next` is the success route (and the default for failures without `fallback`). | Yes (success path; failure with `fallback` routes to single target) | | `llm` | **Yes** - `next` is the success route. Failures without `fallback` halt the graph. | Yes (success path; failure with `fallback` routes to single target) |
| `rag` | **Yes** - `next` is required. Error at runtime if missing. | Yes | | `rag` | **Yes** - `next` is required. Error at runtime if missing. | Yes |
| `map` | **Yes** - `next` is where the parent super-step continues after the map collects. | Yes | | `map` | **Yes** - `next` is where the parent super-step continues after the map collects. | Yes |
| `end` | No - terminal. | n/a | | `end` | No - terminal. | n/a |
### Tolerant-fail contract ### Failure-handling contract
Currently honored by `script` and `llm` nodes: | Node type | Success | Failure with `fallback` | Failure without `fallback` |
|----------------------------|-----------------------------|-------------------------|-----------------------------------------------------------|
| `llm` | Routes via `next` | Routes via `fallback` | **Graph fails at this node** (use `fallback:` to recover) |
| `script` | Routes via `next` | Routes via `fallback` | Routes via `next`; `{{output}}` holds the error |
| `agent` / `input` | Routes via `next` | n/a (no `fallback`) | Graph fails at this node |
| `rag` / `map` / `approval` | Routes via configured edges | n/a | Graph fails at this node |
- Success -> default routing **Note:** LLM node failures halt the graph with
- Failure with `fallback` set -> `fallback` target a clear error message. This prevents downstream nodes from running against
- Failure without `fallback` -> default routing, with the error description garbage state when an upstream LLM call fails (HTTP 4xx/5xx, timeout,
exposed in state so the next node can react structured-extraction error, etc.).
`agent` and `input` nodes do NOT have a tolerant-fail `fallback` path;
their failures propagate as graph failures.
--- ---
@@ -1127,8 +1130,9 @@ Schema:
### Tolerant-fail for extraction ### Tolerant-fail for extraction
- **LLM node**: extraction failure = node failure -> routes via `fallback` - **LLM node**: extraction failure = node failure. Routes via `fallback`
or `next`. if declared; otherwise the graph fails at this node with the extractor
error message.
- **Agent node**: extraction failure propagates as a graph error (agent - **Agent node**: extraction failure propagates as a graph error (agent
nodes have no `fallback`). nodes have no `fallback`).