From 30c7854b319e478783f589231999471675c6b313 Mon Sep 17 00:00:00 2001 From: Alex Clarke Date: Wed, 3 Jun 2026 14:44:56 -0600 Subject: [PATCH] docs: Updated docs for the LLM_*_RAW_JSON variable for agents and tools --- Custom-Bash-Tools.md | 48 +++++++++++------------ Custom-Tools.md | 92 ++++++++++++++++++++++++-------------------- 2 files changed, 74 insertions(+), 66 deletions(-) diff --git a/Custom-Bash-Tools.md b/Custom-Bash-Tools.md index a8d29fc..f68f663 100644 --- a/Custom-Bash-Tools.md +++ b/Custom-Bash-Tools.md @@ -284,36 +284,35 @@ $ ./get_current_time.sh Fri Oct 24 05:55:04 PM MDT 2025 ``` -# Handling large or special-character argument values +# Reading argument values from `LLM_TOOL_RAW_JSON` -Coyote dispatches a tool call by converting the LLM's JSON arguments into shell `--option=` flags via `jq`, then -`eval`-ing the result. The flag values reach your `main` function as `argc_*` variables. For most calls this works fine. +Coyote dispatches a bash tool call by converting the LLM's JSON arguments into shell `--option=` flags via `jq`, +then `eval`-ing the result. The flag values reach your `main` function as `argc_*` variables. For short, single-line +values this works fine. -However, for **very large values, multi-line values dense with newlines, or values with many markdown table pipes (`|`), -single quotes, em-dashes, and other shell-significant characters**, the shell-quoting round-trip can occasionally drop -characters or truncate the value before it reaches your `argc_*` variable. Symptoms include `argc_*` being shorter than -what the LLM sent, or starting mid-content. +However, for **large multi-line values, or values dense with shell-significant characters** (markdown table pipes (`|`), +single quotes, em-dashes, etc.), the shell-quoting round-trip can occasionally drop characters or truncate the value +before it reaches your `argc_*` variable. Symptoms include `argc_*` being shorter than what the LLM sent, or starting +mid-content. -## The escape hatch: `LLM_TOOL_RAW_JSON` - -Coyote exports the raw JSON envelope it received from the LLM as the `LLM_TOOL_RAW_JSON` environment variable on every -tool invocation. To bypass argc parsing for a specific option, re-derive its value directly from the JSON using `jq`: +To sidestep the shell-quoting layer entirely, read the value directly from the raw JSON envelope that Coyote exports as +the `LLM_TOOL_RAW_JSON` environment variable: ```bash -# Prefer the raw JSON when available, fall back to argc parsing if not -if [[ -n "$LLM_TOOL_RAW_JSON" ]] && command -v jq >/dev/null 2>&1; then +# shellcheck disable=SC2154 +main() { argc_contents="$(jq -r '.contents' <<< "$LLM_TOOL_RAW_JSON")" argc_path="$(jq -r '.path' <<< "$LLM_TOOL_RAW_JSON")" -fi + + # ... rest of your tool logic using $argc_contents and $argc_path +} ``` The `jq -r` ("raw") flag preserves every byte of the original LLM-sent value, including newlines, quotes, em-dashes, -and shell-special characters, without any shell-quoting layer in between. This is the same approach Coyote's bundled -`fs_write`, `fs_patch`, `execute_command`, `execute_sql_code`, and `send_mail` tools use for their large-payload options. - -The fallback (`fall back to argc parsing if not`) is intentional: if `LLM_TOOL_RAW_JSON` is unset or `jq` isn't -installed, the tool still works via the standard argc path. You're adding a more reliable code path, not replacing the -existing one. +and shell-special characters, without any shell-quoting layer in between. This is the pattern Coyote's bundled +`fs_write`, `fs_patch`, `execute_command`, `execute_sql_code`, and `send_mail` tools use for their large-payload +options. The argc `# @option --foo!` directives stay in your script so Coyote can build the JSON schema for the LLM +and validate the call, but your `main()` reads from `LLM_TOOL_RAW_JSON` instead of trusting argc's value capture. ## When to use this @@ -327,13 +326,12 @@ flow handles those reliably. ## For agent-local tools -If you're writing tools inside an agent's `tools.sh` (under `/agents//tools.sh`), the same env var -is exposed as `LLM_AGENT_RAW_JSON` (the raw JSON for the agent function call). The bypass pattern is identical: +If you're writing tools inside an agent's `tools.sh` (under `/agents//tools.sh`), the same value is +exposed as `LLM_AGENT_RAW_JSON` (the raw JSON for the agent function call). The semantics are identical; only the +variable name differs: ```bash -if [[ -n "$LLM_AGENT_RAW_JSON" ]] && command -v jq >/dev/null 2>&1; then - argc_some_field="$(jq -r '.some_field' <<< "$LLM_AGENT_RAW_JSON")" -fi +argc_some_field="$(jq -r '.some_field' <<< "$LLM_AGENT_RAW_JSON")" ``` --- diff --git a/Custom-Tools.md b/Custom-Tools.md index 93eb44f..8a143ef 100644 --- a/Custom-Tools.md +++ b/Custom-Tools.md @@ -27,62 +27,72 @@ to enable it globally. See the [Tools](Tools#enablingdisabling-global-tools) doc ## Environment Variables All tools have access to the following environment variables that provide context about the current execution environment: -| Variable | Description | -|----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `LLM_OUTPUT` | Indicates where the output of the tool should go.
In certain situations, this may be set to a temporary file instead of `/dev/stdout`. | -| `LLM_ROOT_DIR` | The root `config_dir` directory for Coyote
(i.e. `dirname $(coyote --info \| grep config_file \| awk '{print $2}')`) | -| `LLM_TOOL_NAME` | The name of the tool being executed | -| `LLM_TOOL_CACHE_DIR` | A directory specific to the tool for storing cache or temporary files | -| `LLM_TOOL_RAW_JSON` | The raw JSON envelope the LLM sent for this tool call, exactly as received. Useful as a robustness escape hatch. See [Handling large or special-character values](#handling-large-or-special-character-values) below. | +| Variable | Description | +|----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `LLM_OUTPUT` | Indicates where the output of the tool should go.
In certain situations, this may be set to a temporary file instead of `/dev/stdout`. | +| `LLM_ROOT_DIR` | The root `config_dir` directory for Coyote
(i.e. `dirname $(coyote --info \| grep config_file \| awk '{print $2}')`) | +| `LLM_TOOL_NAME` | The name of the tool being executed | +| `LLM_TOOL_CACHE_DIR` | A directory specific to the tool for storing cache or temporary files | +| `LLM_TOOL_RAW_JSON` | The raw JSON envelope the LLM sent for this tool call, exactly as received. See [Reading values via LLM_TOOL_RAW_JSON](#reading-values-via-llm_tool_raw_json) below. | Coyote also searches the tools directory on startup for a `.env` file. If found, all tools in `functions/tools/` will have the environment variables defined in the `.env` file available to them. -## Handling large or special-character values +## Reading values via `LLM_TOOL_RAW_JSON` -Coyote dispatches a tool call by converting the LLM's JSON arguments into shell `--option=` flags via `jq`, then -`eval`-ing the result. For most calls this works fine. For very large values, multi-line values with many newlines, or -values dense with markdown table pipes, single quotes, and other shell-significant characters, the shell-quoting -round-trip can occasionally drop characters or truncate the value before it reaches your tool's `argc_*` variable. +Coyote exports the raw JSON envelope it received from the LLM as the `LLM_TOOL_RAW_JSON` environment variable on every +tool invocation. Tools can use this to read option values directly from the JSON rather than going through the +`argc_*` variables. -If you observe corruption or truncation of a tool argument (e.g., the value reaching your tool is shorter than what the -LLM sent, or starts mid-content), bypass the shell parsing entirely by reading directly from `LLM_TOOL_RAW_JSON`: +### When to use it + +**Bash tools**: This is the recommended pattern for any option that may carry large multi-line content, code, file +contents, or values dense with shell-significant characters (markdown table pipes, single quotes, em-dashes, etc.). +Coyote's bash dispatcher converts JSON to shell `--option=` flags via `jq` and `eval`-s the result; for large or +special-character values, that shell-quoting round-trip can occasionally drop characters or misalign content before it +reaches `argc_*`. Reading from `LLM_TOOL_RAW_JSON` bypasses the shell layer entirely. ```bash -# Bash: prefer the raw JSON when available, fall back to argc parsing -if [[ -n "$LLM_TOOL_RAW_JSON" ]] && command -v jq >/dev/null 2>&1; then +main() { argc_contents="$(jq -r '.contents' <<< "$LLM_TOOL_RAW_JSON")" argc_path="$(jq -r '.path' <<< "$LLM_TOOL_RAW_JSON")" -fi -``` -```python -# Python: read os.environ["LLM_TOOL_RAW_JSON"] and parse with json.loads -import json, os -raw = os.environ.get("LLM_TOOL_RAW_JSON") -if raw: - payload = json.loads(raw) - contents = payload["contents"] - path = payload["path"] -``` - -```typescript -// TypeScript: process.env.LLM_TOOL_RAW_JSON, then JSON.parse -const raw = process.env.LLM_TOOL_RAW_JSON; -if (raw) { - const payload = JSON.parse(raw); - const contents = payload.contents; - const path = payload.path; + # ... rest of your tool logic using $argc_contents and $argc_path } ``` -This is the same approach Coyote's bundled `fs_write`, `fs_patch`, `execute_command`, `execute_sql_code`, and -`send_mail` tools use for their large-payload options. It preserves every byte of the original LLM-sent value, including -newlines, quotes, em-dashes, and shell-special characters. If `jq` (bash) or your language's JSON parser is available, -prefer this path for any option that may carry user-generated multi-line content. +This is the pattern Coyote's bundled `fs_write`, `fs_patch`, `execute_command`, `execute_sql_code`, and `send_mail` tools +use for their large-payload options. The argc `# @option --foo!` directives stay in your script so Coyote can build the +JSON schema for the LLM, but your `main()` reads from `LLM_TOOL_RAW_JSON` instead of trusting argc's value capture. -For agent-local tools written under `/agents//tools.sh`, the same env var is exposed as -`LLM_AGENT_RAW_JSON` (the raw JSON payload for the agent function call). +**Python and TypeScript tools**: Coyote's Python and TypeScript dispatchers parse the JSON envelope natively (`json.loads` +/ `JSON.parse`) and pass values directly to your `run()` function as native types. They don't go through shell quoting, +so the `LLM_*_RAW_JSON` escape hatch that bash tools need doesn't affect them. Declared parameters arrive in your +function correctly without needing `LLM_TOOL_RAW_JSON`. + +Python and TypeScript tools may still want to read `LLM_TOOL_RAW_JSON` for other reasons: +- Accessing fields the LLM passed that aren't declared in your `run()` signature (telemetry, optional metadata). +- Auditing or logging the original LLM-sent JSON verbatim. +- Debugging when a value isn't what you expected. + +```python +# Python: parse the raw JSON when you need beyond-signature access +import json, os +payload = json.loads(os.environ["LLM_TOOL_RAW_JSON"]) +extra_field = payload.get("extra_field") +``` + +```typescript +// TypeScript: parse the raw JSON when you need beyond-signature access +const payload = JSON.parse(process.env.LLM_TOOL_RAW_JSON!); +const extraField = (payload as Record).extra_field; +``` + +### Agent-local tools + +For tools written under `/agents//tools.sh` (or `.py` / `.ts`), the same value is exposed as +`LLM_AGENT_RAW_JSON`, the raw JSON payload for the agent function call. The semantics are identical; only the variable +name differs. ## Custom Bash-Based Tools To create a Bash-based tool, refer to the [custom bash tools documentation](Custom-Bash-Tools).