docs: Added docs on the new LLM_TOOL_RAW_JSON escape hatch

2026-06-03 14:22:31 -06:00
parent ddc8ae085f
commit 3bbe8f4b3d
2 changed files with 107 additions and 6 deletions
+54
@@ -284,6 +284,60 @@ $ ./get_current_time.sh
Fri Oct 24 05:55:04 PM MDT 2025 Fri Oct 24 05:55:04 PM MDT 2025
``` ```
# Handling large or special-character argument values
Coyote dispatches a tool call by converting the LLM's JSON arguments into shell `--option=<value>` flags via `jq`, then
`eval`-ing the result. The flag values reach your `main` function as `argc_*` variables. For most calls this works fine.
However, for **very large values, multi-line values dense with newlines, or values with many markdown table pipes (`|`),
single quotes, em-dashes, and other shell-significant characters**, the shell-quoting round-trip can occasionally drop
characters or truncate the value before it reaches your `argc_*` variable. Symptoms include `argc_*` being shorter than
what the LLM sent, or starting mid-content.
## The escape hatch: `LLM_TOOL_RAW_JSON`
Coyote exports the raw JSON envelope it received from the LLM as the `LLM_TOOL_RAW_JSON` environment variable on every
tool invocation. To bypass argc parsing for a specific option, re-derive its value directly from the JSON using `jq`:
```bash
# Prefer the raw JSON when available, fall back to argc parsing if not
if [[ -n "$LLM_TOOL_RAW_JSON" ]] && command -v jq >/dev/null 2>&1; then
argc_contents="$(jq -r '.contents' <<< "$LLM_TOOL_RAW_JSON")"
argc_path="$(jq -r '.path' <<< "$LLM_TOOL_RAW_JSON")"
fi
```
The `jq -r` ("raw") flag preserves every byte of the original LLM-sent value, including newlines, quotes, em-dashes,
and shell-special characters, without any shell-quoting layer in between. This is the same approach Coyote's bundled
`fs_write`, `fs_patch`, `execute_command`, `execute_sql_code`, and `send_mail` tools use for their large-payload options.
The fallback (`fall back to argc parsing if not`) is intentional: if `LLM_TOOL_RAW_JSON` is unset or `jq` isn't
installed, the tool still works via the standard argc path. You're adding a more reliable code path, not replacing the
existing one.
## When to use this
- Your option's value can legitimately be many KB of text (file contents, code, email bodies, SQL queries).
- Your option's value can contain shell-significant characters in dense patterns (pipes, single quotes, ANSI escapes).
- You observe that `argc_<option>` is shorter than what the LLM sent, or has corruption near the beginning/middle of the
value.
If your tool only takes short string values (paths, IDs, search queries), you don't need the bypass. The standard argc
flow handles those reliably.
## For agent-local tools
If you're writing tools inside an agent's `tools.sh` (under `<config_dir>/agents/<agent>/tools.sh`), the same env var
is exposed as `LLM_AGENT_RAW_JSON` (the raw JSON for the agent function call). The bypass pattern is identical:
```bash
if [[ -n "$LLM_AGENT_RAW_JSON" ]] && command -v jq >/dev/null 2>&1; then
argc_some_field="$(jq -r '.some_field' <<< "$LLM_AGENT_RAW_JSON")"
fi
```
---
# Prompt Helpers # Prompt Helpers
It's often useful to create interactive prompts for our bash tools so that our tools can get input from It's often useful to create interactive prompts for our bash tools so that our tools can get input from
users. users.
+53 -6
@@ -27,16 +27,63 @@ to enable it globally. See the [Tools](Tools#enablingdisabling-global-tools) doc
## Environment Variables ## Environment Variables
All tools have access to the following environment variables that provide context about the current execution environment: All tools have access to the following environment variables that provide context about the current execution environment:
| Variable | Description | | Variable | Description |
|----------------------|--------------------------------------------------------------------------------------------------------------------------------------------| |----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `LLM_OUTPUT` | Indicates where the output of the tool should go. <br>In certain situations, this may be set to a temporary file instead of `/dev/stdout`. | | `LLM_OUTPUT` | Indicates where the output of the tool should go. <br>In certain situations, this may be set to a temporary file instead of `/dev/stdout`. |
| `LLM_ROOT_DIR` | The root `config_dir` directory for Coyote <br>(i.e. `dirname $(coyote --info \| grep config_file \| awk '{print $2}')`) | | `LLM_ROOT_DIR` | The root `config_dir` directory for Coyote <br>(i.e. `dirname $(coyote --info \| grep config_file \| awk '{print $2}')`) |
| `LLM_TOOL_NAME` | The name of the tool being executed | | `LLM_TOOL_NAME` | The name of the tool being executed |
| `LLM_TOOL_CACHE_DIR` | A directory specific to the tool for storing cache or temporary files | | `LLM_TOOL_CACHE_DIR` | A directory specific to the tool for storing cache or temporary files |
| `LLM_TOOL_RAW_JSON` | The raw JSON envelope the LLM sent for this tool call, exactly as received. Useful as a robustness escape hatch. See [Handling large or special-character values](#handling-large-or-special-character-values) below. |
Coyote also searches the tools directory on startup for a `.env` file. If found, all tools in `functions/tools/` will have Coyote also searches the tools directory on startup for a `.env` file. If found, all tools in `functions/tools/` will have
the environment variables defined in the `.env` file available to them. the environment variables defined in the `.env` file available to them.
## Handling large or special-character values
Coyote dispatches a tool call by converting the LLM's JSON arguments into shell `--option=<value>` flags via `jq`, then
`eval`-ing the result. For most calls this works fine. For very large values, multi-line values with many newlines, or
values dense with markdown table pipes, single quotes, and other shell-significant characters, the shell-quoting
round-trip can occasionally drop characters or truncate the value before it reaches your tool's `argc_*` variable.
If you observe corruption or truncation of a tool argument (e.g., the value reaching your tool is shorter than what the
LLM sent, or starts mid-content), bypass the shell parsing entirely by reading directly from `LLM_TOOL_RAW_JSON`:
```bash
# Bash: prefer the raw JSON when available, fall back to argc parsing
if [[ -n "$LLM_TOOL_RAW_JSON" ]] && command -v jq >/dev/null 2>&1; then
argc_contents="$(jq -r '.contents' <<< "$LLM_TOOL_RAW_JSON")"
argc_path="$(jq -r '.path' <<< "$LLM_TOOL_RAW_JSON")"
fi
```
```python
# Python: read os.environ["LLM_TOOL_RAW_JSON"] and parse with json.loads
import json, os
raw = os.environ.get("LLM_TOOL_RAW_JSON")
if raw:
payload = json.loads(raw)
contents = payload["contents"]
path = payload["path"]
```
```typescript
// TypeScript: process.env.LLM_TOOL_RAW_JSON, then JSON.parse
const raw = process.env.LLM_TOOL_RAW_JSON;
if (raw) {
const payload = JSON.parse(raw);
const contents = payload.contents;
const path = payload.path;
}
```
This is the same approach Coyote's bundled `fs_write`, `fs_patch`, `execute_command`, `execute_sql_code`, and
`send_mail` tools use for their large-payload options. It preserves every byte of the original LLM-sent value, including
newlines, quotes, em-dashes, and shell-special characters. If `jq` (bash) or your language's JSON parser is available,
prefer this path for any option that may carry user-generated multi-line content.
For agent-local tools written under `<config_dir>/agents/<agent>/tools.sh`, the same env var is exposed as
`LLM_AGENT_RAW_JSON` (the raw JSON payload for the agent function call).
## Custom Bash-Based Tools ## Custom Bash-Based Tools
To create a Bash-based tool, refer to the [custom bash tools documentation](Custom-Bash-Tools). To create a Bash-based tool, refer to the [custom bash tools documentation](Custom-Bash-Tools).