docs: Updated docs for the LLM_*_RAW_JSON variable for agents and tools

2026-06-03 14:44:56 -06:00
parent 3bbe8f4b3d
commit 30c7854b31
2 changed files with 74 additions and 66 deletions
+23 -25
@@ -284,36 +284,35 @@ $ ./get_current_time.sh
Fri Oct 24 05:55:04 PM MDT 2025 Fri Oct 24 05:55:04 PM MDT 2025
``` ```
# Handling large or special-character argument values # Reading argument values from `LLM_TOOL_RAW_JSON`
Coyote dispatches a tool call by converting the LLM's JSON arguments into shell `--option=<value>` flags via `jq`, then Coyote dispatches a bash tool call by converting the LLM's JSON arguments into shell `--option=<value>` flags via `jq`,
`eval`-ing the result. The flag values reach your `main` function as `argc_*` variables. For most calls this works fine. then `eval`-ing the result. The flag values reach your `main` function as `argc_*` variables. For short, single-line
values this works fine.
However, for **very large values, multi-line values dense with newlines, or values with many markdown table pipes (`|`), However, for **large multi-line values, or values dense with shell-significant characters** (markdown table pipes (`|`),
single quotes, em-dashes, and other shell-significant characters**, the shell-quoting round-trip can occasionally drop single quotes, em-dashes, etc.), the shell-quoting round-trip can occasionally drop characters or truncate the value
characters or truncate the value before it reaches your `argc_*` variable. Symptoms include `argc_*` being shorter than before it reaches your `argc_*` variable. Symptoms include `argc_*` being shorter than what the LLM sent, or starting
what the LLM sent, or starting mid-content. mid-content.
## The escape hatch: `LLM_TOOL_RAW_JSON` To sidestep the shell-quoting layer entirely, read the value directly from the raw JSON envelope that Coyote exports as
the `LLM_TOOL_RAW_JSON` environment variable:
Coyote exports the raw JSON envelope it received from the LLM as the `LLM_TOOL_RAW_JSON` environment variable on every
tool invocation. To bypass argc parsing for a specific option, re-derive its value directly from the JSON using `jq`:
```bash ```bash
# Prefer the raw JSON when available, fall back to argc parsing if not # shellcheck disable=SC2154
if [[ -n "$LLM_TOOL_RAW_JSON" ]] && command -v jq >/dev/null 2>&1; then main() {
argc_contents="$(jq -r '.contents' <<< "$LLM_TOOL_RAW_JSON")" argc_contents="$(jq -r '.contents' <<< "$LLM_TOOL_RAW_JSON")"
argc_path="$(jq -r '.path' <<< "$LLM_TOOL_RAW_JSON")" argc_path="$(jq -r '.path' <<< "$LLM_TOOL_RAW_JSON")"
fi
# ... rest of your tool logic using $argc_contents and $argc_path
}
``` ```
The `jq -r` ("raw") flag preserves every byte of the original LLM-sent value, including newlines, quotes, em-dashes, The `jq -r` ("raw") flag preserves every byte of the original LLM-sent value, including newlines, quotes, em-dashes,
and shell-special characters, without any shell-quoting layer in between. This is the same approach Coyote's bundled and shell-special characters, without any shell-quoting layer in between. This is the pattern Coyote's bundled
`fs_write`, `fs_patch`, `execute_command`, `execute_sql_code`, and `send_mail` tools use for their large-payload options. `fs_write`, `fs_patch`, `execute_command`, `execute_sql_code`, and `send_mail` tools use for their large-payload
options. The argc `# @option --foo!` directives stay in your script so Coyote can build the JSON schema for the LLM
The fallback (`fall back to argc parsing if not`) is intentional: if `LLM_TOOL_RAW_JSON` is unset or `jq` isn't and validate the call, but your `main()` reads from `LLM_TOOL_RAW_JSON` instead of trusting argc's value capture.
installed, the tool still works via the standard argc path. You're adding a more reliable code path, not replacing the
existing one.
## When to use this ## When to use this
@@ -327,13 +326,12 @@ flow handles those reliably.
## For agent-local tools ## For agent-local tools
If you're writing tools inside an agent's `tools.sh` (under `<config_dir>/agents/<agent>/tools.sh`), the same env var If you're writing tools inside an agent's `tools.sh` (under `<config_dir>/agents/<agent>/tools.sh`), the same value is
is exposed as `LLM_AGENT_RAW_JSON` (the raw JSON for the agent function call). The bypass pattern is identical: exposed as `LLM_AGENT_RAW_JSON` (the raw JSON for the agent function call). The semantics are identical; only the
variable name differs:
```bash ```bash
if [[ -n "$LLM_AGENT_RAW_JSON" ]] && command -v jq >/dev/null 2>&1; then argc_some_field="$(jq -r '.some_field' <<< "$LLM_AGENT_RAW_JSON")"
argc_some_field="$(jq -r '.some_field' <<< "$LLM_AGENT_RAW_JSON")"
fi
``` ```
--- ---
+51 -41
@@ -27,62 +27,72 @@ to enable it globally. See the [Tools](Tools#enablingdisabling-global-tools) doc
## Environment Variables ## Environment Variables
All tools have access to the following environment variables that provide context about the current execution environment: All tools have access to the following environment variables that provide context about the current execution environment:
| Variable | Description | | Variable | Description |
|----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `LLM_OUTPUT` | Indicates where the output of the tool should go. <br>In certain situations, this may be set to a temporary file instead of `/dev/stdout`. | | `LLM_OUTPUT` | Indicates where the output of the tool should go. <br>In certain situations, this may be set to a temporary file instead of `/dev/stdout`. |
| `LLM_ROOT_DIR` | The root `config_dir` directory for Coyote <br>(i.e. `dirname $(coyote --info \| grep config_file \| awk '{print $2}')`) | | `LLM_ROOT_DIR` | The root `config_dir` directory for Coyote <br>(i.e. `dirname $(coyote --info \| grep config_file \| awk '{print $2}')`) |
| `LLM_TOOL_NAME` | The name of the tool being executed | | `LLM_TOOL_NAME` | The name of the tool being executed |
| `LLM_TOOL_CACHE_DIR` | A directory specific to the tool for storing cache or temporary files | | `LLM_TOOL_CACHE_DIR` | A directory specific to the tool for storing cache or temporary files |
| `LLM_TOOL_RAW_JSON` | The raw JSON envelope the LLM sent for this tool call, exactly as received. Useful as a robustness escape hatch. See [Handling large or special-character values](#handling-large-or-special-character-values) below. | | `LLM_TOOL_RAW_JSON` | The raw JSON envelope the LLM sent for this tool call, exactly as received. See [Reading values via LLM_TOOL_RAW_JSON](#reading-values-via-llm_tool_raw_json) below. |
Coyote also searches the tools directory on startup for a `.env` file. If found, all tools in `functions/tools/` will have Coyote also searches the tools directory on startup for a `.env` file. If found, all tools in `functions/tools/` will have
the environment variables defined in the `.env` file available to them. the environment variables defined in the `.env` file available to them.
## Handling large or special-character values ## Reading values via `LLM_TOOL_RAW_JSON`
Coyote dispatches a tool call by converting the LLM's JSON arguments into shell `--option=<value>` flags via `jq`, then Coyote exports the raw JSON envelope it received from the LLM as the `LLM_TOOL_RAW_JSON` environment variable on every
`eval`-ing the result. For most calls this works fine. For very large values, multi-line values with many newlines, or tool invocation. Tools can use this to read option values directly from the JSON rather than going through the
values dense with markdown table pipes, single quotes, and other shell-significant characters, the shell-quoting `argc_*` variables.
round-trip can occasionally drop characters or truncate the value before it reaches your tool's `argc_*` variable.
If you observe corruption or truncation of a tool argument (e.g., the value reaching your tool is shorter than what the ### When to use it
LLM sent, or starts mid-content), bypass the shell parsing entirely by reading directly from `LLM_TOOL_RAW_JSON`:
**Bash tools**: This is the recommended pattern for any option that may carry large multi-line content, code, file
contents, or values dense with shell-significant characters (markdown table pipes, single quotes, em-dashes, etc.).
Coyote's bash dispatcher converts JSON to shell `--option=<value>` flags via `jq` and `eval`-s the result; for large or
special-character values, that shell-quoting round-trip can occasionally drop characters or misalign content before it
reaches `argc_*`. Reading from `LLM_TOOL_RAW_JSON` bypasses the shell layer entirely.
```bash ```bash
# Bash: prefer the raw JSON when available, fall back to argc parsing main() {
if [[ -n "$LLM_TOOL_RAW_JSON" ]] && command -v jq >/dev/null 2>&1; then
argc_contents="$(jq -r '.contents' <<< "$LLM_TOOL_RAW_JSON")" argc_contents="$(jq -r '.contents' <<< "$LLM_TOOL_RAW_JSON")"
argc_path="$(jq -r '.path' <<< "$LLM_TOOL_RAW_JSON")" argc_path="$(jq -r '.path' <<< "$LLM_TOOL_RAW_JSON")"
fi
```
```python # ... rest of your tool logic using $argc_contents and $argc_path
# Python: read os.environ["LLM_TOOL_RAW_JSON"] and parse with json.loads
import json, os
raw = os.environ.get("LLM_TOOL_RAW_JSON")
if raw:
payload = json.loads(raw)
contents = payload["contents"]
path = payload["path"]
```
```typescript
// TypeScript: process.env.LLM_TOOL_RAW_JSON, then JSON.parse
const raw = process.env.LLM_TOOL_RAW_JSON;
if (raw) {
const payload = JSON.parse(raw);
const contents = payload.contents;
const path = payload.path;
} }
``` ```
This is the same approach Coyote's bundled `fs_write`, `fs_patch`, `execute_command`, `execute_sql_code`, and This is the pattern Coyote's bundled `fs_write`, `fs_patch`, `execute_command`, `execute_sql_code`, and `send_mail` tools
`send_mail` tools use for their large-payload options. It preserves every byte of the original LLM-sent value, including use for their large-payload options. The argc `# @option --foo!` directives stay in your script so Coyote can build the
newlines, quotes, em-dashes, and shell-special characters. If `jq` (bash) or your language's JSON parser is available, JSON schema for the LLM, but your `main()` reads from `LLM_TOOL_RAW_JSON` instead of trusting argc's value capture.
prefer this path for any option that may carry user-generated multi-line content.
For agent-local tools written under `<config_dir>/agents/<agent>/tools.sh`, the same env var is exposed as **Python and TypeScript tools**: Coyote's Python and TypeScript dispatchers parse the JSON envelope natively (`json.loads`
`LLM_AGENT_RAW_JSON` (the raw JSON payload for the agent function call). / `JSON.parse`) and pass values directly to your `run()` function as native types. They don't go through shell quoting,
so the `LLM_*_RAW_JSON` escape hatch that bash tools need doesn't affect them. Declared parameters arrive in your
function correctly without needing `LLM_TOOL_RAW_JSON`.
Python and TypeScript tools may still want to read `LLM_TOOL_RAW_JSON` for other reasons:
- Accessing fields the LLM passed that aren't declared in your `run()` signature (telemetry, optional metadata).
- Auditing or logging the original LLM-sent JSON verbatim.
- Debugging when a value isn't what you expected.
```python
# Python: parse the raw JSON when you need beyond-signature access
import json, os
payload = json.loads(os.environ["LLM_TOOL_RAW_JSON"])
extra_field = payload.get("extra_field")
```
```typescript
// TypeScript: parse the raw JSON when you need beyond-signature access
const payload = JSON.parse(process.env.LLM_TOOL_RAW_JSON!);
const extraField = (payload as Record<string, unknown>).extra_field;
```
### Agent-local tools
For tools written under `<config_dir>/agents/<agent>/tools.sh` (or `.py` / `.ts`), the same value is exposed as
`LLM_AGENT_RAW_JSON`, the raw JSON payload for the agent function call. The semantics are identical; only the variable
name differs.
## Custom Bash-Based Tools ## Custom Bash-Based Tools
To create a Bash-based tool, refer to the [custom bash tools documentation](Custom-Bash-Tools). To create a Bash-based tool, refer to the [custom bash tools documentation](Custom-Bash-Tools).