docs: Added docs on the new LLM_TOOL_RAW_JSON escape hatch
@@ -284,6 +284,60 @@ $ ./get_current_time.sh
|
|||||||
Fri Oct 24 05:55:04 PM MDT 2025
|
Fri Oct 24 05:55:04 PM MDT 2025
|
||||||
```
|
```
|
||||||
|
|
||||||
|
# Handling large or special-character argument values
|
||||||
|
|
||||||
|
Coyote dispatches a tool call by converting the LLM's JSON arguments into shell `--option=<value>` flags via `jq`, then
|
||||||
|
`eval`-ing the result. The flag values reach your `main` function as `argc_*` variables. For most calls this works fine.
|
||||||
|
|
||||||
|
However, for **very large values, multi-line values dense with newlines, or values with many markdown table pipes (`|`),
|
||||||
|
single quotes, em-dashes, and other shell-significant characters**, the shell-quoting round-trip can occasionally drop
|
||||||
|
characters or truncate the value before it reaches your `argc_*` variable. Symptoms include `argc_*` being shorter than
|
||||||
|
what the LLM sent, or starting mid-content.
|
||||||
|
|
||||||
|
## The escape hatch: `LLM_TOOL_RAW_JSON`
|
||||||
|
|
||||||
|
Coyote exports the raw JSON envelope it received from the LLM as the `LLM_TOOL_RAW_JSON` environment variable on every
|
||||||
|
tool invocation. To bypass argc parsing for a specific option, re-derive its value directly from the JSON using `jq`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Prefer the raw JSON when available, fall back to argc parsing if not
|
||||||
|
if [[ -n "$LLM_TOOL_RAW_JSON" ]] && command -v jq >/dev/null 2>&1; then
|
||||||
|
argc_contents="$(jq -r '.contents' <<< "$LLM_TOOL_RAW_JSON")"
|
||||||
|
argc_path="$(jq -r '.path' <<< "$LLM_TOOL_RAW_JSON")"
|
||||||
|
fi
|
||||||
|
```
|
||||||
|
|
||||||
|
The `jq -r` ("raw") flag preserves every byte of the original LLM-sent value, including newlines, quotes, em-dashes,
|
||||||
|
and shell-special characters, without any shell-quoting layer in between. This is the same approach Coyote's bundled
|
||||||
|
`fs_write`, `fs_patch`, `execute_command`, `execute_sql_code`, and `send_mail` tools use for their large-payload options.
|
||||||
|
|
||||||
|
The fallback (`fall back to argc parsing if not`) is intentional: if `LLM_TOOL_RAW_JSON` is unset or `jq` isn't
|
||||||
|
installed, the tool still works via the standard argc path. You're adding a more reliable code path, not replacing the
|
||||||
|
existing one.
|
||||||
|
|
||||||
|
## When to use this
|
||||||
|
|
||||||
|
- Your option's value can legitimately be many KB of text (file contents, code, email bodies, SQL queries).
|
||||||
|
- Your option's value can contain shell-significant characters in dense patterns (pipes, single quotes, ANSI escapes).
|
||||||
|
- You observe that `argc_<option>` is shorter than what the LLM sent, or has corruption near the beginning/middle of the
|
||||||
|
value.
|
||||||
|
|
||||||
|
If your tool only takes short string values (paths, IDs, search queries), you don't need the bypass. The standard argc
|
||||||
|
flow handles those reliably.
|
||||||
|
|
||||||
|
## For agent-local tools
|
||||||
|
|
||||||
|
If you're writing tools inside an agent's `tools.sh` (under `<config_dir>/agents/<agent>/tools.sh`), the same env var
|
||||||
|
is exposed as `LLM_AGENT_RAW_JSON` (the raw JSON for the agent function call). The bypass pattern is identical:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
if [[ -n "$LLM_AGENT_RAW_JSON" ]] && command -v jq >/dev/null 2>&1; then
|
||||||
|
argc_some_field="$(jq -r '.some_field' <<< "$LLM_AGENT_RAW_JSON")"
|
||||||
|
fi
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
# Prompt Helpers
|
# Prompt Helpers
|
||||||
It's often useful to create interactive prompts for our bash tools so that our tools can get input from
|
It's often useful to create interactive prompts for our bash tools so that our tools can get input from
|
||||||
users.
|
users.
|
||||||
|
|||||||
+53
-6
@@ -27,16 +27,63 @@ to enable it globally. See the [Tools](Tools#enablingdisabling-global-tools) doc
|
|||||||
## Environment Variables
|
## Environment Variables
|
||||||
All tools have access to the following environment variables that provide context about the current execution environment:
|
All tools have access to the following environment variables that provide context about the current execution environment:
|
||||||
|
|
||||||
| Variable | Description |
|
| Variable | Description |
|
||||||
|----------------------|--------------------------------------------------------------------------------------------------------------------------------------------|
|
|----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||||
| `LLM_OUTPUT` | Indicates where the output of the tool should go. <br>In certain situations, this may be set to a temporary file instead of `/dev/stdout`. |
|
| `LLM_OUTPUT` | Indicates where the output of the tool should go. <br>In certain situations, this may be set to a temporary file instead of `/dev/stdout`. |
|
||||||
| `LLM_ROOT_DIR` | The root `config_dir` directory for Coyote <br>(i.e. `dirname $(coyote --info \| grep config_file \| awk '{print $2}')`) |
|
| `LLM_ROOT_DIR` | The root `config_dir` directory for Coyote <br>(i.e. `dirname $(coyote --info \| grep config_file \| awk '{print $2}')`) |
|
||||||
| `LLM_TOOL_NAME` | The name of the tool being executed |
|
| `LLM_TOOL_NAME` | The name of the tool being executed |
|
||||||
| `LLM_TOOL_CACHE_DIR` | A directory specific to the tool for storing cache or temporary files |
|
| `LLM_TOOL_CACHE_DIR` | A directory specific to the tool for storing cache or temporary files |
|
||||||
|
| `LLM_TOOL_RAW_JSON` | The raw JSON envelope the LLM sent for this tool call, exactly as received. Useful as a robustness escape hatch. See [Handling large or special-character values](#handling-large-or-special-character-values) below. |
|
||||||
|
|
||||||
Coyote also searches the tools directory on startup for a `.env` file. If found, all tools in `functions/tools/` will have
|
Coyote also searches the tools directory on startup for a `.env` file. If found, all tools in `functions/tools/` will have
|
||||||
the environment variables defined in the `.env` file available to them.
|
the environment variables defined in the `.env` file available to them.
|
||||||
|
|
||||||
|
## Handling large or special-character values
|
||||||
|
|
||||||
|
Coyote dispatches a tool call by converting the LLM's JSON arguments into shell `--option=<value>` flags via `jq`, then
|
||||||
|
`eval`-ing the result. For most calls this works fine. For very large values, multi-line values with many newlines, or
|
||||||
|
values dense with markdown table pipes, single quotes, and other shell-significant characters, the shell-quoting
|
||||||
|
round-trip can occasionally drop characters or truncate the value before it reaches your tool's `argc_*` variable.
|
||||||
|
|
||||||
|
If you observe corruption or truncation of a tool argument (e.g., the value reaching your tool is shorter than what the
|
||||||
|
LLM sent, or starts mid-content), bypass the shell parsing entirely by reading directly from `LLM_TOOL_RAW_JSON`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Bash: prefer the raw JSON when available, fall back to argc parsing
|
||||||
|
if [[ -n "$LLM_TOOL_RAW_JSON" ]] && command -v jq >/dev/null 2>&1; then
|
||||||
|
argc_contents="$(jq -r '.contents' <<< "$LLM_TOOL_RAW_JSON")"
|
||||||
|
argc_path="$(jq -r '.path' <<< "$LLM_TOOL_RAW_JSON")"
|
||||||
|
fi
|
||||||
|
```
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Python: read os.environ["LLM_TOOL_RAW_JSON"] and parse with json.loads
|
||||||
|
import json, os
|
||||||
|
raw = os.environ.get("LLM_TOOL_RAW_JSON")
|
||||||
|
if raw:
|
||||||
|
payload = json.loads(raw)
|
||||||
|
contents = payload["contents"]
|
||||||
|
path = payload["path"]
|
||||||
|
```
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// TypeScript: process.env.LLM_TOOL_RAW_JSON, then JSON.parse
|
||||||
|
const raw = process.env.LLM_TOOL_RAW_JSON;
|
||||||
|
if (raw) {
|
||||||
|
const payload = JSON.parse(raw);
|
||||||
|
const contents = payload.contents;
|
||||||
|
const path = payload.path;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
This is the same approach Coyote's bundled `fs_write`, `fs_patch`, `execute_command`, `execute_sql_code`, and
|
||||||
|
`send_mail` tools use for their large-payload options. It preserves every byte of the original LLM-sent value, including
|
||||||
|
newlines, quotes, em-dashes, and shell-special characters. If `jq` (bash) or your language's JSON parser is available,
|
||||||
|
prefer this path for any option that may carry user-generated multi-line content.
|
||||||
|
|
||||||
|
For agent-local tools written under `<config_dir>/agents/<agent>/tools.sh`, the same env var is exposed as
|
||||||
|
`LLM_AGENT_RAW_JSON` (the raw JSON payload for the agent function call).
|
||||||
|
|
||||||
## Custom Bash-Based Tools
|
## Custom Bash-Based Tools
|
||||||
To create a Bash-based tool, refer to the [custom bash tools documentation](Custom-Bash-Tools).
|
To create a Bash-based tool, refer to the [custom bash tools documentation](Custom-Bash-Tools).
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user