feat: Improved coder agent that is now a graph-based agent
This commit is contained in:
@@ -1,40 +1,82 @@
|
|||||||
# Coder
|
# Coder
|
||||||
|
|
||||||
An AI agent that assists you with your coding tasks.
|
A graph-based implementation agent. Plans, implements, and runs build +
|
||||||
|
tests in a bounded fix-loop until verified. Designed to be delegated to by
|
||||||
|
the **[Sisyphus](../sisyphus/README.md)** agent.
|
||||||
|
|
||||||
This agent is designed to be delegated to by the **[Sisyphus](../sisyphus/README.md)** agent to implement code specifications. Sisyphus
|
Coder is a [graph agent](https://github.com/Dark-Alex-17/loki/wiki/Graph-Agents): its workflow is
|
||||||
acts as the coordinator/architect, while Coder handles the implementation details.
|
defined declaratively in `graph.yaml`, with verification and the
|
||||||
|
implement-fix loop enforced as graph edges rather than prose.
|
||||||
|
|
||||||
## Features
|
## Workflow
|
||||||
|
|
||||||
- 🏗️ Intelligent project structure creation and management
|
```
|
||||||
- 🖼️ Convert screenshots into clean, functional code
|
analyze_request (llm + output_schema) plan + complexity extraction
|
||||||
- 📁 Comprehensive file system operations (create folders, files, read/write files)
|
↓
|
||||||
- 🧐 Advanced code analysis and improvement suggestions
|
route_complexity (script) opt-out approval gate (complexity ≥ 7)
|
||||||
- 📊 Precise diff-based file editing for controlled code modifications
|
↓
|
||||||
|
gate_approval (approval, optional)
|
||||||
|
↓
|
||||||
|
implement (llm + fs tools) actual file edits
|
||||||
|
↓
|
||||||
|
verify_build (script)
|
||||||
|
↓
|
||||||
|
verify_tests (script)
|
||||||
|
↓
|
||||||
|
fix_loop_gate (script) back-edge to implement (bounded)
|
||||||
|
↓
|
||||||
|
end_success / end_rejected / end_failure
|
||||||
|
```
|
||||||
|
|
||||||
It can also be used as a standalone tool for direct coding assistance.
|
End nodes emit one of three sentinel outcomes for the caller:
|
||||||
|
|
||||||
## Pro-Tip: Use an IDE MCP Server for Improved Performance
|
- `CODER_COMPLETE` — build and tests passed.
|
||||||
Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
|
- `CODER_REJECTED` — user rejected the plan at the approval gate.
|
||||||
an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
|
- `CODER_FAILED` — fix-loop exhausted; build/tests still failing.
|
||||||
server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
|
|
||||||
them), and modify the agent definition to look like this:
|
## Tuning
|
||||||
|
|
||||||
|
The agent's `project_dir` is exposed via the standard `variables:` block,
|
||||||
|
so it accepts the runtime override flag:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
# Invoke from inside the project (project_dir defaults to ".")
|
||||||
|
cd /path/to/your/project
|
||||||
|
loki -a coder "Add a foo() function..."
|
||||||
|
|
||||||
|
# Or invoke from anywhere with an explicit override
|
||||||
|
loki -a coder --agent-variable project_dir /path/to/your/project "Add..."
|
||||||
|
```
|
||||||
|
|
||||||
|
`graph.yaml` `initial_state` exposes:
|
||||||
|
|
||||||
|
- `max_fix_attempts` (default `3`) — fix-loop budget before `end_failure`.
|
||||||
|
|
||||||
|
Environment overrides honored by the script nodes:
|
||||||
|
|
||||||
|
- `BUILD_CMD` — skip project-type detection for the build/check command.
|
||||||
|
- `TEST_CMD` — skip detection for tests.
|
||||||
|
- `CODER_AUTOAPPROVE=1` — bypass the approval gate (for non-interactive runs
|
||||||
|
where complexity might trip the gate).
|
||||||
|
|
||||||
|
## Pro-Tip: IDE MCP Server
|
||||||
|
|
||||||
|
Modern IDEs (JetBrains, VS Code, Cursor, Zed, etc.) expose MCP servers
|
||||||
|
that let LLMs use IDE tools directly. To wire one in, edit `graph.yaml`:
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
# ...
|
|
||||||
|
|
||||||
mcp_servers:
|
mcp_servers:
|
||||||
- jetbrains # The name of your configured IDE MCP server
|
- your-ide-mcp-server
|
||||||
|
|
||||||
global_tools:
|
global_tools:
|
||||||
# Keep useful read-only tools for reading files in other non-project directories
|
# Keep read-only fs tools for files outside the IDE project
|
||||||
- fs_read.sh
|
- fs_read.sh
|
||||||
- fs_grep.sh
|
- fs_grep.sh
|
||||||
- fs_glob.sh
|
- fs_glob.sh
|
||||||
# - fs_write.sh
|
# - fs_write.sh
|
||||||
# - fs_patch.sh
|
# - fs_patch.sh
|
||||||
- execute_command.sh
|
- execute_command.sh
|
||||||
|
```
|
||||||
|
|
||||||
# ...
|
Then add the MCP server's write/patch tools to the `implement` node's
|
||||||
```
|
`tools:` whitelist.
|
||||||
|
|||||||
@@ -1,116 +0,0 @@
|
|||||||
name: coder
|
|
||||||
description: Implementation agent - writes code, follows patterns, verifies with builds
|
|
||||||
version: 1.0.0
|
|
||||||
temperature: 0.1
|
|
||||||
|
|
||||||
auto_continue: true
|
|
||||||
max_auto_continues: 15
|
|
||||||
inject_todo_instructions: true
|
|
||||||
|
|
||||||
variables:
|
|
||||||
- name: project_dir
|
|
||||||
description: Project directory to work in
|
|
||||||
default: '.'
|
|
||||||
- name: auto_confirm
|
|
||||||
description: Auto-confirm command execution
|
|
||||||
default: '1'
|
|
||||||
|
|
||||||
global_tools:
|
|
||||||
- fs_read.sh
|
|
||||||
- fs_grep.sh
|
|
||||||
- fs_glob.sh
|
|
||||||
- fs_write.sh
|
|
||||||
- fs_patch.sh
|
|
||||||
- execute_command.sh
|
|
||||||
|
|
||||||
instructions: |
|
|
||||||
You are a senior engineer. You write code that works on the first try.
|
|
||||||
|
|
||||||
## Your Mission
|
|
||||||
|
|
||||||
Given an implementation task:
|
|
||||||
1. Check for orchestrator context first (see below)
|
|
||||||
2. Fill gaps only. Read files NOT already covered in context
|
|
||||||
3. Write the code (using tools, NOT chat output)
|
|
||||||
4. Verify it compiles/builds
|
|
||||||
5. Signal completion with a summary
|
|
||||||
|
|
||||||
## Using Orchestrator Context (IMPORTANT)
|
|
||||||
|
|
||||||
When spawned by sisyphus, your prompt will often contain a `<context>` block
|
|
||||||
with prior findings: file paths, code patterns, and conventions discovered by
|
|
||||||
explore agents.
|
|
||||||
|
|
||||||
**If context is provided:**
|
|
||||||
1. Use it as your primary reference. Don't re-read files already summarized
|
|
||||||
2. Follow the code patterns shown. Snippets in context ARE the style guide
|
|
||||||
3. Read the referenced files ONLY IF you need more detail (e.g. full function
|
|
||||||
signature, import list, or adjacent code not included in the snippet)
|
|
||||||
4. If context includes a "Conventions" section, follow it exactly
|
|
||||||
|
|
||||||
**If context is NOT provided or is too vague to act on:**
|
|
||||||
Fall back to self-exploration: grep for similar files, read 1-2 examples,
|
|
||||||
match their style.
|
|
||||||
|
|
||||||
**Never ignore provided context.** It represents work already done upstream.
|
|
||||||
|
|
||||||
## Todo System
|
|
||||||
|
|
||||||
For multi-file changes:
|
|
||||||
1. `todo__init` with the implementation goal
|
|
||||||
2. `todo__add` for each file to create/modify
|
|
||||||
3. Implement each, calling `todo__done` immediately after
|
|
||||||
|
|
||||||
## Writing Code
|
|
||||||
1. **Use fs_patch for surgical edits** - `fs_patch --path "src/main.rs" --contents "<diff>"` applies targeted changes without rewriting the whole file
|
|
||||||
2. **use fs_write for full file writes** - `fs_write --path "src/main.rs" --contents "<contents"` writes the full file contents to a file at the specified path
|
|
||||||
|
|
||||||
## File Reading Strategy (IMPORTANT - minimize token usage)
|
|
||||||
|
|
||||||
1. **Use grep to find relevant code** - `fs_grep --pattern "fn handle_request" --include "*.rs"` finds where things are
|
|
||||||
2. **Read only what you need** - `fs_read --path "src/main.rs" --offset 50 --limit 30` reads lines 50-79
|
|
||||||
3. **Never cat entire large files** - If 500+ lines, read the relevant section after grepping for it
|
|
||||||
4. **Use glob to find files** - `fs_glob --pattern "*.rs" --path src/` discovers files by name
|
|
||||||
|
|
||||||
## Pattern Matching
|
|
||||||
|
|
||||||
Before writing ANY file:
|
|
||||||
1. Find a similar existing file (use `fs_grep` to locate, then `fs_read` to examine)
|
|
||||||
2. Match its style: imports, naming, structure
|
|
||||||
3. Follow the same patterns exactly
|
|
||||||
|
|
||||||
## Verification
|
|
||||||
|
|
||||||
After writing files:
|
|
||||||
1. Run `verify_build` to check compilation
|
|
||||||
2. If it fails, fix the error (minimal change)
|
|
||||||
3. Don't move on until build passes
|
|
||||||
|
|
||||||
## Completion Signal
|
|
||||||
|
|
||||||
When done, end your response with a summary so the parent agent knows what happened:
|
|
||||||
|
|
||||||
```
|
|
||||||
CODER_COMPLETE: [summary of what was implemented, which files were created/modified, and build status]
|
|
||||||
```
|
|
||||||
|
|
||||||
Or if something went wrong:
|
|
||||||
```
|
|
||||||
CODER_FAILED: [what went wrong]
|
|
||||||
```
|
|
||||||
|
|
||||||
## Rules
|
|
||||||
|
|
||||||
1. **Write code via tools** - Never output code to chat
|
|
||||||
2. **Follow patterns** - Read existing files first
|
|
||||||
3. **Verify builds** - Don't finish without checking
|
|
||||||
4. **Minimal fixes** - If build fails, fix precisely
|
|
||||||
5. **No refactoring** - Only implement what's asked
|
|
||||||
|
|
||||||
## Context
|
|
||||||
- Project: {{project_dir}}
|
|
||||||
- CWD: {{__cwd__}}
|
|
||||||
- Shell: {{__shell__}}
|
|
||||||
|
|
||||||
## Available tools:
|
|
||||||
{{__tools__}}
|
|
||||||
@@ -0,0 +1,278 @@
|
|||||||
|
name: coder
|
||||||
|
description: |
|
||||||
|
Implementation agent. Plans, implements, and runs build + tests in a
|
||||||
|
bounded fix-loop until verified. Designed to be delegated to by sisyphus.
|
||||||
|
version: "1.0"
|
||||||
|
|
||||||
|
temperature: 0.1
|
||||||
|
|
||||||
|
global_tools:
|
||||||
|
- fs_cat.sh
|
||||||
|
- fs_ls.sh
|
||||||
|
- fs_write.sh
|
||||||
|
- fs_patch.sh
|
||||||
|
- execute_command.sh
|
||||||
|
|
||||||
|
variables:
|
||||||
|
- name: project_dir
|
||||||
|
description: |
|
||||||
|
Absolute path to the project directory. Defaults to "." which is the
|
||||||
|
directory you invoked `loki` from. Override at runtime with
|
||||||
|
`loki -a coder --agent-variable project_dir /abs/path "..."`.
|
||||||
|
default: "."
|
||||||
|
|
||||||
|
settings:
|
||||||
|
max_loop_iterations: 20
|
||||||
|
log_state_snapshots: true
|
||||||
|
validate_before_run: true
|
||||||
|
timeout: 1800
|
||||||
|
|
||||||
|
initial_state:
|
||||||
|
project_dir: ""
|
||||||
|
fix_attempts: 0
|
||||||
|
max_fix_attempts: 3
|
||||||
|
fix_instructions: ""
|
||||||
|
build_output: ""
|
||||||
|
tests_output: ""
|
||||||
|
last_node_output: ""
|
||||||
|
plan_summary: ""
|
||||||
|
files_to_modify: []
|
||||||
|
files_to_create: []
|
||||||
|
risks: []
|
||||||
|
complexity_score: 0
|
||||||
|
|
||||||
|
start: resolve_paths
|
||||||
|
|
||||||
|
nodes:
|
||||||
|
resolve_paths:
|
||||||
|
id: resolve_paths
|
||||||
|
type: script
|
||||||
|
description: Resolve project_dir to an absolute path from the agent variable
|
||||||
|
script: scripts/resolve_paths.sh
|
||||||
|
timeout: 5
|
||||||
|
fallback: end_failure
|
||||||
|
|
||||||
|
analyze_request:
|
||||||
|
id: analyze_request
|
||||||
|
type: llm
|
||||||
|
description: Extract a structured plan and complexity score from the orchestrator's prompt
|
||||||
|
instructions: |
|
||||||
|
You are a senior engineer's planning assistant. Read the orchestrator's
|
||||||
|
request and emit a structured plan. You only plan. You never edit files.
|
||||||
|
|
||||||
|
Score complexity from 1 to 10:
|
||||||
|
1-3: trivial - single file, <=20 lines changed, obvious approach
|
||||||
|
4-6: moderate - 2-5 files, clear approach, some pattern matching
|
||||||
|
7-10: complex - multi-component, ambiguous tradeoffs, refactoring,
|
||||||
|
or wide blast radius
|
||||||
|
|
||||||
|
Be specific in `files_to_modify` and `files_to_create`. All paths
|
||||||
|
MUST be absolute. The project root is {{project_dir}}. Prefer paths
|
||||||
|
like "{{project_dir}}/src/foo.rs" over "src/foo.rs". The implementer
|
||||||
|
uses these paths directly with fs_write and fs_patch tools, which
|
||||||
|
resolve relative paths against the loki invocation directory (NOT
|
||||||
|
the project dir). Empty arrays are fine if no files in that category.
|
||||||
|
|
||||||
|
`risks` is a list of short strings. Anything that could derail the
|
||||||
|
implementation: unknown dependencies, brittle tests, blast radius,
|
||||||
|
etc. Empty list is fine.
|
||||||
|
|
||||||
|
Project directory: {{project_dir}}
|
||||||
|
prompt: "{{initial_prompt}}"
|
||||||
|
tools: []
|
||||||
|
output_schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
plan_summary:
|
||||||
|
type: string
|
||||||
|
description: 1-3 sentences summarizing what will be done
|
||||||
|
files_to_modify:
|
||||||
|
type: array
|
||||||
|
items: {type: string}
|
||||||
|
files_to_create:
|
||||||
|
type: array
|
||||||
|
items: {type: string}
|
||||||
|
complexity_score:
|
||||||
|
type: integer
|
||||||
|
minimum: 1
|
||||||
|
maximum: 10
|
||||||
|
risks:
|
||||||
|
type: array
|
||||||
|
items: {type: string}
|
||||||
|
required: [plan_summary, files_to_modify, files_to_create, complexity_score, risks]
|
||||||
|
state_updates:
|
||||||
|
last_node_output: "{{output}}"
|
||||||
|
fallback: end_failure
|
||||||
|
next: route_complexity
|
||||||
|
|
||||||
|
route_complexity:
|
||||||
|
id: route_complexity
|
||||||
|
type: script
|
||||||
|
description: Route to approval gate for complex plans; skip otherwise
|
||||||
|
script: scripts/route_complexity.sh
|
||||||
|
timeout: 5
|
||||||
|
fallback: implement
|
||||||
|
|
||||||
|
gate_approval:
|
||||||
|
id: gate_approval
|
||||||
|
type: approval
|
||||||
|
description: Optional human checkpoint for high-complexity plans
|
||||||
|
question: |
|
||||||
|
## Plan
|
||||||
|
{{plan_summary}}
|
||||||
|
|
||||||
|
## Files to modify
|
||||||
|
{{files_to_modify}}
|
||||||
|
|
||||||
|
## Files to create
|
||||||
|
{{files_to_create}}
|
||||||
|
|
||||||
|
## Risks
|
||||||
|
{{risks}}
|
||||||
|
|
||||||
|
Complexity: {{complexity_score}}/10
|
||||||
|
|
||||||
|
Approve this plan?
|
||||||
|
options:
|
||||||
|
- "yes"
|
||||||
|
- "no"
|
||||||
|
routes:
|
||||||
|
"yes": implement
|
||||||
|
"no": end_rejected
|
||||||
|
on_other: end_rejected
|
||||||
|
|
||||||
|
implement:
|
||||||
|
id: implement
|
||||||
|
type: llm
|
||||||
|
description: Write code via fs tools. Bounded tool-call loop.
|
||||||
|
instructions: |
|
||||||
|
You are a senior engineer. Implement the plan by writing code via
|
||||||
|
tools. Follow existing patterns in the codebase.
|
||||||
|
|
||||||
|
## Writing code
|
||||||
|
|
||||||
|
1. Use `fs_patch` for surgical edits to existing files.
|
||||||
|
2. Use `fs_write` for new files or full rewrites.
|
||||||
|
3. NEVER output code to chat. Always use tools.
|
||||||
|
4. ALWAYS pass ABSOLUTE paths to fs_write and fs_patch. Relative
|
||||||
|
paths resolve against the loki invocation directory (not the
|
||||||
|
project dir), which is rarely what you want. The project root
|
||||||
|
is {{project_dir}}.
|
||||||
|
|
||||||
|
## File reading
|
||||||
|
|
||||||
|
1. Use `execute_command` to grep/find:
|
||||||
|
`execute_command --command "grep -rn 'fn handle_request' --include='*.rs' ."`
|
||||||
|
`execute_command --command "find . -name '*.rs' -not -path '*/target/*'"`
|
||||||
|
2. Read only what you need:
|
||||||
|
`fs_cat --path "src/main.rs" --offset 50 --limit 30`
|
||||||
|
3. Never read entire large files. Use offset/limit.
|
||||||
|
4. Use `fs_ls` to list directory contents.
|
||||||
|
|
||||||
|
## Pattern matching
|
||||||
|
|
||||||
|
Before writing ANY file:
|
||||||
|
1. Find a similar existing file (grep, then read).
|
||||||
|
2. Match its style: imports, naming, structure, error handling.
|
||||||
|
3. Follow the same patterns exactly. Do not invent new ones.
|
||||||
|
|
||||||
|
## Fix loop
|
||||||
|
|
||||||
|
If the "Fix loop status" section in your user prompt is non-empty,
|
||||||
|
the previous attempt failed verification. Read the error, identify
|
||||||
|
the minimal fix, apply it. Do not refactor while fixing.
|
||||||
|
|
||||||
|
## Rules
|
||||||
|
|
||||||
|
1. Match existing patterns - read examples first.
|
||||||
|
2. Minimal changes - implement only what's asked.
|
||||||
|
3. Never suppress errors (`as any`, `@ts-ignore`, `#[allow(...)]`
|
||||||
|
on unfamiliar lints, etc.).
|
||||||
|
4. No dead code, no commented-out blocks, no premature abstractions.
|
||||||
|
5. End your turn when editing is done. The graph runs verification next.
|
||||||
|
|
||||||
|
Project directory: {{project_dir}}
|
||||||
|
prompt: |
|
||||||
|
## Plan summary
|
||||||
|
{{plan_summary}}
|
||||||
|
|
||||||
|
## Files involved
|
||||||
|
- Modify: {{files_to_modify}}
|
||||||
|
- Create: {{files_to_create}}
|
||||||
|
|
||||||
|
## Original request from the orchestrator
|
||||||
|
{{initial_prompt}}
|
||||||
|
|
||||||
|
## Fix loop status
|
||||||
|
{{fix_instructions}}
|
||||||
|
tools:
|
||||||
|
- fs_cat
|
||||||
|
- fs_ls
|
||||||
|
- fs_write
|
||||||
|
- fs_patch
|
||||||
|
- execute_command
|
||||||
|
max_iterations: 30
|
||||||
|
state_updates:
|
||||||
|
last_node_output: "{{output}}"
|
||||||
|
fallback: end_failure
|
||||||
|
next: verify_build
|
||||||
|
|
||||||
|
verify_build:
|
||||||
|
id: verify_build
|
||||||
|
type: script
|
||||||
|
description: Run the project's check/build command. Routes to verify_tests on success, fix_loop_gate on failure.
|
||||||
|
script: scripts/verify_build.sh
|
||||||
|
timeout: 300
|
||||||
|
fallback: fix_loop_gate
|
||||||
|
|
||||||
|
verify_tests:
|
||||||
|
id: verify_tests
|
||||||
|
type: script
|
||||||
|
description: Run the project's test command. Routes to end_success on pass, fix_loop_gate on failure.
|
||||||
|
script: scripts/verify_tests.sh
|
||||||
|
timeout: 600
|
||||||
|
fallback: fix_loop_gate
|
||||||
|
|
||||||
|
fix_loop_gate:
|
||||||
|
id: fix_loop_gate
|
||||||
|
type: script
|
||||||
|
description: Budget gate. Loops back to implement with fix_instructions populated, or terminates as end_failure.
|
||||||
|
script: scripts/fix_loop_gate.sh
|
||||||
|
timeout: 5
|
||||||
|
fallback: end_failure
|
||||||
|
|
||||||
|
end_success:
|
||||||
|
id: end_success
|
||||||
|
type: end
|
||||||
|
output: |
|
||||||
|
CODER_COMPLETE
|
||||||
|
Plan: {{plan_summary}}
|
||||||
|
Files modified: {{files_to_modify}}
|
||||||
|
Files created: {{files_to_create}}
|
||||||
|
Build: passed
|
||||||
|
Tests: passed
|
||||||
|
|
||||||
|
end_rejected:
|
||||||
|
id: end_rejected
|
||||||
|
type: end
|
||||||
|
output: |
|
||||||
|
CODER_REJECTED
|
||||||
|
Plan was rejected at the approval gate.
|
||||||
|
Plan: {{plan_summary}}
|
||||||
|
|
||||||
|
end_failure:
|
||||||
|
id: end_failure
|
||||||
|
type: end
|
||||||
|
output: |
|
||||||
|
CODER_FAILED
|
||||||
|
Plan: {{plan_summary}}
|
||||||
|
Attempts: {{fix_attempts}}/{{max_fix_attempts}}
|
||||||
|
|
||||||
|
Last node output:
|
||||||
|
{{last_node_output}}
|
||||||
|
|
||||||
|
Last build output:
|
||||||
|
{{build_output}}
|
||||||
|
|
||||||
|
Last tests output:
|
||||||
|
{{tests_output}}
|
||||||
@@ -0,0 +1,49 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
if [[ -n "${GRAPH_STATE_FILE:-}" ]]; then
|
||||||
|
state=$(cat "$GRAPH_STATE_FILE")
|
||||||
|
elif [[ -n "${GRAPH_STATE:-}" ]]; then
|
||||||
|
state="$GRAPH_STATE"
|
||||||
|
else
|
||||||
|
state='{}'
|
||||||
|
fi
|
||||||
|
|
||||||
|
fix_attempts=$(echo "$state" | jq -r '.fix_attempts // 0')
|
||||||
|
max_fix_attempts=$(echo "$state" | jq -r '.max_fix_attempts // 3')
|
||||||
|
build_ok=$(echo "$state" | jq -r '.build_ok | if . == null then "true" else (. | tostring) end')
|
||||||
|
tests_ok=$(echo "$state" | jq -r '.tests_ok | if . == null then "true" else (. | tostring) end')
|
||||||
|
build_output=$(echo "$state" | jq -r '.build_output // ""')
|
||||||
|
tests_output=$(echo "$state" | jq -r '.tests_output // ""')
|
||||||
|
|
||||||
|
if (( fix_attempts >= max_fix_attempts )); then
|
||||||
|
jq -nc \
|
||||||
|
--argjson n "$fix_attempts" \
|
||||||
|
'{
|
||||||
|
"fix_attempts": $n,
|
||||||
|
"_next": "end_failure"
|
||||||
|
}'
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
next_attempts=$((fix_attempts + 1))
|
||||||
|
|
||||||
|
if [[ "$build_ok" != "true" ]]; then
|
||||||
|
fix_instructions=$(printf '## Fix loop status (attempt %d of %d)\n\nThe previous attempt failed the build.\n\nBuild output:\n```\n%s\n```\n\nIdentify the minimal fix and apply it. Do not refactor.' \
|
||||||
|
"$next_attempts" "$max_fix_attempts" "$build_output")
|
||||||
|
elif [[ "$tests_ok" != "true" ]]; then
|
||||||
|
fix_instructions=$(printf '## Fix loop status (attempt %d of %d)\n\nBuild passed but tests failed.\n\nTest output:\n```\n%s\n```\n\nIdentify the minimal fix and apply it. Do not refactor.' \
|
||||||
|
"$next_attempts" "$max_fix_attempts" "$tests_output")
|
||||||
|
else
|
||||||
|
fix_instructions=$(printf '## Fix loop status (attempt %d of %d)\n\nfix_loop_gate was reached but no failure was detected in state. Re-run the verification step.' \
|
||||||
|
"$next_attempts" "$max_fix_attempts")
|
||||||
|
fi
|
||||||
|
|
||||||
|
jq -nc \
|
||||||
|
--argjson n "$next_attempts" \
|
||||||
|
--arg fi "$fix_instructions" \
|
||||||
|
'{
|
||||||
|
"fix_attempts": $n,
|
||||||
|
"fix_instructions": $fi,
|
||||||
|
"_next": "implement"
|
||||||
|
}'
|
||||||
@@ -0,0 +1,12 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
project_dir="${LLM_AGENT_VAR_PROJECT_DIR:-.}"
|
||||||
|
resolved=$(cd "$project_dir" 2>/dev/null && pwd) || resolved="$project_dir"
|
||||||
|
|
||||||
|
jq -nc \
|
||||||
|
--arg pd "$resolved" \
|
||||||
|
'{
|
||||||
|
"project_dir": $pd,
|
||||||
|
"_next": "analyze_request"
|
||||||
|
}'
|
||||||
@@ -0,0 +1,23 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
if [[ -n "${GRAPH_STATE_FILE:-}" ]]; then
|
||||||
|
state=$(cat "$GRAPH_STATE_FILE")
|
||||||
|
elif [[ -n "${GRAPH_STATE:-}" ]]; then
|
||||||
|
state="$GRAPH_STATE"
|
||||||
|
else
|
||||||
|
state='{}'
|
||||||
|
fi
|
||||||
|
|
||||||
|
complexity=$(echo "$state" | jq -r '.complexity_score // 0')
|
||||||
|
|
||||||
|
if [[ "${CODER_AUTOAPPROVE:-0}" == "1" ]]; then
|
||||||
|
jq -nc '{"_next": "implement"}'
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
if (( complexity >= 7 )); then
|
||||||
|
jq -nc '{"_next": "gate_approval"}'
|
||||||
|
else
|
||||||
|
jq -nc '{"_next": "implement"}'
|
||||||
|
fi
|
||||||
@@ -0,0 +1,55 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -uo pipefail
|
||||||
|
|
||||||
|
# shellcheck disable=SC1091
|
||||||
|
source "$(dirname "$0")/../../.shared/utils.sh"
|
||||||
|
|
||||||
|
if [[ -n "${GRAPH_STATE_FILE:-}" ]]; then
|
||||||
|
state=$(cat "$GRAPH_STATE_FILE")
|
||||||
|
elif [[ -n "${GRAPH_STATE:-}" ]]; then
|
||||||
|
state="$GRAPH_STATE"
|
||||||
|
else
|
||||||
|
state='{}'
|
||||||
|
fi
|
||||||
|
|
||||||
|
project_dir=$(echo "$state" | jq -r '.project_dir // "."')
|
||||||
|
|
||||||
|
if [[ -n "${BUILD_CMD:-}" ]]; then
|
||||||
|
cmd="$BUILD_CMD"
|
||||||
|
else
|
||||||
|
project_info=$(detect_project "$project_dir")
|
||||||
|
cmd=$(echo "$project_info" | jq -r '.check // .build // ""')
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [[ -z "$cmd" || "$cmd" == "null" ]]; then
|
||||||
|
jq -nc '{
|
||||||
|
"build_ok": true,
|
||||||
|
"build_output": "(no build/check command available for this project type)",
|
||||||
|
"_next": "verify_tests"
|
||||||
|
}'
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
exit_code=0
|
||||||
|
output=$(cd "$project_dir" && eval "$cmd" 2>&1) || exit_code=$?
|
||||||
|
|
||||||
|
if (( exit_code == 0 )); then
|
||||||
|
jq -nc \
|
||||||
|
--arg out "$output" \
|
||||||
|
--arg cmd "$cmd" \
|
||||||
|
'{
|
||||||
|
"build_ok": true,
|
||||||
|
"build_output": ("Ran: " + $cmd + "\n\n" + $out),
|
||||||
|
"_next": "verify_tests"
|
||||||
|
}'
|
||||||
|
else
|
||||||
|
jq -nc \
|
||||||
|
--arg out "$output" \
|
||||||
|
--arg cmd "$cmd" \
|
||||||
|
--argjson rc "$exit_code" \
|
||||||
|
'{
|
||||||
|
"build_ok": false,
|
||||||
|
"build_output": ("Ran: " + $cmd + "\nExit code: " + ($rc | tostring) + "\n\n" + $out),
|
||||||
|
"_next": "fix_loop_gate"
|
||||||
|
}'
|
||||||
|
fi
|
||||||
@@ -0,0 +1,55 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -uo pipefail
|
||||||
|
|
||||||
|
# shellcheck disable=SC1091
|
||||||
|
source "$(dirname "$0")/../../.shared/utils.sh"
|
||||||
|
|
||||||
|
if [[ -n "${GRAPH_STATE_FILE:-}" ]]; then
|
||||||
|
state=$(cat "$GRAPH_STATE_FILE")
|
||||||
|
elif [[ -n "${GRAPH_STATE:-}" ]]; then
|
||||||
|
state="$GRAPH_STATE"
|
||||||
|
else
|
||||||
|
state='{}'
|
||||||
|
fi
|
||||||
|
|
||||||
|
project_dir=$(echo "$state" | jq -r '.project_dir // "."')
|
||||||
|
|
||||||
|
if [[ -n "${TEST_CMD:-}" ]]; then
|
||||||
|
cmd="$TEST_CMD"
|
||||||
|
else
|
||||||
|
project_info=$(detect_project "$project_dir")
|
||||||
|
cmd=$(echo "$project_info" | jq -r '.test // ""')
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [[ -z "$cmd" || "$cmd" == "null" ]]; then
|
||||||
|
jq -nc '{
|
||||||
|
"tests_ok": true,
|
||||||
|
"tests_output": "(no test command available for this project type)",
|
||||||
|
"_next": "end_success"
|
||||||
|
}'
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
exit_code=0
|
||||||
|
output=$(cd "$project_dir" && eval "$cmd" 2>&1) || exit_code=$?
|
||||||
|
|
||||||
|
if (( exit_code == 0 )); then
|
||||||
|
jq -nc \
|
||||||
|
--arg out "$output" \
|
||||||
|
--arg cmd "$cmd" \
|
||||||
|
'{
|
||||||
|
"tests_ok": true,
|
||||||
|
"tests_output": ("Ran: " + $cmd + "\n\n" + $out),
|
||||||
|
"_next": "end_success"
|
||||||
|
}'
|
||||||
|
else
|
||||||
|
jq -nc \
|
||||||
|
--arg out "$output" \
|
||||||
|
--arg cmd "$cmd" \
|
||||||
|
--argjson rc "$exit_code" \
|
||||||
|
'{
|
||||||
|
"tests_ok": false,
|
||||||
|
"tests_output": ("Ran: " + $cmd + "\nExit code: " + ($rc | tostring) + "\n\n" + $out),
|
||||||
|
"_next": "fix_loop_gate"
|
||||||
|
}'
|
||||||
|
fi
|
||||||
@@ -18,16 +18,15 @@ Sisyphus acts as the primary entry point, capable of handling complex tasks by c
|
|||||||
- 🛠️ **Tool Integration**: Seamlessly uses system tools for building, testing, and file manipulation.
|
- 🛠️ **Tool Integration**: Seamlessly uses system tools for building, testing, and file manipulation.
|
||||||
|
|
||||||
## Pro-Tip: Use an IDE MCP Server for Improved Performance
|
## Pro-Tip: Use an IDE MCP Server for Improved Performance
|
||||||
Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
|
Many modern IDEs (JetBrains, VS Code, Cursor, Zed, etc.) expose MCP servers that let LLMs use IDE tools directly. Using
|
||||||
an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
|
one dramatically improves the performance of coding agents. If you have one, add it to your loki config (see the
|
||||||
server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
|
[MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md)) and reference it in this agent's `mcp_servers:` list:
|
||||||
them), and modify the agent definition to look like this:
|
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
# ...
|
# ...
|
||||||
|
|
||||||
mcp_servers:
|
mcp_servers:
|
||||||
- jetbrains
|
- your-ide-mcp-server
|
||||||
|
|
||||||
global_tools:
|
global_tools:
|
||||||
- fs_read.sh
|
- fs_read.sh
|
||||||
|
|||||||
@@ -119,20 +119,21 @@ instructions: |
|
|||||||
1. todo__init --goal "Add user profiles API endpoint"
|
1. todo__init --goal "Add user profiles API endpoint"
|
||||||
2. todo__add --task "Explore existing API patterns"
|
2. todo__add --task "Explore existing API patterns"
|
||||||
3. todo__add --task "Implement profile endpoint"
|
3. todo__add --task "Implement profile endpoint"
|
||||||
4. todo__add --task "Verify with build/test"
|
4. agent__spawn --agent explore --prompt "Find existing API endpoint patterns, route structures, and controller conventions. Include code snippets."
|
||||||
5. agent__spawn --agent explore --prompt "Find existing API endpoint patterns, route structures, and controller conventions. Include code snippets."
|
5. agent__spawn --agent explore --prompt "Find existing data models and database query patterns. Include code snippets."
|
||||||
6. agent__spawn --agent explore --prompt "Find existing data models and database query patterns. Include code snippets."
|
6. agent__collect --id <id1>
|
||||||
7. agent__collect --id <id1>
|
7. agent__collect --id <id2>
|
||||||
8. agent__collect --id <id2>
|
8. todo__done --id 1
|
||||||
9. todo__done --id 1
|
9. agent__spawn --agent coder --prompt "<structured prompt using Coder Delegation Format above, including code snippets from explore results>"
|
||||||
10. agent__spawn --agent coder --prompt "<structured prompt using Coder Delegation Format above, including code snippets from explore results>"
|
10. agent__collect --id <coder_id>
|
||||||
11. agent__collect --id <coder_id>
|
11. todo__done --id 2
|
||||||
12. todo__done --id 2
|
|
||||||
13. run_build
|
|
||||||
14. run_tests
|
|
||||||
15. todo__done --id 3
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Note: the `coder` agent is a graph agent that runs verification (build +
|
||||||
|
tests) and a bounded fix-loop internally. You do NOT need to spawn a
|
||||||
|
separate build/test step. A `CODER_COMPLETE` outcome means build and
|
||||||
|
tests already passed.
|
||||||
|
|
||||||
### Example 2: Architecture/design question (explore + oracle in parallel)
|
### Example 2: Architecture/design question (explore + oracle in parallel)
|
||||||
|
|
||||||
User: "How should I structure the authentication for this app?"
|
User: "How should I structure the authentication for this app?"
|
||||||
@@ -172,6 +173,22 @@ instructions: |
|
|||||||
10. **Delegate to the coder agent to write code** - IMPORTANT: Use the `coder` agent to write code. Do not try to write code yourself except for trivial changes
|
10. **Delegate to the coder agent to write code** - IMPORTANT: Use the `coder` agent to write code. Do not try to write code yourself except for trivial changes
|
||||||
11. **Always output a summary of changes when finished** - Make it clear to user's that you've completed your tasks
|
11. **Always output a summary of changes when finished** - Make it clear to user's that you've completed your tasks
|
||||||
|
|
||||||
|
## Coder Outcomes
|
||||||
|
|
||||||
|
The `coder` agent is a graph agent that runs the implement -> verify_build
|
||||||
|
-> verify_tests -> fix_loop pipeline internally. It always returns one of
|
||||||
|
three sentinel outcomes:
|
||||||
|
|
||||||
|
- `CODER_COMPLETE` - implementation succeeded with build + tests green.
|
||||||
|
Continue with any follow-up todos.
|
||||||
|
- `CODER_REJECTED` - user rejected the plan at the approval gate (only
|
||||||
|
triggered for high-complexity plans). Do NOT re-spawn coder blindly;
|
||||||
|
ask the user what to change first.
|
||||||
|
- `CODER_FAILED` - the fix-loop exhausted its budget without producing
|
||||||
|
green build/tests. The failure output includes the last build and tests
|
||||||
|
output. Surface this to the user; consider spawning `oracle` for
|
||||||
|
diagnosis if the failure is unclear.
|
||||||
|
|
||||||
## When to Do It Yourself
|
## When to Do It Yourself
|
||||||
|
|
||||||
- Simple command execution
|
- Simple command execution
|
||||||
|
|||||||
Reference in New Issue
Block a user