feat: Improved coder agent that is now a graph-based agent
This commit is contained in:
@@ -1,40 +1,82 @@
|
||||
# Coder
|
||||
|
||||
An AI agent that assists you with your coding tasks.
|
||||
A graph-based implementation agent. Plans, implements, and runs build +
|
||||
tests in a bounded fix-loop until verified. Designed to be delegated to by
|
||||
the **[Sisyphus](../sisyphus/README.md)** agent.
|
||||
|
||||
This agent is designed to be delegated to by the **[Sisyphus](../sisyphus/README.md)** agent to implement code specifications. Sisyphus
|
||||
acts as the coordinator/architect, while Coder handles the implementation details.
|
||||
Coder is a [graph agent](https://github.com/Dark-Alex-17/loki/wiki/Graph-Agents): its workflow is
|
||||
defined declaratively in `graph.yaml`, with verification and the
|
||||
implement-fix loop enforced as graph edges rather than prose.
|
||||
|
||||
## Features
|
||||
## Workflow
|
||||
|
||||
- 🏗️ Intelligent project structure creation and management
|
||||
- 🖼️ Convert screenshots into clean, functional code
|
||||
- 📁 Comprehensive file system operations (create folders, files, read/write files)
|
||||
- 🧐 Advanced code analysis and improvement suggestions
|
||||
- 📊 Precise diff-based file editing for controlled code modifications
|
||||
```
|
||||
analyze_request (llm + output_schema) plan + complexity extraction
|
||||
↓
|
||||
route_complexity (script) opt-out approval gate (complexity ≥ 7)
|
||||
↓
|
||||
gate_approval (approval, optional)
|
||||
↓
|
||||
implement (llm + fs tools) actual file edits
|
||||
↓
|
||||
verify_build (script)
|
||||
↓
|
||||
verify_tests (script)
|
||||
↓
|
||||
fix_loop_gate (script) back-edge to implement (bounded)
|
||||
↓
|
||||
end_success / end_rejected / end_failure
|
||||
```
|
||||
|
||||
It can also be used as a standalone tool for direct coding assistance.
|
||||
End nodes emit one of three sentinel outcomes for the caller:
|
||||
|
||||
## Pro-Tip: Use an IDE MCP Server for Improved Performance
|
||||
Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
|
||||
an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
|
||||
server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
|
||||
them), and modify the agent definition to look like this:
|
||||
- `CODER_COMPLETE` — build and tests passed.
|
||||
- `CODER_REJECTED` — user rejected the plan at the approval gate.
|
||||
- `CODER_FAILED` — fix-loop exhausted; build/tests still failing.
|
||||
|
||||
## Tuning
|
||||
|
||||
The agent's `project_dir` is exposed via the standard `variables:` block,
|
||||
so it accepts the runtime override flag:
|
||||
|
||||
```sh
|
||||
# Invoke from inside the project (project_dir defaults to ".")
|
||||
cd /path/to/your/project
|
||||
loki -a coder "Add a foo() function..."
|
||||
|
||||
# Or invoke from anywhere with an explicit override
|
||||
loki -a coder --agent-variable project_dir /path/to/your/project "Add..."
|
||||
```
|
||||
|
||||
`graph.yaml` `initial_state` exposes:
|
||||
|
||||
- `max_fix_attempts` (default `3`) — fix-loop budget before `end_failure`.
|
||||
|
||||
Environment overrides honored by the script nodes:
|
||||
|
||||
- `BUILD_CMD` — skip project-type detection for the build/check command.
|
||||
- `TEST_CMD` — skip detection for tests.
|
||||
- `CODER_AUTOAPPROVE=1` — bypass the approval gate (for non-interactive runs
|
||||
where complexity might trip the gate).
|
||||
|
||||
## Pro-Tip: IDE MCP Server
|
||||
|
||||
Modern IDEs (JetBrains, VS Code, Cursor, Zed, etc.) expose MCP servers
|
||||
that let LLMs use IDE tools directly. To wire one in, edit `graph.yaml`:
|
||||
|
||||
```yaml
|
||||
# ...
|
||||
|
||||
mcp_servers:
|
||||
- jetbrains # The name of your configured IDE MCP server
|
||||
- your-ide-mcp-server
|
||||
|
||||
global_tools:
|
||||
# Keep useful read-only tools for reading files in other non-project directories
|
||||
# Keep read-only fs tools for files outside the IDE project
|
||||
- fs_read.sh
|
||||
- fs_grep.sh
|
||||
- fs_glob.sh
|
||||
# - fs_write.sh
|
||||
# - fs_patch.sh
|
||||
- execute_command.sh
|
||||
|
||||
# ...
|
||||
```
|
||||
|
||||
Then add the MCP server's write/patch tools to the `implement` node's
|
||||
`tools:` whitelist.
|
||||
|
||||
@@ -1,116 +0,0 @@
|
||||
name: coder
|
||||
description: Implementation agent - writes code, follows patterns, verifies with builds
|
||||
version: 1.0.0
|
||||
temperature: 0.1
|
||||
|
||||
auto_continue: true
|
||||
max_auto_continues: 15
|
||||
inject_todo_instructions: true
|
||||
|
||||
variables:
|
||||
- name: project_dir
|
||||
description: Project directory to work in
|
||||
default: '.'
|
||||
- name: auto_confirm
|
||||
description: Auto-confirm command execution
|
||||
default: '1'
|
||||
|
||||
global_tools:
|
||||
- fs_read.sh
|
||||
- fs_grep.sh
|
||||
- fs_glob.sh
|
||||
- fs_write.sh
|
||||
- fs_patch.sh
|
||||
- execute_command.sh
|
||||
|
||||
instructions: |
|
||||
You are a senior engineer. You write code that works on the first try.
|
||||
|
||||
## Your Mission
|
||||
|
||||
Given an implementation task:
|
||||
1. Check for orchestrator context first (see below)
|
||||
2. Fill gaps only. Read files NOT already covered in context
|
||||
3. Write the code (using tools, NOT chat output)
|
||||
4. Verify it compiles/builds
|
||||
5. Signal completion with a summary
|
||||
|
||||
## Using Orchestrator Context (IMPORTANT)
|
||||
|
||||
When spawned by sisyphus, your prompt will often contain a `<context>` block
|
||||
with prior findings: file paths, code patterns, and conventions discovered by
|
||||
explore agents.
|
||||
|
||||
**If context is provided:**
|
||||
1. Use it as your primary reference. Don't re-read files already summarized
|
||||
2. Follow the code patterns shown. Snippets in context ARE the style guide
|
||||
3. Read the referenced files ONLY IF you need more detail (e.g. full function
|
||||
signature, import list, or adjacent code not included in the snippet)
|
||||
4. If context includes a "Conventions" section, follow it exactly
|
||||
|
||||
**If context is NOT provided or is too vague to act on:**
|
||||
Fall back to self-exploration: grep for similar files, read 1-2 examples,
|
||||
match their style.
|
||||
|
||||
**Never ignore provided context.** It represents work already done upstream.
|
||||
|
||||
## Todo System
|
||||
|
||||
For multi-file changes:
|
||||
1. `todo__init` with the implementation goal
|
||||
2. `todo__add` for each file to create/modify
|
||||
3. Implement each, calling `todo__done` immediately after
|
||||
|
||||
## Writing Code
|
||||
1. **Use fs_patch for surgical edits** - `fs_patch --path "src/main.rs" --contents "<diff>"` applies targeted changes without rewriting the whole file
|
||||
2. **use fs_write for full file writes** - `fs_write --path "src/main.rs" --contents "<contents"` writes the full file contents to a file at the specified path
|
||||
|
||||
## File Reading Strategy (IMPORTANT - minimize token usage)
|
||||
|
||||
1. **Use grep to find relevant code** - `fs_grep --pattern "fn handle_request" --include "*.rs"` finds where things are
|
||||
2. **Read only what you need** - `fs_read --path "src/main.rs" --offset 50 --limit 30` reads lines 50-79
|
||||
3. **Never cat entire large files** - If 500+ lines, read the relevant section after grepping for it
|
||||
4. **Use glob to find files** - `fs_glob --pattern "*.rs" --path src/` discovers files by name
|
||||
|
||||
## Pattern Matching
|
||||
|
||||
Before writing ANY file:
|
||||
1. Find a similar existing file (use `fs_grep` to locate, then `fs_read` to examine)
|
||||
2. Match its style: imports, naming, structure
|
||||
3. Follow the same patterns exactly
|
||||
|
||||
## Verification
|
||||
|
||||
After writing files:
|
||||
1. Run `verify_build` to check compilation
|
||||
2. If it fails, fix the error (minimal change)
|
||||
3. Don't move on until build passes
|
||||
|
||||
## Completion Signal
|
||||
|
||||
When done, end your response with a summary so the parent agent knows what happened:
|
||||
|
||||
```
|
||||
CODER_COMPLETE: [summary of what was implemented, which files were created/modified, and build status]
|
||||
```
|
||||
|
||||
Or if something went wrong:
|
||||
```
|
||||
CODER_FAILED: [what went wrong]
|
||||
```
|
||||
|
||||
## Rules
|
||||
|
||||
1. **Write code via tools** - Never output code to chat
|
||||
2. **Follow patterns** - Read existing files first
|
||||
3. **Verify builds** - Don't finish without checking
|
||||
4. **Minimal fixes** - If build fails, fix precisely
|
||||
5. **No refactoring** - Only implement what's asked
|
||||
|
||||
## Context
|
||||
- Project: {{project_dir}}
|
||||
- CWD: {{__cwd__}}
|
||||
- Shell: {{__shell__}}
|
||||
|
||||
## Available tools:
|
||||
{{__tools__}}
|
||||
@@ -0,0 +1,278 @@
|
||||
name: coder
|
||||
description: |
|
||||
Implementation agent. Plans, implements, and runs build + tests in a
|
||||
bounded fix-loop until verified. Designed to be delegated to by sisyphus.
|
||||
version: "1.0"
|
||||
|
||||
temperature: 0.1
|
||||
|
||||
global_tools:
|
||||
- fs_cat.sh
|
||||
- fs_ls.sh
|
||||
- fs_write.sh
|
||||
- fs_patch.sh
|
||||
- execute_command.sh
|
||||
|
||||
variables:
|
||||
- name: project_dir
|
||||
description: |
|
||||
Absolute path to the project directory. Defaults to "." which is the
|
||||
directory you invoked `loki` from. Override at runtime with
|
||||
`loki -a coder --agent-variable project_dir /abs/path "..."`.
|
||||
default: "."
|
||||
|
||||
settings:
|
||||
max_loop_iterations: 20
|
||||
log_state_snapshots: true
|
||||
validate_before_run: true
|
||||
timeout: 1800
|
||||
|
||||
initial_state:
|
||||
project_dir: ""
|
||||
fix_attempts: 0
|
||||
max_fix_attempts: 3
|
||||
fix_instructions: ""
|
||||
build_output: ""
|
||||
tests_output: ""
|
||||
last_node_output: ""
|
||||
plan_summary: ""
|
||||
files_to_modify: []
|
||||
files_to_create: []
|
||||
risks: []
|
||||
complexity_score: 0
|
||||
|
||||
start: resolve_paths
|
||||
|
||||
nodes:
|
||||
resolve_paths:
|
||||
id: resolve_paths
|
||||
type: script
|
||||
description: Resolve project_dir to an absolute path from the agent variable
|
||||
script: scripts/resolve_paths.sh
|
||||
timeout: 5
|
||||
fallback: end_failure
|
||||
|
||||
analyze_request:
|
||||
id: analyze_request
|
||||
type: llm
|
||||
description: Extract a structured plan and complexity score from the orchestrator's prompt
|
||||
instructions: |
|
||||
You are a senior engineer's planning assistant. Read the orchestrator's
|
||||
request and emit a structured plan. You only plan. You never edit files.
|
||||
|
||||
Score complexity from 1 to 10:
|
||||
1-3: trivial - single file, <=20 lines changed, obvious approach
|
||||
4-6: moderate - 2-5 files, clear approach, some pattern matching
|
||||
7-10: complex - multi-component, ambiguous tradeoffs, refactoring,
|
||||
or wide blast radius
|
||||
|
||||
Be specific in `files_to_modify` and `files_to_create`. All paths
|
||||
MUST be absolute. The project root is {{project_dir}}. Prefer paths
|
||||
like "{{project_dir}}/src/foo.rs" over "src/foo.rs". The implementer
|
||||
uses these paths directly with fs_write and fs_patch tools, which
|
||||
resolve relative paths against the loki invocation directory (NOT
|
||||
the project dir). Empty arrays are fine if no files in that category.
|
||||
|
||||
`risks` is a list of short strings. Anything that could derail the
|
||||
implementation: unknown dependencies, brittle tests, blast radius,
|
||||
etc. Empty list is fine.
|
||||
|
||||
Project directory: {{project_dir}}
|
||||
prompt: "{{initial_prompt}}"
|
||||
tools: []
|
||||
output_schema:
|
||||
type: object
|
||||
properties:
|
||||
plan_summary:
|
||||
type: string
|
||||
description: 1-3 sentences summarizing what will be done
|
||||
files_to_modify:
|
||||
type: array
|
||||
items: {type: string}
|
||||
files_to_create:
|
||||
type: array
|
||||
items: {type: string}
|
||||
complexity_score:
|
||||
type: integer
|
||||
minimum: 1
|
||||
maximum: 10
|
||||
risks:
|
||||
type: array
|
||||
items: {type: string}
|
||||
required: [plan_summary, files_to_modify, files_to_create, complexity_score, risks]
|
||||
state_updates:
|
||||
last_node_output: "{{output}}"
|
||||
fallback: end_failure
|
||||
next: route_complexity
|
||||
|
||||
route_complexity:
|
||||
id: route_complexity
|
||||
type: script
|
||||
description: Route to approval gate for complex plans; skip otherwise
|
||||
script: scripts/route_complexity.sh
|
||||
timeout: 5
|
||||
fallback: implement
|
||||
|
||||
gate_approval:
|
||||
id: gate_approval
|
||||
type: approval
|
||||
description: Optional human checkpoint for high-complexity plans
|
||||
question: |
|
||||
## Plan
|
||||
{{plan_summary}}
|
||||
|
||||
## Files to modify
|
||||
{{files_to_modify}}
|
||||
|
||||
## Files to create
|
||||
{{files_to_create}}
|
||||
|
||||
## Risks
|
||||
{{risks}}
|
||||
|
||||
Complexity: {{complexity_score}}/10
|
||||
|
||||
Approve this plan?
|
||||
options:
|
||||
- "yes"
|
||||
- "no"
|
||||
routes:
|
||||
"yes": implement
|
||||
"no": end_rejected
|
||||
on_other: end_rejected
|
||||
|
||||
implement:
|
||||
id: implement
|
||||
type: llm
|
||||
description: Write code via fs tools. Bounded tool-call loop.
|
||||
instructions: |
|
||||
You are a senior engineer. Implement the plan by writing code via
|
||||
tools. Follow existing patterns in the codebase.
|
||||
|
||||
## Writing code
|
||||
|
||||
1. Use `fs_patch` for surgical edits to existing files.
|
||||
2. Use `fs_write` for new files or full rewrites.
|
||||
3. NEVER output code to chat. Always use tools.
|
||||
4. ALWAYS pass ABSOLUTE paths to fs_write and fs_patch. Relative
|
||||
paths resolve against the loki invocation directory (not the
|
||||
project dir), which is rarely what you want. The project root
|
||||
is {{project_dir}}.
|
||||
|
||||
## File reading
|
||||
|
||||
1. Use `execute_command` to grep/find:
|
||||
`execute_command --command "grep -rn 'fn handle_request' --include='*.rs' ."`
|
||||
`execute_command --command "find . -name '*.rs' -not -path '*/target/*'"`
|
||||
2. Read only what you need:
|
||||
`fs_cat --path "src/main.rs" --offset 50 --limit 30`
|
||||
3. Never read entire large files. Use offset/limit.
|
||||
4. Use `fs_ls` to list directory contents.
|
||||
|
||||
## Pattern matching
|
||||
|
||||
Before writing ANY file:
|
||||
1. Find a similar existing file (grep, then read).
|
||||
2. Match its style: imports, naming, structure, error handling.
|
||||
3. Follow the same patterns exactly. Do not invent new ones.
|
||||
|
||||
## Fix loop
|
||||
|
||||
If the "Fix loop status" section in your user prompt is non-empty,
|
||||
the previous attempt failed verification. Read the error, identify
|
||||
the minimal fix, apply it. Do not refactor while fixing.
|
||||
|
||||
## Rules
|
||||
|
||||
1. Match existing patterns - read examples first.
|
||||
2. Minimal changes - implement only what's asked.
|
||||
3. Never suppress errors (`as any`, `@ts-ignore`, `#[allow(...)]`
|
||||
on unfamiliar lints, etc.).
|
||||
4. No dead code, no commented-out blocks, no premature abstractions.
|
||||
5. End your turn when editing is done. The graph runs verification next.
|
||||
|
||||
Project directory: {{project_dir}}
|
||||
prompt: |
|
||||
## Plan summary
|
||||
{{plan_summary}}
|
||||
|
||||
## Files involved
|
||||
- Modify: {{files_to_modify}}
|
||||
- Create: {{files_to_create}}
|
||||
|
||||
## Original request from the orchestrator
|
||||
{{initial_prompt}}
|
||||
|
||||
## Fix loop status
|
||||
{{fix_instructions}}
|
||||
tools:
|
||||
- fs_cat
|
||||
- fs_ls
|
||||
- fs_write
|
||||
- fs_patch
|
||||
- execute_command
|
||||
max_iterations: 30
|
||||
state_updates:
|
||||
last_node_output: "{{output}}"
|
||||
fallback: end_failure
|
||||
next: verify_build
|
||||
|
||||
verify_build:
|
||||
id: verify_build
|
||||
type: script
|
||||
description: Run the project's check/build command. Routes to verify_tests on success, fix_loop_gate on failure.
|
||||
script: scripts/verify_build.sh
|
||||
timeout: 300
|
||||
fallback: fix_loop_gate
|
||||
|
||||
verify_tests:
|
||||
id: verify_tests
|
||||
type: script
|
||||
description: Run the project's test command. Routes to end_success on pass, fix_loop_gate on failure.
|
||||
script: scripts/verify_tests.sh
|
||||
timeout: 600
|
||||
fallback: fix_loop_gate
|
||||
|
||||
fix_loop_gate:
|
||||
id: fix_loop_gate
|
||||
type: script
|
||||
description: Budget gate. Loops back to implement with fix_instructions populated, or terminates as end_failure.
|
||||
script: scripts/fix_loop_gate.sh
|
||||
timeout: 5
|
||||
fallback: end_failure
|
||||
|
||||
end_success:
|
||||
id: end_success
|
||||
type: end
|
||||
output: |
|
||||
CODER_COMPLETE
|
||||
Plan: {{plan_summary}}
|
||||
Files modified: {{files_to_modify}}
|
||||
Files created: {{files_to_create}}
|
||||
Build: passed
|
||||
Tests: passed
|
||||
|
||||
end_rejected:
|
||||
id: end_rejected
|
||||
type: end
|
||||
output: |
|
||||
CODER_REJECTED
|
||||
Plan was rejected at the approval gate.
|
||||
Plan: {{plan_summary}}
|
||||
|
||||
end_failure:
|
||||
id: end_failure
|
||||
type: end
|
||||
output: |
|
||||
CODER_FAILED
|
||||
Plan: {{plan_summary}}
|
||||
Attempts: {{fix_attempts}}/{{max_fix_attempts}}
|
||||
|
||||
Last node output:
|
||||
{{last_node_output}}
|
||||
|
||||
Last build output:
|
||||
{{build_output}}
|
||||
|
||||
Last tests output:
|
||||
{{tests_output}}
|
||||
@@ -0,0 +1,49 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
if [[ -n "${GRAPH_STATE_FILE:-}" ]]; then
|
||||
state=$(cat "$GRAPH_STATE_FILE")
|
||||
elif [[ -n "${GRAPH_STATE:-}" ]]; then
|
||||
state="$GRAPH_STATE"
|
||||
else
|
||||
state='{}'
|
||||
fi
|
||||
|
||||
fix_attempts=$(echo "$state" | jq -r '.fix_attempts // 0')
|
||||
max_fix_attempts=$(echo "$state" | jq -r '.max_fix_attempts // 3')
|
||||
build_ok=$(echo "$state" | jq -r '.build_ok | if . == null then "true" else (. | tostring) end')
|
||||
tests_ok=$(echo "$state" | jq -r '.tests_ok | if . == null then "true" else (. | tostring) end')
|
||||
build_output=$(echo "$state" | jq -r '.build_output // ""')
|
||||
tests_output=$(echo "$state" | jq -r '.tests_output // ""')
|
||||
|
||||
if (( fix_attempts >= max_fix_attempts )); then
|
||||
jq -nc \
|
||||
--argjson n "$fix_attempts" \
|
||||
'{
|
||||
"fix_attempts": $n,
|
||||
"_next": "end_failure"
|
||||
}'
|
||||
exit 0
|
||||
fi
|
||||
|
||||
next_attempts=$((fix_attempts + 1))
|
||||
|
||||
if [[ "$build_ok" != "true" ]]; then
|
||||
fix_instructions=$(printf '## Fix loop status (attempt %d of %d)\n\nThe previous attempt failed the build.\n\nBuild output:\n```\n%s\n```\n\nIdentify the minimal fix and apply it. Do not refactor.' \
|
||||
"$next_attempts" "$max_fix_attempts" "$build_output")
|
||||
elif [[ "$tests_ok" != "true" ]]; then
|
||||
fix_instructions=$(printf '## Fix loop status (attempt %d of %d)\n\nBuild passed but tests failed.\n\nTest output:\n```\n%s\n```\n\nIdentify the minimal fix and apply it. Do not refactor.' \
|
||||
"$next_attempts" "$max_fix_attempts" "$tests_output")
|
||||
else
|
||||
fix_instructions=$(printf '## Fix loop status (attempt %d of %d)\n\nfix_loop_gate was reached but no failure was detected in state. Re-run the verification step.' \
|
||||
"$next_attempts" "$max_fix_attempts")
|
||||
fi
|
||||
|
||||
jq -nc \
|
||||
--argjson n "$next_attempts" \
|
||||
--arg fi "$fix_instructions" \
|
||||
'{
|
||||
"fix_attempts": $n,
|
||||
"fix_instructions": $fi,
|
||||
"_next": "implement"
|
||||
}'
|
||||
@@ -0,0 +1,12 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
project_dir="${LLM_AGENT_VAR_PROJECT_DIR:-.}"
|
||||
resolved=$(cd "$project_dir" 2>/dev/null && pwd) || resolved="$project_dir"
|
||||
|
||||
jq -nc \
|
||||
--arg pd "$resolved" \
|
||||
'{
|
||||
"project_dir": $pd,
|
||||
"_next": "analyze_request"
|
||||
}'
|
||||
@@ -0,0 +1,23 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
if [[ -n "${GRAPH_STATE_FILE:-}" ]]; then
|
||||
state=$(cat "$GRAPH_STATE_FILE")
|
||||
elif [[ -n "${GRAPH_STATE:-}" ]]; then
|
||||
state="$GRAPH_STATE"
|
||||
else
|
||||
state='{}'
|
||||
fi
|
||||
|
||||
complexity=$(echo "$state" | jq -r '.complexity_score // 0')
|
||||
|
||||
if [[ "${CODER_AUTOAPPROVE:-0}" == "1" ]]; then
|
||||
jq -nc '{"_next": "implement"}'
|
||||
exit 0
|
||||
fi
|
||||
|
||||
if (( complexity >= 7 )); then
|
||||
jq -nc '{"_next": "gate_approval"}'
|
||||
else
|
||||
jq -nc '{"_next": "implement"}'
|
||||
fi
|
||||
@@ -0,0 +1,55 @@
|
||||
#!/usr/bin/env bash
|
||||
set -uo pipefail
|
||||
|
||||
# shellcheck disable=SC1091
|
||||
source "$(dirname "$0")/../../.shared/utils.sh"
|
||||
|
||||
if [[ -n "${GRAPH_STATE_FILE:-}" ]]; then
|
||||
state=$(cat "$GRAPH_STATE_FILE")
|
||||
elif [[ -n "${GRAPH_STATE:-}" ]]; then
|
||||
state="$GRAPH_STATE"
|
||||
else
|
||||
state='{}'
|
||||
fi
|
||||
|
||||
project_dir=$(echo "$state" | jq -r '.project_dir // "."')
|
||||
|
||||
if [[ -n "${BUILD_CMD:-}" ]]; then
|
||||
cmd="$BUILD_CMD"
|
||||
else
|
||||
project_info=$(detect_project "$project_dir")
|
||||
cmd=$(echo "$project_info" | jq -r '.check // .build // ""')
|
||||
fi
|
||||
|
||||
if [[ -z "$cmd" || "$cmd" == "null" ]]; then
|
||||
jq -nc '{
|
||||
"build_ok": true,
|
||||
"build_output": "(no build/check command available for this project type)",
|
||||
"_next": "verify_tests"
|
||||
}'
|
||||
exit 0
|
||||
fi
|
||||
|
||||
exit_code=0
|
||||
output=$(cd "$project_dir" && eval "$cmd" 2>&1) || exit_code=$?
|
||||
|
||||
if (( exit_code == 0 )); then
|
||||
jq -nc \
|
||||
--arg out "$output" \
|
||||
--arg cmd "$cmd" \
|
||||
'{
|
||||
"build_ok": true,
|
||||
"build_output": ("Ran: " + $cmd + "\n\n" + $out),
|
||||
"_next": "verify_tests"
|
||||
}'
|
||||
else
|
||||
jq -nc \
|
||||
--arg out "$output" \
|
||||
--arg cmd "$cmd" \
|
||||
--argjson rc "$exit_code" \
|
||||
'{
|
||||
"build_ok": false,
|
||||
"build_output": ("Ran: " + $cmd + "\nExit code: " + ($rc | tostring) + "\n\n" + $out),
|
||||
"_next": "fix_loop_gate"
|
||||
}'
|
||||
fi
|
||||
@@ -0,0 +1,55 @@
|
||||
#!/usr/bin/env bash
|
||||
set -uo pipefail
|
||||
|
||||
# shellcheck disable=SC1091
|
||||
source "$(dirname "$0")/../../.shared/utils.sh"
|
||||
|
||||
if [[ -n "${GRAPH_STATE_FILE:-}" ]]; then
|
||||
state=$(cat "$GRAPH_STATE_FILE")
|
||||
elif [[ -n "${GRAPH_STATE:-}" ]]; then
|
||||
state="$GRAPH_STATE"
|
||||
else
|
||||
state='{}'
|
||||
fi
|
||||
|
||||
project_dir=$(echo "$state" | jq -r '.project_dir // "."')
|
||||
|
||||
if [[ -n "${TEST_CMD:-}" ]]; then
|
||||
cmd="$TEST_CMD"
|
||||
else
|
||||
project_info=$(detect_project "$project_dir")
|
||||
cmd=$(echo "$project_info" | jq -r '.test // ""')
|
||||
fi
|
||||
|
||||
if [[ -z "$cmd" || "$cmd" == "null" ]]; then
|
||||
jq -nc '{
|
||||
"tests_ok": true,
|
||||
"tests_output": "(no test command available for this project type)",
|
||||
"_next": "end_success"
|
||||
}'
|
||||
exit 0
|
||||
fi
|
||||
|
||||
exit_code=0
|
||||
output=$(cd "$project_dir" && eval "$cmd" 2>&1) || exit_code=$?
|
||||
|
||||
if (( exit_code == 0 )); then
|
||||
jq -nc \
|
||||
--arg out "$output" \
|
||||
--arg cmd "$cmd" \
|
||||
'{
|
||||
"tests_ok": true,
|
||||
"tests_output": ("Ran: " + $cmd + "\n\n" + $out),
|
||||
"_next": "end_success"
|
||||
}'
|
||||
else
|
||||
jq -nc \
|
||||
--arg out "$output" \
|
||||
--arg cmd "$cmd" \
|
||||
--argjson rc "$exit_code" \
|
||||
'{
|
||||
"tests_ok": false,
|
||||
"tests_output": ("Ran: " + $cmd + "\nExit code: " + ($rc | tostring) + "\n\n" + $out),
|
||||
"_next": "fix_loop_gate"
|
||||
}'
|
||||
fi
|
||||
@@ -18,16 +18,15 @@ Sisyphus acts as the primary entry point, capable of handling complex tasks by c
|
||||
- 🛠️ **Tool Integration**: Seamlessly uses system tools for building, testing, and file manipulation.
|
||||
|
||||
## Pro-Tip: Use an IDE MCP Server for Improved Performance
|
||||
Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
|
||||
an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
|
||||
server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
|
||||
them), and modify the agent definition to look like this:
|
||||
Many modern IDEs (JetBrains, VS Code, Cursor, Zed, etc.) expose MCP servers that let LLMs use IDE tools directly. Using
|
||||
one dramatically improves the performance of coding agents. If you have one, add it to your loki config (see the
|
||||
[MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md)) and reference it in this agent's `mcp_servers:` list:
|
||||
|
||||
```yaml
|
||||
# ...
|
||||
|
||||
mcp_servers:
|
||||
- jetbrains
|
||||
- your-ide-mcp-server
|
||||
|
||||
global_tools:
|
||||
- fs_read.sh
|
||||
|
||||
@@ -119,20 +119,21 @@ instructions: |
|
||||
1. todo__init --goal "Add user profiles API endpoint"
|
||||
2. todo__add --task "Explore existing API patterns"
|
||||
3. todo__add --task "Implement profile endpoint"
|
||||
4. todo__add --task "Verify with build/test"
|
||||
5. agent__spawn --agent explore --prompt "Find existing API endpoint patterns, route structures, and controller conventions. Include code snippets."
|
||||
6. agent__spawn --agent explore --prompt "Find existing data models and database query patterns. Include code snippets."
|
||||
7. agent__collect --id <id1>
|
||||
8. agent__collect --id <id2>
|
||||
9. todo__done --id 1
|
||||
10. agent__spawn --agent coder --prompt "<structured prompt using Coder Delegation Format above, including code snippets from explore results>"
|
||||
11. agent__collect --id <coder_id>
|
||||
12. todo__done --id 2
|
||||
13. run_build
|
||||
14. run_tests
|
||||
15. todo__done --id 3
|
||||
4. agent__spawn --agent explore --prompt "Find existing API endpoint patterns, route structures, and controller conventions. Include code snippets."
|
||||
5. agent__spawn --agent explore --prompt "Find existing data models and database query patterns. Include code snippets."
|
||||
6. agent__collect --id <id1>
|
||||
7. agent__collect --id <id2>
|
||||
8. todo__done --id 1
|
||||
9. agent__spawn --agent coder --prompt "<structured prompt using Coder Delegation Format above, including code snippets from explore results>"
|
||||
10. agent__collect --id <coder_id>
|
||||
11. todo__done --id 2
|
||||
```
|
||||
|
||||
Note: the `coder` agent is a graph agent that runs verification (build +
|
||||
tests) and a bounded fix-loop internally. You do NOT need to spawn a
|
||||
separate build/test step. A `CODER_COMPLETE` outcome means build and
|
||||
tests already passed.
|
||||
|
||||
### Example 2: Architecture/design question (explore + oracle in parallel)
|
||||
|
||||
User: "How should I structure the authentication for this app?"
|
||||
@@ -172,6 +173,22 @@ instructions: |
|
||||
10. **Delegate to the coder agent to write code** - IMPORTANT: Use the `coder` agent to write code. Do not try to write code yourself except for trivial changes
|
||||
11. **Always output a summary of changes when finished** - Make it clear to user's that you've completed your tasks
|
||||
|
||||
## Coder Outcomes
|
||||
|
||||
The `coder` agent is a graph agent that runs the implement -> verify_build
|
||||
-> verify_tests -> fix_loop pipeline internally. It always returns one of
|
||||
three sentinel outcomes:
|
||||
|
||||
- `CODER_COMPLETE` - implementation succeeded with build + tests green.
|
||||
Continue with any follow-up todos.
|
||||
- `CODER_REJECTED` - user rejected the plan at the approval gate (only
|
||||
triggered for high-complexity plans). Do NOT re-spawn coder blindly;
|
||||
ask the user what to change first.
|
||||
- `CODER_FAILED` - the fix-loop exhausted its budget without producing
|
||||
green build/tests. The failure output includes the last build and tests
|
||||
output. Surface this to the user; consider spawning `oracle` for
|
||||
diagnosis if the failure is unclear.
|
||||
|
||||
## When to Do It Yourself
|
||||
|
||||
- Simple command execution
|
||||
|
||||
Reference in New Issue
Block a user