4.0 KiB
Step-Runner
A graph-based agent that executes one step of a phased implementation
plan, with the step protocol from the step-implementation skill enforced
as graph edges rather than prose. Designed to be delegated to by
Sisyphus; delegates implementation to
Coder and independent review to
code-reviewer.
It expects a plan repo authored per the plan-authoring skill:
plans/
steps/NN-<slug>.md # step plans with frontmatter (step/title/depends_on/status)
handoffs/NN-<slug>.md # written by this agent, validated by a deterministic gate
NOTES.md # rolling durable facts
Workflow
resolve_step (script) locate plan + previous handoff, check depends_on,
↓ mark plan in-progress [→ gate_blocked if deps unsatisfied]
orient (llm, read-only) merge handoff directives + staleness-check the plan
↓
route_staleness (script) major deviation → gate_deviation (approval)
↓
implement (agent → coder) coder runs its own build/test/self-review fix-loop
↓
route_coder_result (script) COMPLETE → verify | REJECTED / FAILED → end
↓
verify_format_lint (script) format BEFORE evidence, then lint
verify_build (script) step-level build/typecheck
verify_tests (script) FULL test suite
↓ [failures → fix_loop_gate, back-edge to implement]
edge_case_sweep (llm) missed edge cases; annotate downstream plans
↓ (Edge cases sections ONLY - scope changes become proposals)
route_sweep (script) 5+ files or architectural boundary → independent_review
independent_review (agent) code-reviewer; 🔴 findings loop back to implement (bounded)
↓
write_handoff (llm) evidence-backed handoff per handoff-protocol + NOTES.md
check_handoff (script) deterministic schema gate; marks plan status complete
↓
gate_user_review (approval) HARD STOP - approve, or send revision comments
↓ (revisions loop through implement → verify → handoff again)
end_success / end_blocked / end_rejected / end_failure
End nodes emit sentinel outcomes for the caller:
STEP_COMPLETE— step implemented, verified, handoff written, user approved.STEP_BLOCKED—depends_onunsatisfied and the user declined to proceed.STEP_REJECTED— user aborted at the deviation gate, or the coder's plan was rejected at its approval gate.STEP_FAILED— coder failed, the step-level fix budget was exhausted, or the handoff failed validation twice.
Usage
# From the project root: run the next in-progress/pending step
coyote -a step-runner "Execute the next step"
# A specific step (also parsed from the prompt: "execute step 3")
coyote -a step-runner --agent-variable step 3 "Execute step 3"
# Plan repo somewhere else
coyote -a step-runner --agent-variable plans_dir docs/plans "Execute the next step"
Invoke from the project root. The coder sub-agent resolves its own
project_dir from the invocation directory; overriding project_dir here
does not propagate to the spawned coder.
Tuning
graph.yaml initial_state exposes:
max_fix_attempts(default2) — step-level fix budget (the coder has its own internal budget of 3).max_review_attempts(default1) — bounded 🔴-finding fix loops after independent review.
Environment overrides honored by the script nodes:
FORMAT_CMD/LINT_CMD— formatting and linting (otherwise a per-type heuristic formats, and linting defers to the build/check command).BUILD_CMD/TEST_CMD— skip project-type detection (same as coder).STEP_AUTOAPPROVE=1— bypass the deviation gate (non-interactive runs).STEP_SKIP_REVIEW=1— never spawn the independent reviewer.
The final user approval gate is never bypassed by an environment variable - it is the point of the workflow.