# Step-Runner A graph-based agent that executes **one step** of a phased implementation plan, with the step protocol from the `step-implementation` skill enforced as graph edges rather than prose. Designed to be delegated to by **[Sisyphus](../sisyphus/README.md)**; delegates implementation to **[Coder](../coder/README.md)** and independent review to **[code-reviewer](../code-reviewer/README.md)**. It expects a plan repo authored per the `plan-authoring` skill: ``` plans/ steps/NN-.md # step plans with frontmatter (step/title/depends_on/status) handoffs/NN-.md # written by this agent, validated by a deterministic gate NOTES.md # rolling durable facts ``` ## Workflow ``` resolve_step (script) locate plan + previous handoff, check depends_on, ↓ mark plan in-progress [→ gate_blocked if deps unsatisfied] orient (llm, read-only) merge handoff directives + staleness-check the plan ↓ route_staleness (script) major deviation → gate_deviation (approval) ↓ implement (agent → coder) coder runs its own build/test/self-review fix-loop ↓ route_coder_result (script) COMPLETE → verify | REJECTED / FAILED → end ↓ verify_format_lint (script) format BEFORE evidence, then lint verify_build (script) step-level build/typecheck verify_tests (script) FULL test suite ↓ [failures → fix_loop_gate, back-edge to implement] edge_case_sweep (llm) missed edge cases; annotate downstream plans ↓ (Edge cases sections ONLY - scope changes become proposals) route_sweep (script) 5+ files or architectural boundary → independent_review independent_review (agent) code-reviewer; 🔴 findings loop back to implement (bounded) ↓ write_handoff (llm) evidence-backed handoff per handoff-protocol + NOTES.md check_handoff (script) deterministic schema gate; marks plan status complete ↓ gate_user_review (approval) HARD STOP - approve, or send revision comments ↓ (revisions loop through implement → verify → handoff again) end_success / end_blocked / end_rejected / end_failure ``` End nodes emit sentinel outcomes for the caller: - `STEP_COMPLETE` — step implemented, verified, handoff written, user approved. - `STEP_BLOCKED` — `depends_on` unsatisfied and the user declined to proceed. - `STEP_REJECTED` — user aborted at the deviation gate, or the coder's plan was rejected at its approval gate. - `STEP_FAILED` — coder failed, the step-level fix budget was exhausted, or the handoff failed validation twice. ## Usage ```sh # From the project root: run the next in-progress/pending step coyote -a step-runner "Execute the next step" # A specific step (also parsed from the prompt: "execute step 3") coyote -a step-runner --agent-variable step 3 "Execute step 3" # Plan repo somewhere else coyote -a step-runner --agent-variable plans_dir docs/plans "Execute the next step" ``` **Invoke from the project root.** The coder sub-agent resolves its own `project_dir` from the invocation directory; overriding `project_dir` here does not propagate to the spawned coder. ## Tuning `graph.yaml` `initial_state` exposes: - `max_fix_attempts` (default `2`) — step-level fix budget (the coder has its own internal budget of 3). - `max_review_attempts` (default `1`) — bounded 🔴-finding fix loops after independent review. Environment overrides honored by the script nodes: - `FORMAT_CMD` / `LINT_CMD` — formatting and linting (otherwise a per-type heuristic formats, and linting defers to the build/check command). - `BUILD_CMD` / `TEST_CMD` — skip project-type detection (same as coder). - `STEP_AUTOAPPROVE=1` — bypass the deviation gate (non-interactive runs). - `STEP_SKIP_REVIEW=1` — never spawn the independent reviewer. The final user approval gate is never bypassed by an environment variable - it is the point of the workflow.