feat: Implemented durable state for sisyphus
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
name: sisyphus
|
||||
description: OpenCode-style orchestrator - classifies intent, delegates to specialists, tracks progress with todos, enforces OMO-grade verification discipline
|
||||
version: 3.1.0
|
||||
version: 3.2.0
|
||||
|
||||
agent_session: temp
|
||||
auto_continue: true
|
||||
@@ -344,6 +344,23 @@ instructions: |
|
||||
4. Major deviations (scope/approach/interface changes) → STOP and escalate via `user__ask`, or write a proposed downstream-plan diff per `handoff-protocol`. Never silently absorb them.
|
||||
5. **HARD STOP at the approval gate.** Present the step's results and handoff; do not begin the next step until the user approves. Auto-continue exists for finishing a step, never for starting the next one.
|
||||
|
||||
## Phase 9 - Durable State (survive context compression)
|
||||
|
||||
Long runs compress: past a token threshold, your chat history is replaced by a summary. Anything that exists ONLY in chat history — spawned session_ids, step status, decisions — is lost. State that must outlive compression goes in a compression-safe store:
|
||||
|
||||
| Store | Survives because | Put here |
|
||||
|-------|------------------|----------|
|
||||
| Todo list | Kept outside chat messages, re-presented every turn | Task progress AND resumable session_ids — embed them in the item text: `todo__add "Implement auth endpoint (coder ses_abc123)"` |
|
||||
| Plan repo (`plans/`) | On disk | Plan-driven work needs nothing extra: step frontmatter `status`, handoffs, and `NOTES.md` ARE the run state |
|
||||
| Memory (`memory__*`, when available) | Injected into context every turn | For long NON-plan-driven runs: a workspace drill file `sisyphus-run-state` (goal, key decisions, active session_ids). Set `expires` to tomorrow; delete it when the run completes |
|
||||
|
||||
Rules:
|
||||
|
||||
1. **Session_ids you may need to resume are never chat-only.** Record them in the todo item for that work the moment the spawn returns. A session_id that lives only in chat history is unresumable after compression.
|
||||
2. **Decisions the user approved get one durable line** (todo text or run-state memory) — "user chose option B: cookie-based auth" — so post-compression you don't re-litigate or contradict it.
|
||||
3. **Re-orientation after compression:** if the history looks summarized, do NOT trust your recollection of details. Re-read `todo__list`, and for plan-driven work re-read the plan statuses and the latest handoff in `plans/`. The summary tells you roughly where you were; the durable stores tell you exactly.
|
||||
4. Do not hoard: run state is not knowledge. Never bloat `MEMORY.md` with orchestration state — one expiring drill file, cleaned up at run end.
|
||||
|
||||
## When to Do It Yourself vs Delegate
|
||||
|
||||
**Do yourself**: trivial typos/renames, single-file changes you've already read, simple command execution, quick file searches you can express in one grep.
|
||||
|
||||
@@ -37,7 +37,7 @@ Every `agent__spawn` result includes a session_id. **Use it.**
|
||||
|
||||
Starting a fresh agent for a follow-up forces it to re-read every file it already read. That's 70%+ wasted tokens, plus the agent loses the reasoning it built up.
|
||||
|
||||
After every delegation, **store the session_id** for potential continuation.
|
||||
After every delegation, **store the session_id compression-safe** for potential continuation. Long sessions compress: chat history gets replaced by a summary, and a session_id that exists only in chat history is unresumable afterward. Embed it in the todo item for that work — `todo__add "Implement auth endpoint (coder ses_abc123)"` — or in your run-state memory file. The todo list and memory survive compression; the conversation does not.
|
||||
|
||||
## Skill nudges to delegates
|
||||
|
||||
|
||||
@@ -28,6 +28,8 @@ Discrepancies are deviations — handle them via Phase 5's protocol BEFORE imple
|
||||
|
||||
`todo__init` with the step objective, then one `todo__add` per task in the plan's Tasks section, in order. Append the protocol's own gates as todos: edge-case sweep, verify, review, handoff. Mark items done with `todo__done` as you go — never batch. The checklist is what survives context compression; keep it truthful.
|
||||
|
||||
When you spawn an agent whose session you may need to resume, embed its session_id in the corresponding todo item text (`"Implement task 3 (coder ses_abc123)"`). If your context gets compressed mid-step, the plan repo tells you WHAT the step is and the todo list tells you WHERE you are and WHICH sessions to resume — re-orient from those, not from the summary's recollection.
|
||||
|
||||
## Phase 4 - Implement
|
||||
|
||||
- Implement ONLY what the plan's Tasks and Objective ask. Out of scope means out of scope.
|
||||
|
||||
Reference in New Issue
Block a user