Sisyphus agent recreated in LangChain to figure out how it works and how to use it

2026-04-15 12:47:38 -06:00
14 changed files with 1745 additions and 0 deletions
@@ -0,0 +1,416 @@
 # Sisyphus in LangChain/LangGraph
 A faithful recreation of [Loki's Sisyphus agent](../../assets/agents/sisyphus/) using [LangGraph](https://docs.langchain.com/langgraph/) — LangChain's framework for stateful, multi-agent workflows.
 This project exists to help you understand LangChain/LangGraph by mapping every concept to its Loki equivalent.
 ## Architecture Overview
 ```
 ┌─────────────────────────────────────────────────────────────┐
 │                     SUPERVISOR NODE                          │
 │  Intent classification → Routing decision → Command(goto=)   │
 │                                                              │
 │  Loki equivalent: sisyphus/config.yaml                       │
 │  (agent__spawn → Command, agent__collect → graph edge)       │
 └──────────┬──────────────┬──────────────┬────────────────────┘
           │              │              │
           ▼              ▼              ▼
    ┌────────────┐ ┌────────────┐ ┌────────────┐
    │  EXPLORE   │ │   ORACLE   │ │   CODER    │
    │ (research) │ │  (advise)  │ │  (build)   │
    │            │ │            │ │            │
    │ read-only  │ │ read-only  │ │ read+write │
    │ tools      │ │ tools      │ │ tools      │
    └─────┬──────┘ └─────┬──────┘ └─────┬──────┘
          │              │              │
          └──────────────┼──────────────┘
                         │
                  back to supervisor
 ```
 ## Concept Map: Loki → LangGraph
 This is the key reference.  Every row maps a Loki concept to its LangGraph equivalent.
 ### Core Architecture
 | Loki Concept | LangGraph Equivalent | Where in Code |
 |---|---|---|
 | Agent config (config.yaml) | Node function + system prompt | `agents/explore.py`, etc. |
 | Agent instructions | System prompt string | `EXPLORE_SYSTEM_PROMPT`, etc. |
 | Agent tools (tools.sh) | `@tool`-decorated Python functions | `tools/filesystem.py`, `tools/project.py` |
 | Agent session (chat loop) | Graph state + message list | `state.py` → `SisyphusState.messages` |
 | `agent__spawn --agent X` | `Command(goto="X")` | `agents/supervisor.py` |
 | `agent__collect --id` | Graph edge (implicit — workers return to supervisor) | `graph.py` → `add_edge("explore", "supervisor")` |
 | `agent__check` (non-blocking) | Not needed (graph handles scheduling) | — |
 | `agent__cancel` | Not needed (graph handles lifecycle) | — |
 | `can_spawn_agents: true` | Node has routing logic (supervisor) | `agents/supervisor.py` |
 | `max_concurrent_agents: 4` | `Send()` API for parallel fan-out | See [Parallel Execution](#parallel-execution) |
 | `max_agent_depth: 3` | `recursion_limit` in config | `cli.py` → `recursion_limit: 50` |
 | `summarization_threshold` | Manual truncation in supervisor | `supervisor.py` → `_summarize_outputs()` |
 ### Tool System
 | Loki Concept | LangGraph Equivalent | Notes |
 |---|---|---|
 | `tools.sh` with `@cmd` annotations | `@tool` decorator | Loki compiles bash annotations to JSON schema; LangChain generates schema from the Python function signature + docstring |
 | `@option --pattern!` (required arg) | Function parameter without default | `def search_content(pattern: str)` |
 | `@option --lines` (optional arg) | Parameter with default | `def read_file(path: str, limit: int = 200)` |
 | `@env LLM_OUTPUT=/dev/stdout` | Return value | LangChain tools return strings; Loki tools write to `$LLM_OUTPUT` |
 | `@describe` | Docstring | The tool's docstring becomes the description the LLM sees |
 | Global tools (`fs_read.sh`, etc.) | Shared tool imports | Both agents import from `tools/filesystem.py` |
 | Agent-specific tools | Per-node tool binding | `llm.bind_tools(EXPLORE_TOOLS)` vs `llm.bind_tools(CODER_TOOLS)` |
 | `.shared/utils.sh` | `tools/project.py` | Shared project detection utilities |
 | `detect_project()` heuristic | `detect_project()` in Python | Same logic: check Cargo.toml → go.mod → package.json → etc. |
 | LLM fallback for unknown projects | (omitted) | The agents themselves can reason about unknown project types |
 ### State & Memory
 | Loki Concept | LangGraph Equivalent | Notes |
 |---|---|---|
 | Agent session (conversation history) | `SisyphusState.messages` | `Annotated[list, add_messages]` — the reducer appends instead of replacing |
 | `agent_session: temp` | `MemorySaver` checkpointer | Loki's temp sessions are ephemeral; MemorySaver is in-memory (lost on restart) |
 | Per-agent isolation | Per-node system prompt + tools | In Loki agents have separate sessions; in LangGraph they share messages but have different system prompts |
 | `{{project_dir}}` variable | `SisyphusState.project_dir` | Loki interpolates variables into prompts; LangGraph stores them in state |
 | `{{__tools__}}` injection | `llm.bind_tools()` | Loki injects tool descriptions into the prompt; LangChain attaches them to the API call |
 ### Orchestration
 | Loki Concept | LangGraph Equivalent | Notes |
 |---|---|---|
 | Intent classification table | `RoutingDecision` structured output | Loki does this in free text; LangGraph forces typed JSON |
 | Oracle triggers ("How should I...") | Supervisor prompt + structured output | Same trigger phrases, enforced via system prompt |
 | Coder delegation format | Supervisor builds HumanMessage | The structured prompt (Goal/Reference Files/Conventions/Constraints) |
 | `agent__spawn` (parallel) | `Send()` API | Dynamic fan-out to multiple nodes |
 | Todo system (`todo__init`, etc.) | `SisyphusState.todos` | State field with a merge reducer |
 | `auto_continue: true` | Supervisor loop (iteration counter) | Supervisor re-routes until FINISH or max iterations |
 | `max_auto_continues: 25` | `MAX_ITERATIONS = 15` | Safety valve to prevent infinite loops |
 | `user__ask` / `user__confirm` | `interrupt()` API | Pauses graph, surfaces question to caller, resumes with answer |
 | Escalation (child → parent → user) | `interrupt()` in any node | Any node can pause; the caller handles the interaction |
 ### Execution Model
 | Loki Concept | LangGraph Equivalent | Notes |
 |---|---|---|
 | `loki --agent sisyphus` | `python -m sisyphus_langchain.cli` | CLI entry point |
 | REPL mode | `cli.py` → `repl()` | Interactive loop with thread persistence |
 | One-shot mode | `cli.py` → `run_query()` | Single query, print result, exit |
 | Streaming output | `graph.stream()` | LangGraph supports per-node streaming |
 | `inject_spawn_instructions` | (always on) | System prompts are always included |
 | `inject_todo_instructions` | (always on) | Todo instructions could be added to prompts |
 ## How the Execution Flow Works
 ### 1. User sends a message
 ```python
 graph.invoke({"messages": [HumanMessage("Add a health check endpoint")]})
 ```
 ### 2. Supervisor classifies intent
 The supervisor LLM reads the message and produces a `RoutingDecision`:
 ```json
 {
  "intent": "implementation",
  "next_agent": "explore",
  "delegation_notes": "Find existing API endpoint patterns, route structure, and health check conventions"
 }
 ```
 ### 3. Supervisor routes via Command
 ```python
 return Command(goto="explore", update={"intent": "implementation", "iteration_count": 1})
 ```
 ### 4. Explore agent runs
 - Receives the full message history (including the user's request)
 - Calls read-only tools (search_content, search_files, read_file)
 - Returns findings in messages
 ### 5. Control returns to supervisor
 The graph edge `explore → supervisor` fires automatically.
 ### 6. Supervisor reviews and routes again
 Now it has explore's findings.  It routes to coder with context:
 ```json
 {
  "intent": "implementation",
  "next_agent": "coder",
  "delegation_notes": "Implement health check endpoint following patterns found in src/routes/"
 }
 ```
 ### 7. Coder implements
 - Reads explore's findings from the message history
 - Writes files via `write_file` tool
 - Runs `verify_build` to check compilation
 ### 8. Supervisor verifies and finishes
 ```json
 {
  "intent": "implementation",
  "next_agent": "FINISH",
  "delegation_notes": "Added /health endpoint in src/routes/health.py. Build passes."
 }
 ```
 ## Key Differences from Loki
 ### What LangGraph does better
 1. **Declarative graph** — The topology is visible and debuggable.  Loki's orchestration is emergent from the LLM's tool calls.
 2. **Typed state** — `SisyphusState` is a TypedDict with reducers.  Loki's state is implicit in the conversation.
 3. **Checkpointing** — Built-in persistence.  Loki manages sessions manually.
 4. **Time-travel debugging** — Inspect any checkpoint.  Loki has no equivalent.
 5. **Structured routing** — `RoutingDecision` forces valid JSON.  Loki relies on the LLM calling the right tool.
 ### What Loki does better
 1. **True parallelism** — `agent__spawn` runs multiple agents concurrently in separate threads.  This LangGraph implementation is sequential (see [Parallel Execution](#parallel-execution) for how to add it).
 2. **Agent isolation** — Each Loki agent has its own session, tools, and config.  LangGraph nodes share state.
 3. **Teammate messaging** — Loki agents can send messages to siblings.  LangGraph nodes communicate only through shared state.
 4. **Dynamic tool compilation** — Loki compiles bash/python/typescript tools at startup.  LangChain tools are statically defined.
 5. **Escalation protocol** — Loki's child-to-parent escalation is sophisticated.  LangGraph's `interrupt()` is simpler but less structured.
 6. **Task queues with dependencies** — Loki's `agent__task_create` supports dependency DAGs.  LangGraph's routing is simpler (hub-and-spoke).
 ## Running It
 ### Prerequisites
 ```bash
 # Python 3.11+
 python --version
 # Set your API key
 export OPENAI_API_KEY="sk-..."
 ```
 ### Install
 ```bash
 cd examples/langchain-sisyphus
 # With pip
 pip install -e .
 # Or with uv (recommended)
 uv pip install -e .
 ```
 ### Usage
 ```bash
 # Interactive REPL (like `loki --agent sisyphus`)
 sisyphus
 # One-shot query
 sisyphus "Find all TODO comments in the codebase"
 # With custom models (cost optimization)
 sisyphus --explore-model gpt-4o-mini --coder-model gpt-4o "Add input validation to the API"
 # Programmatic usage
 python -c "
 from sisyphus_langchain import build_graph
 from langchain_core.messages import HumanMessage
 graph = build_graph()
 result = graph.invoke({
    'messages': [HumanMessage('What patterns does this codebase use?')],
    'intent': 'ambiguous',
    'next_agent': '',
    'iteration_count': 0,
    'todos': [],
    'agent_outputs': {},
    'final_output': '',
    'project_dir': '.',
 }, config={'configurable': {'thread_id': 'demo'}, 'recursion_limit': 50})
 print(result['final_output'])
 "
 ```
 ### Using Anthropic Models
 Replace `ChatOpenAI` with `ChatAnthropic` in the agent factories:
 ```python
 from langchain_anthropic import ChatAnthropic
 # In agents/oracle.py:
 llm = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0.2).bind_tools(ORACLE_TOOLS)
 ```
 ## Deployment
 ### Option 1: Standalone Script (Simplest)
 Just run the CLI directly. No infrastructure needed.
 ```bash
 sisyphus "Add a health check endpoint"
 ```
 ### Option 2: FastAPI Server
 ```python
 # server.py
 from fastapi import FastAPI
 from langserve import add_routes
 from sisyphus_langchain import build_graph
 app = FastAPI(title="Sisyphus API")
 graph = build_graph()
 add_routes(app, graph, path="/agent")
 # Run: uvicorn server:app --host 0.0.0.0 --port 8000
 # Call: POST http://localhost:8000/agent/invoke
 ```
 ### Option 3: LangGraph Platform (Production)
 Create a `langgraph.json` at the project root:
 ```json
 {
  "graphs": {
    "sisyphus": "./sisyphus_langchain/graph.py:build_graph"
  },
  "dependencies": ["./sisyphus_langchain"],
  "env": ".env"
 }
 ```
 Then deploy:
 ```bash
 pip install langgraph-cli
 langgraph deploy
 ```
 This gives you:
 - Durable checkpointing (PostgreSQL)
 - Background runs
 - Streaming API
 - Zero-downtime deployments
 - Built-in observability
 ### Option 4: Docker
 ```dockerfile
 FROM python:3.12-slim
 WORKDIR /app
 COPY . .
 RUN pip install -e .
 CMD ["sisyphus"]
 ```
 ```bash
 docker build -t sisyphus .
 docker run -it -e OPENAI_API_KEY=$OPENAI_API_KEY sisyphus
 ```
 ## Parallel Execution
 This implementation routes sequentially for simplicity. To add Loki-style parallel agent execution, use LangGraph's `Send()` API:
 ```python
 from langgraph.types import Send
 def supervisor_node(state):
    # Fan out to multiple explore agents in parallel
    # (like Loki's agent__spawn called multiple times)
    return [
        Send("explore", {
            **state,
            "messages": state["messages"] + [
                HumanMessage("Find existing API endpoint patterns")
            ],
        }),
        Send("explore", {
            **state,
            "messages": state["messages"] + [
                HumanMessage("Find data models and database patterns")
            ],
        }),
    ]
 ```
 This is equivalent to Loki's pattern of spawning multiple explore agents:
 ```
 agent__spawn --agent explore --prompt "Find API patterns"
 agent__spawn --agent explore --prompt "Find database patterns"
 agent__collect --id <id1>
 agent__collect --id <id2>
 ```
 ## Adding Human-in-the-Loop
 To replicate Loki's `user__ask` / `user__confirm` tools, use LangGraph's `interrupt()`:
 ```python
 from langgraph.types import interrupt
 def supervisor_node(state):
    # Pause and ask the user (like Loki's user__ask)
    answer = interrupt({
        "question": "How should we structure the authentication?",
        "options": [
            "JWT with httpOnly cookies (Recommended)",
            "Session-based with Redis",
            "OAuth2 with external provider",
        ],
    })
    # `answer` contains the user's selection when the graph resumes
 ```
 ## Project Structure
 ```
 examples/langchain-sisyphus/
 ├── pyproject.toml                          # Dependencies & build config
 ├── README.md                               # This file
 └── sisyphus_langchain/
    ├── __init__.py                          # Package entry point
    ├── cli.py                              # CLI (REPL + one-shot mode)
    ├── graph.py                            # Graph assembly (wires nodes + edges)
    ├── state.py                            # Shared state schema (TypedDict)
    ├── agents/
    │   ├── __init__.py
    │   ├── supervisor.py                   # Sisyphus orchestrator (intent → routing)
    │   ├── explore.py                      # Read-only codebase researcher
    │   ├── oracle.py                       # Architecture/debugging advisor
    │   └── coder.py                        # Implementation worker
    └── tools/
        ├── __init__.py
        ├── filesystem.py                   # File read/write/search/glob tools
        └── project.py                      # Project detection, build, test tools
 ```
 ### File-to-Loki Mapping
 | This Project | Loki Equivalent |
 |---|---|
 | `state.py` | Session context + todo state (implicit in Loki) |
 | `graph.py` | `src/supervisor/mod.rs` (runtime orchestration) |
 | `cli.py` | `src/main.rs` (CLI entry point) |
 | `agents/supervisor.py` | `assets/agents/sisyphus/config.yaml` |
 | `agents/explore.py` | `assets/agents/explore/config.yaml` + `tools.sh` |
 | `agents/oracle.py` | `assets/agents/oracle/config.yaml` + `tools.sh` |
 | `agents/coder.py` | `assets/agents/coder/config.yaml` + `tools.sh` |
 | `tools/filesystem.py` | `assets/functions/tools/fs_*.sh` |
 | `tools/project.py` | `assets/agents/.shared/utils.sh` + `sisyphus/tools.sh` |
 ## Further Reading
 - [LangGraph Documentation](https://docs.langchain.com/langgraph/)
 - [LangGraph Multi-Agent Tutorial](https://docs.langchain.com/langgraph/how-tos/multi-agent-systems)
 - [Loki Agents Documentation](../../docs/AGENTS.md)
 - [Loki Sisyphus README](../../assets/agents/sisyphus/README.md)
 - [LangGraph Supervisor Library](https://github.com/langchain-ai/langgraph-supervisor-py)
@@ -0,0 +1,29 @@
 [project]
 name = "sisyphus-langchain"
 version = "0.1.0"
 description = "Loki's Sisyphus multi-agent orchestrator recreated in LangChain/LangGraph"
 readme = "README.md"
 requires-python = ">=3.11"
 dependencies = [
    "langgraph>=0.3.0",
    "langchain>=0.3.0",
    "langchain-openai>=0.3.0",
    "langchain-anthropic>=0.3.0",
    "langchain-core>=0.3.0",
 ]
 [project.optional-dependencies]
 dev = [
    "pytest>=8.0",
    "ruff>=0.8.0",
 ]
 server = [
    "langgraph-api>=0.1.0",
 ]
 [project.scripts]
 sisyphus = "sisyphus_langchain.cli:main"
 [build-system]
 requires = ["hatchling"]
 build-backend = "hatchling.build"
@@ -0,0 +1,5 @@
 """Sisyphus multi-agent orchestrator — a LangGraph recreation of Loki's Sisyphus agent."""
 from sisyphus_langchain.graph import build_graph
 __all__ = ["build_graph"]
@@ -0,0 +1 @@
 """Agent node definitions for the Sisyphus orchestrator."""
@@ -0,0 +1,145 @@
 """
 Coder agent node — the implementation worker.
 Loki equivalent: assets/agents/coder/config.yaml + tools.sh
 In Loki, the coder is the ONLY agent that modifies files.  It:
    - Receives a structured prompt from sisyphus with code patterns to follow
    - Writes files via the write_file tool (never pastes code in chat)
    - Verifies builds after every change
    - Signals CODER_COMPLETE or CODER_FAILED
 In LangGraph, coder is a node with write-capable tools (read_file, write_file,
 search_content, execute_command, verify_build).  The supervisor formats a
 structured delegation prompt (Goal / Reference Files / Code Patterns /
 Conventions / Constraints) and routes to this node.
 Key Loki→LangGraph mapping:
    - Loki's "Coder Delegation Format" → the supervisor builds this as a
      HumanMessage before routing to the coder node.
    - Loki's auto_continue (up to 15) → the supervisor can re-route to coder
      if verification fails, up to iteration_count limits.
    - Loki's todo system for multi-file changes → the coder updates
      state["todos"] as it completes each file.
 """
 from __future__ import annotations
 from langchain_core.messages import SystemMessage
 from langchain_openai import ChatOpenAI
 from sisyphus_langchain.state import SisyphusState
 from sisyphus_langchain.tools.filesystem import (
    read_file,
    search_content,
    search_files,
    write_file,
 )
 from sisyphus_langchain.tools.project import (
    execute_command,
    run_tests,
    verify_build,
 )
 # ---------------------------------------------------------------------------
 # System prompt — faithfully mirrors coder/config.yaml
 # ---------------------------------------------------------------------------
 CODER_SYSTEM_PROMPT = """\
 You are a senior engineer. You write code that works on the first try.
 ## Your Mission
 Given an implementation task:
 1. Check for context provided in the conversation (patterns, conventions, reference files).
 2. Fill gaps only — read files NOT already covered in context.
 3. Write the code using the write_file tool (NEVER output code in chat).
 4. Verify it compiles/builds using verify_build.
 5. Provide a summary of what you implemented.
 ## Using Provided Context (IMPORTANT)
 Your prompt often contains prior findings from the explore agent: file paths,
 code patterns, and conventions.
 **If context is provided:**
 1. Use it as your primary reference.  Don't re-read files already summarized.
 2. Follow the code patterns shown — snippets in context ARE the style guide.
 3. Read referenced files ONLY IF you need more detail (full signatures, imports).
 4. If context includes a "Conventions" section, follow it exactly.
 **If context is NOT provided or is too vague:**
 Fall back to self-exploration: search for similar files, read 1-2 examples,
 match their style.
 ## Writing Code
 CRITICAL: Write code using the write_file tool. NEVER paste code in chat.
 ## Pattern Matching
 Before writing ANY file:
 1. Find a similar existing file.
 2. Match its style: imports, naming, structure.
 3. Follow the same patterns exactly.
 ## Verification
 After writing files:
 1. Run verify_build to check compilation.
 2. If it fails, fix the error (minimal change).
 3. Don't move on until build passes.
 ## Rules
 1. Write code via tools — never output code to chat.
 2. Follow patterns — read existing files first.
 3. Verify builds — don't finish without checking.
 4. Minimal fixes — if build fails, fix precisely.
 5. No refactoring — only implement what's asked.
 """
 # Full tool set — coder gets write access and command execution
 CODER_TOOLS = [
    read_file,
    write_file,
    search_content,
    search_files,
    execute_command,
    verify_build,
    run_tests,
 ]
 def create_coder_node(model_name: str = "gpt-4o", temperature: float = 0.1):
    """
    Factory that returns a coder node function.
    Coder needs a capable model because it writes production code.  In Loki,
    coder uses the same model as the parent by default.
    Args:
        model_name: Model identifier.
        temperature: LLM temperature (Loki coder uses 0.1 for consistency).
    """
    llm = ChatOpenAI(model=model_name, temperature=temperature).bind_tools(CODER_TOOLS)
    def coder_node(state: SisyphusState) -> dict:
        """
        LangGraph node: run the coder agent.
        Reads conversation history (including the supervisor's structured
        delegation prompt), invokes the LLM with write-capable tools,
        and returns the result.
        """
        response = llm.invoke(
            [SystemMessage(content=CODER_SYSTEM_PROMPT)] + state["messages"]
        )
        return {
            "messages": [response],
            "agent_outputs": {
                **state.get("agent_outputs", {}),
                "coder": response.content,
            },
        }
    return coder_node
@@ -0,0 +1,110 @@
 """
 Explore agent node — the read-only codebase researcher.
 Loki equivalent: assets/agents/explore/config.yaml + tools.sh
 In Loki, the explore agent is spawned via `agent__spawn --agent explore --prompt "..."`
 and runs as an isolated subprocess with its own session.  It ends with
 "EXPLORE_COMPLETE" so the parent knows it's finished.
 In LangGraph, the explore agent is a *node* in the graph.  The supervisor routes
 to it via `Command(goto="explore")`.  It reads the latest message (the supervisor's
 delegation prompt), calls the LLM with read-only tools, and writes its findings
 back to the shared message list.  The graph edge then returns control to the
 supervisor.
 Key differences from Loki:
    - No isolated session — shares the graph's message list (but has its own
      system prompt and tool set, just like Loki's per-agent config).
    - No "EXPLORE_COMPLETE" sentinel — the graph edge handles control flow.
    - No output summarization — LangGraph's state handles context management.
 """
 from __future__ import annotations
 from langchain_core.messages import SystemMessage
 from langchain_openai import ChatOpenAI
 from sisyphus_langchain.state import SisyphusState
 from sisyphus_langchain.tools.filesystem import (
    list_directory,
    read_file,
    search_content,
    search_files,
 )
 # ---------------------------------------------------------------------------
 # System prompt — faithfully mirrors explore/config.yaml
 # ---------------------------------------------------------------------------
 EXPLORE_SYSTEM_PROMPT = """\
 You are a codebase explorer. Your job: Search, find, report. Nothing else.
 ## Your Mission
 Given a search task, you:
 1. Search for relevant files and patterns
 2. Read key files to understand structure
 3. Report findings concisely
 ## Strategy
 1. **Find first, read second** — Never read a file without knowing why.
 2. **Use search_content to locate** — find exactly where things are defined.
 3. **Use search_files to discover** — find files by name pattern.
 4. **Read targeted sections** — use offset and limit to read only relevant lines.
 5. **Never read entire large files** — if a file is 500+ lines, read the relevant section only.
 ## Output Format
 Always end your response with a structured findings summary:
 FINDINGS:
 - [Key finding 1]
 - [Key finding 2]
 - Relevant files: [list of paths]
 ## Rules
 1. Be fast — don't read every file, read representative ones.
 2. Be focused — answer the specific question asked.
 3. Be concise — report findings, not your process.
 4. Never modify files — you are read-only.
 5. Limit reads — max 5 file reads per exploration.
 """
 # Read-only tools — mirrors explore's tool set (no write_file, no execute_command)
 EXPLORE_TOOLS = [read_file, search_content, search_files, list_directory]
 def create_explore_node(model_name: str = "gpt-4o-mini", temperature: float = 0.1):
    """
    Factory that returns an explore node function bound to a specific model.
    In Loki, the model is set per-agent in config.yaml.  Here we parameterize it
    so you can use a cheap model for exploration (cost optimization).
    Args:
        model_name: OpenAI model identifier.
        temperature: LLM temperature (Loki explore uses 0.1).
    """
    llm = ChatOpenAI(model=model_name, temperature=temperature).bind_tools(EXPLORE_TOOLS)
    def explore_node(state: SisyphusState) -> dict:
        """
        LangGraph node: run the explore agent.
        Reads the conversation history, applies the explore system prompt,
        invokes the LLM with read-only tools, and returns the response.
        """
        response = llm.invoke(
            [SystemMessage(content=EXPLORE_SYSTEM_PROMPT)] + state["messages"]
        )
        return {
            "messages": [response],
            "agent_outputs": {
                **state.get("agent_outputs", {}),
                "explore": response.content,
            },
        }
    return explore_node
@@ -0,0 +1,124 @@
 """
 Oracle agent node — the high-IQ architecture and debugging advisor.
 Loki equivalent: assets/agents/oracle/config.yaml + tools.sh
 In Loki, the oracle is a READ-ONLY advisor spawned for:
    - Architecture decisions and multi-system tradeoffs
    - Complex debugging (after 2+ failed fix attempts)
    - Code/design review
    - Risk assessment
 It uses temperature 0.2 (slightly higher than explore/coder for more creative
 reasoning) and ends with "ORACLE_COMPLETE".
 In LangGraph, oracle is a node that receives the full message history, reasons
 about the problem, and writes structured advice back.  It has read-only tools
 only — it never modifies files.
 Key Loki→LangGraph mapping:
    - Loki oracle triggers (the "MUST spawn oracle when..." rules in sisyphus)
      become routing conditions in the supervisor node.
    - Oracle's structured output format (Analysis/Recommendation/Reasoning/Risks)
      is enforced via the system prompt, same as in Loki.
 """
 from __future__ import annotations
 from langchain_core.messages import SystemMessage
 from langchain_openai import ChatOpenAI
 from sisyphus_langchain.state import SisyphusState
 from sisyphus_langchain.tools.filesystem import (
    list_directory,
    read_file,
    search_content,
    search_files,
 )
 # ---------------------------------------------------------------------------
 # System prompt — faithfully mirrors oracle/config.yaml
 # ---------------------------------------------------------------------------
 ORACLE_SYSTEM_PROMPT = """\
 You are Oracle — a senior architect and debugger consulted for complex decisions.
 ## Your Role
 You are READ-ONLY. You analyze, advise, and recommend. You do NOT implement.
 ## When You're Consulted
 1. **Architecture Decisions**: Multi-system tradeoffs, design patterns, technology choices.
 2. **Complex Debugging**: After 2+ failed fix attempts, deep analysis needed.
 3. **Code Review**: Evaluating proposed designs or implementations.
 4. **Risk Assessment**: Security, performance, or reliability concerns.
 ## Your Process
 1. **Understand**: Read relevant code, understand the full context.
 2. **Analyze**: Consider multiple angles and tradeoffs.
 3. **Recommend**: Provide clear, actionable advice.
 4. **Justify**: Explain your reasoning.
 ## Output Format
 Structure your response as:
 ## Analysis
 [Your understanding of the situation]
 ## Recommendation
 [Clear, specific advice]
 ## Reasoning
 [Why this is the right approach]
 ## Risks/Considerations
 [What to watch out for]
 ## Rules
 1. Never modify files — you advise, others implement.
 2. Be thorough — read all relevant context before advising.
 3. Be specific — general advice isn't helpful.
 4. Consider tradeoffs — there are rarely perfect solutions.
 5. Stay focused — answer the specific question asked.
 """
 # Read-only tools — same set as explore (oracle never writes)
 ORACLE_TOOLS = [read_file, search_content, search_files, list_directory]
 def create_oracle_node(model_name: str = "gpt-4o", temperature: float = 0.2):
    """
    Factory that returns an oracle node function.
    Oracle uses a more expensive model than explore because it needs deeper
    reasoning.  In Loki, the model is inherited from the global config unless
    overridden in oracle/config.yaml.
    Args:
        model_name: Model identifier (use a strong reasoning model).
        temperature: LLM temperature (Loki oracle uses 0.2).
    """
    llm = ChatOpenAI(model=model_name, temperature=temperature).bind_tools(ORACLE_TOOLS)
    def oracle_node(state: SisyphusState) -> dict:
        """
        LangGraph node: run the oracle agent.
        Reads conversation history, applies the oracle system prompt,
        invokes the LLM, and returns structured advice.
        """
        response = llm.invoke(
            [SystemMessage(content=ORACLE_SYSTEM_PROMPT)] + state["messages"]
        )
        return {
            "messages": [response],
            "agent_outputs": {
                **state.get("agent_outputs", {}),
                "oracle": response.content,
            },
        }
    return oracle_node
@@ -0,0 +1,227 @@
 """
 Sisyphus supervisor node — the orchestrator that classifies intent and routes.
 Loki equivalent: assets/agents/sisyphus/config.yaml
 This is the brain of the system.  In Loki, Sisyphus is the top-level agent that:
    1. Classifies every incoming request (trivial / exploration / implementation /
       architecture / ambiguous)
    2. Routes to the appropriate sub-agent (explore, coder, oracle)
    3. Manages the todo list for multi-step tasks
    4. Verifies results and decides when the task is complete
 In LangGraph, the supervisor is a node that returns `Command(goto="agent_name")`
 to route control.  This replaces Loki's `agent__spawn` + `agent__collect` pattern
 with a declarative graph edge.
 Key Loki→LangGraph mapping:
    - agent__spawn --agent explore  →  Command(goto="explore")
    - agent__spawn --agent coder    →  Command(goto="coder")
    - agent__spawn --agent oracle   →  Command(goto="oracle")
    - agent__check / agent__collect →  (implicit: graph edges return to supervisor)
    - todo__init / todo__add        →  state["todos"] updates
    - user__ask / user__confirm     →  interrupt() for human-in-the-loop
 Parallel execution note:
    Loki can spawn multiple explore agents in parallel.  In LangGraph, you'd use
    the Send() API for dynamic fan-out.  For simplicity, this implementation uses
    sequential routing.  See the README for how to add parallel fan-out.
 """
 from __future__ import annotations
 from typing import Literal
 from langchain_core.messages import SystemMessage
 from langchain_openai import ChatOpenAI
 from langgraph.types import Command
 from pydantic import BaseModel, Field
 from sisyphus_langchain.state import SisyphusState
 # ---------------------------------------------------------------------------
 # Maximum iterations before forcing completion (safety valve)
 # Mirrors Loki's max_auto_continues: 25
 # ---------------------------------------------------------------------------
 MAX_ITERATIONS = 15
 # ---------------------------------------------------------------------------
 # Structured output schema for the supervisor's routing decision.
 #
 # In Loki, the supervisor is an LLM that produces free-text and calls tools
 # like agent__spawn.  In LangGraph, we use structured output to force the
 # LLM into a typed routing decision — more reliable than parsing free text.
 # ---------------------------------------------------------------------------
 class RoutingDecision(BaseModel):
    """The supervisor's decision about what to do next."""
    intent: Literal["trivial", "exploration", "implementation", "architecture", "ambiguous"] = Field(
        description="Classified intent of the user's request."
    )
    next_agent: Literal["explore", "oracle", "coder", "FINISH"] = Field(
        description=(
            "Which agent to route to.  'explore' for research/discovery, "
            "'oracle' for architecture/design/debugging advice, "
            "'coder' for implementation, 'FINISH' if the task is complete."
        )
    )
    delegation_notes: str = Field(
        description=(
            "Brief instructions for the target agent: what to look for (explore), "
            "what to analyze (oracle), or what to implement (coder).  "
            "For FINISH, summarize what was accomplished."
        )
    )
 # ---------------------------------------------------------------------------
 # Supervisor system prompt — faithfully mirrors sisyphus/config.yaml
 # ---------------------------------------------------------------------------
 SUPERVISOR_SYSTEM_PROMPT = """\
 You are Sisyphus — an orchestrator that drives coding tasks to completion.
 Your job: Classify → Delegate → Verify → Complete.
 ## Intent Classification (BEFORE every action)
 | Type            | Signal                                              | Action               |
 |-----------------|-----------------------------------------------------|----------------------|
 | trivial         | Single file, known location, typo fix               | Route to FINISH      |
 | exploration     | "Find X", "Where is Y", "List all Z"               | Route to explore     |
 | implementation  | "Add feature", "Fix bug", "Write code"              | Route to coder       |
 | architecture    | See oracle triggers below                           | Route to oracle      |
 | ambiguous       | Unclear scope, multiple interpretations             | Route to FINISH with a clarifying question |
 ## Oracle Triggers (MUST route to oracle when you see these)
 Route to oracle ANY time the user asks about:
 - "How should I..." / "What's the best way to..." — design/approach questions
 - "Why does X keep..." / "What's wrong with..." — complex debugging
 - "Should I use X or Y?" — technology or pattern choices
 - "How should this be structured?" — architecture
 - "Review this" / "What do you think of..." — code/design review
 - Tradeoff questions, multi-component questions, vague/open-ended questions
 ## Agent Specializations
 | Agent   | Use For                                       |
 |---------|-----------------------------------------------|
 | explore | Find patterns, understand code, search        |
 | coder   | Write/edit files, implement features          |
 | oracle  | Architecture decisions, complex debugging     |
 ## Workflow Patterns
 ### Implementation task: explore → coder
 1. Route to explore to find existing patterns and conventions.
 2. Review explore findings.
 3. Route to coder with a structured prompt including the explore findings.
 4. Verify the coder's output (check for CODER_COMPLETE or CODER_FAILED).
 ### Architecture question: explore + oracle
 1. Route to explore to find relevant code.
 2. Route to oracle with the explore findings for analysis.
 ### Simple question: oracle directly
 For pure design/architecture questions, route to oracle directly.
 ## Rules
 1. Always classify before acting.
 2. You are a coordinator, not an implementer.
 3. Route to oracle for ANY design/architecture question.
 4. When routing to coder, include code patterns from explore findings.
 5. Route to FINISH when the task is fully addressed.
 ## Current State
 Iteration: {iteration_count}/{max_iterations}
 Previous agent outputs: {agent_outputs}
 """
 def create_supervisor_node(model_name: str = "gpt-4o", temperature: float = 0.1):
    """
    Factory that returns a supervisor node function.
    The supervisor uses a capable model for accurate routing.
    Args:
        model_name: Model identifier.
        temperature: LLM temperature (low for consistent routing).
    """
    llm = ChatOpenAI(model=model_name, temperature=temperature).with_structured_output(
        RoutingDecision
    )
    def supervisor_node(
        state: SisyphusState,
    ) -> Command[Literal["explore", "oracle", "coder", "__end__"]]:
        """
        LangGraph node: the Sisyphus supervisor.
        Classifies the user's intent, decides which agent to route to,
        and returns a Command that directs graph execution.
        """
        iteration = state.get("iteration_count", 0)
        # Safety valve — prevent infinite loops
        if iteration >= MAX_ITERATIONS:
            return Command(
                goto="__end__",
                update={
                    "final_output": "Reached maximum iterations.  Here's what was accomplished:\n"
                    + "\n".join(
                        f"- {k}: {v[:200]}" for k, v in state.get("agent_outputs", {}).items()
                    ),
                },
            )
        # Format the system prompt with current state
        prompt = SUPERVISOR_SYSTEM_PROMPT.format(
            iteration_count=iteration,
            max_iterations=MAX_ITERATIONS,
            agent_outputs=_summarize_outputs(state.get("agent_outputs", {})),
        )
        # Invoke the LLM to get a structured routing decision
        decision: RoutingDecision = llm.invoke(
            [SystemMessage(content=prompt)] + state["messages"]
        )
        # Route to FINISH
        if decision.next_agent == "FINISH":
            return Command(
                goto="__end__",
                update={
                    "intent": decision.intent,
                    "next_agent": "FINISH",
                    "final_output": decision.delegation_notes,
                },
            )
        # Route to a worker agent
        return Command(
            goto=decision.next_agent,
            update={
                "intent": decision.intent,
                "next_agent": decision.next_agent,
                "iteration_count": iteration + 1,
            },
        )
    return supervisor_node
 def _summarize_outputs(outputs: dict[str, str]) -> str:
    """Summarize agent outputs for the supervisor's context window."""
    if not outputs:
        return "(none yet)"
    parts = []
    for agent, output in outputs.items():
        # Truncate long outputs to keep supervisor context manageable
        # This mirrors Loki's summarization_threshold behavior
        if len(output) > 2000:
            output = output[:2000] + "... (truncated)"
        parts.append(f"[{agent}]: {output}")
    return "\n\n".join(parts)
@@ -0,0 +1,155 @@
 """
 CLI entry point for the Sisyphus LangChain agent.
 This mirrors Loki's `loki --agent sisyphus` entry point.
 In Loki:
    loki --agent sisyphus
    # Starts a REPL with the sisyphus agent loaded
 In this LangChain version:
    python -m sisyphus_langchain.cli
    # or: sisyphus (if installed via pip)
 Usage:
    # Interactive REPL mode
    sisyphus
    # One-shot query
    sisyphus "Add a health check endpoint to the API"
    # With custom models
    sisyphus --supervisor-model gpt-4o --explore-model gpt-4o-mini "Find auth patterns"
 Environment variables:
    OPENAI_API_KEY      — Required for OpenAI models
    ANTHROPIC_API_KEY   — Required if using Anthropic models
 """
 from __future__ import annotations
 import argparse
 import sys
 import uuid
 from langchain_core.messages import HumanMessage
 from sisyphus_langchain.graph import build_graph
 def run_query(graph, query: str, thread_id: str) -> str:
    """
    Run a single query through the Sisyphus graph.
    Args:
        graph: Compiled LangGraph.
        query: User's natural language request.
        thread_id: Session identifier for checkpointing.
    Returns:
        The final output string.
    """
    result = graph.invoke(
        {
            "messages": [HumanMessage(content=query)],
            "intent": "ambiguous",
            "next_agent": "",
            "iteration_count": 0,
            "todos": [],
            "agent_outputs": {},
            "final_output": "",
            "project_dir": ".",
        },
        config={
            "configurable": {"thread_id": thread_id},
            "recursion_limit": 50,
        },
    )
    return result.get("final_output", "(no output)")
 def repl(graph, thread_id: str) -> None:
    """
    Interactive REPL loop — mirrors Loki's REPL mode.
    Maintains conversation across turns via the thread_id (checkpointer).
    """
    print("Sisyphus (LangChain) — type 'quit' to exit")
    print("=" * 50)
    while True:
        try:
            query = input("\n> ").strip()
        except (EOFError, KeyboardInterrupt):
            print("\nBye.")
            break
        if not query:
            continue
        if query.lower() in ("quit", "exit", "q"):
            print("Bye.")
            break
        try:
            output = run_query(graph, query, thread_id)
            print(f"\n{output}")
        except Exception as e:
            print(f"\nError: {e}")
 def main() -> None:
    """CLI entry point."""
    parser = argparse.ArgumentParser(
        description="Sisyphus — multi-agent coding orchestrator (LangChain edition)"
    )
    parser.add_argument(
        "query",
        nargs="?",
        help="One-shot query (omit for REPL mode)",
    )
    parser.add_argument(
        "--supervisor-model",
        default="gpt-4o",
        help="Model for the supervisor (default: gpt-4o)",
    )
    parser.add_argument(
        "--explore-model",
        default="gpt-4o-mini",
        help="Model for the explore agent (default: gpt-4o-mini)",
    )
    parser.add_argument(
        "--oracle-model",
        default="gpt-4o",
        help="Model for the oracle agent (default: gpt-4o)",
    )
    parser.add_argument(
        "--coder-model",
        default="gpt-4o",
        help="Model for the coder agent (default: gpt-4o)",
    )
    parser.add_argument(
        "--thread-id",
        default=None,
        help="Session thread ID for persistence (auto-generated if omitted)",
    )
    args = parser.parse_args()
    graph = build_graph(
        supervisor_model=args.supervisor_model,
        explore_model=args.explore_model,
        oracle_model=args.oracle_model,
        coder_model=args.coder_model,
    )
    thread_id = args.thread_id or f"sisyphus-{uuid.uuid4().hex[:8]}"
    if args.query:
        output = run_query(graph, args.query, thread_id)
        print(output)
    else:
        repl(graph, thread_id)
 if __name__ == "__main__":
    main()
@@ -0,0 +1,115 @@
 """
 Graph assembly — wires together the supervisor and worker nodes.
 This is the LangGraph equivalent of Loki's runtime agent execution engine
 (src/supervisor/mod.rs + src/config/request_context.rs).
 In Loki, the runtime:
    1. Loads the agent config (config.yaml)
    2. Compiles tools (tools.sh → binary)
    3. Starts a chat loop: user → LLM → tool calls → LLM → ...
    4. For orchestrators with can_spawn_agents: true, the supervisor module
       manages child agent lifecycle (spawn, check, collect, cancel).
 In LangGraph, all of this is declarative:
    1. Define nodes (supervisor, explore, oracle, coder)
    2. Define edges (workers always return to supervisor)
    3. Compile the graph (with optional checkpointer for persistence)
    4. Invoke with initial state
 The graph topology:
    ┌─────────────────────────────────────────────┐
    │               SUPERVISOR                     │
    │  (classifies intent, routes to workers)       │
    └─────┬──────────┬──────────┬─────────────────┘
          │          │          │
          ▼          ▼          ▼
      ┌────────┐ ┌────────┐ ┌────────┐
      │EXPLORE │ │ ORACLE │ │ CODER  │
      │(search)│ │(advise)│ │(build) │
      └───┬────┘ └───┬────┘ └───┬────┘
          │          │          │
          └──────────┼──────────┘
                     │
              (back to supervisor)
 Every worker returns to the supervisor.  The supervisor decides what to do next:
 route to another worker, or end the graph.
 """
 from __future__ import annotations
 from langgraph.checkpoint.memory import MemorySaver
 from langgraph.graph import END, START, StateGraph
 from sisyphus_langchain.agents.coder import create_coder_node
 from sisyphus_langchain.agents.explore import create_explore_node
 from sisyphus_langchain.agents.oracle import create_oracle_node
 from sisyphus_langchain.agents.supervisor import create_supervisor_node
 from sisyphus_langchain.state import SisyphusState
 def build_graph(
    *,
    supervisor_model: str = "gpt-4o",
    explore_model: str = "gpt-4o-mini",
    oracle_model: str = "gpt-4o",
    coder_model: str = "gpt-4o",
    use_checkpointer: bool = True,
 ):
    """
    Build and compile the Sisyphus LangGraph.
    This is the main entry point for creating the agent system.  It wires
    together all nodes and edges, optionally adds a checkpointer for
    persistence, and returns a compiled graph ready to invoke.
    Args:
        supervisor_model: Model for the routing supervisor.
        explore_model: Model for the explore agent (can be cheaper).
        oracle_model: Model for the oracle agent (should be strong).
        coder_model: Model for the coder agent.
        use_checkpointer: Whether to add MemorySaver for session persistence.
    Returns:
        A compiled LangGraph ready to .invoke() or .stream().
    Model cost optimization (mirrors Loki's per-agent model config):
        - supervisor: expensive (accurate routing is critical)
        - explore: cheap (just searching, not reasoning deeply)
        - oracle: expensive (deep reasoning, architecture advice)
        - coder: expensive (writing correct code matters)
    """
    # Create the graph builder with our typed state
    builder = StateGraph(SisyphusState)
    # ── Register nodes ─────────────────────────────────────────────────
    # Each node is a function that takes state and returns state updates.
    # This mirrors Loki's agent registration (agents are discovered by
    # their config.yaml in the agents/ directory).
    builder.add_node("supervisor", create_supervisor_node(supervisor_model))
    builder.add_node("explore", create_explore_node(explore_model))
    builder.add_node("oracle", create_oracle_node(oracle_model))
    builder.add_node("coder", create_coder_node(coder_model))
    # ── Define edges ───────────────────────────────────────────────────
    # Entry point: every invocation starts at the supervisor
    builder.add_edge(START, "supervisor")
    # Workers always return to supervisor (the hub-and-spoke pattern).
    # In Loki, this is implicit: agent__collect returns output to the parent,
    # and the parent (sisyphus) decides what to do next.
    builder.add_edge("explore", "supervisor")
    builder.add_edge("oracle", "supervisor")
    builder.add_edge("coder", "supervisor")
    # The supervisor node itself uses Command(goto=...) to route,
    # so we don't need add_conditional_edges — the Command API
    # handles dynamic routing internally.
    # ── Compile ────────────────────────────────────────────────────────
    checkpointer = MemorySaver() if use_checkpointer else None
    graph = builder.compile(checkpointer=checkpointer)
    return graph
@@ -0,0 +1,100 @@
 """
 Shared state schema for the Sisyphus orchestrator graph.
 In LangGraph, state is the single source of truth that flows through every node.
 This is analogous to Loki's per-agent session context, but unified into one typed
 dictionary that the entire graph shares.
 Loki Concept Mapping:
    - Loki session context          → SisyphusState (TypedDict)
    - Loki todo__init / todo__add   → SisyphusState.todos list
    - Loki agent__spawn outputs     → SisyphusState.agent_outputs dict
    - Loki intent classification    → SisyphusState.intent field
 """
 from __future__ import annotations
 from dataclasses import dataclass, field
 from typing import Annotated, Literal
 from langchain_core.messages import BaseMessage
 from langgraph.graph.message import add_messages
 from typing_extensions import TypedDict
 # ---------------------------------------------------------------------------
 # Intent types — mirrors Loki's Sisyphus classification table
 # ---------------------------------------------------------------------------
 IntentType = Literal[
    "trivial",        # Single file, known location, typo fix → handle yourself
    "exploration",    # "Find X", "Where is Y" → spawn explore
    "implementation", # "Add feature", "Fix bug" → spawn coder
    "architecture",   # Design questions, oracle triggers → spawn oracle
    "ambiguous",      # Unclear scope → ask user
 ]
 # ---------------------------------------------------------------------------
 # Todo item — mirrors Loki's built-in todo system
 # ---------------------------------------------------------------------------
@dataclass
 class TodoItem:
    """A single task in the orchestrator's todo list."""
    id: int
    task: str
    done: bool = False
 def _merge_todos(existing: list[TodoItem], new: list[TodoItem]) -> list[TodoItem]:
    """
    Reducer for the todos field.
    LangGraph requires a reducer for any state field that can be written by
    multiple nodes.  This merges by id: if a todo with the same id already
    exists, the incoming version wins (allows marking done).
    """
    by_id = {t.id: t for t in existing}
    for t in new:
        by_id[t.id] = t
    return list(by_id.values())
 # ---------------------------------------------------------------------------
 # Core graph state
 # ---------------------------------------------------------------------------
 class SisyphusState(TypedDict):
    """
    The shared state that flows through every node in the Sisyphus graph.
    Annotated fields use *reducers* — functions that merge concurrent writes.
    Without reducers, parallel node outputs would overwrite each other.
    """
    # Conversation history — the `add_messages` reducer appends new messages
    # instead of replacing the list.  This is critical: every node adds its
    # response here, and downstream nodes see the full history.
    #
    # Loki equivalent: each agent's chat session accumulates messages the same
    # way, but messages are scoped per-agent.  In LangGraph the shared message
    # list IS the inter-agent communication channel.
    messages: Annotated[list[BaseMessage], add_messages]
    # Classified intent for the current request
    intent: IntentType
    # Which agent the supervisor routed to last
    next_agent: str
    # Iteration counter — safety valve analogous to Loki's max_auto_continues
    iteration_count: int
    # Todo list for multi-step tracking (mirrors Loki's todo__* tools)
    todos: Annotated[list[TodoItem], _merge_todos]
    # Accumulated outputs from sub-agent nodes, keyed by agent name.
    # The supervisor reads these to decide what to do next.
    agent_outputs: dict[str, str]
    # Final synthesized answer to return to the user
    final_output: str
    # The working directory / project path (mirrors Loki's project_dir variable)
    project_dir: str
@@ -0,0 +1 @@
 """Tool definitions for Sisyphus agents."""
@@ -0,0 +1,175 @@
 """
 Filesystem tools for Sisyphus agents.
 These are the LangChain equivalents of Loki's global tools:
    - fs_read.sh   → read_file
    - fs_grep.sh   → search_content
    - fs_glob.sh   → search_files
    - fs_ls.sh     → list_directory
    - fs_write.sh  → write_file
    - fs_patch.sh  → (omitted — write_file covers full rewrites)
 Loki Concept Mapping:
    Loki tools are bash scripts with @cmd annotations that Loki's compiler
    turns into function-calling declarations.  In LangChain, we use the @tool
    decorator which serves the same purpose: it generates the JSON schema
    that the LLM sees, and wraps the Python function for execution.
 """
 from __future__ import annotations
 import fnmatch
 import os
 import re
 import subprocess
 from langchain_core.tools import tool
@tool
 def read_file(path: str, offset: int = 1, limit: int = 200) -> str:
    """Read a file's contents with optional line range.
    Args:
        path: Path to the file (absolute or relative to cwd).
        offset: 1-based line number to start from.
        limit: Maximum number of lines to return.
    """
    path = os.path.expanduser(path)
    if not os.path.isfile(path):
        return f"Error: file not found: {path}"
    try:
        with open(path, "r", encoding="utf-8", errors="replace") as f:
            lines = f.readlines()
    except Exception as e:
        return f"Error reading {path}: {e}"
    total = len(lines)
    start = max(0, offset - 1)
    end = min(total, start + limit)
    selected = lines[start:end]
    result = f"File: {path}  (lines {start + 1}-{end} of {total})\n\n"
    for i, line in enumerate(selected, start=start + 1):
        result += f"{i}: {line}"
    if end < total:
        result += f"\n... truncated ({total} total lines)"
    return result
@tool
 def write_file(path: str, content: str) -> str:
    """Write complete contents to a file, creating parent directories as needed.
    Args:
        path: Path for the file.
        content: Complete file contents to write.
    """
    path = os.path.expanduser(path)
    os.makedirs(os.path.dirname(path) or ".", exist_ok=True)
    try:
        with open(path, "w", encoding="utf-8") as f:
            f.write(content)
        return f"Wrote: {path}"
    except Exception as e:
        return f"Error writing {path}: {e}"
@tool
 def search_content(pattern: str, directory: str = ".", file_type: str = "") -> str:
    """Search for a text/regex pattern in files under a directory.
    Args:
        pattern: Text or regex pattern to search for.
        directory: Root directory to search in.
        file_type: Optional file extension filter (e.g. "py", "rs").
    """
    directory = os.path.expanduser(directory)
    cmd = ["grep", "-rn"]
    if file_type:
        cmd += [f"--include=*.{file_type}"]
    cmd += [pattern, directory]
    try:
        result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
        lines = result.stdout.strip().splitlines()
    except Exception as e:
        return f"Error: {e}"
    # Filter noise
    noise = {"/.git/", "/node_modules/", "/target/", "/dist/", "/__pycache__/"}
    filtered = [l for l in lines if not any(n in l for n in noise)][:30]
    if not filtered:
        return "No matches found."
    return "\n".join(filtered)
@tool
 def search_files(pattern: str, directory: str = ".") -> str:
    """Find files matching a glob pattern.
    Args:
        pattern: Glob pattern (e.g. '*.py', 'config*', '*test*').
        directory: Directory to search in.
    """
    directory = os.path.expanduser(directory)
    noise = {".git", "node_modules", "target", "dist", "__pycache__"}
    matches: list[str] = []
    for root, dirs, files in os.walk(directory):
        dirs[:] = [d for d in dirs if d not in noise]
        for name in files:
            if fnmatch.fnmatch(name, pattern):
                matches.append(os.path.join(root, name))
                if len(matches) >= 25:
                    break
        if len(matches) >= 25:
            break
    if not matches:
        return "No files found."
    return "\n".join(matches)
@tool
 def list_directory(path: str = ".", max_depth: int = 3) -> str:
    """List directory tree structure.
    Args:
        path: Directory to list.
        max_depth: Maximum depth to recurse.
    """
    path = os.path.expanduser(path)
    if not os.path.isdir(path):
        return f"Error: not a directory: {path}"
    noise = {".git", "node_modules", "target", "dist", "__pycache__", ".venv", "venv"}
    lines: list[str] = []
    def _walk(dir_path: str, prefix: str, depth: int) -> None:
        if depth > max_depth:
            return
        try:
            entries = sorted(os.listdir(dir_path))
        except PermissionError:
            return
        dirs = [e for e in entries if os.path.isdir(os.path.join(dir_path, e)) and e not in noise]
        files = [e for e in entries if os.path.isfile(os.path.join(dir_path, e))]
        for f in files[:20]:
            lines.append(f"{prefix}{f}")
        if len(files) > 20:
            lines.append(f"{prefix}... ({len(files) - 20} more files)")
        for d in dirs:
            lines.append(f"{prefix}{d}/")
            _walk(os.path.join(dir_path, d), prefix + "  ", depth + 1)
    lines.append(f"{os.path.basename(path) or path}/")
    _walk(path, "  ", 1)
    return "\n".join(lines[:200])
@@ -0,0 +1,142 @@
 """
 Project detection and build/test tools.
 These mirror Loki's .shared/utils.sh detect_project() heuristic and the
 sisyphus/coder tools.sh run_build / run_tests / verify_build commands.
 Loki Concept Mapping:
    Loki uses a heuristic cascade: check for Cargo.toml → go.mod → package.json
    etc., then falls back to an LLM call for unknown projects.  We replicate the
    heuristic portion here.  The LLM fallback is omitted since the agents
    themselves can reason about unknown project types.
 """
 from __future__ import annotations
 import json
 import os
 import subprocess
 from langchain_core.tools import tool
 # ---------------------------------------------------------------------------
 # Project detection (mirrors _detect_heuristic in utils.sh)
 # ---------------------------------------------------------------------------
 _HEURISTICS: list[tuple[str, dict[str, str]]] = [
    ("Cargo.toml", {"type": "rust", "build": "cargo build", "test": "cargo test", "check": "cargo check"}),
    ("go.mod", {"type": "go", "build": "go build ./...", "test": "go test ./...", "check": "go vet ./..."}),
    ("package.json", {"type": "nodejs", "build": "npm run build", "test": "npm test", "check": "npm run lint"}),
    ("pyproject.toml", {"type": "python", "build": "", "test": "pytest", "check": "ruff check ."}),
    ("pom.xml", {"type": "java", "build": "mvn compile", "test": "mvn test", "check": "mvn verify"}),
    ("Makefile", {"type": "make", "build": "make build", "test": "make test", "check": "make lint"}),
 ]
 def detect_project(directory: str) -> dict[str, str]:
    """Detect project type and return build/test commands."""
    for marker, info in _HEURISTICS:
        if os.path.exists(os.path.join(directory, marker)):
            return info
    return {"type": "unknown", "build": "", "test": "", "check": ""}
@tool
 def get_project_info(directory: str = ".") -> str:
    """Detect the project type and show structure overview.
    Args:
        directory: Project root directory.
    """
    directory = os.path.expanduser(directory)
    info = detect_project(directory)
    result = f"Project: {os.path.abspath(directory)}\n"
    result += f"Type: {info['type']}\n"
    result += f"Build: {info['build'] or '(none)'}\n"
    result += f"Test: {info['test'] or '(none)'}\n"
    result += f"Check: {info['check'] or '(none)'}\n"
    return result
 def _run_project_command(directory: str, command_key: str) -> str:
    """Run a detected project command (build/test/check)."""
    directory = os.path.expanduser(directory)
    info = detect_project(directory)
    cmd = info.get(command_key, "")
    if not cmd:
        return f"No {command_key} command detected for this project."
    try:
        result = subprocess.run(
            cmd,
            shell=True,
            capture_output=True,
            text=True,
            cwd=directory,
            timeout=300,
        )
        output = result.stdout + result.stderr
        status = "SUCCESS" if result.returncode == 0 else f"FAILED (exit {result.returncode})"
        return f"Running: {cmd}\n\n{output}\n\n{command_key.upper()}: {status}"
    except subprocess.TimeoutExpired:
        return f"{command_key.upper()}: TIMEOUT after 300s"
    except Exception as e:
        return f"{command_key.upper()}: ERROR — {e}"
@tool
 def run_build(directory: str = ".") -> str:
    """Run the project's build command.
    Args:
        directory: Project root directory.
    """
    return _run_project_command(directory, "build")
@tool
 def run_tests(directory: str = ".") -> str:
    """Run the project's test suite.
    Args:
        directory: Project root directory.
    """
    return _run_project_command(directory, "test")
@tool
 def verify_build(directory: str = ".") -> str:
    """Run the project's check/lint command to verify correctness.
    Args:
        directory: Project root directory.
    """
    return _run_project_command(directory, "check")
@tool
 def execute_command(command: str, directory: str = ".") -> str:
    """Execute a shell command and return its output.
    Args:
        command: Shell command to execute.
        directory: Working directory.
    """
    directory = os.path.expanduser(directory)
    try:
        result = subprocess.run(
            command,
            shell=True,
            capture_output=True,
            text=True,
            cwd=directory,
            timeout=120,
        )
        output = (result.stdout + result.stderr).strip()
        if result.returncode != 0:
            return f"Command failed (exit {result.returncode}):\n{output}"
        return output or "(no output)"
    except subprocess.TimeoutExpired:
        return "Command timed out after 120s."
    except Exception as e:
        return f"Error: {e}"
		`@@ -0,0 +1 @@`
							`"""Agent node definitions for the Sisyphus orchestrator."""`
		`@@ -0,0 +1 @@`
							`"""Tool definitions for Sisyphus agents."""`