Files
loki/examples/langchain-sisyphus/README.md

16 KiB

Sisyphus in LangChain/LangGraph

A faithful recreation of Loki's Sisyphus agent using LangGraph — LangChain's framework for stateful, multi-agent workflows.

This project exists to help you understand LangChain/LangGraph by mapping every concept to its Loki equivalent.

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                     SUPERVISOR NODE                          │
│  Intent classification → Routing decision → Command(goto=)   │
│                                                              │
│  Loki equivalent: sisyphus/config.yaml                       │
│  (agent__spawn → Command, agent__collect → graph edge)       │
└──────────┬──────────────┬──────────────┬────────────────────┘
           │              │              │
           ▼              ▼              ▼
    ┌────────────┐ ┌────────────┐ ┌────────────┐
    │  EXPLORE   │ │   ORACLE   │ │   CODER    │
    │ (research) │ │  (advise)  │ │  (build)   │
    │            │ │            │ │            │
    │ read-only  │ │ read-only  │ │ read+write │
    │ tools      │ │ tools      │ │ tools      │
    └─────┬──────┘ └─────┬──────┘ └─────┬──────┘
          │              │              │
          └──────────────┼──────────────┘
                         │
                  back to supervisor

Concept Map: Loki → LangGraph

This is the key reference. Every row maps a Loki concept to its LangGraph equivalent.

Core Architecture

Loki Concept LangGraph Equivalent Where in Code
Agent config (config.yaml) Node function + system prompt agents/explore.py, etc.
Agent instructions System prompt string EXPLORE_SYSTEM_PROMPT, etc.
Agent tools (tools.sh) @tool-decorated Python functions tools/filesystem.py, tools/project.py
Agent session (chat loop) Graph state + message list state.pySisyphusState.messages
agent__spawn --agent X Command(goto="X") agents/supervisor.py
agent__collect --id Graph edge (implicit — workers return to supervisor) graph.pyadd_edge("explore", "supervisor")
agent__check (non-blocking) Not needed (graph handles scheduling)
agent__cancel Not needed (graph handles lifecycle)
can_spawn_agents: true Node has routing logic (supervisor) agents/supervisor.py
max_concurrent_agents: 4 Send() API for parallel fan-out See Parallel Execution
max_agent_depth: 3 recursion_limit in config cli.pyrecursion_limit: 50
summarization_threshold Manual truncation in supervisor supervisor.py_summarize_outputs()

Tool System

Loki Concept LangGraph Equivalent Notes
tools.sh with @cmd annotations @tool decorator Loki compiles bash annotations to JSON schema; LangChain generates schema from the Python function signature + docstring
@option --pattern! (required arg) Function parameter without default def search_content(pattern: str)
@option --lines (optional arg) Parameter with default def read_file(path: str, limit: int = 200)
@env LLM_OUTPUT=/dev/stdout Return value LangChain tools return strings; Loki tools write to $LLM_OUTPUT
@describe Docstring The tool's docstring becomes the description the LLM sees
Global tools (fs_read.sh, etc.) Shared tool imports Both agents import from tools/filesystem.py
Agent-specific tools Per-node tool binding llm.bind_tools(EXPLORE_TOOLS) vs llm.bind_tools(CODER_TOOLS)
.shared/utils.sh tools/project.py Shared project detection utilities
detect_project() heuristic detect_project() in Python Same logic: check Cargo.toml → go.mod → package.json → etc.
LLM fallback for unknown projects (omitted) The agents themselves can reason about unknown project types

State & Memory

Loki Concept LangGraph Equivalent Notes
Agent session (conversation history) SisyphusState.messages Annotated[list, add_messages] — the reducer appends instead of replacing
agent_session: temp MemorySaver checkpointer Loki's temp sessions are ephemeral; MemorySaver is in-memory (lost on restart)
Per-agent isolation Per-node system prompt + tools In Loki agents have separate sessions; in LangGraph they share messages but have different system prompts
{{project_dir}} variable SisyphusState.project_dir Loki interpolates variables into prompts; LangGraph stores them in state
{{__tools__}} injection llm.bind_tools() Loki injects tool descriptions into the prompt; LangChain attaches them to the API call

Orchestration

Loki Concept LangGraph Equivalent Notes
Intent classification table RoutingDecision structured output Loki does this in free text; LangGraph forces typed JSON
Oracle triggers ("How should I...") Supervisor prompt + structured output Same trigger phrases, enforced via system prompt
Coder delegation format Supervisor builds HumanMessage The structured prompt (Goal/Reference Files/Conventions/Constraints)
agent__spawn (parallel) Send() API Dynamic fan-out to multiple nodes
Todo system (todo__init, etc.) SisyphusState.todos State field with a merge reducer
auto_continue: true Supervisor loop (iteration counter) Supervisor re-routes until FINISH or max iterations
max_auto_continues: 25 MAX_ITERATIONS = 15 Safety valve to prevent infinite loops
user__ask / user__confirm interrupt() API Pauses graph, surfaces question to caller, resumes with answer
Escalation (child → parent → user) interrupt() in any node Any node can pause; the caller handles the interaction

Execution Model

Loki Concept LangGraph Equivalent Notes
loki --agent sisyphus python -m sisyphus_langchain.cli CLI entry point
REPL mode cli.pyrepl() Interactive loop with thread persistence
One-shot mode cli.pyrun_query() Single query, print result, exit
Streaming output graph.stream() LangGraph supports per-node streaming
inject_spawn_instructions (always on) System prompts are always included
inject_todo_instructions (always on) Todo instructions could be added to prompts

How the Execution Flow Works

1. User sends a message

graph.invoke({"messages": [HumanMessage("Add a health check endpoint")]})

2. Supervisor classifies intent

The supervisor LLM reads the message and produces a RoutingDecision:

{
  "intent": "implementation",
  "next_agent": "explore",
  "delegation_notes": "Find existing API endpoint patterns, route structure, and health check conventions"
}

3. Supervisor routes via Command

return Command(goto="explore", update={"intent": "implementation", "iteration_count": 1})

4. Explore agent runs

  • Receives the full message history (including the user's request)
  • Calls read-only tools (search_content, search_files, read_file)
  • Returns findings in messages

5. Control returns to supervisor

The graph edge explore → supervisor fires automatically.

6. Supervisor reviews and routes again

Now it has explore's findings. It routes to coder with context:

{
  "intent": "implementation",
  "next_agent": "coder",
  "delegation_notes": "Implement health check endpoint following patterns found in src/routes/"
}

7. Coder implements

  • Reads explore's findings from the message history
  • Writes files via write_file tool
  • Runs verify_build to check compilation

8. Supervisor verifies and finishes

{
  "intent": "implementation",
  "next_agent": "FINISH",
  "delegation_notes": "Added /health endpoint in src/routes/health.py. Build passes."
}

Key Differences from Loki

What LangGraph does better

  1. Declarative graph — The topology is visible and debuggable. Loki's orchestration is emergent from the LLM's tool calls.
  2. Typed stateSisyphusState is a TypedDict with reducers. Loki's state is implicit in the conversation.
  3. Checkpointing — Built-in persistence. Loki manages sessions manually.
  4. Time-travel debugging — Inspect any checkpoint. Loki has no equivalent.
  5. Structured routingRoutingDecision forces valid JSON. Loki relies on the LLM calling the right tool.

What Loki does better

  1. True parallelismagent__spawn runs multiple agents concurrently in separate threads. This LangGraph implementation is sequential (see Parallel Execution for how to add it).
  2. Agent isolation — Each Loki agent has its own session, tools, and config. LangGraph nodes share state.
  3. Teammate messaging — Loki agents can send messages to siblings. LangGraph nodes communicate only through shared state.
  4. Dynamic tool compilation — Loki compiles bash/python/typescript tools at startup. LangChain tools are statically defined.
  5. Escalation protocol — Loki's child-to-parent escalation is sophisticated. LangGraph's interrupt() is simpler but less structured.
  6. Task queues with dependencies — Loki's agent__task_create supports dependency DAGs. LangGraph's routing is simpler (hub-and-spoke).

Running It

Prerequisites

# Python 3.11+
python --version

# Set your API key
export OPENAI_API_KEY="sk-..."

Install

cd examples/langchain-sisyphus

# With pip
pip install -e .

# Or with uv (recommended)
uv pip install -e .

Usage

# Interactive REPL (like `loki --agent sisyphus`)
sisyphus

# One-shot query
sisyphus "Find all TODO comments in the codebase"

# With custom models (cost optimization)
sisyphus --explore-model gpt-4o-mini --coder-model gpt-4o "Add input validation to the API"

# Programmatic usage
python -c "
from sisyphus_langchain import build_graph
from langchain_core.messages import HumanMessage

graph = build_graph()
result = graph.invoke({
    'messages': [HumanMessage('What patterns does this codebase use?')],
    'intent': 'ambiguous',
    'next_agent': '',
    'iteration_count': 0,
    'todos': [],
    'agent_outputs': {},
    'final_output': '',
    'project_dir': '.',
}, config={'configurable': {'thread_id': 'demo'}, 'recursion_limit': 50})
print(result['final_output'])
"

Using Anthropic Models

Replace ChatOpenAI with ChatAnthropic in the agent factories:

from langchain_anthropic import ChatAnthropic

# In agents/oracle.py:
llm = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0.2).bind_tools(ORACLE_TOOLS)

Deployment

Option 1: Standalone Script (Simplest)

Just run the CLI directly. No infrastructure needed.

sisyphus "Add a health check endpoint"

Option 2: FastAPI Server

# server.py
from fastapi import FastAPI
from langserve import add_routes
from sisyphus_langchain import build_graph

app = FastAPI(title="Sisyphus API")
graph = build_graph()
add_routes(app, graph, path="/agent")

# Run: uvicorn server:app --host 0.0.0.0 --port 8000
# Call: POST http://localhost:8000/agent/invoke

Option 3: LangGraph Platform (Production)

Create a langgraph.json at the project root:

{
  "graphs": {
    "sisyphus": "./sisyphus_langchain/graph.py:build_graph"
  },
  "dependencies": ["./sisyphus_langchain"],
  "env": ".env"
}

Then deploy:

pip install langgraph-cli
langgraph deploy

This gives you:

  • Durable checkpointing (PostgreSQL)
  • Background runs
  • Streaming API
  • Zero-downtime deployments
  • Built-in observability

Option 4: Docker

FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install -e .
CMD ["sisyphus"]
docker build -t sisyphus .
docker run -it -e OPENAI_API_KEY=$OPENAI_API_KEY sisyphus

Parallel Execution

This implementation routes sequentially for simplicity. To add Loki-style parallel agent execution, use LangGraph's Send() API:

from langgraph.types import Send

def supervisor_node(state):
    # Fan out to multiple explore agents in parallel
    # (like Loki's agent__spawn called multiple times)
    return [
        Send("explore", {
            **state,
            "messages": state["messages"] + [
                HumanMessage("Find existing API endpoint patterns")
            ],
        }),
        Send("explore", {
            **state,
            "messages": state["messages"] + [
                HumanMessage("Find data models and database patterns")
            ],
        }),
    ]

This is equivalent to Loki's pattern of spawning multiple explore agents:

agent__spawn --agent explore --prompt "Find API patterns"
agent__spawn --agent explore --prompt "Find database patterns"
agent__collect --id <id1>
agent__collect --id <id2>

Adding Human-in-the-Loop

To replicate Loki's user__ask / user__confirm tools, use LangGraph's interrupt():

from langgraph.types import interrupt

def supervisor_node(state):
    # Pause and ask the user (like Loki's user__ask)
    answer = interrupt({
        "question": "How should we structure the authentication?",
        "options": [
            "JWT with httpOnly cookies (Recommended)",
            "Session-based with Redis",
            "OAuth2 with external provider",
        ],
    })
    # `answer` contains the user's selection when the graph resumes

Project Structure

examples/langchain-sisyphus/
├── pyproject.toml                          # Dependencies & build config
├── README.md                               # This file
└── sisyphus_langchain/
    ├── __init__.py                          # Package entry point
    ├── cli.py                              # CLI (REPL + one-shot mode)
    ├── graph.py                            # Graph assembly (wires nodes + edges)
    ├── state.py                            # Shared state schema (TypedDict)
    ├── agents/
    │   ├── __init__.py
    │   ├── supervisor.py                   # Sisyphus orchestrator (intent → routing)
    │   ├── explore.py                      # Read-only codebase researcher
    │   ├── oracle.py                       # Architecture/debugging advisor
    │   └── coder.py                        # Implementation worker
    └── tools/
        ├── __init__.py
        ├── filesystem.py                   # File read/write/search/glob tools
        └── project.py                      # Project detection, build, test tools

File-to-Loki Mapping

This Project Loki Equivalent
state.py Session context + todo state (implicit in Loki)
graph.py src/supervisor/mod.rs (runtime orchestration)
cli.py src/main.rs (CLI entry point)
agents/supervisor.py assets/agents/sisyphus/config.yaml
agents/explore.py assets/agents/explore/config.yaml + tools.sh
agents/oracle.py assets/agents/oracle/config.yaml + tools.sh
agents/coder.py assets/agents/coder/config.yaml + tools.sh
tools/filesystem.py assets/functions/tools/fs_*.sh
tools/project.py assets/agents/.shared/utils.sh + sisyphus/tools.sh

Further Reading