Files
loki/examples/langchain-sisyphus/README.md

417 lines
16 KiB
Markdown

# Sisyphus in LangChain/LangGraph
A faithful recreation of [Loki's Sisyphus agent](../../assets/agents/sisyphus/) using [LangGraph](https://docs.langchain.com/langgraph/) — LangChain's framework for stateful, multi-agent workflows.
This project exists to help you understand LangChain/LangGraph by mapping every concept to its Loki equivalent.
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────┐
│ SUPERVISOR NODE │
│ Intent classification → Routing decision → Command(goto=) │
│ │
│ Loki equivalent: sisyphus/config.yaml │
│ (agent__spawn → Command, agent__collect → graph edge) │
└──────────┬──────────────┬──────────────┬────────────────────┘
│ │ │
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ EXPLORE │ │ ORACLE │ │ CODER │
│ (research) │ │ (advise) │ │ (build) │
│ │ │ │ │ │
│ read-only │ │ read-only │ │ read+write │
│ tools │ │ tools │ │ tools │
└─────┬──────┘ └─────┬──────┘ └─────┬──────┘
│ │ │
└──────────────┼──────────────┘
back to supervisor
```
## Concept Map: Loki → LangGraph
This is the key reference. Every row maps a Loki concept to its LangGraph equivalent.
### Core Architecture
| Loki Concept | LangGraph Equivalent | Where in Code |
|---|---|---|
| Agent config (config.yaml) | Node function + system prompt | `agents/explore.py`, etc. |
| Agent instructions | System prompt string | `EXPLORE_SYSTEM_PROMPT`, etc. |
| Agent tools (tools.sh) | `@tool`-decorated Python functions | `tools/filesystem.py`, `tools/project.py` |
| Agent session (chat loop) | Graph state + message list | `state.py``SisyphusState.messages` |
| `agent__spawn --agent X` | `Command(goto="X")` | `agents/supervisor.py` |
| `agent__collect --id` | Graph edge (implicit — workers return to supervisor) | `graph.py``add_edge("explore", "supervisor")` |
| `agent__check` (non-blocking) | Not needed (graph handles scheduling) | — |
| `agent__cancel` | Not needed (graph handles lifecycle) | — |
| `can_spawn_agents: true` | Node has routing logic (supervisor) | `agents/supervisor.py` |
| `max_concurrent_agents: 4` | `Send()` API for parallel fan-out | See [Parallel Execution](#parallel-execution) |
| `max_agent_depth: 3` | `recursion_limit` in config | `cli.py``recursion_limit: 50` |
| `summarization_threshold` | Manual truncation in supervisor | `supervisor.py``_summarize_outputs()` |
### Tool System
| Loki Concept | LangGraph Equivalent | Notes |
|---|---|---|
| `tools.sh` with `@cmd` annotations | `@tool` decorator | Loki compiles bash annotations to JSON schema; LangChain generates schema from the Python function signature + docstring |
| `@option --pattern!` (required arg) | Function parameter without default | `def search_content(pattern: str)` |
| `@option --lines` (optional arg) | Parameter with default | `def read_file(path: str, limit: int = 200)` |
| `@env LLM_OUTPUT=/dev/stdout` | Return value | LangChain tools return strings; Loki tools write to `$LLM_OUTPUT` |
| `@describe` | Docstring | The tool's docstring becomes the description the LLM sees |
| Global tools (`fs_read.sh`, etc.) | Shared tool imports | Both agents import from `tools/filesystem.py` |
| Agent-specific tools | Per-node tool binding | `llm.bind_tools(EXPLORE_TOOLS)` vs `llm.bind_tools(CODER_TOOLS)` |
| `.shared/utils.sh` | `tools/project.py` | Shared project detection utilities |
| `detect_project()` heuristic | `detect_project()` in Python | Same logic: check Cargo.toml → go.mod → package.json → etc. |
| LLM fallback for unknown projects | (omitted) | The agents themselves can reason about unknown project types |
### State & Memory
| Loki Concept | LangGraph Equivalent | Notes |
|---|---|---|
| Agent session (conversation history) | `SisyphusState.messages` | `Annotated[list, add_messages]` — the reducer appends instead of replacing |
| `agent_session: temp` | `MemorySaver` checkpointer | Loki's temp sessions are ephemeral; MemorySaver is in-memory (lost on restart) |
| Per-agent isolation | Per-node system prompt + tools | In Loki agents have separate sessions; in LangGraph they share messages but have different system prompts |
| `{{project_dir}}` variable | `SisyphusState.project_dir` | Loki interpolates variables into prompts; LangGraph stores them in state |
| `{{__tools__}}` injection | `llm.bind_tools()` | Loki injects tool descriptions into the prompt; LangChain attaches them to the API call |
### Orchestration
| Loki Concept | LangGraph Equivalent | Notes |
|---|---|---|
| Intent classification table | `RoutingDecision` structured output | Loki does this in free text; LangGraph forces typed JSON |
| Oracle triggers ("How should I...") | Supervisor prompt + structured output | Same trigger phrases, enforced via system prompt |
| Coder delegation format | Supervisor builds HumanMessage | The structured prompt (Goal/Reference Files/Conventions/Constraints) |
| `agent__spawn` (parallel) | `Send()` API | Dynamic fan-out to multiple nodes |
| Todo system (`todo__init`, etc.) | `SisyphusState.todos` | State field with a merge reducer |
| `auto_continue: true` | Supervisor loop (iteration counter) | Supervisor re-routes until FINISH or max iterations |
| `max_auto_continues: 25` | `MAX_ITERATIONS = 15` | Safety valve to prevent infinite loops |
| `user__ask` / `user__confirm` | `interrupt()` API | Pauses graph, surfaces question to caller, resumes with answer |
| Escalation (child → parent → user) | `interrupt()` in any node | Any node can pause; the caller handles the interaction |
### Execution Model
| Loki Concept | LangGraph Equivalent | Notes |
|---|---|---|
| `loki --agent sisyphus` | `python -m sisyphus_langchain.cli` | CLI entry point |
| REPL mode | `cli.py``repl()` | Interactive loop with thread persistence |
| One-shot mode | `cli.py``run_query()` | Single query, print result, exit |
| Streaming output | `graph.stream()` | LangGraph supports per-node streaming |
| `inject_spawn_instructions` | (always on) | System prompts are always included |
| `inject_todo_instructions` | (always on) | Todo instructions could be added to prompts |
## How the Execution Flow Works
### 1. User sends a message
```python
graph.invoke({"messages": [HumanMessage("Add a health check endpoint")]})
```
### 2. Supervisor classifies intent
The supervisor LLM reads the message and produces a `RoutingDecision`:
```json
{
"intent": "implementation",
"next_agent": "explore",
"delegation_notes": "Find existing API endpoint patterns, route structure, and health check conventions"
}
```
### 3. Supervisor routes via Command
```python
return Command(goto="explore", update={"intent": "implementation", "iteration_count": 1})
```
### 4. Explore agent runs
- Receives the full message history (including the user's request)
- Calls read-only tools (search_content, search_files, read_file)
- Returns findings in messages
### 5. Control returns to supervisor
The graph edge `explore → supervisor` fires automatically.
### 6. Supervisor reviews and routes again
Now it has explore's findings. It routes to coder with context:
```json
{
"intent": "implementation",
"next_agent": "coder",
"delegation_notes": "Implement health check endpoint following patterns found in src/routes/"
}
```
### 7. Coder implements
- Reads explore's findings from the message history
- Writes files via `write_file` tool
- Runs `verify_build` to check compilation
### 8. Supervisor verifies and finishes
```json
{
"intent": "implementation",
"next_agent": "FINISH",
"delegation_notes": "Added /health endpoint in src/routes/health.py. Build passes."
}
```
## Key Differences from Loki
### What LangGraph does better
1. **Declarative graph** — The topology is visible and debuggable. Loki's orchestration is emergent from the LLM's tool calls.
2. **Typed state**`SisyphusState` is a TypedDict with reducers. Loki's state is implicit in the conversation.
3. **Checkpointing** — Built-in persistence. Loki manages sessions manually.
4. **Time-travel debugging** — Inspect any checkpoint. Loki has no equivalent.
5. **Structured routing**`RoutingDecision` forces valid JSON. Loki relies on the LLM calling the right tool.
### What Loki does better
1. **True parallelism**`agent__spawn` runs multiple agents concurrently in separate threads. This LangGraph implementation is sequential (see [Parallel Execution](#parallel-execution) for how to add it).
2. **Agent isolation** — Each Loki agent has its own session, tools, and config. LangGraph nodes share state.
3. **Teammate messaging** — Loki agents can send messages to siblings. LangGraph nodes communicate only through shared state.
4. **Dynamic tool compilation** — Loki compiles bash/python/typescript tools at startup. LangChain tools are statically defined.
5. **Escalation protocol** — Loki's child-to-parent escalation is sophisticated. LangGraph's `interrupt()` is simpler but less structured.
6. **Task queues with dependencies** — Loki's `agent__task_create` supports dependency DAGs. LangGraph's routing is simpler (hub-and-spoke).
## Running It
### Prerequisites
```bash
# Python 3.11+
python --version
# Set your API key
export OPENAI_API_KEY="sk-..."
```
### Install
```bash
cd examples/langchain-sisyphus
# With pip
pip install -e .
# Or with uv (recommended)
uv pip install -e .
```
### Usage
```bash
# Interactive REPL (like `loki --agent sisyphus`)
sisyphus
# One-shot query
sisyphus "Find all TODO comments in the codebase"
# With custom models (cost optimization)
sisyphus --explore-model gpt-4o-mini --coder-model gpt-4o "Add input validation to the API"
# Programmatic usage
python -c "
from sisyphus_langchain import build_graph
from langchain_core.messages import HumanMessage
graph = build_graph()
result = graph.invoke({
'messages': [HumanMessage('What patterns does this codebase use?')],
'intent': 'ambiguous',
'next_agent': '',
'iteration_count': 0,
'todos': [],
'agent_outputs': {},
'final_output': '',
'project_dir': '.',
}, config={'configurable': {'thread_id': 'demo'}, 'recursion_limit': 50})
print(result['final_output'])
"
```
### Using Anthropic Models
Replace `ChatOpenAI` with `ChatAnthropic` in the agent factories:
```python
from langchain_anthropic import ChatAnthropic
# In agents/oracle.py:
llm = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0.2).bind_tools(ORACLE_TOOLS)
```
## Deployment
### Option 1: Standalone Script (Simplest)
Just run the CLI directly. No infrastructure needed.
```bash
sisyphus "Add a health check endpoint"
```
### Option 2: FastAPI Server
```python
# server.py
from fastapi import FastAPI
from langserve import add_routes
from sisyphus_langchain import build_graph
app = FastAPI(title="Sisyphus API")
graph = build_graph()
add_routes(app, graph, path="/agent")
# Run: uvicorn server:app --host 0.0.0.0 --port 8000
# Call: POST http://localhost:8000/agent/invoke
```
### Option 3: LangGraph Platform (Production)
Create a `langgraph.json` at the project root:
```json
{
"graphs": {
"sisyphus": "./sisyphus_langchain/graph.py:build_graph"
},
"dependencies": ["./sisyphus_langchain"],
"env": ".env"
}
```
Then deploy:
```bash
pip install langgraph-cli
langgraph deploy
```
This gives you:
- Durable checkpointing (PostgreSQL)
- Background runs
- Streaming API
- Zero-downtime deployments
- Built-in observability
### Option 4: Docker
```dockerfile
FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install -e .
CMD ["sisyphus"]
```
```bash
docker build -t sisyphus .
docker run -it -e OPENAI_API_KEY=$OPENAI_API_KEY sisyphus
```
## Parallel Execution
This implementation routes sequentially for simplicity. To add Loki-style parallel agent execution, use LangGraph's `Send()` API:
```python
from langgraph.types import Send
def supervisor_node(state):
# Fan out to multiple explore agents in parallel
# (like Loki's agent__spawn called multiple times)
return [
Send("explore", {
**state,
"messages": state["messages"] + [
HumanMessage("Find existing API endpoint patterns")
],
}),
Send("explore", {
**state,
"messages": state["messages"] + [
HumanMessage("Find data models and database patterns")
],
}),
]
```
This is equivalent to Loki's pattern of spawning multiple explore agents:
```
agent__spawn --agent explore --prompt "Find API patterns"
agent__spawn --agent explore --prompt "Find database patterns"
agent__collect --id <id1>
agent__collect --id <id2>
```
## Adding Human-in-the-Loop
To replicate Loki's `user__ask` / `user__confirm` tools, use LangGraph's `interrupt()`:
```python
from langgraph.types import interrupt
def supervisor_node(state):
# Pause and ask the user (like Loki's user__ask)
answer = interrupt({
"question": "How should we structure the authentication?",
"options": [
"JWT with httpOnly cookies (Recommended)",
"Session-based with Redis",
"OAuth2 with external provider",
],
})
# `answer` contains the user's selection when the graph resumes
```
## Project Structure
```
examples/langchain-sisyphus/
├── pyproject.toml # Dependencies & build config
├── README.md # This file
└── sisyphus_langchain/
├── __init__.py # Package entry point
├── cli.py # CLI (REPL + one-shot mode)
├── graph.py # Graph assembly (wires nodes + edges)
├── state.py # Shared state schema (TypedDict)
├── agents/
│ ├── __init__.py
│ ├── supervisor.py # Sisyphus orchestrator (intent → routing)
│ ├── explore.py # Read-only codebase researcher
│ ├── oracle.py # Architecture/debugging advisor
│ └── coder.py # Implementation worker
└── tools/
├── __init__.py
├── filesystem.py # File read/write/search/glob tools
└── project.py # Project detection, build, test tools
```
### File-to-Loki Mapping
| This Project | Loki Equivalent |
|---|---|
| `state.py` | Session context + todo state (implicit in Loki) |
| `graph.py` | `src/supervisor/mod.rs` (runtime orchestration) |
| `cli.py` | `src/main.rs` (CLI entry point) |
| `agents/supervisor.py` | `assets/agents/sisyphus/config.yaml` |
| `agents/explore.py` | `assets/agents/explore/config.yaml` + `tools.sh` |
| `agents/oracle.py` | `assets/agents/oracle/config.yaml` + `tools.sh` |
| `agents/coder.py` | `assets/agents/coder/config.yaml` + `tools.sh` |
| `tools/filesystem.py` | `assets/functions/tools/fs_*.sh` |
| `tools/project.py` | `assets/agents/.shared/utils.sh` + `sisyphus/tools.sh` |
## Further Reading
- [LangGraph Documentation](https://docs.langchain.com/langgraph/)
- [LangGraph Multi-Agent Tutorial](https://docs.langchain.com/langgraph/how-tos/multi-agent-systems)
- [Loki Agents Documentation](../../docs/AGENTS.md)
- [Loki Sisyphus README](../../assets/agents/sisyphus/README.md)
- [LangGraph Supervisor Library](https://github.com/langchain-ai/langgraph-supervisor-py)