417 lines
16 KiB
Markdown
417 lines
16 KiB
Markdown
# Sisyphus in LangChain/LangGraph
|
|
|
|
A faithful recreation of [Loki's Sisyphus agent](../../assets/agents/sisyphus/) using [LangGraph](https://docs.langchain.com/langgraph/) — LangChain's framework for stateful, multi-agent workflows.
|
|
|
|
This project exists to help you understand LangChain/LangGraph by mapping every concept to its Loki equivalent.
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ SUPERVISOR NODE │
|
|
│ Intent classification → Routing decision → Command(goto=) │
|
|
│ │
|
|
│ Loki equivalent: sisyphus/config.yaml │
|
|
│ (agent__spawn → Command, agent__collect → graph edge) │
|
|
└──────────┬──────────────┬──────────────┬────────────────────┘
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌────────────┐ ┌────────────┐ ┌────────────┐
|
|
│ EXPLORE │ │ ORACLE │ │ CODER │
|
|
│ (research) │ │ (advise) │ │ (build) │
|
|
│ │ │ │ │ │
|
|
│ read-only │ │ read-only │ │ read+write │
|
|
│ tools │ │ tools │ │ tools │
|
|
└─────┬──────┘ └─────┬──────┘ └─────┬──────┘
|
|
│ │ │
|
|
└──────────────┼──────────────┘
|
|
│
|
|
back to supervisor
|
|
```
|
|
|
|
## Concept Map: Loki → LangGraph
|
|
|
|
This is the key reference. Every row maps a Loki concept to its LangGraph equivalent.
|
|
|
|
### Core Architecture
|
|
|
|
| Loki Concept | LangGraph Equivalent | Where in Code |
|
|
|---|---|---|
|
|
| Agent config (config.yaml) | Node function + system prompt | `agents/explore.py`, etc. |
|
|
| Agent instructions | System prompt string | `EXPLORE_SYSTEM_PROMPT`, etc. |
|
|
| Agent tools (tools.sh) | `@tool`-decorated Python functions | `tools/filesystem.py`, `tools/project.py` |
|
|
| Agent session (chat loop) | Graph state + message list | `state.py` → `SisyphusState.messages` |
|
|
| `agent__spawn --agent X` | `Command(goto="X")` | `agents/supervisor.py` |
|
|
| `agent__collect --id` | Graph edge (implicit — workers return to supervisor) | `graph.py` → `add_edge("explore", "supervisor")` |
|
|
| `agent__check` (non-blocking) | Not needed (graph handles scheduling) | — |
|
|
| `agent__cancel` | Not needed (graph handles lifecycle) | — |
|
|
| `can_spawn_agents: true` | Node has routing logic (supervisor) | `agents/supervisor.py` |
|
|
| `max_concurrent_agents: 4` | `Send()` API for parallel fan-out | See [Parallel Execution](#parallel-execution) |
|
|
| `max_agent_depth: 3` | `recursion_limit` in config | `cli.py` → `recursion_limit: 50` |
|
|
| `summarization_threshold` | Manual truncation in supervisor | `supervisor.py` → `_summarize_outputs()` |
|
|
|
|
### Tool System
|
|
|
|
| Loki Concept | LangGraph Equivalent | Notes |
|
|
|---|---|---|
|
|
| `tools.sh` with `@cmd` annotations | `@tool` decorator | Loki compiles bash annotations to JSON schema; LangChain generates schema from the Python function signature + docstring |
|
|
| `@option --pattern!` (required arg) | Function parameter without default | `def search_content(pattern: str)` |
|
|
| `@option --lines` (optional arg) | Parameter with default | `def read_file(path: str, limit: int = 200)` |
|
|
| `@env LLM_OUTPUT=/dev/stdout` | Return value | LangChain tools return strings; Loki tools write to `$LLM_OUTPUT` |
|
|
| `@describe` | Docstring | The tool's docstring becomes the description the LLM sees |
|
|
| Global tools (`fs_read.sh`, etc.) | Shared tool imports | Both agents import from `tools/filesystem.py` |
|
|
| Agent-specific tools | Per-node tool binding | `llm.bind_tools(EXPLORE_TOOLS)` vs `llm.bind_tools(CODER_TOOLS)` |
|
|
| `.shared/utils.sh` | `tools/project.py` | Shared project detection utilities |
|
|
| `detect_project()` heuristic | `detect_project()` in Python | Same logic: check Cargo.toml → go.mod → package.json → etc. |
|
|
| LLM fallback for unknown projects | (omitted) | The agents themselves can reason about unknown project types |
|
|
|
|
### State & Memory
|
|
|
|
| Loki Concept | LangGraph Equivalent | Notes |
|
|
|---|---|---|
|
|
| Agent session (conversation history) | `SisyphusState.messages` | `Annotated[list, add_messages]` — the reducer appends instead of replacing |
|
|
| `agent_session: temp` | `MemorySaver` checkpointer | Loki's temp sessions are ephemeral; MemorySaver is in-memory (lost on restart) |
|
|
| Per-agent isolation | Per-node system prompt + tools | In Loki agents have separate sessions; in LangGraph they share messages but have different system prompts |
|
|
| `{{project_dir}}` variable | `SisyphusState.project_dir` | Loki interpolates variables into prompts; LangGraph stores them in state |
|
|
| `{{__tools__}}` injection | `llm.bind_tools()` | Loki injects tool descriptions into the prompt; LangChain attaches them to the API call |
|
|
|
|
### Orchestration
|
|
|
|
| Loki Concept | LangGraph Equivalent | Notes |
|
|
|---|---|---|
|
|
| Intent classification table | `RoutingDecision` structured output | Loki does this in free text; LangGraph forces typed JSON |
|
|
| Oracle triggers ("How should I...") | Supervisor prompt + structured output | Same trigger phrases, enforced via system prompt |
|
|
| Coder delegation format | Supervisor builds HumanMessage | The structured prompt (Goal/Reference Files/Conventions/Constraints) |
|
|
| `agent__spawn` (parallel) | `Send()` API | Dynamic fan-out to multiple nodes |
|
|
| Todo system (`todo__init`, etc.) | `SisyphusState.todos` | State field with a merge reducer |
|
|
| `auto_continue: true` | Supervisor loop (iteration counter) | Supervisor re-routes until FINISH or max iterations |
|
|
| `max_auto_continues: 25` | `MAX_ITERATIONS = 15` | Safety valve to prevent infinite loops |
|
|
| `user__ask` / `user__confirm` | `interrupt()` API | Pauses graph, surfaces question to caller, resumes with answer |
|
|
| Escalation (child → parent → user) | `interrupt()` in any node | Any node can pause; the caller handles the interaction |
|
|
|
|
### Execution Model
|
|
|
|
| Loki Concept | LangGraph Equivalent | Notes |
|
|
|---|---|---|
|
|
| `loki --agent sisyphus` | `python -m sisyphus_langchain.cli` | CLI entry point |
|
|
| REPL mode | `cli.py` → `repl()` | Interactive loop with thread persistence |
|
|
| One-shot mode | `cli.py` → `run_query()` | Single query, print result, exit |
|
|
| Streaming output | `graph.stream()` | LangGraph supports per-node streaming |
|
|
| `inject_spawn_instructions` | (always on) | System prompts are always included |
|
|
| `inject_todo_instructions` | (always on) | Todo instructions could be added to prompts |
|
|
|
|
## How the Execution Flow Works
|
|
|
|
### 1. User sends a message
|
|
|
|
```python
|
|
graph.invoke({"messages": [HumanMessage("Add a health check endpoint")]})
|
|
```
|
|
|
|
### 2. Supervisor classifies intent
|
|
|
|
The supervisor LLM reads the message and produces a `RoutingDecision`:
|
|
```json
|
|
{
|
|
"intent": "implementation",
|
|
"next_agent": "explore",
|
|
"delegation_notes": "Find existing API endpoint patterns, route structure, and health check conventions"
|
|
}
|
|
```
|
|
|
|
### 3. Supervisor routes via Command
|
|
|
|
```python
|
|
return Command(goto="explore", update={"intent": "implementation", "iteration_count": 1})
|
|
```
|
|
|
|
### 4. Explore agent runs
|
|
|
|
- Receives the full message history (including the user's request)
|
|
- Calls read-only tools (search_content, search_files, read_file)
|
|
- Returns findings in messages
|
|
|
|
### 5. Control returns to supervisor
|
|
|
|
The graph edge `explore → supervisor` fires automatically.
|
|
|
|
### 6. Supervisor reviews and routes again
|
|
|
|
Now it has explore's findings. It routes to coder with context:
|
|
```json
|
|
{
|
|
"intent": "implementation",
|
|
"next_agent": "coder",
|
|
"delegation_notes": "Implement health check endpoint following patterns found in src/routes/"
|
|
}
|
|
```
|
|
|
|
### 7. Coder implements
|
|
|
|
- Reads explore's findings from the message history
|
|
- Writes files via `write_file` tool
|
|
- Runs `verify_build` to check compilation
|
|
|
|
### 8. Supervisor verifies and finishes
|
|
|
|
```json
|
|
{
|
|
"intent": "implementation",
|
|
"next_agent": "FINISH",
|
|
"delegation_notes": "Added /health endpoint in src/routes/health.py. Build passes."
|
|
}
|
|
```
|
|
|
|
## Key Differences from Loki
|
|
|
|
### What LangGraph does better
|
|
|
|
1. **Declarative graph** — The topology is visible and debuggable. Loki's orchestration is emergent from the LLM's tool calls.
|
|
2. **Typed state** — `SisyphusState` is a TypedDict with reducers. Loki's state is implicit in the conversation.
|
|
3. **Checkpointing** — Built-in persistence. Loki manages sessions manually.
|
|
4. **Time-travel debugging** — Inspect any checkpoint. Loki has no equivalent.
|
|
5. **Structured routing** — `RoutingDecision` forces valid JSON. Loki relies on the LLM calling the right tool.
|
|
|
|
### What Loki does better
|
|
|
|
1. **True parallelism** — `agent__spawn` runs multiple agents concurrently in separate threads. This LangGraph implementation is sequential (see [Parallel Execution](#parallel-execution) for how to add it).
|
|
2. **Agent isolation** — Each Loki agent has its own session, tools, and config. LangGraph nodes share state.
|
|
3. **Teammate messaging** — Loki agents can send messages to siblings. LangGraph nodes communicate only through shared state.
|
|
4. **Dynamic tool compilation** — Loki compiles bash/python/typescript tools at startup. LangChain tools are statically defined.
|
|
5. **Escalation protocol** — Loki's child-to-parent escalation is sophisticated. LangGraph's `interrupt()` is simpler but less structured.
|
|
6. **Task queues with dependencies** — Loki's `agent__task_create` supports dependency DAGs. LangGraph's routing is simpler (hub-and-spoke).
|
|
|
|
## Running It
|
|
|
|
### Prerequisites
|
|
|
|
```bash
|
|
# Python 3.11+
|
|
python --version
|
|
|
|
# Set your API key
|
|
export OPENAI_API_KEY="sk-..."
|
|
```
|
|
|
|
### Install
|
|
|
|
```bash
|
|
cd examples/langchain-sisyphus
|
|
|
|
# With pip
|
|
pip install -e .
|
|
|
|
# Or with uv (recommended)
|
|
uv pip install -e .
|
|
```
|
|
|
|
### Usage
|
|
|
|
```bash
|
|
# Interactive REPL (like `loki --agent sisyphus`)
|
|
sisyphus
|
|
|
|
# One-shot query
|
|
sisyphus "Find all TODO comments in the codebase"
|
|
|
|
# With custom models (cost optimization)
|
|
sisyphus --explore-model gpt-4o-mini --coder-model gpt-4o "Add input validation to the API"
|
|
|
|
# Programmatic usage
|
|
python -c "
|
|
from sisyphus_langchain import build_graph
|
|
from langchain_core.messages import HumanMessage
|
|
|
|
graph = build_graph()
|
|
result = graph.invoke({
|
|
'messages': [HumanMessage('What patterns does this codebase use?')],
|
|
'intent': 'ambiguous',
|
|
'next_agent': '',
|
|
'iteration_count': 0,
|
|
'todos': [],
|
|
'agent_outputs': {},
|
|
'final_output': '',
|
|
'project_dir': '.',
|
|
}, config={'configurable': {'thread_id': 'demo'}, 'recursion_limit': 50})
|
|
print(result['final_output'])
|
|
"
|
|
```
|
|
|
|
### Using Anthropic Models
|
|
|
|
Replace `ChatOpenAI` with `ChatAnthropic` in the agent factories:
|
|
|
|
```python
|
|
from langchain_anthropic import ChatAnthropic
|
|
|
|
# In agents/oracle.py:
|
|
llm = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0.2).bind_tools(ORACLE_TOOLS)
|
|
```
|
|
|
|
## Deployment
|
|
|
|
### Option 1: Standalone Script (Simplest)
|
|
|
|
Just run the CLI directly. No infrastructure needed.
|
|
|
|
```bash
|
|
sisyphus "Add a health check endpoint"
|
|
```
|
|
|
|
### Option 2: FastAPI Server
|
|
|
|
```python
|
|
# server.py
|
|
from fastapi import FastAPI
|
|
from langserve import add_routes
|
|
from sisyphus_langchain import build_graph
|
|
|
|
app = FastAPI(title="Sisyphus API")
|
|
graph = build_graph()
|
|
add_routes(app, graph, path="/agent")
|
|
|
|
# Run: uvicorn server:app --host 0.0.0.0 --port 8000
|
|
# Call: POST http://localhost:8000/agent/invoke
|
|
```
|
|
|
|
### Option 3: LangGraph Platform (Production)
|
|
|
|
Create a `langgraph.json` at the project root:
|
|
|
|
```json
|
|
{
|
|
"graphs": {
|
|
"sisyphus": "./sisyphus_langchain/graph.py:build_graph"
|
|
},
|
|
"dependencies": ["./sisyphus_langchain"],
|
|
"env": ".env"
|
|
}
|
|
```
|
|
|
|
Then deploy:
|
|
```bash
|
|
pip install langgraph-cli
|
|
langgraph deploy
|
|
```
|
|
|
|
This gives you:
|
|
- Durable checkpointing (PostgreSQL)
|
|
- Background runs
|
|
- Streaming API
|
|
- Zero-downtime deployments
|
|
- Built-in observability
|
|
|
|
### Option 4: Docker
|
|
|
|
```dockerfile
|
|
FROM python:3.12-slim
|
|
WORKDIR /app
|
|
COPY . .
|
|
RUN pip install -e .
|
|
CMD ["sisyphus"]
|
|
```
|
|
|
|
```bash
|
|
docker build -t sisyphus .
|
|
docker run -it -e OPENAI_API_KEY=$OPENAI_API_KEY sisyphus
|
|
```
|
|
|
|
## Parallel Execution
|
|
|
|
This implementation routes sequentially for simplicity. To add Loki-style parallel agent execution, use LangGraph's `Send()` API:
|
|
|
|
```python
|
|
from langgraph.types import Send
|
|
|
|
def supervisor_node(state):
|
|
# Fan out to multiple explore agents in parallel
|
|
# (like Loki's agent__spawn called multiple times)
|
|
return [
|
|
Send("explore", {
|
|
**state,
|
|
"messages": state["messages"] + [
|
|
HumanMessage("Find existing API endpoint patterns")
|
|
],
|
|
}),
|
|
Send("explore", {
|
|
**state,
|
|
"messages": state["messages"] + [
|
|
HumanMessage("Find data models and database patterns")
|
|
],
|
|
}),
|
|
]
|
|
```
|
|
|
|
This is equivalent to Loki's pattern of spawning multiple explore agents:
|
|
```
|
|
agent__spawn --agent explore --prompt "Find API patterns"
|
|
agent__spawn --agent explore --prompt "Find database patterns"
|
|
agent__collect --id <id1>
|
|
agent__collect --id <id2>
|
|
```
|
|
|
|
## Adding Human-in-the-Loop
|
|
|
|
To replicate Loki's `user__ask` / `user__confirm` tools, use LangGraph's `interrupt()`:
|
|
|
|
```python
|
|
from langgraph.types import interrupt
|
|
|
|
def supervisor_node(state):
|
|
# Pause and ask the user (like Loki's user__ask)
|
|
answer = interrupt({
|
|
"question": "How should we structure the authentication?",
|
|
"options": [
|
|
"JWT with httpOnly cookies (Recommended)",
|
|
"Session-based with Redis",
|
|
"OAuth2 with external provider",
|
|
],
|
|
})
|
|
# `answer` contains the user's selection when the graph resumes
|
|
```
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
examples/langchain-sisyphus/
|
|
├── pyproject.toml # Dependencies & build config
|
|
├── README.md # This file
|
|
└── sisyphus_langchain/
|
|
├── __init__.py # Package entry point
|
|
├── cli.py # CLI (REPL + one-shot mode)
|
|
├── graph.py # Graph assembly (wires nodes + edges)
|
|
├── state.py # Shared state schema (TypedDict)
|
|
├── agents/
|
|
│ ├── __init__.py
|
|
│ ├── supervisor.py # Sisyphus orchestrator (intent → routing)
|
|
│ ├── explore.py # Read-only codebase researcher
|
|
│ ├── oracle.py # Architecture/debugging advisor
|
|
│ └── coder.py # Implementation worker
|
|
└── tools/
|
|
├── __init__.py
|
|
├── filesystem.py # File read/write/search/glob tools
|
|
└── project.py # Project detection, build, test tools
|
|
```
|
|
|
|
### File-to-Loki Mapping
|
|
|
|
| This Project | Loki Equivalent |
|
|
|---|---|
|
|
| `state.py` | Session context + todo state (implicit in Loki) |
|
|
| `graph.py` | `src/supervisor/mod.rs` (runtime orchestration) |
|
|
| `cli.py` | `src/main.rs` (CLI entry point) |
|
|
| `agents/supervisor.py` | `assets/agents/sisyphus/config.yaml` |
|
|
| `agents/explore.py` | `assets/agents/explore/config.yaml` + `tools.sh` |
|
|
| `agents/oracle.py` | `assets/agents/oracle/config.yaml` + `tools.sh` |
|
|
| `agents/coder.py` | `assets/agents/coder/config.yaml` + `tools.sh` |
|
|
| `tools/filesystem.py` | `assets/functions/tools/fs_*.sh` |
|
|
| `tools/project.py` | `assets/agents/.shared/utils.sh` + `sisyphus/tools.sh` |
|
|
|
|
## Further Reading
|
|
|
|
- [LangGraph Documentation](https://docs.langchain.com/langgraph/)
|
|
- [LangGraph Multi-Agent Tutorial](https://docs.langchain.com/langgraph/how-tos/multi-agent-systems)
|
|
- [Loki Agents Documentation](../../docs/AGENTS.md)
|
|
- [Loki Sisyphus README](../../assets/agents/sisyphus/README.md)
|
|
- [LangGraph Supervisor Library](https://github.com/langchain-ai/langgraph-supervisor-py)
|