Dark-Alex-17/loki

Fork 0

Files

Alex Clarke 9bab6a0c2d

Sisyphus agent recreated in LangChain to figure out how it works and how to use it

2026-04-15 12:47:38 -06:00

16 KiB

Raw Blame History

Sisyphus in LangChain/LangGraph

A faithful recreation of Loki's Sisyphus agent using LangGraph — LangChain's framework for stateful, multi-agent workflows.

This project exists to help you understand LangChain/LangGraph by mapping every concept to its Loki equivalent.

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                     SUPERVISOR NODE                          │
│  Intent classification → Routing decision → Command(goto=)   │
│                                                              │
│  Loki equivalent: sisyphus/config.yaml                       │
│  (agent__spawn → Command, agent__collect → graph edge)       │
└──────────┬──────────────┬──────────────┬────────────────────┘
           │              │              │
           ▼              ▼              ▼
    ┌────────────┐ ┌────────────┐ ┌────────────┐
    │  EXPLORE   │ │   ORACLE   │ │   CODER    │
    │ (research) │ │  (advise)  │ │  (build)   │
    │            │ │            │ │            │
    │ read-only  │ │ read-only  │ │ read+write │
    │ tools      │ │ tools      │ │ tools      │
    └─────┬──────┘ └─────┬──────┘ └─────┬──────┘
          │              │              │
          └──────────────┼──────────────┘
                         │
                  back to supervisor

Concept Map: Loki → LangGraph

This is the key reference. Every row maps a Loki concept to its LangGraph equivalent.

Core Architecture

Loki Concept	LangGraph Equivalent	Where in Code
Agent config (config.yaml)	Node function + system prompt	`agents/explore.py`, etc.
Agent instructions	System prompt string	`EXPLORE_SYSTEM_PROMPT`, etc.
Agent tools (tools.sh)	`@tool`-decorated Python functions	`tools/filesystem.py`, `tools/project.py`
Agent session (chat loop)	Graph state + message list	`state.py` → `SisyphusState.messages`
`agent__spawn --agent X`	`Command(goto="X")`	`agents/supervisor.py`
`agent__collect --id`	Graph edge (implicit — workers return to supervisor)	`graph.py` → `add_edge("explore", "supervisor")`
`agent__check` (non-blocking)	Not needed (graph handles scheduling)	—
`agent__cancel`	Not needed (graph handles lifecycle)	—
`can_spawn_agents: true`	Node has routing logic (supervisor)	`agents/supervisor.py`
`max_concurrent_agents: 4`	`Send()` API for parallel fan-out	See Parallel Execution
`max_agent_depth: 3`	`recursion_limit` in config	`cli.py` → `recursion_limit: 50`
`summarization_threshold`	Manual truncation in supervisor	`supervisor.py` → `_summarize_outputs()`

Tool System

Loki Concept	LangGraph Equivalent	Notes
`tools.sh` with `@cmd` annotations	`@tool` decorator	Loki compiles bash annotations to JSON schema; LangChain generates schema from the Python function signature + docstring
`@option --pattern!` (required arg)	Function parameter without default	`def search_content(pattern: str)`
`@option --lines` (optional arg)	Parameter with default	`def read_file(path: str, limit: int = 200)`
`@env LLM_OUTPUT=/dev/stdout`	Return value	LangChain tools return strings; Loki tools write to `$LLM_OUTPUT`
`@describe`	Docstring	The tool's docstring becomes the description the LLM sees
Global tools (`fs_read.sh`, etc.)	Shared tool imports	Both agents import from `tools/filesystem.py`
Agent-specific tools	Per-node tool binding	`llm.bind_tools(EXPLORE_TOOLS)` vs `llm.bind_tools(CODER_TOOLS)`
`.shared/utils.sh`	`tools/project.py`	Shared project detection utilities
`detect_project()` heuristic	`detect_project()` in Python	Same logic: check Cargo.toml → go.mod → package.json → etc.
LLM fallback for unknown projects	(omitted)	The agents themselves can reason about unknown project types

State & Memory

Loki Concept	LangGraph Equivalent	Notes
Agent session (conversation history)	`SisyphusState.messages`	`Annotated[list, add_messages]` — the reducer appends instead of replacing
`agent_session: temp`	`MemorySaver` checkpointer	Loki's temp sessions are ephemeral; MemorySaver is in-memory (lost on restart)
Per-agent isolation	Per-node system prompt + tools	In Loki agents have separate sessions; in LangGraph they share messages but have different system prompts
`{{project_dir}}` variable	`SisyphusState.project_dir`	Loki interpolates variables into prompts; LangGraph stores them in state
`{{__tools__}}` injection	`llm.bind_tools()`	Loki injects tool descriptions into the prompt; LangChain attaches them to the API call

Orchestration

Loki Concept	LangGraph Equivalent	Notes
Intent classification table	`RoutingDecision` structured output	Loki does this in free text; LangGraph forces typed JSON
Oracle triggers ("How should I...")	Supervisor prompt + structured output	Same trigger phrases, enforced via system prompt
Coder delegation format	Supervisor builds HumanMessage	The structured prompt (Goal/Reference Files/Conventions/Constraints)
`agent__spawn` (parallel)	`Send()` API	Dynamic fan-out to multiple nodes
Todo system (`todo__init`, etc.)	`SisyphusState.todos`	State field with a merge reducer
`auto_continue: true`	Supervisor loop (iteration counter)	Supervisor re-routes until FINISH or max iterations
`max_auto_continues: 25`	`MAX_ITERATIONS = 15`	Safety valve to prevent infinite loops
`user__ask` / `user__confirm`	`interrupt()` API	Pauses graph, surfaces question to caller, resumes with answer
Escalation (child → parent → user)	`interrupt()` in any node	Any node can pause; the caller handles the interaction

Execution Model

Loki Concept	LangGraph Equivalent	Notes
`loki --agent sisyphus`	`python -m sisyphus_langchain.cli`	CLI entry point
REPL mode	`cli.py` → `repl()`	Interactive loop with thread persistence
One-shot mode	`cli.py` → `run_query()`	Single query, print result, exit
Streaming output	`graph.stream()`	LangGraph supports per-node streaming
`inject_spawn_instructions`	(always on)	System prompts are always included
`inject_todo_instructions`	(always on)	Todo instructions could be added to prompts

How the Execution Flow Works

1. User sends a message

graph.invoke({"messages": [HumanMessage("Add a health check endpoint")]})

2. Supervisor classifies intent

The supervisor LLM reads the message and produces a RoutingDecision:

{
  "intent": "implementation",
  "next_agent": "explore",
  "delegation_notes": "Find existing API endpoint patterns, route structure, and health check conventions"
}

3. Supervisor routes via Command

return Command(goto="explore", update={"intent": "implementation", "iteration_count": 1})

4. Explore agent runs

Receives the full message history (including the user's request)
Calls read-only tools (search_content, search_files, read_file)
Returns findings in messages

5. Control returns to supervisor

The graph edge explore → supervisor fires automatically.

6. Supervisor reviews and routes again

Now it has explore's findings. It routes to coder with context:

{
  "intent": "implementation",
  "next_agent": "coder",
  "delegation_notes": "Implement health check endpoint following patterns found in src/routes/"
}

7. Coder implements

Reads explore's findings from the message history
Writes files via write_file tool
Runs verify_build to check compilation

8. Supervisor verifies and finishes

{
  "intent": "implementation",
  "next_agent": "FINISH",
  "delegation_notes": "Added /health endpoint in src/routes/health.py. Build passes."
}

Key Differences from Loki

What LangGraph does better

Declarative graph — The topology is visible and debuggable. Loki's orchestration is emergent from the LLM's tool calls.
Typed state — SisyphusState is a TypedDict with reducers. Loki's state is implicit in the conversation.
Checkpointing — Built-in persistence. Loki manages sessions manually.
Time-travel debugging — Inspect any checkpoint. Loki has no equivalent.
Structured routing — RoutingDecision forces valid JSON. Loki relies on the LLM calling the right tool.

What Loki does better

True parallelism — agent__spawn runs multiple agents concurrently in separate threads. This LangGraph implementation is sequential (see Parallel Execution for how to add it).
Agent isolation — Each Loki agent has its own session, tools, and config. LangGraph nodes share state.
Teammate messaging — Loki agents can send messages to siblings. LangGraph nodes communicate only through shared state.
Dynamic tool compilation — Loki compiles bash/python/typescript tools at startup. LangChain tools are statically defined.
Escalation protocol — Loki's child-to-parent escalation is sophisticated. LangGraph's interrupt() is simpler but less structured.
Task queues with dependencies — Loki's agent__task_create supports dependency DAGs. LangGraph's routing is simpler (hub-and-spoke).

Running It

Prerequisites

# Python 3.11+
python --version

# Set your API key
export OPENAI_API_KEY="sk-..."

Install

cd examples/langchain-sisyphus

# With pip
pip install -e .

# Or with uv (recommended)
uv pip install -e .

Usage

# Interactive REPL (like `loki --agent sisyphus`)
sisyphus

# One-shot query
sisyphus "Find all TODO comments in the codebase"

# With custom models (cost optimization)
sisyphus --explore-model gpt-4o-mini --coder-model gpt-4o "Add input validation to the API"

# Programmatic usage
python -c "
from sisyphus_langchain import build_graph
from langchain_core.messages import HumanMessage

graph = build_graph()
result = graph.invoke({
    'messages': [HumanMessage('What patterns does this codebase use?')],
    'intent': 'ambiguous',
    'next_agent': '',
    'iteration_count': 0,
    'todos': [],
    'agent_outputs': {},
    'final_output': '',
    'project_dir': '.',
}, config={'configurable': {'thread_id': 'demo'}, 'recursion_limit': 50})
print(result['final_output'])
"

Using Anthropic Models

Replace ChatOpenAI with ChatAnthropic in the agent factories:

from langchain_anthropic import ChatAnthropic

# In agents/oracle.py:
llm = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0.2).bind_tools(ORACLE_TOOLS)

Deployment

Option 1: Standalone Script (Simplest)

Just run the CLI directly. No infrastructure needed.

sisyphus "Add a health check endpoint"

Option 2: FastAPI Server

# server.py
from fastapi import FastAPI
from langserve import add_routes
from sisyphus_langchain import build_graph

app = FastAPI(title="Sisyphus API")
graph = build_graph()
add_routes(app, graph, path="/agent")

# Run: uvicorn server:app --host 0.0.0.0 --port 8000
# Call: POST http://localhost:8000/agent/invoke

Option 3: LangGraph Platform (Production)

Create a langgraph.json at the project root:

{
  "graphs": {
    "sisyphus": "./sisyphus_langchain/graph.py:build_graph"
  },
  "dependencies": ["./sisyphus_langchain"],
  "env": ".env"
}

Then deploy:

pip install langgraph-cli
langgraph deploy

This gives you:

Durable checkpointing (PostgreSQL)
Background runs
Streaming API
Zero-downtime deployments
Built-in observability

Option 4: Docker

FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install -e .
CMD ["sisyphus"]

docker build -t sisyphus .
docker run -it -e OPENAI_API_KEY=$OPENAI_API_KEY sisyphus

Parallel Execution

This implementation routes sequentially for simplicity. To add Loki-style parallel agent execution, use LangGraph's Send() API:

from langgraph.types import Send

def supervisor_node(state):
    # Fan out to multiple explore agents in parallel
    # (like Loki's agent__spawn called multiple times)
    return [
        Send("explore", {
            **state,
            "messages": state["messages"] + [
                HumanMessage("Find existing API endpoint patterns")
            ],
        }),
        Send("explore", {
            **state,
            "messages": state["messages"] + [
                HumanMessage("Find data models and database patterns")
            ],
        }),
    ]

This is equivalent to Loki's pattern of spawning multiple explore agents:

agent__spawn --agent explore --prompt "Find API patterns"
agent__spawn --agent explore --prompt "Find database patterns"
agent__collect --id <id1>
agent__collect --id <id2>

Adding Human-in-the-Loop

To replicate Loki's user__ask / user__confirm tools, use LangGraph's interrupt():

from langgraph.types import interrupt

def supervisor_node(state):
    # Pause and ask the user (like Loki's user__ask)
    answer = interrupt({
        "question": "How should we structure the authentication?",
        "options": [
            "JWT with httpOnly cookies (Recommended)",
            "Session-based with Redis",
            "OAuth2 with external provider",
        ],
    })
    # `answer` contains the user's selection when the graph resumes

Project Structure

examples/langchain-sisyphus/
├── pyproject.toml                          # Dependencies & build config
├── README.md                               # This file
└── sisyphus_langchain/
    ├── __init__.py                          # Package entry point
    ├── cli.py                              # CLI (REPL + one-shot mode)
    ├── graph.py                            # Graph assembly (wires nodes + edges)
    ├── state.py                            # Shared state schema (TypedDict)
    ├── agents/
    │   ├── __init__.py
    │   ├── supervisor.py                   # Sisyphus orchestrator (intent → routing)
    │   ├── explore.py                      # Read-only codebase researcher
    │   ├── oracle.py                       # Architecture/debugging advisor
    │   └── coder.py                        # Implementation worker
    └── tools/
        ├── __init__.py
        ├── filesystem.py                   # File read/write/search/glob tools
        └── project.py                      # Project detection, build, test tools

File-to-Loki Mapping

This Project	Loki Equivalent
`state.py`	Session context + todo state (implicit in Loki)
`graph.py`	`src/supervisor/mod.rs` (runtime orchestration)
`cli.py`	`src/main.rs` (CLI entry point)
`agents/supervisor.py`	`assets/agents/sisyphus/config.yaml`
`agents/explore.py`	`assets/agents/explore/config.yaml` + `tools.sh`
`agents/oracle.py`	`assets/agents/oracle/config.yaml` + `tools.sh`
`agents/coder.py`	`assets/agents/coder/config.yaml` + `tools.sh`
`tools/filesystem.py`	`assets/functions/tools/fs_*.sh`
`tools/project.py`	`assets/agents/.shared/utils.sh` + `sisyphus/tools.sh`

16 KiB Raw Blame History