# deep-research A deep web research agent, built as a Loki graph agent. It plans an investigation, decomposes it into sub-questions researched in parallel, grounds the work in a local knowledge corpus, vets the credibility of cited sources, runs a reflexion self-critique loop to revise weak findings, delegates the final write-up to a focused sub-agent, checks that the cited sources are reachable, and gates the result behind human approval. Unlike a regular agent (which takes a goal and improvises the steps), this agent runs a fixed graph: every request goes through the same `plan -> parallel research -> vet -> critique -> synthesize -> verify -> approve` pipeline. This agent is also the **canonical reference for the Loki graph system**: it exercises every node type (`script`, `llm`, `rag`, `map`, `agent`, `input`, `approval`, `end`) and both static fan-out and dynamic `map` fan-out. If you are learning how to build a graph agent, this is the file to read alongside the [Graph-Agents wiki](https://github.com/Dark-Alex-17/loki/wiki/Graph-Agents). ## Workflow 17 nodes. `->` is the static route; a script node can also route dynamically via `_next`. The `▶▶` line is a parallel super-step — those branches run concurrently: ``` parse_request (script) -> bootstrap_research (or -> ask_topic if no topic) ask_topic (input) -> bootstrap_research bootstrap_research (script) -> [plan, knowledge_lookup] ▶▶ parallel plan (llm + output_schema) -> research_each_question knowledge_lookup (rag) -> research_each_question research_each_question (map) -> combine_findings (spawns one branch per question) └─ research_one_question (llm) (atomic; runs N×, joins at map) combine_findings (script) -> vet_sources vet_sources (llm + custom tool) -> critique critique (llm) -> reflexion_gate reflexion_gate (script) -> synthesize (or -> research_each_question: reflexion loop) synthesize (agent: report-writer) -> verify_sources verify_sources (script) -> approve approve (approval) -> end_accepted ("accept") -> end_rejected ("reject") -> incorporate_feedback (any free-form answer) incorporate_feedback (script) -> research_each_question (the human-feedback loop) ``` ### Node-type breakdown | Type | Nodes | |---|---| | `script` (Python) | `parse_request`, `bootstrap_research`, `combine_findings`, `reflexion_gate`, `verify_sources`, `incorporate_feedback` | | `llm` (tools: `[]`) | `plan`, `critique` | | `llm` (with tool whitelist) | `research_one_question`, `vet_sources` | | `rag` | `knowledge_lookup` — local corpus retrieval | | `map` | `research_each_question` — dynamic fan-out per sub-question | | `agent` | `synthesize` — spawns the `report-writer` sub-agent | | `input` | `ask_topic` | | `approval` | `approve` | | `end` | `end_accepted`, `end_rejected` | ## Parallel execution The graph has two parallel super-steps where Loki's BSP scheduler runs branches concurrently. **1. Context loading (`plan` ‖ `knowledge_lookup`)** — after `bootstrap_research`, the LLM planner (which decomposes the topic into sub-questions) and the RAG retrieval over the local `knowledge/` corpus run side by side. They write disjoint state keys (`plan` writes `research_plan` and `questions`; `knowledge_lookup` writes `local_context` and `local_sources`) so no reducer is needed. **2. Per-question research (`research_each_question` map)** — the plan emits a `questions` array (3-5 entries, enforced by its `output_schema`). The `map` node spawns one parallel branch per question (`max_concurrency: 3`). Each branch is an isolated `research_one_question` LLM invocation with web tools, instructed to investigate exactly its assigned question. Outputs collect into `question_findings` in input order, then `combine_findings` joins them into a single `findings` Markdown document for downstream nodes. `settings.max_concurrency: 4` is the graph-wide cap; the per-`map` override (`max_concurrency: 3` on `research_each_question`) is deliberately lower to leave headroom for the planner's tool calls running alongside RAG. ## Local knowledge corpus `knowledge_lookup` is a `rag` node — it runs hybrid (vector + keyword) retrieval over every file in `knowledge/`. The directory ships with a small `research-style-notes.md` so the RAG node has something to retrieve against on a clean install; drop your own Markdown notes, PDFs, or text files into `knowledge/` to bias the research toward your local context. The knowledge base is built once, at agent-load time, into `~/.config/loki/agents/deep-research/knowledge_lookup.yaml`. Because the node fully specifies its build config (`embedding_model`, `chunk_size`, `chunk_overlap`), the build is non-interactive. Delete that cached file after adding or changing knowledge to force a rebuild. ## Sub-agent: report-writer The `synthesize` node is an `agent` node that spawns the `report-writer` sub-agent (`assets/agents/report-writer/`). This is the agent-as-tool pattern: the orchestrating graph delegates the writing phase to a focused sub-agent dedicated to coherent prose, while the research phase uses different (typically cheaper) LLM nodes for fast-and-many-question investigation. The `report-writer` sub-agent has no tools — it cannot access the web, cannot search, and cannot invent facts. It reads only the findings it is given and produces a final Markdown report preserving every inline citation. See `assets/agents/report-writer/README.md` for details. ## Tools and tool scoping This agent demonstrates Loki's three tool sources and how an `llm` node's `tools:` whitelist scopes them per node. The agent's full tool universe, declared in `graph.yaml`: - **Global tools** (`global_tools`): `web_search_loki`, `fetch_url_via_curl`, `search_arxiv` - Loki's built-in tool scripts. - **MCP server** (`mcp_servers`): `ddg-search` - a DuckDuckGo web search MCP server. Referenced in a whitelist as `mcp:ddg-search`. - **Custom agent tool** (`tools.sh`): `classify_source` - a deterministic source-credibility classifier shipped with this agent. No node receives all of these. Each `llm` node's `tools:` whitelist narrows the universe to exactly what that step needs: | Node | `tools:` whitelist | Draws from | |---|---|---| | `plan`, `critique` | `[]` | nothing - pure reasoning | | `research_one_question` | `web_search_loki`, `fetch_url_via_curl`, `search_arxiv`, `mcp:ddg-search` | global tools + MCP | | `vet_sources` | `classify_source` | the custom tool only | `research_one_question` (each parallel branch of the map) can search and fetch but cannot classify sources; `vet_sources` can classify sources but cannot touch the web. That separation is the point of the `tools:` whitelist: a node gets only the tools its job calls for, never the agent's full set. The `classify_source` custom tool (`tools.sh`) takes a URL and returns a credibility tier (government, academic, preprint, organization, unverified) derived from the host and top-level domain. It is deterministic - exactly the kind of logic a tool should own rather than the LLM guessing. Web search may require API-key configuration; see the [Tools](https://github.com/Dark-Alex-17/loki/wiki/Tools) docs. `fetch_url_via_curl`, `search_arxiv`, and `classify_source` work without a key. ## Setup `research_one_question` (each parallel branch of the `map`) uses the `ddg-search` MCP server via `mcp:ddg-search`. It is one of Loki's default MCP servers; make sure it is registered in `~/.config/loki/mcp.json` (run `loki --install mcp_config` to restore the default template if it is missing). If `ddg-search` is unavailable, the branches still have their global web-search tools to fall back on. The `synthesize` node spawns the `report-writer` sub-agent. Both agents ship with `loki agents install`; if you install one manually, install both so the agent reference resolves. ## Reflexion The agent has two loops, both built with script nodes that route via `_next`. The engine allows back-edges at runtime; the validator only rejects cycles built from static `next` / `routes` edges, so script `_next` loops are always allowed. **Automated reflexion loop.** After the parallel research map and `vet_sources`, the `critique` node reviews the merged findings against the research plan and the source credibility assessment, and emits `VERDICT: PASS` or `VERDICT: REVISE` with specific feedback. `reflexion_gate.py` then: - `PASS` -> continue to `synthesize`. - `REVISE`, budget remaining -> loop back to `research_each_question`, with the critique injected as `research_feedback` so every parallel branch sees it on the retry. - `REVISE`, budget spent -> continue to `synthesize` anyway (the human approval step is the final backstop). The budget is `MAX_REFLEXION_REVISIONS` in `reflexion_gate.py` (default 2, so the research map runs at most 3 times per pass). **Human-feedback loop.** At `approve` the user answers `accept`, `reject`, or types their own feedback. A free-form answer routes via the approval node's `on_other` to `incorporate_feedback.py`, which folds that text into `research_feedback` and loops back to `research_each_question` for another parallel pass. `settings.max_loop_iterations` (40) is the engine's infinite-loop backstop: it caps the total visits to any single node. ## Running ```sh loki agents install # ships deep-research loki -a deep-research "How does HTTP/3 differ from HTTP/2?" loki -a deep-research "Recent advances in solid-state batteries" loki -a deep-research # no prompt -> triggers ask_topic ``` ## Anti-hallucination - `research_one_question` (each map branch) is instructed to back every claim with a real retrieved source and never to fabricate URLs, titles, or DOIs. - `vet_sources` classifies every cited source so weak sources are visible to the critique step. - `critique` independently reviews the merged findings and sends weak or uncited work back for another parallel research pass. - `synthesize` (the `report-writer` sub-agent) is grounded: it may use only the gathered findings and must keep each claim's inline source. It has no tools and cannot browse the web. - `verify_sources` probes every cited URL / DOI with an HTTP HEAD request and reports which are unreachable, so the human reviewer sees broken citations before approving. ## Customizing - **Loop budget.** `MAX_REFLEXION_REVISIONS` in `reflexion_gate.py`. - **Map concurrency.** The `research_each_question` node's `max_concurrency: 3` caps simultaneous web-research branches. Raise to investigate more questions in parallel; lower to be gentle on rate-limited providers. - **Per-node model.** Add `model: anthropic:...` to any `llm` node. Cheap models work well for `plan` / `critique` / `vet_sources`; the heavy intelligence is needed in `research_one_question` and the `report-writer` sub-agent. - **Tool scope.** Narrow the `research_one_question` node's `tools:` list to constrain where each branch looks (for example, drop `web_search_loki` and `mcp:ddg-search` to force arXiv-only research). - **Local knowledge.** Drop files into `knowledge/` to bias every research branch toward your local context (see the *Local knowledge corpus* section above). - **Different writer.** Replace `agent: report-writer` on the `synthesize` node with the name of any other agent. The orchestrator does not care what kind of agent the writer is. - **Skip approval.** Point both `approve` routes at `end_accepted`, or wire `verify_sources` straight to an `end` node. ## Files ``` assets/agents/deep-research/ graph.yaml - agent config + 17-node workflow tools.sh - classify_source custom tool README.md - this file knowledge/ README.md - corpus-format notes research-style-notes.md - starter knowledge file (replace with your notes) scripts/ parse_request.py - _next: bootstrap_research, or ask_topic if no topic bootstrap_research.py - fan-out source: next [plan, knowledge_lookup] combine_findings.py - joins map output (question_findings) into findings reflexion_gate.py - _next: research_each_question (revise) or synthesize verify_sources.py - HTTP HEAD on cited URLs / DOIs incorporate_feedback.py - _next: research_each_question, with user feedback ``` See also `assets/agents/report-writer/` — the sub-agent the `synthesize` node spawns.