feat: created new graph-based deep-research agent
This commit is contained in:
@@ -0,0 +1,274 @@
|
|||||||
|
# deep-research
|
||||||
|
|
||||||
|
A deep web research agent, built as a Loki graph agent. It plans an
|
||||||
|
investigation, decomposes it into sub-questions researched in
|
||||||
|
parallel, grounds the work in a local knowledge corpus, vets the
|
||||||
|
credibility of cited sources, runs a reflexion self-critique loop to
|
||||||
|
revise weak findings, delegates the final write-up to a focused
|
||||||
|
sub-agent, checks that the cited sources are reachable, and gates the
|
||||||
|
result behind human approval.
|
||||||
|
|
||||||
|
Unlike a regular agent (which takes a goal and improvises the steps),
|
||||||
|
this agent runs a fixed graph: every request goes through the same
|
||||||
|
`plan -> parallel research -> vet -> critique -> synthesize -> verify -> approve`
|
||||||
|
pipeline.
|
||||||
|
|
||||||
|
This agent is also the **canonical reference for the Loki graph
|
||||||
|
system**: it exercises every node type (`script`, `llm`, `rag`, `map`,
|
||||||
|
`agent`, `input`, `approval`, `end`) and both static fan-out and
|
||||||
|
dynamic `map` fan-out. If you are learning how to build a graph
|
||||||
|
agent, this is the file to read alongside the
|
||||||
|
[Graph-Agents wiki](https://github.com/Dark-Alex-17/loki/wiki/Graph-Agents).
|
||||||
|
|
||||||
|
## Workflow
|
||||||
|
|
||||||
|
17 nodes. `->` is the static route; a script node can also route
|
||||||
|
dynamically via `_next`. The `▶▶` line is a parallel super-step —
|
||||||
|
those branches run concurrently:
|
||||||
|
|
||||||
|
```
|
||||||
|
parse_request (script) -> bootstrap_research (or -> ask_topic if no topic)
|
||||||
|
ask_topic (input) -> bootstrap_research
|
||||||
|
bootstrap_research (script) -> [plan, knowledge_lookup] ▶▶ parallel
|
||||||
|
plan (llm + output_schema) -> research_each_question
|
||||||
|
knowledge_lookup (rag) -> research_each_question
|
||||||
|
research_each_question (map) -> combine_findings (spawns one branch per question)
|
||||||
|
└─ research_one_question (llm) (atomic; runs N×, joins at map)
|
||||||
|
combine_findings (script) -> vet_sources
|
||||||
|
vet_sources (llm + custom tool) -> critique
|
||||||
|
critique (llm) -> reflexion_gate
|
||||||
|
reflexion_gate (script) -> synthesize (or -> research_each_question: reflexion loop)
|
||||||
|
synthesize (agent: report-writer) -> verify_sources
|
||||||
|
verify_sources (script) -> approve
|
||||||
|
approve (approval) -> end_accepted ("accept")
|
||||||
|
-> end_rejected ("reject")
|
||||||
|
-> incorporate_feedback (any free-form answer)
|
||||||
|
incorporate_feedback (script) -> research_each_question (the human-feedback loop)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Node-type breakdown
|
||||||
|
|
||||||
|
| Type | Nodes |
|
||||||
|
|---|---|
|
||||||
|
| `script` (Python) | `parse_request`, `bootstrap_research`, `combine_findings`, `reflexion_gate`, `verify_sources`, `incorporate_feedback` |
|
||||||
|
| `llm` (tools: `[]`) | `plan`, `critique` |
|
||||||
|
| `llm` (with tool whitelist) | `research_one_question`, `vet_sources` |
|
||||||
|
| `rag` | `knowledge_lookup` — local corpus retrieval |
|
||||||
|
| `map` | `research_each_question` — dynamic fan-out per sub-question |
|
||||||
|
| `agent` | `synthesize` — spawns the `report-writer` sub-agent |
|
||||||
|
| `input` | `ask_topic` |
|
||||||
|
| `approval` | `approve` |
|
||||||
|
| `end` | `end_accepted`, `end_rejected` |
|
||||||
|
|
||||||
|
## Parallel execution
|
||||||
|
|
||||||
|
The graph has two parallel super-steps where Loki's BSP scheduler runs
|
||||||
|
branches concurrently.
|
||||||
|
|
||||||
|
**1. Context loading (`plan` ‖ `knowledge_lookup`)** — after
|
||||||
|
`bootstrap_research`, the LLM planner (which decomposes the topic into
|
||||||
|
sub-questions) and the RAG retrieval over the local `knowledge/`
|
||||||
|
corpus run side by side. They write disjoint state keys (`plan` writes
|
||||||
|
`research_plan` and `questions`; `knowledge_lookup` writes
|
||||||
|
`local_context` and `local_sources`) so no reducer is needed.
|
||||||
|
|
||||||
|
**2. Per-question research (`research_each_question` map)** — the
|
||||||
|
plan emits a `questions` array (3-5 entries, enforced by its
|
||||||
|
`output_schema`). The `map` node spawns one parallel branch per
|
||||||
|
question (`max_concurrency: 3`). Each branch is an isolated
|
||||||
|
`research_one_question` LLM invocation with web tools, instructed to
|
||||||
|
investigate exactly its assigned question. Outputs collect into
|
||||||
|
`question_findings` in input order, then `combine_findings` joins
|
||||||
|
them into a single `findings` Markdown document for downstream nodes.
|
||||||
|
|
||||||
|
`settings.max_concurrency: 4` is the graph-wide cap; the per-`map`
|
||||||
|
override (`max_concurrency: 3` on `research_each_question`) is
|
||||||
|
deliberately lower to leave headroom for the planner's tool calls
|
||||||
|
running alongside RAG.
|
||||||
|
|
||||||
|
## Local knowledge corpus
|
||||||
|
|
||||||
|
`knowledge_lookup` is a `rag` node — it runs hybrid (vector + keyword)
|
||||||
|
retrieval over every file in `knowledge/`. The directory ships with a
|
||||||
|
small `research-style-notes.md` so the RAG node has something to
|
||||||
|
retrieve against on a clean install; drop your own Markdown notes,
|
||||||
|
PDFs, or text files into `knowledge/` to bias the research toward
|
||||||
|
your local context.
|
||||||
|
|
||||||
|
The knowledge base is built once, at agent-load time, into
|
||||||
|
`~/.config/loki/agents/deep-research/knowledge_lookup.yaml`. Because
|
||||||
|
the node fully specifies its build config (`embedding_model`,
|
||||||
|
`chunk_size`, `chunk_overlap`), the build is non-interactive. Delete
|
||||||
|
that cached file after adding or changing knowledge to force a
|
||||||
|
rebuild.
|
||||||
|
|
||||||
|
## Sub-agent: report-writer
|
||||||
|
|
||||||
|
The `synthesize` node is an `agent` node that spawns the
|
||||||
|
`report-writer` sub-agent (`assets/agents/report-writer/`). This is
|
||||||
|
the agent-as-tool pattern: the orchestrating graph delegates the
|
||||||
|
writing phase to a focused sub-agent dedicated to coherent prose,
|
||||||
|
while the research phase uses different (typically cheaper) LLM nodes
|
||||||
|
for fast-and-many-question investigation.
|
||||||
|
|
||||||
|
The `report-writer` sub-agent has no tools — it cannot access the
|
||||||
|
web, cannot search, and cannot invent facts. It reads only the
|
||||||
|
findings it is given and produces a final Markdown report preserving
|
||||||
|
every inline citation. See `assets/agents/report-writer/README.md`
|
||||||
|
for details.
|
||||||
|
|
||||||
|
## Tools and tool scoping
|
||||||
|
|
||||||
|
This agent demonstrates Loki's three tool sources and how an `llm`
|
||||||
|
node's `tools:` whitelist scopes them per node.
|
||||||
|
|
||||||
|
The agent's full tool universe, declared in `graph.yaml`:
|
||||||
|
|
||||||
|
- **Global tools** (`global_tools`): `web_search_loki`,
|
||||||
|
`fetch_url_via_curl`, `search_arxiv` - Loki's built-in tool scripts.
|
||||||
|
- **MCP server** (`mcp_servers`): `ddg-search` - a DuckDuckGo web
|
||||||
|
search MCP server. Referenced in a whitelist as `mcp:ddg-search`.
|
||||||
|
- **Custom agent tool** (`tools.sh`): `classify_source` - a
|
||||||
|
deterministic source-credibility classifier shipped with this agent.
|
||||||
|
|
||||||
|
No node receives all of these. Each `llm` node's `tools:` whitelist
|
||||||
|
narrows the universe to exactly what that step needs:
|
||||||
|
|
||||||
|
| Node | `tools:` whitelist | Draws from |
|
||||||
|
|---|---|---|
|
||||||
|
| `plan`, `critique` | `[]` | nothing - pure reasoning |
|
||||||
|
| `research_one_question` | `web_search_loki`, `fetch_url_via_curl`, `search_arxiv`, `mcp:ddg-search` | global tools + MCP |
|
||||||
|
| `vet_sources` | `classify_source` | the custom tool only |
|
||||||
|
|
||||||
|
`research_one_question` (each parallel branch of the map) can search
|
||||||
|
and fetch but cannot classify sources; `vet_sources` can classify
|
||||||
|
sources but cannot touch the web. That separation is the point of the
|
||||||
|
`tools:` whitelist: a node gets only the tools its job calls for,
|
||||||
|
never the agent's full set.
|
||||||
|
|
||||||
|
The `classify_source` custom tool (`tools.sh`) takes a URL and returns
|
||||||
|
a credibility tier (government, academic, preprint, organization,
|
||||||
|
unverified) derived from the host and top-level domain. It is
|
||||||
|
deterministic - exactly the kind of logic a tool should own rather than
|
||||||
|
the LLM guessing.
|
||||||
|
|
||||||
|
Web search may require API-key configuration; see the
|
||||||
|
[Tools](https://github.com/Dark-Alex-17/loki/wiki/Tools) docs.
|
||||||
|
`fetch_url_via_curl`, `search_arxiv`, and `classify_source` work
|
||||||
|
without a key.
|
||||||
|
|
||||||
|
## Setup
|
||||||
|
|
||||||
|
`research_one_question` (each parallel branch of the `map`) uses the
|
||||||
|
`ddg-search` MCP server via `mcp:ddg-search`. It is one of Loki's
|
||||||
|
default MCP servers; make sure it is registered in
|
||||||
|
`~/.config/loki/mcp.json` (run `loki --install mcp_config` to restore
|
||||||
|
the default template if it is missing). If `ddg-search` is unavailable,
|
||||||
|
the branches still have their global web-search tools to fall back on.
|
||||||
|
|
||||||
|
The `synthesize` node spawns the `report-writer` sub-agent. Both
|
||||||
|
agents ship with `loki agents install`; if you install one manually,
|
||||||
|
install both so the agent reference resolves.
|
||||||
|
|
||||||
|
## Reflexion
|
||||||
|
|
||||||
|
The agent has two loops, both built with script nodes that route via
|
||||||
|
`_next`. The engine allows back-edges at runtime; the validator only
|
||||||
|
rejects cycles built from static `next` / `routes` edges, so script
|
||||||
|
`_next` loops are always allowed.
|
||||||
|
|
||||||
|
**Automated reflexion loop.** After the parallel research map and
|
||||||
|
`vet_sources`, the `critique` node reviews the merged findings
|
||||||
|
against the research plan and the source credibility assessment, and
|
||||||
|
emits `VERDICT: PASS` or `VERDICT: REVISE` with specific feedback.
|
||||||
|
`reflexion_gate.py` then:
|
||||||
|
|
||||||
|
- `PASS` -> continue to `synthesize`.
|
||||||
|
- `REVISE`, budget remaining -> loop back to `research_each_question`,
|
||||||
|
with the critique injected as `research_feedback` so every parallel
|
||||||
|
branch sees it on the retry.
|
||||||
|
- `REVISE`, budget spent -> continue to `synthesize` anyway (the human
|
||||||
|
approval step is the final backstop).
|
||||||
|
|
||||||
|
The budget is `MAX_REFLEXION_REVISIONS` in `reflexion_gate.py`
|
||||||
|
(default 2, so the research map runs at most 3 times per pass).
|
||||||
|
|
||||||
|
**Human-feedback loop.** At `approve` the user answers `accept`,
|
||||||
|
`reject`, or types their own feedback. A free-form answer routes via
|
||||||
|
the approval node's `on_other` to `incorporate_feedback.py`, which
|
||||||
|
folds that text into `research_feedback` and loops back to
|
||||||
|
`research_each_question` for another parallel pass.
|
||||||
|
|
||||||
|
`settings.max_loop_iterations` (40) is the engine's infinite-loop
|
||||||
|
backstop: it caps the total visits to any single node.
|
||||||
|
|
||||||
|
## Running
|
||||||
|
|
||||||
|
```sh
|
||||||
|
loki agents install # ships deep-research
|
||||||
|
loki -a deep-research "How does HTTP/3 differ from HTTP/2?"
|
||||||
|
loki -a deep-research "Recent advances in solid-state batteries"
|
||||||
|
loki -a deep-research # no prompt -> triggers ask_topic
|
||||||
|
```
|
||||||
|
|
||||||
|
## Anti-hallucination
|
||||||
|
|
||||||
|
- `research_one_question` (each map branch) is instructed to back
|
||||||
|
every claim with a real retrieved source and never to fabricate
|
||||||
|
URLs, titles, or DOIs.
|
||||||
|
- `vet_sources` classifies every cited source so weak sources are
|
||||||
|
visible to the critique step.
|
||||||
|
- `critique` independently reviews the merged findings and sends weak
|
||||||
|
or uncited work back for another parallel research pass.
|
||||||
|
- `synthesize` (the `report-writer` sub-agent) is grounded: it may use
|
||||||
|
only the gathered findings and must keep each claim's inline source.
|
||||||
|
It has no tools and cannot browse the web.
|
||||||
|
- `verify_sources` probes every cited URL / DOI with an HTTP HEAD
|
||||||
|
request and reports which are unreachable, so the human reviewer
|
||||||
|
sees broken citations before approving.
|
||||||
|
|
||||||
|
## Customizing
|
||||||
|
|
||||||
|
- **Loop budget.** `MAX_REFLEXION_REVISIONS` in `reflexion_gate.py`.
|
||||||
|
- **Map concurrency.** The `research_each_question` node's
|
||||||
|
`max_concurrency: 3` caps simultaneous web-research branches.
|
||||||
|
Raise to investigate more questions in parallel; lower to be gentle
|
||||||
|
on rate-limited providers.
|
||||||
|
- **Per-node model.** Add `model: anthropic:...` to any `llm` node.
|
||||||
|
Cheap models work well for `plan` / `critique` / `vet_sources`; the
|
||||||
|
heavy intelligence is needed in `research_one_question` and the
|
||||||
|
`report-writer` sub-agent.
|
||||||
|
- **Tool scope.** Narrow the `research_one_question` node's `tools:`
|
||||||
|
list to constrain where each branch looks (for example, drop
|
||||||
|
`web_search_loki` and `mcp:ddg-search` to force arXiv-only
|
||||||
|
research).
|
||||||
|
- **Local knowledge.** Drop files into `knowledge/` to bias every
|
||||||
|
research branch toward your local context (see the *Local
|
||||||
|
knowledge corpus* section above).
|
||||||
|
- **Different writer.** Replace `agent: report-writer` on the
|
||||||
|
`synthesize` node with the name of any other agent. The
|
||||||
|
orchestrator does not care what kind of agent the writer is.
|
||||||
|
- **Skip approval.** Point both `approve` routes at `end_accepted`,
|
||||||
|
or wire `verify_sources` straight to an `end` node.
|
||||||
|
|
||||||
|
## Files
|
||||||
|
|
||||||
|
```
|
||||||
|
assets/agents/deep-research/
|
||||||
|
graph.yaml - agent config + 17-node workflow
|
||||||
|
tools.sh - classify_source custom tool
|
||||||
|
README.md - this file
|
||||||
|
knowledge/
|
||||||
|
README.md - corpus-format notes
|
||||||
|
research-style-notes.md - starter knowledge file (replace with your notes)
|
||||||
|
scripts/
|
||||||
|
parse_request.py - _next: bootstrap_research, or ask_topic if no topic
|
||||||
|
bootstrap_research.py - fan-out source: next [plan, knowledge_lookup]
|
||||||
|
combine_findings.py - joins map output (question_findings) into findings
|
||||||
|
reflexion_gate.py - _next: research_each_question (revise) or synthesize
|
||||||
|
verify_sources.py - HTTP HEAD on cited URLs / DOIs
|
||||||
|
incorporate_feedback.py - _next: research_each_question, with user feedback
|
||||||
|
```
|
||||||
|
|
||||||
|
See also `assets/agents/report-writer/` — the sub-agent the
|
||||||
|
`synthesize` node spawns.
|
||||||
@@ -0,0 +1,294 @@
|
|||||||
|
name: deep-research
|
||||||
|
description: |
|
||||||
|
Deep web research workflow. Plans an investigation, decomposes it
|
||||||
|
into sub-questions researched in parallel, grounds the work in a
|
||||||
|
local knowledge corpus, vets the credibility of cited sources, runs
|
||||||
|
a reflexion self-critique loop to revise weak or incomplete findings,
|
||||||
|
delegates the final write-up to a focused sub-agent, checks that the
|
||||||
|
cited sources are reachable, and gates the result behind human
|
||||||
|
approval. A reviewer's free-form feedback at the approval step feeds
|
||||||
|
back into another research pass.
|
||||||
|
|
||||||
|
This is the canonical Loki graph-agent reference: it exercises every
|
||||||
|
node type (script, llm, rag, map, agent, input, approval, end) and
|
||||||
|
both static fan-out and dynamic map fan-out.
|
||||||
|
|
||||||
|
version: "1.0"
|
||||||
|
|
||||||
|
temperature: 0.0
|
||||||
|
|
||||||
|
global_tools:
|
||||||
|
- web_search_loki.sh
|
||||||
|
- fetch_url_via_curl.sh
|
||||||
|
- search_arxiv.sh
|
||||||
|
|
||||||
|
mcp_servers:
|
||||||
|
- ddg-search
|
||||||
|
|
||||||
|
conversation_starters:
|
||||||
|
- "How does HTTP/3 differ from HTTP/2?"
|
||||||
|
- "Summarize recent advances in solid-state battery chemistry"
|
||||||
|
|
||||||
|
settings:
|
||||||
|
max_loop_iterations: 40
|
||||||
|
log_state_snapshots: false
|
||||||
|
validate_before_run: true
|
||||||
|
max_concurrency: 4
|
||||||
|
|
||||||
|
initial_state:
|
||||||
|
research_feedback: ""
|
||||||
|
research_attempts: 0
|
||||||
|
local_context: ""
|
||||||
|
local_sources: ""
|
||||||
|
|
||||||
|
start: parse_request
|
||||||
|
|
||||||
|
nodes:
|
||||||
|
|
||||||
|
parse_request:
|
||||||
|
id: parse_request
|
||||||
|
type: script
|
||||||
|
script: scripts/parse_request.py
|
||||||
|
next: bootstrap_research
|
||||||
|
|
||||||
|
ask_topic:
|
||||||
|
id: ask_topic
|
||||||
|
type: input
|
||||||
|
question: "What would you like me to research?"
|
||||||
|
validation: "len(input) > 0"
|
||||||
|
state_updates:
|
||||||
|
topic: "{{input}}"
|
||||||
|
next: bootstrap_research
|
||||||
|
|
||||||
|
bootstrap_research:
|
||||||
|
id: bootstrap_research
|
||||||
|
type: script
|
||||||
|
script: scripts/bootstrap_research.py
|
||||||
|
next: [plan, knowledge_lookup]
|
||||||
|
|
||||||
|
plan:
|
||||||
|
id: plan
|
||||||
|
type: llm
|
||||||
|
instructions: |
|
||||||
|
You are a research planner. Given a topic, produce a focused
|
||||||
|
research plan and decompose it into 3-5 specific sub-questions
|
||||||
|
that can each be researched independently in parallel.
|
||||||
|
|
||||||
|
The plan is a short narrative naming the key questions and the
|
||||||
|
kinds of sources that would be authoritative. The sub-questions
|
||||||
|
are precise, self-contained queries (each one is sent on its own
|
||||||
|
to a separate research worker, so they must be answerable
|
||||||
|
without each other's context).
|
||||||
|
prompt: "Research topic: {{topic}}"
|
||||||
|
tools: []
|
||||||
|
output_schema:
|
||||||
|
type: object
|
||||||
|
properties:
|
||||||
|
research_plan:
|
||||||
|
type: string
|
||||||
|
description: A short plan narrative.
|
||||||
|
questions:
|
||||||
|
type: array
|
||||||
|
items: { type: string }
|
||||||
|
minItems: 1
|
||||||
|
maxItems: 6
|
||||||
|
description: 3-5 specific, self-contained sub-questions.
|
||||||
|
required: [research_plan, questions]
|
||||||
|
next: research_each_question
|
||||||
|
|
||||||
|
knowledge_lookup:
|
||||||
|
id: knowledge_lookup
|
||||||
|
type: rag
|
||||||
|
documents:
|
||||||
|
- ./knowledge/
|
||||||
|
query: "{{topic}}"
|
||||||
|
top_k: 6
|
||||||
|
embedding_model: openai:text-embedding-3-small
|
||||||
|
chunk_size: 1000
|
||||||
|
chunk_overlap: 100
|
||||||
|
state_updates:
|
||||||
|
local_context: "{{output.context}}"
|
||||||
|
local_sources: "{{output.sources}}"
|
||||||
|
next: research_each_question
|
||||||
|
|
||||||
|
research_each_question:
|
||||||
|
id: research_each_question
|
||||||
|
type: map
|
||||||
|
over: "{{questions}}"
|
||||||
|
as: question
|
||||||
|
branch: research_one_question
|
||||||
|
collect_into: question_findings
|
||||||
|
max_concurrency: 3
|
||||||
|
next: combine_findings
|
||||||
|
|
||||||
|
research_one_question:
|
||||||
|
id: research_one_question
|
||||||
|
type: llm
|
||||||
|
instructions: |
|
||||||
|
You are a web research assistant. Investigate the SINGLE question
|
||||||
|
given to you using your tools: search the web, fetch and read
|
||||||
|
pages, and search arXiv for academic sources.
|
||||||
|
|
||||||
|
Rules:
|
||||||
|
- Every factual claim must be backed by a real source you
|
||||||
|
actually retrieved. Never fabricate URLs, page titles,
|
||||||
|
authors, or DOIs.
|
||||||
|
- Prefer primary and authoritative sources over aggregators.
|
||||||
|
- Where sources disagree, report the disagreement rather than
|
||||||
|
papering over it.
|
||||||
|
- Put the URL (or DOI) inline next to each claim it supports.
|
||||||
|
|
||||||
|
Return organized findings in plain text. Do not include
|
||||||
|
meta-commentary about the process.
|
||||||
|
prompt: |
|
||||||
|
Research question: {{question}}
|
||||||
|
|
||||||
|
Local context that may help:
|
||||||
|
{{local_context}}
|
||||||
|
|
||||||
|
{{research_feedback}}
|
||||||
|
tools:
|
||||||
|
- web_search_loki
|
||||||
|
- fetch_url_via_curl
|
||||||
|
- search_arxiv
|
||||||
|
- mcp:ddg-search
|
||||||
|
max_iterations: 10
|
||||||
|
max_attempts: 2
|
||||||
|
temperature: 0.1
|
||||||
|
|
||||||
|
combine_findings:
|
||||||
|
id: combine_findings
|
||||||
|
type: script
|
||||||
|
script: scripts/combine_findings.py
|
||||||
|
next: vet_sources
|
||||||
|
|
||||||
|
vet_sources:
|
||||||
|
id: vet_sources
|
||||||
|
type: llm
|
||||||
|
instructions: |
|
||||||
|
You assess the credibility of the sources cited in a set of
|
||||||
|
research findings. For every distinct source URL in the findings,
|
||||||
|
call the `classify_source` tool to get its credibility tier. Then
|
||||||
|
summarize: which claims rest on HIGH-credibility sources, and
|
||||||
|
which rest on PREPRINT or UNVERIFIED sources and so need
|
||||||
|
corroboration. Do NOT do any new research -- assess only what is
|
||||||
|
already cited.
|
||||||
|
prompt: |
|
||||||
|
Findings to assess:
|
||||||
|
{{findings}}
|
||||||
|
tools:
|
||||||
|
- classify_source
|
||||||
|
max_iterations: 15
|
||||||
|
state_updates:
|
||||||
|
source_assessment: "{{output}}"
|
||||||
|
next: critique
|
||||||
|
|
||||||
|
critique:
|
||||||
|
id: critique
|
||||||
|
type: llm
|
||||||
|
instructions: |
|
||||||
|
You are a meticulous research reviewer. Judge whether the
|
||||||
|
findings below are good enough to synthesize a complete,
|
||||||
|
well-supported report that answers the research plan.
|
||||||
|
|
||||||
|
Mark the findings REVISE if ANY of these hold:
|
||||||
|
- A research-plan question is unanswered or only weakly
|
||||||
|
addressed.
|
||||||
|
- A factual claim has no source, or cites a source that looks
|
||||||
|
fabricated.
|
||||||
|
- The findings lean on a single source where corroboration is
|
||||||
|
needed.
|
||||||
|
- A key claim rests only on a PREPRINT or UNVERIFIED source,
|
||||||
|
per the source credibility assessment below.
|
||||||
|
- An obvious counter-perspective or recent development is
|
||||||
|
missing.
|
||||||
|
Otherwise mark them PASS.
|
||||||
|
|
||||||
|
Respond in EXACTLY this format, nothing else:
|
||||||
|
|
||||||
|
VERDICT: <PASS or REVISE>
|
||||||
|
FEEDBACK: <if REVISE, be specific and actionable -- name the gaps
|
||||||
|
and what kind of source would close them; if PASS, write "none">
|
||||||
|
prompt: |
|
||||||
|
Research plan:
|
||||||
|
{{research_plan}}
|
||||||
|
|
||||||
|
Findings under review:
|
||||||
|
{{findings}}
|
||||||
|
|
||||||
|
Source credibility assessment:
|
||||||
|
{{source_assessment}}
|
||||||
|
tools: []
|
||||||
|
state_updates:
|
||||||
|
critique: "{{output}}"
|
||||||
|
next: reflexion_gate
|
||||||
|
|
||||||
|
reflexion_gate:
|
||||||
|
id: reflexion_gate
|
||||||
|
type: script
|
||||||
|
script: scripts/reflexion_gate.py
|
||||||
|
next: synthesize
|
||||||
|
|
||||||
|
synthesize:
|
||||||
|
id: synthesize
|
||||||
|
type: agent
|
||||||
|
agent: report-writer
|
||||||
|
prompt: |
|
||||||
|
Research topic: {{topic}}
|
||||||
|
|
||||||
|
Findings (organized by sub-question, with inline citations):
|
||||||
|
{{findings}}
|
||||||
|
|
||||||
|
Source credibility assessment:
|
||||||
|
{{source_assessment}}
|
||||||
|
|
||||||
|
Produce the final report following your instructions.
|
||||||
|
timeout: 300
|
||||||
|
state_updates:
|
||||||
|
report: "{{output}}"
|
||||||
|
next: verify_sources
|
||||||
|
|
||||||
|
verify_sources:
|
||||||
|
id: verify_sources
|
||||||
|
type: script
|
||||||
|
script: scripts/verify_sources.py
|
||||||
|
next: approve
|
||||||
|
|
||||||
|
approve:
|
||||||
|
id: approve
|
||||||
|
type: approval
|
||||||
|
question: |
|
||||||
|
Research report on: {{topic}}
|
||||||
|
|
||||||
|
{{report}}
|
||||||
|
|
||||||
|
----
|
||||||
|
{{source_check}}
|
||||||
|
----
|
||||||
|
|
||||||
|
Accept this report? Pick "accept" or "reject", or type specific
|
||||||
|
feedback to send the research back for another pass.
|
||||||
|
options:
|
||||||
|
- "accept"
|
||||||
|
- "reject"
|
||||||
|
routes:
|
||||||
|
"accept": end_accepted
|
||||||
|
"reject": end_rejected
|
||||||
|
on_other: incorporate_feedback
|
||||||
|
state_updates:
|
||||||
|
decision: "{{choice}}"
|
||||||
|
|
||||||
|
incorporate_feedback:
|
||||||
|
id: incorporate_feedback
|
||||||
|
type: script
|
||||||
|
script: scripts/incorporate_feedback.py
|
||||||
|
|
||||||
|
end_accepted:
|
||||||
|
id: end_accepted
|
||||||
|
type: end
|
||||||
|
output: "{{report}}"
|
||||||
|
|
||||||
|
end_rejected:
|
||||||
|
id: end_rejected
|
||||||
|
type: end
|
||||||
|
output: "Research on '{{topic}}' was rejected and discarded."
|
||||||
@@ -0,0 +1,23 @@
|
|||||||
|
# Local knowledge corpus for deep-research
|
||||||
|
|
||||||
|
The `knowledge_lookup` node in `graph.yaml` is a `rag` node that runs
|
||||||
|
hybrid (vector + keyword) retrieval over every file in this directory.
|
||||||
|
Drop your own notes, papers (PDFs), Markdown docs, or text files here
|
||||||
|
and they will be indexed into a per-agent knowledge base on first run.
|
||||||
|
|
||||||
|
Loki supports common file types out of the box: `.md`, `.txt`, `.pdf`,
|
||||||
|
`.html`, and others. Subdirectories are walked recursively.
|
||||||
|
|
||||||
|
A small starter file (`research-style-notes.md`) ships so the RAG
|
||||||
|
node has something non-empty to retrieve against on a clean install.
|
||||||
|
Replace or extend it with your own materials to bias the research
|
||||||
|
phase toward your local context.
|
||||||
|
|
||||||
|
To force the knowledge base to rebuild after you add or change files,
|
||||||
|
delete the cached index:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
rm ~/.config/loki/agents/deep-research/knowledge_lookup.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
The next run will rebuild from the current contents of this directory.
|
||||||
@@ -0,0 +1,49 @@
|
|||||||
|
# Research style notes
|
||||||
|
|
||||||
|
These are general principles the `deep-research` agent should keep in
|
||||||
|
mind regardless of topic. Replace this file with your own notes if you
|
||||||
|
want to bias retrieval toward your local context.
|
||||||
|
|
||||||
|
## What "good research" means here
|
||||||
|
|
||||||
|
- **Every factual claim cites a source you actually retrieved.** Never
|
||||||
|
fabricate URLs, page titles, authors, or DOIs.
|
||||||
|
- **Primary sources beat aggregators.** Prefer the original paper, the
|
||||||
|
RFC, the standards body, or the manufacturer over a blog summarizing
|
||||||
|
them.
|
||||||
|
- **Corroboration matters where stakes are high.** If a single source
|
||||||
|
makes a strong claim, look for a second independent source before
|
||||||
|
taking it as established.
|
||||||
|
- **Disagreement is information, not noise.** If two credible sources
|
||||||
|
disagree, report the disagreement and the reasoning on each side.
|
||||||
|
- **Old does not mean wrong.** A 2014 RFC is still authoritative if no
|
||||||
|
newer one has obsoleted it; check before assuming a source is stale.
|
||||||
|
|
||||||
|
## Source-tier heuristics
|
||||||
|
|
||||||
|
The `vet_sources` node uses these rough tiers to weigh credibility.
|
||||||
|
The custom tool `classify_source` (see `tools.sh`) implements this
|
||||||
|
deterministically by hostname / TLD.
|
||||||
|
|
||||||
|
- **HIGH:** government domains (`.gov`, `.mil`), academic institutions
|
||||||
|
(`.edu`, university subdomains), peer-reviewed journals, standards
|
||||||
|
bodies (IETF/RFCs, W3C, ISO, IEEE, NIST), and primary documents from
|
||||||
|
the entities being researched (e.g. a vendor's official spec page).
|
||||||
|
- **PREPRINT:** arXiv, bioRxiv, medRxiv, SSRN. Useful but not yet
|
||||||
|
peer-reviewed; treat numeric claims with extra caution.
|
||||||
|
- **ORGANIZATION:** established nonprofits, standards-adjacent groups,
|
||||||
|
industry consortia. Reliable for their stated mission but may have a
|
||||||
|
perspective.
|
||||||
|
- **UNVERIFIED:** general web pages, blogs, news aggregators, social
|
||||||
|
media. Useful for leads but should not be the only source for a
|
||||||
|
factual claim.
|
||||||
|
|
||||||
|
## Common pitfalls to flag in critique
|
||||||
|
|
||||||
|
- A claim cited only to a PREPRINT or UNVERIFIED source on a numeric
|
||||||
|
or contested point.
|
||||||
|
- A research-plan question that the findings address only obliquely.
|
||||||
|
- "Findings" that paraphrase a single source three times rather than
|
||||||
|
triangulating.
|
||||||
|
- Citation collisions where two sources are listed but turn out to
|
||||||
|
be the same study reported via different aggregators.
|
||||||
@@ -0,0 +1,18 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Fan-out source for context loading.
|
||||||
|
|
||||||
|
Has no logic of its own. Exists so the static `next: [plan, knowledge_lookup]`
|
||||||
|
list on this node fans out into two parallel branches (the LLM planner and
|
||||||
|
the RAG knowledge lookup) as a single super-step. The validator requires
|
||||||
|
declared parallel-branch script outputs, so we emit an empty JSON object
|
||||||
|
explicitly here.
|
||||||
|
"""
|
||||||
|
import json
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
print(json.dumps({}))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,39 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Join the per-question map outputs into a single `findings` string.
|
||||||
|
|
||||||
|
The `research_each_question` map writes `question_findings` (an array,
|
||||||
|
one entry per sub-question, in input order). Downstream nodes
|
||||||
|
(`vet_sources`, `critique`, `synthesize`) read `{{findings}}` as a
|
||||||
|
single block, so this script renders the array as a Markdown document
|
||||||
|
with one section per question.
|
||||||
|
"""
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
|
|
||||||
|
def load_state():
|
||||||
|
path = os.environ.get("GRAPH_STATE_FILE")
|
||||||
|
if path:
|
||||||
|
with open(path) as f:
|
||||||
|
return json.load(f)
|
||||||
|
return json.loads(os.environ.get("GRAPH_STATE", "{}"))
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
state = load_state()
|
||||||
|
questions = state.get("questions") or []
|
||||||
|
per_question = state.get("question_findings") or []
|
||||||
|
|
||||||
|
sections = []
|
||||||
|
for idx, q in enumerate(questions):
|
||||||
|
body = per_question[idx] if idx < len(per_question) else ""
|
||||||
|
if isinstance(body, dict) or isinstance(body, list):
|
||||||
|
body = json.dumps(body, indent=2)
|
||||||
|
sections.append(f"## {q}\n\n{body}")
|
||||||
|
|
||||||
|
findings = "\n\n".join(sections) if sections else "No findings gathered."
|
||||||
|
print(json.dumps({"findings": findings}))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,41 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Fold a reviewer's free-form feedback back into the research loop.
|
||||||
|
|
||||||
|
Runs when the user answers the approval step with their own text
|
||||||
|
instead of "accept" or "reject". That text (saved by the approval node
|
||||||
|
as `decision`) becomes `research_feedback`, and the graph loops back to
|
||||||
|
`research_each_question` for another informed pass (each sub-question is
|
||||||
|
re-researched in parallel with the new feedback in context). The
|
||||||
|
reflexion counter is reset so the user-driven pass gets a fresh revision
|
||||||
|
budget.
|
||||||
|
|
||||||
|
Routing (`_next`): always research_each_question.
|
||||||
|
"""
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
|
|
||||||
|
def load_state():
|
||||||
|
path = os.environ.get("GRAPH_STATE_FILE")
|
||||||
|
if path:
|
||||||
|
with open(path) as f:
|
||||||
|
return json.load(f)
|
||||||
|
return json.loads(os.environ.get("GRAPH_STATE", "{}"))
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
state = load_state()
|
||||||
|
feedback = (state.get("decision") or "").strip()
|
||||||
|
output = {
|
||||||
|
"_next": "research_each_question",
|
||||||
|
"research_attempts": 0,
|
||||||
|
"research_feedback": (
|
||||||
|
"The user reviewed the report and asked for changes. Treat "
|
||||||
|
"this as the top priority for the next pass:\n\n" + feedback
|
||||||
|
),
|
||||||
|
}
|
||||||
|
print(json.dumps(output))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,35 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Entry router for deep-research.
|
||||||
|
|
||||||
|
Reads the caller's prompt from state. If it contains a usable research
|
||||||
|
topic, stores it as `topic` and falls through to the static `next`
|
||||||
|
(plan). If the prompt is empty, routes to `ask_topic` so the user can
|
||||||
|
supply one interactively.
|
||||||
|
|
||||||
|
Routing (`_next`):
|
||||||
|
- prompt present -> (no _next; static next: plan)
|
||||||
|
- prompt empty -> ask_topic
|
||||||
|
"""
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
|
|
||||||
|
def load_state():
|
||||||
|
path = os.environ.get("GRAPH_STATE_FILE")
|
||||||
|
if path:
|
||||||
|
with open(path) as f:
|
||||||
|
return json.load(f)
|
||||||
|
return json.loads(os.environ.get("GRAPH_STATE", "{}"))
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
state = load_state()
|
||||||
|
prompt = (state.get("initial_prompt") or "").strip()
|
||||||
|
if prompt:
|
||||||
|
print(json.dumps({"topic": prompt}))
|
||||||
|
else:
|
||||||
|
print(json.dumps({"_next": "ask_topic"}))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,76 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Reflexion gate for deep-research.
|
||||||
|
|
||||||
|
Runs after `critique` has reviewed the current research findings. If the
|
||||||
|
critique's verdict is REVISE and the reflexion budget is not spent,
|
||||||
|
loops back to `research` with the critique attached as
|
||||||
|
`research_feedback`, so the retry is informed rather than a blind
|
||||||
|
re-run. Otherwise it proceeds to `synthesize`.
|
||||||
|
|
||||||
|
Routing (`_next`):
|
||||||
|
- verdict PASS -> synthesize
|
||||||
|
- verdict REVISE, budget remaining -> research_each_question (+ research_feedback)
|
||||||
|
- verdict REVISE, budget spent -> synthesize
|
||||||
|
|
||||||
|
Reflexion is a best-effort quality booster, not a hard gate: once the
|
||||||
|
budget is spent the workflow proceeds anyway, and the human approval
|
||||||
|
step is the final backstop.
|
||||||
|
"""
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
|
||||||
|
# Automated revision passes allowed. `research` runs at most
|
||||||
|
# MAX_REFLEXION_REVISIONS + 1 times per user pass. Bump to allow more.
|
||||||
|
MAX_REFLEXION_REVISIONS = 2
|
||||||
|
|
||||||
|
|
||||||
|
def load_state():
|
||||||
|
path = os.environ.get("GRAPH_STATE_FILE")
|
||||||
|
if path:
|
||||||
|
with open(path) as f:
|
||||||
|
return json.load(f)
|
||||||
|
return json.loads(os.environ.get("GRAPH_STATE", "{}"))
|
||||||
|
|
||||||
|
|
||||||
|
def as_int(value, default=0):
|
||||||
|
try:
|
||||||
|
return int(value)
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
return default
|
||||||
|
|
||||||
|
|
||||||
|
def parse_verdict(critique):
|
||||||
|
"""Pull PASS/REVISE from the critique's `VERDICT:` line. Defaults to
|
||||||
|
PASS when no verdict line is found, so a malformed critique lets the
|
||||||
|
workflow proceed instead of burning the whole revision budget."""
|
||||||
|
match = re.search(r"VERDICT:\s*([A-Za-z]+)", critique, re.IGNORECASE)
|
||||||
|
if not match:
|
||||||
|
return "PASS"
|
||||||
|
return match.group(1).upper()
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
state = load_state()
|
||||||
|
critique = state.get("critique") or ""
|
||||||
|
verdict = parse_verdict(critique)
|
||||||
|
attempts = as_int(state.get("research_attempts"))
|
||||||
|
|
||||||
|
if verdict == "REVISE" and attempts < MAX_REFLEXION_REVISIONS:
|
||||||
|
feedback = (
|
||||||
|
"A reviewer judged the previous research pass incomplete. "
|
||||||
|
"Address every point in the critique below:\n\n" + critique
|
||||||
|
)
|
||||||
|
output = {
|
||||||
|
"_next": "research_each_question",
|
||||||
|
"research_attempts": attempts + 1,
|
||||||
|
"research_feedback": feedback,
|
||||||
|
}
|
||||||
|
else:
|
||||||
|
output = {"_next": "synthesize"}
|
||||||
|
|
||||||
|
print(json.dumps(output))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,69 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Check that the sources cited in the research report are reachable.
|
||||||
|
|
||||||
|
Scans the final report for URLs and DOIs, probes each with a HEAD
|
||||||
|
request, and writes a `source_check` summary into state so the human
|
||||||
|
reviewer sees broken citations at the approval step.
|
||||||
|
|
||||||
|
Times out per request so a slow source cannot stall the graph.
|
||||||
|
"""
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import urllib.error
|
||||||
|
import urllib.request
|
||||||
|
|
||||||
|
DOI_RE = re.compile(r"\b(10\.\d{4,9}/[-._;()/:A-Z0-9]+)", re.IGNORECASE)
|
||||||
|
URL_RE = re.compile(r"https?://[^\s)\]\}\"'>]+")
|
||||||
|
|
||||||
|
|
||||||
|
def load_state():
|
||||||
|
path = os.environ.get("GRAPH_STATE_FILE")
|
||||||
|
if path:
|
||||||
|
with open(path) as f:
|
||||||
|
return json.load(f)
|
||||||
|
return json.loads(os.environ.get("GRAPH_STATE", "{}"))
|
||||||
|
|
||||||
|
|
||||||
|
def reachable(url, timeout=5.0):
|
||||||
|
req = urllib.request.Request(url, method="HEAD")
|
||||||
|
try:
|
||||||
|
with urllib.request.urlopen(req, timeout=timeout) as resp:
|
||||||
|
return 200 <= resp.status < 400
|
||||||
|
except urllib.error.HTTPError as e:
|
||||||
|
return 200 <= e.code < 400
|
||||||
|
except Exception:
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
state = load_state()
|
||||||
|
report = state.get("report") or ""
|
||||||
|
|
||||||
|
urls = sorted({u.rstrip(".,;)") for u in URL_RE.findall(report)})
|
||||||
|
dois = sorted(set(DOI_RE.findall(report)))
|
||||||
|
|
||||||
|
results = []
|
||||||
|
for url in urls:
|
||||||
|
ok = reachable(url)
|
||||||
|
results.append(f" {'OK' if ok else 'UNREACHABLE'} {url}")
|
||||||
|
for doi in dois:
|
||||||
|
url = f"https://doi.org/{doi}"
|
||||||
|
if url in urls:
|
||||||
|
continue
|
||||||
|
ok = reachable(url)
|
||||||
|
results.append(f" {'OK' if ok else 'UNREACHABLE'} DOI {doi} ({url})")
|
||||||
|
|
||||||
|
if not results:
|
||||||
|
summary = "No web sources were cited in the report."
|
||||||
|
else:
|
||||||
|
summary = (
|
||||||
|
f"Source reachability ({len(results)} checked):\n"
|
||||||
|
+ "\n".join(results)
|
||||||
|
)
|
||||||
|
|
||||||
|
print(json.dumps({"source_check": summary}))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,39 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
|
||||||
|
set -e
|
||||||
|
|
||||||
|
# @env LLM_OUTPUT=/dev/stdout The output path
|
||||||
|
|
||||||
|
# @cmd Classify the credibility tier of a web source from its URL.
|
||||||
|
# A deterministic check based on the host and top-level domain. Use it
|
||||||
|
# to weigh how much trust to place in a source before relying on it.
|
||||||
|
# @option --url! The full source URL to classify
|
||||||
|
classify_source() {
|
||||||
|
# shellcheck disable=SC2154
|
||||||
|
local url="$argc_url"
|
||||||
|
local host="${url#*://}"
|
||||||
|
host="${host%%/*}"
|
||||||
|
host="${host##*@}"
|
||||||
|
host="${host%%:*}"
|
||||||
|
host="$(printf '%s' "$host" | tr '[:upper:]' '[:lower:]')"
|
||||||
|
|
||||||
|
local tier
|
||||||
|
case "$host" in
|
||||||
|
'')
|
||||||
|
tier="UNKNOWN - no host could be parsed from the URL" ;;
|
||||||
|
*.gov | *.gov.* | *.mil)
|
||||||
|
tier="HIGH - government source" ;;
|
||||||
|
*.edu | *.edu.* | *.ac.*)
|
||||||
|
tier="HIGH - academic institution" ;;
|
||||||
|
arxiv.org | *.arxiv.org | biorxiv.org | *.biorxiv.org | medrxiv.org | *.medrxiv.org | ssrn.com | *.ssrn.com)
|
||||||
|
tier="PREPRINT - not yet peer reviewed, corroborate before citing" ;;
|
||||||
|
wikipedia.org | *.wikipedia.org)
|
||||||
|
tier="TERTIARY - encyclopedia, good for orientation not citation" ;;
|
||||||
|
*.org | *.org.*)
|
||||||
|
tier="MEDIUM - organization site, check for institutional bias" ;;
|
||||||
|
*)
|
||||||
|
tier="UNVERIFIED - general web source, corroborate before citing" ;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
printf '%s: %s\n' "${host:-<none>}" "$tier" >> "$LLM_OUTPUT"
|
||||||
|
}
|
||||||
@@ -0,0 +1,46 @@
|
|||||||
|
# report-writer
|
||||||
|
|
||||||
|
A tiny, focused sub-agent that turns a set of research findings into a
|
||||||
|
single coherent final report. Reads only what it is given — does not
|
||||||
|
do independent research, does not access the web, does not invent
|
||||||
|
facts. It exists as a focused tool for orchestrating agents to
|
||||||
|
delegate the writing phase to.
|
||||||
|
|
||||||
|
## Why a separate agent?
|
||||||
|
|
||||||
|
This is an example of the **agent-as-tool** pattern in graph agents.
|
||||||
|
The `deep-research` graph agent's `synthesize` node is an `agent` node
|
||||||
|
that spawns this one (see `assets/agents/deep-research/graph.yaml`).
|
||||||
|
Separating the role has two practical benefits:
|
||||||
|
|
||||||
|
- The orchestrating agent can use a cheap model (or a high-temperature
|
||||||
|
exploratory one) for the research phase, while letting the writing
|
||||||
|
phase use a different (typically lower-temperature, possibly larger)
|
||||||
|
model dedicated to coherent prose.
|
||||||
|
- The writing prompt is owned by this agent's `config.yaml` rather
|
||||||
|
than buried inside another agent's graph. You can polish it
|
||||||
|
independently without touching the research flow.
|
||||||
|
|
||||||
|
## Standalone use
|
||||||
|
|
||||||
|
You can also use this agent directly if you have a set of findings you
|
||||||
|
want polished:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
loki -a report-writer "Topic: X. Findings: <paste findings here>"
|
||||||
|
```
|
||||||
|
|
||||||
|
It will produce a single Markdown report following the rules in its
|
||||||
|
system prompt: executive summary at the top, grouped sections by
|
||||||
|
related sub-questions, every inline citation preserved verbatim, and a
|
||||||
|
final "Open questions / disagreements" section.
|
||||||
|
|
||||||
|
## What it will NOT do
|
||||||
|
|
||||||
|
- Search the web, fetch URLs, query an MCP server, or use any tool.
|
||||||
|
It has no tools configured.
|
||||||
|
- Invent facts beyond what is in the findings you give it.
|
||||||
|
- Strip or rewrite citations.
|
||||||
|
|
||||||
|
These constraints are the point of the agent existing: a writer that
|
||||||
|
the orchestrator can trust to stay in its lane.
|
||||||
@@ -0,0 +1,34 @@
|
|||||||
|
name: report-writer
|
||||||
|
description: Polishes research findings into a clear, citation-preserving final report
|
||||||
|
version: 1.0.0
|
||||||
|
temperature: 0.2
|
||||||
|
|
||||||
|
instructions: |
|
||||||
|
You are a technical writer. You will be given:
|
||||||
|
- a research topic
|
||||||
|
- a set of findings, organized per sub-question, with inline
|
||||||
|
citations next to each claim
|
||||||
|
- a source-credibility assessment of the cited sources
|
||||||
|
|
||||||
|
Your job is to produce a single, well-organized final report:
|
||||||
|
|
||||||
|
Rules:
|
||||||
|
- Use ONLY the findings provided. Do not introduce facts from
|
||||||
|
your own memory. Do not speculate beyond what the findings
|
||||||
|
support.
|
||||||
|
- Preserve every inline citation. If a sentence in the findings
|
||||||
|
had a URL or DOI, the equivalent sentence in your report must
|
||||||
|
keep the same citation.
|
||||||
|
- Lead with a 2-3 sentence executive summary at the top.
|
||||||
|
- Organize the body so that related sub-questions are grouped,
|
||||||
|
not strictly one section per question. The findings are raw
|
||||||
|
material; the report should read as a single coherent answer
|
||||||
|
to the original topic.
|
||||||
|
- End with a short "Open questions / disagreements" section
|
||||||
|
naming anything the findings flagged as unresolved or
|
||||||
|
contested.
|
||||||
|
|
||||||
|
Output plain Markdown. No metadata, no JSON wrapper.
|
||||||
|
|
||||||
|
conversation_starters:
|
||||||
|
- "Polish these findings into a cited report"
|
||||||
Reference in New Issue
Block a user