Files
loki/docs/PHASE-1-IMPLEMENTATION-PLAN.md
2026-04-10 15:45:51 -06:00

957 lines
60 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase 1 Implementation Plan: Extract AppState from Config
## Overview
Split the monolithic `Config` struct into:
- **`AppConfig`** — immutable server-wide settings (deserialized from `config.yaml`)
- **`RequestContext`** — per-request mutable state (current role, session, agent, supervisor, etc.)
The existing `GlobalConfig` (`Arc<RwLock<Config>>`) type alias is replaced. CLI and REPL continue working identically. No API code is added in this phase.
**Estimated effort:** ~3-4 weeks (originally estimated 1-2 weeks; revised during implementation as Steps 6.5 and 7 deferred their semantic rewrites to an expanded Step 8)
**Risk:** Medium — touches 91 callsites across 15 modules
**Mitigation:** Incremental migration with tests passing at every step
**Sub-step tracking:** Each step has per-step implementation notes in `docs/implementation/PHASE-1-STEP-*-NOTES.md`
---
## Current State: Config Field Classification
### Serialized Fields (from config.yaml → AppConfig)
These are loaded from disk once and should be immutable during request processing:
| Field | Type | Notes |
|---|---|---|
| `model_id` | `String` | Default model ID |
| `temperature` | `Option<f64>` | Default temperature |
| `top_p` | `Option<f64>` | Default top_p |
| `dry_run` | `bool` | Can be overridden per-request |
| `stream` | `bool` | Can be overridden per-request |
| `save` | `bool` | Whether to persist to messages.md |
| `keybindings` | `String` | REPL keybinding style |
| `editor` | `Option<String>` | Editor command |
| `wrap` | `Option<String>` | Text wrapping |
| `wrap_code` | `bool` | Code block wrapping |
| `vault_password_file` | `Option<PathBuf>` | Vault password location |
| `function_calling_support` | `bool` | Global function calling toggle |
| `mapping_tools` | `IndexMap<String, String>` | Tool aliases |
| `enabled_tools` | `Option<String>` | Default enabled tools |
| `visible_tools` | `Option<Vec<String>>` | Visible tool list |
| `mcp_server_support` | `bool` | Global MCP toggle |
| `mapping_mcp_servers` | `IndexMap<String, String>` | MCP server aliases |
| `enabled_mcp_servers` | `Option<String>` | Default enabled MCP servers |
| `repl_prelude` | `Option<String>` | REPL prelude config |
| `cmd_prelude` | `Option<String>` | CLI prelude config |
| `agent_session` | `Option<String>` | Default agent session |
| `save_session` | `Option<bool>` | Session save behavior |
| `compression_threshold` | `usize` | Session compression threshold |
| `summarization_prompt` | `Option<String>` | Compression prompt |
| `summary_context_prompt` | `Option<String>` | Summary context prompt |
| `rag_embedding_model` | `Option<String>` | RAG embedding model |
| `rag_reranker_model` | `Option<String>` | RAG reranker model |
| `rag_top_k` | `usize` | RAG top-k results |
| `rag_chunk_size` | `Option<usize>` | RAG chunk size |
| `rag_chunk_overlap` | `Option<usize>` | RAG chunk overlap |
| `rag_template` | `Option<String>` | RAG template |
| `document_loaders` | `HashMap<String, String>` | Document loader mappings |
| `highlight` | `bool` | Syntax highlighting |
| `theme` | `Option<String>` | Color theme |
| `left_prompt` | `Option<String>` | REPL left prompt format |
| `right_prompt` | `Option<String>` | REPL right prompt format |
| `user_agent` | `Option<String>` | HTTP User-Agent |
| `save_shell_history` | `bool` | Shell history persistence |
| `sync_models_url` | `Option<String>` | Models sync URL |
| `clients` | `Vec<ClientConfig>` | LLM provider configs |
### Runtime Fields (#[serde(skip)] → RequestContext)
These are created at runtime and are per-request/per-session mutable state:
| Field | Type | Destination |
|---|---|---|
| `vault` | `GlobalVault` | `AppState.vault` (shared service) |
| `macro_flag` | `bool` | `RequestContext.macro_flag` |
| `info_flag` | `bool` | `RequestContext.info_flag` |
| `agent_variables` | `Option<AgentVariables>` | `RequestContext.agent_variables` |
| `model` | `Model` | `RequestContext.model` |
| `functions` | `Functions` | `RequestContext.tool_scope.functions` (unified in Step 6) |
| `mcp_registry` | `Option<McpRegistry>` | **REMOVED.** Replaced by per-`ToolScope` `McpRuntime`s produced by a new `McpFactory` on `AppState`. See the architecture doc's "Tool Scope Isolation" section. |
| `working_mode` | `WorkingMode` | `RequestContext.working_mode` |
| `last_message` | `Option<LastMessage>` | `RequestContext.last_message` |
| `role` | `Option<Role>` | `RequestContext.role` |
| `session` | `Option<Session>` | `RequestContext.session` |
| `rag` | `Option<Arc<Rag>>` | `RequestContext.rag` |
| `agent` | `Option<Agent>` | `RequestContext.agent` (agent spec + role + RAG) |
| `tool_call_tracker` | `Option<ToolCallTracker>` | `RequestContext.tool_scope.tool_tracker` (unified in Step 6) |
| `supervisor` | `Option<Arc<RwLock<Supervisor>>>` | `RequestContext.agent_runtime.supervisor` |
| `parent_supervisor` | `Option<Arc<RwLock<Supervisor>>>` | `RequestContext.agent_runtime.parent_supervisor` |
| `self_agent_id` | `Option<String>` | `RequestContext.agent_runtime.self_agent_id` |
| `current_depth` | `usize` | `RequestContext.agent_runtime.current_depth` |
| `inbox` | `Option<Arc<Inbox>>` | `RequestContext.agent_runtime.inbox` |
| `root_escalation_queue` | `Option<Arc<EscalationQueue>>` | `RequestContext.agent_runtime.escalation_queue` (shared from the root via `Arc`) |
**Note on `ToolScope` and `AgentRuntime`:** during Phase 1 Step 0 the new `RequestContext` struct keeps `functions`, `tool_call_tracker`, supervisor/inbox/escalation fields as **flat fields** mirroring today's `Config`. This is deliberate — it makes the field-by-field migration mechanical. In Step 6.5 these fields collapse into two sub-structs:
- `ToolScope { functions, mcp_runtime, tool_tracker }` — owned by every active `RoleLike` scope, rebuilt on role/session/agent transitions via `McpFactory::acquire()`.
- `AgentRuntime { spec, rag, supervisor, inbox, escalation_queue, todo_list, self_agent_id, parent_supervisor, current_depth, auto_continue_count }` — owned only when an agent is active.
**Two behavior changes land during Step 6.5** that tighten today's code:
1. `todo_list` becomes `Option<TodoList>`. Today the code always allocates `TodoList::default()` for every agent, even when `auto_continue: false`. Since the todo tools and instructions are only exposed when `auto_continue: true`, the allocation is wasted. The new shape skips allocation unless the agent opts in. No user-visible change.
2. A unified `RagCache` on `AppState` serves **both** standalone RAGs (attached via `.rag <name>`) and agent-owned RAGs (loaded from an agent's `documents:` field). Today, both paths independently call `Rag::load` from disk on every use; with the cache, any scope requesting the same `RagKey` shares the same `Arc<Rag>`. Standalone RAG lives in `ctx.rag`; agent RAG lives in `ctx.agent_runtime.rag`. Roles and Sessions do **not** own RAG (the structs have no RAG fields) — this is true today and unchanged by the refactor. `rebuild_rag` and `edit_rag_docs` call `RagCache::invalidate()`.
See `docs/REST-API-ARCHITECTURE.md` section 5 for the full `ToolScope`, `McpFactory`, `RagCache`, and MCP pooling designs.
---
## Migration Strategy: The Facade Pattern
**Do NOT rewrite everything at once.** Instead, use a transitional facade that keeps the old `Config` working while new code uses the split types.
### Step 0: Add new types alongside Config (no breaking changes) ✅ DONE
Create the new structs in new files. `Config` stays untouched. Nothing breaks.
**Files created:**
- `src/config/app_config.rs``AppConfig` struct (the serialized half)
- `src/config/request_context.rs``RequestContext` struct (the runtime half)
- `src/config/app_state.rs``AppState` struct (Arc-wrapped global services, no `mcp_registry` — see below)
**`AppConfig`** is essentially the current `Config` struct but containing ONLY the serialized fields (no `#[serde(skip)]` fields). It should derive `Deserialize` identically so the existing `config.yaml` still loads.
**Important change from the original plan:** `AppState` does NOT hold an `McpRegistry`. MCP server processes are scoped per `RoleLike`, not process-wide. An `McpFactory` service will be added to `AppState` in Step 6.5. See `docs/REST-API-ARCHITECTURE.md` section 5 for the design rationale.
```rust
// src/config/app_config.rs
#[derive(Debug, Clone, Deserialize)]
#[serde(default)]
pub struct AppConfig {
#[serde(rename(serialize = "model", deserialize = "model"))]
pub model_id: String,
pub temperature: Option<f64>,
pub top_p: Option<f64>,
pub dry_run: bool,
pub stream: bool,
pub save: bool,
pub keybindings: String,
pub editor: Option<String>,
pub wrap: Option<String>,
pub wrap_code: bool,
vault_password_file: Option<PathBuf>,
pub function_calling_support: bool,
pub mapping_tools: IndexMap<String, String>,
pub enabled_tools: Option<String>,
pub visible_tools: Option<Vec<String>>,
pub mcp_server_support: bool,
pub mapping_mcp_servers: IndexMap<String, String>,
pub enabled_mcp_servers: Option<String>,
pub repl_prelude: Option<String>,
pub cmd_prelude: Option<String>,
pub agent_session: Option<String>,
pub save_session: Option<bool>,
pub compression_threshold: usize,
pub summarization_prompt: Option<String>,
pub summary_context_prompt: Option<String>,
pub rag_embedding_model: Option<String>,
pub rag_reranker_model: Option<String>,
pub rag_top_k: usize,
pub rag_chunk_size: Option<usize>,
pub rag_chunk_overlap: Option<usize>,
pub rag_template: Option<String>,
pub document_loaders: HashMap<String, String>,
pub highlight: bool,
pub theme: Option<String>,
pub left_prompt: Option<String>,
pub right_prompt: Option<String>,
pub user_agent: Option<String>,
pub save_shell_history: bool,
pub sync_models_url: Option<String>,
pub clients: Vec<ClientConfig>,
}
```
```rust
// src/config/app_state.rs
#[derive(Clone)]
pub struct AppState {
pub config: Arc<AppConfig>,
pub vault: GlobalVault,
// NOTE: no `mcp_registry` field. MCP runtime is scoped per-`ToolScope`
// on `RequestContext`, not process-wide. An `McpFactory` will be added
// here later (Step 6 / Phase 5) to pool and ref-count MCP processes
// across concurrent ToolScopes. See architecture doc section 5.
}
```
```rust
// src/config/request_context.rs
pub struct RequestContext {
pub app: Arc<AppState>,
// per-request flags
pub macro_flag: bool,
pub info_flag: bool,
pub working_mode: WorkingMode,
// active context
pub model: Model,
pub functions: Functions,
pub role: Option<Role>,
pub session: Option<Session>,
pub rag: Option<Arc<Rag>>,
pub agent: Option<Agent>,
pub agent_variables: Option<AgentVariables>,
// conversation state
pub last_message: Option<LastMessage>,
pub tool_call_tracker: Option<ToolCallTracker>,
// agent supervision
pub supervisor: Option<Arc<RwLock<Supervisor>>>,
pub parent_supervisor: Option<Arc<RwLock<Supervisor>>>,
pub self_agent_id: Option<String>,
pub current_depth: usize,
pub inbox: Option<Arc<Inbox>>,
pub root_escalation_queue: Option<Arc<EscalationQueue>>,
}
```
### Step 1: Make Config constructible from AppConfig + RequestContext
Add conversion methods so the old `Config` can be built from the new types, and vice versa. This is the bridge that lets us migrate incrementally.
```rust
// On Config:
impl Config {
/// Extract the global portion into AppConfig
pub fn to_app_config(&self) -> AppConfig { /* copy serialized fields */ }
/// Extract the runtime portion into RequestContext
pub fn to_request_context(&self, app: Arc<AppState>) -> RequestContext { /* copy runtime fields */ }
/// Reconstruct Config from the split types (for backward compat during migration)
pub fn from_parts(app: &AppState, ctx: &RequestContext) -> Config { /* merge back */ }
}
```
**Test:** After this step, `Config::from_parts(config.to_app_config(), config.to_request_context())` round-trips correctly. Existing tests still pass.
### Step 2: Migrate static methods off Config
There are ~30 static methods on Config (no `self` parameter). These are pure utility functions that don't need Config at all — they compute file paths, list directories, etc.
**Target:** Move these to a standalone `paths` module or keep on `AppConfig` where appropriate.
| Method | Move to |
|---|---|
| `config_dir()` | `paths::config_dir()` |
| `local_path(name)` | `paths::local_path(name)` |
| `cache_path()` | `paths::cache_path()` |
| `oauth_tokens_path()` | `paths::oauth_tokens_path()` |
| `token_file(client)` | `paths::token_file(client)` |
| `log_path()` | `paths::log_path()` |
| `config_file()` | `paths::config_file()` |
| `roles_dir()` | `paths::roles_dir()` |
| `role_file(name)` | `paths::role_file(name)` |
| `macros_dir()` | `paths::macros_dir()` |
| `macro_file(name)` | `paths::macro_file(name)` |
| `env_file()` | `paths::env_file()` |
| `rags_dir()` | `paths::rags_dir()` |
| `functions_dir()` | `paths::functions_dir()` |
| `functions_bin_dir()` | `paths::functions_bin_dir()` |
| `mcp_config_file()` | `paths::mcp_config_file()` |
| `global_tools_dir()` | `paths::global_tools_dir()` |
| `global_utils_dir()` | `paths::global_utils_dir()` |
| `bash_prompt_utils_file()` | `paths::bash_prompt_utils_file()` |
| `agents_data_dir()` | `paths::agents_data_dir()` |
| `agent_data_dir(name)` | `paths::agent_data_dir(name)` |
| `agent_config_file(name)` | `paths::agent_config_file(name)` |
| `agent_bin_dir(name)` | `paths::agent_bin_dir(name)` |
| `agent_rag_file(agent, rag)` | `paths::agent_rag_file(agent, rag)` |
| `agent_functions_file(name)` | `paths::agent_functions_file(name)` |
| `models_override_file()` | `paths::models_override_file()` |
| `list_roles(with_builtin)` | `Role::list(with_builtin)` or `paths` |
| `list_rags()` | `Rag::list()` or `paths` |
| `list_macros()` | `Macro::list()` or `paths` |
| `has_role(name)` | `Role::exists(name)` |
| `has_macro(name)` | `Macro::exists(name)` |
| `sync_models(url, abort)` | Standalone function or on `AppConfig` |
| `local_models_override()` | Standalone function |
| `log_config()` | Standalone function |
**Approach:** Create `src/config/paths.rs`, move functions there, and add `#[deprecated]` forwarding methods on `Config` that call the new locations. Compile, run tests, fix callsites module by module, then remove the deprecated methods.
**Callsite count:** Low — most of these are called from 1-3 places. This is a quick-win step.
### Step 3: Migrate global-read methods to AppConfig
These methods only read serialized config values and should live on `AppConfig`:
| Method | Current Signature | New Home |
|---|---|---|
| `vault_password_file` | `&self -> PathBuf` | `AppConfig` |
| `editor` | `&self -> Result<String>` | `AppConfig` |
| `sync_models_url` | `&self -> String` | `AppConfig` |
| `light_theme` | `&self -> bool` | `AppConfig` |
| `render_options` | `&self -> Result<RenderOptions>` | `AppConfig` |
| `print_markdown` | `&self, text -> Result<()>` | `AppConfig` |
| `rag_template` | `&self, embeddings, sources, text -> String` | `AppConfig` |
| `select_functions` | `&self, role -> Option<Vec<...>>` | `AppConfig` |
| `select_enabled_functions` | `&self, role -> Vec<...>` | `AppConfig` |
| `select_enabled_mcp_servers` | `&self, role -> Vec<...>` | `AppConfig` |
**Same pattern:** Add new methods on `AppConfig`, add `#[deprecated]` forwarding on `Config`, migrate callers, remove.
### Step 4: Migrate global-write methods
These modify serialized config settings (`.set` command, environment loading):
| Method | Notes |
|---|---|
| `set_wrap` | Modifies `self.wrap` |
| `update` | Generic key-value update of config settings |
| `load_envs` | Applies env var overrides |
| `load_functions` | Initializes function definitions |
| `load_mcp_servers` | Starts MCP servers |
| `setup_model` | Sets default model |
| `setup_document_loaders` | Sets default doc loaders |
| `setup_user_agent` | Sets user agent string |
The `load_*` / `setup_*` methods are initialization-only (called once in `Config::init`). They become part of `AppState` construction.
`update` and `set_wrap` are runtime mutations of global config. For the API world, these should require a config reload. For now, they can stay as methods that mutate `AppConfig` through interior mutability or require a mutable reference during REPL setup.
### Step 5: Migrate request-read methods to RequestContext
Pure reads of per-request state:
| Method | Notes |
|---|---|
| `state` | Returns flags for active role/session/agent/rag |
| `messages_file` | Path depends on active agent |
| `sessions_dir` | Path depends on active agent |
| `session_file` | Path depends on active agent |
| `rag_file` | Path depends on active agent |
| `info` | Reads current agent/session/role/rag |
| `role_info` | Reads current role |
| `session_info` | Reads current session |
| `agent_info` | Reads current agent |
| `agent_banner` | Reads current agent |
| `rag_info` | Reads current rag |
| `list_sessions` | Depends on sessions_dir (agent context) |
| `list_autoname_sessions` | Depends on sessions_dir |
| `is_compressing_session` | Reads session state |
| `role_like_mut` | Returns mutable ref to role-like |
### Step 6: Migrate request-write methods to RequestContext
Mutations of per-request state:
| Method | Notes |
|---|---|
| `use_prompt` | Sets temporary role |
| `use_role` / `use_role_obj` | Sets role on session or self |
| `exit_role` | Clears role |
| `edit_role` | Edits and re-applies role |
| `use_session` | Sets session |
| `exit_session` | Saves and clears session |
| `save_session` | Persists session |
| `empty_session` | Clears session messages |
| `set_save_session_this_time` | Session flag |
| `compress_session` / `maybe_compress_session` | Session compression |
| `autoname_session` / `maybe_autoname_session` | Session naming |
| `use_rag` / `exit_rag` / `edit_rag_docs` / `rebuild_rag` | RAG lifecycle |
| `use_agent` / `exit_agent` / `exit_agent_session` | Agent lifecycle |
| `apply_prelude` | Sets role/session from prelude config |
| `before_chat_completion` | Pre-LLM state updates |
| `after_chat_completion` | Post-LLM state updates |
| `discontinuous_last_message` | Message state |
| `init_agent_shared_variables` | Agent vars |
| `init_agent_session_variables` | Agent session vars |
### Step 6.5: Unify tool/MCP fields into `ToolScope` and agent fields into `AgentRuntime`
After Step 6, `RequestContext` has many flat fields that logically cluster into two sub-structs. This step collapses them and introduces three new services on `AppState`.
**New types:**
```rust
pub struct ToolScope {
pub functions: Functions,
pub mcp_runtime: McpRuntime,
pub tool_tracker: ToolCallTracker,
}
pub struct McpRuntime {
servers: HashMap<String, Arc<McpServerHandle>>,
}
pub struct AgentRuntime {
pub spec: AgentSpec,
pub rag: Option<Arc<Rag>>, // shared across siblings of same type
pub supervisor: Supervisor,
pub inbox: Arc<Inbox>,
pub escalation_queue: Arc<EscalationQueue>, // shared from root
pub todo_list: Option<TodoList>, // Some(...) only when auto_continue: true
pub self_agent_id: String,
pub parent_supervisor: Option<Arc<Supervisor>>,
pub current_depth: usize,
pub auto_continue_count: usize,
}
```
**New services on `AppState`:**
```rust
pub struct AppState {
pub config: Arc<AppConfig>,
pub vault: GlobalVault,
pub mcp_factory: Arc<McpFactory>,
pub rag_cache: Arc<RagCache>,
}
pub struct McpFactory {
active: Mutex<HashMap<McpServerKey, Weak<McpServerHandle>>>,
// idle pool + reaper added in Phase 5; Step 6.5 ships the no-pool version
}
impl McpFactory {
pub async fn acquire(&self, key: &McpServerKey) -> Result<Arc<McpServerHandle>>;
}
pub struct RagCache {
entries: RwLock<HashMap<RagKey, Weak<Rag>>>,
}
#[derive(Hash, Eq, PartialEq, Clone, Debug)]
pub enum RagKey {
Named(String), // standalone: rags/<name>.yaml
Agent(String), // agent-owned: agents/<name>/rag.yaml
}
impl RagCache {
pub async fn load(&self, key: &RagKey) -> Result<Option<Arc<Rag>>>;
pub fn invalidate(&self, key: &RagKey);
}
```
**`RequestContext` after collapse:**
```rust
pub struct RequestContext {
pub app: Arc<AppState>,
pub macro_flag: bool,
pub info_flag: bool,
pub working_mode: WorkingMode,
pub model: Model,
pub agent_variables: Option<AgentVariables>,
pub role: Option<Role>,
pub session: Option<Session>,
pub rag: Option<Arc<Rag>>, // session/standalone RAG, not agent RAG
pub agent: Option<Agent>,
pub last_message: Option<LastMessage>,
pub tool_scope: ToolScope, // replaces functions + tool_call_tracker + global mcp_registry
pub agent_runtime: Option<AgentRuntime>, // replaces supervisor + inbox + escalation_queue + todo + self_id + parent + depth; holds shared agent RAG
}
```
**What this step does:**
1. Implement `McpRuntime` and `ToolScope`.
2. Implement `McpFactory`**no pool, no idle handling, no reaper.** `acquire()` checks `active` for an upgradable `Weak`, otherwise spawns fresh. `Drop` on `McpServerHandle` tears down the subprocess directly. Pooling lands in Phase 5.
3. Implement `RagCache` with `RagKey` enum, weak-ref sharing, and per-key serialization for concurrent first-load.
4. Implement `AgentRuntime` with the shape above. `todo_list` is `Option` — only allocated when `agent.spec.auto_continue == true`. `rag` is served from `RagCache` during activation via `RagKey::Agent(name)`.
5. Rewrite scope transitions (`use_role`, `use_session`, `use_agent`, `exit_*`, `Config::update`) to:
- Resolve the effective enabled-tool / enabled-MCP-server list using priority `Agent > Session > Role > Global`
- Build a fresh `McpRuntime` by calling `McpFactory::acquire()` for each required server key
- Construct a new `ToolScope` wrapping the runtime + resolved `Functions`
- Swap `ctx.tool_scope` atomically
6. `use_rag` (standalone / `.rag <name>` path) is rewritten to call `app.rag_cache.load(RagKey::Named(name))` and assign the result to `ctx.rag`. No role/session RAG changes because roles/sessions do not own RAG.
7. Agent activation additionally:
- Calls `app.rag_cache.load(RagKey::Agent(agent_name))` and stores the returned `Arc<Rag>` in the new `AgentRuntime.rag`
- Allocates `todo_list: Some(TodoList::default())` only when `auto_continue: true`; otherwise `None`
- Constructs the `AgentRuntime` and assigns it to `ctx.agent_runtime`
- **Preserves today's clobber behavior for standalone RAG:** does NOT save `ctx.rag` anywhere. When the agent exits, the user's previous `.rag <name>` selection is not restored (matches current behavior). Stacking / restoration is flagged as a Phase 2+ enhancement.
8. `exit_agent` drops `ctx.agent_runtime` (which drops the agent's `Arc<Rag>`; the cache entry becomes evictable if no other scope holds it) and rebuilds `ctx.tool_scope` from the now-topmost `RoleLike`.
9. Sub-agent spawning (in `function/supervisor.rs`) constructs a fresh `RequestContext` for the child from the shared `AppState`:
- Its own `ToolScope` via `McpFactory::acquire()` calls for the child agent's `mcp_servers`
- Its own `AgentRuntime` with:
- `rag` via `rag_cache.load(RagKey::Agent(child_agent_name))` — shared with parent/siblings of same type
- Fresh `Supervisor`, fresh `Inbox`, `current_depth = parent.depth + 1`
- `parent_supervisor = Some(parent.supervisor.clone())` (for messaging)
- `escalation_queue = parent.escalation_queue.clone()` — one queue, rooted at the human
- `todo_list` honoring the child's own `auto_continue` flag
10. Old `Agent::init` logic that mutates a global `McpRegistry` is removed — that's now `McpFactory::acquire()` producing scope-local handles.
11. `rebuild_rag` and `edit_rag_docs` are updated to determine the correct `RagKey` (check `ctx.agent_runtime` first — if present use `RagKey::Agent(spec.name)`, otherwise use the standalone name from `ctx.rag`'s origin) and call `rag_cache.invalidate(&key)` before reloading.
**What this step preserves:**
- **Diff-based reinit for REPL users** — when you `.exit role` from `[github, jira]` back to global `[github]`, the new `ToolScope` is built by calling `McpFactory::acquire("github")`. Without pooling (Phase 1), this respawns `github`. With pooling (Phase 5), `github`'s `Arc` is still held by no one, but the idle pool keeps it warm, so revival is instant. The Phase 1 version has a mild regression here that Phase 5 fixes.
- **Agent-vs-non-agent compatibility** — today's `Agent::init` reinits a global registry; after this step, agent activation replaces `ctx.tool_scope` with an agent-specific one, and `exit_agent` restores the pre-agent scope by rebuilding from the (now-active) role/session/global lists.
- **Todo semantics from the user's perspective** — today's behavior is "todos are available when `auto_continue: true`". After Step 6.5, it's still "todos are available when `auto_continue: true`" — the only difference is we skip the wasted `TodoList::default()` allocation for the other agents.
**Risk:** Mediumhigh. This is where the Phase 1 refactor stops being mechanical and starts having semantic implications. Five things to watch:
1. **Parent scope restoration on `exit_agent`.** Today, `exit_agent` tears down the agent's MCP set but leaves the registry in whatever state `reinit` put it — the parent's original MCP set is NOT restored. Users don't notice because the next scope activation (or REPL exit) reinits anyway. In the new design, `exit_agent` MUST rebuild the parent's `ToolScope` from the still-active role/session/global lists so the user sees the expected state. Test this carefully.
2. **`McpFactory` contention.** With many concurrent sub-agents (say, 4 siblings each needing different MCP sets), the factory's mutex could become a bottleneck during `acquire()`. Hold the lock only while touching `active`, never while awaiting subprocess spawn.
3. **`RagCache` concurrent first-load.** If two consumers request the same `RagKey` simultaneously and neither finds a cached entry, both will try to `Rag::load` from disk. Use per-key `tokio::sync::Mutex` or `OnceCell` to serialize the first load — the second caller blocks briefly and receives the shared Arc. This applies equally to standalone and agent RAGs.
4. **Weak ref staleness in `RagCache`.** The `Weak<Rag>` in the map might point to a dropped `Rag`. The `load()` path MUST attempt `Weak::upgrade()` before returning; if upgrade fails, treat it as a miss and reload.
5. **`rebuild_rag` / `edit_rag_docs` race.** If a user runs `.rag rebuild` while another scope holds the same `Arc<Rag>` (concurrent API session, running sub-agent, etc.), the cache invalidation must NOT yank the Arc out from under the active holder. The `Arc` keeps its reference alive — invalidation just ensures *future* loads read fresh. This is the correct behavior for both standalone and agent RAG; worth confirming in tests.
6. **Identifying the right `RagKey` during rebuild.** `rebuild_rag` today operates on `Config.rag` without knowing its origin. In the new model, the code needs to check `ctx.agent_runtime` first to determine if the active RAG is agent-owned (`RagKey::Agent`) or standalone (`RagKey::Named`). Get this wrong and you invalidate the wrong cache entry, silently breaking subsequent loads.
### Step 7: Tackle mixed methods (THE HARD PART)
These 17 methods conditionally read global config OR per-request state depending on what's active. They need to be split into explicit parameter passing.
| Method | Why it's mixed | Refactoring approach |
|---|---|---|
| `current_model` | Returns agent model, session model, role model, or global model | Take `(&AppConfig, &RequestContext) -> &Model` — check ctx first, fall back to app |
| `extract_role` | Builds role from session/agent/role or global settings | Take `(&AppConfig, &RequestContext) -> Role` |
| `sysinfo` | Reads global settings + current rag/session/agent | Take `(&AppConfig, &RequestContext) -> String` |
| `set_temperature` | Sets on role-like or global | Split: `ctx.set_temperature()` for role-like, `app.set_temperature()` for global |
| `set_top_p` | Same pattern as temperature | Same split |
| `set_enabled_tools` | Same pattern | Same split |
| `set_enabled_mcp_servers` | Same pattern | Same split |
| `set_save_session` | Sets on session or global | Same split |
| `set_compression_threshold` | Sets on session or global | Same split |
| `set_rag_reranker_model` | Sets on rag or global | Same split |
| `set_rag_top_k` | Sets on rag or global | Same split |
| `set_max_output_tokens` | Sets on role-like model or global model | Same split |
| `set_model` | Sets on role-like or global | Same split |
| `retrieve_role` | Loads role, merges with current model settings | Take `(&AppConfig, &RequestContext, name) -> Role` |
| `use_role_safely` | Takes GlobalConfig, does take/replace pattern | Refactor to `(&mut RequestContext, name)` |
| `use_session_safely` | Takes GlobalConfig, does take/replace pattern | Refactor to `(&mut RequestContext, name)` |
| `save_message` | Reads `save` flag (global) + writes to messages_file (agent-dependent path) | Take `(&AppConfig, &RequestContext, input, output)` |
| `render_prompt_left/right` | Reads prompt format (global) + current model/session/agent/role (request) | Take `(&AppConfig, &RequestContext) -> String` |
| `generate_prompt_context` | Same as prompt rendering | Take `(&AppConfig, &RequestContext) -> HashMap` |
| `repl_complete` | Reads both global config and request state for completions | Take `(&AppConfig, &RequestContext, cmd, args) -> Vec<...>` |
**Common pattern for `set_*` methods:** The current code does something like:
```rust
fn set_temperature(&mut self, value: Option<f64>) {
if let Some(role_like) = self.role_like_mut() {
role_like.set_temperature(value);
} else {
self.temperature = value;
}
}
```
This becomes:
```rust
// On RequestContext:
fn set_temperature(&mut self, value: Option<f64>, app_defaults: &AppConfig) {
if let Some(role_like) = self.role_like_mut() {
role_like.set_temperature(value);
}
// Global default mutation goes through a separate path if needed
}
```
### Step 8: The Caller Migration Epic (absorbed scope from Steps 6.5 and 7)
**Important:** the original plan described Step 8 as "rewrite main.rs and repl/mod.rs entry points." During implementation, Steps 6.5 and 7 deliberately deferred their semantic rewrites to Step 8 so the bridge pattern (add new methods alongside old, don't migrate callers yet) stayed consistent. As a result, Step 8 now absorbs:
- **Original Step 8 scope:** entry point rewrite (`main.rs`, `repl/mod.rs`)
- **From Step 6.5 deferrals:** `McpFactory::acquire()` implementation, scope transition rewrites (`use_role`, `use_session`, `use_agent`, `exit_agent`), RAG lifecycle via `RagCache` (`use_rag`, `edit_rag_docs`, `rebuild_rag`), session compression/autoname, `apply_prelude`, sub-agent spawning
- **From Step 7 deferrals:** `Model::retrieve_model` client module refactor, `retrieve_role`, `set_model`, `repl_complete`, `setup_model`, `update` dispatcher, `set_rag_reranker_model`, `set_rag_top_k`, `use_role_safely`/`use_session_safely` elimination, `use_prompt`, `edit_role`
This is a large amount of work. Step 8 is split into **8 sub-steps (8a8h)** for clarity and reviewability. Each sub-step keeps the build green and can be completed as a standalone unit.
**Dependency graph between sub-steps:**
```
┌─────────┐
│ 8a │ client module refactor
│ (Model: │ (Model::retrieve_model, list_models,
│ &App- │ list_all_models! → &AppConfig)
│ Config) │
└────┬────┘
│ unblocks
┌─────────┐
│ 8b │ remaining Step 7 deferrals
│ (Step 7 │ (retrieve_role, set_model, setup_model,
│ debt) │ use_prompt, edit_role, update, etc.)
└────┬────┘
┌──────────┐ │
│ 8c │ │
│(McpFac- │──┐ │
│ tory:: │ │ unblocks │
│acquire())│ │ │
└──────────┘ │ │
▼ │
┌─────────┐ │
│ 8d │ │
│ (scope │──┐ │
│ trans.) │ │ unblocks │
└─────────┘ │ │
▼ │
┌─────────┐ │
│ 8e │ │
│ (RAG + │──┐ │
│session) │ │ │
└─────────┘ │ │
▼ ▼
┌───────────────────┐
│ 8f + 8g │
│ caller migration │
│ (main.rs, REPL) │
└─────────┬─────────┘
┌──────────┐
│ 8h │ remaining callsites
│ (sweep) │ (priority-ordered)
└──────────┘
```
---
#### Step 8a: Client module refactor — `Model::retrieve_model` takes `&AppConfig`
Target: remove the `&Config` dependency from the LLM client infrastructure so Step 8b's mixed-method migrations (`retrieve_role`, `set_model`, `repl_complete`, `setup_model`) can proceed.
**Files touched:**
- `src/client/model.rs``Model::retrieve_model(config: &Config, ...)``Model::retrieve_model(config: &AppConfig, ...)`
- `src/client/macros.rs``list_all_models!` macro takes `&AppConfig` instead of `&Config`
- `src/client/*.rs``list_models`, helper functions updated to take `&AppConfig`
- Any callsite in `src/config/`, `src/main.rs`, `src/repl/`, etc. that calls these client functions — updated to pass `&config.app_config_snapshot()` or equivalent during the bridge window
**Bridge strategy:** add a helper `Config::app_config_snapshot(&self) -> AppConfig` that clones the serialized fields into an `AppConfig`. Callsites that currently pass `&*config.read()` pass `&config.read().app_config_snapshot()` instead. This is slightly wasteful (clones ~40 fields per call) but keeps the bridge window working without a mass caller rewrite. Step 8f/8g will eliminate the clones when callers hold `Arc<AppState>` directly.
**Verification:** full build green. All existing tests pass. CLI/REPL manual smoke test: `loki --model openai:gpt-4o "hello"` still works.
**Risk:** Low. Mechanical refactor. The bridge helper absorbs the signature change cost.
---
#### Step 8b: Finish Step 7's deferred mixed-method migrations
With Step 8a done, the methods that transitively depended on `Model::retrieve_model(&Config)` can now migrate to `RequestContext` with `&AppConfig` parameters.
**Methods migrated to `RequestContext`:**
- `retrieve_role(&self, app: &AppConfig, name: &str) -> Result<Role>`
- `set_model_on_role_like(&mut self, app: &AppConfig, model_id: &str) -> Result<bool>` (paired with `AppConfig::set_model_default`)
- `repl_complete(&self, app: &AppConfig, cmd: &str, args: &[&str]) -> Vec<(String, Option<String>)>`
- `setup_model(&mut self, app: &AppConfig) -> Result<()>` — actually, `setup_model` writes to `self.model_id` (serialized) AND `self.model` (runtime). The split: `AppConfig::ensure_default_model_id()` picks the first available model and updates `self.model_id`; `RequestContext::reload_current_model(&AppConfig)` refreshes `ctx.model` from the app config's id.
- `use_prompt(&mut self, app: &AppConfig, prompt: &str) -> Result<()>` — trivial wrapper around `extract_role` (already done) + `use_role_obj` (Step 6)
- `edit_role(&mut self, app: &AppConfig, abort_signal: AbortSignal) -> Result<()>` — calls `app.editor()`, `upsert_role`, `use_role` (still deferred to 8d)
**RAG-related deferrals:**
- `set_rag_reranker_model` and `set_rag_top_k` get split: the runtime branch (update the active `Rag`) becomes a `RequestContext` method taking `Arc<Rag>` mutation, and the global branch becomes `AppConfig::set_rag_reranker_model_default` / `AppConfig::set_rag_top_k_default`.
**`update` dispatcher:** Once all the individual `set_*` methods exist on both types, `update` migrates to `RequestContext::update(&mut self, app: &mut AppConfig, data: &str) -> Result<()>`. The dispatcher's body becomes a match that calls the appropriate split pair for each key.
**`use_role_safely` / `use_session_safely`:** Still not eliminated in 8b — they're wrappers around the still-`Config`-based `use_role` and `use_session`. Eliminated in 8f/8g when callers switch to `&mut RequestContext`.
**Verification:** full build green. All tests pass. Smoke test: `.set temperature 0.7`, `.set enabled_tools fs`, `.model openai:gpt-4o` all work in REPL.
**Risk:** Low. Same bridge pattern, now unblocked by 8a.
---
#### Step 8c: Extract `McpFactory::acquire()` from `McpRegistry::init_server`
Target: give `McpFactory` a working `acquire()` method so Step 8d can build real `ToolScope` instances.
**Files touched:**
- `src/mcp/mod.rs` — extract the MCP subprocess spawn + rmcp handshake logic (currently inside `McpRegistry::init_server`, ~60 lines) into a standalone function:
```rust
pub(crate) async fn spawn_mcp_server(
spec: &McpServer,
log_path: Option<&Path>,
abort_signal: &AbortSignal,
) -> Result<ConnectedServer>
```
`McpRegistry::init_server` then calls this helper and does its own bookkeeping. Backward-compatible for bridge callers.
- `src/config/mcp_factory.rs` — implement `McpFactory::acquire(spec: &McpServer, log_path, abort_signal) -> Result<Arc<ConnectedServer>>`:
1. Build an `McpServerKey` from the spec
2. Try `self.try_get_active(&key)` → share if upgraded
3. Otherwise call `spawn_mcp_server(spec, ...).await` → wrap in `Arc` → `self.insert_active(key, &arc)` → return
- Write a couple of integration tests that exercise the factory's sharing behavior with a mock server spec (or document why a real integration test needs Phase 5's pooling work)
**What this step does NOT do:** no caller migration, no `ToolScope` construction, no changes to `McpRegistry::reinit`. Step 8d does those.
**Verification:** new unit tests pass. Existing tests pass. `McpRegistry` still works for all current callers.
**Risk:** Medium. The spawn logic is intricate (child process + stdio handshake + error recovery). Extracting without a behavior change requires careful diff review.
---
#### Step 8d: Scope transition rewrites — `use_role`, `use_session`, `use_agent`, `exit_agent`
Target: build real `ToolScope` instances via `McpFactory` when scopes change. This is where Step 6.5's scaffolding stops being scaffolding.
**New methods on `RequestContext`:**
- `use_role(&mut self, app: &AppConfig, name: &str, abort_signal: AbortSignal) -> Result<()>`:
1. Call `self.retrieve_role(app, name)?` (from 8b)
2. Resolve the role's `enabled_mcp_servers` list
3. Build a fresh `ToolScope` by calling `app.mcp_factory.acquire(spec, ...)` for each required server
4. Populate `ctx.tool_scope.functions` with the role's effective function list via `select_functions(app, &role)`
5. Swap `ctx.tool_scope` atomically
6. Call `self.use_role_obj(role)` (from Step 6)
- `use_session(&mut self, app: &AppConfig, session_name: Option<&str>, abort_signal) -> Result<()>` — same pattern, with session-specific handling for `agent_session_variables`
- `use_agent(&mut self, app: &AppConfig, agent_name: &str, session_name: Option<&str>, abort_signal) -> Result<()>` — builds an `AgentRuntime` (Step 6.5 scaffolding), populates `ctx.agent_runtime`, activates the optional inner session
- `exit_agent(&mut self, app: &AppConfig) -> Result<()>` — drops `ctx.agent_runtime`, rebuilds `ctx.tool_scope` from the now-topmost RoleLike (role/session/global), cancels the supervisor, clears RAG if it came from the agent
**Key invariant: parent scope restoration on `exit_agent`.** Today's `Config::exit_agent` leaves the `McpRegistry` in whatever state the agent left it. The new `exit_agent` explicitly rebuilds `ctx.tool_scope` from the current role/session/global enabled-server lists so the user sees the expected state after exiting an agent. This is a semantic improvement over today's behavior (which technically has a latent bug that nobody notices because the next scope activation fixes it).
**What this step does NOT do:** no caller migration. `Config::use_role`, `Config::use_session`, etc. are still on `Config` and still work for existing callers. The `_safely` wrappers are still around.
**Verification:** new `RequestContext::use_role` etc. have unit tests. Full build green. Existing tests pass. No runtime behavior change because nothing calls the new methods yet.
**Risk:** Mediumhigh. This is the first time `McpFactory::acquire()` is exercised outside unit tests. Specifically watch:
- **`McpFactory` mutex contention** — hold the `active` lock only during HashMap mutation, never across subprocess spawn or `await`
- **Parent scope restoration correctness** — write a targeted test that activates an agent with `[github]`, exits, activates a role with `[jira]`, and verifies the tool scope has only `jira` (not `github` leftover)
---
#### Step 8e: RAG lifecycle + session compression + `apply_prelude`
Target: migrate the Category C deferrals from Step 6 (session/RAG lifecycle methods that currently take `&GlobalConfig`).
**New methods on `RequestContext`:**
- `use_rag(&mut self, app: &AppConfig, name: Option<&str>, abort_signal) -> Result<()>` — routes through `app.rag_cache.load(RagKey::Named(name))`
- `edit_rag_docs(&mut self, app: &AppConfig, abort_signal) -> Result<()>` — determines the `RagKey` (Agent or Named) from `ctx.agent_runtime` / `ctx.rag` origin, calls `app.rag_cache.invalidate(&key)`, reloads
- `rebuild_rag(&mut self, app: &AppConfig, abort_signal) -> Result<()>` — same pattern as `edit_rag_docs`
- `compress_session(&mut self, app: &AppConfig) -> Result<()>` — reads `app.summarization_prompt`, `app.summary_context_prompt`, mutates `ctx.session`. Async, does an LLM call via an existing `Input::from_str` pattern.
- `maybe_compress_session(&mut self, app: &AppConfig) -> bool` — checks `ctx.session.needs_compression(app.compression_threshold)`, triggers compression if so. Returns whether compression was triggered; caller decides whether to spawn a background task (the task spawning moves to the caller's responsibility, not the method's).
- `autoname_session(&mut self, app: &AppConfig) -> Result<()>` — same pattern, uses `CREATE_TITLE_ROLE` and `Input::from_str`
- `maybe_autoname_session(&mut self, app: &AppConfig) -> bool` — same return-bool pattern
- `apply_prelude(&mut self, app: &AppConfig, abort_signal) -> Result<()>` — parses `app.repl_prelude` / `app.cmd_prelude`, calls the new `self.use_role()` / `self.use_session()` from 8d
**The `GlobalConfig`-taking static methods go away.** Today's code uses the pattern `Config::maybe_compress_session(config: GlobalConfig)` which takes an owned `Arc<RwLock<Config>>` and spawns a background task. After 8e, the new `RequestContext::maybe_compress_session` returns a bool; callers that want async compression spawn the task themselves with their `RequestContext` context. This is simpler and more explicit.
**Verification:** new methods have unit tests where feasible. Full build green. `compress_session` and `autoname_session` are tricky to unit-test because they do LLM calls; mock the LLM or skip the full path in tests.
**Risk:** Medium. The session compression flow is the most behavior-sensitive — getting the semantics wrong here results in lost session history. Write a targeted integration test that feeds 10+ user messages into a session, triggers compression, and verifies the session's summary is preserved.
---
#### Step 8f: Entry point rewrite — `main.rs`
Target: rewrite `main.rs` to construct `AppState` + `RequestContext` explicitly instead of using `GlobalConfig`.
**Specific changes:**
- `Config::init()` → `AppState::init()` which:
1. Loads `config.yaml` into `AppConfig`
2. Applies environment variable overrides (calls `AppConfig::load_envs` from Step 4)
3. Calls `AppConfig::setup_document_loaders` / `AppConfig::setup_user_agent` (Step 4)
4. Constructs the `Vault`, `McpFactory`, `RagCache`
5. Returns `Arc<AppState>`
- `main::run()` constructs a `RequestContext` from the `AppState` and threads it through to subcommands
- `main::start_directive(ctx: &mut RequestContext, ...)` — signature change
- `main::create_input(ctx: &RequestContext, ...)` — signature change
- `main::shell_execute(ctx: &mut RequestContext, ...)` — signature change
- All 18 `main.rs` callsites updated
**`load_functions` and `load_mcp_servers`:** These are initialization-time methods that populate `ctx.tool_scope.functions` and `ctx.tool_scope.mcp_runtime`. They move from `Config` to a new `RequestContext::bootstrap_tools(&mut self, app: &AppConfig, abort_signal) -> Result<()>` that:
1. Initializes `Functions` via `Functions::init(visible_tools)` (existing code)
2. Resolves the initial enabled-MCP-server list from `app.enabled_mcp_servers`
3. Calls `app.mcp_factory.acquire()` for each
4. Assigns the result to `ctx.tool_scope`
This replaces the `Config::load_functions` + `Config::load_mcp_servers` call sequence in today's `main.rs`.
**Verification:** CLI smoke tests from the original plan's Step 8 verification checklist. Specifically:
- `loki "hello"` — plain prompt
- `loki --role explain "what is TCP"` — role activation
- `loki --session my-project "..."` — session
- `loki --agent sisyphus "..."` — agent activation
- `loki --info` — sysinfo output
Each should produce output matching the pre-Step-8 behavior exactly.
**Risk:** High. `main.rs` is the primary entry point; any regression here is user-visible. Write smoke tests that compare CLI output byte-for-byte with a recorded baseline.
---
#### Step 8g: REPL rewrite — `repl/mod.rs`
Target: rewrite `repl/mod.rs` to use `&mut RequestContext` instead of `GlobalConfig`.
**Specific changes:**
- `Repl` struct: `config: GlobalConfig` → `ctx: RequestContext` (long-lived, mutable across turns)
- `run_repl_command(ctx: &mut RequestContext, ...)` — signature change
- `ask(ctx: &mut RequestContext, ...)` — signature change
- Every dot-command handler updated. Dot-commands that take the `GlobalConfig` pattern (like the `_safely` wrappers) are **eliminated** — they just call `ctx.use_role(...)` directly.
- All 39 command handlers migrated
- All 12 `repl/mod.rs` internal callsites updated
**`use_role_safely` / `use_session_safely` elimination:** these wrappers exist only because `Config::use_role` is `&mut self` and the REPL holds `Arc<RwLock<Config>>`. After Step 8g, the REPL holds `RequestContext` directly (no lock), so the wrappers are no longer needed and get deleted.
**Verification:** REPL smoke tests matching the pre-Step-8 behavior. Specifically:
- Start REPL, issue a prompt → should see same output
- `.role explain`, `.session my-session`, `.agent sisyphus`, `.exit agent` — should all work identically
- `.set temperature 0.7` then `.info` — should show updated temperature
- Ctrl-C during an LLM call — should cleanly abort
- `.macro run-tests` — should execute without errors
**Risk:** High. Same reason as 8f — this is a user-visible entry point. Test every dot-command.
---
#### Step 8h: Remaining callsite sweep
Target: migrate the remaining modules in priority order (lowest callsite count first, keeping the build green after each module):
| Priority | Module | Callsites | Notes |
|---|---|---|---|
| 1 | `render/mod.rs` | 1 | `render_stream` just reads config — trivial |
| 2 | `repl/completer.rs` | 1 | Just reads for completions |
| 3 | `repl/prompt.rs` | 1 | Just reads for prompt rendering |
| 4 | `function/user_interaction.rs` | 1 | Just reads for user prompts |
| 5 | `function/mod.rs` | 2 | `eval_tool_calls` reads config |
| 6 | `config/macros.rs` | 3 | `macro_execute` reads and writes |
| 7 | `function/todo.rs` | 4 | Todo handlers read/write agent state |
| 8 | `config/input.rs` | 6 | Input creation — reads config |
| 9 | `rag/mod.rs` | 6 | RAG init/search |
| 10 | `function/supervisor.rs` | 8 | Sub-agent spawning — complex |
| 11 | `config/agent.rs` | 12 | Agent init — complex, many mixed concerns |
**Sub-agent spawning** (`function/supervisor.rs`) is the most complex item in the sweep. Each child agent gets a fresh `RequestContext` forked from the parent's `Arc<AppState>`:
- Own `ToolScope` built by calling `app.mcp_factory.acquire()` for the child's `mcp_servers` list
- Own `AgentRuntime` with fresh supervisor, fresh inbox, `current_depth = parent.depth + 1`
- `parent_supervisor = Some(parent.agent_runtime.supervisor.clone())` — weakly linked to parent for messaging
- `escalation_queue = parent.agent_runtime.escalation_queue.clone()` — `Arc`-shared from root
- RAG served via `app.rag_cache.load(RagKey::Agent(child_name))` — shared with any sibling of the same type
`config/agent.rs` — `Agent::init` is currently tightly coupled to `Config`. It needs to be rewritten to take `&AppState` + `&mut RequestContext`. Some of its complexity (MCP server startup, RAG loading) moves into `RequestContext::use_agent` from Step 8d; `Agent::init` becomes just the spec-loading portion.
**Verification:** after each module migrates, run full `cargo check` + `cargo test`. After all modules migrate, run the full smoke test suite from 8f and 8g.
**Risk:** Medium. The sub-agent spawning and `config/agent.rs` work is complex, but the bridge pattern means we can take each module independently.
---
### Step 9: Remove the Bridge
After Step 8h, no caller references `GlobalConfig` or calls `Config`-based methods that have `RequestContext` / `AppConfig` equivalents. Step 9 is the cleanup:
1. Delete `src/config/bridge.rs` (the conversion methods added in Step 1)
2. Delete the `#[allow(dead_code)]` attributes from `AppConfig` impl blocks (now that callers use them)
3. Delete the `#[allow(dead_code)]` attributes from `RequestContext` impl blocks
4. Remove the flat runtime fields from `RequestContext` that have been superseded by `tool_scope` and `agent_runtime` (i.e., delete `functions`, `tool_call_tracker`, `supervisor`, `parent_supervisor`, `self_agent_id`, `current_depth`, `inbox`, `root_escalation_queue` from `RequestContext`; they now live inside `ToolScope` and `AgentRuntime`)
5. Run the full test suite
This step is mostly mechanical deletion. The one care-ful part is removing the flat fields — make sure no remaining code path still reads them. `cargo check` will catch any stragglers.
### Step 10: Remove the old Config struct and GlobalConfig type alias
Once Step 9 is done and the tree is clean:
1. Delete all Step 2-7 method definitions on `Config` (the ones that got duplicated to `paths` / `AppConfig` / `RequestContext`)
2. Delete the `#[serde(skip)]` runtime fields on `Config` (they're all on `RequestContext` now)
3. At this point `Config` should be nearly empty — possibly just the serialized fields
4. Delete `Config` entirely (or rename it to `RawConfig` if it's still useful as a serde DTO, though `AppConfig` already plays that role)
5. Delete the `pub type GlobalConfig = Arc<RwLock<Config>>` type alias
6. Run `cargo check`, `cargo test`, `cargo clippy` — all clean
7. Run the full manual smoke test suite one more time
Phase 1 complete.
---
## Callsite Migration Summary
| Module | Functions to Migrate | Handled In |
|---|---|---|
| `config/mod.rs` | 120 methods (30 static, 10 global-read, 8 global-write, 35 request-read/write, 17 mixed) | Steps 2-7 (mechanical duplication), Step 10 (deletion) |
| `client/` macros and `model.rs` | `Model::retrieve_model`, `list_all_models!`, `list_models` | Step 8a |
| `main.rs` | `run`, `start_directive`, `shell_execute`, `create_input`, `apply_prelude_safely` | Step 8f |
| `repl/mod.rs` | `run_repl_command`, `ask`, plus 39 command handlers | Step 8g |
| `config/agent.rs` | `Agent::init`, agent lifecycle methods | Step 8h (partial) + Step 8d (scope transitions) |
| `function/supervisor.rs` | Sub-agent spawning, task management | Step 8h |
| `config/input.rs` | `Input::from_str`, `from_files`, `from_files_with_spinner` | Step 8h |
| `rag/mod.rs` | RAG init, load, search | Step 8e (lifecycle) + Step 8h (remaining) |
| `mcp/mod.rs` | `McpRegistry::init_server` spawn logic extraction | Step 8c |
| `function/mod.rs` | `eval_tool_calls` | Step 8h |
| `function/todo.rs` | Todo handlers | Step 8h |
| `function/user_interaction.rs` | User prompt handler | Step 8h |
| `render/mod.rs` | `render_stream` | Step 8h |
| `repl/completer.rs` | Completion logic | Step 8h |
| `repl/prompt.rs` | Prompt rendering | Step 8h |
| `config/macros.rs` | `macro_execute` | Step 8h |
### Step 8 effort estimates
| Sub-step | Effort | Risk |
|---|---|---|
| 8a — client module refactor | 0.51 day | Low |
| 8b — Step 7 deferrals | 0.51 day | Low |
| 8c — `McpFactory::acquire()` extraction | 1 day | Medium |
| 8d — scope transition rewrites | 12 days | Mediumhigh |
| 8e — RAG + session lifecycle migration | 12 days | Medium |
| 8f — `main.rs` rewrite | 1 day | High |
| 8g — `repl/mod.rs` rewrite | 12 days | High |
| 8h — remaining callsite sweep | 12 days | Medium |
**Total estimated Step 8 effort: ~712 days.** The "total Phase 1 effort" from the plan header needs to be updated once Step 8 finishes.
---
## Verification Checkpoints
After each step, verify:
1. **`cargo check`** — no compilation errors
2. **`cargo test`** — all existing tests pass
3. **Manual smoke test** — CLI one-shot prompt works, REPL starts and processes a prompt
4. **No behavior changes** — identical output for identical inputs
## Risk Factors
### Phase-wide risks
| Risk | Severity | Mitigation |
|---|---|---|
| Bridge-window duplication drift — bug fixed in `Config::X` but not `RequestContext::X` or vice versa | Medium | Keep the bridge window as short as possible. Step 8 should finish within 2 weeks of Step 7 ideally. Any bug fix during Steps 8a-8h must be applied to both places if the method is still duplicated. |
| Sub-agent spawning semantics change subtly | High | Cross-agent MCP trampling is a latent bug today that Step 8d/8h fixes. Write targeted integration tests for the sub-agent spawning path before and after Step 8h to verify semantics match (or improve intentionally). |
| Long-running Phase 1 blocking Phase 2+ work | Medium | Phase 2 (Engine + Emitter) can start prep work in parallel with Step 8h — the final callsite sweep doesn't block the new Engine design. |
### Step 8 sub-step risks
| Sub-step | Risk | Severity | Mitigation |
|---|---|---|---|
| 8a | Client macro refactor breaks LLM provider integration | Low | All LLM providers use the same `Model::retrieve_model` entry point. Test with at least 2 providers (openai + another) before declaring 8a done. |
| 8b | `update` dispatcher has ~15 cases — easy to miss one | Low | Enumerate every `.set` key handled today; check each is in the new dispatcher. |
| 8c | Extracting `spawn_mcp_server` introduces behavior differences (e.g., error handling, abort signal propagation) | Medium | Do a line-by-line diff review. Write a test that kills an in-flight spawn via the abort signal. |
| 8d | `McpFactory` mutex contention under parallel sub-agent spawning | Medium | Hold the `active` lock only during HashMap operations, never across `await`. Benchmark with 4 concurrent scope transitions before declaring 8d done. |
| 8d | Parent scope restoration on `exit_agent` differs from today's implicit behavior | High | Write a targeted test: activate global→role(jira)→agent(github,slack)→exit_agent. Verify scope is role(jira), not agent's (github,slack) or stale. |
| 8e | Session compression loses messages when triggered mid-request | High | Integration test: feed 10+ user messages, compress, verify summary preserves all user intent. Also test concurrent compression (REPL background task + foreground turn). |
| 8e | `rebuild_rag` / `edit_rag_docs` pick the wrong `RagKey` variant | Medium | Test both paths explicitly: agent-scoped rebuild and standalone rebuild. Assert the right cache entry is invalidated. |
| 8f | CLI output bytes differ from pre-refactor baseline | High | Record baseline CLI outputs for 10 common invocations. After 8f, diff byte-for-byte. Any difference is a regression unless explicitly justified. |
| 8g | REPL dot-command behavior regresses silently | High | Test every dot-command end-to-end: `.role`, `.session`, `.agent`, `.rag`, `.set`, `.info`, `.exit *`, `.compress session`, etc. |
| 8h | Sub-agent spawning in `function/supervisor.rs` shares state incorrectly between parent and child | High | Integration test: parent activates agent A (github), spawns child B (jira), verify B's tool scope has only jira and parent's has only github. Each parallel child has independent tool scopes. |
| 8h | `Agent::init` refactor drops initialization logic | Medium | `Agent::init` is ~100 lines today. Diff the old vs new init paths line-by-line. |
### Legacy risks (resolved during the refactor)
These risks from the original plan have been addressed by the step-by-step scaffolding approach:
| Original risk | How it was resolved |
|---|---|
| `use_role_safely` / `use_session_safely` use take/replace pattern | Eliminated entirely in Step 8g — REPL holds `&mut RequestContext` directly, no lock take/replace needed |
| `Agent::init` creates MCP servers, functions, RAG on Config | Resolved in Step 8d + Step 8h — MCP via `McpFactory::acquire()`, RAG via `RagCache`, functions via `RequestContext::bootstrap_tools` |
| Sub-agent spawning clones Config | Resolved in Step 8h — children get fresh `RequestContext` forked from `Arc<AppState>` |
| Input holds `GlobalConfig` clone | Resolved in Step 8f — `Input` now holds references to the context it needs from `RequestContext`, not an owned clone |
| Concurrent REPL operations spawn tasks with `GlobalConfig` clone | Resolved in Step 8e — task spawning moves to the caller's responsibility with explicit `RequestContext` context |
## What This Phase Does NOT Do
- No REST API server code
- No Engine::run() unification (that's Phase 2)
- No Emitter trait (that's Phase 2)
- No SessionStore abstraction (that's Phase 3)
- No UUID-based sessions (that's Phase 3)
- No agent isolation refactoring (that's Phase 5)
- No new dependencies added
The sole goal is: **split Config into immutable global + mutable per-request, with identical external behavior.**