testing

2026-04-10 15:45:51 -06:00
parent ff3419a714
commit e9e6b82e24
42 changed files with 11578 additions and 358 deletions
@@ -0,0 +1,956 @@
+# Phase 1 Implementation Plan: Extract AppState from Config
+
+## Overview
+
+Split the monolithic `Config` struct into:
+- **`AppConfig`** — immutable server-wide settings (deserialized from `config.yaml`)
+- **`RequestContext`** — per-request mutable state (current role, session, agent, supervisor, etc.)
+
+The existing `GlobalConfig` (`Arc<RwLock<Config>>`) type alias is replaced. CLI and REPL continue working identically. No API code is added in this phase.
+
+**Estimated effort:** ~3-4 weeks (originally estimated 1-2 weeks; revised during implementation as Steps 6.5 and 7 deferred their semantic rewrites to an expanded Step 8)
+**Risk:** Medium — touches 91 callsites across 15 modules
+**Mitigation:** Incremental migration with tests passing at every step
+**Sub-step tracking:** Each step has per-step implementation notes in `docs/implementation/PHASE-1-STEP-*-NOTES.md`
+
+---
+
+## Current State: Config Field Classification
+
+### Serialized Fields (from config.yaml → AppConfig)
+
+These are loaded from disk once and should be immutable during request processing:
+
+| Field | Type | Notes |
+|---|---|---|
+| `model_id` | `String` | Default model ID |
+| `temperature` | `Option<f64>` | Default temperature |
+| `top_p` | `Option<f64>` | Default top_p |
+| `dry_run` | `bool` | Can be overridden per-request |
+| `stream` | `bool` | Can be overridden per-request |
+| `save` | `bool` | Whether to persist to messages.md |
+| `keybindings` | `String` | REPL keybinding style |
+| `editor` | `Option<String>` | Editor command |
+| `wrap` | `Option<String>` | Text wrapping |
+| `wrap_code` | `bool` | Code block wrapping |
+| `vault_password_file` | `Option<PathBuf>` | Vault password location |
+| `function_calling_support` | `bool` | Global function calling toggle |
+| `mapping_tools` | `IndexMap<String, String>` | Tool aliases |
+| `enabled_tools` | `Option<String>` | Default enabled tools |
+| `visible_tools` | `Option<Vec<String>>` | Visible tool list |
+| `mcp_server_support` | `bool` | Global MCP toggle |
+| `mapping_mcp_servers` | `IndexMap<String, String>` | MCP server aliases |
+| `enabled_mcp_servers` | `Option<String>` | Default enabled MCP servers |
+| `repl_prelude` | `Option<String>` | REPL prelude config |
+| `cmd_prelude` | `Option<String>` | CLI prelude config |
+| `agent_session` | `Option<String>` | Default agent session |
+| `save_session` | `Option<bool>` | Session save behavior |
+| `compression_threshold` | `usize` | Session compression threshold |
+| `summarization_prompt` | `Option<String>` | Compression prompt |
+| `summary_context_prompt` | `Option<String>` | Summary context prompt |
+| `rag_embedding_model` | `Option<String>` | RAG embedding model |
+| `rag_reranker_model` | `Option<String>` | RAG reranker model |
+| `rag_top_k` | `usize` | RAG top-k results |
+| `rag_chunk_size` | `Option<usize>` | RAG chunk size |
+| `rag_chunk_overlap` | `Option<usize>` | RAG chunk overlap |
+| `rag_template` | `Option<String>` | RAG template |
+| `document_loaders` | `HashMap<String, String>` | Document loader mappings |
+| `highlight` | `bool` | Syntax highlighting |
+| `theme` | `Option<String>` | Color theme |
+| `left_prompt` | `Option<String>` | REPL left prompt format |
+| `right_prompt` | `Option<String>` | REPL right prompt format |
+| `user_agent` | `Option<String>` | HTTP User-Agent |
+| `save_shell_history` | `bool` | Shell history persistence |
+| `sync_models_url` | `Option<String>` | Models sync URL |
+| `clients` | `Vec<ClientConfig>` | LLM provider configs |
+
+### Runtime Fields (#[serde(skip)] → RequestContext)
+
+These are created at runtime and are per-request/per-session mutable state:
+
+| Field | Type | Destination |
+|---|---|---|
+| `vault` | `GlobalVault` | `AppState.vault` (shared service) |
+| `macro_flag` | `bool` | `RequestContext.macro_flag` |
+| `info_flag` | `bool` | `RequestContext.info_flag` |
+| `agent_variables` | `Option<AgentVariables>` | `RequestContext.agent_variables` |
+| `model` | `Model` | `RequestContext.model` |
+| `functions` | `Functions` | `RequestContext.tool_scope.functions` (unified in Step 6) |
+| `mcp_registry` | `Option<McpRegistry>` | **REMOVED.** Replaced by per-`ToolScope` `McpRuntime`s produced by a new `McpFactory` on `AppState`. See the architecture doc's "Tool Scope Isolation" section. |
+| `working_mode` | `WorkingMode` | `RequestContext.working_mode` |
+| `last_message` | `Option<LastMessage>` | `RequestContext.last_message` |
+| `role` | `Option<Role>` | `RequestContext.role` |
+| `session` | `Option<Session>` | `RequestContext.session` |
+| `rag` | `Option<Arc<Rag>>` | `RequestContext.rag` |
+| `agent` | `Option<Agent>` | `RequestContext.agent` (agent spec + role + RAG) |
+| `tool_call_tracker` | `Option<ToolCallTracker>` | `RequestContext.tool_scope.tool_tracker` (unified in Step 6) |
+| `supervisor` | `Option<Arc<RwLock<Supervisor>>>` | `RequestContext.agent_runtime.supervisor` |
+| `parent_supervisor` | `Option<Arc<RwLock<Supervisor>>>` | `RequestContext.agent_runtime.parent_supervisor` |
+| `self_agent_id` | `Option<String>` | `RequestContext.agent_runtime.self_agent_id` |
+| `current_depth` | `usize` | `RequestContext.agent_runtime.current_depth` |
+| `inbox` | `Option<Arc<Inbox>>` | `RequestContext.agent_runtime.inbox` |
+| `root_escalation_queue` | `Option<Arc<EscalationQueue>>` | `RequestContext.agent_runtime.escalation_queue` (shared from the root via `Arc`) |
+
+**Note on `ToolScope` and `AgentRuntime`:** during Phase 1 Step 0 the new `RequestContext` struct keeps `functions`, `tool_call_tracker`, supervisor/inbox/escalation fields as **flat fields** mirroring today's `Config`. This is deliberate — it makes the field-by-field migration mechanical. In Step 6.5 these fields collapse into two sub-structs:
+
+- `ToolScope { functions, mcp_runtime, tool_tracker }` — owned by every active `RoleLike` scope, rebuilt on role/session/agent transitions via `McpFactory::acquire()`.
+- `AgentRuntime { spec, rag, supervisor, inbox, escalation_queue, todo_list, self_agent_id, parent_supervisor, current_depth, auto_continue_count }` — owned only when an agent is active.
+
+**Two behavior changes land during Step 6.5** that tighten today's code:
+
+1. `todo_list` becomes `Option<TodoList>`. Today the code always allocates `TodoList::default()` for every agent, even when `auto_continue: false`. Since the todo tools and instructions are only exposed when `auto_continue: true`, the allocation is wasted. The new shape skips allocation unless the agent opts in. No user-visible change.
+
+2. A unified `RagCache` on `AppState` serves **both** standalone RAGs (attached via `.rag <name>`) and agent-owned RAGs (loaded from an agent's `documents:` field). Today, both paths independently call `Rag::load` from disk on every use; with the cache, any scope requesting the same `RagKey` shares the same `Arc<Rag>`. Standalone RAG lives in `ctx.rag`; agent RAG lives in `ctx.agent_runtime.rag`. Roles and Sessions do **not** own RAG (the structs have no RAG fields) — this is true today and unchanged by the refactor. `rebuild_rag` and `edit_rag_docs` call `RagCache::invalidate()`.
+
+See `docs/REST-API-ARCHITECTURE.md` section 5 for the full `ToolScope`, `McpFactory`, `RagCache`, and MCP pooling designs.
+
+---
+
+## Migration Strategy: The Facade Pattern
+
+**Do NOT rewrite everything at once.** Instead, use a transitional facade that keeps the old `Config` working while new code uses the split types.
+
+### Step 0: Add new types alongside Config (no breaking changes)  ✅ DONE
+
+Create the new structs in new files. `Config` stays untouched. Nothing breaks.
+
+**Files created:**
+- `src/config/app_config.rs` — `AppConfig` struct (the serialized half)
+- `src/config/request_context.rs` — `RequestContext` struct (the runtime half)
+- `src/config/app_state.rs` — `AppState` struct (Arc-wrapped global services, no `mcp_registry` — see below)
+
+**`AppConfig`** is essentially the current `Config` struct but containing ONLY the serialized fields (no `#[serde(skip)]` fields). It should derive `Deserialize` identically so the existing `config.yaml` still loads.
+
+**Important change from the original plan:** `AppState` does NOT hold an `McpRegistry`. MCP server processes are scoped per `RoleLike`, not process-wide. An `McpFactory` service will be added to `AppState` in Step 6.5. See `docs/REST-API-ARCHITECTURE.md` section 5 for the design rationale.
+
+```rust
+// src/config/app_config.rs
+#[derive(Debug, Clone, Deserialize)]
+#[serde(default)]
+pub struct AppConfig {
+    #[serde(rename(serialize = "model", deserialize = "model"))]
+    pub model_id: String,
+    pub temperature: Option<f64>,
+    pub top_p: Option<f64>,
+    pub dry_run: bool,
+    pub stream: bool,
+    pub save: bool,
+    pub keybindings: String,
+    pub editor: Option<String>,
+    pub wrap: Option<String>,
+    pub wrap_code: bool,
+    vault_password_file: Option<PathBuf>,
+    pub function_calling_support: bool,
+    pub mapping_tools: IndexMap<String, String>,
+    pub enabled_tools: Option<String>,
+    pub visible_tools: Option<Vec<String>>,
+    pub mcp_server_support: bool,
+    pub mapping_mcp_servers: IndexMap<String, String>,
+    pub enabled_mcp_servers: Option<String>,
+    pub repl_prelude: Option<String>,
+    pub cmd_prelude: Option<String>,
+    pub agent_session: Option<String>,
+    pub save_session: Option<bool>,
+    pub compression_threshold: usize,
+    pub summarization_prompt: Option<String>,
+    pub summary_context_prompt: Option<String>,
+    pub rag_embedding_model: Option<String>,
+    pub rag_reranker_model: Option<String>,
+    pub rag_top_k: usize,
+    pub rag_chunk_size: Option<usize>,
+    pub rag_chunk_overlap: Option<usize>,
+    pub rag_template: Option<String>,
+    pub document_loaders: HashMap<String, String>,
+    pub highlight: bool,
+    pub theme: Option<String>,
+    pub left_prompt: Option<String>,
+    pub right_prompt: Option<String>,
+    pub user_agent: Option<String>,
+    pub save_shell_history: bool,
+    pub sync_models_url: Option<String>,
+    pub clients: Vec<ClientConfig>,
+}
+```
+
+```rust
+// src/config/app_state.rs
+#[derive(Clone)]
+pub struct AppState {
+    pub config: Arc<AppConfig>,
+    pub vault: GlobalVault,
+    // NOTE: no `mcp_registry` field. MCP runtime is scoped per-`ToolScope`
+    // on `RequestContext`, not process-wide. An `McpFactory` will be added
+    // here later (Step 6 / Phase 5) to pool and ref-count MCP processes
+    // across concurrent ToolScopes. See architecture doc section 5.
+}
+```
+
+```rust
+// src/config/request_context.rs
+pub struct RequestContext {
+    pub app: Arc<AppState>,
+
+    // per-request flags
+    pub macro_flag: bool,
+    pub info_flag: bool,
+    pub working_mode: WorkingMode,
+
+    // active context
+    pub model: Model,
+    pub functions: Functions,
+    pub role: Option<Role>,
+    pub session: Option<Session>,
+    pub rag: Option<Arc<Rag>>,
+    pub agent: Option<Agent>,
+    pub agent_variables: Option<AgentVariables>,
+
+    // conversation state
+    pub last_message: Option<LastMessage>,
+    pub tool_call_tracker: Option<ToolCallTracker>,
+
+    // agent supervision
+    pub supervisor: Option<Arc<RwLock<Supervisor>>>,
+    pub parent_supervisor: Option<Arc<RwLock<Supervisor>>>,
+    pub self_agent_id: Option<String>,
+    pub current_depth: usize,
+    pub inbox: Option<Arc<Inbox>>,
+    pub root_escalation_queue: Option<Arc<EscalationQueue>>,
+}
+```
+
+### Step 1: Make Config constructible from AppConfig + RequestContext
+
+Add conversion methods so the old `Config` can be built from the new types, and vice versa. This is the bridge that lets us migrate incrementally.
+
+```rust
+// On Config:
+impl Config {
+    /// Extract the global portion into AppConfig
+    pub fn to_app_config(&self) -> AppConfig { /* copy serialized fields */ }
+
+    /// Extract the runtime portion into RequestContext
+    pub fn to_request_context(&self, app: Arc<AppState>) -> RequestContext { /* copy runtime fields */ }
+
+    /// Reconstruct Config from the split types (for backward compat during migration)
+    pub fn from_parts(app: &AppState, ctx: &RequestContext) -> Config { /* merge back */ }
+}
+```
+
+**Test:** After this step, `Config::from_parts(config.to_app_config(), config.to_request_context())` round-trips correctly. Existing tests still pass.
+
+### Step 2: Migrate static methods off Config
+
+There are ~30 static methods on Config (no `self` parameter). These are pure utility functions that don't need Config at all — they compute file paths, list directories, etc.
+
+**Target:** Move these to a standalone `paths` module or keep on `AppConfig` where appropriate.
+
+| Method | Move to |
+|---|---|
+| `config_dir()` | `paths::config_dir()` |
+| `local_path(name)` | `paths::local_path(name)` |
+| `cache_path()` | `paths::cache_path()` |
+| `oauth_tokens_path()` | `paths::oauth_tokens_path()` |
+| `token_file(client)` | `paths::token_file(client)` |
+| `log_path()` | `paths::log_path()` |
+| `config_file()` | `paths::config_file()` |
+| `roles_dir()` | `paths::roles_dir()` |
+| `role_file(name)` | `paths::role_file(name)` |
+| `macros_dir()` | `paths::macros_dir()` |
+| `macro_file(name)` | `paths::macro_file(name)` |
+| `env_file()` | `paths::env_file()` |
+| `rags_dir()` | `paths::rags_dir()` |
+| `functions_dir()` | `paths::functions_dir()` |
+| `functions_bin_dir()` | `paths::functions_bin_dir()` |
+| `mcp_config_file()` | `paths::mcp_config_file()` |
+| `global_tools_dir()` | `paths::global_tools_dir()` |
+| `global_utils_dir()` | `paths::global_utils_dir()` |
+| `bash_prompt_utils_file()` | `paths::bash_prompt_utils_file()` |
+| `agents_data_dir()` | `paths::agents_data_dir()` |
+| `agent_data_dir(name)` | `paths::agent_data_dir(name)` |
+| `agent_config_file(name)` | `paths::agent_config_file(name)` |
+| `agent_bin_dir(name)` | `paths::agent_bin_dir(name)` |
+| `agent_rag_file(agent, rag)` | `paths::agent_rag_file(agent, rag)` |
+| `agent_functions_file(name)` | `paths::agent_functions_file(name)` |
+| `models_override_file()` | `paths::models_override_file()` |
+| `list_roles(with_builtin)` | `Role::list(with_builtin)` or `paths` |
+| `list_rags()` | `Rag::list()` or `paths` |
+| `list_macros()` | `Macro::list()` or `paths` |
+| `has_role(name)` | `Role::exists(name)` |
+| `has_macro(name)` | `Macro::exists(name)` |
+| `sync_models(url, abort)` | Standalone function or on `AppConfig` |
+| `local_models_override()` | Standalone function |
+| `log_config()` | Standalone function |
+
+**Approach:** Create `src/config/paths.rs`, move functions there, and add `#[deprecated]` forwarding methods on `Config` that call the new locations. Compile, run tests, fix callsites module by module, then remove the deprecated methods.
+
+**Callsite count:** Low — most of these are called from 1-3 places. This is a quick-win step.
+
+### Step 3: Migrate global-read methods to AppConfig
+
+These methods only read serialized config values and should live on `AppConfig`:
+
+| Method | Current Signature | New Home |
+|---|---|---|
+| `vault_password_file` | `&self -> PathBuf` | `AppConfig` |
+| `editor` | `&self -> Result<String>` | `AppConfig` |
+| `sync_models_url` | `&self -> String` | `AppConfig` |
+| `light_theme` | `&self -> bool` | `AppConfig` |
+| `render_options` | `&self -> Result<RenderOptions>` | `AppConfig` |
+| `print_markdown` | `&self, text -> Result<()>` | `AppConfig` |
+| `rag_template` | `&self, embeddings, sources, text -> String` | `AppConfig` |
+| `select_functions` | `&self, role -> Option<Vec<...>>` | `AppConfig` |
+| `select_enabled_functions` | `&self, role -> Vec<...>` | `AppConfig` |
+| `select_enabled_mcp_servers` | `&self, role -> Vec<...>` | `AppConfig` |
+
+**Same pattern:** Add new methods on `AppConfig`, add `#[deprecated]` forwarding on `Config`, migrate callers, remove.
+
+### Step 4: Migrate global-write methods
+
+These modify serialized config settings (`.set` command, environment loading):
+
+| Method | Notes |
+|---|---|
+| `set_wrap` | Modifies `self.wrap` |
+| `update` | Generic key-value update of config settings |
+| `load_envs` | Applies env var overrides |
+| `load_functions` | Initializes function definitions |
+| `load_mcp_servers` | Starts MCP servers |
+| `setup_model` | Sets default model |
+| `setup_document_loaders` | Sets default doc loaders |
+| `setup_user_agent` | Sets user agent string |
+
+The `load_*` / `setup_*` methods are initialization-only (called once in `Config::init`). They become part of `AppState` construction.
+
+`update` and `set_wrap` are runtime mutations of global config. For the API world, these should require a config reload. For now, they can stay as methods that mutate `AppConfig` through interior mutability or require a mutable reference during REPL setup.
+
+### Step 5: Migrate request-read methods to RequestContext
+
+Pure reads of per-request state:
+
+| Method | Notes |
+|---|---|
+| `state` | Returns flags for active role/session/agent/rag |
+| `messages_file` | Path depends on active agent |
+| `sessions_dir` | Path depends on active agent |
+| `session_file` | Path depends on active agent |
+| `rag_file` | Path depends on active agent |
+| `info` | Reads current agent/session/role/rag |
+| `role_info` | Reads current role |
+| `session_info` | Reads current session |
+| `agent_info` | Reads current agent |
+| `agent_banner` | Reads current agent |
+| `rag_info` | Reads current rag |
+| `list_sessions` | Depends on sessions_dir (agent context) |
+| `list_autoname_sessions` | Depends on sessions_dir |
+| `is_compressing_session` | Reads session state |
+| `role_like_mut` | Returns mutable ref to role-like |
+
+### Step 6: Migrate request-write methods to RequestContext
+
+Mutations of per-request state:
+
+| Method | Notes |
+|---|---|
+| `use_prompt` | Sets temporary role |
+| `use_role` / `use_role_obj` | Sets role on session or self |
+| `exit_role` | Clears role |
+| `edit_role` | Edits and re-applies role |
+| `use_session` | Sets session |
+| `exit_session` | Saves and clears session |
+| `save_session` | Persists session |
+| `empty_session` | Clears session messages |
+| `set_save_session_this_time` | Session flag |
+| `compress_session` / `maybe_compress_session` | Session compression |
+| `autoname_session` / `maybe_autoname_session` | Session naming |
+| `use_rag` / `exit_rag` / `edit_rag_docs` / `rebuild_rag` | RAG lifecycle |
+| `use_agent` / `exit_agent` / `exit_agent_session` | Agent lifecycle |
+| `apply_prelude` | Sets role/session from prelude config |
+| `before_chat_completion` | Pre-LLM state updates |
+| `after_chat_completion` | Post-LLM state updates |
+| `discontinuous_last_message` | Message state |
+| `init_agent_shared_variables` | Agent vars |
+| `init_agent_session_variables` | Agent session vars |
+
+### Step 6.5: Unify tool/MCP fields into `ToolScope` and agent fields into `AgentRuntime`
+
+After Step 6, `RequestContext` has many flat fields that logically cluster into two sub-structs. This step collapses them and introduces three new services on `AppState`.
+
+**New types:**
+
+```rust
+pub struct ToolScope {
+    pub functions: Functions,
+    pub mcp_runtime: McpRuntime,
+    pub tool_tracker: ToolCallTracker,
+}
+
+pub struct McpRuntime {
+    servers: HashMap<String, Arc<McpServerHandle>>,
+}
+
+pub struct AgentRuntime {
+    pub spec: AgentSpec,
+    pub rag: Option<Arc<Rag>>,                   // shared across siblings of same type
+    pub supervisor: Supervisor,
+    pub inbox: Arc<Inbox>,
+    pub escalation_queue: Arc<EscalationQueue>,  // shared from root
+    pub todo_list: Option<TodoList>,             // Some(...) only when auto_continue: true
+    pub self_agent_id: String,
+    pub parent_supervisor: Option<Arc<Supervisor>>,
+    pub current_depth: usize,
+    pub auto_continue_count: usize,
+}
+```
+
+**New services on `AppState`:**
+
+```rust
+pub struct AppState {
+    pub config: Arc<AppConfig>,
+    pub vault: GlobalVault,
+    pub mcp_factory: Arc<McpFactory>,
+    pub rag_cache: Arc<RagCache>,
+}
+
+pub struct McpFactory {
+    active: Mutex<HashMap<McpServerKey, Weak<McpServerHandle>>>,
+    // idle pool + reaper added in Phase 5; Step 6.5 ships the no-pool version
+}
+
+impl McpFactory {
+    pub async fn acquire(&self, key: &McpServerKey) -> Result<Arc<McpServerHandle>>;
+}
+
+pub struct RagCache {
+    entries: RwLock<HashMap<RagKey, Weak<Rag>>>,
+}
+
+#[derive(Hash, Eq, PartialEq, Clone, Debug)]
+pub enum RagKey {
+    Named(String),   // standalone: rags/<name>.yaml
+    Agent(String),   // agent-owned: agents/<name>/rag.yaml
+}
+
+impl RagCache {
+    pub async fn load(&self, key: &RagKey) -> Result<Option<Arc<Rag>>>;
+    pub fn invalidate(&self, key: &RagKey);
+}
+```
+
+**`RequestContext` after collapse:**
+
+```rust
+pub struct RequestContext {
+    pub app: Arc<AppState>,
+    pub macro_flag: bool,
+    pub info_flag: bool,
+    pub working_mode: WorkingMode,
+    pub model: Model,
+    pub agent_variables: Option<AgentVariables>,
+
+    pub role: Option<Role>,
+    pub session: Option<Session>,
+    pub rag: Option<Arc<Rag>>,  // session/standalone RAG, not agent RAG
+    pub agent: Option<Agent>,
+
+    pub last_message: Option<LastMessage>,
+
+    pub tool_scope: ToolScope,                // replaces functions + tool_call_tracker + global mcp_registry
+    pub agent_runtime: Option<AgentRuntime>,  // replaces supervisor + inbox + escalation_queue + todo + self_id + parent + depth; holds shared agent RAG
+}
+```
+
+**What this step does:**
+
+1. Implement `McpRuntime` and `ToolScope`.
+2. Implement `McpFactory` — **no pool, no idle handling, no reaper.** `acquire()` checks `active` for an upgradable `Weak`, otherwise spawns fresh. `Drop` on `McpServerHandle` tears down the subprocess directly. Pooling lands in Phase 5.
+3. Implement `RagCache` with `RagKey` enum, weak-ref sharing, and per-key serialization for concurrent first-load.
+4. Implement `AgentRuntime` with the shape above. `todo_list` is `Option` — only allocated when `agent.spec.auto_continue == true`. `rag` is served from `RagCache` during activation via `RagKey::Agent(name)`.
+5. Rewrite scope transitions (`use_role`, `use_session`, `use_agent`, `exit_*`, `Config::update`) to:
+   - Resolve the effective enabled-tool / enabled-MCP-server list using priority `Agent > Session > Role > Global`
+   - Build a fresh `McpRuntime` by calling `McpFactory::acquire()` for each required server key
+   - Construct a new `ToolScope` wrapping the runtime + resolved `Functions`
+   - Swap `ctx.tool_scope` atomically
+6. `use_rag` (standalone / `.rag <name>` path) is rewritten to call `app.rag_cache.load(RagKey::Named(name))` and assign the result to `ctx.rag`. No role/session RAG changes because roles/sessions do not own RAG.
+7. Agent activation additionally:
+   - Calls `app.rag_cache.load(RagKey::Agent(agent_name))` and stores the returned `Arc<Rag>` in the new `AgentRuntime.rag`
+   - Allocates `todo_list: Some(TodoList::default())` only when `auto_continue: true`; otherwise `None`
+   - Constructs the `AgentRuntime` and assigns it to `ctx.agent_runtime`
+   - **Preserves today's clobber behavior for standalone RAG:** does NOT save `ctx.rag` anywhere. When the agent exits, the user's previous `.rag <name>` selection is not restored (matches current behavior). Stacking / restoration is flagged as a Phase 2+ enhancement.
+8. `exit_agent` drops `ctx.agent_runtime` (which drops the agent's `Arc<Rag>`; the cache entry becomes evictable if no other scope holds it) and rebuilds `ctx.tool_scope` from the now-topmost `RoleLike`.
+9. Sub-agent spawning (in `function/supervisor.rs`) constructs a fresh `RequestContext` for the child from the shared `AppState`:
+   - Its own `ToolScope` via `McpFactory::acquire()` calls for the child agent's `mcp_servers`
+   - Its own `AgentRuntime` with:
+     - `rag` via `rag_cache.load(RagKey::Agent(child_agent_name))` — shared with parent/siblings of same type
+     - Fresh `Supervisor`, fresh `Inbox`, `current_depth = parent.depth + 1`
+     - `parent_supervisor = Some(parent.supervisor.clone())` (for messaging)
+     - `escalation_queue = parent.escalation_queue.clone()` — one queue, rooted at the human
+     - `todo_list` honoring the child's own `auto_continue` flag
+10. Old `Agent::init` logic that mutates a global `McpRegistry` is removed — that's now `McpFactory::acquire()` producing scope-local handles.
+11. `rebuild_rag` and `edit_rag_docs` are updated to determine the correct `RagKey` (check `ctx.agent_runtime` first — if present use `RagKey::Agent(spec.name)`, otherwise use the standalone name from `ctx.rag`'s origin) and call `rag_cache.invalidate(&key)` before reloading.
+
+**What this step preserves:**
+
+- **Diff-based reinit for REPL users** — when you `.exit role` from `[github, jira]` back to global `[github]`, the new `ToolScope` is built by calling `McpFactory::acquire("github")`. Without pooling (Phase 1), this respawns `github`. With pooling (Phase 5), `github`'s `Arc` is still held by no one, but the idle pool keeps it warm, so revival is instant. The Phase 1 version has a mild regression here that Phase 5 fixes.
+- **Agent-vs-non-agent compatibility** — today's `Agent::init` reinits a global registry; after this step, agent activation replaces `ctx.tool_scope` with an agent-specific one, and `exit_agent` restores the pre-agent scope by rebuilding from the (now-active) role/session/global lists.
+- **Todo semantics from the user's perspective** — today's behavior is "todos are available when `auto_continue: true`". After Step 6.5, it's still "todos are available when `auto_continue: true`" — the only difference is we skip the wasted `TodoList::default()` allocation for the other agents.
+
+**Risk:** Medium–high. This is where the Phase 1 refactor stops being mechanical and starts having semantic implications. Five things to watch:
+
+1. **Parent scope restoration on `exit_agent`.** Today, `exit_agent` tears down the agent's MCP set but leaves the registry in whatever state `reinit` put it — the parent's original MCP set is NOT restored. Users don't notice because the next scope activation (or REPL exit) reinits anyway. In the new design, `exit_agent` MUST rebuild the parent's `ToolScope` from the still-active role/session/global lists so the user sees the expected state. Test this carefully.
+2. **`McpFactory` contention.** With many concurrent sub-agents (say, 4 siblings each needing different MCP sets), the factory's mutex could become a bottleneck during `acquire()`. Hold the lock only while touching `active`, never while awaiting subprocess spawn.
+3. **`RagCache` concurrent first-load.** If two consumers request the same `RagKey` simultaneously and neither finds a cached entry, both will try to `Rag::load` from disk. Use per-key `tokio::sync::Mutex` or `OnceCell` to serialize the first load — the second caller blocks briefly and receives the shared Arc. This applies equally to standalone and agent RAGs.
+4. **Weak ref staleness in `RagCache`.** The `Weak<Rag>` in the map might point to a dropped `Rag`. The `load()` path MUST attempt `Weak::upgrade()` before returning; if upgrade fails, treat it as a miss and reload.
+5. **`rebuild_rag` / `edit_rag_docs` race.** If a user runs `.rag rebuild` while another scope holds the same `Arc<Rag>` (concurrent API session, running sub-agent, etc.), the cache invalidation must NOT yank the Arc out from under the active holder. The `Arc` keeps its reference alive — invalidation just ensures *future* loads read fresh. This is the correct behavior for both standalone and agent RAG; worth confirming in tests.
+6. **Identifying the right `RagKey` during rebuild.** `rebuild_rag` today operates on `Config.rag` without knowing its origin. In the new model, the code needs to check `ctx.agent_runtime` first to determine if the active RAG is agent-owned (`RagKey::Agent`) or standalone (`RagKey::Named`). Get this wrong and you invalidate the wrong cache entry, silently breaking subsequent loads.
+
+### Step 7: Tackle mixed methods (THE HARD PART)
+
+These 17 methods conditionally read global config OR per-request state depending on what's active. They need to be split into explicit parameter passing.
+
+| Method | Why it's mixed | Refactoring approach |
+|---|---|---|
+| `current_model` | Returns agent model, session model, role model, or global model | Take `(&AppConfig, &RequestContext) -> &Model` — check ctx first, fall back to app |
+| `extract_role` | Builds role from session/agent/role or global settings | Take `(&AppConfig, &RequestContext) -> Role` |
+| `sysinfo` | Reads global settings + current rag/session/agent | Take `(&AppConfig, &RequestContext) -> String` |
+| `set_temperature` | Sets on role-like or global | Split: `ctx.set_temperature()` for role-like, `app.set_temperature()` for global |
+| `set_top_p` | Same pattern as temperature | Same split |
+| `set_enabled_tools` | Same pattern | Same split |
+| `set_enabled_mcp_servers` | Same pattern | Same split |
+| `set_save_session` | Sets on session or global | Same split |
+| `set_compression_threshold` | Sets on session or global | Same split |
+| `set_rag_reranker_model` | Sets on rag or global | Same split |
+| `set_rag_top_k` | Sets on rag or global | Same split |
+| `set_max_output_tokens` | Sets on role-like model or global model | Same split |
+| `set_model` | Sets on role-like or global | Same split |
+| `retrieve_role` | Loads role, merges with current model settings | Take `(&AppConfig, &RequestContext, name) -> Role` |
+| `use_role_safely` | Takes GlobalConfig, does take/replace pattern | Refactor to `(&mut RequestContext, name)` |
+| `use_session_safely` | Takes GlobalConfig, does take/replace pattern | Refactor to `(&mut RequestContext, name)` |
+| `save_message` | Reads `save` flag (global) + writes to messages_file (agent-dependent path) | Take `(&AppConfig, &RequestContext, input, output)` |
+| `render_prompt_left/right` | Reads prompt format (global) + current model/session/agent/role (request) | Take `(&AppConfig, &RequestContext) -> String` |
+| `generate_prompt_context` | Same as prompt rendering | Take `(&AppConfig, &RequestContext) -> HashMap` |
+| `repl_complete` | Reads both global config and request state for completions | Take `(&AppConfig, &RequestContext, cmd, args) -> Vec<...>` |
+
+**Common pattern for `set_*` methods:** The current code does something like:
+```rust
+fn set_temperature(&mut self, value: Option<f64>) {
+    if let Some(role_like) = self.role_like_mut() {
+        role_like.set_temperature(value);
+    } else {
+        self.temperature = value;
+    }
+}
+```
+
+This becomes:
+```rust
+// On RequestContext:
+fn set_temperature(&mut self, value: Option<f64>, app_defaults: &AppConfig) {
+    if let Some(role_like) = self.role_like_mut() {
+        role_like.set_temperature(value);
+    }
+    // Global default mutation goes through a separate path if needed
+}
+```
+
+### Step 8: The Caller Migration Epic (absorbed scope from Steps 6.5 and 7)
+
+**Important:** the original plan described Step 8 as "rewrite main.rs and repl/mod.rs entry points." During implementation, Steps 6.5 and 7 deliberately deferred their semantic rewrites to Step 8 so the bridge pattern (add new methods alongside old, don't migrate callers yet) stayed consistent. As a result, Step 8 now absorbs:
+
+- **Original Step 8 scope:** entry point rewrite (`main.rs`, `repl/mod.rs`)
+- **From Step 6.5 deferrals:** `McpFactory::acquire()` implementation, scope transition rewrites (`use_role`, `use_session`, `use_agent`, `exit_agent`), RAG lifecycle via `RagCache` (`use_rag`, `edit_rag_docs`, `rebuild_rag`), session compression/autoname, `apply_prelude`, sub-agent spawning
+- **From Step 7 deferrals:** `Model::retrieve_model` client module refactor, `retrieve_role`, `set_model`, `repl_complete`, `setup_model`, `update` dispatcher, `set_rag_reranker_model`, `set_rag_top_k`, `use_role_safely`/`use_session_safely` elimination, `use_prompt`, `edit_role`
+
+This is a large amount of work. Step 8 is split into **8 sub-steps (8a–8h)** for clarity and reviewability. Each sub-step keeps the build green and can be completed as a standalone unit.
+
+**Dependency graph between sub-steps:**
+
+```
+                                                    ┌─────────┐
+                                                    │   8a    │  client module refactor
+                                                    │ (Model: │  (Model::retrieve_model, list_models,
+                                                    │  &App-  │   list_all_models! → &AppConfig)
+                                                    │ Config) │
+                                                    └────┬────┘
+                                                         │ unblocks
+                                                         ▼
+                                                    ┌─────────┐
+                                                    │   8b    │  remaining Step 7 deferrals
+                                                    │ (Step 7 │  (retrieve_role, set_model, setup_model,
+                                                    │  debt)  │   use_prompt, edit_role, update, etc.)
+                                                    └────┬────┘
+                                                         │
+            ┌──────────┐                                 │
+            │    8c    │                                 │
+            │(McpFac-  │──┐                              │
+            │ tory::   │  │ unblocks                     │
+            │acquire())│  │                              │
+            └──────────┘  │                              │
+                          ▼                              │
+                     ┌─────────┐                         │
+                     │   8d    │                         │
+                     │ (scope  │──┐                      │
+                     │ trans.) │  │ unblocks             │
+                     └─────────┘  │                      │
+                                  ▼                      │
+                              ┌─────────┐                │
+                              │   8e    │                │
+                              │ (RAG +  │──┐             │
+                              │session) │  │             │
+                              └─────────┘  │             │
+                                           ▼             ▼
+                                          ┌───────────────────┐
+                                          │      8f + 8g      │
+                                          │  caller migration │
+                                          │  (main.rs, REPL)  │
+                                          └─────────┬─────────┘
+                                                    ▼
+                                              ┌──────────┐
+                                              │    8h    │  remaining callsites
+                                              │ (sweep)  │  (priority-ordered)
+                                              └──────────┘
+```
+
+---
+
+#### Step 8a: Client module refactor — `Model::retrieve_model` takes `&AppConfig`
+
+Target: remove the `&Config` dependency from the LLM client infrastructure so Step 8b's mixed-method migrations (`retrieve_role`, `set_model`, `repl_complete`, `setup_model`) can proceed.
+
+**Files touched:**
+- `src/client/model.rs` — `Model::retrieve_model(config: &Config, ...)` → `Model::retrieve_model(config: &AppConfig, ...)`
+- `src/client/macros.rs` — `list_all_models!` macro takes `&AppConfig` instead of `&Config`
+- `src/client/*.rs` — `list_models`, helper functions updated to take `&AppConfig`
+- Any callsite in `src/config/`, `src/main.rs`, `src/repl/`, etc. that calls these client functions — updated to pass `&config.app_config_snapshot()` or equivalent during the bridge window
+
+**Bridge strategy:** add a helper `Config::app_config_snapshot(&self) -> AppConfig` that clones the serialized fields into an `AppConfig`. Callsites that currently pass `&*config.read()` pass `&config.read().app_config_snapshot()` instead. This is slightly wasteful (clones ~40 fields per call) but keeps the bridge window working without a mass caller rewrite. Step 8f/8g will eliminate the clones when callers hold `Arc<AppState>` directly.
+
+**Verification:** full build green. All existing tests pass. CLI/REPL manual smoke test: `loki --model openai:gpt-4o "hello"` still works.
+
+**Risk:** Low. Mechanical refactor. The bridge helper absorbs the signature change cost.
+
+---
+
+#### Step 8b: Finish Step 7's deferred mixed-method migrations
+
+With Step 8a done, the methods that transitively depended on `Model::retrieve_model(&Config)` can now migrate to `RequestContext` with `&AppConfig` parameters.
+
+**Methods migrated to `RequestContext`:**
+- `retrieve_role(&self, app: &AppConfig, name: &str) -> Result<Role>`
+- `set_model_on_role_like(&mut self, app: &AppConfig, model_id: &str) -> Result<bool>` (paired with `AppConfig::set_model_default`)
+- `repl_complete(&self, app: &AppConfig, cmd: &str, args: &[&str]) -> Vec<(String, Option<String>)>`
+- `setup_model(&mut self, app: &AppConfig) -> Result<()>` — actually, `setup_model` writes to `self.model_id` (serialized) AND `self.model` (runtime). The split: `AppConfig::ensure_default_model_id()` picks the first available model and updates `self.model_id`; `RequestContext::reload_current_model(&AppConfig)` refreshes `ctx.model` from the app config's id.
+- `use_prompt(&mut self, app: &AppConfig, prompt: &str) -> Result<()>` — trivial wrapper around `extract_role` (already done) + `use_role_obj` (Step 6)
+- `edit_role(&mut self, app: &AppConfig, abort_signal: AbortSignal) -> Result<()>` — calls `app.editor()`, `upsert_role`, `use_role` (still deferred to 8d)
+
+**RAG-related deferrals:**
+- `set_rag_reranker_model` and `set_rag_top_k` get split: the runtime branch (update the active `Rag`) becomes a `RequestContext` method taking `Arc<Rag>` mutation, and the global branch becomes `AppConfig::set_rag_reranker_model_default` / `AppConfig::set_rag_top_k_default`.
+
+**`update` dispatcher:** Once all the individual `set_*` methods exist on both types, `update` migrates to `RequestContext::update(&mut self, app: &mut AppConfig, data: &str) -> Result<()>`. The dispatcher's body becomes a match that calls the appropriate split pair for each key.
+
+**`use_role_safely` / `use_session_safely`:** Still not eliminated in 8b — they're wrappers around the still-`Config`-based `use_role` and `use_session`. Eliminated in 8f/8g when callers switch to `&mut RequestContext`.
+
+**Verification:** full build green. All tests pass. Smoke test: `.set temperature 0.7`, `.set enabled_tools fs`, `.model openai:gpt-4o` all work in REPL.
+
+**Risk:** Low. Same bridge pattern, now unblocked by 8a.
+
+---
+
+#### Step 8c: Extract `McpFactory::acquire()` from `McpRegistry::init_server`
+
+Target: give `McpFactory` a working `acquire()` method so Step 8d can build real `ToolScope` instances.
+
+**Files touched:**
+- `src/mcp/mod.rs` — extract the MCP subprocess spawn + rmcp handshake logic (currently inside `McpRegistry::init_server`, ~60 lines) into a standalone function:
+  ```rust
+  pub(crate) async fn spawn_mcp_server(
+      spec: &McpServer,
+      log_path: Option<&Path>,
+      abort_signal: &AbortSignal,
+  ) -> Result<ConnectedServer>
+  ```
+  `McpRegistry::init_server` then calls this helper and does its own bookkeeping. Backward-compatible for bridge callers.
+- `src/config/mcp_factory.rs` — implement `McpFactory::acquire(spec: &McpServer, log_path, abort_signal) -> Result<Arc<ConnectedServer>>`:
+  1. Build an `McpServerKey` from the spec
+  2. Try `self.try_get_active(&key)` → share if upgraded
+  3. Otherwise call `spawn_mcp_server(spec, ...).await` → wrap in `Arc` → `self.insert_active(key, &arc)` → return
+- Write a couple of integration tests that exercise the factory's sharing behavior with a mock server spec (or document why a real integration test needs Phase 5's pooling work)
+
+**What this step does NOT do:** no caller migration, no `ToolScope` construction, no changes to `McpRegistry::reinit`. Step 8d does those.
+
+**Verification:** new unit tests pass. Existing tests pass. `McpRegistry` still works for all current callers.
+
+**Risk:** Medium. The spawn logic is intricate (child process + stdio handshake + error recovery). Extracting without a behavior change requires careful diff review.
+
+---
+
+#### Step 8d: Scope transition rewrites — `use_role`, `use_session`, `use_agent`, `exit_agent`
+
+Target: build real `ToolScope` instances via `McpFactory` when scopes change. This is where Step 6.5's scaffolding stops being scaffolding.
+
+**New methods on `RequestContext`:**
+- `use_role(&mut self, app: &AppConfig, name: &str, abort_signal: AbortSignal) -> Result<()>`:
+  1. Call `self.retrieve_role(app, name)?` (from 8b)
+  2. Resolve the role's `enabled_mcp_servers` list
+  3. Build a fresh `ToolScope` by calling `app.mcp_factory.acquire(spec, ...)` for each required server
+  4. Populate `ctx.tool_scope.functions` with the role's effective function list via `select_functions(app, &role)`
+  5. Swap `ctx.tool_scope` atomically
+  6. Call `self.use_role_obj(role)` (from Step 6)
+- `use_session(&mut self, app: &AppConfig, session_name: Option<&str>, abort_signal) -> Result<()>` — same pattern, with session-specific handling for `agent_session_variables`
+- `use_agent(&mut self, app: &AppConfig, agent_name: &str, session_name: Option<&str>, abort_signal) -> Result<()>` — builds an `AgentRuntime` (Step 6.5 scaffolding), populates `ctx.agent_runtime`, activates the optional inner session
+- `exit_agent(&mut self, app: &AppConfig) -> Result<()>` — drops `ctx.agent_runtime`, rebuilds `ctx.tool_scope` from the now-topmost RoleLike (role/session/global), cancels the supervisor, clears RAG if it came from the agent
+
+**Key invariant: parent scope restoration on `exit_agent`.** Today's `Config::exit_agent` leaves the `McpRegistry` in whatever state the agent left it. The new `exit_agent` explicitly rebuilds `ctx.tool_scope` from the current role/session/global enabled-server lists so the user sees the expected state after exiting an agent. This is a semantic improvement over today's behavior (which technically has a latent bug that nobody notices because the next scope activation fixes it).
+
+**What this step does NOT do:** no caller migration. `Config::use_role`, `Config::use_session`, etc. are still on `Config` and still work for existing callers. The `_safely` wrappers are still around.
+
+**Verification:** new `RequestContext::use_role` etc. have unit tests. Full build green. Existing tests pass. No runtime behavior change because nothing calls the new methods yet.
+
+**Risk:** Medium–high. This is the first time `McpFactory::acquire()` is exercised outside unit tests. Specifically watch:
+- **`McpFactory` mutex contention** — hold the `active` lock only during HashMap mutation, never across subprocess spawn or `await`
+- **Parent scope restoration correctness** — write a targeted test that activates an agent with `[github]`, exits, activates a role with `[jira]`, and verifies the tool scope has only `jira` (not `github` leftover)
+
+---
+
+#### Step 8e: RAG lifecycle + session compression + `apply_prelude`
+
+Target: migrate the Category C deferrals from Step 6 (session/RAG lifecycle methods that currently take `&GlobalConfig`).
+
+**New methods on `RequestContext`:**
+- `use_rag(&mut self, app: &AppConfig, name: Option<&str>, abort_signal) -> Result<()>` — routes through `app.rag_cache.load(RagKey::Named(name))`
+- `edit_rag_docs(&mut self, app: &AppConfig, abort_signal) -> Result<()>` — determines the `RagKey` (Agent or Named) from `ctx.agent_runtime` / `ctx.rag` origin, calls `app.rag_cache.invalidate(&key)`, reloads
+- `rebuild_rag(&mut self, app: &AppConfig, abort_signal) -> Result<()>` — same pattern as `edit_rag_docs`
+- `compress_session(&mut self, app: &AppConfig) -> Result<()>` — reads `app.summarization_prompt`, `app.summary_context_prompt`, mutates `ctx.session`. Async, does an LLM call via an existing `Input::from_str` pattern.
+- `maybe_compress_session(&mut self, app: &AppConfig) -> bool` — checks `ctx.session.needs_compression(app.compression_threshold)`, triggers compression if so. Returns whether compression was triggered; caller decides whether to spawn a background task (the task spawning moves to the caller's responsibility, not the method's).
+- `autoname_session(&mut self, app: &AppConfig) -> Result<()>` — same pattern, uses `CREATE_TITLE_ROLE` and `Input::from_str`
+- `maybe_autoname_session(&mut self, app: &AppConfig) -> bool` — same return-bool pattern
+- `apply_prelude(&mut self, app: &AppConfig, abort_signal) -> Result<()>` — parses `app.repl_prelude` / `app.cmd_prelude`, calls the new `self.use_role()` / `self.use_session()` from 8d
+
+**The `GlobalConfig`-taking static methods go away.** Today's code uses the pattern `Config::maybe_compress_session(config: GlobalConfig)` which takes an owned `Arc<RwLock<Config>>` and spawns a background task. After 8e, the new `RequestContext::maybe_compress_session` returns a bool; callers that want async compression spawn the task themselves with their `RequestContext` context. This is simpler and more explicit.
+
+**Verification:** new methods have unit tests where feasible. Full build green. `compress_session` and `autoname_session` are tricky to unit-test because they do LLM calls; mock the LLM or skip the full path in tests.
+
+**Risk:** Medium. The session compression flow is the most behavior-sensitive — getting the semantics wrong here results in lost session history. Write a targeted integration test that feeds 10+ user messages into a session, triggers compression, and verifies the session's summary is preserved.
+
+---
+
+#### Step 8f: Entry point rewrite — `main.rs`
+
+Target: rewrite `main.rs` to construct `AppState` + `RequestContext` explicitly instead of using `GlobalConfig`.
+
+**Specific changes:**
+- `Config::init()` → `AppState::init()` which:
+  1. Loads `config.yaml` into `AppConfig`
+  2. Applies environment variable overrides (calls `AppConfig::load_envs` from Step 4)
+  3. Calls `AppConfig::setup_document_loaders` / `AppConfig::setup_user_agent` (Step 4)
+  4. Constructs the `Vault`, `McpFactory`, `RagCache`
+  5. Returns `Arc<AppState>`
+- `main::run()` constructs a `RequestContext` from the `AppState` and threads it through to subcommands
+- `main::start_directive(ctx: &mut RequestContext, ...)` — signature change
+- `main::create_input(ctx: &RequestContext, ...)` — signature change
+- `main::shell_execute(ctx: &mut RequestContext, ...)` — signature change
+- All 18 `main.rs` callsites updated
+
+**`load_functions` and `load_mcp_servers`:** These are initialization-time methods that populate `ctx.tool_scope.functions` and `ctx.tool_scope.mcp_runtime`. They move from `Config` to a new `RequestContext::bootstrap_tools(&mut self, app: &AppConfig, abort_signal) -> Result<()>` that:
+1. Initializes `Functions` via `Functions::init(visible_tools)` (existing code)
+2. Resolves the initial enabled-MCP-server list from `app.enabled_mcp_servers`
+3. Calls `app.mcp_factory.acquire()` for each
+4. Assigns the result to `ctx.tool_scope`
+
+This replaces the `Config::load_functions` + `Config::load_mcp_servers` call sequence in today's `main.rs`.
+
+**Verification:** CLI smoke tests from the original plan's Step 8 verification checklist. Specifically:
+- `loki "hello"` — plain prompt
+- `loki --role explain "what is TCP"` — role activation
+- `loki --session my-project "..."` — session
+- `loki --agent sisyphus "..."` — agent activation
+- `loki --info` — sysinfo output
+
+Each should produce output matching the pre-Step-8 behavior exactly.
+
+**Risk:** High. `main.rs` is the primary entry point; any regression here is user-visible. Write smoke tests that compare CLI output byte-for-byte with a recorded baseline.
+
+---
+
+#### Step 8g: REPL rewrite — `repl/mod.rs`
+
+Target: rewrite `repl/mod.rs` to use `&mut RequestContext` instead of `GlobalConfig`.
+
+**Specific changes:**
+- `Repl` struct: `config: GlobalConfig` → `ctx: RequestContext` (long-lived, mutable across turns)
+- `run_repl_command(ctx: &mut RequestContext, ...)` — signature change
+- `ask(ctx: &mut RequestContext, ...)` — signature change
+- Every dot-command handler updated. Dot-commands that take the `GlobalConfig` pattern (like the `_safely` wrappers) are **eliminated** — they just call `ctx.use_role(...)` directly.
+- All 39 command handlers migrated
+- All 12 `repl/mod.rs` internal callsites updated
+
+**`use_role_safely` / `use_session_safely` elimination:** these wrappers exist only because `Config::use_role` is `&mut self` and the REPL holds `Arc<RwLock<Config>>`. After Step 8g, the REPL holds `RequestContext` directly (no lock), so the wrappers are no longer needed and get deleted.
+
+**Verification:** REPL smoke tests matching the pre-Step-8 behavior. Specifically:
+- Start REPL, issue a prompt → should see same output
+- `.role explain`, `.session my-session`, `.agent sisyphus`, `.exit agent` — should all work identically
+- `.set temperature 0.7` then `.info` — should show updated temperature
+- Ctrl-C during an LLM call — should cleanly abort
+- `.macro run-tests` — should execute without errors
+
+**Risk:** High. Same reason as 8f — this is a user-visible entry point. Test every dot-command.
+
+---
+
+#### Step 8h: Remaining callsite sweep
+
+Target: migrate the remaining modules in priority order (lowest callsite count first, keeping the build green after each module):
+
+| Priority | Module | Callsites | Notes |
+|---|---|---|---|
+| 1 | `render/mod.rs` | 1 | `render_stream` just reads config — trivial |
+| 2 | `repl/completer.rs` | 1 | Just reads for completions |
+| 3 | `repl/prompt.rs` | 1 | Just reads for prompt rendering |
+| 4 | `function/user_interaction.rs` | 1 | Just reads for user prompts |
+| 5 | `function/mod.rs` | 2 | `eval_tool_calls` reads config |
+| 6 | `config/macros.rs` | 3 | `macro_execute` reads and writes |
+| 7 | `function/todo.rs` | 4 | Todo handlers read/write agent state |
+| 8 | `config/input.rs` | 6 | Input creation — reads config |
+| 9 | `rag/mod.rs` | 6 | RAG init/search |
+| 10 | `function/supervisor.rs` | 8 | Sub-agent spawning — complex |
+| 11 | `config/agent.rs` | 12 | Agent init — complex, many mixed concerns |
+
+**Sub-agent spawning** (`function/supervisor.rs`) is the most complex item in the sweep. Each child agent gets a fresh `RequestContext` forked from the parent's `Arc<AppState>`:
+- Own `ToolScope` built by calling `app.mcp_factory.acquire()` for the child's `mcp_servers` list
+- Own `AgentRuntime` with fresh supervisor, fresh inbox, `current_depth = parent.depth + 1`
+- `parent_supervisor = Some(parent.agent_runtime.supervisor.clone())` — weakly linked to parent for messaging
+- `escalation_queue = parent.agent_runtime.escalation_queue.clone()` — `Arc`-shared from root
+- RAG served via `app.rag_cache.load(RagKey::Agent(child_name))` — shared with any sibling of the same type
+
+`config/agent.rs` — `Agent::init` is currently tightly coupled to `Config`. It needs to be rewritten to take `&AppState` + `&mut RequestContext`. Some of its complexity (MCP server startup, RAG loading) moves into `RequestContext::use_agent` from Step 8d; `Agent::init` becomes just the spec-loading portion.
+
+**Verification:** after each module migrates, run full `cargo check` + `cargo test`. After all modules migrate, run the full smoke test suite from 8f and 8g.
+
+**Risk:** Medium. The sub-agent spawning and `config/agent.rs` work is complex, but the bridge pattern means we can take each module independently.
+
+---
+
+### Step 9: Remove the Bridge
+
+After Step 8h, no caller references `GlobalConfig` or calls `Config`-based methods that have `RequestContext` / `AppConfig` equivalents. Step 9 is the cleanup:
+
+1. Delete `src/config/bridge.rs` (the conversion methods added in Step 1)
+2. Delete the `#[allow(dead_code)]` attributes from `AppConfig` impl blocks (now that callers use them)
+3. Delete the `#[allow(dead_code)]` attributes from `RequestContext` impl blocks
+4. Remove the flat runtime fields from `RequestContext` that have been superseded by `tool_scope` and `agent_runtime` (i.e., delete `functions`, `tool_call_tracker`, `supervisor`, `parent_supervisor`, `self_agent_id`, `current_depth`, `inbox`, `root_escalation_queue` from `RequestContext`; they now live inside `ToolScope` and `AgentRuntime`)
+5. Run the full test suite
+
+This step is mostly mechanical deletion. The one care-ful part is removing the flat fields — make sure no remaining code path still reads them. `cargo check` will catch any stragglers.
+
+### Step 10: Remove the old Config struct and GlobalConfig type alias
+
+Once Step 9 is done and the tree is clean:
+
+1. Delete all Step 2-7 method definitions on `Config` (the ones that got duplicated to `paths` / `AppConfig` / `RequestContext`)
+2. Delete the `#[serde(skip)]` runtime fields on `Config` (they're all on `RequestContext` now)
+3. At this point `Config` should be nearly empty — possibly just the serialized fields
+4. Delete `Config` entirely (or rename it to `RawConfig` if it's still useful as a serde DTO, though `AppConfig` already plays that role)
+5. Delete the `pub type GlobalConfig = Arc<RwLock<Config>>` type alias
+6. Run `cargo check`, `cargo test`, `cargo clippy` — all clean
+7. Run the full manual smoke test suite one more time
+
+Phase 1 complete.
+
+---
+
+## Callsite Migration Summary
+
+| Module | Functions to Migrate | Handled In |
+|---|---|---|
+| `config/mod.rs` | 120 methods (30 static, 10 global-read, 8 global-write, 35 request-read/write, 17 mixed) | Steps 2-7 (mechanical duplication), Step 10 (deletion) |
+| `client/` macros and `model.rs` | `Model::retrieve_model`, `list_all_models!`, `list_models` | Step 8a |
+| `main.rs` | `run`, `start_directive`, `shell_execute`, `create_input`, `apply_prelude_safely` | Step 8f |
+| `repl/mod.rs` | `run_repl_command`, `ask`, plus 39 command handlers | Step 8g |
+| `config/agent.rs` | `Agent::init`, agent lifecycle methods | Step 8h (partial) + Step 8d (scope transitions) |
+| `function/supervisor.rs` | Sub-agent spawning, task management | Step 8h |
+| `config/input.rs` | `Input::from_str`, `from_files`, `from_files_with_spinner` | Step 8h |
+| `rag/mod.rs` | RAG init, load, search | Step 8e (lifecycle) + Step 8h (remaining) |
+| `mcp/mod.rs` | `McpRegistry::init_server` spawn logic extraction | Step 8c |
+| `function/mod.rs` | `eval_tool_calls` | Step 8h |
+| `function/todo.rs` | Todo handlers | Step 8h |
+| `function/user_interaction.rs` | User prompt handler | Step 8h |
+| `render/mod.rs` | `render_stream` | Step 8h |
+| `repl/completer.rs` | Completion logic | Step 8h |
+| `repl/prompt.rs` | Prompt rendering | Step 8h |
+| `config/macros.rs` | `macro_execute` | Step 8h |
+
+### Step 8 effort estimates
+
+| Sub-step | Effort | Risk |
+|---|---|---|
+| 8a — client module refactor | 0.5–1 day | Low |
+| 8b — Step 7 deferrals | 0.5–1 day | Low |
+| 8c — `McpFactory::acquire()` extraction | 1 day | Medium |
+| 8d — scope transition rewrites | 1–2 days | Medium–high |
+| 8e — RAG + session lifecycle migration | 1–2 days | Medium |
+| 8f — `main.rs` rewrite | 1 day | High |
+| 8g — `repl/mod.rs` rewrite | 1–2 days | High |
+| 8h — remaining callsite sweep | 1–2 days | Medium |
+
+**Total estimated Step 8 effort: ~7–12 days.** The "total Phase 1 effort" from the plan header needs to be updated once Step 8 finishes.
+
+---
+
+## Verification Checkpoints
+
+After each step, verify:
+
+1. **`cargo check`** — no compilation errors
+2. **`cargo test`** — all existing tests pass
+3. **Manual smoke test** — CLI one-shot prompt works, REPL starts and processes a prompt
+4. **No behavior changes** — identical output for identical inputs
+
+## Risk Factors
+
+### Phase-wide risks
+
+| Risk | Severity | Mitigation |
+|---|---|---|
+| Bridge-window duplication drift — bug fixed in `Config::X` but not `RequestContext::X` or vice versa | Medium | Keep the bridge window as short as possible. Step 8 should finish within 2 weeks of Step 7 ideally. Any bug fix during Steps 8a-8h must be applied to both places if the method is still duplicated. |
+| Sub-agent spawning semantics change subtly | High | Cross-agent MCP trampling is a latent bug today that Step 8d/8h fixes. Write targeted integration tests for the sub-agent spawning path before and after Step 8h to verify semantics match (or improve intentionally). |
+| Long-running Phase 1 blocking Phase 2+ work | Medium | Phase 2 (Engine + Emitter) can start prep work in parallel with Step 8h — the final callsite sweep doesn't block the new Engine design. |
+
+### Step 8 sub-step risks
+
+| Sub-step | Risk | Severity | Mitigation |
+|---|---|---|---|
+| 8a | Client macro refactor breaks LLM provider integration | Low | All LLM providers use the same `Model::retrieve_model` entry point. Test with at least 2 providers (openai + another) before declaring 8a done. |
+| 8b | `update` dispatcher has ~15 cases — easy to miss one | Low | Enumerate every `.set` key handled today; check each is in the new dispatcher. |
+| 8c | Extracting `spawn_mcp_server` introduces behavior differences (e.g., error handling, abort signal propagation) | Medium | Do a line-by-line diff review. Write a test that kills an in-flight spawn via the abort signal. |
+| 8d | `McpFactory` mutex contention under parallel sub-agent spawning | Medium | Hold the `active` lock only during HashMap operations, never across `await`. Benchmark with 4 concurrent scope transitions before declaring 8d done. |
+| 8d | Parent scope restoration on `exit_agent` differs from today's implicit behavior | High | Write a targeted test: activate global→role(jira)→agent(github,slack)→exit_agent. Verify scope is role(jira), not agent's (github,slack) or stale. |
+| 8e | Session compression loses messages when triggered mid-request | High | Integration test: feed 10+ user messages, compress, verify summary preserves all user intent. Also test concurrent compression (REPL background task + foreground turn). |
+| 8e | `rebuild_rag` / `edit_rag_docs` pick the wrong `RagKey` variant | Medium | Test both paths explicitly: agent-scoped rebuild and standalone rebuild. Assert the right cache entry is invalidated. |
+| 8f | CLI output bytes differ from pre-refactor baseline | High | Record baseline CLI outputs for 10 common invocations. After 8f, diff byte-for-byte. Any difference is a regression unless explicitly justified. |
+| 8g | REPL dot-command behavior regresses silently | High | Test every dot-command end-to-end: `.role`, `.session`, `.agent`, `.rag`, `.set`, `.info`, `.exit *`, `.compress session`, etc. |
+| 8h | Sub-agent spawning in `function/supervisor.rs` shares state incorrectly between parent and child | High | Integration test: parent activates agent A (github), spawns child B (jira), verify B's tool scope has only jira and parent's has only github. Each parallel child has independent tool scopes. |
+| 8h | `Agent::init` refactor drops initialization logic | Medium | `Agent::init` is ~100 lines today. Diff the old vs new init paths line-by-line. |
+
+### Legacy risks (resolved during the refactor)
+
+These risks from the original plan have been addressed by the step-by-step scaffolding approach:
+
+| Original risk | How it was resolved |
+|---|---|
+| `use_role_safely` / `use_session_safely` use take/replace pattern | Eliminated entirely in Step 8g — REPL holds `&mut RequestContext` directly, no lock take/replace needed |
+| `Agent::init` creates MCP servers, functions, RAG on Config | Resolved in Step 8d + Step 8h — MCP via `McpFactory::acquire()`, RAG via `RagCache`, functions via `RequestContext::bootstrap_tools` |
+| Sub-agent spawning clones Config | Resolved in Step 8h — children get fresh `RequestContext` forked from `Arc<AppState>` |
+| Input holds `GlobalConfig` clone | Resolved in Step 8f — `Input` now holds references to the context it needs from `RequestContext`, not an owned clone |
+| Concurrent REPL operations spawn tasks with `GlobalConfig` clone | Resolved in Step 8e — task spawning moves to the caller's responsibility with explicit `RequestContext` context |
+
+## What This Phase Does NOT Do
+
+- No REST API server code
+- No Engine::run() unification (that's Phase 2)
+- No Emitter trait (that's Phase 2)
+- No SessionStore abstraction (that's Phase 3)
+- No UUID-based sessions (that's Phase 3)
+- No agent isolation refactoring (that's Phase 5)
+- No new dependencies added
+
+The sole goal is: **split Config into immutable global + mutable per-request, with identical external behavior.**