# Phase 1 Implementation Plan: Extract AppState from Config ## Overview Split the monolithic `Config` struct into: - **`AppConfig`** — immutable server-wide settings (deserialized from `config.yaml`) - **`RequestContext`** — per-request mutable state (current role, session, agent, supervisor, etc.) The existing `GlobalConfig` (`Arc>`) type alias is replaced. CLI and REPL continue working identically. No API code is added in this phase. **Estimated effort:** ~3-4 weeks (originally estimated 1-2 weeks; revised during implementation as Steps 6.5 and 7 deferred their semantic rewrites to an expanded Step 8) **Risk:** Medium — touches 91 callsites across 15 modules **Mitigation:** Incremental migration with tests passing at every step **Sub-step tracking:** Each step has per-step implementation notes in `docs/implementation/PHASE-1-STEP-*-NOTES.md` --- ## Current State: Config Field Classification ### Serialized Fields (from config.yaml → AppConfig) These are loaded from disk once and should be immutable during request processing: | Field | Type | Notes | |---|---|---| | `model_id` | `String` | Default model ID | | `temperature` | `Option` | Default temperature | | `top_p` | `Option` | Default top_p | | `dry_run` | `bool` | Can be overridden per-request | | `stream` | `bool` | Can be overridden per-request | | `save` | `bool` | Whether to persist to messages.md | | `keybindings` | `String` | REPL keybinding style | | `editor` | `Option` | Editor command | | `wrap` | `Option` | Text wrapping | | `wrap_code` | `bool` | Code block wrapping | | `vault_password_file` | `Option` | Vault password location | | `function_calling_support` | `bool` | Global function calling toggle | | `mapping_tools` | `IndexMap` | Tool aliases | | `enabled_tools` | `Option` | Default enabled tools | | `visible_tools` | `Option>` | Visible tool list | | `mcp_server_support` | `bool` | Global MCP toggle | | `mapping_mcp_servers` | `IndexMap` | MCP server aliases | | `enabled_mcp_servers` | `Option` | Default enabled MCP servers | | `repl_prelude` | `Option` | REPL prelude config | | `cmd_prelude` | `Option` | CLI prelude config | | `agent_session` | `Option` | Default agent session | | `save_session` | `Option` | Session save behavior | | `compression_threshold` | `usize` | Session compression threshold | | `summarization_prompt` | `Option` | Compression prompt | | `summary_context_prompt` | `Option` | Summary context prompt | | `rag_embedding_model` | `Option` | RAG embedding model | | `rag_reranker_model` | `Option` | RAG reranker model | | `rag_top_k` | `usize` | RAG top-k results | | `rag_chunk_size` | `Option` | RAG chunk size | | `rag_chunk_overlap` | `Option` | RAG chunk overlap | | `rag_template` | `Option` | RAG template | | `document_loaders` | `HashMap` | Document loader mappings | | `highlight` | `bool` | Syntax highlighting | | `theme` | `Option` | Color theme | | `left_prompt` | `Option` | REPL left prompt format | | `right_prompt` | `Option` | REPL right prompt format | | `user_agent` | `Option` | HTTP User-Agent | | `save_shell_history` | `bool` | Shell history persistence | | `sync_models_url` | `Option` | Models sync URL | | `clients` | `Vec` | LLM provider configs | ### Runtime Fields (#[serde(skip)] → RequestContext) These are created at runtime and are per-request/per-session mutable state: | Field | Type | Destination | |---|---|---| | `vault` | `GlobalVault` | `AppState.vault` (shared service) | | `macro_flag` | `bool` | `RequestContext.macro_flag` | | `info_flag` | `bool` | `RequestContext.info_flag` | | `agent_variables` | `Option` | `RequestContext.agent_variables` | | `model` | `Model` | `RequestContext.model` | | `functions` | `Functions` | `RequestContext.tool_scope.functions` (unified in Step 6) | | `mcp_registry` | `Option` | **REMOVED.** Replaced by per-`ToolScope` `McpRuntime`s produced by a new `McpFactory` on `AppState`. See the architecture doc's "Tool Scope Isolation" section. | | `working_mode` | `WorkingMode` | `RequestContext.working_mode` | | `last_message` | `Option` | `RequestContext.last_message` | | `role` | `Option` | `RequestContext.role` | | `session` | `Option` | `RequestContext.session` | | `rag` | `Option>` | `RequestContext.rag` | | `agent` | `Option` | `RequestContext.agent` (agent spec + role + RAG) | | `tool_call_tracker` | `Option` | `RequestContext.tool_scope.tool_tracker` (unified in Step 6) | | `supervisor` | `Option>>` | `RequestContext.agent_runtime.supervisor` | | `parent_supervisor` | `Option>>` | `RequestContext.agent_runtime.parent_supervisor` | | `self_agent_id` | `Option` | `RequestContext.agent_runtime.self_agent_id` | | `current_depth` | `usize` | `RequestContext.agent_runtime.current_depth` | | `inbox` | `Option>` | `RequestContext.agent_runtime.inbox` | | `root_escalation_queue` | `Option>` | `RequestContext.agent_runtime.escalation_queue` (shared from the root via `Arc`) | **Note on `ToolScope` and `AgentRuntime`:** during Phase 1 Step 0 the new `RequestContext` struct keeps `functions`, `tool_call_tracker`, supervisor/inbox/escalation fields as **flat fields** mirroring today's `Config`. This is deliberate — it makes the field-by-field migration mechanical. In Step 6.5 these fields collapse into two sub-structs: - `ToolScope { functions, mcp_runtime, tool_tracker }` — owned by every active `RoleLike` scope, rebuilt on role/session/agent transitions via `McpFactory::acquire()`. - `AgentRuntime { spec, rag, supervisor, inbox, escalation_queue, todo_list, self_agent_id, parent_supervisor, current_depth, auto_continue_count }` — owned only when an agent is active. **Two behavior changes land during Step 6.5** that tighten today's code: 1. `todo_list` becomes `Option`. Today the code always allocates `TodoList::default()` for every agent, even when `auto_continue: false`. Since the todo tools and instructions are only exposed when `auto_continue: true`, the allocation is wasted. The new shape skips allocation unless the agent opts in. No user-visible change. 2. A unified `RagCache` on `AppState` serves **both** standalone RAGs (attached via `.rag `) and agent-owned RAGs (loaded from an agent's `documents:` field). Today, both paths independently call `Rag::load` from disk on every use; with the cache, any scope requesting the same `RagKey` shares the same `Arc`. Standalone RAG lives in `ctx.rag`; agent RAG lives in `ctx.agent_runtime.rag`. Roles and Sessions do **not** own RAG (the structs have no RAG fields) — this is true today and unchanged by the refactor. `rebuild_rag` and `edit_rag_docs` call `RagCache::invalidate()`. See `docs/REST-API-ARCHITECTURE.md` section 5 for the full `ToolScope`, `McpFactory`, `RagCache`, and MCP pooling designs. --- ## Migration Strategy: The Facade Pattern **Do NOT rewrite everything at once.** Instead, use a transitional facade that keeps the old `Config` working while new code uses the split types. ### Step 0: Add new types alongside Config (no breaking changes) ✅ DONE Create the new structs in new files. `Config` stays untouched. Nothing breaks. **Files created:** - `src/config/app_config.rs` — `AppConfig` struct (the serialized half) - `src/config/request_context.rs` — `RequestContext` struct (the runtime half) - `src/config/app_state.rs` — `AppState` struct (Arc-wrapped global services, no `mcp_registry` — see below) **`AppConfig`** is essentially the current `Config` struct but containing ONLY the serialized fields (no `#[serde(skip)]` fields). It should derive `Deserialize` identically so the existing `config.yaml` still loads. **Important change from the original plan:** `AppState` does NOT hold an `McpRegistry`. MCP server processes are scoped per `RoleLike`, not process-wide. An `McpFactory` service will be added to `AppState` in Step 6.5. See `docs/REST-API-ARCHITECTURE.md` section 5 for the design rationale. ```rust // src/config/app_config.rs #[derive(Debug, Clone, Deserialize)] #[serde(default)] pub struct AppConfig { #[serde(rename(serialize = "model", deserialize = "model"))] pub model_id: String, pub temperature: Option, pub top_p: Option, pub dry_run: bool, pub stream: bool, pub save: bool, pub keybindings: String, pub editor: Option, pub wrap: Option, pub wrap_code: bool, vault_password_file: Option, pub function_calling_support: bool, pub mapping_tools: IndexMap, pub enabled_tools: Option, pub visible_tools: Option>, pub mcp_server_support: bool, pub mapping_mcp_servers: IndexMap, pub enabled_mcp_servers: Option, pub repl_prelude: Option, pub cmd_prelude: Option, pub agent_session: Option, pub save_session: Option, pub compression_threshold: usize, pub summarization_prompt: Option, pub summary_context_prompt: Option, pub rag_embedding_model: Option, pub rag_reranker_model: Option, pub rag_top_k: usize, pub rag_chunk_size: Option, pub rag_chunk_overlap: Option, pub rag_template: Option, pub document_loaders: HashMap, pub highlight: bool, pub theme: Option, pub left_prompt: Option, pub right_prompt: Option, pub user_agent: Option, pub save_shell_history: bool, pub sync_models_url: Option, pub clients: Vec, } ``` ```rust // src/config/app_state.rs #[derive(Clone)] pub struct AppState { pub config: Arc, pub vault: GlobalVault, // NOTE: no `mcp_registry` field. MCP runtime is scoped per-`ToolScope` // on `RequestContext`, not process-wide. An `McpFactory` will be added // here later (Step 6 / Phase 5) to pool and ref-count MCP processes // across concurrent ToolScopes. See architecture doc section 5. } ``` ```rust // src/config/request_context.rs pub struct RequestContext { pub app: Arc, // per-request flags pub macro_flag: bool, pub info_flag: bool, pub working_mode: WorkingMode, // active context pub model: Model, pub functions: Functions, pub role: Option, pub session: Option, pub rag: Option>, pub agent: Option, pub agent_variables: Option, // conversation state pub last_message: Option, pub tool_call_tracker: Option, // agent supervision pub supervisor: Option>>, pub parent_supervisor: Option>>, pub self_agent_id: Option, pub current_depth: usize, pub inbox: Option>, pub root_escalation_queue: Option>, } ``` ### Step 1: Make Config constructible from AppConfig + RequestContext Add conversion methods so the old `Config` can be built from the new types, and vice versa. This is the bridge that lets us migrate incrementally. ```rust // On Config: impl Config { /// Extract the global portion into AppConfig pub fn to_app_config(&self) -> AppConfig { /* copy serialized fields */ } /// Extract the runtime portion into RequestContext pub fn to_request_context(&self, app: Arc) -> RequestContext { /* copy runtime fields */ } /// Reconstruct Config from the split types (for backward compat during migration) pub fn from_parts(app: &AppState, ctx: &RequestContext) -> Config { /* merge back */ } } ``` **Test:** After this step, `Config::from_parts(config.to_app_config(), config.to_request_context())` round-trips correctly. Existing tests still pass. ### Step 2: Migrate static methods off Config There are ~30 static methods on Config (no `self` parameter). These are pure utility functions that don't need Config at all — they compute file paths, list directories, etc. **Target:** Move these to a standalone `paths` module or keep on `AppConfig` where appropriate. | Method | Move to | |---|---| | `config_dir()` | `paths::config_dir()` | | `local_path(name)` | `paths::local_path(name)` | | `cache_path()` | `paths::cache_path()` | | `oauth_tokens_path()` | `paths::oauth_tokens_path()` | | `token_file(client)` | `paths::token_file(client)` | | `log_path()` | `paths::log_path()` | | `config_file()` | `paths::config_file()` | | `roles_dir()` | `paths::roles_dir()` | | `role_file(name)` | `paths::role_file(name)` | | `macros_dir()` | `paths::macros_dir()` | | `macro_file(name)` | `paths::macro_file(name)` | | `env_file()` | `paths::env_file()` | | `rags_dir()` | `paths::rags_dir()` | | `functions_dir()` | `paths::functions_dir()` | | `functions_bin_dir()` | `paths::functions_bin_dir()` | | `mcp_config_file()` | `paths::mcp_config_file()` | | `global_tools_dir()` | `paths::global_tools_dir()` | | `global_utils_dir()` | `paths::global_utils_dir()` | | `bash_prompt_utils_file()` | `paths::bash_prompt_utils_file()` | | `agents_data_dir()` | `paths::agents_data_dir()` | | `agent_data_dir(name)` | `paths::agent_data_dir(name)` | | `agent_config_file(name)` | `paths::agent_config_file(name)` | | `agent_bin_dir(name)` | `paths::agent_bin_dir(name)` | | `agent_rag_file(agent, rag)` | `paths::agent_rag_file(agent, rag)` | | `agent_functions_file(name)` | `paths::agent_functions_file(name)` | | `models_override_file()` | `paths::models_override_file()` | | `list_roles(with_builtin)` | `Role::list(with_builtin)` or `paths` | | `list_rags()` | `Rag::list()` or `paths` | | `list_macros()` | `Macro::list()` or `paths` | | `has_role(name)` | `Role::exists(name)` | | `has_macro(name)` | `Macro::exists(name)` | | `sync_models(url, abort)` | Standalone function or on `AppConfig` | | `local_models_override()` | Standalone function | | `log_config()` | Standalone function | **Approach:** Create `src/config/paths.rs`, move functions there, and add `#[deprecated]` forwarding methods on `Config` that call the new locations. Compile, run tests, fix callsites module by module, then remove the deprecated methods. **Callsite count:** Low — most of these are called from 1-3 places. This is a quick-win step. ### Step 3: Migrate global-read methods to AppConfig These methods only read serialized config values and should live on `AppConfig`: | Method | Current Signature | New Home | |---|---|---| | `vault_password_file` | `&self -> PathBuf` | `AppConfig` | | `editor` | `&self -> Result` | `AppConfig` | | `sync_models_url` | `&self -> String` | `AppConfig` | | `light_theme` | `&self -> bool` | `AppConfig` | | `render_options` | `&self -> Result` | `AppConfig` | | `print_markdown` | `&self, text -> Result<()>` | `AppConfig` | | `rag_template` | `&self, embeddings, sources, text -> String` | `AppConfig` | | `select_functions` | `&self, role -> Option>` | `AppConfig` | | `select_enabled_functions` | `&self, role -> Vec<...>` | `AppConfig` | | `select_enabled_mcp_servers` | `&self, role -> Vec<...>` | `AppConfig` | **Same pattern:** Add new methods on `AppConfig`, add `#[deprecated]` forwarding on `Config`, migrate callers, remove. ### Step 4: Migrate global-write methods These modify serialized config settings (`.set` command, environment loading): | Method | Notes | |---|---| | `set_wrap` | Modifies `self.wrap` | | `update` | Generic key-value update of config settings | | `load_envs` | Applies env var overrides | | `load_functions` | Initializes function definitions | | `load_mcp_servers` | Starts MCP servers | | `setup_model` | Sets default model | | `setup_document_loaders` | Sets default doc loaders | | `setup_user_agent` | Sets user agent string | The `load_*` / `setup_*` methods are initialization-only (called once in `Config::init`). They become part of `AppState` construction. `update` and `set_wrap` are runtime mutations of global config. For the API world, these should require a config reload. For now, they can stay as methods that mutate `AppConfig` through interior mutability or require a mutable reference during REPL setup. ### Step 5: Migrate request-read methods to RequestContext Pure reads of per-request state: | Method | Notes | |---|---| | `state` | Returns flags for active role/session/agent/rag | | `messages_file` | Path depends on active agent | | `sessions_dir` | Path depends on active agent | | `session_file` | Path depends on active agent | | `rag_file` | Path depends on active agent | | `info` | Reads current agent/session/role/rag | | `role_info` | Reads current role | | `session_info` | Reads current session | | `agent_info` | Reads current agent | | `agent_banner` | Reads current agent | | `rag_info` | Reads current rag | | `list_sessions` | Depends on sessions_dir (agent context) | | `list_autoname_sessions` | Depends on sessions_dir | | `is_compressing_session` | Reads session state | | `role_like_mut` | Returns mutable ref to role-like | ### Step 6: Migrate request-write methods to RequestContext Mutations of per-request state: | Method | Notes | |---|---| | `use_prompt` | Sets temporary role | | `use_role` / `use_role_obj` | Sets role on session or self | | `exit_role` | Clears role | | `edit_role` | Edits and re-applies role | | `use_session` | Sets session | | `exit_session` | Saves and clears session | | `save_session` | Persists session | | `empty_session` | Clears session messages | | `set_save_session_this_time` | Session flag | | `compress_session` / `maybe_compress_session` | Session compression | | `autoname_session` / `maybe_autoname_session` | Session naming | | `use_rag` / `exit_rag` / `edit_rag_docs` / `rebuild_rag` | RAG lifecycle | | `use_agent` / `exit_agent` / `exit_agent_session` | Agent lifecycle | | `apply_prelude` | Sets role/session from prelude config | | `before_chat_completion` | Pre-LLM state updates | | `after_chat_completion` | Post-LLM state updates | | `discontinuous_last_message` | Message state | | `init_agent_shared_variables` | Agent vars | | `init_agent_session_variables` | Agent session vars | ### Step 6.5: Unify tool/MCP fields into `ToolScope` and agent fields into `AgentRuntime` After Step 6, `RequestContext` has many flat fields that logically cluster into two sub-structs. This step collapses them and introduces three new services on `AppState`. **New types:** ```rust pub struct ToolScope { pub functions: Functions, pub mcp_runtime: McpRuntime, pub tool_tracker: ToolCallTracker, } pub struct McpRuntime { servers: HashMap>, } pub struct AgentRuntime { pub spec: AgentSpec, pub rag: Option>, // shared across siblings of same type pub supervisor: Supervisor, pub inbox: Arc, pub escalation_queue: Arc, // shared from root pub todo_list: Option, // Some(...) only when auto_continue: true pub self_agent_id: String, pub parent_supervisor: Option>, pub current_depth: usize, pub auto_continue_count: usize, } ``` **New services on `AppState`:** ```rust pub struct AppState { pub config: Arc, pub vault: GlobalVault, pub mcp_factory: Arc, pub rag_cache: Arc, } pub struct McpFactory { active: Mutex>>, // idle pool + reaper added in Phase 5; Step 6.5 ships the no-pool version } impl McpFactory { pub async fn acquire(&self, key: &McpServerKey) -> Result>; } pub struct RagCache { entries: RwLock>>, } #[derive(Hash, Eq, PartialEq, Clone, Debug)] pub enum RagKey { Named(String), // standalone: rags/.yaml Agent(String), // agent-owned: agents//rag.yaml } impl RagCache { pub async fn load(&self, key: &RagKey) -> Result>>; pub fn invalidate(&self, key: &RagKey); } ``` **`RequestContext` after collapse:** ```rust pub struct RequestContext { pub app: Arc, pub macro_flag: bool, pub info_flag: bool, pub working_mode: WorkingMode, pub model: Model, pub agent_variables: Option, pub role: Option, pub session: Option, pub rag: Option>, // session/standalone RAG, not agent RAG pub agent: Option, pub last_message: Option, pub tool_scope: ToolScope, // replaces functions + tool_call_tracker + global mcp_registry pub agent_runtime: Option, // replaces supervisor + inbox + escalation_queue + todo + self_id + parent + depth; holds shared agent RAG } ``` **What this step does:** 1. Implement `McpRuntime` and `ToolScope`. 2. Implement `McpFactory` — **no pool, no idle handling, no reaper.** `acquire()` checks `active` for an upgradable `Weak`, otherwise spawns fresh. `Drop` on `McpServerHandle` tears down the subprocess directly. Pooling lands in Phase 5. 3. Implement `RagCache` with `RagKey` enum, weak-ref sharing, and per-key serialization for concurrent first-load. 4. Implement `AgentRuntime` with the shape above. `todo_list` is `Option` — only allocated when `agent.spec.auto_continue == true`. `rag` is served from `RagCache` during activation via `RagKey::Agent(name)`. 5. Rewrite scope transitions (`use_role`, `use_session`, `use_agent`, `exit_*`, `Config::update`) to: - Resolve the effective enabled-tool / enabled-MCP-server list using priority `Agent > Session > Role > Global` - Build a fresh `McpRuntime` by calling `McpFactory::acquire()` for each required server key - Construct a new `ToolScope` wrapping the runtime + resolved `Functions` - Swap `ctx.tool_scope` atomically 6. `use_rag` (standalone / `.rag ` path) is rewritten to call `app.rag_cache.load(RagKey::Named(name))` and assign the result to `ctx.rag`. No role/session RAG changes because roles/sessions do not own RAG. 7. Agent activation additionally: - Calls `app.rag_cache.load(RagKey::Agent(agent_name))` and stores the returned `Arc` in the new `AgentRuntime.rag` - Allocates `todo_list: Some(TodoList::default())` only when `auto_continue: true`; otherwise `None` - Constructs the `AgentRuntime` and assigns it to `ctx.agent_runtime` - **Preserves today's clobber behavior for standalone RAG:** does NOT save `ctx.rag` anywhere. When the agent exits, the user's previous `.rag ` selection is not restored (matches current behavior). Stacking / restoration is flagged as a Phase 2+ enhancement. 8. `exit_agent` drops `ctx.agent_runtime` (which drops the agent's `Arc`; the cache entry becomes evictable if no other scope holds it) and rebuilds `ctx.tool_scope` from the now-topmost `RoleLike`. 9. Sub-agent spawning (in `function/supervisor.rs`) constructs a fresh `RequestContext` for the child from the shared `AppState`: - Its own `ToolScope` via `McpFactory::acquire()` calls for the child agent's `mcp_servers` - Its own `AgentRuntime` with: - `rag` via `rag_cache.load(RagKey::Agent(child_agent_name))` — shared with parent/siblings of same type - Fresh `Supervisor`, fresh `Inbox`, `current_depth = parent.depth + 1` - `parent_supervisor = Some(parent.supervisor.clone())` (for messaging) - `escalation_queue = parent.escalation_queue.clone()` — one queue, rooted at the human - `todo_list` honoring the child's own `auto_continue` flag 10. Old `Agent::init` logic that mutates a global `McpRegistry` is removed — that's now `McpFactory::acquire()` producing scope-local handles. 11. `rebuild_rag` and `edit_rag_docs` are updated to determine the correct `RagKey` (check `ctx.agent_runtime` first — if present use `RagKey::Agent(spec.name)`, otherwise use the standalone name from `ctx.rag`'s origin) and call `rag_cache.invalidate(&key)` before reloading. **What this step preserves:** - **Diff-based reinit for REPL users** — when you `.exit role` from `[github, jira]` back to global `[github]`, the new `ToolScope` is built by calling `McpFactory::acquire("github")`. Without pooling (Phase 1), this respawns `github`. With pooling (Phase 5), `github`'s `Arc` is still held by no one, but the idle pool keeps it warm, so revival is instant. The Phase 1 version has a mild regression here that Phase 5 fixes. - **Agent-vs-non-agent compatibility** — today's `Agent::init` reinits a global registry; after this step, agent activation replaces `ctx.tool_scope` with an agent-specific one, and `exit_agent` restores the pre-agent scope by rebuilding from the (now-active) role/session/global lists. - **Todo semantics from the user's perspective** — today's behavior is "todos are available when `auto_continue: true`". After Step 6.5, it's still "todos are available when `auto_continue: true`" — the only difference is we skip the wasted `TodoList::default()` allocation for the other agents. **Risk:** Medium–high. This is where the Phase 1 refactor stops being mechanical and starts having semantic implications. Five things to watch: 1. **Parent scope restoration on `exit_agent`.** Today, `exit_agent` tears down the agent's MCP set but leaves the registry in whatever state `reinit` put it — the parent's original MCP set is NOT restored. Users don't notice because the next scope activation (or REPL exit) reinits anyway. In the new design, `exit_agent` MUST rebuild the parent's `ToolScope` from the still-active role/session/global lists so the user sees the expected state. Test this carefully. 2. **`McpFactory` contention.** With many concurrent sub-agents (say, 4 siblings each needing different MCP sets), the factory's mutex could become a bottleneck during `acquire()`. Hold the lock only while touching `active`, never while awaiting subprocess spawn. 3. **`RagCache` concurrent first-load.** If two consumers request the same `RagKey` simultaneously and neither finds a cached entry, both will try to `Rag::load` from disk. Use per-key `tokio::sync::Mutex` or `OnceCell` to serialize the first load — the second caller blocks briefly and receives the shared Arc. This applies equally to standalone and agent RAGs. 4. **Weak ref staleness in `RagCache`.** The `Weak` in the map might point to a dropped `Rag`. The `load()` path MUST attempt `Weak::upgrade()` before returning; if upgrade fails, treat it as a miss and reload. 5. **`rebuild_rag` / `edit_rag_docs` race.** If a user runs `.rag rebuild` while another scope holds the same `Arc` (concurrent API session, running sub-agent, etc.), the cache invalidation must NOT yank the Arc out from under the active holder. The `Arc` keeps its reference alive — invalidation just ensures *future* loads read fresh. This is the correct behavior for both standalone and agent RAG; worth confirming in tests. 6. **Identifying the right `RagKey` during rebuild.** `rebuild_rag` today operates on `Config.rag` without knowing its origin. In the new model, the code needs to check `ctx.agent_runtime` first to determine if the active RAG is agent-owned (`RagKey::Agent`) or standalone (`RagKey::Named`). Get this wrong and you invalidate the wrong cache entry, silently breaking subsequent loads. ### Step 7: Tackle mixed methods (THE HARD PART) These 17 methods conditionally read global config OR per-request state depending on what's active. They need to be split into explicit parameter passing. | Method | Why it's mixed | Refactoring approach | |---|---|---| | `current_model` | Returns agent model, session model, role model, or global model | Take `(&AppConfig, &RequestContext) -> &Model` — check ctx first, fall back to app | | `extract_role` | Builds role from session/agent/role or global settings | Take `(&AppConfig, &RequestContext) -> Role` | | `sysinfo` | Reads global settings + current rag/session/agent | Take `(&AppConfig, &RequestContext) -> String` | | `set_temperature` | Sets on role-like or global | Split: `ctx.set_temperature()` for role-like, `app.set_temperature()` for global | | `set_top_p` | Same pattern as temperature | Same split | | `set_enabled_tools` | Same pattern | Same split | | `set_enabled_mcp_servers` | Same pattern | Same split | | `set_save_session` | Sets on session or global | Same split | | `set_compression_threshold` | Sets on session or global | Same split | | `set_rag_reranker_model` | Sets on rag or global | Same split | | `set_rag_top_k` | Sets on rag or global | Same split | | `set_max_output_tokens` | Sets on role-like model or global model | Same split | | `set_model` | Sets on role-like or global | Same split | | `retrieve_role` | Loads role, merges with current model settings | Take `(&AppConfig, &RequestContext, name) -> Role` | | `use_role_safely` | Takes GlobalConfig, does take/replace pattern | Refactor to `(&mut RequestContext, name)` | | `use_session_safely` | Takes GlobalConfig, does take/replace pattern | Refactor to `(&mut RequestContext, name)` | | `save_message` | Reads `save` flag (global) + writes to messages_file (agent-dependent path) | Take `(&AppConfig, &RequestContext, input, output)` | | `render_prompt_left/right` | Reads prompt format (global) + current model/session/agent/role (request) | Take `(&AppConfig, &RequestContext) -> String` | | `generate_prompt_context` | Same as prompt rendering | Take `(&AppConfig, &RequestContext) -> HashMap` | | `repl_complete` | Reads both global config and request state for completions | Take `(&AppConfig, &RequestContext, cmd, args) -> Vec<...>` | **Common pattern for `set_*` methods:** The current code does something like: ```rust fn set_temperature(&mut self, value: Option) { if let Some(role_like) = self.role_like_mut() { role_like.set_temperature(value); } else { self.temperature = value; } } ``` This becomes: ```rust // On RequestContext: fn set_temperature(&mut self, value: Option, app_defaults: &AppConfig) { if let Some(role_like) = self.role_like_mut() { role_like.set_temperature(value); } // Global default mutation goes through a separate path if needed } ``` ### Step 8: The Caller Migration Epic (absorbed scope from Steps 6.5 and 7) **Important:** the original plan described Step 8 as "rewrite main.rs and repl/mod.rs entry points." During implementation, Steps 6.5 and 7 deliberately deferred their semantic rewrites to Step 8 so the bridge pattern (add new methods alongside old, don't migrate callers yet) stayed consistent. As a result, Step 8 now absorbs: - **Original Step 8 scope:** entry point rewrite (`main.rs`, `repl/mod.rs`) - **From Step 6.5 deferrals:** `McpFactory::acquire()` implementation, scope transition rewrites (`use_role`, `use_session`, `use_agent`, `exit_agent`), RAG lifecycle via `RagCache` (`use_rag`, `edit_rag_docs`, `rebuild_rag`), session compression/autoname, `apply_prelude`, sub-agent spawning - **From Step 7 deferrals:** `Model::retrieve_model` client module refactor, `retrieve_role`, `set_model`, `repl_complete`, `setup_model`, `update` dispatcher, `set_rag_reranker_model`, `set_rag_top_k`, `use_role_safely`/`use_session_safely` elimination, `use_prompt`, `edit_role` This is a large amount of work. Step 8 is split into **8 sub-steps (8a–8h)** for clarity and reviewability. Each sub-step keeps the build green and can be completed as a standalone unit. **Dependency graph between sub-steps:** ``` ┌─────────┐ │ 8a │ client module refactor │ (Model: │ (Model::retrieve_model, list_models, │ &App- │ list_all_models! → &AppConfig) │ Config) │ └────┬────┘ │ unblocks ▼ ┌─────────┐ │ 8b │ remaining Step 7 deferrals │ (Step 7 │ (retrieve_role, set_model, setup_model, │ debt) │ use_prompt, edit_role, update, etc.) └────┬────┘ │ ┌──────────┐ │ │ 8c │ │ │(McpFac- │──┐ │ │ tory:: │ │ unblocks │ │acquire())│ │ │ └──────────┘ │ │ ▼ │ ┌─────────┐ │ │ 8d │ │ │ (scope │──┐ │ │ trans.) │ │ unblocks │ └─────────┘ │ │ ▼ │ ┌─────────┐ │ │ 8e │ │ │ (RAG + │──┐ │ │session) │ │ │ └─────────┘ │ │ ▼ ▼ ┌───────────────────┐ │ 8f + 8g │ │ caller migration │ │ (main.rs, REPL) │ └─────────┬─────────┘ ▼ ┌──────────┐ │ 8h │ remaining callsites │ (sweep) │ (priority-ordered) └──────────┘ ``` --- #### Step 8a: Client module refactor — `Model::retrieve_model` takes `&AppConfig` Target: remove the `&Config` dependency from the LLM client infrastructure so Step 8b's mixed-method migrations (`retrieve_role`, `set_model`, `repl_complete`, `setup_model`) can proceed. **Files touched:** - `src/client/model.rs` — `Model::retrieve_model(config: &Config, ...)` → `Model::retrieve_model(config: &AppConfig, ...)` - `src/client/macros.rs` — `list_all_models!` macro takes `&AppConfig` instead of `&Config` - `src/client/*.rs` — `list_models`, helper functions updated to take `&AppConfig` - Any callsite in `src/config/`, `src/main.rs`, `src/repl/`, etc. that calls these client functions — updated to pass `&config.app_config_snapshot()` or equivalent during the bridge window **Bridge strategy:** add a helper `Config::app_config_snapshot(&self) -> AppConfig` that clones the serialized fields into an `AppConfig`. Callsites that currently pass `&*config.read()` pass `&config.read().app_config_snapshot()` instead. This is slightly wasteful (clones ~40 fields per call) but keeps the bridge window working without a mass caller rewrite. Step 8f/8g will eliminate the clones when callers hold `Arc` directly. **Verification:** full build green. All existing tests pass. CLI/REPL manual smoke test: `loki --model openai:gpt-4o "hello"` still works. **Risk:** Low. Mechanical refactor. The bridge helper absorbs the signature change cost. --- #### Step 8b: Finish Step 7's deferred mixed-method migrations With Step 8a done, the methods that transitively depended on `Model::retrieve_model(&Config)` can now migrate to `RequestContext` with `&AppConfig` parameters. **Methods migrated to `RequestContext`:** - `retrieve_role(&self, app: &AppConfig, name: &str) -> Result` - `set_model_on_role_like(&mut self, app: &AppConfig, model_id: &str) -> Result` (paired with `AppConfig::set_model_default`) - `repl_complete(&self, app: &AppConfig, cmd: &str, args: &[&str]) -> Vec<(String, Option)>` - `setup_model(&mut self, app: &AppConfig) -> Result<()>` — actually, `setup_model` writes to `self.model_id` (serialized) AND `self.model` (runtime). The split: `AppConfig::ensure_default_model_id()` picks the first available model and updates `self.model_id`; `RequestContext::reload_current_model(&AppConfig)` refreshes `ctx.model` from the app config's id. - `use_prompt(&mut self, app: &AppConfig, prompt: &str) -> Result<()>` — trivial wrapper around `extract_role` (already done) + `use_role_obj` (Step 6) - `edit_role(&mut self, app: &AppConfig, abort_signal: AbortSignal) -> Result<()>` — calls `app.editor()`, `upsert_role`, `use_role` (still deferred to 8d) **RAG-related deferrals:** - `set_rag_reranker_model` and `set_rag_top_k` get split: the runtime branch (update the active `Rag`) becomes a `RequestContext` method taking `Arc` mutation, and the global branch becomes `AppConfig::set_rag_reranker_model_default` / `AppConfig::set_rag_top_k_default`. **`update` dispatcher:** Once all the individual `set_*` methods exist on both types, `update` migrates to `RequestContext::update(&mut self, app: &mut AppConfig, data: &str) -> Result<()>`. The dispatcher's body becomes a match that calls the appropriate split pair for each key. **`use_role_safely` / `use_session_safely`:** Still not eliminated in 8b — they're wrappers around the still-`Config`-based `use_role` and `use_session`. Eliminated in 8f/8g when callers switch to `&mut RequestContext`. **Verification:** full build green. All tests pass. Smoke test: `.set temperature 0.7`, `.set enabled_tools fs`, `.model openai:gpt-4o` all work in REPL. **Risk:** Low. Same bridge pattern, now unblocked by 8a. --- #### Step 8c: Extract `McpFactory::acquire()` from `McpRegistry::init_server` Target: give `McpFactory` a working `acquire()` method so Step 8d can build real `ToolScope` instances. **Files touched:** - `src/mcp/mod.rs` — extract the MCP subprocess spawn + rmcp handshake logic (currently inside `McpRegistry::init_server`, ~60 lines) into a standalone function: ```rust pub(crate) async fn spawn_mcp_server( spec: &McpServer, log_path: Option<&Path>, abort_signal: &AbortSignal, ) -> Result ``` `McpRegistry::init_server` then calls this helper and does its own bookkeeping. Backward-compatible for bridge callers. - `src/config/mcp_factory.rs` — implement `McpFactory::acquire(spec: &McpServer, log_path, abort_signal) -> Result>`: 1. Build an `McpServerKey` from the spec 2. Try `self.try_get_active(&key)` → share if upgraded 3. Otherwise call `spawn_mcp_server(spec, ...).await` → wrap in `Arc` → `self.insert_active(key, &arc)` → return - Write a couple of integration tests that exercise the factory's sharing behavior with a mock server spec (or document why a real integration test needs Phase 5's pooling work) **What this step does NOT do:** no caller migration, no `ToolScope` construction, no changes to `McpRegistry::reinit`. Step 8d does those. **Verification:** new unit tests pass. Existing tests pass. `McpRegistry` still works for all current callers. **Risk:** Medium. The spawn logic is intricate (child process + stdio handshake + error recovery). Extracting without a behavior change requires careful diff review. --- #### Step 8d: Scope transition rewrites — `use_role`, `use_session`, `use_agent`, `exit_agent` Target: build real `ToolScope` instances via `McpFactory` when scopes change. This is where Step 6.5's scaffolding stops being scaffolding. **New methods on `RequestContext`:** - `use_role(&mut self, app: &AppConfig, name: &str, abort_signal: AbortSignal) -> Result<()>`: 1. Call `self.retrieve_role(app, name)?` (from 8b) 2. Resolve the role's `enabled_mcp_servers` list 3. Build a fresh `ToolScope` by calling `app.mcp_factory.acquire(spec, ...)` for each required server 4. Populate `ctx.tool_scope.functions` with the role's effective function list via `select_functions(app, &role)` 5. Swap `ctx.tool_scope` atomically 6. Call `self.use_role_obj(role)` (from Step 6) - `use_session(&mut self, app: &AppConfig, session_name: Option<&str>, abort_signal) -> Result<()>` — same pattern, with session-specific handling for `agent_session_variables` - `use_agent(&mut self, app: &AppConfig, agent_name: &str, session_name: Option<&str>, abort_signal) -> Result<()>` — builds an `AgentRuntime` (Step 6.5 scaffolding), populates `ctx.agent_runtime`, activates the optional inner session - `exit_agent(&mut self, app: &AppConfig) -> Result<()>` — drops `ctx.agent_runtime`, rebuilds `ctx.tool_scope` from the now-topmost RoleLike (role/session/global), cancels the supervisor, clears RAG if it came from the agent **Key invariant: parent scope restoration on `exit_agent`.** Today's `Config::exit_agent` leaves the `McpRegistry` in whatever state the agent left it. The new `exit_agent` explicitly rebuilds `ctx.tool_scope` from the current role/session/global enabled-server lists so the user sees the expected state after exiting an agent. This is a semantic improvement over today's behavior (which technically has a latent bug that nobody notices because the next scope activation fixes it). **What this step does NOT do:** no caller migration. `Config::use_role`, `Config::use_session`, etc. are still on `Config` and still work for existing callers. The `_safely` wrappers are still around. **Verification:** new `RequestContext::use_role` etc. have unit tests. Full build green. Existing tests pass. No runtime behavior change because nothing calls the new methods yet. **Risk:** Medium–high. This is the first time `McpFactory::acquire()` is exercised outside unit tests. Specifically watch: - **`McpFactory` mutex contention** — hold the `active` lock only during HashMap mutation, never across subprocess spawn or `await` - **Parent scope restoration correctness** — write a targeted test that activates an agent with `[github]`, exits, activates a role with `[jira]`, and verifies the tool scope has only `jira` (not `github` leftover) --- #### Step 8e: RAG lifecycle + session compression + `apply_prelude` Target: migrate the Category C deferrals from Step 6 (session/RAG lifecycle methods that currently take `&GlobalConfig`). **New methods on `RequestContext`:** - `use_rag(&mut self, app: &AppConfig, name: Option<&str>, abort_signal) -> Result<()>` — routes through `app.rag_cache.load(RagKey::Named(name))` - `edit_rag_docs(&mut self, app: &AppConfig, abort_signal) -> Result<()>` — determines the `RagKey` (Agent or Named) from `ctx.agent_runtime` / `ctx.rag` origin, calls `app.rag_cache.invalidate(&key)`, reloads - `rebuild_rag(&mut self, app: &AppConfig, abort_signal) -> Result<()>` — same pattern as `edit_rag_docs` - `compress_session(&mut self, app: &AppConfig) -> Result<()>` — reads `app.summarization_prompt`, `app.summary_context_prompt`, mutates `ctx.session`. Async, does an LLM call via an existing `Input::from_str` pattern. - `maybe_compress_session(&mut self, app: &AppConfig) -> bool` — checks `ctx.session.needs_compression(app.compression_threshold)`, triggers compression if so. Returns whether compression was triggered; caller decides whether to spawn a background task (the task spawning moves to the caller's responsibility, not the method's). - `autoname_session(&mut self, app: &AppConfig) -> Result<()>` — same pattern, uses `CREATE_TITLE_ROLE` and `Input::from_str` - `maybe_autoname_session(&mut self, app: &AppConfig) -> bool` — same return-bool pattern - `apply_prelude(&mut self, app: &AppConfig, abort_signal) -> Result<()>` — parses `app.repl_prelude` / `app.cmd_prelude`, calls the new `self.use_role()` / `self.use_session()` from 8d **The `GlobalConfig`-taking static methods go away.** Today's code uses the pattern `Config::maybe_compress_session(config: GlobalConfig)` which takes an owned `Arc>` and spawns a background task. After 8e, the new `RequestContext::maybe_compress_session` returns a bool; callers that want async compression spawn the task themselves with their `RequestContext` context. This is simpler and more explicit. **Verification:** new methods have unit tests where feasible. Full build green. `compress_session` and `autoname_session` are tricky to unit-test because they do LLM calls; mock the LLM or skip the full path in tests. **Risk:** Medium. The session compression flow is the most behavior-sensitive — getting the semantics wrong here results in lost session history. Write a targeted integration test that feeds 10+ user messages into a session, triggers compression, and verifies the session's summary is preserved. --- #### Step 8f: Entry point rewrite — `main.rs` Target: rewrite `main.rs` to construct `AppState` + `RequestContext` explicitly instead of using `GlobalConfig`. **Specific changes:** - `Config::init()` → `AppState::init()` which: 1. Loads `config.yaml` into `AppConfig` 2. Applies environment variable overrides (calls `AppConfig::load_envs` from Step 4) 3. Calls `AppConfig::setup_document_loaders` / `AppConfig::setup_user_agent` (Step 4) 4. Constructs the `Vault`, `McpFactory`, `RagCache` 5. Returns `Arc` - `main::run()` constructs a `RequestContext` from the `AppState` and threads it through to subcommands - `main::start_directive(ctx: &mut RequestContext, ...)` — signature change - `main::create_input(ctx: &RequestContext, ...)` — signature change - `main::shell_execute(ctx: &mut RequestContext, ...)` — signature change - All 18 `main.rs` callsites updated **`load_functions` and `load_mcp_servers`:** These are initialization-time methods that populate `ctx.tool_scope.functions` and `ctx.tool_scope.mcp_runtime`. They move from `Config` to a new `RequestContext::bootstrap_tools(&mut self, app: &AppConfig, abort_signal) -> Result<()>` that: 1. Initializes `Functions` via `Functions::init(visible_tools)` (existing code) 2. Resolves the initial enabled-MCP-server list from `app.enabled_mcp_servers` 3. Calls `app.mcp_factory.acquire()` for each 4. Assigns the result to `ctx.tool_scope` This replaces the `Config::load_functions` + `Config::load_mcp_servers` call sequence in today's `main.rs`. **Verification:** CLI smoke tests from the original plan's Step 8 verification checklist. Specifically: - `loki "hello"` — plain prompt - `loki --role explain "what is TCP"` — role activation - `loki --session my-project "..."` — session - `loki --agent sisyphus "..."` — agent activation - `loki --info` — sysinfo output Each should produce output matching the pre-Step-8 behavior exactly. **Risk:** High. `main.rs` is the primary entry point; any regression here is user-visible. Write smoke tests that compare CLI output byte-for-byte with a recorded baseline. --- #### Step 8g: REPL rewrite — `repl/mod.rs` Target: rewrite `repl/mod.rs` to use `&mut RequestContext` instead of `GlobalConfig`. **Specific changes:** - `Repl` struct: `config: GlobalConfig` → `ctx: RequestContext` (long-lived, mutable across turns) - `run_repl_command(ctx: &mut RequestContext, ...)` — signature change - `ask(ctx: &mut RequestContext, ...)` — signature change - Every dot-command handler updated. Dot-commands that take the `GlobalConfig` pattern (like the `_safely` wrappers) are **eliminated** — they just call `ctx.use_role(...)` directly. - All 39 command handlers migrated - All 12 `repl/mod.rs` internal callsites updated **`use_role_safely` / `use_session_safely` elimination:** these wrappers exist only because `Config::use_role` is `&mut self` and the REPL holds `Arc>`. After Step 8g, the REPL holds `RequestContext` directly (no lock), so the wrappers are no longer needed and get deleted. **Verification:** REPL smoke tests matching the pre-Step-8 behavior. Specifically: - Start REPL, issue a prompt → should see same output - `.role explain`, `.session my-session`, `.agent sisyphus`, `.exit agent` — should all work identically - `.set temperature 0.7` then `.info` — should show updated temperature - Ctrl-C during an LLM call — should cleanly abort - `.macro run-tests` — should execute without errors **Risk:** High. Same reason as 8f — this is a user-visible entry point. Test every dot-command. --- #### Step 8h: Remaining callsite sweep Target: migrate the remaining modules in priority order (lowest callsite count first, keeping the build green after each module): | Priority | Module | Callsites | Notes | |---|---|---|---| | 1 | `render/mod.rs` | 1 | `render_stream` just reads config — trivial | | 2 | `repl/completer.rs` | 1 | Just reads for completions | | 3 | `repl/prompt.rs` | 1 | Just reads for prompt rendering | | 4 | `function/user_interaction.rs` | 1 | Just reads for user prompts | | 5 | `function/mod.rs` | 2 | `eval_tool_calls` reads config | | 6 | `config/macros.rs` | 3 | `macro_execute` reads and writes | | 7 | `function/todo.rs` | 4 | Todo handlers read/write agent state | | 8 | `config/input.rs` | 6 | Input creation — reads config | | 9 | `rag/mod.rs` | 6 | RAG init/search | | 10 | `function/supervisor.rs` | 8 | Sub-agent spawning — complex | | 11 | `config/agent.rs` | 12 | Agent init — complex, many mixed concerns | **Sub-agent spawning** (`function/supervisor.rs`) is the most complex item in the sweep. Each child agent gets a fresh `RequestContext` forked from the parent's `Arc`: - Own `ToolScope` built by calling `app.mcp_factory.acquire()` for the child's `mcp_servers` list - Own `AgentRuntime` with fresh supervisor, fresh inbox, `current_depth = parent.depth + 1` - `parent_supervisor = Some(parent.agent_runtime.supervisor.clone())` — weakly linked to parent for messaging - `escalation_queue = parent.agent_runtime.escalation_queue.clone()` — `Arc`-shared from root - RAG served via `app.rag_cache.load(RagKey::Agent(child_name))` — shared with any sibling of the same type `config/agent.rs` — `Agent::init` is currently tightly coupled to `Config`. It needs to be rewritten to take `&AppState` + `&mut RequestContext`. Some of its complexity (MCP server startup, RAG loading) moves into `RequestContext::use_agent` from Step 8d; `Agent::init` becomes just the spec-loading portion. **Verification:** after each module migrates, run full `cargo check` + `cargo test`. After all modules migrate, run the full smoke test suite from 8f and 8g. **Risk:** Medium. The sub-agent spawning and `config/agent.rs` work is complex, but the bridge pattern means we can take each module independently. --- ### Step 9: Remove the Bridge After Step 8h, no caller references `GlobalConfig` or calls `Config`-based methods that have `RequestContext` / `AppConfig` equivalents. Step 9 is the cleanup: 1. Delete `src/config/bridge.rs` (the conversion methods added in Step 1) 2. Delete the `#[allow(dead_code)]` attributes from `AppConfig` impl blocks (now that callers use them) 3. Delete the `#[allow(dead_code)]` attributes from `RequestContext` impl blocks 4. Remove the flat runtime fields from `RequestContext` that have been superseded by `tool_scope` and `agent_runtime` (i.e., delete `functions`, `tool_call_tracker`, `supervisor`, `parent_supervisor`, `self_agent_id`, `current_depth`, `inbox`, `root_escalation_queue` from `RequestContext`; they now live inside `ToolScope` and `AgentRuntime`) 5. Run the full test suite This step is mostly mechanical deletion. The one care-ful part is removing the flat fields — make sure no remaining code path still reads them. `cargo check` will catch any stragglers. ### Step 10: Remove the old Config struct and GlobalConfig type alias Once Step 9 is done and the tree is clean: 1. Delete all Step 2-7 method definitions on `Config` (the ones that got duplicated to `paths` / `AppConfig` / `RequestContext`) 2. Delete the `#[serde(skip)]` runtime fields on `Config` (they're all on `RequestContext` now) 3. At this point `Config` should be nearly empty — possibly just the serialized fields 4. Delete `Config` entirely (or rename it to `RawConfig` if it's still useful as a serde DTO, though `AppConfig` already plays that role) 5. Delete the `pub type GlobalConfig = Arc>` type alias 6. Run `cargo check`, `cargo test`, `cargo clippy` — all clean 7. Run the full manual smoke test suite one more time Phase 1 complete. --- ## Callsite Migration Summary | Module | Functions to Migrate | Handled In | |---|---|---| | `config/mod.rs` | 120 methods (30 static, 10 global-read, 8 global-write, 35 request-read/write, 17 mixed) | Steps 2-7 (mechanical duplication), Step 10 (deletion) | | `client/` macros and `model.rs` | `Model::retrieve_model`, `list_all_models!`, `list_models` | Step 8a | | `main.rs` | `run`, `start_directive`, `shell_execute`, `create_input`, `apply_prelude_safely` | Step 8f | | `repl/mod.rs` | `run_repl_command`, `ask`, plus 39 command handlers | Step 8g | | `config/agent.rs` | `Agent::init`, agent lifecycle methods | Step 8h (partial) + Step 8d (scope transitions) | | `function/supervisor.rs` | Sub-agent spawning, task management | Step 8h | | `config/input.rs` | `Input::from_str`, `from_files`, `from_files_with_spinner` | Step 8h | | `rag/mod.rs` | RAG init, load, search | Step 8e (lifecycle) + Step 8h (remaining) | | `mcp/mod.rs` | `McpRegistry::init_server` spawn logic extraction | Step 8c | | `function/mod.rs` | `eval_tool_calls` | Step 8h | | `function/todo.rs` | Todo handlers | Step 8h | | `function/user_interaction.rs` | User prompt handler | Step 8h | | `render/mod.rs` | `render_stream` | Step 8h | | `repl/completer.rs` | Completion logic | Step 8h | | `repl/prompt.rs` | Prompt rendering | Step 8h | | `config/macros.rs` | `macro_execute` | Step 8h | ### Step 8 effort estimates | Sub-step | Effort | Risk | |---|---|---| | 8a — client module refactor | 0.5–1 day | Low | | 8b — Step 7 deferrals | 0.5–1 day | Low | | 8c — `McpFactory::acquire()` extraction | 1 day | Medium | | 8d — scope transition rewrites | 1–2 days | Medium–high | | 8e — RAG + session lifecycle migration | 1–2 days | Medium | | 8f — `main.rs` rewrite | 1 day | High | | 8g — `repl/mod.rs` rewrite | 1–2 days | High | | 8h — remaining callsite sweep | 1–2 days | Medium | **Total estimated Step 8 effort: ~7–12 days.** The "total Phase 1 effort" from the plan header needs to be updated once Step 8 finishes. --- ## Verification Checkpoints After each step, verify: 1. **`cargo check`** — no compilation errors 2. **`cargo test`** — all existing tests pass 3. **Manual smoke test** — CLI one-shot prompt works, REPL starts and processes a prompt 4. **No behavior changes** — identical output for identical inputs ## Risk Factors ### Phase-wide risks | Risk | Severity | Mitigation | |---|---|---| | Bridge-window duplication drift — bug fixed in `Config::X` but not `RequestContext::X` or vice versa | Medium | Keep the bridge window as short as possible. Step 8 should finish within 2 weeks of Step 7 ideally. Any bug fix during Steps 8a-8h must be applied to both places if the method is still duplicated. | | Sub-agent spawning semantics change subtly | High | Cross-agent MCP trampling is a latent bug today that Step 8d/8h fixes. Write targeted integration tests for the sub-agent spawning path before and after Step 8h to verify semantics match (or improve intentionally). | | Long-running Phase 1 blocking Phase 2+ work | Medium | Phase 2 (Engine + Emitter) can start prep work in parallel with Step 8h — the final callsite sweep doesn't block the new Engine design. | ### Step 8 sub-step risks | Sub-step | Risk | Severity | Mitigation | |---|---|---|---| | 8a | Client macro refactor breaks LLM provider integration | Low | All LLM providers use the same `Model::retrieve_model` entry point. Test with at least 2 providers (openai + another) before declaring 8a done. | | 8b | `update` dispatcher has ~15 cases — easy to miss one | Low | Enumerate every `.set` key handled today; check each is in the new dispatcher. | | 8c | Extracting `spawn_mcp_server` introduces behavior differences (e.g., error handling, abort signal propagation) | Medium | Do a line-by-line diff review. Write a test that kills an in-flight spawn via the abort signal. | | 8d | `McpFactory` mutex contention under parallel sub-agent spawning | Medium | Hold the `active` lock only during HashMap operations, never across `await`. Benchmark with 4 concurrent scope transitions before declaring 8d done. | | 8d | Parent scope restoration on `exit_agent` differs from today's implicit behavior | High | Write a targeted test: activate global→role(jira)→agent(github,slack)→exit_agent. Verify scope is role(jira), not agent's (github,slack) or stale. | | 8e | Session compression loses messages when triggered mid-request | High | Integration test: feed 10+ user messages, compress, verify summary preserves all user intent. Also test concurrent compression (REPL background task + foreground turn). | | 8e | `rebuild_rag` / `edit_rag_docs` pick the wrong `RagKey` variant | Medium | Test both paths explicitly: agent-scoped rebuild and standalone rebuild. Assert the right cache entry is invalidated. | | 8f | CLI output bytes differ from pre-refactor baseline | High | Record baseline CLI outputs for 10 common invocations. After 8f, diff byte-for-byte. Any difference is a regression unless explicitly justified. | | 8g | REPL dot-command behavior regresses silently | High | Test every dot-command end-to-end: `.role`, `.session`, `.agent`, `.rag`, `.set`, `.info`, `.exit *`, `.compress session`, etc. | | 8h | Sub-agent spawning in `function/supervisor.rs` shares state incorrectly between parent and child | High | Integration test: parent activates agent A (github), spawns child B (jira), verify B's tool scope has only jira and parent's has only github. Each parallel child has independent tool scopes. | | 8h | `Agent::init` refactor drops initialization logic | Medium | `Agent::init` is ~100 lines today. Diff the old vs new init paths line-by-line. | ### Legacy risks (resolved during the refactor) These risks from the original plan have been addressed by the step-by-step scaffolding approach: | Original risk | How it was resolved | |---|---| | `use_role_safely` / `use_session_safely` use take/replace pattern | Eliminated entirely in Step 8g — REPL holds `&mut RequestContext` directly, no lock take/replace needed | | `Agent::init` creates MCP servers, functions, RAG on Config | Resolved in Step 8d + Step 8h — MCP via `McpFactory::acquire()`, RAG via `RagCache`, functions via `RequestContext::bootstrap_tools` | | Sub-agent spawning clones Config | Resolved in Step 8h — children get fresh `RequestContext` forked from `Arc` | | Input holds `GlobalConfig` clone | Resolved in Step 8f — `Input` now holds references to the context it needs from `RequestContext`, not an owned clone | | Concurrent REPL operations spawn tasks with `GlobalConfig` clone | Resolved in Step 8e — task spawning moves to the caller's responsibility with explicit `RequestContext` context | ## What This Phase Does NOT Do - No REST API server code - No Engine::run() unification (that's Phase 2) - No Emitter trait (that's Phase 2) - No SessionStore abstraction (that's Phase 3) - No UUID-based sessions (that's Phase 3) - No agent isolation refactoring (that's Phase 5) - No new dependencies added The sole goal is: **split Config into immutable global + mutable per-request, with identical external behavior.**