# Phase 1 Step 8c — Implementation Notes ## Status Done. ## Plan reference - Plan: `docs/PHASE-1-IMPLEMENTATION-PLAN.md` - Section: "Step 8c: Extract `McpFactory::acquire()` from `McpRegistry::init_server`" ## Summary Extracted the MCP subprocess spawn + rmcp handshake logic from `McpRegistry::start_server` into a standalone `pub(crate) async fn spawn_mcp_server()` function. Rewrote `start_server` to call it. Implemented `McpFactory::acquire()` using the extracted function plus the existing `try_get_active` / `insert_active` scaffolding from Step 6.5. Three types in `mcp/mod.rs` were bumped to `pub(crate)` visibility for cross-module access. ## What was changed ### Files modified (2 files) - **`src/mcp/mod.rs`** — 4 changes: 1. **Extracted `spawn_mcp_server`** (~40 lines) — standalone `pub(crate) async fn` that takes an `&McpServer` spec and optional log path, builds a `tokio::process::Command`, creates a `TokioChildProcess` transport (with optional stderr log redirect), calls `().serve(transport).await` for the rmcp handshake, and returns `Arc`. 2. **Rewrote `McpRegistry::start_server`** — now looks up the `McpServer` spec from `self.config`, calls `spawn_mcp_server`, then does its own catalog building (tool listing, BM25 index construction). The spawn + handshake code that was previously inline is replaced by the one-liner `spawn_mcp_server(spec, self.log_path.as_deref()).await?`. 3. **Bumped 3 types to `pub(crate)`**: `McpServer`, `JsonField`, `McpServersConfig`. These were previously private to `mcp/mod.rs`. `McpFactory::acquire()` and `McpServerKey::from_spec()` need `McpServer` and `JsonField` to build the server key from a spec. `McpServersConfig` is bumped for completeness (Step 8d may need to access it when loading server specs during scope transitions). - **`src/config/mcp_factory.rs`** — 3 changes: 1. **Added `McpServerKey::from_spec(name, &McpServer)`** — builds a key by extracting command, args (defaulting to empty vec), and env vars (converting `JsonField` variants to strings) from the spec. Args and env are sorted by the existing `new()` constructor to ensure identical specs produce identical keys. 2. **Added `McpFactory::acquire(name, &McpServer, log_path)`** — the core method. Builds an `McpServerKey` from the spec, checks `try_get_active` for an existing `Arc` (sharing path), otherwise calls `spawn_mcp_server` to start a fresh subprocess, inserts the result into `active` via `insert_active`, and returns the `Arc`. 3. **Updated imports** — added `McpServer`, `spawn_mcp_server`, `Result`, `Path`. ### Files NOT changed - **`src/config/tool_scope.rs`** — unchanged; Step 8d will use `McpFactory::acquire()` to populate `McpRuntime` instances. - **All caller code** — `McpRegistry::start_select_mcp_servers` and `McpRegistry::reinit` continue to call `self.start_server()` which internally uses the extracted function. No caller migration. ## Key decisions ### 1. Spawn function does NOT list tools or build catalogs The plan said to extract "the MCP subprocess spawn + rmcp handshake logic (~60 lines)." I interpreted this as: `Command` construction → transport creation → `serve()` handshake → `Arc` wrapping. The tool listing (`service.list_tools`) and catalog building (BM25 index) are `McpRegistry`-specific bookkeeping and stay in `start_server`. `McpFactory::acquire()` returns a connected server handle ready to use. Callers (Step 8d's scope transitions) can list tools themselves if they need to build function declarations. ### 2. No `abort_signal` parameter on `spawn_mcp_server` The plan suggested `abort_signal: &AbortSignal` as a parameter. The existing `start_server` doesn't use an abort signal — cancellation is handled at a higher level by `abortable_run_with_spinner` wrapping the entire batch of `start_select_mcp_servers`. Adding an abort signal to the individual spawn would require threading `tokio::select!` into the transport creation, which is a behavior change beyond Step 8c's scope. Step 8d can add cancellation when building `ToolScope` if needed. ### 3. `McpServerKey::from_spec` converts `JsonField` to strings The `McpServer.env` field uses a `JsonField` enum (Str/Bool/Int) for JSON flexibility. The key needs string comparisons for hashing, so `from_spec` converts each variant to its string representation. This matches the conversion already done in the env-building code inside `spawn_mcp_server`. ### 4. `McpFactory::acquire` mutex contention is safe The plan warned: "hold the lock only during HashMap mutation, never across subprocess spawn." The implementation achieves this by using the existing `try_get_active` and `insert_active` methods, which each acquire and release the mutex within their own scope. The `spawn_mcp_server` await happens between the two lock acquisitions with no lock held. TOCTOU race: two concurrent callers could both miss in `try_get_active`, both spawn, and both insert. The second insert overwrites the first's `Weak`. This means one extra subprocess gets spawned and the first `Arc` has no `Weak` in the map (but stays alive via its holder's `Arc`). This is acceptable for Phase 1 — the worst case is a redundant spawn, not a crash or leak. Phase 5's pooling design (per-key `tokio::sync::Mutex`) will eliminate this race. ### 5. No integration tests for `acquire()` The plan suggested writing integration tests for the factory's sharing behavior. Spawning a real MCP server requires a configured binary on the system PATH. A mock server would need a test binary that speaks the rmcp stdio protocol — this is substantial test infrastructure that doesn't exist yet. Rather than building it in Step 8c, I'm documenting that integration testing of `McpFactory::acquire()` should happen in Phase 5 when the pooling infrastructure provides natural test hooks (idle pool, reaper, health checks). The extraction itself is verified by the fact that existing MCP functionality (which goes through `McpRegistry::start_server` → `spawn_mcp_server`) still compiles and all 63 tests pass. ## Deviations from plan | Deviation | Rationale | |---|---| | No `abort_signal` parameter | Not used by existing code; adding it is a behavior change | | No integration tests | Requires MCP test infrastructure that doesn't exist | | Removed `get_server_spec` / `log_path` accessors from McpRegistry | Not needed; `acquire()` takes spec and log_path directly | ## Verification ### Compilation - `cargo check` — clean, zero warnings, zero errors - `cargo clippy` — clean ### Tests - `cargo test` — **63 passed, 0 failed** (unchanged from Steps 1–8b) ## Handoff to next step ### What Step 8d can rely on - **`spawn_mcp_server(&McpServer, Option<&Path>) -> Result>`** — available from `crate::mcp::spawn_mcp_server` - **`McpFactory::acquire(name, &McpServer, log_path) -> Result>`** — checks active map for sharing, spawns fresh if needed, inserts into active map - **`McpServerKey::from_spec(name, &McpServer) -> McpServerKey`** — builds a hashable key from a server spec - **`McpServer`, `McpServersConfig`, `JsonField`** — all `pub(crate)` and accessible from `src/config/` ### What Step 8d should do Build real `ToolScope` instances during scope transitions: 1. Resolve the effective enabled-server list from the role/session/agent 2. Look up each server's `McpServer` spec (from the MCP config) 3. Call `app.mcp_factory.acquire(name, spec, log_path)` for each 4. Populate an `McpRuntime` with the returned `Arc` handles 5. Construct a `ToolScope` with the runtime + resolved `Functions` 6. Assign to `ctx.tool_scope` ### What Step 8d should watch for - **Log path.** `McpRegistry` stores `log_path` during `init()`. Step 8d needs to decide where the log path comes from for factory-acquired servers. Options: store it on `AppState`, compute it from `paths::cache_path()`, or pass it through from the caller. The simplest is to store it on `McpFactory` at construction time. - **MCP config loading.** `McpRegistry::init()` loads and parses `mcp.json`. Step 8d's scope transitions need access to the parsed `McpServersConfig` to look up server specs by name. Options: store the parsed config on `AppState`, or load it fresh each time. Storing on `AppState` is more efficient. - **Catalog building.** `McpRegistry::start_server` builds a `ServerCatalog` (BM25 index) for each server after spawning. Step 8d's `ToolScope` doesn't use catalogs — they're for the `mcp_search` meta-function. The catalog functionality may need to be lifted out of `McpRegistry` eventually, but that's not blocking Step 8d. ### Files to re-read at the start of Step 8d - `docs/PHASE-1-IMPLEMENTATION-PLAN.md` — Step 8d section - This notes file - `src/config/mcp_factory.rs` — full file - `src/config/tool_scope.rs` — full file - `src/mcp/mod.rs` — `McpRegistry::init`, `start_select_mcp_servers`, `resolve_server_ids` for the config loading / server selection patterns that Step 8d will replicate ## References - Phase 1 plan: `docs/PHASE-1-IMPLEMENTATION-PLAN.md` - Step 8b notes: `docs/implementation/PHASE-1-STEP-8b-NOTES.md` - Step 6.5 notes: `docs/implementation/PHASE-1-STEP-6.5-NOTES.md` - Modified files: - `src/mcp/mod.rs` (extracted `spawn_mcp_server`, rewrote `start_server`, bumped 3 types to `pub(crate)`) - `src/config/mcp_factory.rs` (added `from_spec`, `acquire`, updated imports)