This commit is contained in:
2026-04-15 12:56:00 -06:00
parent ff3419a714
commit 63b6678e73
82 changed files with 14800 additions and 3310 deletions
@@ -0,0 +1,226 @@
# Phase 1 Step 8c — Implementation Notes
## Status
Done.
## Plan reference
- Plan: `docs/PHASE-1-IMPLEMENTATION-PLAN.md`
- Section: "Step 8c: Extract `McpFactory::acquire()` from
`McpRegistry::init_server`"
## Summary
Extracted the MCP subprocess spawn + rmcp handshake logic from
`McpRegistry::start_server` into a standalone `pub(crate) async fn
spawn_mcp_server()` function. Rewrote `start_server` to call it.
Implemented `McpFactory::acquire()` using the extracted function
plus the existing `try_get_active` / `insert_active` scaffolding
from Step 6.5. Three types in `mcp/mod.rs` were bumped to
`pub(crate)` visibility for cross-module access.
## What was changed
### Files modified (2 files)
- **`src/mcp/mod.rs`** — 4 changes:
1. **Extracted `spawn_mcp_server`** (~40 lines) — standalone
`pub(crate) async fn` that takes an `&McpServer` spec and
optional log path, builds a `tokio::process::Command`, creates
a `TokioChildProcess` transport (with optional stderr log
redirect), calls `().serve(transport).await` for the rmcp
handshake, and returns `Arc<ConnectedServer>`.
2. **Rewrote `McpRegistry::start_server`** — now looks up the
`McpServer` spec from `self.config`, calls `spawn_mcp_server`,
then does its own catalog building (tool listing, BM25 index
construction). The spawn + handshake code that was previously
inline is replaced by the one-liner
`spawn_mcp_server(spec, self.log_path.as_deref()).await?`.
3. **Bumped 3 types to `pub(crate)`**: `McpServer`, `JsonField`,
`McpServersConfig`. These were previously private to
`mcp/mod.rs`. `McpFactory::acquire()` and
`McpServerKey::from_spec()` need `McpServer` and `JsonField`
to build the server key from a spec. `McpServersConfig` is
bumped for completeness (Step 8d may need to access it when
loading server specs during scope transitions).
- **`src/config/mcp_factory.rs`** — 3 changes:
1. **Added `McpServerKey::from_spec(name, &McpServer)`** — builds
a key by extracting command, args (defaulting to empty vec),
and env vars (converting `JsonField` variants to strings) from
the spec. Args and env are sorted by the existing `new()`
constructor to ensure identical specs produce identical keys.
2. **Added `McpFactory::acquire(name, &McpServer, log_path)`**
the core method. Builds an `McpServerKey` from the spec, checks
`try_get_active` for an existing `Arc` (sharing path), otherwise
calls `spawn_mcp_server` to start a fresh subprocess, inserts
the result into `active` via `insert_active`, and returns the
`Arc<ConnectedServer>`.
3. **Updated imports** — added `McpServer`, `spawn_mcp_server`,
`Result`, `Path`.
### Files NOT changed
- **`src/config/tool_scope.rs`** — unchanged; Step 8d will use
`McpFactory::acquire()` to populate `McpRuntime` instances.
- **All caller code** — `McpRegistry::start_select_mcp_servers` and
`McpRegistry::reinit` continue to call `self.start_server()` which
internally uses the extracted function. No caller migration.
## Key decisions
### 1. Spawn function does NOT list tools or build catalogs
The plan said to extract "the MCP subprocess spawn + rmcp handshake
logic (~60 lines)." I interpreted this as: `Command` construction →
transport creation → `serve()` handshake → `Arc` wrapping. The tool
listing (`service.list_tools`) and catalog building (BM25 index) are
`McpRegistry`-specific bookkeeping and stay in `start_server`.
`McpFactory::acquire()` returns a connected server handle ready to
use. Callers (Step 8d's scope transitions) can list tools themselves
if they need to build function declarations.
### 2. No `abort_signal` parameter on `spawn_mcp_server`
The plan suggested `abort_signal: &AbortSignal` as a parameter. The
existing `start_server` doesn't use an abort signal — cancellation
is handled at a higher level by `abortable_run_with_spinner` wrapping
the entire batch of `start_select_mcp_servers`. Adding an abort signal
to the individual spawn would require threading `tokio::select!` into
the transport creation, which is a behavior change beyond Step 8c's
scope. Step 8d can add cancellation when building `ToolScope` if
needed.
### 3. `McpServerKey::from_spec` converts `JsonField` to strings
The `McpServer.env` field uses a `JsonField` enum (Str/Bool/Int) for
JSON flexibility. The key needs string comparisons for hashing, so
`from_spec` converts each variant to its string representation. This
matches the conversion already done in the env-building code inside
`spawn_mcp_server`.
### 4. `McpFactory::acquire` mutex contention is safe
The plan warned: "hold the lock only during HashMap mutation, never
across subprocess spawn." The implementation achieves this by using
the existing `try_get_active` and `insert_active` methods, which each
acquire and release the mutex within their own scope. The `spawn_mcp_server`
await happens between the two lock acquisitions with no lock held.
TOCTOU race: two concurrent callers could both miss in `try_get_active`,
both spawn, and both insert. The second insert overwrites the first's
`Weak`. This means one extra subprocess gets spawned and the first
`Arc` has no `Weak` in the map (but stays alive via its holder's
`Arc`). This is acceptable for Phase 1 — the worst case is a
redundant spawn, not a crash or leak. Phase 5's pooling design
(per-key `tokio::sync::Mutex`) will eliminate this race.
### 5. No integration tests for `acquire()`
The plan suggested writing integration tests for the factory's sharing
behavior. Spawning a real MCP server requires a configured binary on
the system PATH. A mock server would need a test binary that speaks
the rmcp stdio protocol — this is substantial test infrastructure
that doesn't exist yet. Rather than building it in Step 8c, I'm
documenting that integration testing of `McpFactory::acquire()` should
happen in Phase 5 when the pooling infrastructure provides natural
test hooks (idle pool, reaper, health checks). The extraction itself
is verified by the fact that existing MCP functionality (which goes
through `McpRegistry::start_server``spawn_mcp_server`) still
compiles and all 63 tests pass.
## Deviations from plan
| Deviation | Rationale |
|---|---|
| No `abort_signal` parameter | Not used by existing code; adding it is a behavior change |
| No integration tests | Requires MCP test infrastructure that doesn't exist |
| Removed `get_server_spec` / `log_path` accessors from McpRegistry | Not needed; `acquire()` takes spec and log_path directly |
## Verification
### Compilation
- `cargo check` — clean, zero warnings, zero errors
- `cargo clippy` — clean
### Tests
- `cargo test`**63 passed, 0 failed** (unchanged from Steps 18b)
## Handoff to next step
### What Step 8d can rely on
- **`spawn_mcp_server(&McpServer, Option<&Path>) -> Result<Arc<ConnectedServer>>`** —
available from `crate::mcp::spawn_mcp_server`
- **`McpFactory::acquire(name, &McpServer, log_path) -> Result<Arc<ConnectedServer>>`** —
checks active map for sharing, spawns fresh if needed, inserts
into active map
- **`McpServerKey::from_spec(name, &McpServer) -> McpServerKey`** —
builds a hashable key from a server spec
- **`McpServer`, `McpServersConfig`, `JsonField`** — all `pub(crate)`
and accessible from `src/config/`
### What Step 8d should do
Build real `ToolScope` instances during scope transitions:
1. Resolve the effective enabled-server list from the role/session/agent
2. Look up each server's `McpServer` spec (from the MCP config)
3. Call `app.mcp_factory.acquire(name, spec, log_path)` for each
4. Populate an `McpRuntime` with the returned `Arc<ConnectedServer>`
handles
5. Construct a `ToolScope` with the runtime + resolved `Functions`
6. Assign to `ctx.tool_scope`
### What Step 8d should watch for
- **Log path.** `McpRegistry` stores `log_path` during `init()`.
Step 8d needs to decide where the log path comes from for
factory-acquired servers. Options: store it on `AppState`,
compute it from `paths::cache_path()`, or pass it through from
the caller. The simplest is to store it on `McpFactory` at
construction time.
- **MCP config loading.** `McpRegistry::init()` loads and parses
`mcp.json`. Step 8d's scope transitions need access to the
parsed `McpServersConfig` to look up server specs by name.
Options: store the parsed config on `AppState`, or load it
fresh each time. Storing on `AppState` is more efficient.
- **Catalog building.** `McpRegistry::start_server` builds a
`ServerCatalog` (BM25 index) for each server after spawning.
Step 8d's `ToolScope` doesn't use catalogs — they're for the
`mcp_search` meta-function. The catalog functionality may need
to be lifted out of `McpRegistry` eventually, but that's not
blocking Step 8d.
### Files to re-read at the start of Step 8d
- `docs/PHASE-1-IMPLEMENTATION-PLAN.md` — Step 8d section
- This notes file
- `src/config/mcp_factory.rs` — full file
- `src/config/tool_scope.rs` — full file
- `src/mcp/mod.rs``McpRegistry::init`, `start_select_mcp_servers`,
`resolve_server_ids` for the config loading / server selection
patterns that Step 8d will replicate
## References
- Phase 1 plan: `docs/PHASE-1-IMPLEMENTATION-PLAN.md`
- Step 8b notes: `docs/implementation/PHASE-1-STEP-8b-NOTES.md`
- Step 6.5 notes: `docs/implementation/PHASE-1-STEP-6.5-NOTES.md`
- Modified files:
- `src/mcp/mod.rs` (extracted `spawn_mcp_server`, rewrote
`start_server`, bumped 3 types to `pub(crate)`)
- `src/config/mcp_factory.rs` (added `from_spec`, `acquire`,
updated imports)