227 lines
9.5 KiB
Markdown
227 lines
9.5 KiB
Markdown
# Phase 1 Step 8c — Implementation Notes
|
||
|
||
## Status
|
||
|
||
Done.
|
||
|
||
## Plan reference
|
||
|
||
- Plan: `docs/PHASE-1-IMPLEMENTATION-PLAN.md`
|
||
- Section: "Step 8c: Extract `McpFactory::acquire()` from
|
||
`McpRegistry::init_server`"
|
||
|
||
## Summary
|
||
|
||
Extracted the MCP subprocess spawn + rmcp handshake logic from
|
||
`McpRegistry::start_server` into a standalone `pub(crate) async fn
|
||
spawn_mcp_server()` function. Rewrote `start_server` to call it.
|
||
Implemented `McpFactory::acquire()` using the extracted function
|
||
plus the existing `try_get_active` / `insert_active` scaffolding
|
||
from Step 6.5. Three types in `mcp/mod.rs` were bumped to
|
||
`pub(crate)` visibility for cross-module access.
|
||
|
||
## What was changed
|
||
|
||
### Files modified (2 files)
|
||
|
||
- **`src/mcp/mod.rs`** — 4 changes:
|
||
|
||
1. **Extracted `spawn_mcp_server`** (~40 lines) — standalone
|
||
`pub(crate) async fn` that takes an `&McpServer` spec and
|
||
optional log path, builds a `tokio::process::Command`, creates
|
||
a `TokioChildProcess` transport (with optional stderr log
|
||
redirect), calls `().serve(transport).await` for the rmcp
|
||
handshake, and returns `Arc<ConnectedServer>`.
|
||
|
||
2. **Rewrote `McpRegistry::start_server`** — now looks up the
|
||
`McpServer` spec from `self.config`, calls `spawn_mcp_server`,
|
||
then does its own catalog building (tool listing, BM25 index
|
||
construction). The spawn + handshake code that was previously
|
||
inline is replaced by the one-liner
|
||
`spawn_mcp_server(spec, self.log_path.as_deref()).await?`.
|
||
|
||
3. **Bumped 3 types to `pub(crate)`**: `McpServer`, `JsonField`,
|
||
`McpServersConfig`. These were previously private to
|
||
`mcp/mod.rs`. `McpFactory::acquire()` and
|
||
`McpServerKey::from_spec()` need `McpServer` and `JsonField`
|
||
to build the server key from a spec. `McpServersConfig` is
|
||
bumped for completeness (Step 8d may need to access it when
|
||
loading server specs during scope transitions).
|
||
|
||
- **`src/config/mcp_factory.rs`** — 3 changes:
|
||
|
||
1. **Added `McpServerKey::from_spec(name, &McpServer)`** — builds
|
||
a key by extracting command, args (defaulting to empty vec),
|
||
and env vars (converting `JsonField` variants to strings) from
|
||
the spec. Args and env are sorted by the existing `new()`
|
||
constructor to ensure identical specs produce identical keys.
|
||
|
||
2. **Added `McpFactory::acquire(name, &McpServer, log_path)`** —
|
||
the core method. Builds an `McpServerKey` from the spec, checks
|
||
`try_get_active` for an existing `Arc` (sharing path), otherwise
|
||
calls `spawn_mcp_server` to start a fresh subprocess, inserts
|
||
the result into `active` via `insert_active`, and returns the
|
||
`Arc<ConnectedServer>`.
|
||
|
||
3. **Updated imports** — added `McpServer`, `spawn_mcp_server`,
|
||
`Result`, `Path`.
|
||
|
||
### Files NOT changed
|
||
|
||
- **`src/config/tool_scope.rs`** — unchanged; Step 8d will use
|
||
`McpFactory::acquire()` to populate `McpRuntime` instances.
|
||
- **All caller code** — `McpRegistry::start_select_mcp_servers` and
|
||
`McpRegistry::reinit` continue to call `self.start_server()` which
|
||
internally uses the extracted function. No caller migration.
|
||
|
||
## Key decisions
|
||
|
||
### 1. Spawn function does NOT list tools or build catalogs
|
||
|
||
The plan said to extract "the MCP subprocess spawn + rmcp handshake
|
||
logic (~60 lines)." I interpreted this as: `Command` construction →
|
||
transport creation → `serve()` handshake → `Arc` wrapping. The tool
|
||
listing (`service.list_tools`) and catalog building (BM25 index) are
|
||
`McpRegistry`-specific bookkeeping and stay in `start_server`.
|
||
|
||
`McpFactory::acquire()` returns a connected server handle ready to
|
||
use. Callers (Step 8d's scope transitions) can list tools themselves
|
||
if they need to build function declarations.
|
||
|
||
### 2. No `abort_signal` parameter on `spawn_mcp_server`
|
||
|
||
The plan suggested `abort_signal: &AbortSignal` as a parameter. The
|
||
existing `start_server` doesn't use an abort signal — cancellation
|
||
is handled at a higher level by `abortable_run_with_spinner` wrapping
|
||
the entire batch of `start_select_mcp_servers`. Adding an abort signal
|
||
to the individual spawn would require threading `tokio::select!` into
|
||
the transport creation, which is a behavior change beyond Step 8c's
|
||
scope. Step 8d can add cancellation when building `ToolScope` if
|
||
needed.
|
||
|
||
### 3. `McpServerKey::from_spec` converts `JsonField` to strings
|
||
|
||
The `McpServer.env` field uses a `JsonField` enum (Str/Bool/Int) for
|
||
JSON flexibility. The key needs string comparisons for hashing, so
|
||
`from_spec` converts each variant to its string representation. This
|
||
matches the conversion already done in the env-building code inside
|
||
`spawn_mcp_server`.
|
||
|
||
### 4. `McpFactory::acquire` mutex contention is safe
|
||
|
||
The plan warned: "hold the lock only during HashMap mutation, never
|
||
across subprocess spawn." The implementation achieves this by using
|
||
the existing `try_get_active` and `insert_active` methods, which each
|
||
acquire and release the mutex within their own scope. The `spawn_mcp_server`
|
||
await happens between the two lock acquisitions with no lock held.
|
||
|
||
TOCTOU race: two concurrent callers could both miss in `try_get_active`,
|
||
both spawn, and both insert. The second insert overwrites the first's
|
||
`Weak`. This means one extra subprocess gets spawned and the first
|
||
`Arc` has no `Weak` in the map (but stays alive via its holder's
|
||
`Arc`). This is acceptable for Phase 1 — the worst case is a
|
||
redundant spawn, not a crash or leak. Phase 5's pooling design
|
||
(per-key `tokio::sync::Mutex`) will eliminate this race.
|
||
|
||
### 5. No integration tests for `acquire()`
|
||
|
||
The plan suggested writing integration tests for the factory's sharing
|
||
behavior. Spawning a real MCP server requires a configured binary on
|
||
the system PATH. A mock server would need a test binary that speaks
|
||
the rmcp stdio protocol — this is substantial test infrastructure
|
||
that doesn't exist yet. Rather than building it in Step 8c, I'm
|
||
documenting that integration testing of `McpFactory::acquire()` should
|
||
happen in Phase 5 when the pooling infrastructure provides natural
|
||
test hooks (idle pool, reaper, health checks). The extraction itself
|
||
is verified by the fact that existing MCP functionality (which goes
|
||
through `McpRegistry::start_server` → `spawn_mcp_server`) still
|
||
compiles and all 63 tests pass.
|
||
|
||
## Deviations from plan
|
||
|
||
| Deviation | Rationale |
|
||
|---|---|
|
||
| No `abort_signal` parameter | Not used by existing code; adding it is a behavior change |
|
||
| No integration tests | Requires MCP test infrastructure that doesn't exist |
|
||
| Removed `get_server_spec` / `log_path` accessors from McpRegistry | Not needed; `acquire()` takes spec and log_path directly |
|
||
|
||
## Verification
|
||
|
||
### Compilation
|
||
|
||
- `cargo check` — clean, zero warnings, zero errors
|
||
- `cargo clippy` — clean
|
||
|
||
### Tests
|
||
|
||
- `cargo test` — **63 passed, 0 failed** (unchanged from Steps 1–8b)
|
||
|
||
## Handoff to next step
|
||
|
||
### What Step 8d can rely on
|
||
|
||
- **`spawn_mcp_server(&McpServer, Option<&Path>) -> Result<Arc<ConnectedServer>>`** —
|
||
available from `crate::mcp::spawn_mcp_server`
|
||
- **`McpFactory::acquire(name, &McpServer, log_path) -> Result<Arc<ConnectedServer>>`** —
|
||
checks active map for sharing, spawns fresh if needed, inserts
|
||
into active map
|
||
- **`McpServerKey::from_spec(name, &McpServer) -> McpServerKey`** —
|
||
builds a hashable key from a server spec
|
||
- **`McpServer`, `McpServersConfig`, `JsonField`** — all `pub(crate)`
|
||
and accessible from `src/config/`
|
||
|
||
### What Step 8d should do
|
||
|
||
Build real `ToolScope` instances during scope transitions:
|
||
|
||
1. Resolve the effective enabled-server list from the role/session/agent
|
||
2. Look up each server's `McpServer` spec (from the MCP config)
|
||
3. Call `app.mcp_factory.acquire(name, spec, log_path)` for each
|
||
4. Populate an `McpRuntime` with the returned `Arc<ConnectedServer>`
|
||
handles
|
||
5. Construct a `ToolScope` with the runtime + resolved `Functions`
|
||
6. Assign to `ctx.tool_scope`
|
||
|
||
### What Step 8d should watch for
|
||
|
||
- **Log path.** `McpRegistry` stores `log_path` during `init()`.
|
||
Step 8d needs to decide where the log path comes from for
|
||
factory-acquired servers. Options: store it on `AppState`,
|
||
compute it from `paths::cache_path()`, or pass it through from
|
||
the caller. The simplest is to store it on `McpFactory` at
|
||
construction time.
|
||
|
||
- **MCP config loading.** `McpRegistry::init()` loads and parses
|
||
`mcp.json`. Step 8d's scope transitions need access to the
|
||
parsed `McpServersConfig` to look up server specs by name.
|
||
Options: store the parsed config on `AppState`, or load it
|
||
fresh each time. Storing on `AppState` is more efficient.
|
||
|
||
- **Catalog building.** `McpRegistry::start_server` builds a
|
||
`ServerCatalog` (BM25 index) for each server after spawning.
|
||
Step 8d's `ToolScope` doesn't use catalogs — they're for the
|
||
`mcp_search` meta-function. The catalog functionality may need
|
||
to be lifted out of `McpRegistry` eventually, but that's not
|
||
blocking Step 8d.
|
||
|
||
### Files to re-read at the start of Step 8d
|
||
|
||
- `docs/PHASE-1-IMPLEMENTATION-PLAN.md` — Step 8d section
|
||
- This notes file
|
||
- `src/config/mcp_factory.rs` — full file
|
||
- `src/config/tool_scope.rs` — full file
|
||
- `src/mcp/mod.rs` — `McpRegistry::init`, `start_select_mcp_servers`,
|
||
`resolve_server_ids` for the config loading / server selection
|
||
patterns that Step 8d will replicate
|
||
|
||
## References
|
||
|
||
- Phase 1 plan: `docs/PHASE-1-IMPLEMENTATION-PLAN.md`
|
||
- Step 8b notes: `docs/implementation/PHASE-1-STEP-8b-NOTES.md`
|
||
- Step 6.5 notes: `docs/implementation/PHASE-1-STEP-6.5-NOTES.md`
|
||
- Modified files:
|
||
- `src/mcp/mod.rs` (extracted `spawn_mcp_server`, rewrote
|
||
`start_server`, bumped 3 types to `pub(crate)`)
|
||
- `src/config/mcp_factory.rs` (added `from_spec`, `acquire`,
|
||
updated imports)
|