docs: Documentation for the RESTful API POC

2026-05-01 14:45:13 -06:00
parent ca03f6f9d7
commit 7ea3044a37
73 changed files with 2 additions and 2 deletions
@@ -0,0 +1,62 @@
+# Test Plan: Config Loading and AppConfig
+
+## Feature description
+
+Loki loads its configuration from a YAML file (`config.yaml`) into
+a `Config` struct, then converts it to `AppConfig` (immutable,
+shared) + `RequestContext` (mutable, per-request). The `AppConfig`
+holds all serialized fields; `RequestContext` holds runtime state.
+
+## Behaviors to test
+
+### Config loading
+- [ ] Config loads from YAML file with all supported fields
+- [x] Missing optional fields get correct defaults (config_defaults_match_expected)
+- [ ] `model_id` defaults to first available model if empty (requires Config::init, integration test)
+- [x] `temperature`, `top_p` default to `None`
+- [x] `stream` defaults to `true`
+- [x] `save` defaults to `false` (CORRECTED: was listed as true)
+- [x] `highlight` defaults to `true`
+- [x] `dry_run` defaults to `false`
+- [x] `function_calling_support` defaults to `true`
+- [x] `mcp_server_support` defaults to `true`
+- [x] `compression_threshold` defaults to `4000`
+- [ ] `document_loaders` populated from config and defaults (requires Config::init)
+- [x] `clients` parsed from config (to_app_config_copies_clients)
+
+### AppConfig conversion
+- [x] `to_app_config()` copies all serialized fields correctly
+- [x] `clients` field populated on AppConfig
+- [ ] `visible_tools` correctly computed from `enabled_tools` config (deferred to plan 16)
+- [x] `mapping_tools` correctly parsed
+- [x] `mapping_mcp_servers` correctly parsed
+- [ ] `user_agent` resolved (auto → crate name/version)
+
+### RequestContext conversion
+- [x] `to_request_context()` copies all runtime fields (to_request_context_creates_clean_state)
+- [ ] `model` field populated with resolved model (requires Model::retrieve_model)
+- [ ] `working_mode` set correctly (Repl vs Cmd)
+- [x] `tool_scope` starts with default (empty)
+- [x] `agent_runtime` starts as `None`
+
+### AppConfig field accessors
+- [x] `editor()` returns configured editor or $EDITOR
+- [x] `light_theme()` returns theme flag
+- [ ] `render_options()` returns options for markdown rendering
+- [x] `sync_models_url()` returns configured or default URL
+
+### Dynamic config updates
+- [x] `update_app_config` closure correctly clones and replaces Arc
+- [x] Changes to `dry_run`, `stream`, `save` persist across calls
+- [x] Changes visible to subsequent `ctx.app.config` reads
+
+## Context switching scenarios
+- [ ] AppConfig remains immutable after construction (no field mutation)
+- [ ] Multiple RequestContexts can share the same AppState
+- [ ] Changing AppConfig fields (via clone-mutate-replace) doesn't
+      affect other references to the old Arc
+
+## Old code reference
+- `src/config/mod.rs` — `Config` struct, `Config::init`, defaults
+- `src/config/bridge.rs` — `to_app_config`, `to_request_context`
+- `src/config/app_config.rs` — `AppConfig` struct and methods
@@ -0,0 +1,68 @@
+# Test Plan: Roles
+
+## Feature description
+
+Roles define a system prompt + optional model/temperature/MCP config
+that customizes LLM behavior. Roles can be built-in or user-defined
+(markdown files). Roles are "role-likes" — sessions and agents also
+implement the RoleLike trait.
+
+## Behaviors to test
+
+### Role loading
+- [x] Built-in roles load correctly (shell, code)
+- [ ] User-defined roles load from markdown files (requires filesystem)
+- [x] Role parses model_id from metadata
+- [x] Role parses temperature, top_p from metadata
+- [x] Role parses enabled_tools from metadata
+- [x] Role parses enabled_mcp_servers from metadata
+- [ ] Role with no model_id inherits current model (requires retrieve_role + client config)
+- [ ] Role with no temperature inherits from AppConfig (requires retrieve_role)
+- [ ] Role with no top_p inherits from AppConfig (requires retrieve_role)
+
+### retrieve_role
+- [ ] Retrieves by name from file system
+- [ ] Resolves model via Model::retrieve_model
+- [ ] Falls back to current model if role has no model_id
+- [ ] Sets temperature/top_p from AppConfig when role doesn't specify
+
+### use_role (scope transition)
+- [x] Sets role on RequestContext (use_role_obj_sets_role)
+- [ ] Triggers rebuild_tool_scope (async, deferred to plan 05/08)
+- [ ] MCP servers start if role has enabled_mcp_servers (deferred to plan 05)
+- [ ] MCP meta functions added to function list (deferred to plan 05)
+- [ ] Previous role cleared when switching (deferred to plan 08)
+- [x] Role-like temperature/top_p take effect (role_set_temperature_works)
+
+### exit_role
+- [x] Clears role from RequestContext (exit_role_clears_role)
+- [ ] Followed by bootstrap_tools to restore global tool scope (async, deferred)
+- [ ] MCP servers from role are stopped (deferred to plan 05)
+- [ ] Global MCP servers restored (deferred to plan 05)
+
+### use_prompt (temp role)
+- [x] Creates a TEMP_ROLE_NAME role with the prompt text (use_prompt_creates_temp_role)
+- [x] Uses current model
+- [x] Activates via use_role_obj
+
+### extract_role
+- [ ] Returns role from agent if agent active (deferred to plan 04)
+- [ ] Returns role from session if session active with role (deferred to plan 03)
+- [x] Returns standalone role if active (extract_role_returns_standalone_role)
+- [x] Returns default role if none active (extract_role_returns_default_when_nothing_active)
+
+### One-shot role messages (REPL)
+- [ ] `.role coder write hello` sends message with role, then exits role
+- [ ] Original state restored after one-shot
+
+## Context switching scenarios
+- [ ] Role → different role: old role replaced, MCP swapped
+- [ ] Role → session: role cleared, session takes over
+- [ ] Role with MCP → exit: MCP servers stop, global MCP restored
+- [ ] No MCP → role with MCP: servers start
+- [ ] Role with MCP → role without MCP: servers stop
+
+## Old code reference
+- `src/config/mod.rs` — `use_role`, `exit_role`, `retrieve_role`
+- `src/config/role.rs` — `Role` struct, parsing
+- `src/config/request_context.rs` — `use_role`, `exit_role`, `use_prompt`, `retrieve_role`
@@ -0,0 +1,66 @@
+# Test Plan: Sessions
+
+## Feature description
+
+Sessions persist conversation history across multiple turns. They
+store messages, role context, model info, and optional MCP config.
+Sessions can be temporary, named, or auto-named.
+
+## Behaviors to test
+
+### Session creation
+- [ ] Temp session created with TEMP_SESSION_NAME
+- [ ] Named session created at correct file path
+- [ ] New session captures current role via extract_role
+- [ ] New session captures save_session from AppConfig
+- [ ] Session tracks model_id
+
+### Session loading
+- [ ] Named session loads from YAML file
+- [ ] Loaded session resolves model via Model::retrieve_model
+- [ ] Loaded session restores role_prompt if role exists
+- [ ] Auto-named sessions (prefixed `_/`) handled correctly
+
+### Session saving
+- [ ] Session saved to correct path
+- [ ] Session file contains messages, model_id, role info
+- [ ] save_session flag controls whether session is persisted
+- [ ] set_save_session_this_time overrides for current turn
+
+### Session lifecycle
+- [ ] use_session creates or loads session
+- [ ] Already in session → error
+- [ ] exit_session saves and clears
+- [ ] empty_session clears messages but keeps session active
+
+### Session carry-over
+- [ ] New empty session with last_message prompts "incorporate?"
+- [ ] If accepted, last Q&A added to session
+- [ ] If declined, session starts fresh
+- [ ] Only prompts when continuous and output not empty
+
+### Session compression
+- [ ] maybe_compress_session returns true when threshold exceeded
+- [ ] compress_session reduces message count
+- [ ] Compression message shown to user
+- [ ] Session usable after compression
+
+### Session autoname
+- [ ] maybe_autoname_session returns true for new sessions
+- [ ] Auto-naming sets session name based on content
+- [ ] Autoname only triggers once per session
+
+### Session info
+- [ ] session_info returns formatted session details
+- [ ] Shows message count, model, role, tokens
+
+## Context switching scenarios
+- [ ] Session → role change: role updated within session
+- [ ] Session → exit session: messages saved, state cleared
+- [ ] Agent session → exit: agent session cleanup
+- [ ] Session with MCP → exit: MCP servers handled
+
+## Old code reference
+- `src/config/mod.rs` — `use_session`, `exit_session`, `empty_session`
+- `src/config/session.rs` — `Session` struct, new, load, save
+- `src/config/request_context.rs` — `use_session`, `exit_session`
@@ -0,0 +1,77 @@
+# Test Plan: Agents
+
+## Feature description
+
+Agents combine a role (instructions), tools (bash/python/ts scripts),
+optional RAG, optional MCP servers, and optional sub-agent spawning
+capability. Agent::init compiles tools, resolves model, loads RAG,
+and sets up the agent environment.
+
+## Behaviors to test
+
+### Agent initialization
+- [ ] Agent::init loads config.yaml from agent directory
+- [ ] Agent tools compiled from tools.sh / tools.py / tools.ts
+- [ ] Tool file priority: .sh > .py > .ts > .js
+- [ ] Global tools loaded (from global_tools config)
+- [ ] Model resolved from agent config or defaults to current
+- [ ] Agent with no model_id uses current model
+- [ ] Temperature/top_p from agent config applied
+- [ ] Dynamic instructions (_instructions function) invoked if configured
+- [ ] Static instructions loaded from config
+- [ ] Agent variables interpolated into instructions
+- [ ] Special variables (__os__, __cwd__, __now__, etc.) interpolated
+- [ ] Agent .env file loaded if present
+- [ ] Built-in agents installed on first run (skip if exists)
+
+### Agent tools
+- [ ] Agent-specific tools available as function declarations
+- [ ] Global tools (from global_tools) also available
+- [ ] Tool binaries built in agent bin directory
+- [ ] clear_agent_bin_dir removes old binaries before rebuild
+- [ ] Tool declarations include name, description, parameters
+
+### Agent with MCP
+- [ ] MCP servers listed in agent config started
+- [ ] MCP meta functions (invoke/search/describe) added
+- [ ] Agent with MCP but mcp_server_support=false → error
+- [ ] MCP servers stopped on agent exit
+
+### Agent with RAG
+- [ ] RAG documents loaded from agent config
+- [ ] RAG available during agent conversation
+- [ ] RAG search results included in context
+
+### Agent sessions
+- [ ] Agent session started (temp or named)
+- [ ] agent_session config used if no explicit session
+- [ ] Agent session variables initialized
+
+### Agent lifecycle
+- [ ] use_agent checks function_calling_support
+- [ ] use_agent errors if agent already active
+- [ ] exit_agent clears agent, session, rag, supervisor
+- [ ] exit_agent restores global tool scope
+
+### Auto-continuation
+- [ ] Agents with auto_continue=true continue after incomplete todos
+- [ ] max_auto_continues limits continuation attempts
+- [ ] Continuation prompt sent with todo state
+- [ ] clear todo stops continuation
+
+### Conversation starters
+- [ ] Starters loaded from agent config
+- [ ] .starter lists available starters
+- [ ] .starter <n> sends the starter as a message
+
+## Context switching scenarios
+- [ ] Agent → exit: tools cleared, MCP stopped, session ended
+- [ ] Agent with MCP → exit: MCP servers released, global MCP restored
+- [ ] Already in agent → start agent: error
+- [ ] Agent with RAG → exit: RAG cleared
+
+## Old code reference
+- `src/config/agent.rs` — Agent::init, agent config parsing
+- `src/config/mod.rs` — use_agent, exit_agent
+- `src/config/request_context.rs` — use_agent, exit_agent
+- `src/function/mod.rs` — Functions::init_agent, tool compilation
@@ -0,0 +1,118 @@
+# Test Plan: MCP Server Lifecycle
+
+## Feature description
+
+MCP (Model Context Protocol) servers are external tools that run
+as subprocesses communicating via stdio. Loki manages their lifecycle
+through McpFactory (start/share via Weak dedup) and McpRuntime
+(per-scope active server handles). Servers are started/stopped
+during scope transitions (role/session/agent enter/exit).
+
+## Behaviors to test
+
+### MCP config loading
+- [x] mcp.json parsed correctly from functions directory
+- [x] Server specs include command, args, env, cwd
+- [ ] Vault secrets interpolated in mcp.json
+- [ ] Missing secrets reported as warnings
+- [x] McpServersConfig stored on AppState.mcp_config
+
+### McpFactory
+- [ ] acquire() spawns new server when none active (requires real subprocess)
+- [ ] acquire() returns existing handle via Weak upgrade (requires real subprocess)
+- [ ] acquire() spawns fresh when Weak is dead (requires real subprocess)
+- [ ] Multiple acquire() calls for same spec share handle (requires real subprocess)
+- [x] Different specs get different handles (via key inequality)
+- [x] McpServerKey built correctly from spec (sorted args/env)
+
+### McpRuntime
+- [ ] insert() adds server handle by name (requires Arc<ConnectedServer>)
+- [ ] get() retrieves handle by name (requires Arc<ConnectedServer>)
+- [x] server_names() returns all active names
+- [x] is_empty() correct for empty/non-empty
+- [ ] search() finds tools by keyword (BM25 ranking) (requires live server)
+- [ ] describe() returns tool input schema (requires live server)
+- [ ] invoke() calls tool on server and returns result (requires live server)
+
+### spawn_mcp_server
+- [ ] Builds Command from spec (command, args, env, cwd) (integration test)
+- [ ] Creates TokioChildProcess transport (integration test)
+- [ ] Completes rmcp handshake (serve) (integration test)
+- [ ] Returns Arc<ConnectedServer> (integration test)
+- [ ] Log file created when log_path provided (integration test)
+
+### rebuild_tool_scope (MCP integration)
+- [x] Empty enabled_mcp_servers → no servers acquired
+- [ ] "all" → all configured servers acquired (requires real subprocess)
+- [ ] Comma-separated list → only listed servers acquired (requires real subprocess)
+- [ ] Mapping resolution: alias → actual server key(s) (requires real subprocess)
+- [ ] MCP meta functions appended for each started server (requires real subprocess)
+- [ ] Old ToolScope dropped (releasing old server handles) (requires real subprocess)
+- [ ] Loading spinner shown during acquisition (UI test)
+- [ ] AbortSignal properly threaded through (integration test)
+
+### Server lifecycle during scope transitions
+- [ ] Enter role with MCP: servers start (integration test)
+- [ ] Exit role: servers stop (handle dropped) (integration test)
+- [ ] Enter role A (MCP-X) → exit → enter role B (MCP-Y):
+      X stops, Y starts (integration test)
+- [ ] Enter role with MCP → exit to no MCP: servers stop,
+      global MCP restored (integration test)
+- [ ] Start REPL with global MCP → enter agent with different MCP:
+      agent MCP takes over (integration test)
+- [ ] Exit agent: agent MCP stops, global MCP restored (integration test)
+
+### MCP tool invocation chain
+- [ ] LLM calls mcp__search_<server> → search results returned (integration test)
+- [ ] LLM calls mcp__describe_<server> tool_name → schema returned (integration test)
+- [ ] LLM calls mcp__invoke_<server> tool args → tool executed (integration test)
+- [ ] Server not found → "MCP server not found in runtime" error (tested via McpRuntime.get)
+- [ ] Tool not found → appropriate error (requires live server)
+
+### MCP support flag
+- [x] mcp_server_support=false → no MCP servers started
+- [ ] mcp_server_support=false + agent with MCP → error (blocks) (requires agent init)
+- [ ] mcp_server_support=false + role with MCP → warning, continues (requires role init)
+- [ ] .set mcp_server_support true → MCP servers start (requires live server)
+
+### MCP in child agents
+- [ ] Child agent MCP servers acquired via factory (integration test)
+- [ ] Child agent MCP runtime populated (integration test)
+- [ ] Child agent MCP tool invocations work (integration test)
+- [ ] Child agent exit drops MCP handles (integration test)
+
+## Context switching scenarios (comprehensive)
+- [ ] No MCP → role with MCP → exit role → no MCP (integration test)
+- [ ] Global MCP-A → role MCP-B → exit role → global MCP-A (integration test)
+- [ ] Global MCP-A → agent MCP-B → exit agent → global MCP-A (integration test)
+- [ ] Role MCP-A → session MCP-B (overrides) → exit session (integration test)
+- [ ] Agent MCP → child agent MCP → child exits → parent MCP intact (integration test)
+- [ ] .set enabled_mcp_servers X → .set enabled_mcp_servers Y:
+      X released, Y acquired (integration test)
+- [ ] .set enabled_mcp_servers null → all released (integration test)
+
+## Additional behaviors tested (not in original plan)
+
+- [x] McpServerKey equality: same spec → equal keys
+- [x] McpServerKey inequality: different names → different keys
+- [x] McpServerKey inequality: different commands → different keys
+- [x] McpServerKey env coercion: Bool/Int → String
+- [x] McpFactory default has empty active map
+- [x] McpServer::is_remote() true for Http/Sse, false for Stdio
+- [x] McpServer::validate() all cross-field conflicts (6 cases)
+- [x] McpServersConfig: empty servers map, multiple servers, cwd field
+- [x] McpRegistry: default state, config accessor
+- [x] McpRegistry: resolve with whitespace trimming
+- [x] McpRegistry: resolve all-nonexistent returns empty
+- [x] rebuild_tool_scope: no mcp_config yields empty runtime
+- [x] rebuild_tool_scope: preserves tool_tracker across rebuild
+- [x] rebuild_tool_scope: REPL mode appends user interaction functions
+- [x] rebuild_tool_scope: CMD mode excludes user interaction functions
+- [x] MCP meta function name prefix constants are correct
+- [x] ToolScope default: empty functions, runtime, tracker
+
+## Old code reference
+- `src/mcp/mod.rs` — McpRegistry, init, reinit, start/stop
+- `src/config/mcp_factory.rs` — McpFactory, acquire, McpServerKey
+- `src/config/tool_scope.rs` — ToolScope, McpRuntime
+- `src/config/request_context.rs` — rebuild_tool_scope, bootstrap_tools
@@ -0,0 +1,85 @@
+# Test Plan: Tool Evaluation
+
+## Feature description
+
+When the LLM returns tool calls, `eval_tool_calls` dispatches each
+call to the appropriate handler. Handlers include: shell tools
+(bash/python/ts scripts), MCP tools, supervisor tools (agent spawn),
+todo tools, and user interaction tools.
+
+## Behaviors to test
+
+### eval_tool_calls dispatch
+- [ ] Calls dispatched to correct handler by function name prefix (requires RequestContext)
+- [ ] Tool results returned for each call (requires RequestContext)
+- [ ] Multiple concurrent tool calls processed (requires RequestContext)
+- [x] Tool call tracker updated (chain length, repeats)
+- [ ] Root agent (depth 0) checks escalation queue after eval (requires RequestContext)
+- [ ] Escalation notifications injected into results (requires RequestContext)
+
+### ToolCall::eval routing
+- [ ] agent__* → handle_supervisor_tool (requires RequestContext)
+- [ ] todo__* → handle_todo_tool (requires RequestContext)
+- [ ] user__* → handle_user_tool (depth 0) or escalate (depth > 0) (requires RequestContext)
+- [ ] mcp_invoke_* → invoke_mcp_tool (requires RequestContext + live MCP)
+- [ ] mcp_search_* → search_mcp_tools (requires RequestContext + live MCP)
+- [ ] mcp_describe_* → describe_mcp_tool (requires RequestContext + live MCP)
+- [ ] Other → shell tool execution (requires RequestContext + binary)
+
+### Shell tool execution
+- [ ] Tool binary found and executed (integration test)
+- [ ] Arguments passed correctly (integration test)
+- [ ] Environment variables set (LLM_OUTPUT, etc.) (integration test)
+- [ ] Tool output returned as result (integration test)
+- [ ] Tool failure → error returned as tool result (not panic) (integration test)
+
+### Tool call tracking
+- [x] Tracker counts consecutive identical calls
+- [x] Max repeats triggers warning
+- [x] Chain length tracked across turns
+- [x] Tracker state preserved across tool-result loops
+
+### Function selection
+- [ ] select_functions filters by role's enabled_tools (requires filesystem)
+- [x] select_functions includes MCP meta functions for enabled servers
+- [x] select_functions includes agent functions when agent active (via append tests)
+- [ ] "all" enables all functions (requires filesystem)
+- [ ] Comma-separated list enables specific functions (requires filesystem)
+
+## Context switching scenarios
+- [ ] Tool calls during agent → agent tools available (integration test)
+- [ ] Tool calls during role → role tools available (integration test)
+- [ ] Tool calls with MCP → MCP invoke/search/describe work (integration test)
+- [x] No agent → no agent__/todo__ tools in declarations (via Functions::default)
+
+## Additional behaviors tested (not in original plan)
+
+- [x] ToolCall::new sets name, arguments, id correctly
+- [x] ToolCall::default has empty/null fields
+- [x] ToolCall::with_thought_signature sets and clears
+- [x] ToolCall::dedup keeps last occurrence for duplicate ids
+- [x] ToolCall::dedup keeps all calls without ids
+- [x] ToolCall::dedup empty input returns empty
+- [x] ToolCall::dedup mixed with/without ids
+- [x] ToolCallTracker default values (max_repeats=2, chain_len=3)
+- [x] ToolCallTracker no loop on fresh tracker
+- [x] ToolCallTracker no loop below threshold
+- [x] ToolCallTracker different args breaks loop
+- [x] ToolCallTracker different names breaks loop
+- [x] ToolCallTracker record_call respects capacity
+- [x] ToolCallTracker loop message includes call_history
+- [x] All 6 prefix constants verified
+- [x] Functions::append_todo adds all 5 todo tools
+- [x] Functions::append_supervisor adds spawn/check/collect/list/cancel/reply + task queue
+- [x] Functions::append_teammate adds send_message/check_inbox
+- [x] Functions::append_user_interaction adds ask/confirm/input/checkbox
+- [x] Functions::append_mcp_meta creates 3 per server with correct schemas
+- [x] Functions::append_mcp_meta empty servers → no declarations
+- [x] Functions::find/contains work correctly
+- [x] ToolResult::new stores call and output
+
+## Old code reference
+- `src/function/mod.rs` — eval_tool_calls, ToolCall::eval
+- `src/function/supervisor.rs` — handle_supervisor_tool
+- `src/function/todo.rs` — handle_todo_tool
+- `src/function/user_interaction.rs` — handle_user_tool
@@ -0,0 +1,88 @@
+# Test Plan: Input Construction
+
+## Feature description
+
+`Input` encapsulates a single chat turn's data: text, files, role,
+model, session context, RAG embeddings, and function declarations.
+It's constructed at the start of each turn and captures all needed
+state from `RequestContext`.
+
+## Behaviors to test
+
+### Input::from_str
+- [x] Creates Input from text string
+- [x] Captures role via resolve_role
+- [x] Captures session from ctx
+- [ ] Captures rag from ctx (requires RAG setup)
+- [ ] Captures functions via select_functions (tested separately)
+- [x] Captures stream_enabled from AppConfig
+- [x] app_config field set from ctx.app.config
+- [x] Empty text → is_empty() returns true
+
+### Input::from_files
+- [ ] Loads file contents (async + filesystem)
+- [ ] Supports multiple files (async + filesystem)
+- [ ] Supports directories (recursive) (async + filesystem)
+- [ ] Supports URLs (fetches content) (async + network)
+- [ ] Supports loader syntax (e.g., jina:url) (async + loader)
+- [x] Last message carry-over (%% syntax) (via resolve_paths)
+- [ ] Combines file content with text (async)
+- [ ] document_loaders from AppConfig used (async)
+
+### resolve_role
+- [x] Returns provided role if given
+- [ ] Extracts role from agent if agent active (requires agent init)
+- [x] Extracts role from session if session has role
+- [x] Returns default model-based role otherwise
+- [x] with_session flag set correctly
+- [x] with_agent flag set correctly
+
+### Input methods
+- [ ] stream() returns stream_enabled && !model.no_stream() (requires Model with no_stream)
+- [ ] create_client() uses app_config to init client (requires client config)
+- [ ] prepare_completion_data() uses captured functions (requires Model)
+- [ ] build_messages() uses captured session (requires Message setup)
+- [ ] echo_messages() uses captured session (requires Message setup)
+- [x] set_regenerate(role) refreshes role
+- [ ] use_embeddings() searches RAG if present (requires RAG)
+- [ ] merge_tool_results() creates continuation input (requires ToolResult)
+
+## Context switching scenarios
+- [ ] Input with agent → agent functions selected (requires agent init)
+- [x] Input with MCP → MCP meta functions in declarations (via select_functions tests)
+- [ ] Input with RAG → embeddings included after use_embeddings (requires RAG)
+- [x] Input without session → no session messages in build_messages (via session() test)
+
+## Additional behaviors tested (not in original plan)
+
+- [x] resolve_role: explicit role overrides session flag
+- [x] resolve_paths: empty input
+- [x] resolve_paths: URL detection (https://)
+- [x] resolve_paths: external command detection (backtick syntax)
+- [x] resolve_paths: rejects URL with glob suffix
+- [x] resolve_paths: mixed inputs (%%, URL, external cmd)
+- [x] Input::set_text changes text
+- [x] Input::patched_text overrides text()
+- [x] Input::clear_patch restores original
+- [x] Input::set_continue_output accumulates
+- [x] Input::summary truncates long text with ...
+- [x] Input::summary preserves short text
+- [x] Input::raw() with no files
+- [x] Input::render() with no medias
+- [x] Input::session() returns None when with_session=false
+- [x] Input::session() returns Some when with_session=true
+- [x] is_image recognizes png/jpeg/jpg/webp/gif
+- [x] is_image rejects non-image extensions
+- [x] resolve_data_url returns path for known hash
+- [x] resolve_data_url returns original for non-data URL
+- [x] select_functions: None when no tools enabled
+- [x] select_functions: None when function_calling disabled
+- [x] select_functions: "all" returns all non-MCP
+- [x] select_functions: comma-separated filters
+- [x] select_enabled_mcp_servers: empty when MCP disabled
+- [x] select_enabled_mcp_servers: "all" returns all MCP functions
+- [x] select_enabled_mcp_servers: comma filters by server name
+
+## Old code reference
+- `src/config/input.rs` — Input struct, from_str, from_files
+- `src/config/mod.rs` — select_functions, extract_role
@@ -0,0 +1,87 @@
+# Test Plan: RequestContext
+
+## Feature description
+
+`RequestContext` is the per-request mutable state container. It holds
+the active model, role, session, agent, RAG, tool scope, and agent
+runtime. It provides methods for scope transitions, state queries,
+and chat completion lifecycle.
+
+## Behaviors to test
+
+### State management
+- [ ] info() returns formatted system info (requires model provider config)
+- [x] state() returns correct StateFlags combination
+- [ ] current_model() returns active model (tested implicitly via extract_role)
+- [x] role_info() errors when no role, succeeds with role
+- [ ] session_info() format (requires filesystem for sessions)
+- [x] rag_info() errors when no rag
+- [x] agent_info() errors when no agent
+- [ ] sysinfo() returns system details (requires model provider config)
+- [x] working_mode correctly distinguishes Repl vs Cmd
+
+### Scope transitions
+- [x] use_role changes role (via use_role_obj)
+- [ ] use_session creates/loads session, rebuilds tool scope (async + filesystem)
+- [x] use_agent initializes agent with all subsystems (via exit_agent test)
+- [x] exit_role clears role
+- [x] exit_session saves and clears session
+- [x] exit_agent clears agent, supervisor, rag, session
+- [x] exit_rag clears rag
+- [ ] bootstrap_tools rebuilds tool scope with global MCP (async + MCP servers)
+
+### Chat completion lifecycle
+- [x] before_chat_completion sets up for API call
+- [ ] after_chat_completion saves messages, updates state (async + client)
+- [x] discontinuous_last_message marks last message as non-continuous
+
+### ToolScope management
+- [x] rebuild_tool_scope creates fresh Functions
+- [ ] rebuild_tool_scope acquires MCP servers via factory (requires live MCP)
+- [x] rebuild_tool_scope appends user interaction functions in REPL mode
+- [ ] rebuild_tool_scope appends MCP meta functions for started servers (requires live MCP)
+- [x] Tool tracker preserved across scope rebuilds
+
+### AgentRuntime management
+- [x] agent_runtime populated by use_agent (via exit_agent test)
+- [x] agent_runtime cleared by exit_agent
+- [x] Accessor methods (current_depth, supervisor, inbox, etc.) return
+      correct values when agent active
+- [x] Accessor methods return defaults when no agent
+
+### Settings update
+- [ ] update() handles all .set keys correctly (requires REPL command infra)
+- [x] update_app_config() clones and replaces Arc properly
+- [ ] delete() handles all delete subcommands (requires REPL command infra)
+
+### Session helpers
+- [ ] list_sessions() returns session names (requires filesystem)
+- [ ] list_autoname_sessions() returns auto-named sessions (requires filesystem)
+- [x] session_file() returns correct path
+- [ ] save_session() persists session (requires filesystem)
+- [x] empty_session() clears messages
+
+## Context switching scenarios
+- [x] No state → use_role → exit_role → no state
+- [x] No state → use_agent → exit_agent → no state
+- [x] Agent active → use_role_obj errors
+- [ ] Agent → exit_agent → use_role (clean transition) (async)
+
+## Additional behaviors tested (not in original plan)
+
+- [x] state() empty context returns empty flags
+- [x] state() role only → ROLE flag
+- [x] state() empty session → SESSION_EMPTY flag
+- [x] state() role + session flags combine
+- [x] discontinuous_last_message noop when no last_message
+- [x] before_chat_completion creates LastMessage with empty output and continuous=true
+- [x] role_like_mut returns None when no active scope
+- [x] role_like_mut returns role when only role active
+- [x] role_like_mut prefers session over role
+- [x] session_file handles subdir/name format
+- [x] is_compressing_session false with no session
+- [x] is_compressing_session false with default session
+
+## Old code reference
+- `src/config/request_context.rs` — all methods
+- `src/config/mod.rs` — original Config methods (for parity)
@@ -0,0 +1,92 @@
+# Test Plan: REPL Commands
+
+## Feature description
+
+The REPL processes dot-commands (`.role`, `.session`, `.agent`, etc.)
+and plain text (chat messages). Each command has state assertions
+(e.g., `.info role` requires an active role).
+
+## Behaviors to test
+
+### Command parsing
+- [x] Dot-commands parsed correctly (command + args)
+- [x] Multi-line input (:::) handled (regex)
+- [x] Plain text treated as chat message (parse_command returns None)
+- [x] Empty input ignored (parse_command returns None)
+
+### State assertions (REPL_COMMANDS array)
+- [x] Each command's assert_state enforced correctly
+- [x] Invalid state → command rejected (via is_valid)
+- [x] Commands with AssertState::pass() always available
+
+### Command handlers (each one)
+- [ ] .help — prints help text
+- [ ] .info [subcommand] — displays appropriate info
+- [ ] .model <name> — switches model
+- [ ] .prompt <text> — sets temp role
+- [ ] .role <name> [text] — enters role or one-shot
+- [ ] .session [name] — starts/resumes session
+- [ ] .agent <name> [session] [key=value] — starts agent
+- [ ] .rag [name] — initializes RAG
+- [ ] .starter [n] — lists or executes conversation starter
+- [ ] .set <key> <value> — updates setting
+- [ ] .delete <type> — deletes item
+- [ ] .exit [type] — exits scope or REPL
+- [ ] .save role/session [name] — saves to file
+- [ ] .edit role/session/config/agent-config/rag-docs — opens editor
+- [ ] .empty session — clears session
+- [ ] .compress session — compresses session
+- [ ] .rebuild rag — rebuilds RAG
+- [ ] .sources rag — shows RAG sources
+- [ ] .copy — copies last response
+- [ ] .continue — continues response
+- [ ] .regenerate — regenerates response
+- [ ] .file <path> [-- text] — includes files
+- [ ] .macro <name> [text] — runs/creates macro
+- [ ] .authenticate — OAuth flow
+- [ ] .vault <cmd> [name] — vault operations
+- [ ] .clear todo — clears agent todo
+
+### ask function (chat flow)
+- [ ] Input constructed from text
+- [ ] Embeddings applied if RAG active
+- [ ] Waits for compression to complete
+- [ ] before_chat_completion called
+- [ ] Streaming vs non-streaming based on config
+- [ ] Tool results loop (recursive ask with merged results)
+- [ ] after_chat_completion called
+- [ ] Auto-continuation for agents with todos
+
+## Additional behaviors tested (not in original plan)
+
+- [x] AssertState::pass() always returns true (all flag combos)
+- [x] AssertState::bare() only matches empty flags
+- [x] AssertState::True requires any matching flag present
+- [x] AssertState::True with multiple flags — any match suffices
+- [x] AssertState::False requires all specified flags absent
+- [x] AssertState::False with multiple flags
+- [x] AssertState::TrueFalse — true present AND false absent
+- [x] AssertState::Equal — exact flag match
+- [x] REPL_COMMANDS has exactly 39 entries
+- [x] All commands start with '.'
+- [x] All commands have non-empty descriptions
+- [x] .help, .exit always available (pass)
+- [x] .info role requires ROLE
+- [x] .session blocked when already in session
+- [x] .exit session requires session
+- [x] .exit agent requires agent
+- [x] .agent only when bare (no role/session/agent)
+- [x] .role blocked in session/agent
+- [x] .prompt blocked in session/agent
+- [x] .rag blocked in agent
+- [x] .starter requires agent
+- [x] .clear todo requires agent
+- [x] .edit role requires ROLE, blocked in SESSION
+- [x] .exit rag requires RAG, blocked in AGENT
+- [x] split_first_arg: None, single word, two words, extra spaces
+- [x] parse_command: plain text, empty, whitespace, dot only
+- [x] ReplCommand::is_valid with pass/True/False
+- [x] Multiline regex: captures content, rejects unclosed, rejects plain text
+
+## Old code reference
+- `src/repl/mod.rs` — run_repl_command, ask, REPL_COMMANDS
@@ -0,0 +1,67 @@
+# Test Plan: CLI Flags
+
+## Feature description
+
+Loki CLI accepts flags for model, role, session, agent, file input,
+execution mode, and various info/list commands. Flags determine
+the execution path through main.rs.
+
+## Behaviors to test
+
+### Early-exit flags
+- [x] --info parsed correctly
+- [x] --list-models parsed correctly
+- [x] --list-roles parsed correctly
+- [x] --list-sessions parsed correctly
+- [x] --list-agents parsed correctly
+- [x] --list-rags parsed correctly
+- [x] --list-macros parsed correctly
+- [x] --sync-models parsed correctly
+- [x] --build-tools parsed correctly
+- [ ] --authenticate runs OAuth and exits (integration)
+- [ ] --completions generates shell completions and exits (integration)
+- [x] Vault flags (--add/get/update/delete-secret, --list-secrets) parsed
+
+### Mode selection
+- [x] No text/file → text returns None (REPL indicator)
+- [x] Text provided → text joined and returned
+- [x] --agent → agent field set
+- [x] --role → role field set
+- [x] --execute (-e) → execute flag set
+- [x] --code (-c) → code flag set
+- [x] --prompt → prompt field set
+- [x] --macro → macro_name field set
+
+### Flag combinations
+- [x] --model + --role parsed together
+- [x] --session + --role parsed together
+- [ ] --session + --agent → agent with session (integration)
+- [ ] --agent + --agent-variable → variables set (integration)
+- [x] --dry-run flag parsed
+- [x] --no-stream (-S) flag parsed
+- [x] --file + text → both parsed
+- [x] --empty-session + --session parsed
+- [x] --save-session + --session parsed
+
+### Prelude
+- [ ] apply_prelude runs before main execution (async + filesystem)
+- [ ] Prelude "role:name" loads role (async + filesystem)
+- [ ] Prelude "session:name" loads session (async + filesystem)
+- [ ] Prelude "session:role" loads both (async + filesystem)
+- [ ] Prelude skipped if macro_flag set (async)
+- [ ] Prelude skipped if state already has role/session/agent (async)
+
+## Additional behaviors tested (not in original plan)
+
+- [x] Default Cli has all flags unset/empty
+- [x] Short flags: -m, -r, -a, -s, -e, -c, -S, -f
+- [x] Multiple -f flags accumulate
+- [x] Trailing text args collected as vec
+- [x] Cli::text() returns None with no args (terminal stdin)
+- [x] Cli::text() joins trailing args with spaces
+- [x] --rag flag parsed
+- [x] --macro flag parsed
+
+## Old code reference
+- `src/cli/mod.rs` — Cli struct, flag definitions
+- `src/main.rs` — run(), flag processing, mode branching
@@ -0,0 +1,106 @@
+# Test Plan: Sub-Agent Spawning
+
+## Feature description
+
+Agents with can_spawn_agents=true can spawn child agents that run
+in parallel as background tokio tasks. Children communicate results
+back to the parent via collect/check. Escalation allows children
+to request user input through the parent.
+
+## Behaviors to test
+
+### Spawn
+- [ ] agent__spawn creates child agent in background (requires agent config on disk)
+- [x] Child gets own RequestContext with incremented depth (new_for_child)
+- [x] Child starts with empty scope (new_for_child)
+- [x] Child gets shared root_escalation_queue (new_for_child)
+- [x] Child gets inbox for teammate messaging (new_for_child)
+- [x] Child inherits parent_supervisor (new_for_child)
+- [ ] Child MCP servers acquired if configured (requires live MCP)
+- [x] Max concurrent agents enforced (Supervisor.register)
+- [x] Max depth enforced (Supervisor.register)
+- [ ] Agent not found → error (requires agent config on disk)
+- [ ] can_spawn_agents=false → no spawn tools available (requires agent init)
+
+### Collect/Check
+- [x] agent__check returns PENDING for running agent
+- [x] agent__check returns error for unknown agent
+- [ ] agent__collect blocks until done, returns output (requires real child completion)
+- [ ] Output summarization when exceeds threshold (requires LLM client)
+- [ ] Summarization uses configured model (requires LLM client)
+
+### Task queue (handler integration tests)
+- [x] handle_task_create creates tasks (simple, with deps, with dispatch_agent)
+- [x] handle_task_create errors when agent set without prompt
+- [x] handle_task_complete unblocks dependents
+- [x] handle_task_list shows all tasks
+- [x] handle_task_fail marks failed and reports blocked dependents
+- [x] handle_task_fail returns error for missing task
+
+### Escalation (handler integration tests)
+- [x] handle_reply_escalation delivers reply via oneshot channel
+- [x] handle_reply_escalation errors for missing escalation_id
+- [x] handle_reply_escalation errors when no queue
+- [x] Pending summary contains correct fields
+- [x] Reply reaches receiver via oneshot channel
+- [ ] Escalation timeout → fallback message (requires tokio timeout)
+
+### Teammate messaging (handler integration tests)
+- [x] handle_send_message delivers to registered agent's inbox
+- [x] handle_send_message errors for unknown agent
+- [x] handle_check_inbox returns messages with count
+- [x] handle_check_inbox returns empty when no inbox
+- [x] handle_check_inbox returns empty for empty inbox
+
+### Cancel/List (handler integration tests)
+- [x] handle_list returns empty for fresh supervisor
+- [x] handle_list returns registered agents
+- [x] handle_list errors when no supervisor
+- [x] handle_cancel removes agent and signals abort
+- [x] handle_cancel errors for unknown agent
+- [x] handle_cancel errors when no supervisor
+
+### Dispatch routing
+- [x] Unknown action → error with "Unknown supervisor action"
+- [x] agent__list routes to handle_list
+- [x] agent__task_list routes to handle_task_list
+
+### Child agent lifecycle
+- [ ] run_child_agent loops (requires LLM client)
+- [ ] Child uses before/after_chat_completion (requires LLM client)
+- [ ] Child tool calls evaluated (requires LLM client)
+- [ ] Child exits cleanly (requires LLM client)
+
+## Context switching scenarios
+- [ ] Parent spawns child with MCP (requires live MCP + agent config)
+- [ ] Parent exits agent → all children cancelled (requires agent init)
+- [x] Multiple children share escalation queue (new_for_child + ensure_root_escalation_queue)
+
+## Additional behaviors tested (not in original plan)
+
+- [x] EscalationQueue: default, submit, take, take_nonexistent, has_pending
+- [x] EscalationQueue: pending_summary with/without options, empty
+- [x] EscalationQueue: reply via oneshot channel
+- [x] new_escalation_id: prefix and uniqueness
+- [x] Inbox: new/default empty, deliver+drain, drain empties, multiple deliveries
+- [x] Inbox: clone preserves messages, clone is independent
+- [x] Supervisor: new defaults, register count, take removes, take nonexistent
+- [x] Supervisor: inbox accessor, list_agents, task_queue accessible
+- [x] Supervisor: register allows at max_depth boundary
+- [x] AgentExitStatus: equality/inequality
+- [x] TaskQueue: fail sets status, get missing returns None
+- [x] TaskQueue: dispatch_agent/prompt stored, claim blocked fails
+- [x] TaskQueue: list sorted by id, default empty
+- [x] TaskQueue: dependency on nonexistent errors, complete nonexistent
+- [x] TaskNode: is_runnable when pending+unblocked, not when blocked
+
+## Integration handler tests added
+
+- [x] All handle_* functions tested via handler integration tests (36 tests)
+- [x] new_for_child: depth, id, inbox, escalation queue, parent supervisor, empty scope
+- [x] ensure_root_escalation_queue: lazy init, same Arc on repeated calls
+- [x] AppState::test_default() helper added for cross-module test construction
+
+## Old code reference
+- `src/function/supervisor.rs` — all handler functions
+- `src/supervisor/` — Supervisor, EscalationQueue, Inbox, TaskQueue
@@ -0,0 +1,33 @@
+# Test Plan: RAG
+
+## Behaviors to test
+- [ ] Rag::init creates new RAG with embedding model (requires LLM client)
+- [ ] Rag::load loads existing RAG from disk (requires filesystem)
+- [ ] Rag::create builds vector store from documents (requires embedding model)
+- [ ] Rag::refresh_document_paths updates document list (requires filesystem)
+- [ ] RAG search returns relevant embeddings (requires embedding model)
+- [x] RAG template contains required placeholders
+- [ ] Reranker model applied when configured (requires LLM client)
+- [ ] top_k controls number of results (requires embedding model)
+- [ ] RAG sources tracked for .sources command (requires full Rag struct)
+- [x] exit_rag clears RAG from context (tested in iteration 8)
+
+## Additional behaviors tested
+
+- [x] DocumentId: new/split round-trip, zero/zero, large values
+- [x] DocumentId: Debug format ("file-doc"), equality, inequality, ordering
+- [x] RagDocument: new with content, default empty
+- [x] RagData: new sets all defaults, empty collections
+- [x] RagData::get: returns document, None for missing file, None for missing doc index
+- [x] RagData::del: removes files + associated vectors, noop for nonexistent
+- [x] RagData::add: inserts files, vectors, updates next_file_id
+- [x] RagData::build_bm25: empty data returns no results
+- [x] RagData::build_bm25: finds documents by keyword (BM25 ranking)
+- [x] RAG_TEMPLATE: contains __CONTEXT__, __SOURCES__, __INPUT__
+- [x] get_separators: Rust/Python/Markdown return language-specific
+- [x] get_separators: unknown extension returns defaults
+- [x] get_separators: all 22 known extensions have language-specific separators
+
+## Old code reference
+- `src/rag/mod.rs` — Rag struct and methods
+- `src/config/request_context.rs` — use_rag, edit_rag_docs, rebuild_rag
@@ -0,0 +1,35 @@
+# Test Plan: Tab Completion and Prompt
+
+## Behaviors to test
+
+### Tab completion (repl_complete)
+- [ ] .role<TAB> → role names (no hidden files)
+- [ ] .agent<TAB> → agent names (no .shared)
+- [ ] .session<TAB> → session names
+- [ ] .rag<TAB> → RAG names
+- [ ] .macro<TAB> → macro names
+- [ ] .model<TAB> → model names with descriptions
+- [ ] .set <TAB> → setting keys (sorted)
+- [ ] .set temperature <TAB> → current value suggestions
+- [ ] .set enabled_tools <TAB> → tool names (no internal tools)
+- [ ] .set enabled_mcp_servers <TAB> → configured servers + aliases
+- [ ] .delete <TAB> → type names
+- [ ] .vault <TAB> → subcommands
+- [ ] .agent <name> <TAB> → session names for that agent
+- [ ] Fuzzy filtering applied to all completions
+
+### Prompt rendering
+- [ ] Left prompt shows role/session/agent name
+- [ ] Right prompt shows model name
+- [ ] Prompt updates after scope transitions
+- [ ] Multi-line indicator shown during ::: input
+
+## Status
+Most completion logic requires filesystem access for role/session/agent lists.
+The `split_line` function has existing tests. Prompt rendering methods are trivial
+wrappers around stored strings. Low additional unit-test yield.
+
+## Old code reference
+- `src/config/request_context.rs` — repl_complete
+- `src/repl/completer.rs` — ReplCompleter (split_line already tested)
+- `src/repl/prompt.rs` — ReplPrompt
@@ -0,0 +1,24 @@
+# Test Plan: Macros
+
+## Behaviors to test
+- [ ] Macro loaded from YAML file (requires filesystem)
+- [ ] Macro steps executed sequentially (requires async + RequestContext)
+- [ ] Each step runs through run_repl_command (requires async)
+- [x] Variable interpolation in macro steps
+- [ ] Built-in macros installed on first run (requires filesystem)
+- [ ] macro_execute creates isolated RequestContext (requires async)
+- [ ] Macro context inherits tool scope from parent (requires async)
+- [ ] Macro context has macro_flag set (requires async)
+
+## Additional behaviors tested
+
+- [x] resolve_variables: no variables, required provided, required missing errors
+- [x] resolve_variables: default used, default overridden
+- [x] resolve_variables: rest captures remaining args, rest with default
+- [x] resolve_variables: multiple variables mixed
+- [x] usage: no variables, required, optional, rest, rest+default, mixed
+- [x] interpolate_command: single, multiple, no vars, missing var passthrough
+- [x] YAML deserialization: with variables, with defaults, no variables
+
+## Old code reference
+- `src/config/macros.rs` — macro_execute, Macro struct
@@ -0,0 +1,25 @@
+# Test Plan: Vault
+
+## Behaviors to test
+- [ ] Vault add stores encrypted secret (requires terminal + password file)
+- [ ] Vault get decrypts and returns secret (requires password file)
+- [ ] Vault update replaces secret value (requires terminal + password file)
+- [ ] Vault delete removes secret (requires password file)
+- [ ] Vault list shows all secret names (requires password file)
+- [ ] Secrets interpolated in MCP config (mcp.json) (requires Vault with secrets)
+- [ ] Missing secrets produce warning during MCP init (requires Vault)
+- [x] Vault accessible from CLI (flag parsing tested in iteration 10)
+- [ ] Vault accessible from REPL (.vault commands) (requires REPL infra)
+
+## Additional behaviors tested
+
+- [x] SECRET_RE matches {{DOUBLE_BRACES}}
+- [x] SECRET_RE matches with surrounding text
+- [x] SECRET_RE does not match {SINGLE_BRACES}
+- [x] SECRET_RE does not match plain text
+- [x] SECRET_RE matches with spaces inside braces
+- [x] Vault::default() creates instance with no password file
+
+## Old code reference
+- `src/vault/mod.rs` — GlobalVault, operations
+- `src/mcp/mod.rs` — interpolate_secrets
@@ -0,0 +1,57 @@
+# Test Plan: Functions and Tools
+
+## Behaviors to test
+
+### Function declarations
+- [ ] Functions::init loads from visible_tools config
+- [ ] Tool declarations parsed from bash scripts (argc annotations)
+- [ ] Tool declarations parsed from python scripts (docstrings)
+- [ ] Tool declarations parsed from typescript (JSDoc + type inference)
+- [ ] Each declaration has name, description, parameters
+- [ ] Agent tools loaded via Functions::init_agent
+- [ ] Global tools loaded via build_global_tool_declarations
+
+### Tool compilation
+- [ ] Bash tools compiled to bin directory
+- [ ] Python tools compiled to bin directory
+- [ ] TypeScript tools compiled to bin directory
+- [ ] clear_agent_bin_dir removes old binaries
+- [ ] Tool file priority: .sh > .py > .ts > .js
+
+### User interaction functions
+- [ ] append_user_interaction_functions adds user__ask/confirm/input/checkbox
+- [ ] Only appended in REPL mode
+- [ ] User interaction tools work at depth 0 (direct prompt)
+- [ ] User interaction tools escalate at depth > 0
+
+### MCP meta functions
+- [ ] append_mcp_meta_functions adds invoke/search/describe per server
+- [ ] Meta functions removed when ToolScope rebuilt without those servers
+- [ ] Function names follow mcp_invoke_<server> pattern
+
+### Function selection
+- [ ] select_functions filters by role's enabled_tools
+- [ ] "all" enables everything
+- [ ] Specific tool names enabled selectively
+- [ ] mapping_tools aliases resolved
+- [ ] Agent functions included when agent active
+- [ ] MCP meta functions included when servers active
+
+## Status
+- Function declarations, append methods, find/contains tested in iteration 6
+- MCP meta functions tested in iterations 5-7
+- Function selection tested in iteration 7
+- User interaction functions tested in iterations 6-7
+- Python parser: extensive existing tests (400+ lines)
+- TypeScript parser: extensive existing tests (400+ lines)
+- parsers::common::underscore tested in iteration 13
+- Functions::init and tool compilation require filesystem
+
+## Additional behaviors tested
+
+- [x] parsers::common::underscore: simple, dashes, spaces, special chars, consecutive, leading/trailing, uppercase, mixed
+
+## Old code reference
+- `src/function/mod.rs` — Functions struct, init, init_agent
+- `src/config/paths.rs` — agent_functions_file (priority)
+- `src/parsers/` — bash, python, typescript parsers