This commit is contained in:
2026-04-15 12:56:00 -06:00
parent ff3419a714
commit 63b6678e73
82 changed files with 14800 additions and 3310 deletions
+52
View File
@@ -0,0 +1,52 @@
# Iteration 1 — Test Implementation Notes
## Plan file addressed
`docs/testing/plans/01-config-and-appconfig.md`
## Tests created
| File | Test name | What it verifies |
|---|---|---|
| `src/config/mod.rs` | `config_defaults_match_expected` | All Config::default() fields match old code values |
| `src/config/app_config.rs` | `to_app_config_copies_serialized_fields` | to_app_config copies model_id, temperature, top_p, dry_run, stream, save, highlight, compression_threshold, rag_top_k |
| `src/config/app_config.rs` | `to_app_config_copies_clients` | clients field populated (empty by default) |
| `src/config/app_config.rs` | `to_app_config_copies_mapping_fields` | mapping_tools and mapping_mcp_servers copied correctly |
| `src/config/app_config.rs` | `editor_returns_configured_value` | editor() returns configured value |
| `src/config/app_config.rs` | `editor_falls_back_to_env` | editor() doesn't panic without config |
| `src/config/app_config.rs` | `light_theme_default_is_false` | light_theme() default |
| `src/config/app_config.rs` | `sync_models_url_has_default` | sync_models_url() has non-empty default |
| `src/config/request_context.rs` | `to_request_context_creates_clean_state` | RequestContext starts with clean state (no role/session/agent, empty tool_scope, no agent_runtime) |
| `src/config/request_context.rs` | `update_app_config_persists_changes` | Dynamic config updates via clone-mutate-replace persist |
**Total: 10 new tests (59 → 69)**
## Bugs discovered
None. The `save` default was `false` in both old and new code
(my plan file incorrectly said `true` — corrected).
## Observations for future iterations
1. The `Config::default().save` is `false`, but the plan file
01 incorrectly listed it as `true`. Plan file should be
updated to reflect the actual default.
2. `AppConfig::default()` doesn't exist natively (no derive).
Tests construct it via `Config::default().to_app_config()`.
This is fine since that's how it's created in production.
3. The `visible_tools` field computation happens during
`Config::init` (not `to_app_config`). Testing the full
visible_tools resolution requires integration-level testing
with actual tool files. Deferred to plan file 16
(functions-and-tools).
4. Testing `Config::init` directly is difficult because it reads
from the filesystem, starts MCP servers, etc. The unit tests
focus on the conversion paths which are the Phase 1 surface.
## Next iteration
Plan file 02: Roles — role loading, retrieve_role, use_role/exit_role,
use_prompt, extract_role, one-shot role messages, MCP context switching.
+71
View File
@@ -0,0 +1,71 @@
# Iteration 2 — Test Implementation Notes
## Plan file addressed
`docs/testing/plans/02-roles.md`
## Tests created
### src/config/role.rs (12 new tests, 15 total)
| Test name | What it verifies |
|---|---|
| `role_new_parses_prompt` | Role::new extracts prompt text |
| `role_new_parses_metadata` | Metadata block parses model, temperature, top_p |
| `role_new_parses_enabled_tools` | enabled_tools from metadata |
| `role_new_parses_enabled_mcp_servers` | enabled_mcp_servers from metadata |
| `role_new_no_metadata_has_none_fields` | No metadata → all optional fields None |
| `role_builtin_shell_loads` | Built-in "shell" role loads |
| `role_builtin_code_loads` | Built-in "code" role loads |
| `role_builtin_nonexistent_errors` | Non-existent built-in → error |
| `role_default_has_empty_fields` | Default role has empty name/prompt |
| `role_set_model_updates_model` | set_model() changes the model |
| `role_set_temperature_works` | set_temperature() changes temperature |
| `role_export_includes_metadata` | export() includes metadata and prompt |
### src/config/request_context.rs (5 new tests, 7 total)
| Test name | What it verifies |
|---|---|
| `use_role_obj_sets_role` | use_role_obj sets role on ctx |
| `exit_role_clears_role` | exit_role clears role from ctx |
| `use_prompt_creates_temp_role` | use_prompt creates TEMP_ROLE_NAME role |
| `extract_role_returns_standalone_role` | extract_role returns active role |
| `extract_role_returns_default_when_nothing_active` | extract_role returns default role |
**Total: 17 new tests (69 → 86)**
## Bugs discovered
None. Role parsing behavior matches between old and new code.
## Observations for future iterations
1. `retrieve_role` (which calls `Model::retrieve_model`) can't be
easily unit-tested without a real client config. It depends on
having at least one configured client. Deferred to integration
testing or plan 08 (RequestContext scope transitions).
2. The `use_role` async method (which calls `rebuild_tool_scope`)
requires async test runtime and MCP infrastructure. Deferred to
plan 05 (MCP lifecycle) and 08 (RequestContext).
3. `use_role_obj` correctly rejects when agent is active — tested
implicitly through the error path, but creating a mock Agent
is complex. Noted for plan 04 (agents).
4. The `extract_role` priority order (session > agent > role > default)
is important behavioral contract. Tests verify the role and
default cases. Session and agent cases deferred to plans 03, 04.
5. Added `create_test_ctx()` helper to request_context.rs tests.
Future iterations should reuse this.
## Plan file updates
Updated 02-roles.md to mark completed items.
## Next iteration
Plan file 03: Sessions — session create/load/save, compression,
autoname, carry-over, exit, context switching.
+76
View File
@@ -0,0 +1,76 @@
# Iteration 3 — Test Implementation Notes
## Plan file addressed
`docs/testing/plans/03-sessions.md`
## Tests created
### src/config/session.rs (15 new tests)
| Test name | What it verifies |
|---|---|
| `session_default_is_empty` | Default session is empty, no name, no role, not dirty |
| `session_new_from_ctx_captures_save_session` | new_from_ctx captures name, empty, not dirty |
| `session_set_role_captures_role_info` | set_role copies model_id, temperature, role_name, marks dirty |
| `session_clear_role` | clear_role removes role_name |
| `session_guard_empty_passes_when_empty` | guard_empty OK when empty |
| `session_needs_compression_threshold` | Empty session doesn't need compression |
| `session_needs_compression_returns_false_when_compressing` | Already compressing → false |
| `session_needs_compression_returns_false_when_threshold_zero` | Zero threshold → false |
| `session_set_compressing_flag` | set_compressing toggles flag |
| `session_set_save_session_this_time` | Doesn't panic |
| `session_save_session_returns_configured_value` | save_session get/set roundtrip |
| `session_compress_moves_messages` | compress moves messages to compressed, adds system |
| `session_is_not_empty_after_compress` | Session with compressed messages is not empty |
| `session_need_autoname_default_false` | Default session doesn't need autoname |
| `session_set_autonaming_doesnt_panic` | set_autonaming safe without autoname |
### src/config/request_context.rs (4 new tests, 11 total)
| Test name | What it verifies |
|---|---|
| `exit_session_clears_session` | exit_session removes session from ctx |
| `empty_session_clears_messages` | empty_session keeps session but clears it |
| `maybe_compress_session_returns_false_when_no_session` | No session → no compression |
| `maybe_autoname_session_returns_false_when_no_session` | No session → no autoname |
**Total: 19 new tests (86 → 105)**
## Bugs discovered
None. Session behavior matches between old and new code.
## Observations for future iterations
1. `Session::new_from_ctx` and `Session::load_from_ctx` have
`#[allow(dead_code)]` annotations — they were bridge methods.
Should verify if they're still needed or if the old `Session::new`
and `Session::load` (which take `&Config`) should be cleaned up
in a future pass.
2. The `compress` method moves messages to `compressed_messages` and
adds a single system message with the summary. This is a critical
behavioral contract — if the summary format changes, sessions
could break.
3. `needs_compression` uses `self.compression_threshold` (session-
level) with fallback to the global threshold. This priority
(session > global) is important behavior.
4. Session carry-over (the "incorporate last Q&A?" prompt) happens
inside `use_session` which is async and involves user interaction
(inquire::Confirm). Can't unit test this — needs integration test
or manual verification.
5. The `extract_role` test for session-active case should verify that
`session.to_role()` is returned. Added note to plan 02.
## Plan file updates
Updated 03-sessions.md to mark completed items.
## Next iteration
Plan file 04: Agents — agent init, tool compilation, variables,
lifecycle, MCP, RAG, auto-continuation.
+71
View File
@@ -0,0 +1,71 @@
# Iteration 4 — Test Implementation Notes
## Plan file addressed
`docs/testing/plans/04-agents.md`
## Tests created
### src/config/agent.rs (4 new tests)
| Test name | What it verifies |
|---|---|
| `agent_config_parses_from_yaml` | Full AgentConfig YAML with all fields |
| `agent_config_defaults` | Minimal AgentConfig gets correct defaults |
| `agent_config_with_model` | model_id, temperature, top_p from YAML |
| `agent_config_inject_defaults_true` | inject_todo/spawn_instructions default true |
### src/config/agent_runtime.rs (2 new tests)
| Test name | What it verifies |
|---|---|
| `agent_runtime_new_defaults` | All fields default correctly |
| `agent_runtime_builder_pattern` | with_depth, with_parent_supervisor work |
### src/config/request_context.rs (6 new tests, 17 total)
| Test name | What it verifies |
|---|---|
| `exit_agent_clears_all_agent_state` | exit_agent clears agent, agent_runtime, rag |
| `current_depth_returns_zero_without_agent` | Default depth is 0 |
| `current_depth_returns_agent_runtime_depth` | Depth from agent_runtime |
| `supervisor_returns_none_without_agent` | No agent → no supervisor |
| `inbox_returns_none_without_agent` | No agent → no inbox |
| `root_escalation_queue_returns_none_without_agent` | No agent → no queue |
**Total: 12 new tests (105 → 117)**
## Bugs discovered
None.
## Observations for future iterations
1. `Agent::init` can't be unit tested easily — requires agent config
files, tool files on disk. Integration tests with temp directories
would be needed for full coverage.
2. AgentConfig default values verified:
- `max_concurrent_agents` = 4
- `max_agent_depth` = 3
- `max_auto_continues` = 10
- `inject_todo_instructions` = true
- `inject_spawn_instructions` = true
These are important behavioral contracts.
3. The `exit_agent` test shows that clearing agent state also
rebuilds the tool_scope with fresh functions. This is the
correct behavior for returning to the global context.
4. Agent variable interpolation (special vars like __os__, __cwd__)
happens in Agent::init which is filesystem-dependent. Deferred.
5. `list_agents()` (which filters hidden dirs) is tested via the
`.shared` exclusion noted in improvements. Could add a unit test
with a temp dir if needed.
## Next iteration
Plan file 05: MCP Lifecycle — the most critical test area. McpFactory,
McpRuntime, spawn_mcp_server, rebuild_tool_scope MCP integration,
scope transition MCP behavior.