diff --git a/README.md b/README.md index dcc335b..f55a66f 100644 --- a/README.md +++ b/README.md @@ -36,6 +36,7 @@ Coming from [AIChat](https://github.com/sigoden/aichat)? Follow the [migration g * [Sessions](/docs/SESSIONS.md): Manage and persist conversational contexts and settings across multiple interactions. * [Roles](./docs/ROLES.md): Customize model behavior for specific tasks or domains. * [Agents](/docs/AGENTS.md): Leverage AI agents to perform complex tasks and workflows. + * [Todo System](./docs/TODO-SYSTEM.md): Built-in task tracking for improved agent reliability with smaller models. * [Environment Variables](./docs/ENVIRONMENT-VARIABLES.md): Override and customize your Loki configuration at runtime with environment variables. * [Client Configurations](./docs/clients/CLIENTS.md): Configuration instructions for various LLM providers. * [Patching API Requests](./docs/clients/PATCHES.md): Learn how to patch API requests for advanced customization. diff --git a/config.agent.example.yaml b/config.agent.example.yaml index 43577ab..1accb0e 100644 --- a/config.agent.example.yaml +++ b/config.agent.example.yaml @@ -17,6 +17,13 @@ agent_session: null # Set a session to use when starting the agent. name: # Name of the agent, used in the UI and logs description: # Description of the agent, used in the UI version: 1 # Version of the agent +# Todo System & Auto-Continuation +# These settings help smaller models handle multi-step tasks more reliably. +# See docs/TODO-SYSTEM.md for detailed documentation. +auto_continue: false # Enable automatic continuation when incomplete todos remain +max_auto_continues: 10 # Maximum number of automatic continuations before stopping +inject_todo_instructions: true # Inject the default todo tool usage instructions into the agent's system prompt +continuation_prompt: null # Custom prompt used when auto-continuing (optional; uses default if null) mcp_servers: # Optional list of MCP servers that the agent utilizes - github # Corresponds to the name of an MCP server in the `/functions/mcp.json` file global_tools: # Optional list of additional global tools to enable for the agent; i.e. not tools specific to the agent diff --git a/docs/AGENTS.md b/docs/AGENTS.md index b53bf3f..50dd67a 100644 --- a/docs/AGENTS.md +++ b/docs/AGENTS.md @@ -34,6 +34,7 @@ If you're looking for more example agents, refer to the [built-in agents](../ass - [Python-Based Agent Tools](#python-based-agent-tools) - [Bash-Based Agent Tools](#bash-based-agent-tools) - [5. Conversation Starters](#5-conversation-starters) +- [6. Todo System & Auto-Continuation](#6-todo-system--auto-continuation) - [Built-In Agents](#built-in-agents) @@ -81,6 +82,11 @@ global_tools: # Optional list of additional global tools - web_search - fs - python +# Todo System & Auto-Continuation (see "Todo System & Auto-Continuation" section below) +auto_continue: false # Enable automatic continuation when incomplete todos remain +max_auto_continues: 10 # Maximum continuation attempts before stopping +inject_todo_instructions: true # Inject todo tool instructions into system prompt +continuation_prompt: null # Custom prompt for continuations (optional) ``` As mentioned previously: Agents utilize function calling to extend a model's capabilities. However, agents operate in @@ -421,6 +427,50 @@ conversation_starters: ![Example Conversation Starters](./images/agents/conversation-starters.gif) +## 6. Todo System & Auto-Continuation + +Loki includes a built-in task tracking system designed to improve the reliability of agents, especially when using +smaller language models. The Todo System helps models: + +- Break complex tasks into manageable steps +- Track progress through multi-step workflows +- Automatically continue work until all tasks are complete + +### Quick Configuration + +```yaml +# agents/my-agent/config.yaml +auto_continue: true # Enable auto-continuation +max_auto_continues: 10 # Max continuation attempts +inject_todo_instructions: true # Include the default todo instructions into prompt +``` + +### How It Works + +1. When `inject_todo_instructions` is enabled, agents receive instructions on using four built-in tools: + - `todo__init`: Initialize a todo list with a goal + - `todo__add`: Add a task to the list + - `todo__done`: Mark a task complete + - `todo__list`: View current todo state + + These instructions are a reasonable default that detail how to use Loki's To-Do System. If you wish, + you can disable the injection of the default instructions and specify your own instructions for how + to use the To-Do System into your main `instructions` for the agent. + +2. When `auto_continue` is enabled and the model stops with incomplete tasks, Loki automatically sends a + continuation prompt with the current todo state, nudging the model to continue working. + +3. This continues until all tasks are done or `max_auto_continues` is reached. + +### When to Use + +- Multistep tasks where the model might lose track +- Smaller models that need more structure +- Workflows requiring guaranteed completion of all steps + +For complete documentation including all configuration options, tool details, and best practices, see the +[Todo System Guide](./TODO-SYSTEM.md). + ## Built-In Agents Loki comes packaged with some useful built-in agents: * `coder`: An agent to assist you with all your coding tasks diff --git a/docs/TODO-SYSTEM.md b/docs/TODO-SYSTEM.md new file mode 100644 index 0000000..b5c42a4 --- /dev/null +++ b/docs/TODO-SYSTEM.md @@ -0,0 +1,234 @@ +# Todo System + +Loki's Todo System is a built-in task tracking feature designed to improve the reliability and effectiveness of LLM agents, +especially smaller models. It provides structured task management that helps models: + +- Break complex tasks into manageable steps +- Track progress through multistep workflows +- Automatically continue work until all tasks are complete +- Avoid forgetting steps or losing context + +![Todo System Example](./images/agents/todo-system.png) + +## Quick Links + +- [Why Use the Todo System?](#why-use-the-todo-system) +- [How It Works](#how-it-works) +- [Configuration Options](#configuration-options) +- [Available Tools](#available-tools) +- [Auto-Continuation](#auto-continuation) +- [Best Practices](#best-practices) +- [Example Workflow](#example-workflow) +- [Troubleshooting](#troubleshooting) + + +## Why Use the Todo System? +Smaller language models often struggle with: +- **Context drift**: Forgetting earlier steps in a multi-step task +- **Incomplete execution**: Stopping before all work is done +- **Lack of structure**: Jumping between tasks without clear organization + +The Loki Todo System addresses these issues by giving the model explicit tools to plan, track, and verify task completion. +The system automatically prompts the model to continue when incomplete tasks remain, ensuring work gets finished. + +## How It Works +1. **Planning Phase**: The model initializes a todo list with a goal and adds individual tasks +2. **Execution Phase**: The model works through tasks, marking each done immediately after completion +3. **Continuation Phase**: If incomplete tasks remain, the system automatically prompts the model to continue +4. **Completion**: When all tasks are marked done, the workflow ends naturally + +The todo state is preserved across the conversation (and any compressions), and injected into continuation prompts, +keeping the model focused on remaining work. + +## Configuration Options +The Todo System is configured per-agent in `/agents//config.yaml`: + +| Setting | Type | Default | Description | +|----------------------------|---------|-------------|---------------------------------------------------------------------------------| +| `auto_continue` | boolean | `false` | Enable the To-Do system for automatic continuation when incomplete todos remain | +| `max_auto_continues` | integer | `10` | Maximum number of automatic continuations before stopping | +| `inject_todo_instructions` | boolean | `true` | Inject the default todo tool usage instructions into the agent's system prompt | +| `continuation_prompt` | string | (see below) | Custom prompt used when auto-continuing | + +### Example Configuration +```yaml +# agents/my-agent/config.yaml +model: openai:gpt-4o +auto_continue: true # Enable auto-continuation +max_auto_continues: 15 # Allow up to 15 automatic continuations +inject_todo_instructions: true # Include todo instructions in system prompt +continuation_prompt: | # Optional: customize the continuation prompt + [CONTINUE] + You have unfinished tasks. Proceed with the next pending item. + Do not explain—just execute. +``` + +### Default Continuation Prompt +If `continuation_prompt` is not specified, the following default is used: + +``` +[SYSTEM REMINDER - TODO CONTINUATION] +You have incomplete tasks in your todo list. Continue with the next pending item. +Call tools immediately. Do not explain what you will do. +``` + +## Available Tools +When `inject_todo_instructions` is enabled (the default), agents have access to four built-in todo management tools: + +### `todo__init` +Initialize a new todo list with a goal. Clears any existing todos. + +**Parameters:** +- `goal` (string, required): The overall goal to achieve when all todos are completed + +**Example:** +```json +{"goal": "Refactor the authentication module"} +``` + +### `todo__add` +Add a new todo item to the list. + +**Parameters:** +- `task` (string, required): Description of the todo task + +**Example:** +```json +{"task": "Extract password validation into separate function"} +``` + +**Returns:** The assigned task ID + +### `todo__done` +Mark a todo item as done by its ID. + +**Parameters:** +- `id` (integer, required): The ID of the todo item to mark as done + +**Example:** +```json +{"id": 1} +``` + +### `todo__list` +Display the current todo list with status of each item. + +**Parameters:** None + +**Returns:** The full todo list with goal, progress, and item statuses + +## Auto-Continuation +When `auto_continue` is enabled, Loki automatically sends a continuation prompt if: + +1. The agent's response completes (model stops generating) +2. There are incomplete tasks in the todo list +3. The continuation count hasn't exceeded `max_auto_continues` +4. The response isn't identical to the previous continuation (prevents loops) + +### What Gets Injected +Each continuation prompt includes: +- The continuation prompt text (default or custom) +- The current todo list state showing: + - The goal + - Progress (e.g., "3/5 completed") + - Each task with status (✓ done, ○ pending) + +**Example continuation context:** +``` +[SYSTEM REMINDER - TODO CONTINUATION] +You have incomplete tasks in your todo list. Continue with the next pending item. +Call tools immediately. Do not explain what you will do. + +Goal: Refactor the authentication module +Progress: 2/4 completed + ✓ 1. Extract password validation into separate function + ✓ 2. Add unit tests for password validation + ○ 3. Update login handler to use new validation + ○ 4. Update registration handler to use new validation +``` + +### Visual Feedback +During auto-continuation, you'll see a message in your terminal: +``` +📋 Auto-continuing (3/10): 2 incomplete todo(s) remain +``` + +## Best Practices + +### For Agent Developers +1. **Enable for complex workflows**: Use `auto_continue: true` for agents that handle multistep tasks +2. **Set reasonable limits**: Adjust `max_auto_continues` based on typical task complexity +3. **Customize the prompt**: If your agent needs specific continuation behavior, set a custom `continuation_prompt` + +### For Model Behavior +The injected instructions tell models to: + +1. **Always create a todo list before starting work**: This ensures planning happens upfront +2. **Mark each task done immediately after finishing**: Don't batch completions +3. **Add all planned steps before starting**: Complete planning before execution + +### When to Use vs. Skip +**Use the Todo System when:** +- Tasks have 3+ distinct steps +- The model might lose track of progress +- You want guaranteed completion of all steps +- Working with smaller/less capable models + +**Skip the Todo System when:** +- Single-step, simple tasks +- Interactive Q&A sessions +- The overhead of task tracking isn't worth it + +## Example Workflow +Here's how a typical todo-driven workflow looks: + +**User prompt:** "Add input validation to the user registration form" + +**Model actions:** +``` +1. todo__init(goal="Add input validation to user registration form") +2. todo__add(task="Analyze current registration form fields") +3. todo__add(task="Create validation rules for email field") +4. todo__add(task="Create validation rules for password field") +5. todo__add(task="Implement client-side validation") +6. todo__add(task="Add server-side validation") +7. todo__add(task="Write tests for validation logic") +``` + +**Model executes first task, then:** +``` +8. todo__done(id=1) +9. [Proceeds with task 2...] +10. todo__done(id=2) +... +``` + +**If model stops with incomplete tasks:** +- System automatically sends continuation prompt +- Model sees remaining tasks and continues +- Repeats until all tasks are done or max continuations reached + +## Troubleshooting + +### Model Not Using Todo Tools +- Verify `inject_todo_instructions: true` in your agent config +- Check that the agent is properly loaded (not just a role) +- Some models may need explicit prompting to use the tools + +### Too Many Continuations +- Lower `max_auto_continues` to a reasonable limit +- Check if the model is creating new tasks without completing old ones +- Ensure tasks are appropriately scoped (not too granular) + +### Continuation Loop +The system detects when a model's response is identical to its previous continuation response and stops +automatically. If you're seeing loops: +- The model may be stuck; check if a task is impossible to complete +- Consider adjusting the `continuation_prompt` to be more directive + +--- + +## Additional Docs +- [Agents](./AGENTS.md) — Full agent configuration guide +- [Function Calling](./function-calling/TOOLS.md) — How tools work in Loki +- [Sessions](./SESSIONS.md) — How conversation state is managed diff --git a/docs/function-calling/TOOLS.md b/docs/function-calling/TOOLS.md index 52830d4..b555cc0 100644 --- a/docs/function-calling/TOOLS.md +++ b/docs/function-calling/TOOLS.md @@ -16,6 +16,10 @@ loki --info | grep functions_dir | awk '{print $2}' - [Enabling/Disabling Global Tools](#enablingdisabling-global-tools) - [Role Configuration](#role-configuration) - [Agent Configuration](#agent-configuration) +- [Tool Error Handling](#tool-error-handling) + - [Native/Shell Tool Errors](#nativeshell-tool-errors) + - [MCP Errors](#mcp-tool-errors) + - [Why Tool Error Handling Is Important](#why-this-matters) --- @@ -137,3 +141,47 @@ The values for `mapping_tools` are inherited from the [global configuration](#gl For more information about agents, refer to the [Agents](../AGENTS.md) documentation. For a full example configuration for an agent, see the [Agent Configuration Example](../../config.agent.example.yaml) file. + +--- + +## Tool Error Handling +When tools fail, Loki captures error information and passes it back to the model so it can diagnose issues and +potentially retry or adjust its approach. + +### Native/Shell Tool Errors +When a shell-based tool exits with a non-zero exit code, the model receives: + +```json +{ + "tool_call_error": "Tool call 'my_tool' exited with code 1", + "stderr": "Error: file not found: config.json" +} +``` + +The `stderr` field contains the actual error output from the tool, giving the model context about what went wrong. +If the tool produces no stderr output, only the `tool_call_error` field is included. + +**Note:** Tool stdout streams to your terminal in real-time so you can see progress. Only stderr is captured for +error reporting. + +### MCP Tool Errors +When an MCP (Model Context Protocol) tool invocation fails due to connection issues, timeouts, or server errors, +the model receives: + +```json +{ + "tool_call_error": "MCP tool invocation failed: connection refused" +} +``` + +This allows the model to understand that an external service failed and take appropriate action (retry, use an +alternative approach, or inform the user). + +### Why This Matters +Without proper error propagation, models would only know that "something went wrong" without understanding *what* +went wrong. By including stderr output and detailed error messages, models can: + +- Diagnose the root cause of failures +- Suggest fixes (e.g., "the file doesn't exist, should I create it?") +- Retry with corrected parameters +- Fall back to alternative approaches when appropriate diff --git a/docs/images/agents/todo-system.png b/docs/images/agents/todo-system.png new file mode 100644 index 0000000..e66212a Binary files /dev/null and b/docs/images/agents/todo-system.png differ diff --git a/src/config/agent.rs b/src/config/agent.rs index 5534914..1a4852f 100644 --- a/src/config/agent.rs +++ b/src/config/agent.rs @@ -1,3 +1,4 @@ +use super::todo::TodoList; use super::*; use crate::{ @@ -14,6 +15,18 @@ use serde::{Deserialize, Serialize}; use std::{ffi::OsStr, path::Path}; const DEFAULT_AGENT_NAME: &str = "rag"; +const DEFAULT_TODO_INSTRUCTIONS: &str = "\ +\n## Task Tracking\n\ +You have built-in task tracking tools. Use them to track your progress:\n\ +- `todo__init`: Initialize a todo list with a goal. Call this at the start of every multi-step task.\n\ +- `todo__add`: Add individual tasks. Add all planned steps before starting work.\n\ +- `todo__done`: Mark a task done by id. Call this immediately after completing each step.\n\ +- `todo__list`: Show the current todo list.\n\ +\n\ +RULES:\n\ +- Always create a todo list before starting work.\n\ +- Mark each task done as soon as you finish it — do not batch.\n\ +- If you stop with incomplete tasks, the system will automatically prompt you to continue."; pub type AgentVariables = IndexMap; @@ -33,6 +46,9 @@ pub struct Agent { rag: Option>, model: Model, vault: GlobalVault, + todo_list: TodoList, + continuation_count: usize, + last_continuation_response: Option, } impl Agent { @@ -188,6 +204,10 @@ impl Agent { None }; + if agent_config.auto_continue { + functions.append_todo_functions(); + } + Ok(Self { name: name.to_string(), config: agent_config, @@ -199,6 +219,9 @@ impl Agent { rag, model, vault: Arc::clone(&config.read().vault), + todo_list: TodoList::default(), + continuation_count: 0, + last_continuation_response: None, }) } @@ -309,11 +332,16 @@ impl Agent { } pub fn interpolated_instructions(&self) -> String { - let output = self + let mut output = self .session_dynamic_instructions .clone() .or_else(|| self.shared_dynamic_instructions.clone()) .unwrap_or_else(|| self.config.instructions.clone()); + + if self.config.auto_continue && self.config.inject_todo_instructions { + output.push_str(DEFAULT_TODO_INSTRUCTIONS); + } + self.interpolate_text(&output) } @@ -376,6 +404,67 @@ impl Agent { self.session_dynamic_instructions = None; } + pub fn auto_continue_enabled(&self) -> bool { + self.config.auto_continue + } + + pub fn max_auto_continues(&self) -> usize { + self.config.max_auto_continues + } + + pub fn continuation_count(&self) -> usize { + self.continuation_count + } + + pub fn increment_continuation(&mut self) { + self.continuation_count += 1; + } + + pub fn reset_continuation(&mut self) { + self.continuation_count = 0; + self.last_continuation_response = None; + } + + pub fn is_stale_response(&self, response: &str) -> bool { + self.last_continuation_response + .as_ref() + .is_some_and(|last| last == response) + } + + pub fn set_last_continuation_response(&mut self, response: String) { + self.last_continuation_response = Some(response); + } + + pub fn todo_list(&self) -> &TodoList { + &self.todo_list + } + + pub fn init_todo_list(&mut self, goal: &str) { + self.todo_list = TodoList::new(goal); + } + + pub fn add_todo(&mut self, task: &str) -> usize { + self.todo_list.add(task) + } + + pub fn mark_todo_done(&mut self, id: usize) -> bool { + self.todo_list.mark_done(id) + } + + pub fn continuation_prompt(&self) -> String { + self.config.continuation_prompt.clone().unwrap_or_else(|| { + "[SYSTEM REMINDER - TODO CONTINUATION]\n\ + You have incomplete tasks in your todo list. \ + Continue with the next pending item. \ + Call tools immediately. Do not explain what you will do." + .to_string() + }) + } + + pub fn compression_threshold(&self) -> Option { + self.config.compression_threshold + } + pub fn is_dynamic_instructions(&self) -> bool { self.config.dynamic_instructions } @@ -498,6 +587,14 @@ pub struct AgentConfig { #[serde(skip_serializing_if = "Option::is_none")] pub agent_session: Option, #[serde(default)] + pub auto_continue: bool, + #[serde(default = "default_max_auto_continues")] + pub max_auto_continues: usize, + #[serde(default = "default_true")] + pub inject_todo_instructions: bool, + #[serde(skip_serializing_if = "Option::is_none")] + pub compression_threshold: Option, + #[serde(default)] pub description: String, #[serde(default)] pub version: String, @@ -505,6 +602,8 @@ pub struct AgentConfig { pub mcp_servers: Vec, #[serde(default)] pub global_tools: Vec, + #[serde(skip_serializing_if = "Option::is_none")] + pub continuation_prompt: Option, #[serde(default)] pub instructions: String, #[serde(default)] @@ -517,6 +616,14 @@ pub struct AgentConfig { pub documents: Vec, } +fn default_max_auto_continues() -> usize { + 10 +} + +fn default_true() -> bool { + true +} + impl AgentConfig { pub fn load(path: &Path) -> Result { let contents = read_to_string(path) diff --git a/src/config/mod.rs b/src/config/mod.rs index 299bdbf..fb2d5ae 100644 --- a/src/config/mod.rs +++ b/src/config/mod.rs @@ -3,6 +3,7 @@ mod input; mod macros; mod role; mod session; +pub(crate) mod todo; pub use self::agent::{Agent, AgentVariables, complete_agent_variables, list_agents}; pub use self::input::Input; @@ -1573,8 +1574,18 @@ impl Config { .summary_context_prompt .clone() .unwrap_or_else(|| SUMMARY_CONTEXT_PROMPT.into()); + + let todo_prefix = config + .read() + .agent + .as_ref() + .map(|agent| agent.todo_list()) + .filter(|todos| !todos.is_empty()) + .map(|todos| format!("[ACTIVE TODO LIST]\n{}\n\n", todos.render_for_model())) + .unwrap_or_default(); + if let Some(session) = config.write().session.as_mut() { - session.compress(format!("{summary_context_prompt}{summary}")); + session.compress(format!("{todo_prefix}{summary_context_prompt}{summary}")); } config.write().discontinuous_last_message(); Ok(()) diff --git a/src/config/session.rs b/src/config/session.rs index cfc8e02..cb0cecf 100644 --- a/src/config/session.rs +++ b/src/config/session.rs @@ -299,6 +299,9 @@ impl Session { self.role_prompt = agent.interpolated_instructions(); self.agent_variables = agent.variables().clone(); self.agent_instructions = self.role_prompt.clone(); + if let Some(threshold) = agent.compression_threshold() { + self.set_compression_threshold(Some(threshold)); + } } pub fn agent_variables(&self) -> &AgentVariables { diff --git a/src/config/todo.rs b/src/config/todo.rs new file mode 100644 index 0000000..7b070fe --- /dev/null +++ b/src/config/todo.rs @@ -0,0 +1,165 @@ +use serde::{Deserialize, Serialize}; + +#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] +#[serde(rename_all = "lowercase")] +pub enum TodoStatus { + Pending, + Done, +} + +impl TodoStatus { + fn icon(&self) -> &'static str { + match self { + TodoStatus::Pending => "○", + TodoStatus::Done => "✓", + } + } +} + +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct TodoItem { + pub id: usize, + #[serde(alias = "description")] + pub desc: String, + pub done: bool, +} + +#[derive(Debug, Clone, Default, Serialize, Deserialize)] +pub struct TodoList { + #[serde(default)] + pub goal: String, + #[serde(default)] + pub todos: Vec, +} + +impl TodoList { + pub fn new(goal: &str) -> Self { + Self { + goal: goal.to_string(), + todos: Vec::new(), + } + } + + pub fn add(&mut self, task: &str) -> usize { + let id = self.todos.iter().map(|t| t.id).max().unwrap_or(0) + 1; + self.todos.push(TodoItem { + id, + desc: task.to_string(), + done: false, + }); + id + } + + pub fn mark_done(&mut self, id: usize) -> bool { + if let Some(item) = self.todos.iter_mut().find(|t| t.id == id) { + item.done = true; + true + } else { + false + } + } + + pub fn has_incomplete(&self) -> bool { + self.todos.iter().any(|item| !item.done) + } + + pub fn is_empty(&self) -> bool { + self.todos.is_empty() + } + + pub fn render_for_model(&self) -> String { + let mut lines = Vec::new(); + if !self.goal.is_empty() { + lines.push(format!("Goal: {}", self.goal)); + } + lines.push(format!( + "Progress: {}/{} completed", + self.completed_count(), + self.todos.len() + )); + for item in &self.todos { + let status = if item.done { + TodoStatus::Done + } else { + TodoStatus::Pending + }; + lines.push(format!(" {} {}. {}", status.icon(), item.id, item.desc)); + } + lines.join("\n") + } + + pub fn incomplete_count(&self) -> usize { + self.todos.iter().filter(|item| !item.done).count() + } + + pub fn completed_count(&self) -> usize { + self.todos.iter().filter(|item| item.done).count() + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_new_and_add() { + let mut list = TodoList::new("Map Labs"); + assert_eq!(list.add("Discover"), 1); + assert_eq!(list.add("Map columns"), 2); + assert_eq!(list.todos.len(), 2); + assert!(list.has_incomplete()); + } + + #[test] + fn test_mark_done() { + let mut list = TodoList::new("Test"); + list.add("Task 1"); + list.add("Task 2"); + assert!(list.mark_done(1)); + assert!(!list.mark_done(99)); + assert_eq!(list.completed_count(), 1); + assert_eq!(list.incomplete_count(), 1); + } + + #[test] + fn test_empty_list() { + let list = TodoList::default(); + assert!(!list.has_incomplete()); + assert!(list.is_empty()); + } + + #[test] + fn test_all_done() { + let mut list = TodoList::new("Test"); + list.add("Done task"); + list.mark_done(1); + assert!(!list.has_incomplete()); + } + + #[test] + fn test_render_for_model() { + let mut list = TodoList::new("Map Labs"); + list.add("Discover"); + list.add("Map"); + list.mark_done(1); + let rendered = list.render_for_model(); + assert!(rendered.contains("Goal: Map Labs")); + assert!(rendered.contains("Progress: 1/2 completed")); + assert!(rendered.contains("✓ 1. Discover")); + assert!(rendered.contains("○ 2. Map")); + } + + #[test] + fn test_serialization_roundtrip() { + let mut list = TodoList::new("Roundtrip"); + list.add("Step 1"); + list.add("Step 2"); + list.mark_done(1); + let json = serde_json::to_string(&list).unwrap(); + let deserialized: TodoList = serde_json::from_str(&json).unwrap(); + assert_eq!(deserialized.goal, "Roundtrip"); + assert_eq!(deserialized.todos.len(), 2); + assert!(deserialized.todos[0].done); + assert!(!deserialized.todos[1].done); + } +} diff --git a/src/function/mod.rs b/src/function/mod.rs index 5fb4f65..54bee4e 100644 --- a/src/function/mod.rs +++ b/src/function/mod.rs @@ -1,3 +1,5 @@ +pub(crate) mod todo; + use crate::{ config::{Agent, Config, GlobalConfig}, utils::*, @@ -26,6 +28,7 @@ use std::{ process::{Command, Stdio}, }; use strum_macros::AsRefStr; +use todo::TODO_FUNCTION_PREFIX; #[derive(Embed)] #[folder = "assets/functions/"] @@ -262,6 +265,10 @@ impl Functions { self.declarations.is_empty() } + pub fn append_todo_functions(&mut self) { + self.declarations.extend(todo::todo_function_declarations()); + } + pub fn clear_mcp_meta_functions(&mut self) { self.declarations.retain(|d| { !d.name.starts_with(MCP_INVOKE_META_FUNCTION_NAME_PREFIX) @@ -850,7 +857,7 @@ impl ToolCall { _ if cmd_name.starts_with(MCP_SEARCH_META_FUNCTION_NAME_PREFIX) => { Self::search_mcp_tools(config, &cmd_name, &json_data).unwrap_or_else(|e| { let error_msg = format!("MCP search failed: {e}"); - println!("{}", warning_text(&format!("⚠️ {error_msg} ⚠️"))); + eprintln!("{}", warning_text(&format!("⚠️ {error_msg} ⚠️"))); json!({"tool_call_error": error_msg}) }) } @@ -859,7 +866,7 @@ impl ToolCall { .await .unwrap_or_else(|e| { let error_msg = format!("MCP describe failed: {e}"); - println!("{}", warning_text(&format!("⚠️ {error_msg} ⚠️"))); + eprintln!("{}", warning_text(&format!("⚠️ {error_msg} ⚠️"))); json!({"tool_call_error": error_msg}) }) } @@ -868,10 +875,17 @@ impl ToolCall { .await .unwrap_or_else(|e| { let error_msg = format!("MCP tool invocation failed: {e}"); - println!("{}", warning_text(&format!("⚠️ {error_msg} ⚠️"))); + eprintln!("{}", warning_text(&format!("⚠️ {error_msg} ⚠️"))); json!({"tool_call_error": error_msg}) }) } + _ if cmd_name.starts_with(TODO_FUNCTION_PREFIX) => { + todo::handle_todo_tool(config, &cmd_name, &json_data).unwrap_or_else(|e| { + let error_msg = format!("Todo tool failed: {e}"); + eprintln!("{}", warning_text(&format!("⚠️ {error_msg} ⚠️"))); + json!({"tool_call_error": error_msg}) + }) + } _ => match run_llm_function(cmd_name, cmd_args, envs, agent_name) { Ok(Some(contents)) => serde_json::from_str(&contents) .ok() @@ -1052,7 +1066,7 @@ pub fn run_llm_function( eprintln!("{stderr}"); } let tool_error_message = format!("Tool call '{command_name}' exited with code {exit_code}"); - println!("{}", warning_text(&format!("⚠️ {tool_error_message} ⚠️"))); + eprintln!("{}", warning_text(&format!("⚠️ {tool_error_message} ⚠️"))); let mut error_json = json!({"tool_call_error": tool_error_message}); if !stderr.is_empty() { error_json["stderr"] = json!(stderr); diff --git a/src/function/todo.rs b/src/function/todo.rs new file mode 100644 index 0000000..e4c2738 --- /dev/null +++ b/src/function/todo.rs @@ -0,0 +1,160 @@ +use super::{FunctionDeclaration, JsonSchema}; +use crate::config::GlobalConfig; + +use anyhow::{Result, bail}; +use indexmap::IndexMap; +use serde_json::{Value, json}; + +pub const TODO_FUNCTION_PREFIX: &str = "todo__"; + +pub fn todo_function_declarations() -> Vec { + vec![ + FunctionDeclaration { + name: format!("{TODO_FUNCTION_PREFIX}init"), + description: "Initialize a new todo list with a goal. Clears any existing todos." + .to_string(), + parameters: JsonSchema { + type_value: Some("object".to_string()), + properties: Some(IndexMap::from([( + "goal".to_string(), + JsonSchema { + type_value: Some("string".to_string()), + description: Some( + "The overall goal to achieve when all todos are completed".into(), + ), + ..Default::default() + }, + )])), + required: Some(vec!["goal".to_string()]), + ..Default::default() + }, + agent: false, + }, + FunctionDeclaration { + name: format!("{TODO_FUNCTION_PREFIX}add"), + description: "Add a new todo item to the list.".to_string(), + parameters: JsonSchema { + type_value: Some("object".to_string()), + properties: Some(IndexMap::from([( + "task".to_string(), + JsonSchema { + type_value: Some("string".to_string()), + description: Some("Description of the todo task".into()), + ..Default::default() + }, + )])), + required: Some(vec!["task".to_string()]), + ..Default::default() + }, + agent: false, + }, + FunctionDeclaration { + name: format!("{TODO_FUNCTION_PREFIX}done"), + description: "Mark a todo item as done by its id.".to_string(), + parameters: JsonSchema { + type_value: Some("object".to_string()), + properties: Some(IndexMap::from([( + "id".to_string(), + JsonSchema { + type_value: Some("integer".to_string()), + description: Some("The id of the todo item to mark as done".into()), + ..Default::default() + }, + )])), + required: Some(vec!["id".to_string()]), + ..Default::default() + }, + agent: false, + }, + FunctionDeclaration { + name: format!("{TODO_FUNCTION_PREFIX}list"), + description: "Display the current todo list with status of each item.".to_string(), + parameters: JsonSchema { + type_value: Some("object".to_string()), + ..Default::default() + }, + agent: false, + }, + ] +} + +pub fn handle_todo_tool(config: &GlobalConfig, cmd_name: &str, args: &Value) -> Result { + let action = cmd_name + .strip_prefix(TODO_FUNCTION_PREFIX) + .unwrap_or(cmd_name); + + match action { + "init" => { + let goal = args.get("goal").and_then(Value::as_str).unwrap_or_default(); + let mut cfg = config.write(); + let agent = cfg.agent.as_mut(); + match agent { + Some(agent) => { + agent.init_todo_list(goal); + Ok(json!({"status": "ok", "message": "Initialized new todo list"})) + } + None => bail!("No active agent"), + } + } + "add" => { + let task = args.get("task").and_then(Value::as_str).unwrap_or_default(); + if task.is_empty() { + return Ok(json!({"error": "task description is required"})); + } + let mut cfg = config.write(); + let agent = cfg.agent.as_mut(); + match agent { + Some(agent) => { + let id = agent.add_todo(task); + Ok(json!({"status": "ok", "id": id})) + } + None => bail!("No active agent"), + } + } + "done" => { + let id = args + .get("id") + .and_then(|v| { + v.as_u64() + .or_else(|| v.as_str().and_then(|s| s.parse().ok())) + }) + .map(|v| v as usize); + match id { + Some(id) => { + let mut cfg = config.write(); + let agent = cfg.agent.as_mut(); + match agent { + Some(agent) => { + if agent.mark_todo_done(id) { + Ok( + json!({"status": "ok", "message": format!("Marked todo {id} as done")}), + ) + } else { + Ok(json!({"error": format!("Todo {id} not found")})) + } + } + None => bail!("No active agent"), + } + } + None => Ok(json!({"error": "id is required and must be a number"})), + } + } + "list" => { + let cfg = config.read(); + let agent = cfg.agent.as_ref(); + match agent { + Some(agent) => { + let list = agent.todo_list(); + if list.is_empty() { + Ok(json!({"goal": "", "todos": []})) + } else { + Ok(serde_json::to_value(list) + .unwrap_or(json!({"error": "serialization failed"}))) + } + } + None => bail!("No active agent"), + } + } + _ => bail!("Unknown todo action: {action}"), + } +} diff --git a/src/repl/mod.rs b/src/repl/mod.rs index f0cca2d..5618640 100644 --- a/src/repl/mod.rs +++ b/src/repl/mod.rs @@ -826,6 +826,14 @@ pub async fn run_repl_command( _ => unknown_command()?, }, None => { + if config + .read() + .agent + .as_ref() + .is_some_and(|a| a.continuation_count() > 0) + { + config.write().agent.as_mut().unwrap().reset_continuation(); + } let input = Input::from_str(config, line, None); ask(config, abort_signal.clone(), input, true).await?; } @@ -874,9 +882,60 @@ async fn ask( ) .await } else { - Config::maybe_autoname_session(config.clone()); - Config::maybe_compress_session(config.clone()); - Ok(()) + let should_continue = { + let cfg = config.read(); + if let Some(agent) = &cfg.agent { + agent.auto_continue_enabled() + && agent.continuation_count() < agent.max_auto_continues() + && !agent.is_stale_response(&output) + && agent.todo_list().has_incomplete() + } else { + false + } + }; + + if should_continue { + let full_prompt = { + let mut cfg = config.write(); + let agent = cfg.agent.as_mut().expect("agent checked above"); + agent.set_last_continuation_response(output.clone()); + agent.increment_continuation(); + let count = agent.continuation_count(); + let max = agent.max_auto_continues(); + + let todo_state = agent.todo_list().render_for_model(); + let remaining = agent.todo_list().incomplete_count(); + let prompt = agent.continuation_prompt(); + + let color = if cfg.light_theme() { + nu_ansi_term::Color::LightGray + } else { + nu_ansi_term::Color::DarkGray + }; + eprintln!( + "\n📋 {}", + color.italic().paint(format!( + "Auto-continuing ({count}/{max}): {remaining} incomplete todo(s) remain" + )) + ); + + format!("{prompt}\n\n{todo_state}") + }; + let continuation_input = Input::from_str(config, &full_prompt, None); + ask(config, abort_signal, continuation_input, false).await + } else { + if config + .read() + .agent + .as_ref() + .is_some_and(|a| a.continuation_count() > 0) + { + config.write().agent.as_mut().unwrap().reset_continuation(); + } + Config::maybe_autoname_session(config.clone()); + Config::maybe_compress_session(config.clone()); + Ok(()) + } } }