docs: Updated sisyphus README to always include the execute_command.sh tool

docs: Updated the sisyphus system docs to have a pro-tip of configuring an IDE MCP server to improve performance
docs: Created README docs for the CodeRabbit-style Code reviewer agents
2026-02-20 15:06:57 -07:00 · 2026-02-20 15:01:08 -07:00 · 2026-02-20 15:00:32 -07:00 · 2026-02-20 14:36:34 -07:00 · 2026-02-20 13:47:28 -07:00 · 2026-02-20 12:08:00 -07:00
11 changed files with 739 additions and 200 deletions
@@ -0,0 +1,36 @@
+# Code Reviewer
+
+A CodeRabbit-style code review orchestrator that coordinates per-file reviews and synthesizes findings into a unified 
+report.
+
+This agent acts as the manager for the review process, delegating actual file analysis to **[File Reviewer](../file-reviewer/README.md)** 
+agents while handling coordination and final reporting.
+
+## Features
+
+- 🤖 **Orchestration**: Spawns parallel reviewers for each changed file.
+- 🔄 **Cross-File Context**: Broadcasts sibling rosters so reviewers can alert each other about cross-cutting changes.
+- 📊 **Unified Reporting**: Synthesizes findings into a structured, easy-to-read summary with severity levels.
+- ⚡ **Parallel Execution**: Runs reviews concurrently for maximum speed.
+
+## Pro-Tip: Use an IDE MCP Server for Improved Performance
+Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
+an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
+server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
+them), and modify the agent definition to look like this:
+
+```yaml
+# ...
+
+mcp_servers:
+  - jetbrains # The name of your configured IDE MCP server
+
+global_tools:
+  - fs_read.sh
+  - fs_grep.sh
+  - fs_glob.sh
+#  - execute_command.sh
+
+# ...
+```
+
@@ -14,3 +14,27 @@ acts as the coordinator/architect, while Coder handles the implementation detail
 - 📊 Precise diff-based file editing for controlled code modifications

 It can also be used as a standalone tool for direct coding assistance.
+
+## Pro-Tip: Use an IDE MCP Server for Improved Performance
+Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
+an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
+server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
+them), and modify the agent definition to look like this:
+
+```yaml
+# ...
+
+mcp_servers:
+  - jetbrains # The name of your configured IDE MCP server
+
+global_tools:
+  # Keep useful read-only tools for reading files in other non-project directories
+  - fs_read.sh
+  - fs_grep.sh
+  - fs_glob.sh
+#  - fs_write.sh
+#  - fs_patch.sh
+  - execute_command.sh
+
+# ...
+```
@@ -13,3 +13,25 @@ It can also be used as a standalone tool for understanding codebases and finding
 - 📂 File system navigation and content analysis
 - 🧠 Context gathering for complex tasks
 - 🛡️ Read-only operations for safe investigation
+
+## Pro-Tip: Use an IDE MCP Server for Improved Performance
+Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
+an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
+server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
+them), and modify the agent definition to look like this:
+
+```yaml
+# ...
+
+mcp_servers:
+  - jetbrains # The name of your configured IDE MCP server
+
+global_tools:
+  - fs_read.sh
+  - fs_grep.sh
+  - fs_glob.sh
+  - fs_ls.sh
+  - web_search_loki.sh
+
+# ...
+```
@@ -0,0 +1,35 @@
+# File Reviewer
+
+A specialized worker agent that reviews a single file's diff for bugs, style issues, and cross-cutting concerns.
+
+This agent is designed to be spawned by the **[Code Reviewer](../code-reviewer/README.md)** agent. It focuses deeply on 
+one file while communicating with sibling agents to catch issues that span multiple files.
+
+## Features
+
+- 🔍 **Deep Analysis**: Focuses on bugs, logic errors, security issues, and style problems in a single file.
+- 🗣️ **Teammate Communication**: Sends and receives alerts to/from sibling reviewers about interface or dependency 
+  changes.
+- 🎯 **Targeted Reading**: Reads only relevant context around changed lines to stay efficient.
+- 🏷️ **Structured Findings**: Categorizes issues by severity (🔴 Critical, 🟡 Warning, 🟢 Suggestion, 💡 Nitpick).
+
+## Pro-Tip: Use an IDE MCP Server for Improved Performance
+Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
+an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
+server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
+them), and modify the agent definition to look like this:
+
+```yaml
+# ...
+
+mcp_servers:
+  - jetbrains # The name of your configured IDE MCP server
+
+global_tools:
+  - fs_read.sh
+  - fs_grep.sh
+  - fs_glob.sh
+
+# ...
+```
+
@@ -15,3 +15,25 @@ It can also be used as a standalone tool for design reviews and solving difficul
 - ⚖️ Tradeoff analysis and technology selection
 - 📝 Code review and best practices advice
 - 🧠 Deep reasoning for ambiguous problems
+
+## Pro-Tip: Use an IDE MCP Server for Improved Performance
+Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
+an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
+server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
+them), and modify the agent definition to look like this:
+
+```yaml
+# ...
+
+mcp_servers:
+  - jetbrains # The name of your configured IDE MCP server
+
+global_tools:
+  - fs_read.sh
+  - fs_grep.sh
+  - fs_glob.sh
+  - fs_ls.sh
+  - web_search_loki.sh
+
+# ...
+```
@@ -16,3 +16,26 @@ Sisyphus acts as the primary entry point, capable of handling complex tasks by c
 - 💻 **CLI Coding**: Provides a natural language interface for writing and editing code.
 - 🔄 **Task Management**: Tracks progress and context across complex operations.
 - 🛠️ **Tool Integration**: Seamlessly uses system tools for building, testing, and file manipulation.
+
+## Pro-Tip: Use an IDE MCP Server for Improved Performance
+Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
+an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
+server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
+them), and modify the agent definition to look like this:
+
+```yaml
+# ...
+
+mcp_servers:
+  - jetbrains
+
+global_tools:
+  - fs_read.sh
+  - fs_grep.sh
+  - fs_glob.sh
+  - fs_ls.sh
+  - web_search_loki.sh
+  - execute_command.sh
+
+# ...
+```
@@ -81,6 +81,7 @@
      supports_vision: true
      supports_function_calling: true
    - name: o4-mini
+      max_output_tokens: 100000
      max_input_tokens: 200000
      input_price: 1.1
      output_price: 4.4
@@ -93,6 +94,7 @@
          temperature: null
          top_p: null
    - name: o4-mini-high
+      max_output_tokens: 100000
      real_name: o4-mini
      max_input_tokens: 200000
      input_price: 1.1
@@ -107,6 +109,7 @@
          temperature: null
          top_p: null
    - name: o3
+      max_output_tokens: 100000
      max_input_tokens: 200000
      input_price: 2
      output_price: 8
@@ -133,6 +136,7 @@
          temperature: null
          top_p: null
    - name: o3-mini
+      max_output_tokens: 100000
      max_input_tokens: 200000
      input_price: 1.1
      output_price: 4.4
@@ -145,6 +149,7 @@
          temperature: null
          top_p: null
    - name: o3-mini-high
+      max_output_tokens: 100000
      real_name: o3-mini
      max_input_tokens: 200000
      input_price: 1.1
@@ -192,23 +197,23 @@
  models:
    - name: gemini-2.5-flash
      max_input_tokens: 1048576
-      max_output_tokens: 65536
-      input_price: 0
-      output_price: 0
+      max_output_tokens: 65535
+      input_price: 0.3
+      output_price: 2.5
      supports_vision: true
      supports_function_calling: true
    - name: gemini-2.5-pro
      max_input_tokens: 1048576
      max_output_tokens: 65536
-      input_price: 0
-      output_price: 0
+      input_price: 1.25
+      output_price: 10
      supports_vision: true
      supports_function_calling: true
    - name: gemini-2.5-flash-lite
-      max_input_tokens: 1000000
-      max_output_tokens: 64000
-      input_price: 0
-      output_price: 0
+      max_input_tokens: 1048576
+      max_output_tokens: 65535
+      input_price: 0.1
+      output_price: 0.4
      supports_vision: true
      supports_function_calling: true
    - name: gemini-2.0-flash
@@ -226,10 +231,11 @@
      supports_vision: true
      supports_function_calling: true
    - name: gemma-3-27b-it
-      max_input_tokens: 131072
-      max_output_tokens: 8192
-      input_price: 0
-      output_price: 0
+      supports_vision: true
+      max_input_tokens: 128000
+      max_output_tokens: 65536
+      input_price: 0.04
+      output_price: 0.15
    - name: text-embedding-004
      type: embedding
      input_price: 0
@@ -509,8 +515,8 @@
      output_price: 10
      supports_vision: true
    - name: command-r7b-12-2024
-      max_input_tokens: 131072
-      max_output_tokens: 4096
+      max_input_tokens: 128000
+      max_output_tokens: 4000
      input_price: 0.0375
      output_price: 0.15
    - name: embed-v4.0
@@ -547,6 +553,7 @@
 - provider: xai
  models:
    - name: grok-4
+      supports_vision: true
      max_input_tokens: 256000
      input_price: 3
      output_price: 15
@@ -583,14 +590,18 @@
 - provider: perplexity
  models:
    - name: sonar-pro
+      max_output_tokens: 8000
+      supports_vision: true
      max_input_tokens: 200000
      input_price: 3
      output_price: 15
    - name: sonar
-      max_input_tokens: 128000
+      supports_vision: true
+      max_input_tokens: 127072
      input_price: 1
      output_price: 1
    - name: sonar-reasoning-pro
+      supports_vision: true
      max_input_tokens: 128000
      input_price: 2
      output_price: 8
@@ -663,13 +674,13 @@
      hipaa_safe: true
      max_input_tokens: 1048576
      max_output_tokens: 65536
-      input_price: 0
-      output_price: 0
+      input_price: 2
+      output_price: 12
      supports_vision: true
      supports_function_calling: true
    - name: gemini-2.5-flash
      max_input_tokens: 1048576
-      max_output_tokens: 65536
+      max_output_tokens: 65535
      input_price: 0.3
      output_price: 2.5
      supports_vision: true
@@ -683,16 +694,16 @@
      supports_function_calling: true
    - name: gemini-2.5-flash-lite
      max_input_tokens: 1048576
-      max_output_tokens: 65536
-      input_price: 0.3
+      max_output_tokens: 65535
+      input_price: 0.1
      output_price: 0.4
      supports_vision: true
      supports_function_calling: true
    - name: gemini-2.0-flash-001
      max_input_tokens: 1048576
      max_output_tokens: 8192
-      input_price: 0.15
-      output_price: 0.6
+      input_price: 0.1
+      output_price: 0.4
      supports_vision: true
      supports_function_calling: true
    - name: gemini-2.0-flash-lite-001
@@ -1194,10 +1205,16 @@
 - provider: qianwen
  models:
    - name: qwen3-max
+      input_price: 1.2
+      output_price: 6
+      max_output_tokens: 32768
      max_input_tokens: 262144
      supports_function_calling: true
    - name: qwen-plus
-      max_input_tokens: 131072
+      input_price: 0.4
+      output_price: 1.2
+      max_output_tokens: 32768
+      max_input_tokens: 1000000
      supports_function_calling: true
    - name: qwen-flash
      max_input_tokens: 1000000
@@ -1213,14 +1230,14 @@
    - name: qwen-coder-flash
      max_input_tokens: 1000000
    - name: qwen3-next-80b-a3b-instruct
-      max_input_tokens: 131072
-      input_price: 0.14
-      output_price: 0.56
+      max_input_tokens: 262144
+      input_price: 0.09
+      output_price: 1.1
      supports_function_calling: true
    - name: qwen3-next-80b-a3b-thinking
-      max_input_tokens: 131072
-      input_price: 0.14
-      output_price: 1.4
+      max_input_tokens: 128000
+      input_price: 0.15
+      output_price: 1.2
    - name: qwen3-235b-a22b-instruct-2507
      max_input_tokens: 131072
      input_price: 0.28
@@ -1228,35 +1245,39 @@
      supports_function_calling: true
    - name: qwen3-235b-a22b-thinking-2507
      max_input_tokens: 131072
-      input_price: 0.28
-      output_price: 2.8
+      input_price: 0
+      output_price: 0
    - name: qwen3-30b-a3b-instruct-2507
-      max_input_tokens: 131072
-      input_price: 0.105
-      output_price: 0.42
+      max_output_tokens: 262144
+      max_input_tokens: 262144
+      input_price: 0.09
+      output_price: 0.3
      supports_function_calling: true
    - name: qwen3-30b-a3b-thinking-2507
-      max_input_tokens: 131072
-      input_price: 0.105
-      output_price: 1.05
+      max_input_tokens: 32768
+      input_price: 0.051
+      output_price: 0.34
    - name: qwen3-vl-32b-instruct
+      max_output_tokens: 32768
      max_input_tokens: 131072
-      input_price: 0.28
-      output_price: 1.12
+      input_price: 0.104
+      output_price: 0.416
      supports_vision: true
    - name: qwen3-vl-8b-instruct
+      max_output_tokens: 32768
      max_input_tokens: 131072
-      input_price: 0.07
-      output_price: 0.28
+      input_price: 0.08
+      output_price: 0.5
      supports_vision: true
    - name: qwen3-coder-480b-a35b-instruct
      max_input_tokens: 262144
      input_price: 1.26
      output_price: 5.04
    - name: qwen3-coder-30b-a3b-instruct
-      max_input_tokens: 262144
-      input_price: 0.315
-      output_price: 1.26
+      max_output_tokens: 32768
+      max_input_tokens: 160000
+      input_price: 0.07
+      output_price: 0.27
    - name: deepseek-v3.2-exp
      max_input_tokens: 131072
      input_price: 0.28
@@ -1332,9 +1353,9 @@
      output_price: 8.12
      supports_vision: true
    - name: kimi-k2-thinking
-      max_input_tokens: 262144
-      input_price: 0.56
-      output_price: 2.24
+      max_input_tokens: 131072
+      input_price: 0.47
+      output_price: 2
      supports_vision: true

 # Links:
@@ -1343,10 +1364,10 @@
 - provider: deepseek
  models:
    - name: deepseek-chat
-      max_input_tokens: 64000
-      max_output_tokens: 8192
-      input_price: 0.56
-      output_price: 1.68
+      max_input_tokens: 163840
+      max_output_tokens: 163840
+      input_price: 0.32
+      output_price: 0.89
      supports_function_calling: true
    - name: deepseek-reasoner
      max_input_tokens: 64000
@@ -1424,9 +1445,10 @@
 - provider: minimax
  models:
    - name: minimax-m2
-      max_input_tokens: 204800
-      input_price: 0.294
-      output_price: 1.176
+      max_output_tokens: 65536
+      max_input_tokens: 196608
+      input_price: 0.255
+      output_price: 1
      supports_function_calling: true

 # Links:
@@ -1442,8 +1464,8 @@
      supports_vision: true
      supports_function_calling: true
    - name: openai/gpt-5.1-chat
-      max_input_tokens: 400000
-      max_output_tokens: 128000
+      max_input_tokens: 128000
+      max_output_tokens: 16384
      input_price: 1.25
      output_price: 10
      supports_vision: true
@@ -1456,8 +1478,8 @@
      supports_vision: true
      supports_function_calling: true
    - name: openai/gpt-5-chat
-      max_input_tokens: 400000
-      max_output_tokens: 128000
+      max_input_tokens: 128000
+      max_output_tokens: 16384
      input_price: 1.25
      output_price: 10
      supports_vision: true
@@ -1498,18 +1520,21 @@
      supports_vision: true
      supports_function_calling: true
    - name: openai/gpt-4o
+      max_output_tokens: 16384
      max_input_tokens: 128000
      input_price: 2.5
      output_price: 10
      supports_vision: true
      supports_function_calling: true
    - name: openai/gpt-4o-mini
+      max_output_tokens: 16384
      max_input_tokens: 128000
      input_price: 0.15
      output_price: 0.6
      supports_vision: true
      supports_function_calling: true
    - name: openai/o4-mini
+      max_output_tokens: 100000
      max_input_tokens: 200000
      input_price: 1.1
      output_price: 4.4
@@ -1522,6 +1547,7 @@
          temperature: null
          top_p: null
    - name: openai/o4-mini-high
+      max_output_tokens: 100000
      max_input_tokens: 200000
      input_price: 1.1
      output_price: 4.4
@@ -1535,6 +1561,7 @@
          temperature: null
          top_p: null
    - name: openai/o3
+      max_output_tokens: 100000
      max_input_tokens: 200000
      input_price: 2
      output_price: 8
@@ -1560,6 +1587,7 @@
          temperature: null
          top_p: null
    - name: openai/o3-mini
+      max_output_tokens: 100000
      max_input_tokens: 200000
      input_price: 1.1
      output_price: 4.4
@@ -1571,6 +1599,7 @@
          temperature: null
          top_p: null
    - name: openai/o3-mini-high
+      max_output_tokens: 100000
      max_input_tokens: 200000
      input_price: 1.1
      output_price: 4.4
@@ -1583,50 +1612,57 @@
          top_p: null
    - name: openai/gpt-oss-120b
      max_input_tokens: 131072
-      input_price: 0.09
-      output_price: 0.45
+      input_price: 0.039
+      output_price: 0.19
      supports_function_calling: true
    - name: openai/gpt-oss-20b
      max_input_tokens: 131072
-      input_price: 0.04
-      output_price: 0.16
+      input_price: 0.03
+      output_price: 0.14
      supports_function_calling: true
    - name: google/gemini-2.5-flash
+      max_output_tokens: 65535
      max_input_tokens: 1048576
      input_price: 0.3
      output_price: 2.5
      supports_vision: true
      supports_function_calling: true
    - name: google/gemini-2.5-pro
+      max_output_tokens: 65536
      max_input_tokens: 1048576
      input_price: 1.25
      output_price: 10
      supports_vision: true
      supports_function_calling: true
    - name: google/gemini-2.5-flash-lite
+      max_output_tokens: 65535
      max_input_tokens: 1048576
-      input_price: 0.3
+      input_price: 0.1
      output_price: 0.4
      supports_vision: true
    - name: google/gemini-2.0-flash-001
-      max_input_tokens: 1000000
-      input_price: 0.15
-      output_price: 0.6
+      max_output_tokens: 8192
+      max_input_tokens: 1048576
+      input_price: 0.1
+      output_price: 0.4
      supports_vision: true
      supports_function_calling: true
    - name: google/gemini-2.0-flash-lite-001
+      max_output_tokens: 8192
      max_input_tokens: 1048576
      input_price: 0.075
      output_price: 0.3
      supports_vision: true
      supports_function_calling: true
    - name: google/gemma-3-27b-it
-      max_input_tokens: 131072
-      input_price: 0.1
-      output_price: 0.2
+      max_output_tokens: 65536
+      supports_vision: true
+      max_input_tokens: 128000
+      input_price: 0.04
+      output_price: 0.15
    - name: anthropic/claude-sonnet-4.5
-      max_input_tokens: 200000
-      max_output_tokens: 8192
+      max_input_tokens: 1000000
+      max_output_tokens: 64000
      require_max_tokens: true
      input_price: 3
      output_price: 15
@@ -1634,7 +1670,7 @@
      supports_function_calling: true
    - name: anthropic/claude-haiku-4.5
      max_input_tokens: 200000
-      max_output_tokens: 8192
+      max_output_tokens: 64000
      require_max_tokens: true
      input_price: 1
      output_price: 5
@@ -1642,7 +1678,7 @@
      supports_function_calling: true
    - name: anthropic/claude-opus-4.1
      max_input_tokens: 200000
-      max_output_tokens: 8192
+      max_output_tokens: 32000
      require_max_tokens: true
      input_price: 15
      output_price: 75
@@ -1650,15 +1686,15 @@
      supports_function_calling: true
    - name: anthropic/claude-opus-4
      max_input_tokens: 200000
-      max_output_tokens: 8192
+      max_output_tokens: 32000
      require_max_tokens: true
      input_price: 15
      output_price: 75
      supports_vision: true
      supports_function_calling: true
    - name: anthropic/claude-sonnet-4
-      max_input_tokens: 200000
-      max_output_tokens: 8192
+      max_input_tokens: 1000000
+      max_output_tokens: 64000
      require_max_tokens: true
      input_price: 3
      output_price: 15
@@ -1666,7 +1702,7 @@
      supports_function_calling: true
    - name: anthropic/claude-3.7-sonnet
      max_input_tokens: 200000
-      max_output_tokens: 8192
+      max_output_tokens: 64000
      require_max_tokens: true
      input_price: 3
      output_price: 15
@@ -1681,21 +1717,24 @@
      supports_vision: true
      supports_function_calling: true
    - name: meta-llama/llama-4-maverick
+      max_output_tokens: 16384
      max_input_tokens: 1048576
-      input_price: 0.18
+      input_price: 0.15
      output_price: 0.6
      supports_vision: true
      supports_function_calling: true
    - name: meta-llama/llama-4-scout
+      max_output_tokens: 16384
      max_input_tokens: 327680
      input_price: 0.08
      output_price: 0.3
      supports_vision: true
      supports_function_calling: true
    - name: meta-llama/llama-3.3-70b-instruct
+      max_output_tokens: 16384
      max_input_tokens: 131072
-      input_price: 0.12
-      output_price: 0.3
+      input_price: 0.1
+      output_price: 0.32
    - name: mistralai/mistral-medium-3.1
      max_input_tokens: 131072
      input_price: 0.4
@@ -1703,9 +1742,10 @@
      supports_function_calling: true
      supports_vision: true
    - name: mistralai/mistral-small-3.2-24b-instruct
+      max_output_tokens: 131072
      max_input_tokens: 131072
-      input_price: 0.1
-      output_price: 0.3
+      input_price: 0.06
+      output_price: 0.18
      supports_vision: true
    - name: mistralai/magistral-medium-2506
      max_input_tokens: 40960
@@ -1726,8 +1766,8 @@
      supports_function_calling: true
    - name: mistralai/devstral-small
      max_input_tokens: 131072
-      input_price: 0.07
-      output_price: 0.28
+      input_price: 0.1
+      output_price: 0.3
      supports_function_calling: true
    - name: mistralai/codestral-2508
      max_input_tokens: 256000
@@ -1735,6 +1775,7 @@
      output_price: 0.9
      supports_function_calling: true
    - name: ai21/jamba-large-1.7
+      max_output_tokens: 4096
      max_input_tokens: 256000
      input_price: 2
      output_price: 8
@@ -1745,88 +1786,98 @@
      output_price: 0.4
      supports_function_calling: true
    - name: cohere/command-a
+      max_output_tokens: 8192
      max_input_tokens: 256000
      input_price: 2.5
      output_price: 10
      supports_function_calling: true
    - name: cohere/command-r7b-12-2024
      max_input_tokens: 128000
-      max_output_tokens: 4096
+      max_output_tokens: 4000
      input_price: 0.0375
      output_price: 0.15
    - name: deepseek/deepseek-v3.2-exp
+      max_output_tokens: 65536
      max_input_tokens: 163840
      input_price: 0.27
-      output_price: 0.40
+      output_price: 0.41
    - name: deepseek/deepseek-v3.1-terminus
      max_input_tokens: 163840
-      input_price: 0.23
-      output_price: 0.90
+      input_price: 0.21
+      output_price: 0.79
    - name: deepseek/deepseek-chat-v3.1
-      max_input_tokens: 163840
-      input_price: 0.2
-      output_price: 0.8
+      max_output_tokens: 7168
+      max_input_tokens: 32768
+      input_price: 0.15
+      output_price: 0.75
    - name: deepseek/deepseek-r1-0528
-      max_input_tokens: 128000
-      input_price: 0.50
-      output_price: 2.15
+      max_output_tokens: 65536
+      max_input_tokens: 163840
+      input_price: 0.4
+      output_price: 1.75
      patch:
        body:
          include_reasoning: true
    - name: qwen/qwen3-max
+      max_output_tokens: 32768
      max_input_tokens: 262144
      input_price: 1.2
      output_price: 6
      supports_function_calling: true
    - name: qwen/qwen-plus
-      max_input_tokens: 131072
-      max_output_tokens: 8192
+      max_input_tokens: 1000000
+      max_output_tokens: 32768
      input_price: 0.4
      output_price: 1.2
      supports_function_calling: true
    - name: qwen/qwen3-next-80b-a3b-instruct
      max_input_tokens: 262144
-      input_price: 0.1
-      output_price: 0.8
+      input_price: 0.09
+      output_price: 1.1
      supports_function_calling: true
    - name: qwen/qwen3-next-80b-a3b-thinking
-      max_input_tokens: 262144
-      input_price: 0.1
-      output_price: 0.8
+      max_input_tokens: 128000
+      input_price: 0.15
+      output_price: 1.2
    - name: qwen/qwen5-235b-a22b-2507 # Qwen3 235B A22B Instruct 2507
      max_input_tokens: 262144
      input_price: 0.12
      output_price: 0.59
      supports_function_calling: true
    - name: qwen/qwen3-235b-a22b-thinking-2507
-      max_input_tokens: 262144
-      input_price: 0.118
-      output_price: 0.118
-    - name: qwen/qwen3-30b-a3b-instruct-2507
      max_input_tokens: 131072
-      input_price: 0.2
-      output_price: 0.8
+      input_price: 0
+      output_price: 0
+    - name: qwen/qwen3-30b-a3b-instruct-2507
+      max_output_tokens: 262144
+      max_input_tokens: 262144
+      input_price: 0.09
+      output_price: 0.3
    - name: qwen/qwen3-30b-a3b-thinking-2507
-      max_input_tokens: 262144
-      input_price: 0.071
-      output_price: 0.285
+      max_input_tokens: 32768
+      input_price: 0.051
+      output_price: 0.34
    - name: qwen/qwen3-vl-32b-instruct
-      max_input_tokens: 262144
-      input_price: 0.35
-      output_price: 1.1
+      max_output_tokens: 32768
+      max_input_tokens: 131072
+      input_price: 0.104
+      output_price: 0.416
      supports_vision: true
    - name: qwen/qwen3-vl-8b-instruct
-      max_input_tokens: 262144
+      max_output_tokens: 32768
+      max_input_tokens: 131072
      input_price: 0.08
-      output_price: 0.50
+      output_price: 0.5
      supports_vision: true
    - name: qwen/qwen3-coder-plus
-      max_input_tokens: 128000
+      max_output_tokens: 65536
+      max_input_tokens: 1000000
      input_price: 1
      output_price: 5
      supports_function_calling: true
    - name: qwen/qwen3-coder-flash
-      max_input_tokens: 128000
+      max_output_tokens: 65536
+      max_input_tokens: 1000000
      input_price: 0.3
      output_price: 1.5
      supports_function_calling: true
@@ -1836,19 +1887,20 @@
      output_price: 0.95
      supports_function_calling: true
    - name: qwen/qwen3-coder-30b-a3b-instruct
-      max_input_tokens: 262144
-      input_price: 0.052
-      output_price: 0.207
+      max_output_tokens: 32768
+      max_input_tokens: 160000
+      input_price: 0.07
+      output_price: 0.27
      supports_function_calling: true
    - name: moonshotai/kimi-k2-0905
-      max_input_tokens: 262144
-      input_price: 0.296
-      output_price: 1.185
+      max_input_tokens: 131072
+      input_price: 0.4
+      output_price: 2
      supports_function_calling: true
    - name: moonshotai/kimi-k2-thinking
-      max_input_tokens: 262144
-      input_price: 0.45
-      output_price: 2.35
+      max_input_tokens: 131072
+      input_price: 0.47
+      output_price: 2
      supports_function_calling: true
    - name: moonshotai/kimi-dev-72b
      max_input_tokens: 131072
@@ -1856,21 +1908,26 @@
      output_price: 1.15
      supports_function_calling: true
    - name: x-ai/grok-4
+      supports_vision: true
      max_input_tokens: 256000
      input_price: 3
      output_price: 15
      supports_function_calling: true
    - name: x-ai/grok-4-fast
+      max_output_tokens: 30000
+      supports_vision: true
      max_input_tokens: 2000000
      input_price: 0.2
      output_price: 0.5
      supports_function_calling: true
    - name: x-ai/grok-code-fast-1
+      max_output_tokens: 10000
      max_input_tokens: 256000
      input_price: 0.2
      output_price: 1.5
      supports_function_calling: true
    - name: amazon/nova-premier-v1
+      max_output_tokens: 32000
      max_input_tokens: 1000000
      input_price: 2.5
      output_price: 12.5
@@ -1893,14 +1950,18 @@
      input_price: 0.035
      output_price: 0.14
    - name: perplexity/sonar-pro
+      max_output_tokens: 8000
+      supports_vision: true
      max_input_tokens: 200000
      input_price: 3
      output_price: 15
    - name: perplexity/sonar
+      supports_vision: true
      max_input_tokens: 127072
      input_price: 1
      output_price: 1
    - name: perplexity/sonar-reasoning-pro
+      supports_vision: true
      max_input_tokens: 128000
      input_price: 2
      output_price: 8
@@ -1915,20 +1976,22 @@
        body:
          include_reasoning: true
    - name: perplexity/sonar-deep-research
-      max_input_tokens: 200000
+      max_input_tokens: 128000
      input_price: 2
      output_price: 8
      patch:
        body:
          include_reasoning: true
    - name: minimax/minimax-m2
+      max_output_tokens: 65536
      max_input_tokens: 196608
-      input_price: 0.15
-      output_price: 0.45
+      input_price: 0.255
+      output_price: 1
    - name: z-ai/glm-4.6
+      max_output_tokens: 131072
      max_input_tokens: 202752
-      input_price: 0.5
-      output_price: 1.75
+      input_price: 0.35
+      output_price: 1.71
      supports_function_calling: true

 # Links:
@@ -0,0 +1,2 @@
+requests
+ruamel.yaml
@@ -0,0 +1,255 @@
+import requests
+import sys
+import re
+import json
+
+# Provider mapping from models.yaml to OpenRouter prefixes
+PROVIDER_MAPPING = {
+    "openai": "openai",
+    "claude": "anthropic",
+    "gemini": "google",
+    "mistral": "mistralai",
+    "cohere": "cohere",
+    "perplexity": "perplexity",
+    "xai": "x-ai",
+    "openrouter": "openrouter",
+    "ai21": "ai21",
+    "deepseek": "deepseek",
+    "moonshot": "moonshotai",
+    "qianwen": "qwen",
+    "zhipuai": "zhipuai",
+    "minimax": "minimax",
+    "vertexai": "google",
+    "groq": "groq",
+    "bedrock": "amazon",
+    "hunyuan": "tencent",
+    "ernie": "baidu",
+    "github": "github",
+}
+
+def fetch_openrouter_models():
+    print("Fetching models from OpenRouter...")
+    try:
+        response = requests.get("https://openrouter.ai/api/v1/models")
+        response.raise_for_status()
+        data = response.json()["data"]
+        print(f"Fetched {len(data)} models.")
+        return data
+    except Exception as e:
+        print(f"Error fetching models: {e}")
+        sys.exit(1)
+
+def get_openrouter_model(models_data, provider_prefix, model_name, is_openrouter_provider=False):
+    if is_openrouter_provider:
+        # For openrouter provider, the model_name in yaml is usually the full ID
+        for model in models_data:
+            if model["id"] == model_name:
+                return model
+        return None
+
+    expected_id = f"{provider_prefix}/{model_name}"
+    
+    # 1. Try exact match on ID
+    for model in models_data:
+        if model["id"] == expected_id:
+            return model
+            
+    # 2. Try match by suffix
+    for model in models_data:
+        if model["id"].split("/")[-1] == model_name:
+            if model["id"].startswith(f"{provider_prefix}/"):
+                return model
+
+    return None
+
+def format_price(price_per_token):
+    if price_per_token is None:
+        return None
+    try:
+        price_per_1m = float(price_per_token) * 1_000_000
+        if price_per_1m.is_integer():
+            return str(int(price_per_1m))
+        else:
+            return str(round(price_per_1m, 4))
+    except:
+        return None
+
+def get_indentation(line):
+    return len(line) - len(line.lstrip())
+
+def process_model_block(block_lines, current_provider, or_models):
+    if not block_lines:
+        return []
+
+    # 1. Identify model name and indentation
+    name_line = block_lines[0]
+    name_match = re.match(r"^(\s*)-\s*name:\s*(.+)$", name_line)
+    if not name_match:
+        return block_lines 
+
+    name_indent_str = name_match.group(1)
+    model_name = name_match.group(2).strip()
+    
+    # 2. Find OpenRouter model
+    or_prefix = PROVIDER_MAPPING.get(current_provider)
+    is_openrouter_provider = (current_provider == "openrouter")
+    
+    if not or_prefix and not is_openrouter_provider:
+        return block_lines
+        
+    or_model = get_openrouter_model(or_models, or_prefix, model_name, is_openrouter_provider)
+    if not or_model:
+        return block_lines
+
+    print(f"  Updating {model_name}...")
+
+    # 3. Prepare updates
+    updates = {}
+    
+    # Pricing
+    pricing = or_model.get("pricing", {})
+    p_in = format_price(pricing.get("prompt"))
+    p_out = format_price(pricing.get("completion"))
+    if p_in: updates["input_price"] = p_in
+    if p_out: updates["output_price"] = p_out
+    
+    # Context
+    ctx = or_model.get("context_length")
+    if ctx: updates["max_input_tokens"] = str(ctx)
+    
+    max_out = None
+    if "top_provider" in or_model and or_model["top_provider"]:
+        max_out = or_model["top_provider"].get("max_completion_tokens")
+    if max_out: updates["max_output_tokens"] = str(max_out)
+    
+    # Capabilities
+    arch = or_model.get("architecture", {})
+    modality = arch.get("modality", "")
+    if "image" in modality:
+        updates["supports_vision"] = "true"
+
+    # 4. Detect field indentation
+    field_indent_str = None
+    existing_fields = {} # key -> line_index
+    
+    for i, line in enumerate(block_lines):
+        if i == 0: continue # Skip name line
+        
+        # Skip comments
+        if line.strip().startswith("#"):
+            continue
+            
+        # Look for "key: value"
+        m = re.match(r"^(\s*)([\w_-]+):", line)
+        if m:
+            indent = m.group(1)
+            key = m.group(2)
+            # Must be deeper than name line
+            if len(indent) > len(name_indent_str):
+                if field_indent_str is None:
+                    field_indent_str = indent
+                existing_fields[key] = i
+
+    if field_indent_str is None:
+        field_indent_str = name_indent_str + "  "
+
+    # 5. Apply updates
+    new_block = list(block_lines)
+    
+    # Update existing fields
+    for key, value in updates.items():
+        if key in existing_fields:
+            idx = existing_fields[key]
+            # Preserve original key indentation exactly
+            original_line = new_block[idx]
+            m = re.match(r"^(\s*)([\w_-]+):", original_line)
+            if m:
+                current_indent = m.group(1)
+                new_block[idx] = f"{current_indent}{key}: {value}\n"
+    
+    # Insert missing fields
+    # Insert after the name line
+    insertion_idx = 1
+    
+    for key, value in updates.items():
+        if key not in existing_fields:
+            new_line = f"{field_indent_str}{key}: {value}\n"
+            new_block.insert(insertion_idx, new_line)
+            insertion_idx += 1
+            
+    return new_block
+
+def main():
+    or_models = fetch_openrouter_models()
+    
+    print("Reading models.yaml...")
+    with open("models.yaml", "r") as f:
+        lines = f.readlines()
+        
+    new_lines = []
+    current_provider = None
+    
+    i = 0
+    while i < len(lines):
+        line = lines[i]
+        
+        # Check for provider
+        # - provider: name
+        p_match = re.match(r"^\s*-?\s*provider:\s*(.+)$", line)
+        if p_match:
+            current_provider = p_match.group(1).strip()
+            new_lines.append(line)
+            i += 1
+            continue
+            
+        # Check for model start
+        # - name: ...
+        m_match = re.match(r"^(\s*)-\s*name:\s*.+$", line)
+        if m_match:
+            # Start of a model block
+            start_indent = len(m_match.group(1))
+            
+            # Collect block lines
+            block_lines = [line]
+            j = i + 1
+            while j < len(lines):
+                next_line = lines[j]
+                stripped = next_line.strip()
+                
+                # If empty or comment, include it
+                if not stripped or stripped.startswith("#"):
+                    block_lines.append(next_line)
+                    j += 1
+                    continue
+                
+                # Check indentation
+                next_indent = get_indentation(next_line)
+                
+                # If indentation is greater, it's part of the block (property)
+                if next_indent > start_indent:
+                    block_lines.append(next_line)
+                    j += 1
+                    continue
+                
+                # If indentation is equal or less, it's the end of the block
+                break
+            
+            # Process the block
+            processed_block = process_model_block(block_lines, current_provider, or_models)
+            new_lines.extend(processed_block)
+            
+            # Advance i
+            i = j
+            continue
+            
+        # Otherwise, just a regular line
+        new_lines.append(line)
+        i += 1
+        
+    print("Saving models.yaml...")
+    with open("models.yaml", "w") as f:
+        f.writelines(new_lines)
+    print("Done.")
+
+if __name__ == "__main__":
+    main()
@@ -22,7 +22,7 @@ use serde_json::{Value, json};
 use std::collections::VecDeque;
 use std::ffi::OsStr;
 use std::fs::File;
-use std::io::Write;
+use std::io::{Read, Write};
 use std::{
    collections::{HashMap, HashSet},
    env, fs, io,
@@ -1064,7 +1064,7 @@ impl ToolCall {
                        function_name.clone(),
                        function_name,
                        vec![],
-                        Default::default(),
+                        agent.variable_envs(),
                    ))
                }
            }
@@ -1117,18 +1117,55 @@ pub fn run_llm_function(
    #[cfg(windows)]
    let cmd_name = polyfill_cmd_name(&cmd_name, &bin_dirs);

-    let output = Command::new(&cmd_name)
+    envs.insert("CLICOLOR_FORCE".into(), "1".into());
+    envs.insert("FORCE_COLOR".into(), "1".into());
+
+    let mut child = Command::new(&cmd_name)
      .args(&cmd_args)
      .envs(envs)
-        .stdout(Stdio::inherit())
+      .stdout(Stdio::piped())
      .stderr(Stdio::piped())
      .spawn()
-        .and_then(|child| child.wait_with_output())
      .map_err(|err| anyhow!("Unable to run {command_name}, {err}"))?;

-    let exit_code = output.status.code().unwrap_or_default();
+    let stdout = child.stdout.take().expect("Failed to capture stdout");
+    let mut stderr = child.stderr.take().expect("Failed to capture stderr");
+
+    let stdout_thread = std::thread::spawn(move || {
+        let mut buffer = [0; 1024];
+        let mut reader = stdout;
+        let mut out = io::stdout();
+        while let Ok(n) = reader.read(&mut buffer) {
+            if n == 0 { break; }
+            let chunk = &buffer[0..n];
+            let mut last_pos = 0;
+            for (i, &byte) in chunk.iter().enumerate() {
+                if byte == b'\n' {
+                    let _ = out.write_all(&chunk[last_pos..i]);
+                    let _ = out.write_all(b"\r\n");
+                    last_pos = i + 1;
+                }
+            }
+            if last_pos < n {
+                let _ = out.write_all(&chunk[last_pos..n]);
+            }
+            let _ = out.flush();
+        }
+    });
+
+    let stderr_thread = std::thread::spawn(move || {
+        let mut buf = Vec::new();
+        let _ = stderr.read_to_end(&mut buf);
+        buf
+    });
+
+    let status = child.wait().map_err(|err| anyhow!("Unable to run {command_name}, {err}"))?;
+    let _ = stdout_thread.join();
+    let stderr_bytes = stderr_thread.join().unwrap_or_default();
+
+    let exit_code = status.code().unwrap_or_default();
    if exit_code != 0 {
-        let stderr = String::from_utf8_lossy(&output.stderr).trim().to_string();
+        let stderr = String::from_utf8_lossy(&stderr_bytes).trim().to_string();
        if !stderr.is_empty() {
            eprintln!("{stderr}");
        }
@@ -158,27 +158,31 @@ impl McpRegistry {
    }

    pub async fn reinit(
-        registry: McpRegistry,
+        mut registry: McpRegistry,
        enabled_mcp_servers: Option<String>,
        abort_signal: AbortSignal,
    ) -> Result<Self> {
        debug!("Reinitializing MCP registry");
-        debug!("Stopping all MCP servers");
-        let mut new_registry = abortable_run_with_spinner(
-            registry.stop_all_servers(),
-            "Stopping MCP servers",
+
+        let desired_ids = registry.resolve_server_ids(enabled_mcp_servers.clone());
+        let desired_set: HashSet<String> = desired_ids.iter().cloned().collect();
+
+        debug!("Stopping unused MCP servers");
+        abortable_run_with_spinner(
+            registry.stop_unused_servers(&desired_set),
+            "Stopping unused MCP servers",
            abort_signal.clone(),
        )
        .await?;

        abortable_run_with_spinner(
-            new_registry.start_select_mcp_servers(enabled_mcp_servers),
+            registry.start_select_mcp_servers(enabled_mcp_servers),
            "Loading MCP servers",
            abort_signal,
        )
        .await?;

-        Ok(new_registry)
+        Ok(registry)
    }

    async fn start_select_mcp_servers(
@@ -192,27 +196,19 @@ impl McpRegistry {
            return Ok(());
        }

-        if let Some(servers) = enabled_mcp_servers {
-            debug!("Starting selected MCP servers: {:?}", servers);
-            let config = self
-                .config
-                .as_ref()
-                .with_context(|| "MCP Config not defined. Cannot start servers")?;
-            let mcp_servers = config.mcp_servers.clone();
+        let desired_ids = self.resolve_server_ids(enabled_mcp_servers);
+        let ids_to_start: Vec<String> = desired_ids.into_iter()
+            .filter(|id| !self.servers.contains_key(id))
+            .collect();

-            let enabled_servers: HashSet<String> =
-                servers.split(',').map(|s| s.trim().to_string()).collect();
-            let server_ids: Vec<String> = if servers == "all" {
-                mcp_servers.into_keys().collect()
-            } else {
-                mcp_servers
-                    .into_keys()
-                    .filter(|id| enabled_servers.contains(id))
-                    .collect()
-            };
+        if ids_to_start.is_empty() {
+            return Ok(());
+        }
+
+        debug!("Starting selected MCP servers: {:?}", ids_to_start);

        let results: Vec<(String, Arc<_>, ServerCatalog)> = stream::iter(
-                server_ids
+            ids_to_start
                .into_iter()
                .map(|id| async { self.start_server(id).await }),
        )
@@ -220,15 +216,9 @@ impl McpRegistry {
        .try_collect()
        .await?;

-            self.servers = results
-                .clone()
-                .into_iter()
-                .map(|(id, server, _)| (id, server))
-                .collect();
-            self.catalogs = results
-                .into_iter()
-                .map(|(id, _, catalog)| (id, catalog))
-                .collect();
+        for (id, server, catalog) in results {
+            self.servers.insert(id.clone(), server);
+            self.catalogs.insert(id, catalog);
        }

        Ok(())
@@ -309,19 +299,49 @@ impl McpRegistry {
        Ok((id.to_string(), service, catalog))
    }

-    pub async fn stop_all_servers(mut self) -> Result<Self> {
-        for (id, server) in self.servers {
-            Arc::try_unwrap(server)
-                .map_err(|_| anyhow!("Failed to unwrap Arc for MCP server: {id}"))?
-                .cancel()
-                .await
+    fn resolve_server_ids(&self, enabled_mcp_servers: Option<String>) -> Vec<String> {
+        if let Some(config) = &self.config
+             && let Some(servers) = enabled_mcp_servers {
+                if servers == "all" {
+                    config.mcp_servers.keys().cloned().collect()
+                } else {
+                    let enabled_servers: HashSet<String> =
+                        servers.split(',').map(|s| s.trim().to_string()).collect();
+                    config.mcp_servers
+                        .keys()
+                        .filter(|id| enabled_servers.contains(*id))
+                        .cloned()
+                        .collect()
+                }
+        } else {
+            vec![]
+        }
+    }
+
+    pub async fn stop_unused_servers(&mut self, keep_ids: &HashSet<String>) -> Result<()> {
+        let mut ids_to_remove = Vec::new();
+        for (id, _) in self.servers.iter() {
+            if !keep_ids.contains(id) {
+                ids_to_remove.push(id.clone());
+            }
+        }
+
+        for id in ids_to_remove {
+            if let Some(server) = self.servers.remove(&id) {
+                match Arc::try_unwrap(server) {
+                    Ok(server_inner) => {
+                        server_inner.cancel().await
                            .with_context(|| format!("Failed to stop MCP server: {id}"))?;
                        info!("Stopped MCP server: {id}");
                    }
-
-        self.servers = HashMap::new();
-
-        Ok(self)
+                    Err(_) => {
+                        info!("Detaching from MCP server: {id} (still in use)");
+                    }
+                }
+                self.catalogs.remove(&id);
+            }
+        }
+        Ok(())
    }

    pub fn list_started_servers(&self) -> Vec<String> {
Author	SHA1	Message	Date
Alex Clarke	c5f52e1efb	docs: Updated sisyphus README to always include the execute_command.sh tool CI / All (macos-latest) (push) Has been cancelled Details CI / All (ubuntu-latest) (push) Has been cancelled Details CI / All (windows-latest) (push) Has been cancelled Details	2026-02-20 15:06:57 -07:00
Alex Clarke	470149b606	docs: Updated the sisyphus system docs to have a pro-tip of configuring an IDE MCP server to improve performance	2026-02-20 15:01:08 -07:00
Alex Clarke	02062c5a50	docs: Created README docs for the CodeRabbit-style Code reviewer agents	2026-02-20 15:00:32 -07:00
Alex Clarke	e6e99b6926	feat: Improved MCP server spinup and spindown when switching contexts or settings in the REPL: Modify existing config rather than stopping all servers always and re-initializing if unnecessary	2026-02-20 14:36:34 -07:00
Alex Clarke	15a293204f	fix: Improved sub-agent stdout and stderr output for users to follow	2026-02-20 13:47:28 -07:00
Alex Clarke	ecf3780aed	Update models.yaml with latest OpenRouter data	2026-02-20 12:08:00 -07:00
Alex Clarke	e798747135	Add script to update models.yaml from OpenRouter	2026-02-20 12:07:59 -07:00
Alex Clarke	60493728a0	fix: Inject agent variables into environment variables for global tool calls when invoked from agents to modify global tool behavior	2026-02-20 11:38:24 -07:00