8 Commits

11 changed files with 739 additions and 200 deletions
+36
View File
@@ -0,0 +1,36 @@
# Code Reviewer
A CodeRabbit-style code review orchestrator that coordinates per-file reviews and synthesizes findings into a unified
report.
This agent acts as the manager for the review process, delegating actual file analysis to **[File Reviewer](../file-reviewer/README.md)**
agents while handling coordination and final reporting.
## Features
- 🤖 **Orchestration**: Spawns parallel reviewers for each changed file.
- 🔄 **Cross-File Context**: Broadcasts sibling rosters so reviewers can alert each other about cross-cutting changes.
- 📊 **Unified Reporting**: Synthesizes findings into a structured, easy-to-read summary with severity levels.
-**Parallel Execution**: Runs reviews concurrently for maximum speed.
## Pro-Tip: Use an IDE MCP Server for Improved Performance
Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
them), and modify the agent definition to look like this:
```yaml
# ...
mcp_servers:
- jetbrains # The name of your configured IDE MCP server
global_tools:
- fs_read.sh
- fs_grep.sh
- fs_glob.sh
# - execute_command.sh
# ...
```
+24
View File
@@ -14,3 +14,27 @@ acts as the coordinator/architect, while Coder handles the implementation detail
- 📊 Precise diff-based file editing for controlled code modifications
It can also be used as a standalone tool for direct coding assistance.
## Pro-Tip: Use an IDE MCP Server for Improved Performance
Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
them), and modify the agent definition to look like this:
```yaml
# ...
mcp_servers:
- jetbrains # The name of your configured IDE MCP server
global_tools:
# Keep useful read-only tools for reading files in other non-project directories
- fs_read.sh
- fs_grep.sh
- fs_glob.sh
# - fs_write.sh
# - fs_patch.sh
- execute_command.sh
# ...
```
+22
View File
@@ -13,3 +13,25 @@ It can also be used as a standalone tool for understanding codebases and finding
- 📂 File system navigation and content analysis
- 🧠 Context gathering for complex tasks
- 🛡️ Read-only operations for safe investigation
## Pro-Tip: Use an IDE MCP Server for Improved Performance
Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
them), and modify the agent definition to look like this:
```yaml
# ...
mcp_servers:
- jetbrains # The name of your configured IDE MCP server
global_tools:
- fs_read.sh
- fs_grep.sh
- fs_glob.sh
- fs_ls.sh
- web_search_loki.sh
# ...
```
+35
View File
@@ -0,0 +1,35 @@
# File Reviewer
A specialized worker agent that reviews a single file's diff for bugs, style issues, and cross-cutting concerns.
This agent is designed to be spawned by the **[Code Reviewer](../code-reviewer/README.md)** agent. It focuses deeply on
one file while communicating with sibling agents to catch issues that span multiple files.
## Features
- 🔍 **Deep Analysis**: Focuses on bugs, logic errors, security issues, and style problems in a single file.
- 🗣️ **Teammate Communication**: Sends and receives alerts to/from sibling reviewers about interface or dependency
changes.
- 🎯 **Targeted Reading**: Reads only relevant context around changed lines to stay efficient.
- 🏷️ **Structured Findings**: Categorizes issues by severity (🔴 Critical, 🟡 Warning, 🟢 Suggestion, 💡 Nitpick).
## Pro-Tip: Use an IDE MCP Server for Improved Performance
Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
them), and modify the agent definition to look like this:
```yaml
# ...
mcp_servers:
- jetbrains # The name of your configured IDE MCP server
global_tools:
- fs_read.sh
- fs_grep.sh
- fs_glob.sh
# ...
```
+22
View File
@@ -15,3 +15,25 @@ It can also be used as a standalone tool for design reviews and solving difficul
- ⚖️ Tradeoff analysis and technology selection
- 📝 Code review and best practices advice
- 🧠 Deep reasoning for ambiguous problems
## Pro-Tip: Use an IDE MCP Server for Improved Performance
Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
them), and modify the agent definition to look like this:
```yaml
# ...
mcp_servers:
- jetbrains # The name of your configured IDE MCP server
global_tools:
- fs_read.sh
- fs_grep.sh
- fs_glob.sh
- fs_ls.sh
- web_search_loki.sh
# ...
```
+23
View File
@@ -16,3 +16,26 @@ Sisyphus acts as the primary entry point, capable of handling complex tasks by c
- 💻 **CLI Coding**: Provides a natural language interface for writing and editing code.
- 🔄 **Task Management**: Tracks progress and context across complex operations.
- 🛠️ **Tool Integration**: Seamlessly uses system tools for building, testing, and file manipulation.
## Pro-Tip: Use an IDE MCP Server for Improved Performance
Many modern IDEs now include MCP servers that let LLMs perform operations within the IDE itself and use IDE tools. Using
an IDE's MCP server dramatically improves the performance of coding agents. So if you have an IDE, try adding that MCP
server to your config (see the [MCP Server docs](../../../docs/function-calling/MCP-SERVERS.md) to see how to configure
them), and modify the agent definition to look like this:
```yaml
# ...
mcp_servers:
- jetbrains
global_tools:
- fs_read.sh
- fs_grep.sh
- fs_glob.sh
- fs_ls.sh
- web_search_loki.sh
- execute_command.sh
# ...
```
+195 -132
View File
@@ -81,6 +81,7 @@
supports_vision: true
supports_function_calling: true
- name: o4-mini
max_output_tokens: 100000
max_input_tokens: 200000
input_price: 1.1
output_price: 4.4
@@ -93,6 +94,7 @@
temperature: null
top_p: null
- name: o4-mini-high
max_output_tokens: 100000
real_name: o4-mini
max_input_tokens: 200000
input_price: 1.1
@@ -107,6 +109,7 @@
temperature: null
top_p: null
- name: o3
max_output_tokens: 100000
max_input_tokens: 200000
input_price: 2
output_price: 8
@@ -133,6 +136,7 @@
temperature: null
top_p: null
- name: o3-mini
max_output_tokens: 100000
max_input_tokens: 200000
input_price: 1.1
output_price: 4.4
@@ -145,6 +149,7 @@
temperature: null
top_p: null
- name: o3-mini-high
max_output_tokens: 100000
real_name: o3-mini
max_input_tokens: 200000
input_price: 1.1
@@ -192,23 +197,23 @@
models:
- name: gemini-2.5-flash
max_input_tokens: 1048576
max_output_tokens: 65536
input_price: 0
output_price: 0
max_output_tokens: 65535
input_price: 0.3
output_price: 2.5
supports_vision: true
supports_function_calling: true
- name: gemini-2.5-pro
max_input_tokens: 1048576
max_output_tokens: 65536
input_price: 0
output_price: 0
input_price: 1.25
output_price: 10
supports_vision: true
supports_function_calling: true
- name: gemini-2.5-flash-lite
max_input_tokens: 1000000
max_output_tokens: 64000
input_price: 0
output_price: 0
max_input_tokens: 1048576
max_output_tokens: 65535
input_price: 0.1
output_price: 0.4
supports_vision: true
supports_function_calling: true
- name: gemini-2.0-flash
@@ -226,10 +231,11 @@
supports_vision: true
supports_function_calling: true
- name: gemma-3-27b-it
max_input_tokens: 131072
max_output_tokens: 8192
input_price: 0
output_price: 0
supports_vision: true
max_input_tokens: 128000
max_output_tokens: 65536
input_price: 0.04
output_price: 0.15
- name: text-embedding-004
type: embedding
input_price: 0
@@ -509,8 +515,8 @@
output_price: 10
supports_vision: true
- name: command-r7b-12-2024
max_input_tokens: 131072
max_output_tokens: 4096
max_input_tokens: 128000
max_output_tokens: 4000
input_price: 0.0375
output_price: 0.15
- name: embed-v4.0
@@ -547,6 +553,7 @@
- provider: xai
models:
- name: grok-4
supports_vision: true
max_input_tokens: 256000
input_price: 3
output_price: 15
@@ -583,14 +590,18 @@
- provider: perplexity
models:
- name: sonar-pro
max_output_tokens: 8000
supports_vision: true
max_input_tokens: 200000
input_price: 3
output_price: 15
- name: sonar
max_input_tokens: 128000
supports_vision: true
max_input_tokens: 127072
input_price: 1
output_price: 1
- name: sonar-reasoning-pro
supports_vision: true
max_input_tokens: 128000
input_price: 2
output_price: 8
@@ -663,13 +674,13 @@
hipaa_safe: true
max_input_tokens: 1048576
max_output_tokens: 65536
input_price: 0
output_price: 0
input_price: 2
output_price: 12
supports_vision: true
supports_function_calling: true
- name: gemini-2.5-flash
max_input_tokens: 1048576
max_output_tokens: 65536
max_output_tokens: 65535
input_price: 0.3
output_price: 2.5
supports_vision: true
@@ -683,16 +694,16 @@
supports_function_calling: true
- name: gemini-2.5-flash-lite
max_input_tokens: 1048576
max_output_tokens: 65536
input_price: 0.3
max_output_tokens: 65535
input_price: 0.1
output_price: 0.4
supports_vision: true
supports_function_calling: true
- name: gemini-2.0-flash-001
max_input_tokens: 1048576
max_output_tokens: 8192
input_price: 0.15
output_price: 0.6
input_price: 0.1
output_price: 0.4
supports_vision: true
supports_function_calling: true
- name: gemini-2.0-flash-lite-001
@@ -1194,10 +1205,16 @@
- provider: qianwen
models:
- name: qwen3-max
input_price: 1.2
output_price: 6
max_output_tokens: 32768
max_input_tokens: 262144
supports_function_calling: true
- name: qwen-plus
max_input_tokens: 131072
input_price: 0.4
output_price: 1.2
max_output_tokens: 32768
max_input_tokens: 1000000
supports_function_calling: true
- name: qwen-flash
max_input_tokens: 1000000
@@ -1213,14 +1230,14 @@
- name: qwen-coder-flash
max_input_tokens: 1000000
- name: qwen3-next-80b-a3b-instruct
max_input_tokens: 131072
input_price: 0.14
output_price: 0.56
max_input_tokens: 262144
input_price: 0.09
output_price: 1.1
supports_function_calling: true
- name: qwen3-next-80b-a3b-thinking
max_input_tokens: 131072
input_price: 0.14
output_price: 1.4
max_input_tokens: 128000
input_price: 0.15
output_price: 1.2
- name: qwen3-235b-a22b-instruct-2507
max_input_tokens: 131072
input_price: 0.28
@@ -1228,35 +1245,39 @@
supports_function_calling: true
- name: qwen3-235b-a22b-thinking-2507
max_input_tokens: 131072
input_price: 0.28
output_price: 2.8
input_price: 0
output_price: 0
- name: qwen3-30b-a3b-instruct-2507
max_input_tokens: 131072
input_price: 0.105
output_price: 0.42
max_output_tokens: 262144
max_input_tokens: 262144
input_price: 0.09
output_price: 0.3
supports_function_calling: true
- name: qwen3-30b-a3b-thinking-2507
max_input_tokens: 131072
input_price: 0.105
output_price: 1.05
max_input_tokens: 32768
input_price: 0.051
output_price: 0.34
- name: qwen3-vl-32b-instruct
max_output_tokens: 32768
max_input_tokens: 131072
input_price: 0.28
output_price: 1.12
input_price: 0.104
output_price: 0.416
supports_vision: true
- name: qwen3-vl-8b-instruct
max_output_tokens: 32768
max_input_tokens: 131072
input_price: 0.07
output_price: 0.28
input_price: 0.08
output_price: 0.5
supports_vision: true
- name: qwen3-coder-480b-a35b-instruct
max_input_tokens: 262144
input_price: 1.26
output_price: 5.04
- name: qwen3-coder-30b-a3b-instruct
max_input_tokens: 262144
input_price: 0.315
output_price: 1.26
max_output_tokens: 32768
max_input_tokens: 160000
input_price: 0.07
output_price: 0.27
- name: deepseek-v3.2-exp
max_input_tokens: 131072
input_price: 0.28
@@ -1332,9 +1353,9 @@
output_price: 8.12
supports_vision: true
- name: kimi-k2-thinking
max_input_tokens: 262144
input_price: 0.56
output_price: 2.24
max_input_tokens: 131072
input_price: 0.47
output_price: 2
supports_vision: true
# Links:
@@ -1343,10 +1364,10 @@
- provider: deepseek
models:
- name: deepseek-chat
max_input_tokens: 64000
max_output_tokens: 8192
input_price: 0.56
output_price: 1.68
max_input_tokens: 163840
max_output_tokens: 163840
input_price: 0.32
output_price: 0.89
supports_function_calling: true
- name: deepseek-reasoner
max_input_tokens: 64000
@@ -1424,9 +1445,10 @@
- provider: minimax
models:
- name: minimax-m2
max_input_tokens: 204800
input_price: 0.294
output_price: 1.176
max_output_tokens: 65536
max_input_tokens: 196608
input_price: 0.255
output_price: 1
supports_function_calling: true
# Links:
@@ -1442,8 +1464,8 @@
supports_vision: true
supports_function_calling: true
- name: openai/gpt-5.1-chat
max_input_tokens: 400000
max_output_tokens: 128000
max_input_tokens: 128000
max_output_tokens: 16384
input_price: 1.25
output_price: 10
supports_vision: true
@@ -1456,8 +1478,8 @@
supports_vision: true
supports_function_calling: true
- name: openai/gpt-5-chat
max_input_tokens: 400000
max_output_tokens: 128000
max_input_tokens: 128000
max_output_tokens: 16384
input_price: 1.25
output_price: 10
supports_vision: true
@@ -1498,18 +1520,21 @@
supports_vision: true
supports_function_calling: true
- name: openai/gpt-4o
max_output_tokens: 16384
max_input_tokens: 128000
input_price: 2.5
output_price: 10
supports_vision: true
supports_function_calling: true
- name: openai/gpt-4o-mini
max_output_tokens: 16384
max_input_tokens: 128000
input_price: 0.15
output_price: 0.6
supports_vision: true
supports_function_calling: true
- name: openai/o4-mini
max_output_tokens: 100000
max_input_tokens: 200000
input_price: 1.1
output_price: 4.4
@@ -1522,6 +1547,7 @@
temperature: null
top_p: null
- name: openai/o4-mini-high
max_output_tokens: 100000
max_input_tokens: 200000
input_price: 1.1
output_price: 4.4
@@ -1535,6 +1561,7 @@
temperature: null
top_p: null
- name: openai/o3
max_output_tokens: 100000
max_input_tokens: 200000
input_price: 2
output_price: 8
@@ -1560,6 +1587,7 @@
temperature: null
top_p: null
- name: openai/o3-mini
max_output_tokens: 100000
max_input_tokens: 200000
input_price: 1.1
output_price: 4.4
@@ -1571,6 +1599,7 @@
temperature: null
top_p: null
- name: openai/o3-mini-high
max_output_tokens: 100000
max_input_tokens: 200000
input_price: 1.1
output_price: 4.4
@@ -1583,50 +1612,57 @@
top_p: null
- name: openai/gpt-oss-120b
max_input_tokens: 131072
input_price: 0.09
output_price: 0.45
input_price: 0.039
output_price: 0.19
supports_function_calling: true
- name: openai/gpt-oss-20b
max_input_tokens: 131072
input_price: 0.04
output_price: 0.16
input_price: 0.03
output_price: 0.14
supports_function_calling: true
- name: google/gemini-2.5-flash
max_output_tokens: 65535
max_input_tokens: 1048576
input_price: 0.3
output_price: 2.5
supports_vision: true
supports_function_calling: true
- name: google/gemini-2.5-pro
max_output_tokens: 65536
max_input_tokens: 1048576
input_price: 1.25
output_price: 10
supports_vision: true
supports_function_calling: true
- name: google/gemini-2.5-flash-lite
max_output_tokens: 65535
max_input_tokens: 1048576
input_price: 0.3
input_price: 0.1
output_price: 0.4
supports_vision: true
- name: google/gemini-2.0-flash-001
max_input_tokens: 1000000
input_price: 0.15
output_price: 0.6
max_output_tokens: 8192
max_input_tokens: 1048576
input_price: 0.1
output_price: 0.4
supports_vision: true
supports_function_calling: true
- name: google/gemini-2.0-flash-lite-001
max_output_tokens: 8192
max_input_tokens: 1048576
input_price: 0.075
output_price: 0.3
supports_vision: true
supports_function_calling: true
- name: google/gemma-3-27b-it
max_input_tokens: 131072
input_price: 0.1
output_price: 0.2
max_output_tokens: 65536
supports_vision: true
max_input_tokens: 128000
input_price: 0.04
output_price: 0.15
- name: anthropic/claude-sonnet-4.5
max_input_tokens: 200000
max_output_tokens: 8192
max_input_tokens: 1000000
max_output_tokens: 64000
require_max_tokens: true
input_price: 3
output_price: 15
@@ -1634,7 +1670,7 @@
supports_function_calling: true
- name: anthropic/claude-haiku-4.5
max_input_tokens: 200000
max_output_tokens: 8192
max_output_tokens: 64000
require_max_tokens: true
input_price: 1
output_price: 5
@@ -1642,7 +1678,7 @@
supports_function_calling: true
- name: anthropic/claude-opus-4.1
max_input_tokens: 200000
max_output_tokens: 8192
max_output_tokens: 32000
require_max_tokens: true
input_price: 15
output_price: 75
@@ -1650,15 +1686,15 @@
supports_function_calling: true
- name: anthropic/claude-opus-4
max_input_tokens: 200000
max_output_tokens: 8192
max_output_tokens: 32000
require_max_tokens: true
input_price: 15
output_price: 75
supports_vision: true
supports_function_calling: true
- name: anthropic/claude-sonnet-4
max_input_tokens: 200000
max_output_tokens: 8192
max_input_tokens: 1000000
max_output_tokens: 64000
require_max_tokens: true
input_price: 3
output_price: 15
@@ -1666,7 +1702,7 @@
supports_function_calling: true
- name: anthropic/claude-3.7-sonnet
max_input_tokens: 200000
max_output_tokens: 8192
max_output_tokens: 64000
require_max_tokens: true
input_price: 3
output_price: 15
@@ -1681,21 +1717,24 @@
supports_vision: true
supports_function_calling: true
- name: meta-llama/llama-4-maverick
max_output_tokens: 16384
max_input_tokens: 1048576
input_price: 0.18
input_price: 0.15
output_price: 0.6
supports_vision: true
supports_function_calling: true
- name: meta-llama/llama-4-scout
max_output_tokens: 16384
max_input_tokens: 327680
input_price: 0.08
output_price: 0.3
supports_vision: true
supports_function_calling: true
- name: meta-llama/llama-3.3-70b-instruct
max_output_tokens: 16384
max_input_tokens: 131072
input_price: 0.12
output_price: 0.3
input_price: 0.1
output_price: 0.32
- name: mistralai/mistral-medium-3.1
max_input_tokens: 131072
input_price: 0.4
@@ -1703,9 +1742,10 @@
supports_function_calling: true
supports_vision: true
- name: mistralai/mistral-small-3.2-24b-instruct
max_output_tokens: 131072
max_input_tokens: 131072
input_price: 0.1
output_price: 0.3
input_price: 0.06
output_price: 0.18
supports_vision: true
- name: mistralai/magistral-medium-2506
max_input_tokens: 40960
@@ -1726,8 +1766,8 @@
supports_function_calling: true
- name: mistralai/devstral-small
max_input_tokens: 131072
input_price: 0.07
output_price: 0.28
input_price: 0.1
output_price: 0.3
supports_function_calling: true
- name: mistralai/codestral-2508
max_input_tokens: 256000
@@ -1735,6 +1775,7 @@
output_price: 0.9
supports_function_calling: true
- name: ai21/jamba-large-1.7
max_output_tokens: 4096
max_input_tokens: 256000
input_price: 2
output_price: 8
@@ -1745,88 +1786,98 @@
output_price: 0.4
supports_function_calling: true
- name: cohere/command-a
max_output_tokens: 8192
max_input_tokens: 256000
input_price: 2.5
output_price: 10
supports_function_calling: true
- name: cohere/command-r7b-12-2024
max_input_tokens: 128000
max_output_tokens: 4096
max_output_tokens: 4000
input_price: 0.0375
output_price: 0.15
- name: deepseek/deepseek-v3.2-exp
max_output_tokens: 65536
max_input_tokens: 163840
input_price: 0.27
output_price: 0.40
output_price: 0.41
- name: deepseek/deepseek-v3.1-terminus
max_input_tokens: 163840
input_price: 0.23
output_price: 0.90
input_price: 0.21
output_price: 0.79
- name: deepseek/deepseek-chat-v3.1
max_input_tokens: 163840
input_price: 0.2
output_price: 0.8
max_output_tokens: 7168
max_input_tokens: 32768
input_price: 0.15
output_price: 0.75
- name: deepseek/deepseek-r1-0528
max_input_tokens: 128000
input_price: 0.50
output_price: 2.15
max_output_tokens: 65536
max_input_tokens: 163840
input_price: 0.4
output_price: 1.75
patch:
body:
include_reasoning: true
- name: qwen/qwen3-max
max_output_tokens: 32768
max_input_tokens: 262144
input_price: 1.2
output_price: 6
supports_function_calling: true
- name: qwen/qwen-plus
max_input_tokens: 131072
max_output_tokens: 8192
max_input_tokens: 1000000
max_output_tokens: 32768
input_price: 0.4
output_price: 1.2
supports_function_calling: true
- name: qwen/qwen3-next-80b-a3b-instruct
max_input_tokens: 262144
input_price: 0.1
output_price: 0.8
input_price: 0.09
output_price: 1.1
supports_function_calling: true
- name: qwen/qwen3-next-80b-a3b-thinking
max_input_tokens: 262144
input_price: 0.1
output_price: 0.8
max_input_tokens: 128000
input_price: 0.15
output_price: 1.2
- name: qwen/qwen5-235b-a22b-2507 # Qwen3 235B A22B Instruct 2507
max_input_tokens: 262144
input_price: 0.12
output_price: 0.59
supports_function_calling: true
- name: qwen/qwen3-235b-a22b-thinking-2507
max_input_tokens: 262144
input_price: 0.118
output_price: 0.118
- name: qwen/qwen3-30b-a3b-instruct-2507
max_input_tokens: 131072
input_price: 0.2
output_price: 0.8
input_price: 0
output_price: 0
- name: qwen/qwen3-30b-a3b-instruct-2507
max_output_tokens: 262144
max_input_tokens: 262144
input_price: 0.09
output_price: 0.3
- name: qwen/qwen3-30b-a3b-thinking-2507
max_input_tokens: 262144
input_price: 0.071
output_price: 0.285
max_input_tokens: 32768
input_price: 0.051
output_price: 0.34
- name: qwen/qwen3-vl-32b-instruct
max_input_tokens: 262144
input_price: 0.35
output_price: 1.1
max_output_tokens: 32768
max_input_tokens: 131072
input_price: 0.104
output_price: 0.416
supports_vision: true
- name: qwen/qwen3-vl-8b-instruct
max_input_tokens: 262144
max_output_tokens: 32768
max_input_tokens: 131072
input_price: 0.08
output_price: 0.50
output_price: 0.5
supports_vision: true
- name: qwen/qwen3-coder-plus
max_input_tokens: 128000
max_output_tokens: 65536
max_input_tokens: 1000000
input_price: 1
output_price: 5
supports_function_calling: true
- name: qwen/qwen3-coder-flash
max_input_tokens: 128000
max_output_tokens: 65536
max_input_tokens: 1000000
input_price: 0.3
output_price: 1.5
supports_function_calling: true
@@ -1836,19 +1887,20 @@
output_price: 0.95
supports_function_calling: true
- name: qwen/qwen3-coder-30b-a3b-instruct
max_input_tokens: 262144
input_price: 0.052
output_price: 0.207
max_output_tokens: 32768
max_input_tokens: 160000
input_price: 0.07
output_price: 0.27
supports_function_calling: true
- name: moonshotai/kimi-k2-0905
max_input_tokens: 262144
input_price: 0.296
output_price: 1.185
max_input_tokens: 131072
input_price: 0.4
output_price: 2
supports_function_calling: true
- name: moonshotai/kimi-k2-thinking
max_input_tokens: 262144
input_price: 0.45
output_price: 2.35
max_input_tokens: 131072
input_price: 0.47
output_price: 2
supports_function_calling: true
- name: moonshotai/kimi-dev-72b
max_input_tokens: 131072
@@ -1856,21 +1908,26 @@
output_price: 1.15
supports_function_calling: true
- name: x-ai/grok-4
supports_vision: true
max_input_tokens: 256000
input_price: 3
output_price: 15
supports_function_calling: true
- name: x-ai/grok-4-fast
max_output_tokens: 30000
supports_vision: true
max_input_tokens: 2000000
input_price: 0.2
output_price: 0.5
supports_function_calling: true
- name: x-ai/grok-code-fast-1
max_output_tokens: 10000
max_input_tokens: 256000
input_price: 0.2
output_price: 1.5
supports_function_calling: true
- name: amazon/nova-premier-v1
max_output_tokens: 32000
max_input_tokens: 1000000
input_price: 2.5
output_price: 12.5
@@ -1893,14 +1950,18 @@
input_price: 0.035
output_price: 0.14
- name: perplexity/sonar-pro
max_output_tokens: 8000
supports_vision: true
max_input_tokens: 200000
input_price: 3
output_price: 15
- name: perplexity/sonar
supports_vision: true
max_input_tokens: 127072
input_price: 1
output_price: 1
- name: perplexity/sonar-reasoning-pro
supports_vision: true
max_input_tokens: 128000
input_price: 2
output_price: 8
@@ -1915,20 +1976,22 @@
body:
include_reasoning: true
- name: perplexity/sonar-deep-research
max_input_tokens: 200000
max_input_tokens: 128000
input_price: 2
output_price: 8
patch:
body:
include_reasoning: true
- name: minimax/minimax-m2
max_output_tokens: 65536
max_input_tokens: 196608
input_price: 0.15
output_price: 0.45
input_price: 0.255
output_price: 1
- name: z-ai/glm-4.6
max_output_tokens: 131072
max_input_tokens: 202752
input_price: 0.5
output_price: 1.75
input_price: 0.35
output_price: 1.71
supports_function_calling: true
# Links:
+2
View File
@@ -0,0 +1,2 @@
requests
ruamel.yaml
+255
View File
@@ -0,0 +1,255 @@
import requests
import sys
import re
import json
# Provider mapping from models.yaml to OpenRouter prefixes
PROVIDER_MAPPING = {
"openai": "openai",
"claude": "anthropic",
"gemini": "google",
"mistral": "mistralai",
"cohere": "cohere",
"perplexity": "perplexity",
"xai": "x-ai",
"openrouter": "openrouter",
"ai21": "ai21",
"deepseek": "deepseek",
"moonshot": "moonshotai",
"qianwen": "qwen",
"zhipuai": "zhipuai",
"minimax": "minimax",
"vertexai": "google",
"groq": "groq",
"bedrock": "amazon",
"hunyuan": "tencent",
"ernie": "baidu",
"github": "github",
}
def fetch_openrouter_models():
print("Fetching models from OpenRouter...")
try:
response = requests.get("https://openrouter.ai/api/v1/models")
response.raise_for_status()
data = response.json()["data"]
print(f"Fetched {len(data)} models.")
return data
except Exception as e:
print(f"Error fetching models: {e}")
sys.exit(1)
def get_openrouter_model(models_data, provider_prefix, model_name, is_openrouter_provider=False):
if is_openrouter_provider:
# For openrouter provider, the model_name in yaml is usually the full ID
for model in models_data:
if model["id"] == model_name:
return model
return None
expected_id = f"{provider_prefix}/{model_name}"
# 1. Try exact match on ID
for model in models_data:
if model["id"] == expected_id:
return model
# 2. Try match by suffix
for model in models_data:
if model["id"].split("/")[-1] == model_name:
if model["id"].startswith(f"{provider_prefix}/"):
return model
return None
def format_price(price_per_token):
if price_per_token is None:
return None
try:
price_per_1m = float(price_per_token) * 1_000_000
if price_per_1m.is_integer():
return str(int(price_per_1m))
else:
return str(round(price_per_1m, 4))
except:
return None
def get_indentation(line):
return len(line) - len(line.lstrip())
def process_model_block(block_lines, current_provider, or_models):
if not block_lines:
return []
# 1. Identify model name and indentation
name_line = block_lines[0]
name_match = re.match(r"^(\s*)-\s*name:\s*(.+)$", name_line)
if not name_match:
return block_lines
name_indent_str = name_match.group(1)
model_name = name_match.group(2).strip()
# 2. Find OpenRouter model
or_prefix = PROVIDER_MAPPING.get(current_provider)
is_openrouter_provider = (current_provider == "openrouter")
if not or_prefix and not is_openrouter_provider:
return block_lines
or_model = get_openrouter_model(or_models, or_prefix, model_name, is_openrouter_provider)
if not or_model:
return block_lines
print(f" Updating {model_name}...")
# 3. Prepare updates
updates = {}
# Pricing
pricing = or_model.get("pricing", {})
p_in = format_price(pricing.get("prompt"))
p_out = format_price(pricing.get("completion"))
if p_in: updates["input_price"] = p_in
if p_out: updates["output_price"] = p_out
# Context
ctx = or_model.get("context_length")
if ctx: updates["max_input_tokens"] = str(ctx)
max_out = None
if "top_provider" in or_model and or_model["top_provider"]:
max_out = or_model["top_provider"].get("max_completion_tokens")
if max_out: updates["max_output_tokens"] = str(max_out)
# Capabilities
arch = or_model.get("architecture", {})
modality = arch.get("modality", "")
if "image" in modality:
updates["supports_vision"] = "true"
# 4. Detect field indentation
field_indent_str = None
existing_fields = {} # key -> line_index
for i, line in enumerate(block_lines):
if i == 0: continue # Skip name line
# Skip comments
if line.strip().startswith("#"):
continue
# Look for "key: value"
m = re.match(r"^(\s*)([\w_-]+):", line)
if m:
indent = m.group(1)
key = m.group(2)
# Must be deeper than name line
if len(indent) > len(name_indent_str):
if field_indent_str is None:
field_indent_str = indent
existing_fields[key] = i
if field_indent_str is None:
field_indent_str = name_indent_str + " "
# 5. Apply updates
new_block = list(block_lines)
# Update existing fields
for key, value in updates.items():
if key in existing_fields:
idx = existing_fields[key]
# Preserve original key indentation exactly
original_line = new_block[idx]
m = re.match(r"^(\s*)([\w_-]+):", original_line)
if m:
current_indent = m.group(1)
new_block[idx] = f"{current_indent}{key}: {value}\n"
# Insert missing fields
# Insert after the name line
insertion_idx = 1
for key, value in updates.items():
if key not in existing_fields:
new_line = f"{field_indent_str}{key}: {value}\n"
new_block.insert(insertion_idx, new_line)
insertion_idx += 1
return new_block
def main():
or_models = fetch_openrouter_models()
print("Reading models.yaml...")
with open("models.yaml", "r") as f:
lines = f.readlines()
new_lines = []
current_provider = None
i = 0
while i < len(lines):
line = lines[i]
# Check for provider
# - provider: name
p_match = re.match(r"^\s*-?\s*provider:\s*(.+)$", line)
if p_match:
current_provider = p_match.group(1).strip()
new_lines.append(line)
i += 1
continue
# Check for model start
# - name: ...
m_match = re.match(r"^(\s*)-\s*name:\s*.+$", line)
if m_match:
# Start of a model block
start_indent = len(m_match.group(1))
# Collect block lines
block_lines = [line]
j = i + 1
while j < len(lines):
next_line = lines[j]
stripped = next_line.strip()
# If empty or comment, include it
if not stripped or stripped.startswith("#"):
block_lines.append(next_line)
j += 1
continue
# Check indentation
next_indent = get_indentation(next_line)
# If indentation is greater, it's part of the block (property)
if next_indent > start_indent:
block_lines.append(next_line)
j += 1
continue
# If indentation is equal or less, it's the end of the block
break
# Process the block
processed_block = process_model_block(block_lines, current_provider, or_models)
new_lines.extend(processed_block)
# Advance i
i = j
continue
# Otherwise, just a regular line
new_lines.append(line)
i += 1
print("Saving models.yaml...")
with open("models.yaml", "w") as f:
f.writelines(new_lines)
print("Done.")
if __name__ == "__main__":
main()
+44 -7
View File
@@ -22,7 +22,7 @@ use serde_json::{Value, json};
use std::collections::VecDeque;
use std::ffi::OsStr;
use std::fs::File;
use std::io::Write;
use std::io::{Read, Write};
use std::{
collections::{HashMap, HashSet},
env, fs, io,
@@ -1064,7 +1064,7 @@ impl ToolCall {
function_name.clone(),
function_name,
vec![],
Default::default(),
agent.variable_envs(),
))
}
}
@@ -1117,18 +1117,55 @@ pub fn run_llm_function(
#[cfg(windows)]
let cmd_name = polyfill_cmd_name(&cmd_name, &bin_dirs);
let output = Command::new(&cmd_name)
envs.insert("CLICOLOR_FORCE".into(), "1".into());
envs.insert("FORCE_COLOR".into(), "1".into());
let mut child = Command::new(&cmd_name)
.args(&cmd_args)
.envs(envs)
.stdout(Stdio::inherit())
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.and_then(|child| child.wait_with_output())
.map_err(|err| anyhow!("Unable to run {command_name}, {err}"))?;
let exit_code = output.status.code().unwrap_or_default();
let stdout = child.stdout.take().expect("Failed to capture stdout");
let mut stderr = child.stderr.take().expect("Failed to capture stderr");
let stdout_thread = std::thread::spawn(move || {
let mut buffer = [0; 1024];
let mut reader = stdout;
let mut out = io::stdout();
while let Ok(n) = reader.read(&mut buffer) {
if n == 0 { break; }
let chunk = &buffer[0..n];
let mut last_pos = 0;
for (i, &byte) in chunk.iter().enumerate() {
if byte == b'\n' {
let _ = out.write_all(&chunk[last_pos..i]);
let _ = out.write_all(b"\r\n");
last_pos = i + 1;
}
}
if last_pos < n {
let _ = out.write_all(&chunk[last_pos..n]);
}
let _ = out.flush();
}
});
let stderr_thread = std::thread::spawn(move || {
let mut buf = Vec::new();
let _ = stderr.read_to_end(&mut buf);
buf
});
let status = child.wait().map_err(|err| anyhow!("Unable to run {command_name}, {err}"))?;
let _ = stdout_thread.join();
let stderr_bytes = stderr_thread.join().unwrap_or_default();
let exit_code = status.code().unwrap_or_default();
if exit_code != 0 {
let stderr = String::from_utf8_lossy(&output.stderr).trim().to_string();
let stderr = String::from_utf8_lossy(&stderr_bytes).trim().to_string();
if !stderr.is_empty() {
eprintln!("{stderr}");
}
+64 -44
View File
@@ -158,27 +158,31 @@ impl McpRegistry {
}
pub async fn reinit(
registry: McpRegistry,
mut registry: McpRegistry,
enabled_mcp_servers: Option<String>,
abort_signal: AbortSignal,
) -> Result<Self> {
debug!("Reinitializing MCP registry");
debug!("Stopping all MCP servers");
let mut new_registry = abortable_run_with_spinner(
registry.stop_all_servers(),
"Stopping MCP servers",
let desired_ids = registry.resolve_server_ids(enabled_mcp_servers.clone());
let desired_set: HashSet<String> = desired_ids.iter().cloned().collect();
debug!("Stopping unused MCP servers");
abortable_run_with_spinner(
registry.stop_unused_servers(&desired_set),
"Stopping unused MCP servers",
abort_signal.clone(),
)
.await?;
abortable_run_with_spinner(
new_registry.start_select_mcp_servers(enabled_mcp_servers),
registry.start_select_mcp_servers(enabled_mcp_servers),
"Loading MCP servers",
abort_signal,
)
.await?;
Ok(new_registry)
Ok(registry)
}
async fn start_select_mcp_servers(
@@ -192,27 +196,19 @@ impl McpRegistry {
return Ok(());
}
if let Some(servers) = enabled_mcp_servers {
debug!("Starting selected MCP servers: {:?}", servers);
let config = self
.config
.as_ref()
.with_context(|| "MCP Config not defined. Cannot start servers")?;
let mcp_servers = config.mcp_servers.clone();
let desired_ids = self.resolve_server_ids(enabled_mcp_servers);
let ids_to_start: Vec<String> = desired_ids.into_iter()
.filter(|id| !self.servers.contains_key(id))
.collect();
let enabled_servers: HashSet<String> =
servers.split(',').map(|s| s.trim().to_string()).collect();
let server_ids: Vec<String> = if servers == "all" {
mcp_servers.into_keys().collect()
} else {
mcp_servers
.into_keys()
.filter(|id| enabled_servers.contains(id))
.collect()
};
if ids_to_start.is_empty() {
return Ok(());
}
debug!("Starting selected MCP servers: {:?}", ids_to_start);
let results: Vec<(String, Arc<_>, ServerCatalog)> = stream::iter(
server_ids
ids_to_start
.into_iter()
.map(|id| async { self.start_server(id).await }),
)
@@ -220,15 +216,9 @@ impl McpRegistry {
.try_collect()
.await?;
self.servers = results
.clone()
.into_iter()
.map(|(id, server, _)| (id, server))
.collect();
self.catalogs = results
.into_iter()
.map(|(id, _, catalog)| (id, catalog))
.collect();
for (id, server, catalog) in results {
self.servers.insert(id.clone(), server);
self.catalogs.insert(id, catalog);
}
Ok(())
@@ -309,19 +299,49 @@ impl McpRegistry {
Ok((id.to_string(), service, catalog))
}
pub async fn stop_all_servers(mut self) -> Result<Self> {
for (id, server) in self.servers {
Arc::try_unwrap(server)
.map_err(|_| anyhow!("Failed to unwrap Arc for MCP server: {id}"))?
.cancel()
.await
fn resolve_server_ids(&self, enabled_mcp_servers: Option<String>) -> Vec<String> {
if let Some(config) = &self.config
&& let Some(servers) = enabled_mcp_servers {
if servers == "all" {
config.mcp_servers.keys().cloned().collect()
} else {
let enabled_servers: HashSet<String> =
servers.split(',').map(|s| s.trim().to_string()).collect();
config.mcp_servers
.keys()
.filter(|id| enabled_servers.contains(*id))
.cloned()
.collect()
}
} else {
vec![]
}
}
pub async fn stop_unused_servers(&mut self, keep_ids: &HashSet<String>) -> Result<()> {
let mut ids_to_remove = Vec::new();
for (id, _) in self.servers.iter() {
if !keep_ids.contains(id) {
ids_to_remove.push(id.clone());
}
}
for id in ids_to_remove {
if let Some(server) = self.servers.remove(&id) {
match Arc::try_unwrap(server) {
Ok(server_inner) => {
server_inner.cancel().await
.with_context(|| format!("Failed to stop MCP server: {id}"))?;
info!("Stopped MCP server: {id}");
}
self.servers = HashMap::new();
Ok(self)
Err(_) => {
info!("Detaching from MCP server: {id} (still in use)");
}
}
self.catalogs.remove(&id);
}
}
Ok(())
}
pub fn list_started_servers(&self) -> Vec<String> {