10 Commits

Author SHA1 Message Date
3baa3102a3 style: Cleaned up an anyhow error
CI / All (macos-latest) (push) Has been cancelled
CI / All (ubuntu-latest) (push) Has been cancelled
CI / All (windows-latest) (push) Has been cancelled
2025-12-16 14:51:35 -07:00
github-actions[bot]
2d4fad596c bump: version 0.1.2 → 0.1.3 [skip ci] 2025-12-13 20:57:37 +00:00
7259e59d2a ci: Prep for 0.1.3 release 2025-12-13 13:38:09 -07:00
cec04c4597 style: Improved error message for un-fully configured MCP configuration 2025-12-13 13:37:01 -07:00
github-actions[bot]
a7f5677195 chore: bump Cargo.toml to 0.1.3 2025-12-13 20:28:10 +00:00
github-actions[bot]
6075f0a190 bump: version 0.1.2 → 0.1.3 [skip ci] 2025-12-13 20:27:58 +00:00
15310a9e2c chore: Updated the models 2025-12-11 09:05:41 -07:00
f7df54f2f7 docs: Removed the warning about MCP token usage since that has been fixed 2025-12-05 12:38:15 -07:00
212d4bace4 docs: Fixed an unclosed backtick typo in the Environment Variables docs 2025-12-05 12:37:59 -07:00
f4b3267c89 docs: Fixed typo in vault readme 2025-12-05 11:05:14 -07:00
9 changed files with 586 additions and 504 deletions
+6
View File
@@ -1,3 +1,9 @@
## v0.1.3 (2025-12-13)
### Feat
- Improved MCP implementation to minimize the tokens needed to utilize it so it doesn't quickly overwhelm the token space for a given model
## v0.1.2 (2025-11-08) ## v0.1.2 (2025-11-08)
### Refactor ### Refactor
Generated
+364 -279
View File
File diff suppressed because it is too large Load Diff
+1 -1
View File
@@ -1,6 +1,6 @@
[package] [package]
name = "loki-ai" name = "loki-ai"
version = "0.1.2" version = "0.1.3"
edition = "2024" edition = "2024"
authors = ["Alex Clarke <alex.j.tusa@gmail.com>"] authors = ["Alex Clarke <alex.j.tusa@gmail.com>"]
description = "An all-in-one, batteries included LLM CLI Tool" description = "An all-in-one, batteries included LLM CLI Tool"
+1 -1
View File
@@ -84,7 +84,7 @@ You can also customize the location of full agent configurations using the follo
| Environment Variable | Description | | Environment Variable | Description |
|------------------------------|-------------------------------------------------------------------------------------------------------------------------------------| |------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|
| `<AGENT_NAME>_CONFIG_FILE | Customize the location of the agent's configuration file; e.g. `SQL_CONFIG_FILE` | | `<AGENT_NAME>_CONFIG_FILE` | Customize the location of the agent's configuration file; e.g. `SQL_CONFIG_FILE` |
| `<AGENT_NAME>_MODEL` | Customize the `model` used for the agent; e.g `SQL_MODEL` | | `<AGENT_NAME>_MODEL` | Customize the `model` used for the agent; e.g `SQL_MODEL` |
| `<AGENT_NAME>_TEMPERATURE` | Customize the `temperature` used for the agent; e.g. `SQL_TEMPERATURE` | | `<AGENT_NAME>_TEMPERATURE` | Customize the `temperature` used for the agent; e.g. `SQL_TEMPERATURE` |
| `<AGENT_NAME>_TOP_P` | Customize the `top_p` used for the agent; e.g. `SQL_TOP_P` | | `<AGENT_NAME>_TOP_P` | Customize the `top_p` used for the agent; e.g. `SQL_TOP_P` |
+1 -1
View File
@@ -114,7 +114,7 @@ At the time of writing, the following files support Loki secret injection:
|-------------------------|-----------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------| |-------------------------|-----------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------|
| `config.yaml` | The main Loki configuration file | Cannot use secret injection on the `vault_password_file` field | | `config.yaml` | The main Loki configuration file | Cannot use secret injection on the `vault_password_file` field |
| `functions/mcp.json` | The MCP server configuration file | | | `functions/mcp.json` | The MCP server configuration file | |
| `<agent>/tools.<py/sh>` | Tool files for agents | Specific configuration and only supported for Agents, not all global tools ([see below](#environment-variable-secret-injection-in-agents) | | `<agent>/tools.<py/sh>` | Tool files for agents | Specific configuration and only supported for Agents, not all global tools ([see below](#environment-variable-secret-injection-in-agents)) |
Note that all paths are relative to the Loki configuration directory. The directory varies by system, so you can find yours by Note that all paths are relative to the Loki configuration directory. The directory varies by system, so you can find yours by
+1 -3
View File
@@ -83,9 +83,7 @@ enabled_mcp_servers: null # Which MCP servers to enable by default (e.g.
``` ```
A special note about `enabled_mcp_servers`: a user can set this to `all` to enable all configured MCP servers in the A special note about `enabled_mcp_servers`: a user can set this to `all` to enable all configured MCP servers in the
`functions/mcp.json` configuration. However, **this should be used with caution**. When there is a significant number `functions/mcp.json` configuration.
of configured MCP servers, enabling all MCP servers may overwhelm the context length of a model, and quickly exceed
token limits.
(See the [Configuration Example](../../config.example.yaml) file for an example global configuration with all options.) (See the [Configuration Example](../../config.example.yaml) file for an example global configuration with all options.)
+210 -217
View File
@@ -3,6 +3,20 @@
# - https://platform.openai.com/docs/api-reference/chat # - https://platform.openai.com/docs/api-reference/chat
- provider: openai - provider: openai
models: models:
- name: gpt-5.1
max_input_tokens: 400000
max_output_tokens: 128000
input_price: 1.25
output_price: 10
supports_vision: true
supports_function_calling: true
- name: gpt-5.1-chat-latest
max_input_tokens: 400000
max_output_tokens: 128000
input_price: 1.25
output_price: 10
supports_vision: true
supports_function_calling: true
- name: gpt-5 - name: gpt-5
max_input_tokens: 400000 max_input_tokens: 400000
max_output_tokens: 128000 max_output_tokens: 128000
@@ -31,13 +45,6 @@
output_price: 0.4 output_price: 0.4
supports_vision: true supports_vision: true
supports_function_calling: true supports_function_calling: true
- name: gpt-5-codex
max_input_tokens: 400000
max_output_tokens: 128000
input_price: 1.25
output_price: 10
supports_vision: true
supports_function_calling: true
- name: gpt-4.1 - name: gpt-4.1
max_input_tokens: 1047576 max_input_tokens: 1047576
max_output_tokens: 32768 max_output_tokens: 32768
@@ -259,6 +266,30 @@
thinking: thinking:
type: enabled type: enabled
budget_tokens: 16000 budget_tokens: 16000
- name: claude-haiku-4-5-20251001
max_input_tokens: 200000
max_output_tokens: 8192
require_max_tokens: true
input_price: 1
output_price: 5
supports_vision: true
supports_function_calling: true
- name: claude-haiku-4-5-20251001:thinking
real_name: claude-haiku-4-5-20251001
max_input_tokens: 200000
max_output_tokens: 24000
require_max_tokens: true
input_price: 1
output_price: 5
supports_vision: true
supports_function_calling: true
patch:
body:
temperature: null
top_p: null
thinking:
type: enabled
budget_tokens: 16000
- name: claude-opus-4-1-20250805 - name: claude-opus-4-1-20250805
max_input_tokens: 200000 max_input_tokens: 200000
max_output_tokens: 8192 max_output_tokens: 8192
@@ -660,6 +691,29 @@
thinking: thinking:
type: enabled type: enabled
budget_tokens: 16000 budget_tokens: 16000
- name: claude-haiku-4-5@20251001
max_input_tokens: 200000
max_output_tokens: 8192
require_max_tokens: true
input_price: 1
output_price: 5
supports_vision: true
supports_function_calling: true
- name: claude-haiku-4-5@20251001:thinking
real_name: claude-haiku-4-5@20251001
max_input_tokens: 200000
max_output_tokens: 24000
require_max_tokens: true
input_price: 1
output_price: 5
supports_vision: true
patch:
body:
temperature: null
top_p: null
thinking:
type: enabled
budget_tokens: 16000
- name: claude-opus-4-1@20250805 - name: claude-opus-4-1@20250805
max_input_tokens: 200000 max_input_tokens: 200000
max_output_tokens: 8192 max_output_tokens: 8192
@@ -817,6 +871,31 @@
thinking: thinking:
type: enabled type: enabled
budget_tokens: 16000 budget_tokens: 16000
- name: us.anthropic.claude-haiku-4-5-20251001-v1:0
max_input_tokens: 200000
max_output_tokens: 8192
require_max_tokens: true
input_price: 1
output_price: 5
supports_vision: true
supports_function_calling: true
- name: us.anthropic.claude-haiku-4-5-20251001-v1:0:thinking
real_name: us.anthropic.claude-haiku-4-5-20251001-v1:0
max_input_tokens: 200000
max_output_tokens: 24000
require_max_tokens: true
input_price: 1
output_price: 5
supports_vision: true
patch:
body:
inferenceConfig:
temperature: null
topP: null
additionalModelRequestFields:
thinking:
type: enabled
budget_tokens: 16000
- name: us.anthropic.claude-opus-4-1-20250805-v1:0 - name: us.anthropic.claude-opus-4-1-20250805-v1:0
max_input_tokens: 200000 max_input_tokens: 200000
max_output_tokens: 8192 max_output_tokens: 8192
@@ -1004,6 +1083,12 @@
require_max_tokens: true require_max_tokens: true
input_price: 0 input_price: 0
output_price: 0 output_price: 0
- name: '@cf/qwen/qwen3-30b-a3b-fp8'
max_input_tokens: 131072
max_output_tokens: 2048
require_max_tokens: true
input_price: 0
output_price: 0
- name: '@cf/qwen/qwen2.5-coder-32b-instruct' - name: '@cf/qwen/qwen2.5-coder-32b-instruct'
max_input_tokens: 131072 max_input_tokens: 131072
max_output_tokens: 2048 max_output_tokens: 2048
@@ -1030,8 +1115,8 @@
max_batch_size: 100 max_batch_size: 100
# Links: # Links:
# - https://cloud.baidu.com/doc/WENXINWORKSHOP/s/Wm9cvy6rl # - https://cloud.baidu.com/doc/qianfan/s/rmh4stp0j
# - https://cloud.baidu.com/doc/WENXINWORKSHOP/s/Qm9cw2s7m # - https://cloud.baidu.com/doc/qianfan/s/wmh4sv6ya
- provider: ernie - provider: ernie
models: models:
- name: ernie-4.5-turbo-128k - name: ernie-4.5-turbo-128k
@@ -1043,8 +1128,12 @@
input_price: 0.42 input_price: 0.42
output_price: 1.26 output_price: 1.26
supports_vision: true supports_vision: true
- name: ernie-x1-turbo-32k - name: ernie-5.0-thinking-preview
max_input_tokens: 32768 max_input_tokens: 131072
input_price: 1.4
output_price: 5.6
- name: ernie-x1.1-preview
max_input_tokens: 65536
input_price: 0.14 input_price: 0.14
output_price: 0.56 output_price: 0.56
- name: bge-large-zh - name: bge-large-zh
@@ -1064,75 +1153,31 @@
max_input_tokens: 1024 max_input_tokens: 1024
input_price: 0.07 input_price: 0.07
# Links: # Links:
# - https://help.aliyun.com/zh/model-studio/getting-started/models # - https://help.aliyun.com/zh/model-studio/getting-started/models
# - https://help.aliyun.com/zh/model-studio/developer-reference/use-qwen-by-calling-api # - https://help.aliyun.com/zh/model-studio/developer-reference/use-qwen-by-calling-api
- provider: qianwen - provider: qianwen
models: models:
- name: qwen-max-latest
max_input_tokens: 32678
max_output_tokens: 8192
input_price: 1.6
output_price: 6.4
supports_function_calling: true
- name: qwen-plus-latest
max_input_tokens: 131072
max_output_tokens: 8192
input_price: 0.112
output_price: 0.28
supports_function_calling: true
- name: qwen-turbo-latest
max_input_tokens: 1000000
max_output_tokens: 8192
input_price: 0.042
output_price: 0.084
supports_function_calling: true
- name: qwen-long
max_input_tokens: 1000000
input_price: 0.07
output_price: 0.28
- name: qwen-omni-turbo-latest
max_input_tokens: 32768
max_output_tokens: 2048
supports_vision: true
- name: qwen-coder-plus-latest
max_input_tokens: 131072
max_output_tokens: 8192
input_price: 0.49
output_price: 0.98
- name: qwen-coder-turbo-latest
max_input_tokens: 131072
max_output_tokens: 8192
input_price: 0.28
output_price: 0.84
- name: qwen-vl-max-latest
max_input_tokens: 30720
max_output_tokens: 2048
input_price: 0.42
output_price: 1.26
supports_vision: true
- name: qwen-vl-plus-latest
max_input_tokens: 30000
max_output_tokens: 2048
input_price: 0.21
output_price: 0.63
supports_vision: true
- name: qwen3-max - name: qwen3-max
max_input_tokens: 262144 max_input_tokens: 262144
input_price: 2.1 supports_function_calling: true
output_price: 8.4 - name: qwen-plus
max_input_tokens: 131072
supports_function_calling: true
- name: qwen-flash
max_input_tokens: 1000000
supports_function_calling: true supports_function_calling: true
- name: qwen3-vl-plus - name: qwen3-vl-plus
max_input_tokens: 262144 max_input_tokens: 262144
input_price: 0.42
output_price: 4.2
supports_vision: true supports_vision: true
- name: qwen3-max-preview - name: qwen3-vl-flash
max_input_tokens: 262144 max_input_tokens: 262144
max_output_tokens: 32768 supports_vision: true
input_price: 1.4 - name: qwen-coder-plus
output_price: 5.6 max_input_tokens: 1000000
supports_function_calling: true - name: qwen-coder-flash
max_input_tokens: 1000000
- name: qwen3-next-80b-a3b-instruct - name: qwen3-next-80b-a3b-instruct
max_input_tokens: 131072 max_input_tokens: 131072
input_price: 0.14 input_price: 0.14
@@ -1160,6 +1205,16 @@
max_input_tokens: 131072 max_input_tokens: 131072
input_price: 0.105 input_price: 0.105
output_price: 1.05 output_price: 1.05
- name: qwen3-vl-32b-instruct
max_input_tokens: 131072
input_price: 0.28
output_price: 1.12
supports_vision: true
- name: qwen3-vl-8b-instruct
max_input_tokens: 131072
input_price: 0.07
output_price: 0.28
supports_vision: true
- name: qwen3-coder-480b-a35b-instruct - name: qwen3-coder-480b-a35b-instruct
max_input_tokens: 262144 max_input_tokens: 262144
input_price: 1.26 input_price: 1.26
@@ -1168,32 +1223,10 @@
max_input_tokens: 262144 max_input_tokens: 262144
input_price: 0.315 input_price: 0.315
output_price: 1.26 output_price: 1.26
- name: qwen2.5-72b-instruct - name: deepseek-v3.2-exp
max_input_tokens: 129024
max_output_tokens: 8192
input_price: 0.56
output_price: 1.68
supports_function_calling: true
- name: qwen2.5-vl-72b-instruct
max_input_tokens: 129024
max_output_tokens: 8192
input_price: 2.24
output_price: 6.72
supports_vision: true
- name: qwen2.5-coder-32b-instruct
max_input_tokens: 129024
max_output_tokens: 8192
input_price: 0.49
output_price: 0.98
supports_function_calling: true
- name: deepseek-v3.1
max_input_tokens: 131072 max_input_tokens: 131072
input_price: 0.28 input_price: 0.28
output_price: 1.12 output_price: 0.42
- name: deepseek-r1-0528
max_input_tokens: 65536
input_price: 0.28
output_price: 1.12
- name: text-embedding-v4 - name: text-embedding-v4
type: embedding type: embedding
input_price: 0.1 input_price: 0.1
@@ -1247,10 +1280,10 @@
# - https://platform.moonshot.cn/docs/api/chat#%E5%85%AC%E5%BC%80%E7%9A%84%E6%9C%8D%E5%8A%A1%E5%9C%B0%E5%9D%80 # - https://platform.moonshot.cn/docs/api/chat#%E5%85%AC%E5%BC%80%E7%9A%84%E6%9C%8D%E5%8A%A1%E5%9C%B0%E5%9D%80
- provider: moonshot - provider: moonshot
models: models:
- name: kimi-latest - name: kimi-k2-turbo-preview
max_input_tokens: 131072 max_input_tokens: 262144
input_price: 1.4 input_price: 1.12
output_price: 4.2 output_price: 8.12
supports_vision: true supports_vision: true
supports_function_calling: true supports_function_calling: true
- name: kimi-k2-0905-preview - name: kimi-k2-0905-preview
@@ -1259,16 +1292,15 @@
output_price: 2.24 output_price: 2.24
supports_vision: true supports_vision: true
supports_function_calling: true supports_function_calling: true
- name: kimi-k2-turbo-preview - name: kimi-k2-thinking-turbo
max_input_tokens: 131072 max_input_tokens: 262144
input_price: 1.12 input_price: 1.12
output_price: 4.48 output_price: 8.12
supports_vision: true supports_vision: true
supports_function_calling: true - name: kimi-k2-thinking
- name: kimi-thinking-preview max_input_tokens: 262144
max_input_tokens: 131072 input_price: 0.56
input_price: 28 output_price: 2.24
output_price: 28
supports_vision: true supports_vision: true
# Links: # Links:
@@ -1293,7 +1325,7 @@
# - https://open.bigmodel.cn/dev/api#glm-4 # - https://open.bigmodel.cn/dev/api#glm-4
- provider: zhipuai - provider: zhipuai
models: models:
- name: glm-4.5 - name: glm-4.6
max_input_tokens: 202752 max_input_tokens: 202752
input_price: 0.28 input_price: 0.28
output_price: 1.12 output_price: 1.12
@@ -1353,25 +1385,35 @@
input_price: 0.112 input_price: 0.112
# Links: # Links:
# - https://platform.minimaxi.com/document/pricing # - https://platform.minimaxi.com/docs/guides/pricing
# - https://platform.minimaxi.com/document/ChatCompletion%20v2 # - https://platform.minimaxi.com/document/ChatCompletion%20v2
- provider: minimax - provider: minimax
models: models:
- name: minimax-text-01 - name: minimax-m2
max_input_tokens: 1000192 max_input_tokens: 204800
input_price: 0.14 input_price: 0.294
output_price: 1.12 output_price: 1.176
supports_vision: true supports_function_calling: true
- name: minimax-m1
max_input_tokens: 131072
input_price: 0.112
output_price: 1.12
# Links: # Links:
# - https://openrouter.ai/models # - https://openrouter.ai/models
# - https://openrouter.ai/docs/api-reference/chat-completion # - https://openrouter.ai/docs/api-reference/chat-completion
- provider: openrouter - provider: openrouter
models: models:
- name: openai/gpt-5.1
max_input_tokens: 400000
max_output_tokens: 128000
input_price: 1.25
output_price: 10
supports_vision: true
supports_function_calling: true
- name: openai/gpt-5.1-chat
max_input_tokens: 400000
max_output_tokens: 128000
input_price: 1.25
output_price: 10
supports_vision: true
supports_function_calling: true
- name: openai/gpt-5 - name: openai/gpt-5
max_input_tokens: 400000 max_input_tokens: 400000
max_output_tokens: 128000 max_output_tokens: 128000
@@ -1400,13 +1442,6 @@
output_price: 0.4 output_price: 0.4
supports_vision: true supports_vision: true
supports_function_calling: true supports_function_calling: true
- name: openai/gpt-5-codex
max_input_tokens: 400000
max_output_tokens: 128000
input_price: 1.25
output_price: 10
supports_vision: true
supports_function_calling: true
- name: openai/gpt-4.1 - name: openai/gpt-4.1
max_input_tokens: 1047576 max_input_tokens: 1047576
max_output_tokens: 32768 max_output_tokens: 32768
@@ -1563,6 +1598,14 @@
output_price: 15 output_price: 15
supports_vision: true supports_vision: true
supports_function_calling: true supports_function_calling: true
- name: anthropic/claude-haiku-4.5
max_input_tokens: 200000
max_output_tokens: 8192
require_max_tokens: true
input_price: 1
output_price: 5
supports_vision: true
supports_function_calling: true
- name: anthropic/claude-opus-4.1 - name: anthropic/claude-opus-4.1
max_input_tokens: 200000 max_input_tokens: 200000
max_output_tokens: 8192 max_output_tokens: 8192
@@ -1696,11 +1739,10 @@
patch: patch:
body: body:
include_reasoning: true include_reasoning: true
- name: qwen/qwen-max - name: qwen/qwen3-max
max_input_tokens: 32768 max_input_tokens: 262144
max_output_tokens: 8192 input_price: 1.2
input_price: 1.6 output_price: 6
output_price: 6.4
supports_function_calling: true supports_function_calling: true
- name: qwen/qwen-plus - name: qwen/qwen-plus
max_input_tokens: 131072 max_input_tokens: 131072
@@ -1708,22 +1750,6 @@
input_price: 0.4 input_price: 0.4
output_price: 1.2 output_price: 1.2
supports_function_calling: true supports_function_calling: true
- name: qwen/qwen-turbo
max_input_tokens: 1000000
max_output_tokens: 8192
input_price: 0.05
output_price: 0.2
supports_function_calling: true
- name: qwen/qwen-vl-plus
max_input_tokens: 7500
input_price: 0.21
output_price: 0.63
supports_vision: true
- name: qwen/qwen3-max
max_input_tokens: 262144
input_price: 1.2
output_price: 6
supports_function_calling: true
- name: qwen/qwen3-next-80b-a3b-instruct - name: qwen/qwen3-next-80b-a3b-instruct
max_input_tokens: 262144 max_input_tokens: 262144
input_price: 0.1 input_price: 0.1
@@ -1733,7 +1759,7 @@
max_input_tokens: 262144 max_input_tokens: 262144
input_price: 0.1 input_price: 0.1
output_price: 0.8 output_price: 0.8
- name: qwen/qwen3-235b-a22b-2507 - name: qwen/qwen5-235b-a22b-2507 # Qwen3 235B A22B Instruct 2507
max_input_tokens: 262144 max_input_tokens: 262144
input_price: 0.12 input_price: 0.12
output_price: 0.59 output_price: 0.59
@@ -1750,6 +1776,16 @@
max_input_tokens: 262144 max_input_tokens: 262144
input_price: 0.071 input_price: 0.071
output_price: 0.285 output_price: 0.285
- name: qwen/qwen3-vl-32b-instruct
max_input_tokens: 262144
input_price: 0.35
output_price: 1.1
supports_vision: true
- name: qwen/qwen3-vl-8b-instruct
max_input_tokens: 262144
input_price: 0.08
output_price: 0.50
supports_vision: true
- name: qwen/qwen3-coder-plus - name: qwen/qwen3-coder-plus
max_input_tokens: 128000 max_input_tokens: 128000
input_price: 1 input_price: 1
@@ -1760,30 +1796,26 @@
input_price: 0.3 input_price: 0.3
output_price: 1.5 output_price: 1.5
supports_function_calling: true supports_function_calling: true
- name: qwen/qwen3-coder # Qwen3 Coder 480B A35B
max_input_tokens: 262144
input_price: 0.22
output_price: 0.95
supports_function_calling: true
- name: qwen/qwen3-coder-30b-a3b-instruct - name: qwen/qwen3-coder-30b-a3b-instruct
max_input_tokens: 262144 max_input_tokens: 262144
input_price: 0.052 input_price: 0.052
output_price: 0.207 output_price: 0.207
supports_function_calling: true supports_function_calling: true
- name: qwen/qwen-2.5-72b-instruct
max_input_tokens: 131072
input_price: 0.35
output_price: 0.4
supports_function_calling: true
- name: qwen/qwen2.5-vl-72b-instruct
max_input_tokens: 32000
input_price: 0.7
output_price: 0.7
supports_vision: true
- name: qwen/qwen-2.5-coder-32b-instruct
max_input_tokens: 32768
input_price: 0.18
output_price: 0.18
- name: moonshotai/kimi-k2-0905 - name: moonshotai/kimi-k2-0905
max_input_tokens: 262144 max_input_tokens: 262144
input_price: 0.296 input_price: 0.296
output_price: 1.185 output_price: 1.185
supports_function_calling: true supports_function_calling: true
- name: moonshotai/kimi-k2-thinking
max_input_tokens: 262144
input_price: 0.45
output_price: 2.35
supports_function_calling: true
- name: moonshotai/kimi-dev-72b - name: moonshotai/kimi-dev-72b
max_input_tokens: 131072 max_input_tokens: 131072
input_price: 0.29 input_price: 0.29
@@ -1804,6 +1836,11 @@
input_price: 0.2 input_price: 0.2
output_price: 1.5 output_price: 1.5
supports_function_calling: true supports_function_calling: true
- name: amazon/nova-premier-v1
max_input_tokens: 1000000
input_price: 2.5
output_price: 12.5
supports_vision: true
- name: amazon/nova-pro-v1 - name: amazon/nova-pro-v1
max_input_tokens: 300000 max_input_tokens: 300000
max_output_tokens: 5120 max_output_tokens: 5120
@@ -1850,29 +1887,15 @@
patch: patch:
body: body:
include_reasoning: true include_reasoning: true
- name: minimax/minimax-01 - name: minimax/minimax-m2
max_input_tokens: 1000192 max_input_tokens: 196608
input_price: 0.2 input_price: 0.15
output_price: 1.1 output_price: 0.45
- name: z-ai/glm-4.6 - name: z-ai/glm-4.6
max_input_tokens: 202752 max_input_tokens: 202752
input_price: 0.5 input_price: 0.5
output_price: 1.75 output_price: 1.75
supports_function_calling: true supports_function_calling: true
- name: z-ai/glm-4.5
max_input_tokens: 131072
input_price: 0.2
output_price: 0.2
supports_function_calling: true
- name: z-ai/glm-4.5-air
max_input_tokens: 131072
input_price: 0.2
output_price: 1.1
- name: z-ai/glm-4.5v
max_input_tokens: 65536
input_price: 0.5
output_price: 1.7
supports_vision: true
# Links: # Links:
# - https://github.com/marketplace?type=models # - https://github.com/marketplace?type=models
@@ -2068,10 +2091,6 @@
input_price: 0.08 input_price: 0.08
output_price: 0.3 output_price: 0.3
supports_vision: true supports_vision: true
- name: meta-llama/Llama-3.3-70B-Instruct
max_input_tokens: 131072
input_price: 0.23
output_price: 0.40
- name: Qwen/Qwen3-Next-80B-A3B-Instruct - name: Qwen/Qwen3-Next-80B-A3B-Instruct
max_input_tokens: 262144 max_input_tokens: 262144
input_price: 0.14 input_price: 0.14
@@ -2100,27 +2119,15 @@
input_price: 0.07 input_price: 0.07
output_price: 0.27 output_price: 0.27
supports_function_calling: true supports_function_calling: true
- name: Qwen/Qwen3-235B-A22B
max_input_tokens: 40960
input_price: 0.15
output_price: 0.6
- name: Qwen/Qwen3-30B-A3B - name: Qwen/Qwen3-30B-A3B
max_input_tokens: 40960 max_input_tokens: 40960
input_price: 0.1 input_price: 0.1
output_price: 0.3 output_price: 0.3
- name: Qwen/Qwen3-32B - name: Qwen/Qwen3-VL-8B-Instruct
max_input_tokens: 40960 max_input_tokens: 262144
input_price: 0.1 input_price: 0.18
output_price: 0.3 output_price: 0.69
- name: Qwen/Qwen2.5-72B-Instruct supports_vision: true
max_input_tokens: 32768
input_price: 0.23
output_price: 0.40
supports_function_calling: true
- name: Qwen/Qwen2.5-Coder-32B-Instruct
max_input_tokens: 32768
input_price: 0.07
output_price: 0.16
- name: deepseek-ai/DeepSeek-V3.2-Exp - name: deepseek-ai/DeepSeek-V3.2-Exp
max_input_tokens: 163840 max_input_tokens: 163840
input_price: 0.27 input_price: 0.27
@@ -2145,35 +2152,21 @@
max_input_tokens: 32768 max_input_tokens: 32768
input_price: 0.06 input_price: 0.06
output_price: 0.12 output_price: 0.12
- name: mistralai/Devstral-Small-2507
max_input_tokens: 131072
input_price: 0.07
output_price: 0.28
- name: moonshotai/Kimi-K2-Instruct-0905 - name: moonshotai/Kimi-K2-Instruct-0905
max_input_tokens: 262144 max_input_tokens: 262144
input_price: 0.5 input_price: 0.5
output_price: 2.0 output_price: 2.0
supports_function_calling: true supports_function_calling: true
- name: moonshotai/Kimi-K2-Thinking
max_input_tokens: 262144
input_price: 0.55
output_price: 2.5
supports_function_calling: true
- name: zai-org/GLM-4.6 - name: zai-org/GLM-4.6
max_input_tokens: 202752 max_input_tokens: 202752
input_price: 0.6 input_price: 0.6
output_price: 1.9 output_price: 1.9
supports_function_calling: true supports_function_calling: true
- name: zai-org/GLM-4.5
max_input_tokens: 131072
input_price: 0.55
output_price: 2.0
supports_function_calling: true
- name: zai-org/GLM-4.5-Air
max_input_tokens: 131072
input_price: 0.2
output_price: 1.1
supports_function_calling: true
- name: zai-org/GLM-4.5V
max_input_tokens: 65536
input_price: 0.5
output_price: 1.7
supports_vision: true
- name: BAAI/bge-large-en-v1.5 - name: BAAI/bge-large-en-v1.5
type: embedding type: embedding
input_price: 0.01 input_price: 0.01
@@ -2271,4 +2264,4 @@
- name: rerank-2-lite - name: rerank-2-lite
type: reranker type: reranker
max_input_tokens: 8000 max_input_tokens: 8000
input_price: 0.02 input_price: 0.02
+1 -1
View File
@@ -228,7 +228,7 @@ macro_rules! config_get_fn {
std::env::var(&env_name) std::env::var(&env_name)
.ok() .ok()
.or_else(|| self.config.$field_name.clone()) .or_else(|| self.config.$field_name.clone())
.ok_or_else(|| anyhow::anyhow!("Miss '{}'", stringify!($field_name))) .ok_or_else(|| anyhow::anyhow!("Missing '{}'", stringify!($field_name)))
} }
}; };
} }
+1 -1
View File
@@ -800,7 +800,7 @@ impl Config {
|| s == "all" || s == "all"
}) { }) {
bail!( bail!(
"Some of the specified MCP servers in 'enabled_mcp_servers' are configured. Please check your MCP server configuration." "Some of the specified MCP servers in 'enabled_mcp_servers' are not fully configured. Please check your MCP server configuration."
); );
} }
} }