* feat: add gemma 4 for AI studio
* fix: update max context return value for gemma-3n-e4b-it model
* refactor: iterate array of [regex, number]
* gemma4: enable tool calling and sysprompt
Co-authored-by: Copilot <copilot@github.com>
---------
Co-authored-by: Copilot <copilot@github.com>
* feat: add MiniMax as a chat completion provider
Add MiniMax (https://www.minimax.io) as a first-class chat completion
provider. MiniMax already has TTS integration in SillyTavern; this
extends support to LLM chat completions via their OpenAI-compatible API.
Supported models:
- MiniMax-M2.5 (default) — 204K context
- MiniMax-M2.5-highspeed — same capability, faster inference
Key implementation details:
- Reuses existing SECRET_KEYS.MINIMAX (shared with TTS)
- API endpoint: https://api.minimax.io/v1
- Temperature clamped to (0.0, 1.0] as required by MiniMax API
- Returns hardcoded model list since MiniMax doesn't expose /v1/models
- Full UI integration: model selector, sampler parameters, streaming
Co-Authored-By: octo-patch <octo-patch@users.noreply.github.com>
* feat: upgrade MiniMax default model to M2.7
- Add MiniMax-M2.7 and MiniMax-M2.7-highspeed to model list
- Set MiniMax-M2.7 as default model
- Keep all previous models as alternatives
* feat: independent request function, vision support, temp clamping for MiniMax
- Extract sendMinimaxRequest() following Chutes pattern (PR #4844)
with function calling and JSON Schema structured output support
- Clamp temperature to (0.01, 1.0] on backend; limit frontend UI max to 1.0
- Enable image inlining for MiniMax M2.7 model
- Add MiniMax to slash-commands model selector and tokenizer mapping
- Add minimax_model to default preset
* feat: add VLM-based vision support for MiniMax M2.7
M2.7 does not natively accept image input. When images are detected
in messages, pre-process them via the MiniMax VLM endpoint
(/v1/coding_plan/vlm) to convert images to text descriptions before
sending to the chat completions API. Uses the same API key.
* feat: add M2-her model to MiniMax provider
M2-her is MiniMax's dialogue/roleplay-optimized model with 64K context
and 2048 max completion tokens. Text-only (no vision).
* feat: add MiniMax China endpoint (minimaxi.com) support
Add endpoint selector (Global/China) for MiniMax, mirroring the
SiliconFlow pattern. Users can now choose between api.minimax.io
(international) and api.minimaxi.com (China domestic).
* fix: merge consecutive same-role messages for MiniMax
MiniMax API rejects consecutive messages with the same role with
error 'invalid chat setting (2013)'. Merge them before sending.
* review: address PR feedback on MiniMax provider
Backend (src/endpoints/backends/chat-completions.js):
- Drop the entire MiniMax VLM image-preprocessing path; vision is no
longer advertised for this provider, so M2.7 messages now go straight
to /chat/completions without a separate VLM round-trip.
- Drop the json_schema -> response_format mapping (MiniMax does not
document structured-output support; relying on it was speculative).
- Drop the backend temperature clamp; the same clamp now lives in the
frontend so the wire payload matches what the user sees.
- Drop the MINIMAX branch in /status that returned a hard-coded model
list; the frontend hardcodes the same list and bypasses /status via
noValidateSources, so the round-trip was wasted.
- Add a streaming Transform + non-streaming helper that move
<think>...</think> blocks from delta.content / message.content to
reasoning_content. MiniMax M2.x emit chain-of-thought inline in
content; without this transform the raw <think> tags leak into the
rendered chat. Includes a state machine that holds back partial
marker bytes so a marker split across SSE chunks is still detected.
Frontend:
- public/scripts/openai.js: add MINIMAX to noValidateSources so the key
is accepted without a /models call; remove the dead saveModelList
branch; clamp temperature to (0.0, 1.0] in createGenerationParameters.
- public/scripts/reasoning.js: add MINIMAX to the non-streaming
reasoning_content extraction case (the backend transform now produces
this field for MiniMax responses).
- public/scripts/slash-commands.js: add MINIMAX to the /api enum and
add a MiniMax case to /api-url so users can switch endpoint by
command.
- public/scripts/custom-request.js: pass minimax_endpoint through the
override-payload merge alongside the other per-source endpoint fields.
- public/scripts/tokenizers.js: stop returning openai_model (which was
always a MiniMax model id and thus an unknown tokenizer); fall back
to gpt-3.5-turbo for a coarse but functional estimate.
- public/scripts/tool-calling.js: add MINIMAX to supportedSources so
function-calling settings are exposed.
- public/index.html: drop the "-- Connect to the API --" placeholder
option from the model select (the model list is hardcoded and always
populated); remove minimax from the vision data-source attributes
on the inline-media controls.
- public/img/minimax.svg: replace the multicolor brand SVG with a
single-color currentColor version that matches the other provider
icons in the connect panel.
* review: drop backend <think> parsing, defer to frontend
Per reviewer feedback: SillyTavern's reasoningHandler / reasoning_auto_parse
setting already extracts <think>...</think> blocks on the client side, so the
backend doesn't need to rewrite MiniMax responses. Removes the SSE Transform,
the non-streaming helper, and the corresponding case in reasoning.js.
* fix: remove isImageInliningSupported declaration for MINIMAX
* fix: remove MINIMAX from stream reasoning parsing
* fix: add to autoconnect logic
* fix: add missing MINIMAX models from docs
* fix: freq. and pres. pen aren't supported for MINIMAX
* fix: use clamp function for adjusting temperature
* fix: pass minimax_endpoint from connection profile to ChatCompletionService
* fix: update supported APIs in slash command documentation
* fix: replace bespoke merge with standard MERGE_TOOLS processing
* fix: add data-i18n attributes for headers
---------
Co-authored-by: octo-patch <octo-patch@users.noreply.github.com>
Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* feat: add Workers AI text embeddings and multimodal captioning
Extends the Cloudflare Workers AI integration to the vectors and
caption extensions.
Embeddings: adds workers_ai source to the vectors extension using the
OpenAI-compatible /v1/embeddings endpoint, with dynamic model listing
from the Cloudflare model search API.
Captioning: adds workers_ai as a multimodal caption API with dynamic
vision model discovery via the multimodal-models endpoint.
* Add logo svg
* Refactor caption dropdown population
* Fix order of sources
* feat: add error handling for missing Workers AI account ID
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* feat: add Cloudflare Workers AI provider
Adds support for Cloudflare Workers AI using its OpenAI-compatible API.
Workers AI-specific stuff includes:
- Model list fetching and capabilities detection
- Tokenizer auto-detection for typical hosted model families
- Streaming not supported when using structured output
Closes#5305
* Make the entire header clickable
* Add missing samplers
* Fix non-streaming reasoning parsing
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* feat(secrets): update readSecret function to accept optional secret ID
* add secret_id to ConnectionManagerRequestService payload
* fix: pass secret_id for Text Completion types
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* feat: add SiliconFlow.cn endpoint support and embedding vectors
Chat completion:
- Add endpoint selection dropdown (Global/.com vs China/.cn) to existing
SiliconFlow provider, following the Z.AI endpoint pattern
- Backend switches API URL based on selected endpoint
- Add /api-url slash command support for endpoint switching
Embeddings:
- Add SiliconFlow as a vector/embedding source (OpenAI-compatible)
- Support both .com and .cn endpoints via siliconflow_endpoint setting
borrowed from the main connection panel (Vertex AI pattern)
- Superset model list with platform attribution (.cn) markers
- Models: Qwen3-Embedding (0.6B/4B/8B) + BGE/BCE models (.cn only)
* Add filter by models type
* Load embedding models from endpoint
* Improve api-url command declaration
* Support endpoint override in custom-request service
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* Add gpt-5.3-chat-latest model support
- Add to OpenAI model dropdown (index.html)
- Add to captioning multimodal model list (caption/settings.html)
- Add to OPENAI_REASONING_EFFORT_MODELS (constants.js)
- Add OPENAI_FIXED_REASONING_EFFORT map to clamp effort to 'medium' (the only value this model accepts)
- Apply fixed effort override in both Azure and general OpenAI request paths (chat-completions.js)
- Update frontend gpt-5.x regex for parameter handling (openai.js)
* Update public/scripts/openai.js
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* feat(openrouter): disable reasoning if "Request model reasoning" is disabled
* feat(openrouter): map minimum reasoning to none if request reasoning is off
* Add hint how to disable reasoning
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* Implement Gemini thought signatures
* Implement streaming support for Gemini thought signatures
* Implement OR support for Gemini thought signatures
* Remove unnecessary extraction of thought sigs from response parts
* Update thought sig comments to remove explicit Gemini mention
* Fix thought_signature naming convention in message.extra
* Add thought_signatures to ReasoningMessageExtra typedef
* Prevent thought sigs being sent to incompatible endpoints
* Move signatures to populateChatHistory, update for consistent casing
* Code clean-up
* Only send thought signatures if target model and API match original
* Implement content-hash thought signature mapping
* Change the data model + split for text/functions
* Don't include signature to invocations if the model doesn't match
* Fix function description
* Remove misleading comment
* Handle OpenRouter signatures
* Improve message extra types
* Prevent modifying original invocations when removing signatures
* Fix return of openrouter non-streaming signatures
* Remove redundant array check
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* Send toolConfig block to Gemini, if defined and tools block also present.
* Convert OAI tool_choice to Gemini functionCallingConfig for Gemini requests
* Remove blank line
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* Chat Completion: Reduce number of toggles in AI Response Configuration
* Consolidate migration logic
* Don't enable media inlining if image inlining was disabled
* Fix icons showing on media toggle off
* Update i18n