* feat: add MiniMax as a chat completion provider
Add MiniMax (https://www.minimax.io) as a first-class chat completion
provider. MiniMax already has TTS integration in SillyTavern; this
extends support to LLM chat completions via their OpenAI-compatible API.
Supported models:
- MiniMax-M2.5 (default) — 204K context
- MiniMax-M2.5-highspeed — same capability, faster inference
Key implementation details:
- Reuses existing SECRET_KEYS.MINIMAX (shared with TTS)
- API endpoint: https://api.minimax.io/v1
- Temperature clamped to (0.0, 1.0] as required by MiniMax API
- Returns hardcoded model list since MiniMax doesn't expose /v1/models
- Full UI integration: model selector, sampler parameters, streaming
Co-Authored-By: octo-patch <octo-patch@users.noreply.github.com>
* feat: upgrade MiniMax default model to M2.7
- Add MiniMax-M2.7 and MiniMax-M2.7-highspeed to model list
- Set MiniMax-M2.7 as default model
- Keep all previous models as alternatives
* feat: independent request function, vision support, temp clamping for MiniMax
- Extract sendMinimaxRequest() following Chutes pattern (PR #4844)
with function calling and JSON Schema structured output support
- Clamp temperature to (0.01, 1.0] on backend; limit frontend UI max to 1.0
- Enable image inlining for MiniMax M2.7 model
- Add MiniMax to slash-commands model selector and tokenizer mapping
- Add minimax_model to default preset
* feat: add VLM-based vision support for MiniMax M2.7
M2.7 does not natively accept image input. When images are detected
in messages, pre-process them via the MiniMax VLM endpoint
(/v1/coding_plan/vlm) to convert images to text descriptions before
sending to the chat completions API. Uses the same API key.
* feat: add M2-her model to MiniMax provider
M2-her is MiniMax's dialogue/roleplay-optimized model with 64K context
and 2048 max completion tokens. Text-only (no vision).
* feat: add MiniMax China endpoint (minimaxi.com) support
Add endpoint selector (Global/China) for MiniMax, mirroring the
SiliconFlow pattern. Users can now choose between api.minimax.io
(international) and api.minimaxi.com (China domestic).
* fix: merge consecutive same-role messages for MiniMax
MiniMax API rejects consecutive messages with the same role with
error 'invalid chat setting (2013)'. Merge them before sending.
* review: address PR feedback on MiniMax provider
Backend (src/endpoints/backends/chat-completions.js):
- Drop the entire MiniMax VLM image-preprocessing path; vision is no
longer advertised for this provider, so M2.7 messages now go straight
to /chat/completions without a separate VLM round-trip.
- Drop the json_schema -> response_format mapping (MiniMax does not
document structured-output support; relying on it was speculative).
- Drop the backend temperature clamp; the same clamp now lives in the
frontend so the wire payload matches what the user sees.
- Drop the MINIMAX branch in /status that returned a hard-coded model
list; the frontend hardcodes the same list and bypasses /status via
noValidateSources, so the round-trip was wasted.
- Add a streaming Transform + non-streaming helper that move
<think>...</think> blocks from delta.content / message.content to
reasoning_content. MiniMax M2.x emit chain-of-thought inline in
content; without this transform the raw <think> tags leak into the
rendered chat. Includes a state machine that holds back partial
marker bytes so a marker split across SSE chunks is still detected.
Frontend:
- public/scripts/openai.js: add MINIMAX to noValidateSources so the key
is accepted without a /models call; remove the dead saveModelList
branch; clamp temperature to (0.0, 1.0] in createGenerationParameters.
- public/scripts/reasoning.js: add MINIMAX to the non-streaming
reasoning_content extraction case (the backend transform now produces
this field for MiniMax responses).
- public/scripts/slash-commands.js: add MINIMAX to the /api enum and
add a MiniMax case to /api-url so users can switch endpoint by
command.
- public/scripts/custom-request.js: pass minimax_endpoint through the
override-payload merge alongside the other per-source endpoint fields.
- public/scripts/tokenizers.js: stop returning openai_model (which was
always a MiniMax model id and thus an unknown tokenizer); fall back
to gpt-3.5-turbo for a coarse but functional estimate.
- public/scripts/tool-calling.js: add MINIMAX to supportedSources so
function-calling settings are exposed.
- public/index.html: drop the "-- Connect to the API --" placeholder
option from the model select (the model list is hardcoded and always
populated); remove minimax from the vision data-source attributes
on the inline-media controls.
- public/img/minimax.svg: replace the multicolor brand SVG with a
single-color currentColor version that matches the other provider
icons in the connect panel.
* review: drop backend <think> parsing, defer to frontend
Per reviewer feedback: SillyTavern's reasoningHandler / reasoning_auto_parse
setting already extracts <think>...</think> blocks on the client side, so the
backend doesn't need to rewrite MiniMax responses. Removes the SSE Transform,
the non-streaming helper, and the corresponding case in reasoning.js.
* fix: remove isImageInliningSupported declaration for MINIMAX
* fix: remove MINIMAX from stream reasoning parsing
* fix: add to autoconnect logic
* fix: add missing MINIMAX models from docs
* fix: freq. and pres. pen aren't supported for MINIMAX
* fix: use clamp function for adjusting temperature
* fix: pass minimax_endpoint from connection profile to ChatCompletionService
* fix: update supported APIs in slash command documentation
* fix: replace bespoke merge with standard MERGE_TOOLS processing
* fix: add data-i18n attributes for headers
---------
Co-authored-by: octo-patch <octo-patch@users.noreply.github.com>
Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* Add StreamingDisplay class for live LLM generation output with floating toast panel
- Add StreamingDisplay class to show streaming reasoning and content in a floating toast panel
- Extract createModelIcon() helper from insertSVGIcon() for reusable API/model icon creation
- StreamingDisplay automatically appends inside topmost open dialog (same pattern as fixToastrForDialogs)
- Add CSS with fade-in animation, pulsating activity indicator, and separate reasoning/content sections
- Support optional model icon in header
* Add ConnectionManagerRequestService.getProfileIcon() method for retrieving profile API icons
- Add static getProfileIcon() method to ConnectionManagerRequestService
- Returns HTMLImageElement created via createModelIcon() for a given profile's API/model
- Accepts optional profileId parameter, defaults to currently selected profile
- Returns null if Connection Manager is disabled, profile not found, or profile has no API
- Import createModelIcon from script.js
* Use animation_duration directly in hide() and CSS transition instead of constant
- Remove ANIMATION_DURATION_MS constant and use animation_duration directly in hide() method
- Replace hardcoded 0.3s CSS transitions with CSS variable var(--animation-duration, 125ms)
- Read animation_duration value inline in hide() for accurate timing
* Add /genstream slash command with live streaming display and reasoning support
- Add /genstream slash command that generates text via Connection Manager with live streaming UI
- Add formatReasoning() helper function (inverse of parseReasoningFromString) to format reasoning/content into template-wrapped strings
- Add connectionProfiles enum provider for profile selection in slash commands
- StreamingDisplay: add delay parameter to hide() method (default 1000ms) to show final result before dismiss
* Add /reasoning-format slash command to format reasoning and content into template-wrapped strings
- Add /reasoning-format (alias: /format-reasoning) slash command that wraps reasoning/content using Reasoning Formatting settings
- Accept required 'reasoning' named argument and optional unnamed 'content' argument
- Validate that prefix/suffix are configured before formatting
- Return formatted string via formatReasoning() helper for use with /reasoning-parse
- Show warning toasts if prefix/suffix missing
* Rename /genstream command to /profile-genstream and move to appropriate module
* Apply messageFormatting to StreamingDisplay reasoning and content text for proper rendering
- Import messageFormatting from script.js
- Replace textContent with innerHTML using messageFormatting() in updateReasoning() and updateText()
- Pass isSystem=true for reasoning, isSystem=false for content to match formatting expectations
- Update css to utilize pre-formatted paragraphs correctly
* Strip auto-added quotes from <q> tags in StreamingDisplay and add 'mes_text' class for consistent chat message formatting
- Add CSS rules to remove browser-default quotes from <q> tags in reasoning and content sections
- Add 'mes_text' class to textContent div to match chat message formatting behavior
- Prevents double quotes when messageFormatting already adds them via <q> tags
* Add minimize/close buttons and complete state to StreamingDisplay with configurable auto-hide
- Add minimize button to collapse/restore content sections while keeping header visible
- Add close button to manually dismiss display (generation continues in background)
- Replace CSS pseudo-element with explicit LED indicator element for better state control
- Add complete() method to mark generation done: changes LED from pulsing orange to solid green
- Add configurable auto-hide delay after completion
* Add stop button to StreamingDisplay with abort support and onStop/onComplete closures for /profile-genstream
- Add stop button to StreamingDisplay when onStop handler is provided
- Add markStopped() method with solid red LED state indicator
- Add AbortController integration to /profile-genstream for request cancellation
- Add onStop and onComplete closure arguments to /profile-genstream command
- Update complete() method signature to use options object with label and delay
- Disable stop button immediately
* Position StreamingDisplay above bottom form block using CSS variable with fallback
- Change bottom positioning from fixed 20px to dynamic calculation
- Use max() to position above --bottomFormBlockSize + 5px or minimum 20px
- Ensures StreamingDisplay doesn't overlap with bottom UI elements
* Rename /profile-genstream arguments for clarity: label→generating, completedLabel→completed, hideDelay→delay
- Rename `label` argument to `generating` to better reflect its purpose as the in-progress state label
- Rename `completedLabel` to `completed` for consistency and brevity
- Rename `hideDelay` to `delay` for simpler naming
- Update all internal references and variable names to match new argument names
- Update argument descriptions and default values accordingly
* Remove variable resolution from /profile-genstream arguments: system, length, and delay
- Remove ARGUMENT_TYPE.VARIABLE_NAME from typeList for system, length, and delay arguments
- Replace resolveVariable() calls with direct argument access for system, length, and delay
- Simplify type checking to use typeof directly on args properties
- Maintain existing default values and validation logic
* Add warning toast and early return when connection profile not found in /profile-genstream
- Display toastr warning when fuzzy search fails to find matching profile
- Return empty string to prevent execution with invalid profile
- Improves user feedback for incorrect profile names or IDs
* Extract buildResultText() helper in /profile-genstream to return partial results when stopped
- Add buildResultText() helper function to centralize result formatting logic
- Return partial generated text when user stops generation instead of empty string
- Reuse buildResultText() for both stopped and completed states
- Maintains consistent reasoning formatting in both cases
* fix lint
* Update documentation to reflect argument name change from hideDelay to delay
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* feat: add Workers AI text embeddings and multimodal captioning
Extends the Cloudflare Workers AI integration to the vectors and
caption extensions.
Embeddings: adds workers_ai source to the vectors extension using the
OpenAI-compatible /v1/embeddings endpoint, with dynamic model listing
from the Cloudflare model search API.
Captioning: adds workers_ai as a multimodal caption API with dynamic
vision model discovery via the multimodal-models endpoint.
* Add logo svg
* Refactor caption dropdown population
* Fix order of sources
* feat: add error handling for missing Workers AI account ID
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* feat(secrets): update readSecret function to accept optional secret ID
* add secret_id to ConnectionManagerRequestService payload
* fix: pass secret_id for Text Completion types
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* feat: add SiliconFlow.cn endpoint support and embedding vectors
Chat completion:
- Add endpoint selection dropdown (Global/.com vs China/.cn) to existing
SiliconFlow provider, following the Z.AI endpoint pattern
- Backend switches API URL based on selected endpoint
- Add /api-url slash command support for endpoint switching
Embeddings:
- Add SiliconFlow as a vector/embedding source (OpenAI-compatible)
- Support both .com and .cn endpoints via siliconflow_endpoint setting
borrowed from the main connection panel (Vertex AI pattern)
- Superset model list with platform attribution (.cn) markers
- Models: Qwen3-Embedding (0.6B/4B/8B) + BGE/BCE models (.cn only)
* Add filter by models type
* Load embedding models from endpoint
* Improve api-url command declaration
* Support endpoint override in custom-request service
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* Separate prompt-building functionality from request-sending functionality
* removing logs and clarifying comments
* separating parameter construction functionality to allow ConnectionManagerRequestService to use all other preset parameters
* fixing chat completion issues, adding documentation to new functions.
* Improving ConnectionManagerRequestService errors. Adding parseReasoningFromString option to override reasoning template.
* Adjusting TextCompletionService prompt formatting
* linting
* Use settingsToUpdate to convert from OAI preset to OAI settings.
* lint
* throw errors when profile ID not found
* Fix missed instances of global completion settings being used (CC and TC), replaced with optional argument. Specified typing for ChatCompletionSettings and TextCompletionSettings.
* Adjusting parameters of parseReasoningFromString and adding getReasoningTemplateByName
* using messages.role as a fallback for custom requests, fixing newline removal.
* parameters => settings
I like how it sounds better
* ditto
* You know I had to do it to 'em
* Update getCustomTokenBans
* Fix calculateLogitBias
* Fix param attributes
* Fix type checks
* Less strict role type on ChatCompletionMessage
* Add missing space
* fixing getChatCompletionModel to use an arbitrary chat completion settings object
* Fixing issues with preset overriding custom data passed.
* Pass model to createGenerationParameters externally
* Unify seed param handling for CHUTES
* Fix non-existing CC source
* Use strict comparison
* Use global settings as a base for generation parameters creation
* removing unnecessary handling of preset fields
* don't pass preset prompts, use the passed payload override messages
* refactoring text generation prompt building of last line
* Pass model to getReasoningEffort
* Pass model name to canPerformToolCalls
* Pass model to createTextGenGenerationData
---------
Co-authored-by: qvink <qvink@users.noreply.github.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* Add captioning for video attachments
* Unify error toast titles
* Add MEDIA_SOURCE enum and update media handling to include source information
* Unify attachment handling logic
* Add error handling for auto-captioning failures
* Use string formatting for console error
* Add UI elements
* Add support for model configuration
* fix: update API request parameters for improved handling
* Add logo img
* Fix tool calling with negative index
* Include tool calls into 'last in context' calculation
* feat: add support for captioning
* fixed merge conflicts
* Supported max tokens + fixed wrong image model mapping
* fixed merge conflicts
* fixed merge conflicts
* updated the logic
* updated the logic
* replaced hard coded reasoning_effort mode list with a dynamic function
* replaced hard coded reasoning_effort model list with a dynamic function
* Fix eslint
* Adjust reasoning effort logic
* Code clean-up
* Add logo
* Add inline image quality
* Fix multimodal models list
* Fix seed not passed
* Add "detail" error parser
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
* add region option for vertex express mode to access more models
* add region option for vertex express mode to access more models
* fix: Consolidate duplicate 'vertexai_region' input
* feat: add vertex region suggestion datalist & update caption endpoint to correctly handle region
* Adjust global projectless Vertex endpoint
* Remove pointless slash trim
* Skill issue
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
This commit refactors the Vertex AI integration to automatically derive the
Project ID from the provided Service Account JSON. This simplifies the
configuration process for users in "Full" (service account) authentication
mode by removing the need to specify the Project ID separately.
- Updated index.html to include options for Vertex AI authentication modes (Express and Full).
- Enhanced openai.js to manage Vertex AI settings, including project ID, region, and service account JSON.
- Added validation and handling for service account JSON in the backend.
- Modified API request handling in google.js to support both authentication modes for Vertex AI.
- Updated secrets.js to include a key for storing Vertex AI service account JSON.
- Improved error handling and user feedback for authentication issues.
* Add Vertex AI express mode support
Split Google AI Studio and Vertex AI
* Add support for Vertex AI, including updating default models and related settings, modifying frontend HTML to include Vertex AI options, and adjusting request processing logic in the backend API.
* Log API name in the console
* Merge sysprompt toggles back
* Use Gemma tokenizers for Vertex and LearnLM
* AI Studio parity updates
* Add link to express mode doc. Also technically it's not a form
* Split title
* Use array includes
* Add support for Google Vertex AI in image captioning feature
* Specify caption API name, add to compression list
---------
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>