Commit Graph

25 Commits

Author SHA1 Message Date
Octopus aecbb9a2ee feat: add MiniMax as a chat completion provider (#5452)
* feat: add MiniMax as a chat completion provider

Add MiniMax (https://www.minimax.io) as a first-class chat completion
provider. MiniMax already has TTS integration in SillyTavern; this
extends support to LLM chat completions via their OpenAI-compatible API.

Supported models:
- MiniMax-M2.5 (default) — 204K context
- MiniMax-M2.5-highspeed — same capability, faster inference

Key implementation details:
- Reuses existing SECRET_KEYS.MINIMAX (shared with TTS)
- API endpoint: https://api.minimax.io/v1
- Temperature clamped to (0.0, 1.0] as required by MiniMax API
- Returns hardcoded model list since MiniMax doesn't expose /v1/models
- Full UI integration: model selector, sampler parameters, streaming

Co-Authored-By: octo-patch <octo-patch@users.noreply.github.com>

* feat: upgrade MiniMax default model to M2.7

- Add MiniMax-M2.7 and MiniMax-M2.7-highspeed to model list
- Set MiniMax-M2.7 as default model
- Keep all previous models as alternatives

* feat: independent request function, vision support, temp clamping for MiniMax

- Extract sendMinimaxRequest() following Chutes pattern (PR #4844)
  with function calling and JSON Schema structured output support
- Clamp temperature to (0.01, 1.0] on backend; limit frontend UI max to 1.0
- Enable image inlining for MiniMax M2.7 model
- Add MiniMax to slash-commands model selector and tokenizer mapping
- Add minimax_model to default preset

* feat: add VLM-based vision support for MiniMax M2.7

M2.7 does not natively accept image input. When images are detected
in messages, pre-process them via the MiniMax VLM endpoint
(/v1/coding_plan/vlm) to convert images to text descriptions before
sending to the chat completions API. Uses the same API key.

* feat: add M2-her model to MiniMax provider

M2-her is MiniMax's dialogue/roleplay-optimized model with 64K context
and 2048 max completion tokens. Text-only (no vision).

* feat: add MiniMax China endpoint (minimaxi.com) support

Add endpoint selector (Global/China) for MiniMax, mirroring the
SiliconFlow pattern. Users can now choose between api.minimax.io
(international) and api.minimaxi.com (China domestic).

* fix: merge consecutive same-role messages for MiniMax

MiniMax API rejects consecutive messages with the same role with
error 'invalid chat setting (2013)'. Merge them before sending.

* review: address PR feedback on MiniMax provider

Backend (src/endpoints/backends/chat-completions.js):
- Drop the entire MiniMax VLM image-preprocessing path; vision is no
  longer advertised for this provider, so M2.7 messages now go straight
  to /chat/completions without a separate VLM round-trip.
- Drop the json_schema -> response_format mapping (MiniMax does not
  document structured-output support; relying on it was speculative).
- Drop the backend temperature clamp; the same clamp now lives in the
  frontend so the wire payload matches what the user sees.
- Drop the MINIMAX branch in /status that returned a hard-coded model
  list; the frontend hardcodes the same list and bypasses /status via
  noValidateSources, so the round-trip was wasted.
- Add a streaming Transform + non-streaming helper that move
  <think>...</think> blocks from delta.content / message.content to
  reasoning_content. MiniMax M2.x emit chain-of-thought inline in
  content; without this transform the raw <think> tags leak into the
  rendered chat. Includes a state machine that holds back partial
  marker bytes so a marker split across SSE chunks is still detected.

Frontend:
- public/scripts/openai.js: add MINIMAX to noValidateSources so the key
  is accepted without a /models call; remove the dead saveModelList
  branch; clamp temperature to (0.0, 1.0] in createGenerationParameters.
- public/scripts/reasoning.js: add MINIMAX to the non-streaming
  reasoning_content extraction case (the backend transform now produces
  this field for MiniMax responses).
- public/scripts/slash-commands.js: add MINIMAX to the /api enum and
  add a MiniMax case to /api-url so users can switch endpoint by
  command.
- public/scripts/custom-request.js: pass minimax_endpoint through the
  override-payload merge alongside the other per-source endpoint fields.
- public/scripts/tokenizers.js: stop returning openai_model (which was
  always a MiniMax model id and thus an unknown tokenizer); fall back
  to gpt-3.5-turbo for a coarse but functional estimate.
- public/scripts/tool-calling.js: add MINIMAX to supportedSources so
  function-calling settings are exposed.
- public/index.html: drop the "-- Connect to the API --" placeholder
  option from the model select (the model list is hardcoded and always
  populated); remove minimax from the vision data-source attributes
  on the inline-media controls.
- public/img/minimax.svg: replace the multicolor brand SVG with a
  single-color currentColor version that matches the other provider
  icons in the connect panel.

* review: drop backend <think> parsing, defer to frontend

Per reviewer feedback: SillyTavern's reasoningHandler / reasoning_auto_parse
setting already extracts <think>...</think> blocks on the client side, so the
backend doesn't need to rewrite MiniMax responses. Removes the SSE Transform,
the non-streaming helper, and the corresponding case in reasoning.js.

* fix: remove isImageInliningSupported declaration for MINIMAX

* fix: remove MINIMAX from stream reasoning parsing

* fix: add to autoconnect logic

* fix: add missing MINIMAX models from docs

* fix: freq. and pres. pen aren't supported for MINIMAX

* fix: use clamp function for adjusting temperature

* fix: pass minimax_endpoint from connection profile to ChatCompletionService

* fix: update supported APIs in slash command documentation

* fix: replace bespoke merge with standard MERGE_TOOLS processing

* fix: add data-i18n attributes for headers

---------

Co-authored-by: octo-patch <octo-patch@users.noreply.github.com>
Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-04-24 00:43:05 +03:00
Ben 55a07d445d Chutes integration (#4844)
* Chutes integration

* Fix eslint

* Fix key saving

* Fix logo coloration

* Fix tool checks

* Unhide image inlining controls

* Fix order of options

* Fix type use in TTS extension script

* Add Chutes as a vector storage source

* Change log levels to debug

* Fix streamed reasoning parsing

* Skip remote models update

* TTS: Fix API key highlight

* Sort image models A-Z

* TTS: Fixes

* Remove unused SD endpoint

* Skip setting context size if models list is not yet loaded

* remove chutes quota / balance

* Fix: streamed tool calling

* Hide reasoning effort control

* Add image request debug log

* Fix: scroll down on media load in extensions

* Unhide some samplers

* Bring back reasoning effort

* This code will never execute

* Reformat else if cases

* Add stop strings to request

* Remove conditional from reasoning_effort body param

* Preserve original pricing fields

* Unhide logit bias setting

* Pass repetition penalty and logit bias to backend

* Swap llama tokenizer for llama3

* Pass min_p, remove supported_sampling_parameters checks

* Enable logprobs

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2025-12-01 00:17:49 +02:00
Cohee 59ba22fa9e "Squash system messages" is back 2025-11-28 17:21:12 +02:00
Cohee 0a22856faf Chat Completion: Reduce number of toggles in AI Response Configuration (#4821)
* Chat Completion: Reduce number of toggles in AI Response Configuration

* Consolidate migration logic

* Don't enable media inlining if image inlining was disabled

* Fix icons showing on media toggle off

* Update i18n
2025-11-28 00:16:23 +02:00
cloak1505 bc40d93b49 Remove dead Gemini 1.5 models, and clean up (#4636)
* Remove dead Gemini 1.5 models, and clean up

* Remove dead models (error 404): Gemini 1.5, `gemini-2.5-pro-exp-03-25`, `gemini-2.5-flash-preview-04-17`
* Adjust the Gemini → descriptions
* Assign default models to 2.5 Pro and Sonnet 4.5 (3.5 and 3.7 will be retiring soon)
* Add `gemini-2.5-flash-image`

* Don't forget learnlm-1.5-pro

* Update default claude

* Vertex: Clean-up 2.5 preview models

* Disable thinking for 2.5-flash-image

* Bring back banana preview

* Update defaults in more places

* Add gemini preview-09-2025 and robotics-er

* unbrick my last commit

* Add gemini-robotics-er to captions

* Set max context for gemini-robotics-er

dang

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2025-10-10 20:11:04 +03:00
Ngo Dinh Gia Bao 07a4007363 feat: [Electron Hub] Added Text-to-Speech, Prompt cost, Sort/Group/Se… (#4528)
* feat: [Electron Hub] Added Text-to-Speech, Prompt cost, Sort/Group/Search for model list

* feat: [Electron Hub] Added Text-to-Speech, Prompt cost, Sort/Group/Search for model list

* Update public/scripts/extensions/tts/electronhub.js

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Update public/scripts/openai.js

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Update public/scripts/openai.js

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* feat: [Electron Hub] Added Text-to-Speech, Prompt cost, Sort/Group/Search for model list

* feat: [Electron Hub] Show model capabilities

* Support logit_bias

* Small tweaks

* Added tokenizer selection logic

* Added tokenizer selection logic

* Fixed ESLint

* Small tweaks

* Split localization tags

* Fix formatting

* Refactor icons, add tool icon

* Support newer oai model tokenizers

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2025-09-16 22:22:27 +03:00
Cohee d641cbecc4 Remove Scale Spellbook from CC sources (#4293) 2025-07-24 22:02:45 +03:00
Cohee 7429726eb1 Remove Window AI from CC sources (#4294) 2025-07-22 19:45:54 +03:00
Cohee 5be2323e23 AI21: Add Jamba 1.7 models 2025-07-09 20:09:34 +03:00
NijikaMyWaifu 157315cd68 Add Vertex AI express mode support (#3977)
* Add Vertex AI express mode support
Split Google AI Studio and Vertex AI

* Add support for Vertex AI, including updating default models and related settings, modifying frontend HTML to include Vertex AI options, and adjusting request processing logic in the backend API.

* Log API name in the console

* Merge sysprompt toggles back

* Use Gemma tokenizers for Vertex and LearnLM

* AI Studio parity updates

* Add link to express mode doc. Also technically it's not a form

* Split title

* Use array includes

* Add support for Google Vertex AI in image captioning feature

* Specify caption API name, add to compression list

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2025-05-22 20:10:53 +03:00
Cohee 070de9df2d (CC) Move continue nudge at the end of completion (#3611)
* Move continue nudge at the end of completion
Closes #3607

* Move continue message together with nudge
2025-03-09 18:17:02 +02:00
bmen25124 7d568dd4e0 Generic generate methods (#3566)
* sendOpenAIRequest/getTextGenGenerationData methods are improved, now it can use custom API, instead of active ones

* Added missing model param

* Removed unnecessary variable

* active_oai_settings -> settings

* settings -> textgenerationwebui_settings

* Better presetToSettings names, simpler settings name in getTextGenGenerationData,

* Removed unused jailbreak_system

* Reverted most core changes, new custom-request.js file

* Forced stream to false, removed duplicate method, exported settingsToUpdate

* Rewrite typedefs to define props one by one

* Added extractData param for simplicity

* Fixed typehints

* Fixed typehints (again)

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2025-03-03 10:30:20 +02:00
Cohee 7f94cb4bee CC: Simplify default wrappers for personality and scenario 2024-12-22 23:36:58 +02:00
Cohee 9382845dee Claude: remove user filler from prompt converter 2024-11-24 19:05:41 +02:00
Cohee 30af741c3e Deprecated forced instruct on OpenRouter for Chat Completion 2024-09-15 10:54:12 +03:00
Cohee 7d4b3e0800 Use J1.5 Large by default 2024-08-26 12:11:29 +03:00
Cohee 5fc16a2474 New AI21 Jamba + tokenizer 2024-08-26 12:07:36 +03:00
Cohee 32c48cf9fa Fix default value for OpenRouter Top A 2024-08-07 20:58:19 +03:00
Cohee 3a8614db94 Update models in default files 2024-08-01 00:53:45 +03:00
Cohee 5f0e74bd56 Rename PHI/aux UI fields 2024-07-21 14:29:13 +03:00
Succubyss c822b9e2da Implements Assistant Impersonation Prefill 2024-05-16 21:59:58 -05:00
Cohee e25c419491 Update Default chat comps preset 2024-03-24 17:09:28 +02:00
Cohee 965bb54f7d Option to add names to completion contents 2024-03-19 21:53:40 +02:00
Cohee e24fbfdc1d Update default OAI sampler parameters 2024-03-13 02:25:20 +02:00
Cohee 45df576f1c Re-add default presets for content manager 2023-12-03 15:07:21 +02:00