26 Commits

Author SHA1 Message Date
Cohee e5ae782705 Add option to return malformed JSON string from extractJsonFromData (#5578)
* feat: add option to return malformed JSON string from extractJsonFromData
Fixes #5569

* fix: add fallback for Perplexity case
2026-05-02 18:31:06 +03:00
Octopus aecbb9a2ee feat: add MiniMax as a chat completion provider (#5452)
* feat: add MiniMax as a chat completion provider

Add MiniMax (https://www.minimax.io) as a first-class chat completion
provider. MiniMax already has TTS integration in SillyTavern; this
extends support to LLM chat completions via their OpenAI-compatible API.

Supported models:
- MiniMax-M2.5 (default) — 204K context
- MiniMax-M2.5-highspeed — same capability, faster inference

Key implementation details:
- Reuses existing SECRET_KEYS.MINIMAX (shared with TTS)
- API endpoint: https://api.minimax.io/v1
- Temperature clamped to (0.0, 1.0] as required by MiniMax API
- Returns hardcoded model list since MiniMax doesn't expose /v1/models
- Full UI integration: model selector, sampler parameters, streaming

Co-Authored-By: octo-patch <octo-patch@users.noreply.github.com>

* feat: upgrade MiniMax default model to M2.7

- Add MiniMax-M2.7 and MiniMax-M2.7-highspeed to model list
- Set MiniMax-M2.7 as default model
- Keep all previous models as alternatives

* feat: independent request function, vision support, temp clamping for MiniMax

- Extract sendMinimaxRequest() following Chutes pattern (PR #4844)
  with function calling and JSON Schema structured output support
- Clamp temperature to (0.01, 1.0] on backend; limit frontend UI max to 1.0
- Enable image inlining for MiniMax M2.7 model
- Add MiniMax to slash-commands model selector and tokenizer mapping
- Add minimax_model to default preset

* feat: add VLM-based vision support for MiniMax M2.7

M2.7 does not natively accept image input. When images are detected
in messages, pre-process them via the MiniMax VLM endpoint
(/v1/coding_plan/vlm) to convert images to text descriptions before
sending to the chat completions API. Uses the same API key.

* feat: add M2-her model to MiniMax provider

M2-her is MiniMax's dialogue/roleplay-optimized model with 64K context
and 2048 max completion tokens. Text-only (no vision).

* feat: add MiniMax China endpoint (minimaxi.com) support

Add endpoint selector (Global/China) for MiniMax, mirroring the
SiliconFlow pattern. Users can now choose between api.minimax.io
(international) and api.minimaxi.com (China domestic).

* fix: merge consecutive same-role messages for MiniMax

MiniMax API rejects consecutive messages with the same role with
error 'invalid chat setting (2013)'. Merge them before sending.

* review: address PR feedback on MiniMax provider

Backend (src/endpoints/backends/chat-completions.js):
- Drop the entire MiniMax VLM image-preprocessing path; vision is no
  longer advertised for this provider, so M2.7 messages now go straight
  to /chat/completions without a separate VLM round-trip.
- Drop the json_schema -> response_format mapping (MiniMax does not
  document structured-output support; relying on it was speculative).
- Drop the backend temperature clamp; the same clamp now lives in the
  frontend so the wire payload matches what the user sees.
- Drop the MINIMAX branch in /status that returned a hard-coded model
  list; the frontend hardcodes the same list and bypasses /status via
  noValidateSources, so the round-trip was wasted.
- Add a streaming Transform + non-streaming helper that move
  <think>...</think> blocks from delta.content / message.content to
  reasoning_content. MiniMax M2.x emit chain-of-thought inline in
  content; without this transform the raw <think> tags leak into the
  rendered chat. Includes a state machine that holds back partial
  marker bytes so a marker split across SSE chunks is still detected.

Frontend:
- public/scripts/openai.js: add MINIMAX to noValidateSources so the key
  is accepted without a /models call; remove the dead saveModelList
  branch; clamp temperature to (0.0, 1.0] in createGenerationParameters.
- public/scripts/reasoning.js: add MINIMAX to the non-streaming
  reasoning_content extraction case (the backend transform now produces
  this field for MiniMax responses).
- public/scripts/slash-commands.js: add MINIMAX to the /api enum and
  add a MiniMax case to /api-url so users can switch endpoint by
  command.
- public/scripts/custom-request.js: pass minimax_endpoint through the
  override-payload merge alongside the other per-source endpoint fields.
- public/scripts/tokenizers.js: stop returning openai_model (which was
  always a MiniMax model id and thus an unknown tokenizer); fall back
  to gpt-3.5-turbo for a coarse but functional estimate.
- public/scripts/tool-calling.js: add MINIMAX to supportedSources so
  function-calling settings are exposed.
- public/index.html: drop the "-- Connect to the API --" placeholder
  option from the model select (the model list is hardcoded and always
  populated); remove minimax from the vision data-source attributes
  on the inline-media controls.
- public/img/minimax.svg: replace the multicolor brand SVG with a
  single-color currentColor version that matches the other provider
  icons in the connect panel.

* review: drop backend <think> parsing, defer to frontend

Per reviewer feedback: SillyTavern's reasoningHandler / reasoning_auto_parse
setting already extracts <think>...</think> blocks on the client side, so the
backend doesn't need to rewrite MiniMax responses. Removes the SSE Transform,
the non-streaming helper, and the corresponding case in reasoning.js.

* fix: remove isImageInliningSupported declaration for MINIMAX

* fix: remove MINIMAX from stream reasoning parsing

* fix: add to autoconnect logic

* fix: add missing MINIMAX models from docs

* fix: freq. and pres. pen aren't supported for MINIMAX

* fix: use clamp function for adjusting temperature

* fix: pass minimax_endpoint from connection profile to ChatCompletionService

* fix: update supported APIs in slash command documentation

* fix: replace bespoke merge with standard MERGE_TOOLS processing

* fix: add data-i18n attributes for headers

---------

Co-authored-by: octo-patch <octo-patch@users.noreply.github.com>
Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-04-24 00:43:05 +03:00
Wolfsblvt f4f390f325 Fix: Missing signature and toolSignatures fields in ChatCompletionService streaming state (#5439)
* Fix: Add signature and toolSignatures fields to ChatCompletionService streaming state object

* fix: pass images array to getStreamingReply

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-04-13 20:19:02 +03:00
Xiangzhe 2cb1861db6 feat: add SiliconFlow.cn chat completion and embedding support (#5316)
* feat: add SiliconFlow.cn endpoint support and embedding vectors

Chat completion:
- Add endpoint selection dropdown (Global/.com vs China/.cn) to existing
  SiliconFlow provider, following the Z.AI endpoint pattern
- Backend switches API URL based on selected endpoint
- Add /api-url slash command support for endpoint switching

Embeddings:
- Add SiliconFlow as a vector/embedding source (OpenAI-compatible)
- Support both .com and .cn endpoints via siliconflow_endpoint setting
  borrowed from the main connection panel (Vertex AI pattern)
- Superset model list with platform attribution (.cn) markers
- Models: Qwen3-Embedding (0.6B/4B/8B) + BGE/BCE models (.cn only)

* Add filter by models type

* Load embedding models from endpoint

* Improve api-url command declaration

* Support endpoint override in custom-request service

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-03-22 00:52:03 +02:00
Cohee 8ca836c3a1 custom-request: Pass api-url for Z.AI and Vertex and fix if omitted in profile (#4859) 2025-12-03 21:17:41 +02:00
qvink a4cc9b3989 Facillitate extension use of ConnectionManagerRequestService (#4841)
* Separate prompt-building functionality from request-sending functionality

* removing logs and clarifying comments

* separating parameter construction functionality to allow ConnectionManagerRequestService to use all other preset parameters

* fixing chat completion issues, adding documentation to new functions.

* Improving ConnectionManagerRequestService errors. Adding parseReasoningFromString option to override reasoning template.

* Adjusting TextCompletionService prompt formatting

* linting

* Use settingsToUpdate to convert from OAI preset to OAI settings.

* lint

* throw errors when profile ID not found

* Fix missed instances of global completion settings being used (CC and TC), replaced with optional argument. Specified typing for ChatCompletionSettings and TextCompletionSettings.

* Adjusting parameters of parseReasoningFromString and adding getReasoningTemplateByName

* using messages.role as a fallback for custom requests, fixing newline removal.

* parameters => settings
I like how it sounds better

* ditto

* You know I had to do it to 'em

* Update getCustomTokenBans

* Fix calculateLogitBias

* Fix param attributes

* Fix type checks

* Less strict role type on ChatCompletionMessage

* Add missing space

* fixing getChatCompletionModel to use an arbitrary chat completion settings object

* Fixing issues with preset overriding custom data passed.

* Pass model to createGenerationParameters externally

* Unify seed param handling for CHUTES

* Fix non-existing CC source

* Use strict comparison

* Use global settings as a base for generation parameters creation

* removing unnecessary handling of preset fields

* don't pass preset prompts, use the passed payload override messages

* refactoring text generation prompt building of last line

* Pass model to getReasoningEffort

* Pass model name to canPerformToolCalls

* Pass model to createTextGenGenerationData

---------

Co-authored-by: qvink <qvink@users.noreply.github.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2025-12-03 20:09:02 +02:00
Cohee 0a22856faf Chat Completion: Reduce number of toggles in AI Response Configuration (#4821)
* Chat Completion: Reduce number of toggles in AI Response Configuration

* Consolidate migration logic

* Don't enable media inlining if image inlining was disabled

* Fix icons showing on media toggle off

* Update i18n
2025-11-28 00:16:23 +02:00
bmen25124 d48a3639ed Added "custom_prompt_post_processing" to custom-request (#4639) 2025-10-10 20:37:23 +03:00
bmen25124 6d4be9116d Added saveMetadataDebounced to context, fixed 0 values for custom-req… (#4386)
* Added saveMetadataDebounced to context, fixed 0 values for custom-request payloads, fixed system role prefix

* Update filtering logic

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2025-08-15 21:39:49 +03:00
bmen25124 cd176039ef Added structured output for common APIs (#4272)
* Added structured output for common APIs

* eslint

* Added frontend impl

* Type name change

* Unprefix json_schema, apply review suggestions

* Add schema to generateQuietPrompt, add comments

* Prettify diff

* Extract JSON from Claude response

* Add structured gen for Mistral

* Hack to support schema for DeepSeek

* Hack JSON schema for AI21

* Add Groq structured gen

* Add JSON mode for pollinations

* Add JSON schema for perplexity

* Add JSON schema for AIML

* Using extractJsonFromData in custom-request, added google rules for flattenSchema

* Fix response parsing

* Fix Google

* Fixed json parse

* Expose generateRaw to getContext

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2025-07-15 21:54:36 +03:00
omahs d7d20a67fa Fix typos 2025-05-29 11:56:59 +02:00
bmen25124 294dc3b3b1 Fixed out of index error 2025-05-12 23:00:12 +03:00
bmen25124 6609c941a9 Added prefill for custom-request->text completion 2025-05-12 21:52:16 +03:00
bmen25124 fc5e0563ba Added ability to override request payload 2025-04-11 19:07:00 +03:00
bmen25124 4736f533a5 Added proxy support to ChatCompletionService 2025-04-11 19:04:32 +03:00
bmen25124 4e207c2cf0 Removed duplicate codes 2025-03-26 23:38:02 +03:00
bmen25124 972b1e5fa7 Fixed variable naming, better jsdoc 2025-03-26 23:30:09 +03:00
bmen25124 a7d48b1aed Added overridable instruct settings, removed macro override 2025-03-26 23:21:48 +03:00
bmen25124 c5f251c6e3 Added stop string cleanup, better stopping string param 2025-03-26 22:35:10 +03:00
bmen25124 17e0058763 Changed options type 2025-03-21 22:16:57 +03:00
bmen25124 7619396053 Better naming 2025-03-21 22:09:41 +03:00
bmen25124 ec474f5571 Added stream support to "custom-request" 2025-03-21 20:44:09 +03:00
bmen25124 86de927ab9 Added "custom_url" to ChatCompletionService 2025-03-17 14:54:59 +03:00
bmen25124 d42a81f97c New connection manager events, ConnectionManagerRequestService (#3603) 2025-03-16 16:58:34 +02:00
bmen25124 28fc498ee6 Added error field check 2025-03-04 03:39:00 +03:00
bmen25124 7d568dd4e0 Generic generate methods (#3566)
* sendOpenAIRequest/getTextGenGenerationData methods are improved, now it can use custom API, instead of active ones

* Added missing model param

* Removed unnecessary variable

* active_oai_settings -> settings

* settings -> textgenerationwebui_settings

* Better presetToSettings names, simpler settings name in getTextGenGenerationData,

* Removed unused jailbreak_system

* Reverted most core changes, new custom-request.js file

* Forced stream to false, removed duplicate method, exported settingsToUpdate

* Rewrite typedefs to define props one by one

* Added extractData param for simplicity

* Fixed typehints

* Fixed typehints (again)

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2025-03-03 10:30:20 +02:00