Commit Graph

883 Commits

Author SHA1 Message Date
Octopus aecbb9a2ee feat: add MiniMax as a chat completion provider (#5452)
* feat: add MiniMax as a chat completion provider

Add MiniMax (https://www.minimax.io) as a first-class chat completion
provider. MiniMax already has TTS integration in SillyTavern; this
extends support to LLM chat completions via their OpenAI-compatible API.

Supported models:
- MiniMax-M2.5 (default) — 204K context
- MiniMax-M2.5-highspeed — same capability, faster inference

Key implementation details:
- Reuses existing SECRET_KEYS.MINIMAX (shared with TTS)
- API endpoint: https://api.minimax.io/v1
- Temperature clamped to (0.0, 1.0] as required by MiniMax API
- Returns hardcoded model list since MiniMax doesn't expose /v1/models
- Full UI integration: model selector, sampler parameters, streaming

Co-Authored-By: octo-patch <octo-patch@users.noreply.github.com>

* feat: upgrade MiniMax default model to M2.7

- Add MiniMax-M2.7 and MiniMax-M2.7-highspeed to model list
- Set MiniMax-M2.7 as default model
- Keep all previous models as alternatives

* feat: independent request function, vision support, temp clamping for MiniMax

- Extract sendMinimaxRequest() following Chutes pattern (PR #4844)
  with function calling and JSON Schema structured output support
- Clamp temperature to (0.01, 1.0] on backend; limit frontend UI max to 1.0
- Enable image inlining for MiniMax M2.7 model
- Add MiniMax to slash-commands model selector and tokenizer mapping
- Add minimax_model to default preset

* feat: add VLM-based vision support for MiniMax M2.7

M2.7 does not natively accept image input. When images are detected
in messages, pre-process them via the MiniMax VLM endpoint
(/v1/coding_plan/vlm) to convert images to text descriptions before
sending to the chat completions API. Uses the same API key.

* feat: add M2-her model to MiniMax provider

M2-her is MiniMax's dialogue/roleplay-optimized model with 64K context
and 2048 max completion tokens. Text-only (no vision).

* feat: add MiniMax China endpoint (minimaxi.com) support

Add endpoint selector (Global/China) for MiniMax, mirroring the
SiliconFlow pattern. Users can now choose between api.minimax.io
(international) and api.minimaxi.com (China domestic).

* fix: merge consecutive same-role messages for MiniMax

MiniMax API rejects consecutive messages with the same role with
error 'invalid chat setting (2013)'. Merge them before sending.

* review: address PR feedback on MiniMax provider

Backend (src/endpoints/backends/chat-completions.js):
- Drop the entire MiniMax VLM image-preprocessing path; vision is no
  longer advertised for this provider, so M2.7 messages now go straight
  to /chat/completions without a separate VLM round-trip.
- Drop the json_schema -> response_format mapping (MiniMax does not
  document structured-output support; relying on it was speculative).
- Drop the backend temperature clamp; the same clamp now lives in the
  frontend so the wire payload matches what the user sees.
- Drop the MINIMAX branch in /status that returned a hard-coded model
  list; the frontend hardcodes the same list and bypasses /status via
  noValidateSources, so the round-trip was wasted.
- Add a streaming Transform + non-streaming helper that move
  <think>...</think> blocks from delta.content / message.content to
  reasoning_content. MiniMax M2.x emit chain-of-thought inline in
  content; without this transform the raw <think> tags leak into the
  rendered chat. Includes a state machine that holds back partial
  marker bytes so a marker split across SSE chunks is still detected.

Frontend:
- public/scripts/openai.js: add MINIMAX to noValidateSources so the key
  is accepted without a /models call; remove the dead saveModelList
  branch; clamp temperature to (0.0, 1.0] in createGenerationParameters.
- public/scripts/reasoning.js: add MINIMAX to the non-streaming
  reasoning_content extraction case (the backend transform now produces
  this field for MiniMax responses).
- public/scripts/slash-commands.js: add MINIMAX to the /api enum and
  add a MiniMax case to /api-url so users can switch endpoint by
  command.
- public/scripts/custom-request.js: pass minimax_endpoint through the
  override-payload merge alongside the other per-source endpoint fields.
- public/scripts/tokenizers.js: stop returning openai_model (which was
  always a MiniMax model id and thus an unknown tokenizer); fall back
  to gpt-3.5-turbo for a coarse but functional estimate.
- public/scripts/tool-calling.js: add MINIMAX to supportedSources so
  function-calling settings are exposed.
- public/index.html: drop the "-- Connect to the API --" placeholder
  option from the model select (the model list is hardcoded and always
  populated); remove minimax from the vision data-source attributes
  on the inline-media controls.
- public/img/minimax.svg: replace the multicolor brand SVG with a
  single-color currentColor version that matches the other provider
  icons in the connect panel.

* review: drop backend <think> parsing, defer to frontend

Per reviewer feedback: SillyTavern's reasoningHandler / reasoning_auto_parse
setting already extracts <think>...</think> blocks on the client side, so the
backend doesn't need to rewrite MiniMax responses. Removes the SSE Transform,
the non-streaming helper, and the corresponding case in reasoning.js.

* fix: remove isImageInliningSupported declaration for MINIMAX

* fix: remove MINIMAX from stream reasoning parsing

* fix: add to autoconnect logic

* fix: add missing MINIMAX models from docs

* fix: freq. and pres. pen aren't supported for MINIMAX

* fix: use clamp function for adjusting temperature

* fix: pass minimax_endpoint from connection profile to ChatCompletionService

* fix: update supported APIs in slash command documentation

* fix: replace bespoke merge with standard MERGE_TOOLS processing

* fix: add data-i18n attributes for headers

---------

Co-authored-by: octo-patch <octo-patch@users.noreply.github.com>
Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-04-24 00:43:05 +03:00
Wolfsblvt 552936a0d8 fix: exclude other group members' reasoning from prompt context in group chats (#5473)
In group chats, only include reasoning from the currently generating character instead of all group members. This prevents reasoning from other characters being injected into the prompt context when generating responses.

- Filter reasoning in coreChat loop based on message author matching name2
- Filter reasoning in setOpenAIMessages based on message author matching name2
- Add isOtherGroupMember check before adding reasoning to messages
2026-04-19 16:08:52 +03:00
ashishch432 d1e719eb48 add claude-opus-4-7 (#5465) 2026-04-19 15:47:40 +03:00
Reithan 051346c517 Enable interleaved tool reasoning for custom OpenAI-compat endpoints (#5445)
* enable interleaved tool reasoning for custom OpenAI-compat endpoints

Add chat_completion_sources.CUSTOM to interleaved_reasoning_providers so
that local OpenAI-compatible endpoints (e.g. KoboldCPP in Chat Completions
mode) can forward reasoning context in tool-call chains when the user has
configured Interleaved Thinking.

Also expose the Interleaved Thinking UI control for the Custom source so
users can actually opt in — previously the dropdown was hidden behind a
data-source="openrouter" guard.

The custom streaming path already correctly accumulates delta.reasoning_content
from streaming chunks; this change only removes the provider gate that was
silently discarding that data before it reached the API payload.

* don't override invocation reasoning with prior-turn assistant reasoning

When an invocation already has its own reasoning captured at execution
time, preserve it instead of replacing it with previousAssistantReasoning
from the backward scan. The override was correct when invocations never
carried their own reasoning, but now that the custom/openrouter paths
capture per-invocation reasoning, the unconditional replacement caused
all tool calls in a chain to receive the same stale reasoning from an
earlier unrelated assistant turn.

Fall back to previousAssistantReasoning only when clone.reasoning is empty.
2026-04-15 23:16:39 +03:00
Tony Gies 700fc05411 feat: add Cloudflare Workers AI provider (#5385)
* feat: add Cloudflare Workers AI provider

Adds support for Cloudflare Workers AI using its OpenAI-compatible API.

Workers AI-specific stuff includes:
- Model list fetching and capabilities detection
- Tokenizer auto-detection for typical hosted model families
- Streaming not supported when using structured output

Closes #5305

* Make the entire header clickable

* Add missing samplers

* Fix non-streaming reasoning parsing

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-04-06 00:24:47 +03:00
Cohee 21e8cf9060 glm-5v-turbo (#5393)
* glm-5v-turbo

* Add support for image and video inlining
2026-04-02 20:19:01 +03:00
Cohee bef7b39cbe Add glm-5.1 to models list (#5361) 2026-03-28 02:08:28 +02:00
Xiangzhe 2cb1861db6 feat: add SiliconFlow.cn chat completion and embedding support (#5316)
* feat: add SiliconFlow.cn endpoint support and embedding vectors

Chat completion:
- Add endpoint selection dropdown (Global/.com vs China/.cn) to existing
  SiliconFlow provider, following the Z.AI endpoint pattern
- Backend switches API URL based on selected endpoint
- Add /api-url slash command support for endpoint switching

Embeddings:
- Add SiliconFlow as a vector/embedding source (OpenAI-compatible)
- Support both .com and .cn endpoints via siliconflow_endpoint setting
  borrowed from the main connection panel (Vertex AI pattern)
- Superset model list with platform attribution (.cn) markers
- Models: Qwen3-Embedding (0.6B/4B/8B) + BGE/BCE models (.cn only)

* Add filter by models type

* Load embedding models from endpoint

* Improve api-url command declaration

* Support endpoint override in custom-request service

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-03-22 00:52:03 +02:00
Cohee bae4fd9f98 glm-5-turbo 2026-03-17 22:49:42 +02:00
Cohee 92dd9ecab2 gpt-5.4 2026-03-07 20:39:34 +02:00
Spicy Marinara f20aed95d0 Add gpt-5.3-chat-latest model support (#5241)
* Add gpt-5.3-chat-latest model support

- Add to OpenAI model dropdown (index.html)
- Add to captioning multimodal model list (caption/settings.html)
- Add to OPENAI_REASONING_EFFORT_MODELS (constants.js)
- Add OPENAI_FIXED_REASONING_EFFORT map to clamp effort to 'medium' (the only value this model accepts)
- Apply fixed effort override in both Azure and general OpenAI request paths (chat-completions.js)
- Update frontend gpt-5.x regex for parameter handling (openai.js)

* Update public/scripts/openai.js

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-03-04 20:04:04 +02:00
GERNOMA 8f5c8f0a8e Open router provider filter (#5208)
* Added filter for OpenRouter models provider selection

Now if a model is selected, only available providers for that model will show. Wanted to do the same for the quants, but I think the API is not returning the quants available for each model at the moment. Used existing API that for some reason was not consumed.

* Added filter for OpenRouter providers

Now if a model is selected, only the providers available show. Wanted to do the same with the quants but it seems the OpenRouter API is not giving the available quants list at the moment for each model.

* gua

* Now it also works on chat completion and only disables options

* detail

* Warning added

* eslint

* Move inline styles to CSS

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-02-24 20:40:42 +02:00
Brioch 0cef10f63f feat(openrouter): disable reasoning if Request model reasoning is off and effort is minimum (#5079)
* feat(openrouter): disable reasoning if "Request model reasoning" is disabled

* feat(openrouter): map minimum reasoning to none if request reasoning is off

* Add hint how to disable reasoning

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-02-23 21:19:04 +02:00
ZhenyaPav 0ba0418fac OpenRouter interleaved reasoning forwarding for tool-call continuations (#5160)
* fix(openrouter): forward reasoning across active tool-call chains

* feat(reasoning): add tool-chain forwarding toggle and honor edited reasoning

* feat(reasoning): add OpenRouter interleaved forwarding modes

* moved the reasoning forwarding dropdown into a separate line

* feat(reasoning): default tool reasoning forwarding to disabled

* refactor(openrouter): move tool reasoning mode to CC settings

Move OpenRouter tool reasoning forwarding control to response configuration and scope it to OpenRouter.

Store mode in chat completion settings (presettable), remove legacy power_user boolean/fallback, and use constants for mode values.

Preserve OpenRouter Gemini signature forwarding independently from plaintext tool reasoning mode.

* fix(openrouter): tighten active-chain reasoning forwarding

Use trailing contiguous tool-chain boundary for active-chain eligibility.

Also rename the UI control to Interleaved Thinking Forwarding and place selector on its own line.

* fix(openrouter): use adjacent assistant reasoning for tool calls

For interleaved thinking forwarding, source reasoning only from the immediately preceding assistant non-tool message.

Keep mode gating behavior unchanged and avoid history-window reasoning carryover.

* fix(openrouter): skip tool messages for reasoning source

When forwarding interleaved reasoning, ignore intervening tool result messages when resolving the preceding assistant reasoning source.

This keeps only the first tool call in a chain tied to a prior assistant reasoning block unless a later invocation carries its own reasoning.

* fix(openrouter): keep plaintext reasoning with signatures

Do not suppress forwarded tool-call reasoning when thought signatures are present.

* fix(openrouter): split interleaved thinking mode behavior

Restore distinct mode semantics: active_chain uses nearest assistant-text boundary after skipping tool/tool-call messages, while since_last_user scans for latest assistant reasoning since user.

Update UI label to Interleaved Thinking with right-aligned dropdown and explanatory tooltip.

* style(openrouter): align interleaved thinking dropdown row

Match OpenRouter interleaved thinking control layout with existing oneline-dropdown patterns.

Also update reasoning-forwarding inline comment wording for current mode behavior.

* docs(ui): clarify interleaved thinking tooltip

Use explicit API-request wording for OpenRouter interleaved thinking tooltip text.

* i18n(openrouter): localize interleaved thinking UI

Add locale keys for OpenRouter interleaved thinking label, mode options, and inline helper description.

Wire dropdown option text to data-i18n in index.html.

* fixed helper text wrapping

* fix(ui): make interleaved thinking helper text wrap

* i18n(openrouter): translate interleaved thinking labels

Replace placeholder English values for interleaved thinking keys in non-English locale files.

* fix(ui): restore interleaved thinking dropdown alignment

* Remove changes from en.json

* Type fixes

* Reworked the interleaved reasoning provider logic

* Renamed the variables in preparation for potential implementation for other providers

* Gate interleaved tool reasoning on reasoning request setting

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-02-22 13:15:52 +02:00
Spicy Marinara a923b0eefe Add gemini-3.1-pro-preview to Google AI Studio and Vertex model lists with thinking support (#5188) 2026-02-19 14:28:48 +02:00
Cohee 3bd1034639 claude-sonnet-4-6 2026-02-17 21:33:19 +02:00
Cohee 4d1619ba47 Chore: enable brace-style eslint check (#5159)
* eslint: enable brace-style check

* Fix jsdoc and color

* fix: correct CSS color syntax in CreateZenSliders function
2026-02-15 01:46:32 +02:00
Cohee 357da3219b Chore: Add code formatting conventions as eslint rules (#5158)
* Add code formatting conventions as eslint rules

* Improve formatting in addQuickReply
2026-02-15 01:16:34 +02:00
Copilot 1b5d65e34c Add GLM-5 to Z.AI model list (#5138)
* Initial plan

* Add glm-5 to Z.AI model list with 200k context

Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>
2026-02-12 00:21:08 +02:00
Lumi 39c8eb343c add option for claude-opus-4-6 (#5103)
* add option for claude-opus-4-6

* fix: add claude-opus-4-6 to limited sampling and verbosity model lists

* fix: disable assistant prefill for claude-opus-4-6

* refacor: merge fixthinkingPrefill and noPrefillModel

* 1m context

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-02-05 21:42:27 +02:00
Brioch 6c864e8bb2 feat(openrouter): add model quantizations setting (#5080)
* feat(openrouter): add model quantizations setting

* Remove bogus setting

* Simplify nullish coalescing assignment

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-01-30 23:51:22 +02:00
Cohee 10e8e01a55 Moonshot: Map "Request reasoning" to thinking type
Fixes #5072
2026-01-28 00:55:11 +02:00
Cohee 0e5b4de10c Moonshot: Pull vision flag from model data
Fixes #5068
2026-01-28 00:26:50 +02:00
Cohee 5a7875ba28 Update Pollinations API (#5060)
* Upgrade Pollinations API
Done: text, caption
To do: TTS, image
Fixes #5020

* Update Pollinations TTS to new API

* Update Pollinations API for images
2026-01-26 20:31:13 +02:00
DeclineThyself a09c1a7a84 Added 'dot-notation': ['error'] to .eslint.cjs (#5042)
* Added 'dot-notation': ['error'], to `.eslint.cjs`

* Ran `eslint --fix` to correct `dot-notation` errors.

* Added `eslint-disable dot-notation` anywhere errors were caused.

* Allowed dot-notation for uppercase properties: 'allowPattern': '[A-Z]\\w*$'

* Check if `rule instanceof CSSStyleRule`
https://github.com/SillyTavern/SillyTavern/pull/5042#discussion_r2711827148

* Fixed `await result.json();` types.

* refactor: update dot-notation usage in CoquiTtsProvider and PresetManager

---------

Co-authored-by: user <user@exmaple.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-01-23 00:11:03 +02:00
Cohee ff78290fd0 glm-4.7-flash 2026-01-20 23:22:29 +02:00
Cohee 1bfabd9b77 Merge branch 'release' into staging 2026-01-11 02:49:11 +02:00
Cohee f861beb244 OpenRouter: Fix fallback model request 2026-01-11 02:48:58 +02:00
DeclineThyself 8372e7bf9d "gradually replacing property access with a dot operator" (#4965)
* "gradually replacing property access with a dot operator"
https://github.com/SillyTavern/SillyTavern/pull/4963#discussion_r2663003561

(?<=\w|\])\['([a-zA-Z]\w+)'\]
My regex found 593 matches across 47 files.
Also, two typos.

* Fixed chat[0].chat_metadata type error.
https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664275854

* Fixed `swipedElementsDiv[0]?.getAnimations().filter((a) => a.animationName` type error.
https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664274593

* Fixed config.MESSAGE_SANITIZE and config.MESSAGE_ALLOW_SYSTEM_UI type errors.
https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664266271

* Fixed group.date_last_chat type error.
https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664295652

* Reverted SlashCommandParser dot property access.
https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664310931

* LLM fixed canUseNegativeLookbehind.result; type error.
https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664314288

* Reverted chat-completions.js bodyParams and headers dot property access.

https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664317848
https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664320088
https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664324438

* Reverted openai.js data dot property access.

https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664326244

* Reverted tests/frontend/MacroEnvBuilder.e2e.js env.dynamicMacros dot property access.

https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664330990

* Partially reverted `window` dot property access.

* Reverted result.json() and settings dot property access.

* Reverted google.js headers dot property access.

* Fixed regex: `(?<=\w|\])\['([a-zA-Z]\w*)'\]`

* Swapped window to globalThis with dot property access.

* LLM fixed canUseNegativeLookbehind type.

* Refactor property access

* Consistency

---------

Co-authored-by: user <user@exmaple.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-01-08 23:58:21 +02:00
Cohee 7320aa948d Audio inlining for OpenAI and Custom-compatible (#4964)
* Audio inlining for OpenAI and Custom-compatible

* Add context sizes

* chatgpt-image-latest

* Add quality control for gpt-image
2026-01-06 13:27:13 +02:00
Subwolf a8eb154517 Zai moonshot reverse proxy (#4923)
* adding reverse proxy support

* update

* added handling for the image caption extension
2025-12-28 23:52:04 +02:00
Cohee bb53da4c09 add missing zai vision models 2025-12-27 23:07:57 +02:00
equal-l2 d46fd60a57 Fix OpenRouter compatibility for OpenAI models (#4917)
* Fix OpenRouter compatibility for OpenAI models

* Don't disable o1 streaming on OpenRouter

* Fix OpenRouter compatibility for o1 model handling

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2025-12-27 18:44:37 +02:00
Cohee 11337d40ef Fix chutes status check 2025-12-27 01:39:48 +02:00
Cohee 373da39bf4 Add select2 style for NanoGPT list
Closes #4911
2025-12-23 00:09:39 +02:00
Cohee 0539583197 Add glm-4.7 model option and context mapping for Z.AI 2025-12-22 23:14:09 +02:00
Cohee c92939e56c Z.AI: Video inlining and 'coding' captions
Closes #4899
2025-12-17 23:31:14 +02:00
mightytribble 2cd2bd4a4d Implement Gemini thought signatures (#4886)
* Implement Gemini thought signatures

* Implement streaming support for Gemini thought signatures

* Implement OR support for Gemini thought signatures

* Remove unnecessary extraction of thought sigs from response parts

* Update thought sig comments to remove explicit Gemini mention

* Fix thought_signature naming convention in message.extra

* Add thought_signatures to ReasoningMessageExtra typedef

* Prevent thought sigs being sent to incompatible endpoints

* Move signatures to populateChatHistory, update for consistent casing

* Code clean-up

* Only send thought signatures if target model and API match original

* Implement content-hash thought signature mapping

* Change the data model + split for text/functions

* Don't include signature to invocations if the model doesn't match

* Fix function description

* Remove misleading comment

* Handle OpenRouter signatures

* Improve message extra types

* Prevent modifying original invocations when removing signatures

* Fix return of openrouter non-streaming signatures

* Remove redundant array check

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2025-12-17 22:23:47 +02:00
Cohee 5accfbc5d8 Gemini: add media resolution select (#4775) 2025-12-17 20:46:02 +02:00
Cohee 081c3e7b1c jerma 3 flash 2025-12-17 20:35:00 +02:00
Cohee 204d384d9b glm 4.6v 2025-12-13 17:41:24 +02:00
Cohee f3c963fa8e jippity 5.2 2025-12-13 17:26:31 +02:00
Cohee 9046fe8d2d Refactor CC API async route handlers (#4885)
* Improve error handling in CC /status and /generate endpoints

* Cancel pending status check on switching CC source
2025-12-11 23:31:46 +02:00
kashmirmydon c40caa050f Fix Mistral's Max Temperature 2025-12-03 21:27:31 +02:00
Cohee 090c8ec560 feat: add function to retrieve chat completion preset from settings 2025-12-03 21:17:06 +02:00
qvink a4cc9b3989 Facillitate extension use of ConnectionManagerRequestService (#4841)
* Separate prompt-building functionality from request-sending functionality

* removing logs and clarifying comments

* separating parameter construction functionality to allow ConnectionManagerRequestService to use all other preset parameters

* fixing chat completion issues, adding documentation to new functions.

* Improving ConnectionManagerRequestService errors. Adding parseReasoningFromString option to override reasoning template.

* Adjusting TextCompletionService prompt formatting

* linting

* Use settingsToUpdate to convert from OAI preset to OAI settings.

* lint

* throw errors when profile ID not found

* Fix missed instances of global completion settings being used (CC and TC), replaced with optional argument. Specified typing for ChatCompletionSettings and TextCompletionSettings.

* Adjusting parameters of parseReasoningFromString and adding getReasoningTemplateByName

* using messages.role as a fallback for custom requests, fixing newline removal.

* parameters => settings
I like how it sounds better

* ditto

* You know I had to do it to 'em

* Update getCustomTokenBans

* Fix calculateLogitBias

* Fix param attributes

* Fix type checks

* Less strict role type on ChatCompletionMessage

* Add missing space

* fixing getChatCompletionModel to use an arbitrary chat completion settings object

* Fixing issues with preset overriding custom data passed.

* Pass model to createGenerationParameters externally

* Unify seed param handling for CHUTES

* Fix non-existing CC source

* Use strict comparison

* Use global settings as a base for generation parameters creation

* removing unnecessary handling of preset fields

* don't pass preset prompts, use the passed payload override messages

* refactoring text generation prompt building of last line

* Pass model to getReasoningEffort

* Pass model name to canPerformToolCalls

* Pass model to createTextGenGenerationData

---------

Co-authored-by: qvink <qvink@users.noreply.github.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2025-12-03 20:09:02 +02:00
Cohee fdc9b4c949 Fix: add a case in logprobs parsing function for AIMLAPI 2025-12-01 00:26:35 +02:00
Ben 55a07d445d Chutes integration (#4844)
* Chutes integration

* Fix eslint

* Fix key saving

* Fix logo coloration

* Fix tool checks

* Unhide image inlining controls

* Fix order of options

* Fix type use in TTS extension script

* Add Chutes as a vector storage source

* Change log levels to debug

* Fix streamed reasoning parsing

* Skip remote models update

* TTS: Fix API key highlight

* Sort image models A-Z

* TTS: Fixes

* Remove unused SD endpoint

* Skip setting context size if models list is not yet loaded

* remove chutes quota / balance

* Fix: streamed tool calling

* Hide reasoning effort control

* Add image request debug log

* Fix: scroll down on media load in extensions

* Unhide some samplers

* Bring back reasoning effort

* This code will never execute

* Reformat else if cases

* Add stop strings to request

* Remove conditional from reasoning_effort body param

* Preserve original pricing fields

* Unhide logit bias setting

* Pass repetition penalty and logit bias to backend

* Swap llama tokenizer for llama3

* Pass min_p, remove supported_sampling_parameters checks

* Enable logprobs

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2025-12-01 00:17:49 +02:00
Cohee 2eba10fa7f Gemini: Add image request settings (#4838)
* Gemini: Add image request settings

* Allow aspect ratio for 2.5 flash
2025-11-29 00:59:09 +02:00
Cohee 965b86da62 Add verbosity control (#4837)
* Add verbosity control

* Remove for Azure OpenAI
2025-11-28 19:49:59 +02:00