shaw/SillyTavern - SillyTavern - Gitea: Git with a cup of tea

shaw/SillyTavern

Author	SHA1	Message	Date
Cohee	c249e5384c	feat: pass koboldcpp reasoning effort (#5491 ) Fixes #5489	2026-04-26 00:02:07 +03:00
Cohee	09d72828cb	feat: add gemma 4 for AI studio (#5493 ) * feat: add gemma 4 for AI studio * fix: update max context return value for gemma-3n-e4b-it model * refactor: iterate array of [regex, number] * gemma4: enable tool calling and sysprompt Co-authored-by: Copilot <copilot@github.com> --------- Co-authored-by: Copilot <copilot@github.com>	2026-04-25 22:22:55 +03:00
Dclef	77cbcd8774	feat: add DeepSeek V4 model support with thinking mode and reasoning effort (#5522 ) * fix: align DeepSeek provider with V4 API * Fix DeepSeek beta routing for standard chat completions * feat: add DeepSeek V4 model support with thinking mode and reasoning effort * Address DeepSeek review feedback * Set DeepSeek default model to v4 flash * fix: clean-up deprecated models, add migration * fix: move reasoning effort mapping to resolveReasoningEffort * fix: lint empty line * fix: remove duplicate code * fix: add coder model to migration logic --------- Co-authored-by: dclef <drclef233@gmail.com> Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-04-24 21:47:30 +03:00
Octopus	aecbb9a2ee	feat: add MiniMax as a chat completion provider (#5452 ) * feat: add MiniMax as a chat completion provider Add MiniMax (https://www.minimax.io) as a first-class chat completion provider. MiniMax already has TTS integration in SillyTavern; this extends support to LLM chat completions via their OpenAI-compatible API. Supported models: - MiniMax-M2.5 (default) — 204K context - MiniMax-M2.5-highspeed — same capability, faster inference Key implementation details: - Reuses existing SECRET_KEYS.MINIMAX (shared with TTS) - API endpoint: https://api.minimax.io/v1 - Temperature clamped to (0.0, 1.0] as required by MiniMax API - Returns hardcoded model list since MiniMax doesn't expose /v1/models - Full UI integration: model selector, sampler parameters, streaming Co-Authored-By: octo-patch <octo-patch@users.noreply.github.com> * feat: upgrade MiniMax default model to M2.7 - Add MiniMax-M2.7 and MiniMax-M2.7-highspeed to model list - Set MiniMax-M2.7 as default model - Keep all previous models as alternatives * feat: independent request function, vision support, temp clamping for MiniMax - Extract sendMinimaxRequest() following Chutes pattern (PR #4844) with function calling and JSON Schema structured output support - Clamp temperature to (0.01, 1.0] on backend; limit frontend UI max to 1.0 - Enable image inlining for MiniMax M2.7 model - Add MiniMax to slash-commands model selector and tokenizer mapping - Add minimax_model to default preset * feat: add VLM-based vision support for MiniMax M2.7 M2.7 does not natively accept image input. When images are detected in messages, pre-process them via the MiniMax VLM endpoint (/v1/coding_plan/vlm) to convert images to text descriptions before sending to the chat completions API. Uses the same API key. * feat: add M2-her model to MiniMax provider M2-her is MiniMax's dialogue/roleplay-optimized model with 64K context and 2048 max completion tokens. Text-only (no vision). * feat: add MiniMax China endpoint (minimaxi.com) support Add endpoint selector (Global/China) for MiniMax, mirroring the SiliconFlow pattern. Users can now choose between api.minimax.io (international) and api.minimaxi.com (China domestic). * fix: merge consecutive same-role messages for MiniMax MiniMax API rejects consecutive messages with the same role with error 'invalid chat setting (2013)'. Merge them before sending. * review: address PR feedback on MiniMax provider Backend (src/endpoints/backends/chat-completions.js): - Drop the entire MiniMax VLM image-preprocessing path; vision is no longer advertised for this provider, so M2.7 messages now go straight to /chat/completions without a separate VLM round-trip. - Drop the json_schema -> response_format mapping (MiniMax does not document structured-output support; relying on it was speculative). - Drop the backend temperature clamp; the same clamp now lives in the frontend so the wire payload matches what the user sees. - Drop the MINIMAX branch in /status that returned a hard-coded model list; the frontend hardcodes the same list and bypasses /status via noValidateSources, so the round-trip was wasted. - Add a streaming Transform + non-streaming helper that move <think>...</think> blocks from delta.content / message.content to reasoning_content. MiniMax M2.x emit chain-of-thought inline in content; without this transform the raw <think> tags leak into the rendered chat. Includes a state machine that holds back partial marker bytes so a marker split across SSE chunks is still detected. Frontend: - public/scripts/openai.js: add MINIMAX to noValidateSources so the key is accepted without a /models call; remove the dead saveModelList branch; clamp temperature to (0.0, 1.0] in createGenerationParameters. - public/scripts/reasoning.js: add MINIMAX to the non-streaming reasoning_content extraction case (the backend transform now produces this field for MiniMax responses). - public/scripts/slash-commands.js: add MINIMAX to the /api enum and add a MiniMax case to /api-url so users can switch endpoint by command. - public/scripts/custom-request.js: pass minimax_endpoint through the override-payload merge alongside the other per-source endpoint fields. - public/scripts/tokenizers.js: stop returning openai_model (which was always a MiniMax model id and thus an unknown tokenizer); fall back to gpt-3.5-turbo for a coarse but functional estimate. - public/scripts/tool-calling.js: add MINIMAX to supportedSources so function-calling settings are exposed. - public/index.html: drop the "-- Connect to the API --" placeholder option from the model select (the model list is hardcoded and always populated); remove minimax from the vision data-source attributes on the inline-media controls. - public/img/minimax.svg: replace the multicolor brand SVG with a single-color currentColor version that matches the other provider icons in the connect panel. * review: drop backend <think> parsing, defer to frontend Per reviewer feedback: SillyTavern's reasoningHandler / reasoning_auto_parse setting already extracts <think>...</think> blocks on the client side, so the backend doesn't need to rewrite MiniMax responses. Removes the SSE Transform, the non-streaming helper, and the corresponding case in reasoning.js. * fix: remove isImageInliningSupported declaration for MINIMAX * fix: remove MINIMAX from stream reasoning parsing * fix: add to autoconnect logic * fix: add missing MINIMAX models from docs * fix: freq. and pres. pen aren't supported for MINIMAX * fix: use clamp function for adjusting temperature * fix: pass minimax_endpoint from connection profile to ChatCompletionService * fix: update supported APIs in slash command documentation * fix: replace bespoke merge with standard MERGE_TOOLS processing * fix: add data-i18n attributes for headers --------- Co-authored-by: octo-patch <octo-patch@users.noreply.github.com> Co-authored-by: octo-patch <octo-patch@github.com> Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-04-24 00:43:05 +03:00
ashishch432	d1e719eb48	add claude-opus-4-7 (#5465 )	2026-04-19 15:47:40 +03:00
Tony Gies	a9c377c3c8	feat: add Workers AI text embeddings and multimodal captioning (#5414 ) * feat: add Workers AI text embeddings and multimodal captioning Extends the Cloudflare Workers AI integration to the vectors and caption extensions. Embeddings: adds workers_ai source to the vectors extension using the OpenAI-compatible /v1/embeddings endpoint, with dynamic model listing from the Cloudflare model search API. Captioning: adds workers_ai as a multimodal caption API with dynamic vision model discovery via the multimodal-models endpoint. * Add logo svg * Refactor caption dropdown population * Fix order of sources * feat: add error handling for missing Workers AI account ID --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-04-08 23:43:21 +03:00
Tony Gies	700fc05411	feat: add Cloudflare Workers AI provider (#5385 ) * feat: add Cloudflare Workers AI provider Adds support for Cloudflare Workers AI using its OpenAI-compatible API. Workers AI-specific stuff includes: - Model list fetching and capabilities detection - Tokenizer auto-detection for typical hosted model families - Streaming not supported when using structured output Closes #5305 * Make the entire header clickable * Add missing samplers * Fix non-streaming reasoning parsing --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-04-06 00:24:47 +03:00
KKTsN	c9c652eece	fix: improve streaming error propagation and forwarded response logging (#5317 ) * Fix: Improve streaming error handling and forwarded response logging * Fix: fix ESLint error Strings must use singlequote quotes * fix: preserve and log forwarded stream errors * chore: narrow forwarded stream error fix scope * fix: make forwardFetchResponse awaitable and forward upstream error text * Restore original happy path handling * Remove redundant checks in forwardFetchResponse function * Don't send anything on parsing error end --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-04-05 23:01:47 +03:00
lunar sheep	ff1ca1412a	feat(secrets): update readSecret function to accept optional secret ID (#5356 ) * feat(secrets): update readSecret function to accept optional secret ID * add secret_id to ConnectionManagerRequestService payload * fix: pass secret_id for Text Completion types --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-03-30 22:30:45 +03:00
Xiangzhe	2cb1861db6	feat: add SiliconFlow.cn chat completion and embedding support (#5316 ) * feat: add SiliconFlow.cn endpoint support and embedding vectors Chat completion: - Add endpoint selection dropdown (Global/.com vs China/.cn) to existing SiliconFlow provider, following the Z.AI endpoint pattern - Backend switches API URL based on selected endpoint - Add /api-url slash command support for endpoint switching Embeddings: - Add SiliconFlow as a vector/embedding source (OpenAI-compatible) - Support both .com and .cn endpoints via siliconflow_endpoint setting borrowed from the main connection panel (Vertex AI pattern) - Superset model list with platform attribution (.cn) markers - Models: Qwen3-Embedding (0.6B/4B/8B) + BGE/BCE models (.cn only) * Add filter by models type * Load embedding models from endpoint * Improve api-url command declaration * Support endpoint override in custom-request service --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-03-22 00:52:03 +02:00
equal-l2	e834d3724b	Remove xAI web search capability (#5255 ) With web search on, the API now returns 410 Gone.	2026-03-07 16:58:56 +02:00
Spicy Marinara	f20aed95d0	Add gpt-5.3-chat-latest model support (#5241 ) * Add gpt-5.3-chat-latest model support - Add to OpenAI model dropdown (index.html) - Add to captioning multimodal model list (caption/settings.html) - Add to OPENAI_REASONING_EFFORT_MODELS (constants.js) - Add OPENAI_FIXED_REASONING_EFFORT map to clamp effort to 'medium' (the only value this model accepts) - Apply fixed effort override in both Azure and general OpenAI request paths (chat-completions.js) - Update frontend gpt-5.x regex for parameter handling (openai.js) * Update public/scripts/openai.js Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-03-04 20:04:04 +02:00
Cohee	3070cf26cd	Add config for adaptive thinking Fixes #5236	2026-03-03 20:10:39 +02:00
Cohee	63fa9c1d07	Claude: map Reasoning Effort to adaptive thinking config (#5219 ) Supersedes #5105	2026-03-01 17:11:22 +02:00
Cohee	744ce7705d	gemini-3.1-flash-image-preview	2026-02-27 20:26:22 +02:00
Brioch	0cef10f63f	feat(openrouter): disable reasoning if Request model reasoning is off and effort is minimum (#5079 ) * feat(openrouter): disable reasoning if "Request model reasoning" is disabled * feat(openrouter): map minimum reasoning to none if request reasoning is off * Add hint how to disable reasoning --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-02-23 21:19:04 +02:00
Spicy Marinara	a923b0eefe	Add gemini-3.1-pro-preview to Google AI Studio and Vertex model lists with thinking support (#5188 )	2026-02-19 14:28:48 +02:00
Cohee	3bd1034639	claude-sonnet-4-6	2026-02-17 21:33:19 +02:00
Cohee	46ea79bab5	Merge branch 'release' into staging	2026-02-15 15:57:51 +02:00
SenatusSPQR1	4672647293	Fix NanoGPT Claude cache detection for prefixed model IDs (#5164 )	2026-02-15 15:57:14 +02:00
Cohee	4d1619ba47	Chore: enable brace-style eslint check (#5159 ) * eslint: enable brace-style check * Fix jsdoc and color * fix: correct CSS color syntax in CreateZenSliders function	2026-02-15 01:46:32 +02:00
Lumi	39c8eb343c	add option for claude-opus-4-6 (#5103 ) * add option for claude-opus-4-6 * fix: add claude-opus-4-6 to limited sampling and verbosity model lists * fix: disable assistant prefill for claude-opus-4-6 * refacor: merge fixthinkingPrefill and noPrefillModel * 1m context --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-02-05 21:42:27 +02:00
Brioch	6c864e8bb2	feat(openrouter): add model quantizations setting (#5080 ) * feat(openrouter): add model quantizations setting * Remove bogus setting * Simplify nullish coalescing assignment --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-01-30 23:51:22 +02:00
Cohee	10e8e01a55	Moonshot: Map "Request reasoning" to thinking type Fixes #5072	2026-01-28 00:55:11 +02:00
Cohee	0e5b4de10c	Moonshot: Pull vision flag from model data Fixes #5068	2026-01-28 00:26:50 +02:00
Cohee	5a7875ba28	Update Pollinations API (#5060 ) * Upgrade Pollinations API Done: text, caption To do: TTS, image Fixes #5020 * Update Pollinations TTS to new API * Update Pollinations API for images	2026-01-26 20:31:13 +02:00
DeclineThyself	a09c1a7a84	Added `'dot-notation': ['error']` to `.eslint.cjs` (#5042 ) * Added 'dot-notation': ['error'], to `.eslint.cjs` * Ran `eslint --fix` to correct `dot-notation` errors. * Added `eslint-disable dot-notation` anywhere errors were caused. * Allowed dot-notation for uppercase properties: 'allowPattern': '[A-Z]\\w$' Check if `rule instanceof CSSStyleRule` https://github.com/SillyTavern/SillyTavern/pull/5042#discussion_r2711827148 * Fixed `await result.json();` types. * refactor: update dot-notation usage in CoquiTtsProvider and PresetManager --------- Co-authored-by: user <user@exmaple.com> Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-01-23 00:11:03 +02:00
Cohee	372db63cd5	NanoGPT: Add reasoning effort control Closes #4999	2026-01-12 21:05:02 +02:00
DeclineThyself	8372e7bf9d	"gradually replacing property access with a dot operator" (#4965 ) * "gradually replacing property access with a dot operator" https://github.com/SillyTavern/SillyTavern/pull/4963#discussion_r2663003561 (?<=\w\|\])\['([a-zA-Z]\w+)'\] My regex found 593 matches across 47 files. Also, two typos. * Fixed chat[0].chat_metadata type error. https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664275854 * Fixed `swipedElementsDiv[0]?.getAnimations().filter((a) => a.animationName` type error. https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664274593 * Fixed config.MESSAGE_SANITIZE and config.MESSAGE_ALLOW_SYSTEM_UI type errors. https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664266271 * Fixed group.date_last_chat type error. https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664295652 * Reverted SlashCommandParser dot property access. https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664310931 * LLM fixed canUseNegativeLookbehind.result; type error. https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664314288 * Reverted chat-completions.js bodyParams and headers dot property access. https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664317848 https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664320088 https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664324438 * Reverted openai.js data dot property access. https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664326244 * Reverted tests/frontend/MacroEnvBuilder.e2e.js env.dynamicMacros dot property access. https://github.com/SillyTavern/SillyTavern/pull/4965#discussion_r2664330990 * Partially reverted `window` dot property access. * Reverted result.json() and settings dot property access. * Reverted google.js headers dot property access. * Fixed regex: `(?<=\w\|\])\['([a-zA-Z]\w)'\]` Swapped window to globalThis with dot property access. * LLM fixed canUseNegativeLookbehind type. * Refactor property access * Consistency --------- Co-authored-by: user <user@exmaple.com> Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-01-08 23:58:21 +02:00
Cohee	7320aa948d	Audio inlining for OpenAI and Custom-compatible (#4964 ) * Audio inlining for OpenAI and Custom-compatible * Add context sizes * chatgpt-image-latest * Add quality control for gpt-image	2026-01-06 13:27:13 +02:00
Subwolf	a8eb154517	Zai moonshot reverse proxy (#4923 ) * adding reverse proxy support * update * added handling for the image caption extension	2025-12-28 23:52:04 +02:00
Ngo Dinh Gia Bao	829db7f2d0	[Electron Hub] Prompt Caching Support for Claude models (#4918 ) * Prompt Caching support Claude models * Prompt Caching support Claude models * Diff clean-up --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2025-12-28 17:05:54 +02:00
Ben	3668e95d95	filter out models that don't have a valid id (#4920 )	2025-12-27 02:33:25 +02:00
Chanho Chung	ca43796795	Add caching system prompt feature for OpenRouter Gemini (#4903 ) * feat: add caching system prompt for OpenRouter Gemini * fix: resolve reviews	2025-12-20 19:01:42 +02:00
Cohee	83ea6e5cbf	Better thonk effort for Gem 3	2025-12-18 21:40:52 +02:00
mightytribble	2cd2bd4a4d	Implement Gemini thought signatures (#4886 ) * Implement Gemini thought signatures * Implement streaming support for Gemini thought signatures * Implement OR support for Gemini thought signatures * Remove unnecessary extraction of thought sigs from response parts * Update thought sig comments to remove explicit Gemini mention * Fix thought_signature naming convention in message.extra * Add thought_signatures to ReasoningMessageExtra typedef * Prevent thought sigs being sent to incompatible endpoints * Move signatures to populateChatHistory, update for consistent casing * Code clean-up * Only send thought signatures if target model and API match original * Implement content-hash thought signature mapping * Change the data model + split for text/functions * Don't include signature to invocations if the model doesn't match * Fix function description * Remove misleading comment * Handle OpenRouter signatures * Improve message extra types * Prevent modifying original invocations when removing signatures * Fix return of openrouter non-streaming signatures * Remove redundant array check --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2025-12-17 22:23:47 +02:00
Cohee	081c3e7b1c	jerma 3 flash	2025-12-17 20:35:00 +02:00
Cohee	9046fe8d2d	Refactor CC API async route handlers (#4885 ) * Improve error handling in CC /status and /generate endpoints * Cancel pending status check on switching CC source	2025-12-11 23:31:46 +02:00
Chanho Chung	6fdeaa2cd9	fix: caching system prompt functionality for OpenRouter Claude (#4872 )	2025-12-10 20:22:54 +02:00
Cohee	9aff57c9c4	Add dummy reasoning_content for deepseek-reasoner tool calls #4857	2025-12-07 23:40:52 +02:00
Ben	55a07d445d	Chutes integration (#4844 ) * Chutes integration * Fix eslint * Fix key saving * Fix logo coloration * Fix tool checks * Unhide image inlining controls * Fix order of options * Fix type use in TTS extension script * Add Chutes as a vector storage source * Change log levels to debug * Fix streamed reasoning parsing * Skip remote models update * TTS: Fix API key highlight * Sort image models A-Z * TTS: Fixes * Remove unused SD endpoint * Skip setting context size if models list is not yet loaded * remove chutes quota / balance * Fix: streamed tool calling * Hide reasoning effort control * Add image request debug log * Fix: scroll down on media load in extensions * Unhide some samplers * Bring back reasoning effort * This code will never execute * Reformat else if cases * Add stop strings to request * Remove conditional from reasoning_effort body param * Preserve original pricing fields * Unhide logit bias setting * Pass repetition penalty and logit bias to backend * Swap llama tokenizer for llama3 * Pass min_p, remove supported_sampling_parameters checks * Enable logprobs --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2025-12-01 00:17:49 +02:00
mightytribble	1f78094322	Convert OAI tool_choice to Gemini functionCallingConfig for Gemini requests (#4840 ) * Send toolConfig block to Gemini, if defined and tools block also present. * Convert OAI tool_choice to Gemini functionCallingConfig for Gemini requests * Remove blank line --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2025-11-30 19:18:41 +02:00
Cohee	2eba10fa7f	Gemini: Add image request settings (#4838 ) * Gemini: Add image request settings * Allow aspect ratio for 2.5 flash	2025-11-29 00:59:09 +02:00
Cohee	3b59eae7c0	Gemini: Only register custom tools when there are no function tools	2025-11-28 20:22:05 +02:00
Cohee	068c6bdccd	Gemini: Fix search tool is not supported when function tools are used	2025-11-28 20:18:02 +02:00
mightytribble	32bbf4ec10	Support non-function native tools for Gemini * Enable retrieval tool type for VertexAI Gemini endpoints * Apply code suggestion --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2025-11-28 20:02:32 +02:00
Cohee	965b86da62	Add verbosity control (#4837 ) * Add verbosity control * Remove for Azure OpenAI	2025-11-28 19:49:59 +02:00
Cohee	0a22856faf	Chat Completion: Reduce number of toggles in AI Response Configuration (#4821 ) * Chat Completion: Reduce number of toggles in AI Response Configuration * Consolidate migration logic * Don't enable media inlining if image inlining was disabled * Fix icons showing on media toggle off * Update i18n	2025-11-28 00:16:23 +02:00
Cohee	3efcfbd1a2	Add new Claude model options and update regex checks for model validation	2025-11-24 21:55:29 +02:00
Cohee	248f5aa892	NanoGPT: Expose additional samplers	2025-11-24 20:36:51 +02:00

1 2 3 4 5 ...