shaw/SillyTavern - SillyTavern - Gitea: Git with a cup of tea

shaw/SillyTavern

Author	SHA1	Message	Date
Octopus	aecbb9a2ee	feat: add MiniMax as a chat completion provider (#5452 ) * feat: add MiniMax as a chat completion provider Add MiniMax (https://www.minimax.io) as a first-class chat completion provider. MiniMax already has TTS integration in SillyTavern; this extends support to LLM chat completions via their OpenAI-compatible API. Supported models: - MiniMax-M2.5 (default) — 204K context - MiniMax-M2.5-highspeed — same capability, faster inference Key implementation details: - Reuses existing SECRET_KEYS.MINIMAX (shared with TTS) - API endpoint: https://api.minimax.io/v1 - Temperature clamped to (0.0, 1.0] as required by MiniMax API - Returns hardcoded model list since MiniMax doesn't expose /v1/models - Full UI integration: model selector, sampler parameters, streaming Co-Authored-By: octo-patch <octo-patch@users.noreply.github.com> * feat: upgrade MiniMax default model to M2.7 - Add MiniMax-M2.7 and MiniMax-M2.7-highspeed to model list - Set MiniMax-M2.7 as default model - Keep all previous models as alternatives * feat: independent request function, vision support, temp clamping for MiniMax - Extract sendMinimaxRequest() following Chutes pattern (PR #4844) with function calling and JSON Schema structured output support - Clamp temperature to (0.01, 1.0] on backend; limit frontend UI max to 1.0 - Enable image inlining for MiniMax M2.7 model - Add MiniMax to slash-commands model selector and tokenizer mapping - Add minimax_model to default preset * feat: add VLM-based vision support for MiniMax M2.7 M2.7 does not natively accept image input. When images are detected in messages, pre-process them via the MiniMax VLM endpoint (/v1/coding_plan/vlm) to convert images to text descriptions before sending to the chat completions API. Uses the same API key. * feat: add M2-her model to MiniMax provider M2-her is MiniMax's dialogue/roleplay-optimized model with 64K context and 2048 max completion tokens. Text-only (no vision). * feat: add MiniMax China endpoint (minimaxi.com) support Add endpoint selector (Global/China) for MiniMax, mirroring the SiliconFlow pattern. Users can now choose between api.minimax.io (international) and api.minimaxi.com (China domestic). * fix: merge consecutive same-role messages for MiniMax MiniMax API rejects consecutive messages with the same role with error 'invalid chat setting (2013)'. Merge them before sending. * review: address PR feedback on MiniMax provider Backend (src/endpoints/backends/chat-completions.js): - Drop the entire MiniMax VLM image-preprocessing path; vision is no longer advertised for this provider, so M2.7 messages now go straight to /chat/completions without a separate VLM round-trip. - Drop the json_schema -> response_format mapping (MiniMax does not document structured-output support; relying on it was speculative). - Drop the backend temperature clamp; the same clamp now lives in the frontend so the wire payload matches what the user sees. - Drop the MINIMAX branch in /status that returned a hard-coded model list; the frontend hardcodes the same list and bypasses /status via noValidateSources, so the round-trip was wasted. - Add a streaming Transform + non-streaming helper that move <think>...</think> blocks from delta.content / message.content to reasoning_content. MiniMax M2.x emit chain-of-thought inline in content; without this transform the raw <think> tags leak into the rendered chat. Includes a state machine that holds back partial marker bytes so a marker split across SSE chunks is still detected. Frontend: - public/scripts/openai.js: add MINIMAX to noValidateSources so the key is accepted without a /models call; remove the dead saveModelList branch; clamp temperature to (0.0, 1.0] in createGenerationParameters. - public/scripts/reasoning.js: add MINIMAX to the non-streaming reasoning_content extraction case (the backend transform now produces this field for MiniMax responses). - public/scripts/slash-commands.js: add MINIMAX to the /api enum and add a MiniMax case to /api-url so users can switch endpoint by command. - public/scripts/custom-request.js: pass minimax_endpoint through the override-payload merge alongside the other per-source endpoint fields. - public/scripts/tokenizers.js: stop returning openai_model (which was always a MiniMax model id and thus an unknown tokenizer); fall back to gpt-3.5-turbo for a coarse but functional estimate. - public/scripts/tool-calling.js: add MINIMAX to supportedSources so function-calling settings are exposed. - public/index.html: drop the "-- Connect to the API --" placeholder option from the model select (the model list is hardcoded and always populated); remove minimax from the vision data-source attributes on the inline-media controls. - public/img/minimax.svg: replace the multicolor brand SVG with a single-color currentColor version that matches the other provider icons in the connect panel. * review: drop backend <think> parsing, defer to frontend Per reviewer feedback: SillyTavern's reasoningHandler / reasoning_auto_parse setting already extracts <think>...</think> blocks on the client side, so the backend doesn't need to rewrite MiniMax responses. Removes the SSE Transform, the non-streaming helper, and the corresponding case in reasoning.js. * fix: remove isImageInliningSupported declaration for MINIMAX * fix: remove MINIMAX from stream reasoning parsing * fix: add to autoconnect logic * fix: add missing MINIMAX models from docs * fix: freq. and pres. pen aren't supported for MINIMAX * fix: use clamp function for adjusting temperature * fix: pass minimax_endpoint from connection profile to ChatCompletionService * fix: update supported APIs in slash command documentation * fix: replace bespoke merge with standard MERGE_TOOLS processing * fix: add data-i18n attributes for headers --------- Co-authored-by: octo-patch <octo-patch@users.noreply.github.com> Co-authored-by: octo-patch <octo-patch@github.com> Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-04-24 00:43:05 +03:00
Wolfsblvt	64c96e895c	Add Streaming Display Utility and New Generation Slash Commands (`/genstream`, `/reasoning-format`) (#5438 ) * Add StreamingDisplay class for live LLM generation output with floating toast panel - Add StreamingDisplay class to show streaming reasoning and content in a floating toast panel - Extract createModelIcon() helper from insertSVGIcon() for reusable API/model icon creation - StreamingDisplay automatically appends inside topmost open dialog (same pattern as fixToastrForDialogs) - Add CSS with fade-in animation, pulsating activity indicator, and separate reasoning/content sections - Support optional model icon in header * Add ConnectionManagerRequestService.getProfileIcon() method for retrieving profile API icons - Add static getProfileIcon() method to ConnectionManagerRequestService - Returns HTMLImageElement created via createModelIcon() for a given profile's API/model - Accepts optional profileId parameter, defaults to currently selected profile - Returns null if Connection Manager is disabled, profile not found, or profile has no API - Import createModelIcon from script.js * Use animation_duration directly in hide() and CSS transition instead of constant - Remove ANIMATION_DURATION_MS constant and use animation_duration directly in hide() method - Replace hardcoded 0.3s CSS transitions with CSS variable var(--animation-duration, 125ms) - Read animation_duration value inline in hide() for accurate timing * Add /genstream slash command with live streaming display and reasoning support - Add /genstream slash command that generates text via Connection Manager with live streaming UI - Add formatReasoning() helper function (inverse of parseReasoningFromString) to format reasoning/content into template-wrapped strings - Add connectionProfiles enum provider for profile selection in slash commands - StreamingDisplay: add delay parameter to hide() method (default 1000ms) to show final result before dismiss * Add /reasoning-format slash command to format reasoning and content into template-wrapped strings - Add /reasoning-format (alias: /format-reasoning) slash command that wraps reasoning/content using Reasoning Formatting settings - Accept required 'reasoning' named argument and optional unnamed 'content' argument - Validate that prefix/suffix are configured before formatting - Return formatted string via formatReasoning() helper for use with /reasoning-parse - Show warning toasts if prefix/suffix missing * Rename /genstream command to /profile-genstream and move to appropriate module * Apply messageFormatting to StreamingDisplay reasoning and content text for proper rendering - Import messageFormatting from script.js - Replace textContent with innerHTML using messageFormatting() in updateReasoning() and updateText() - Pass isSystem=true for reasoning, isSystem=false for content to match formatting expectations - Update css to utilize pre-formatted paragraphs correctly * Strip auto-added quotes from <q> tags in StreamingDisplay and add 'mes_text' class for consistent chat message formatting - Add CSS rules to remove browser-default quotes from <q> tags in reasoning and content sections - Add 'mes_text' class to textContent div to match chat message formatting behavior - Prevents double quotes when messageFormatting already adds them via <q> tags * Add minimize/close buttons and complete state to StreamingDisplay with configurable auto-hide - Add minimize button to collapse/restore content sections while keeping header visible - Add close button to manually dismiss display (generation continues in background) - Replace CSS pseudo-element with explicit LED indicator element for better state control - Add complete() method to mark generation done: changes LED from pulsing orange to solid green - Add configurable auto-hide delay after completion * Add stop button to StreamingDisplay with abort support and onStop/onComplete closures for /profile-genstream - Add stop button to StreamingDisplay when onStop handler is provided - Add markStopped() method with solid red LED state indicator - Add AbortController integration to /profile-genstream for request cancellation - Add onStop and onComplete closure arguments to /profile-genstream command - Update complete() method signature to use options object with label and delay - Disable stop button immediately * Position StreamingDisplay above bottom form block using CSS variable with fallback - Change bottom positioning from fixed 20px to dynamic calculation - Use max() to position above --bottomFormBlockSize + 5px or minimum 20px - Ensures StreamingDisplay doesn't overlap with bottom UI elements * Rename /profile-genstream arguments for clarity: label→generating, completedLabel→completed, hideDelay→delay - Rename `label` argument to `generating` to better reflect its purpose as the in-progress state label - Rename `completedLabel` to `completed` for consistency and brevity - Rename `hideDelay` to `delay` for simpler naming - Update all internal references and variable names to match new argument names - Update argument descriptions and default values accordingly * Remove variable resolution from /profile-genstream arguments: system, length, and delay - Remove ARGUMENT_TYPE.VARIABLE_NAME from typeList for system, length, and delay arguments - Replace resolveVariable() calls with direct argument access for system, length, and delay - Simplify type checking to use typeof directly on args properties - Maintain existing default values and validation logic * Add warning toast and early return when connection profile not found in /profile-genstream - Display toastr warning when fuzzy search fails to find matching profile - Return empty string to prevent execution with invalid profile - Improves user feedback for incorrect profile names or IDs * Extract buildResultText() helper in /profile-genstream to return partial results when stopped - Add buildResultText() helper function to centralize result formatting logic - Return partial generated text when user stops generation instead of empty string - Reuse buildResultText() for both stopped and completed states - Maintains consistent reasoning formatting in both cases * fix lint * Update documentation to reflect argument name change from hideDelay to delay --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-04-15 21:38:13 +03:00
Tony Gies	a9c377c3c8	feat: add Workers AI text embeddings and multimodal captioning (#5414 ) * feat: add Workers AI text embeddings and multimodal captioning Extends the Cloudflare Workers AI integration to the vectors and caption extensions. Embeddings: adds workers_ai source to the vectors extension using the OpenAI-compatible /v1/embeddings endpoint, with dynamic model listing from the Cloudflare model search API. Captioning: adds workers_ai as a multimodal caption API with dynamic vision model discovery via the multimodal-models endpoint. * Add logo svg * Refactor caption dropdown population * Fix order of sources * feat: add error handling for missing Workers AI account ID --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-04-08 23:43:21 +03:00
lunar sheep	ff1ca1412a	feat(secrets): update readSecret function to accept optional secret ID (#5356 ) * feat(secrets): update readSecret function to accept optional secret ID * add secret_id to ConnectionManagerRequestService payload * fix: pass secret_id for Text Completion types --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-03-30 22:30:45 +03:00
Xiangzhe	2cb1861db6	feat: add SiliconFlow.cn chat completion and embedding support (#5316 ) * feat: add SiliconFlow.cn endpoint support and embedding vectors Chat completion: - Add endpoint selection dropdown (Global/.com vs China/.cn) to existing SiliconFlow provider, following the Z.AI endpoint pattern - Backend switches API URL based on selected endpoint - Add /api-url slash command support for endpoint switching Embeddings: - Add SiliconFlow as a vector/embedding source (OpenAI-compatible) - Support both .com and .cn endpoints via siliconflow_endpoint setting borrowed from the main connection panel (Vertex AI pattern) - Superset model list with platform attribution (.cn) markers - Models: Qwen3-Embedding (0.6B/4B/8B) + BGE/BCE models (.cn only) * Add filter by models type * Load embedding models from endpoint * Improve api-url command declaration * Support endpoint override in custom-request service --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2026-03-22 00:52:03 +02:00
Cohee	5a7875ba28	Update Pollinations API (#5060 ) * Upgrade Pollinations API Done: text, caption To do: TTS, image Fixes #5020 * Update Pollinations TTS to new API * Update Pollinations API for images	2026-01-26 20:31:13 +02:00
Cohee	293ee0a310	Caption: Add custom model input field (#4956 )	2026-01-05 02:15:52 +02:00
Subwolf	a8eb154517	Zai moonshot reverse proxy (#4923 ) * adding reverse proxy support * update * added handling for the image caption extension	2025-12-28 23:52:04 +02:00
Cohee	c92939e56c	Z.AI: Video inlining and 'coding' captions Closes #4899	2025-12-17 23:31:14 +02:00
Cohee	8ca836c3a1	custom-request: Pass api-url for Z.AI and Vertex and fix if omitted in profile (#4859 )	2025-12-03 21:17:41 +02:00
qvink	a4cc9b3989	Facillitate extension use of ConnectionManagerRequestService (#4841 ) * Separate prompt-building functionality from request-sending functionality * removing logs and clarifying comments * separating parameter construction functionality to allow ConnectionManagerRequestService to use all other preset parameters * fixing chat completion issues, adding documentation to new functions. * Improving ConnectionManagerRequestService errors. Adding parseReasoningFromString option to override reasoning template. * Adjusting TextCompletionService prompt formatting * linting * Use settingsToUpdate to convert from OAI preset to OAI settings. * lint * throw errors when profile ID not found * Fix missed instances of global completion settings being used (CC and TC), replaced with optional argument. Specified typing for ChatCompletionSettings and TextCompletionSettings. * Adjusting parameters of parseReasoningFromString and adding getReasoningTemplateByName * using messages.role as a fallback for custom requests, fixing newline removal. * parameters => settings I like how it sounds better * ditto * You know I had to do it to 'em * Update getCustomTokenBans * Fix calculateLogitBias * Fix param attributes * Fix type checks * Less strict role type on ChatCompletionMessage * Add missing space * fixing getChatCompletionModel to use an arbitrary chat completion settings object * Fixing issues with preset overriding custom data passed. * Pass model to createGenerationParameters externally * Unify seed param handling for CHUTES * Fix non-existing CC source * Use strict comparison * Use global settings as a base for generation parameters creation * removing unnecessary handling of preset fields * don't pass preset prompts, use the passed payload override messages * refactoring text generation prompt building of last line * Pass model to getReasoningEffort * Pass model name to canPerformToolCalls * Pass model to createTextGenGenerationData --------- Co-authored-by: qvink <qvink@users.noreply.github.com> Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2025-12-03 20:09:02 +02:00
Ben	55a07d445d	Chutes integration (#4844 ) * Chutes integration * Fix eslint * Fix key saving * Fix logo coloration * Fix tool checks * Unhide image inlining controls * Fix order of options * Fix type use in TTS extension script * Add Chutes as a vector storage source * Change log levels to debug * Fix streamed reasoning parsing * Skip remote models update * TTS: Fix API key highlight * Sort image models A-Z * TTS: Fixes * Remove unused SD endpoint * Skip setting context size if models list is not yet loaded * remove chutes quota / balance * Fix: streamed tool calling * Hide reasoning effort control * Add image request debug log * Fix: scroll down on media load in extensions * Unhide some samplers * Bring back reasoning effort * This code will never execute * Reformat else if cases * Add stop strings to request * Remove conditional from reasoning_effort body param * Preserve original pricing fields * Unhide logit bias setting * Pass repetition penalty and logit bias to backend * Swap llama tokenizer for llama3 * Pass min_p, remove supported_sampling_parameters checks * Enable logprobs --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2025-12-01 00:17:49 +02:00
Cohee	38679897c6	Add captioning for video attachments (#4749 ) * Add captioning for video attachments * Unify error toast titles * Add MEDIA_SOURCE enum and update media handling to include source information * Unify attachment handling logic * Add error handling for auto-captioning failures * Use string formatting for console error	2025-11-08 02:07:28 +02:00
Cohee	4add4f0090	Add official GLM API as CC provider (#4678 ) * Add UI elements * Add support for model configuration * fix: update API request parameters for improved handling * Add logo img * Fix tool calling with negative index * Include tool calls into 'last in context' calculation * feat: add support for captioning	2025-10-21 22:37:27 +03:00
bmen25124	d48a3639ed	Added "custom_prompt_post_processing" to custom-request (#4639 )	2025-10-10 20:37:23 +03:00
Ngo Dinh Gia Bao	8687bb99f3	Add Electron Hub as Chat Completions Provider (#4458 ) * fixed merge conflicts * Supported max tokens + fixed wrong image model mapping * fixed merge conflicts * fixed merge conflicts * updated the logic * updated the logic * replaced hard coded reasoning_effort mode list with a dynamic function * replaced hard coded reasoning_effort model list with a dynamic function * Fix eslint * Adjust reasoning effort logic * Code clean-up * Add logo * Add inline image quality * Fix multimodal models list * Fix seed not passed * Add "detail" error parser --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2025-09-04 21:25:31 +03:00
Cohee	285b09ceb7	NanoGPT: Add as captioning source	2025-08-31 19:00:27 +03:00
Cohee	80fa7f96f1	Caption: Improve Ollama models handling	2025-08-27 00:45:24 +03:00
Cohee	0e38bfbf05	Feat/moonshot api (#4330 ) * moonshot * Partial mode + JSON schema * Add logo image * Limit max temp to 1 * Add to captioning extension	2025-07-31 00:01:04 +03:00
Cohee	086f873f2f	Deprecate 01.ai chat completion source (#4327 )	2025-07-28 21:29:01 +03:00
Cohee	28189bb1c7	Move default slash commands to respective module (#4240 ) * Move default slash commands to respective module * Update public/scripts/slash-commands.js Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-07-06 17:17:19 +03:00
Cohee	dbe0111034	Refactor saveBase64AsFile uploads (#4200 ) * Refactor saveBase64AsFile uploads * Add request body check * Extract server-side constants * Allow .jfif media attachments * Allow .bmp uploads * Enhance image prompt handling: support additional MIME types and prevent upscaling in thumbnails * Convert file extension to lowercase * Enhance thumbnail creation: improve image quality and add white background * Add toast for error in media upload	2025-06-25 21:34:08 +03:00
Compass	083f92b2b2	Add region option for vertex express mode to access more models (#4155 ) * add region option for vertex express mode to access more models * add region option for vertex express mode to access more models * fix: Consolidate duplicate 'vertexai_region' input * feat: add vertex region suggestion datalist & update caption endpoint to correctly handle region * Adjust global projectless Vertex endpoint * Remove pointless slash trim * Skill issue --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2025-06-15 15:26:12 +03:00
Dmitry	fece612f09	Merge pull request #4135 from D1m7asis/release feat: Added AI/ML API Provider Support	2025-06-13 20:59:18 +03:00
InterestingDarkness	75e3f599e6	Derive Vertex AI Project ID from Service Account JSON This commit refactors the Vertex AI integration to automatically derive the Project ID from the provided Service Account JSON. This simplifies the configuration process for users in "Full" (service account) authentication mode by removing the need to specify the Project ID separately.	2025-05-28 21:57:17 +08:00
InterestingDarkness	1e2bec1751	Removed direct references to 'vertexai_project_id' from openai.js and related files, ensuring it is now managed through backend secrets for enhanced security.	2025-05-27 21:25:53 +08:00
InterestingDarkness	5656c7950d	Implement Vertex AI authentication modes and configuration in UI - Updated index.html to include options for Vertex AI authentication modes (Express and Full). - Enhanced openai.js to manage Vertex AI settings, including project ID, region, and service account JSON. - Added validation and handling for service account JSON in the backend. - Modified API request handling in google.js to support both authentication modes for Vertex AI. - Updated secrets.js to include a key for storing Vertex AI service account JSON. - Improved error handling and user feedback for authentication issues.	2025-05-26 22:09:59 +08:00
Aykut Akgün	e4217dbeba	custom endpoint handling (#4031 )	2025-05-24 01:41:03 +03:00
Cohee	57b81be9ce	Caption - allow custom endpoint for xAI	2025-05-22 23:03:04 +03:00
NijikaMyWaifu	157315cd68	Add Vertex AI express mode support (#3977 ) * Add Vertex AI express mode support Split Google AI Studio and Vertex AI * Add support for Vertex AI, including updating default models and related settings, modifying frontend HTML to include Vertex AI options, and adjusting request processing logic in the backend API. * Log API name in the console * Merge sysprompt toggles back * Use Gemma tokenizers for Vertex and LearnLM * AI Studio parity updates * Add link to express mode doc. Also technically it's not a form * Split title * Use array includes * Add support for Google Vertex AI in image captioning feature * Specify caption API name, add to compression list --------- Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>	2025-05-22 20:10:53 +03:00
Cohee	aef005007f	Do not remove data URI prefix from llamacpp caption requests	2025-05-09 23:23:34 +03:00
Cohee	8a4da487dd	llamacpp: use generic CC endpoint for captioning	2025-05-09 22:33:25 +03:00
Cohee	91fc50b82d	Merge branch 'staging' into gork-ai	2025-04-11 21:15:54 +03:00
bmen25124	fc5e0563ba	Added ability to override request payload	2025-04-11 19:07:00 +03:00
Cohee	17cdc78a91	Add xAI for image captioning	2025-04-11 19:05:03 +03:00
bmen25124	4736f533a5	Added proxy support to ChatCompletionService	2025-04-11 19:04:32 +03:00
bmen25124	50b2eeb61f	Added api check for ConnectionManagerRequestService.handleDropdown	2025-04-01 04:39:41 +03:00
bmen25124	972b1e5fa7	Fixed variable naming, better jsdoc	2025-03-26 23:30:09 +03:00
bmen25124	a7d48b1aed	Added overridable instruct settings, removed macro override	2025-03-26 23:21:48 +03:00
bmen25124	ec474f5571	Added stream support to "custom-request"	2025-03-21 20:44:09 +03:00
bmen25124	86de927ab9	Added "custom_url" to ChatCompletionService	2025-03-17 14:54:59 +03:00
bmen25124	d42a81f97c	New connection manager events, ConnectionManagerRequestService (#3603 )	2025-03-16 16:58:34 +02:00
Cohee	c167890d26	Add multimodal captioning for Cohere	2025-03-05 21:36:43 +02:00
Cohee	d7b3a56c3d	chore: await before returning	2025-01-07 20:07:41 +02:00
Cohee	81841ca2a6	WebLLM: use current tokenizer if not available	2025-01-07 20:01:59 +02:00
Cohee	6706cce10d	Groq: Add new models and multimodal captions	2024-10-03 08:41:45 +03:00
Cohee	60df924bec	MistralAI: Add Pixtral to models and captioning	2024-09-17 21:44:25 +03:00
Cohee	8bb964515a	Fix Gemini multimodal with JPG images Fixes #2763	2024-09-08 10:48:28 +03:00
Cohee	06e3d5f8de	Rename MakerSuite => AI Studio	2024-08-21 21:00:17 +03:00
Cohee	8921b78f87	Add debug logs to WebLLM completions	2024-08-13 19:57:38 +03:00

1 2