Commit Graph

1691 Commits

Author SHA1 Message Date
Cohee c249e5384c feat: pass koboldcpp reasoning effort (#5491)
Fixes #5489
2026-04-26 00:02:07 +03:00
Cohee 09d72828cb feat: add gemma 4 for AI studio (#5493)
* feat: add gemma 4 for AI studio

* fix: update max context return value for gemma-3n-e4b-it model

* refactor: iterate array of [regex, number]

* gemma4: enable tool calling and sysprompt

Co-authored-by: Copilot <copilot@github.com>

---------

Co-authored-by: Copilot <copilot@github.com>
2026-04-25 22:22:55 +03:00
Cohee 09bb7622ed OpenAI: Add gpt-5.5, gpt-5.4-mini/nano, gpt-image-2 (#5529)
* feat: gpt-image-2 for OpenAI image generation

* gpt-5.5

Co-authored-by: Copilot <copilot@github.com>

* fix: adjust reasoning effort mapping

Co-authored-by: Copilot <copilot@github.com>

* fix: html format

---------

Co-authored-by: Copilot <copilot@github.com>
2026-04-25 21:46:52 +03:00
DeathStalker471 b1ef254f78 fix: disable HTTP keepAlive (Node 18 behavior) with a config toggle (#5519)
* implement disable keepalive, handle request-proxy and config logic

* Invert keep-alive boolean setting

* fix: clean-up server.js diff

* fix: boolean flag type

* feat: disable keep-alive by default

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-04-24 22:53:35 +03:00
Dclef 77cbcd8774 feat: add DeepSeek V4 model support with thinking mode and reasoning effort (#5522)
* fix: align DeepSeek provider with V4 API

* Fix DeepSeek beta routing for standard chat completions

* feat: add DeepSeek V4 model support with thinking mode and reasoning effort

* Address DeepSeek review feedback

* Set DeepSeek default model to v4 flash

* fix: clean-up deprecated models, add migration

* fix: move reasoning effort mapping to resolveReasoningEffort

* fix: lint empty line

* fix: remove duplicate code

* fix: add coder model to migration logic

---------

Co-authored-by: dclef <drclef233@gmail.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-04-24 21:47:30 +03:00
Octopus aecbb9a2ee feat: add MiniMax as a chat completion provider (#5452)
* feat: add MiniMax as a chat completion provider

Add MiniMax (https://www.minimax.io) as a first-class chat completion
provider. MiniMax already has TTS integration in SillyTavern; this
extends support to LLM chat completions via their OpenAI-compatible API.

Supported models:
- MiniMax-M2.5 (default) — 204K context
- MiniMax-M2.5-highspeed — same capability, faster inference

Key implementation details:
- Reuses existing SECRET_KEYS.MINIMAX (shared with TTS)
- API endpoint: https://api.minimax.io/v1
- Temperature clamped to (0.0, 1.0] as required by MiniMax API
- Returns hardcoded model list since MiniMax doesn't expose /v1/models
- Full UI integration: model selector, sampler parameters, streaming

Co-Authored-By: octo-patch <octo-patch@users.noreply.github.com>

* feat: upgrade MiniMax default model to M2.7

- Add MiniMax-M2.7 and MiniMax-M2.7-highspeed to model list
- Set MiniMax-M2.7 as default model
- Keep all previous models as alternatives

* feat: independent request function, vision support, temp clamping for MiniMax

- Extract sendMinimaxRequest() following Chutes pattern (PR #4844)
  with function calling and JSON Schema structured output support
- Clamp temperature to (0.01, 1.0] on backend; limit frontend UI max to 1.0
- Enable image inlining for MiniMax M2.7 model
- Add MiniMax to slash-commands model selector and tokenizer mapping
- Add minimax_model to default preset

* feat: add VLM-based vision support for MiniMax M2.7

M2.7 does not natively accept image input. When images are detected
in messages, pre-process them via the MiniMax VLM endpoint
(/v1/coding_plan/vlm) to convert images to text descriptions before
sending to the chat completions API. Uses the same API key.

* feat: add M2-her model to MiniMax provider

M2-her is MiniMax's dialogue/roleplay-optimized model with 64K context
and 2048 max completion tokens. Text-only (no vision).

* feat: add MiniMax China endpoint (minimaxi.com) support

Add endpoint selector (Global/China) for MiniMax, mirroring the
SiliconFlow pattern. Users can now choose between api.minimax.io
(international) and api.minimaxi.com (China domestic).

* fix: merge consecutive same-role messages for MiniMax

MiniMax API rejects consecutive messages with the same role with
error 'invalid chat setting (2013)'. Merge them before sending.

* review: address PR feedback on MiniMax provider

Backend (src/endpoints/backends/chat-completions.js):
- Drop the entire MiniMax VLM image-preprocessing path; vision is no
  longer advertised for this provider, so M2.7 messages now go straight
  to /chat/completions without a separate VLM round-trip.
- Drop the json_schema -> response_format mapping (MiniMax does not
  document structured-output support; relying on it was speculative).
- Drop the backend temperature clamp; the same clamp now lives in the
  frontend so the wire payload matches what the user sees.
- Drop the MINIMAX branch in /status that returned a hard-coded model
  list; the frontend hardcodes the same list and bypasses /status via
  noValidateSources, so the round-trip was wasted.
- Add a streaming Transform + non-streaming helper that move
  <think>...</think> blocks from delta.content / message.content to
  reasoning_content. MiniMax M2.x emit chain-of-thought inline in
  content; without this transform the raw <think> tags leak into the
  rendered chat. Includes a state machine that holds back partial
  marker bytes so a marker split across SSE chunks is still detected.

Frontend:
- public/scripts/openai.js: add MINIMAX to noValidateSources so the key
  is accepted without a /models call; remove the dead saveModelList
  branch; clamp temperature to (0.0, 1.0] in createGenerationParameters.
- public/scripts/reasoning.js: add MINIMAX to the non-streaming
  reasoning_content extraction case (the backend transform now produces
  this field for MiniMax responses).
- public/scripts/slash-commands.js: add MINIMAX to the /api enum and
  add a MiniMax case to /api-url so users can switch endpoint by
  command.
- public/scripts/custom-request.js: pass minimax_endpoint through the
  override-payload merge alongside the other per-source endpoint fields.
- public/scripts/tokenizers.js: stop returning openai_model (which was
  always a MiniMax model id and thus an unknown tokenizer); fall back
  to gpt-3.5-turbo for a coarse but functional estimate.
- public/scripts/tool-calling.js: add MINIMAX to supportedSources so
  function-calling settings are exposed.
- public/index.html: drop the "-- Connect to the API --" placeholder
  option from the model select (the model list is hardcoded and always
  populated); remove minimax from the vision data-source attributes
  on the inline-media controls.
- public/img/minimax.svg: replace the multicolor brand SVG with a
  single-color currentColor version that matches the other provider
  icons in the connect panel.

* review: drop backend <think> parsing, defer to frontend

Per reviewer feedback: SillyTavern's reasoningHandler / reasoning_auto_parse
setting already extracts <think>...</think> blocks on the client side, so the
backend doesn't need to rewrite MiniMax responses. Removes the SSE Transform,
the non-streaming helper, and the corresponding case in reasoning.js.

* fix: remove isImageInliningSupported declaration for MINIMAX

* fix: remove MINIMAX from stream reasoning parsing

* fix: add to autoconnect logic

* fix: add missing MINIMAX models from docs

* fix: freq. and pres. pen aren't supported for MINIMAX

* fix: use clamp function for adjusting temperature

* fix: pass minimax_endpoint from connection profile to ChatCompletionService

* fix: update supported APIs in slash command documentation

* fix: replace bespoke merge with standard MERGE_TOOLS processing

* fix: add data-i18n attributes for headers

---------

Co-authored-by: octo-patch <octo-patch@users.noreply.github.com>
Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-04-24 00:43:05 +03:00
Stagnating a028bec87b Display OpenRouter credit balance in UI (#5513)
* Display OpenRouter credit balance in UI

Adds a "View Remaining Credits" click handler that fetches the
current balance from the OpenRouter /credits endpoint via a new
server-side /api/openrouter/credits route, and renders it next
to the link. The anchor still points at openrouter.ai/account
so middle-click / right-click "open in new tab" keeps working.

* Return 500 on OpenRouter credits failure

* Reduce to two decimals

* Update view credits URL

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-04-23 23:21:28 +03:00
Cohee b8f2b1cfa6 fix: enhance URL validation for Z.AI image generation (#5482)
* fix: enhance URL validation for Z.AI image generation and handle potential delays
Fixes #5480

* fix: retry 404 in a loop

* fix: handle Z.AI image and video unavailability with appropriate warnings and responses
2026-04-20 00:28:55 +03:00
Wolfsblvt d720605be8 Bulk extension field updates via merge-attributes with UNSET_VALUE sentinel (#5471)
* feat: add bulk extension field updates with UNSET_VALUE sentinel for key deletion

- Add `UNSET_VALUE` sentinel constant to signal complete field removal from character cards
- Add `writeExtensionFieldBulk()` function to update extension fields across multiple characters in a single API call
- Add `deleteValueByPath()` utility function to remove nested object keys by dot-path
- Update `writeExtensionField()` to support `UNSET_VALUE` for deleting extension keys
- Extend `/api/characters/merge-attributes

* Revert package-lock.json changes

* Allow null values in merge-attributes filter path validation

Change filter.path existence check to only skip on undefined, not null. This allows merging attributes when the existing value is explicitly null, treating null as a valid value rather than absence of a value.

* fix: share forbiddenRegExp between modules

* feat: add writeExtensionFieldBulk and UNSET_VALUE constant to getContext

* Update src/endpoints/characters.js

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: validate for .png extension

* Update public/scripts/extensions.js

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* refactor: extract shouldSkip logic as a function param to avoid double parsing

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-20 00:06:28 +03:00
ashishch432 d1e719eb48 add claude-opus-4-7 (#5465) 2026-04-19 15:47:40 +03:00
Cohee 3f72d3df80 Improve OpenRouter model lists in extensions (#5459)
* fix: extensions OpenRouter model lists

* fix: update JSDoc for optional mapping function parameter in fetchModelsByModality

* fix: update JSDoc to clarify return type of fetchModelsByModality function

* fix: encode output modality in fetchModelsByModality API request
2026-04-15 23:18:26 +03:00
Copilot 78628f7dbb Integrate Cloudflare Workers AI text-to-image into SD extension (#5434)
* feat: integrate Cloudflare Workers AI for text-to-image generation in SD extension

Agent-Logs-Url: https://github.com/SillyTavern/SillyTavern/sessions/efc79e4d-2119-4cdb-8afb-f26e318a38ef

Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>

* fix: address review - use oai_settings for account ID, sort dropdown alphabetically, remove Account ID input, move debug log

Agent-Logs-Url: https://github.com/SillyTavern/SillyTavern/sessions/bf0dda38-df40-44f4-8a63-0c952b48905d

Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>

* Clean-up diffs

* feat: add refresh models button to Workers AI section

Agent-Logs-Url: https://github.com/SillyTavern/SillyTavern/sessions/ab6b5e7a-84d2-44d1-9f6e-3d330de04ef1

Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>

* fix: revert unrelated package-lock.json changes

Agent-Logs-Url: https://github.com/SillyTavern/SillyTavern/sessions/ab6b5e7a-84d2-44d1-9f6e-3d330de04ef1

Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>

* Fix models loading

* refactor: update model refresh button ID and add class to select elements

* Send formData to BFL models

* fix: adjust use FormData condition

* fix: validate Workers AI account ID before proceeding with image model loading

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>
2026-04-15 22:00:08 +03:00
Alex Dills fa9a28c6f3 Fix stable-diffusion.cpp model routing and URL path handling (#5427)
* fix: include model field in sd.cpp SDAPI requests and preserve URL path

The sd.cpp integration overwrites the URL pathname when constructing
requests, which breaks proxy servers like llama-swap that use path-based
routing (e.g. /upstream/model-name). Additionally, the model field was
never included in SDAPI requests, which is required by llama-swap to
route requests to the correct backend.

Changes:
- Server: Append to URL pathname instead of overwriting (same pattern as #5178)
- Server: Pass model field through to sd-server payload
- Client: Add model name text input for sd.cpp source settings
- Client: Send model name in generate request payload

* fix: fetch models from server and populate standard Model dropdown

Instead of a separate text input for the model name, fetch the model
list from the sd.cpp server's /v1/models endpoint and populate the
standard Model dropdown. This provides a seamless experience where
users just pick a model from the dropdown like any other source.

Works with both standalone sd-server and proxy servers like llama-swap
that expose multiple models via the OpenAI-compatible models endpoint.

* fix: don't send clip_skip=1 to sd.cpp, it produces blank images

sd-server generates blank white images when clip_skip is set to 1.
Since clip_skip=1 means 'use all CLIP layers' (the default behavior),
only send the parameter when it's > 1.

* Fix eslint

* Replace string appends with urlJoin

* fix: convert URL strings to URL objects in sdcpp routes

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-04-09 23:12:21 +03:00
Tony Gies a9c377c3c8 feat: add Workers AI text embeddings and multimodal captioning (#5414)
* feat: add Workers AI text embeddings and multimodal captioning

Extends the Cloudflare Workers AI integration to the vectors and
caption extensions.

Embeddings: adds workers_ai source to the vectors extension using the
OpenAI-compatible /v1/embeddings endpoint, with dynamic model listing
from the Cloudflare model search API.

Captioning: adds workers_ai as a multimodal caption API with dynamic
vision model discovery via the multimodal-models endpoint.

* Add logo svg

* Refactor caption dropdown population

* Fix order of sources

* feat: add error handling for missing Workers AI account ID

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-04-08 23:43:21 +03:00
Cohee 5ec635aa40 fix npm audit in src/electron (#5405) 2026-04-06 00:46:27 +03:00
Tony Gies 700fc05411 feat: add Cloudflare Workers AI provider (#5385)
* feat: add Cloudflare Workers AI provider

Adds support for Cloudflare Workers AI using its OpenAI-compatible API.

Workers AI-specific stuff includes:
- Model list fetching and capabilities detection
- Tokenizer auto-detection for typical hosted model families
- Streaming not supported when using structured output

Closes #5305

* Make the entire header clickable

* Add missing samplers

* Fix non-streaming reasoning parsing

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-04-06 00:24:47 +03:00
KKTsN c9c652eece fix: improve streaming error propagation and forwarded response logging (#5317)
* Fix: Improve streaming error handling and forwarded response logging

* Fix: fix ESLint error  Strings must use singlequote  quotes

* fix: preserve and log forwarded stream errors

* chore: narrow forwarded stream error fix scope

* fix: make forwardFetchResponse awaitable and forward upstream error text

* Restore original happy path handling

* Remove redundant checks in forwardFetchResponse function

* Don't send anything on parsing error end

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-04-05 23:01:47 +03:00
Cohee d96d1451ab Add IP whitelist for SSO authentication headers (#5404)
* feat: add trusted proxies configuration for SSO authentication

* Refactor check to accept IP address directly

* Refactor IP patterns validation

* Unify warning message format
2026-04-05 22:20:39 +03:00
Cohee 8e8f501279 Immutable public and global content management (#5390)
* Use custom init script instead of postinstall

* Revert changes to start scripts in src\electron

* Add global data to content manager

* Add migration for public overrides and user.css location update

* Update npm publish workflow to use 'omit=dev' flag in npm ci commands

* Rename user.css readme file

* Fix indentation in userCssMiddleware function

* Add directory creation for content target

* Restore template compile location

* Move stylesheet up in index.json

* Use path.resolve for user.css file path in userCssMiddleware

* Correct capitalization in "Not Found" error page title and heading

* Remove init run from startup scripts

* Simplify user CSS file path resolution

* Update userCssMiddleware comment
2026-04-05 19:32:28 +03:00
Cohee e2d8c0200f Use custom init script instead of postinstall (#5384)
* Use custom init script instead of postinstall

* Revert changes to start scripts in src\electron

* feat: add --ignore-scripts flag to npm install commands in batch and shell scripts

* feat: add --ignore-scripts flag to npm ci in Dockerfile
2026-04-01 23:34:00 +03:00
lunar sheep ff1ca1412a feat(secrets): update readSecret function to accept optional secret ID (#5356)
* feat(secrets): update readSecret function to accept optional secret ID

* add secret_id to ConnectionManagerRequestService payload

* fix: pass secret_id for Text Completion types

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-03-30 22:30:45 +03:00
Tony Gies 271c1e22ca Fix async file deletion bugs in assets endpoint (#5363)
The delete handler had a missing `return` before `sendStatus(400)`,
causing execution to fall through to `sendStatus(200)`, a double-send
that triggers ERR_HTTP_HEADERS_SENT, which the catch block then
compounds by attempting a third `sendStatus(500)`.

Both the delete and download handlers used callback-based `fs.unlink()`
without awaiting completion. In the download handler, this caused a race
with `createWriteStream({ flags: 'wx' })` (which fails if the file
still exists). In both handlers, `throw err` inside the callback was an
unhandled exception that could never be caught by the outer try/catch.

Replace callback-based `fs.unlink()` with `await fs.promises.unlink()`
and add missing `return` statements to prevent response cascades.
2026-03-28 15:39:29 +02:00
Cohee c78f978ede fix: conditionally include secrets in user data backup (#5360)
* fix: conditionally include secrets in user data backup

* feat: add full data backup toggle

* 418 -> 403
I'm not a teapot

* Distinguish fails from disabled
2026-03-28 01:52:03 +02:00
Copilot 319c647e13 Fix vLLM vector embeddings URL construction to preserve custom API path prefixes (#5350)
* Initial plan

* fix: use trimV1 and url-join for vLLM vector embeddings URL construction

Fixes URL path construction in vllm-vectors.js to preserve custom API
path prefixes (e.g. /compatible-mode/v1). Previously url.pathname
assignment would overwrite the entire path, stripping any prefix.

Now uses the same trimV1 + urlJoin pattern as llamacpp-vectors.js.

Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>
Agent-Logs-Url: https://github.com/SillyTavern/SillyTavern/sessions/f708dd66-8961-4c23-8b8b-3ab868bf676a

* Revert package-lock

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>
2026-03-25 00:12:52 +02:00
SoniaNvm a794a3780a Renaming a lorebook now re-links itself to cards using it (with a confirmation prompt) (#5323)
* renaming a lorebook prompts to update existing links

* used suggested api and logic

* add world property to shallow function

* Fix type error in assignLorebookToChat invoke

* Remove debug console logs

* Fix activeCharacterUpdated

* Extract updateWorldInfoLinks into a func

* Invert if for an early return

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-03-24 23:48:43 +02:00
Raymond Flanagan 4839c76fb5 Handle port conflicts during server startup (#5349)
* Handle port conflicts during server startup

* Fix return type of startHTTPorHTTPS

* Update language in getAddressInUseMessage

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-03-24 23:29:25 +02:00
allo- d306194c51 Fix missing model name in tokenize requests for llama.cpp (fixes #4962) (#5344)
* Fix missing model name in tokenize requests for llama.cpp (fixes #4962)

The new router mode of llama.cpp allows to switch models on the fly,
what is already supported by SillyTavern. The call to the `/tokenize`
endpoint did not contain the model name, and failed in router mode.
This patch adds the `model` parameter similar to the implementation
for other backends.

* fix: migrate vllm and aphrodite to new payload field

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-03-23 23:12:02 +02:00
Xiangzhe 2cb1861db6 feat: add SiliconFlow.cn chat completion and embedding support (#5316)
* feat: add SiliconFlow.cn endpoint support and embedding vectors

Chat completion:
- Add endpoint selection dropdown (Global/.com vs China/.cn) to existing
  SiliconFlow provider, following the Z.AI endpoint pattern
- Backend switches API URL based on selected endpoint
- Add /api-url slash command support for endpoint switching

Embeddings:
- Add SiliconFlow as a vector/embedding source (OpenAI-compatible)
- Support both .com and .cn endpoints via siliconflow_endpoint setting
  borrowed from the main connection panel (Vertex AI pattern)
- Superset model list with platform attribution (.cn) markers
- Models: Qwen3-Embedding (0.6B/4B/8B) + BGE/BCE models (.cn only)

* Add filter by models type

* Load embedding models from endpoint

* Improve api-url command declaration

* Support endpoint override in custom-request service

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-03-22 00:52:03 +02:00
cloak1505 e3fbc8510a Correct input_video to video_url in embedOpenRouterMedia() (#5331) 2026-03-21 23:55:42 +02:00
DN2331 c3f36b2b9f fix(openrouter): respect enableThoughtSignatures setting for message signatures (#5318)
The addOpenRouterSignatures function was previously converting and appending message.signature to reasoning_details unconditionally, ignoring the `enableThoughtSignatures` setting.

This change adds a check for `enableThoughtSignatures` before converting message.signature, while still ensuring the original signature property is deleted to prevent API schema validation errors (HTTP 400).
2026-03-18 18:20:40 +02:00
GentleBurr c4024fe208 Fix AICC direct link import parsing (#5307)
* Fix AICC direct link import parsing

Update parseAICC in src/endpoints/content-manager.js to dynamically extract the author and character name from the end of the URL path. This resolves a 404 import error caused by AICC adding category subfolders and changing their base URL structure from /character-cards/ to /charactercards/.

* Clean up whitespace in content-manager.js

Remove unnecessary whitespace in URL path processing.

* Use isValidUrl for URL validation

---------

Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-03-17 20:35:22 +02:00
Cohee 645b8063e1 Intellectual Webpack cache management (#5295)
* Intellectual Webpack cache management

* Check if webpackRoot exists.

* Wrap webpack directory reading in try-catch

* Enhance cache pruning console output
2026-03-15 23:30:57 +02:00
Cohee e0ed67357c Use string byte length for token guesstimation (#5267)
* Use string byte length for token guesstimation

* Use Buffer.byteLength on backend

* Preserve TextEncoder instance
2026-03-11 01:28:23 +02:00
Roland4396 1c5091539c feat: optionally gzip large save uploads with fallback (#5259)
* feat: optionally gzip large save uploads with fallback

* fix: replace Safari-prone save compression with fflate fallback

* refactor: align save upload compression with review feedback

* refactor: use compressRequest wrapper for save uploads

* Refactor request compression settings

* Fix default value

* Avoid null in bytes parsing result

* fix: switch request compression to fflate gzip

* fix: add request compression maxBytes cap and clarify timeout semantics

* Refresh package-lock.json

* Unify payload limit setting names

* Expose compression termination function

* Add compression to group chat saves

---------

Co-authored-by: Roland4396 <Roland4396@users.noreply.github.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-03-10 23:32:36 +02:00
Copilot 3ad9b05e27 Implement extension manifest hooks for lifecycle events (#5261)
* Initial plan

* Implement extension manifest hooks for install, delete, enable, disable

Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>

* Revert unrelated package-lock.json changes

Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>

* Address review: use Object.hasOwn, add activate hook, simplify await, return folderName from backend

Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>

* Add 'update' hook that triggers on extension update

Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>

* Revert package-lock

* Add 5-second timeout for extension hook calls using delay and Promise.race

Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>

* Revert unintended package-lock.json changes

Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>

* Add timeout warning log when extension hook exceeds 5 seconds

Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>

* Refactor extension hook call to handle synchronous results

* Refactor callExtensionHook to use constants for timeout results

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Cohee1207 <18619528+Cohee1207@users.noreply.github.com>
2026-03-08 13:10:28 +02:00
Cohee 92dd9ecab2 gpt-5.4 2026-03-07 20:39:34 +02:00
Cohee 7a9483efba Remove BOS from KoboldCpp token encoding
Fixes #4663
2026-03-07 18:28:20 +02:00
Cohee e19c4f7e19 Remove BOS token from Tabby encoding
Fixes #5254
2026-03-07 18:25:58 +02:00
equal-l2 e834d3724b Remove xAI web search capability (#5255)
With web search on, the API now returns 410 Gone.
2026-03-07 16:58:56 +02:00
Spicy Marinara f20aed95d0 Add gpt-5.3-chat-latest model support (#5241)
* Add gpt-5.3-chat-latest model support

- Add to OpenAI model dropdown (index.html)
- Add to captioning multimodal model list (caption/settings.html)
- Add to OPENAI_REASONING_EFFORT_MODELS (constants.js)
- Add OPENAI_FIXED_REASONING_EFFORT map to clamp effort to 'medium' (the only value this model accepts)
- Apply fixed effort override in both Azure and general OpenAI request paths (chat-completions.js)
- Update frontend gpt-5.x regex for parameter handling (openai.js)

* Update public/scripts/openai.js

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-03-04 20:04:04 +02:00
Cohee 83a16d144c Add path validation for chat directory operations 2026-03-03 23:43:23 +02:00
Cohee ff8b029c9d Fortify user data path checks 2026-03-03 22:00:44 +02:00
Cohee 2abade3f80 Add sanitize of chat import names 2026-03-03 21:55:47 +02:00
Cohee 3070cf26cd Add config for adaptive thinking
Fixes #5236
2026-03-03 20:10:39 +02:00
Cohee 63fa9c1d07 Claude: map Reasoning Effort to adaptive thinking config (#5219)
Supersedes #5105
2026-03-01 17:11:22 +02:00
Sanitised 3db508a759 Support for isomorphic-git as an alternative git backend, part 1 (#5229)
* Initial version of git adapter for alternate backend. Only clone is
implemented.

* Regenerate package-lock.json

* Clarify comments in config.yaml regarding git backend options

---------

Co-authored-by: Sanitised <sanitised@users.noreply.github.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-03-01 17:08:07 +02:00
Tony Gies 15e2f24046 fix: skip system messages in OpenRouter cacheAtDepth calculation (#5230)
System messages in the OpenRouter messages array were being counted
toward depth and could receive cache_control breakpoints. Since
OpenRouter hoists system messages into Claude's separate system
parameter, this misplaced breakpoints and could prevent caching
entirely if the hoisted content fell below minimum cache size.

Closes #5227
2026-03-01 16:58:51 +02:00
Cohee 744ce7705d gemini-3.1-flash-image-preview 2026-02-27 20:26:22 +02:00
shifusen329 d789efba07 Use Ollama /api/embed endpoint for vector embeddings (#5221)
* Use Ollama /api/embed endpoint for vector embeddings

The deprecated /api/embeddings endpoint does not properly support the
truncate parameter, causing "input length exceeds context length" errors
when vectorizing files. Migrate to /api/embed which correctly handles
truncation and supports native batch input.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Wrap single Ollama vector calculation into batch
Fixes https://github.com/SillyTavern/SillyTavern/pull/5221/changes#r2850052729

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Cohee <18619528+Cohee1207@users.noreply.github.com>
2026-02-25 23:44:12 +02:00
Cohee 3f8b9998ca gork-imagine
Closes #5216
2026-02-24 23:35:21 +02:00