Skip to content

Commit cac3cb0

Browse files
authored
🤖 feat: add google:gemini-3-flash-preview model (#1216)
Add `google:gemini-3-flash-preview` as a known model with Flash-specific thinking levels. ### Changes - Add `GEMINI_3_FLASH` to `knownModels.ts` with alias `gemini-3-flash` - Update thinking policy: Flash supports `off/low/medium/high` (vs Pro's `low/high`) - Update `providerOptions.ts` to pass `medium` directly for Flash (Pro maps `medium` → `high`) - Refresh `models.json` from LiteLLM (includes gemini-3-flash-preview pricing) - Fix `modelStats.ts` type to allow `null` for `max_output_tokens` (LiteLLM compat) --- <details> <summary>📋 Implementation Plan</summary> # Plan: Add `google:gemini-3-flash-preview` to Known Models ## Summary Add a new Google Gemini 3 Flash Preview model entry and update thinking policy to reflect Flash-specific thinking levels. **Estimated LoC change:** +25 lines (production code only, excluding auto-fetched JSON) ## Changes ### 1. Add model definition to `src/common/constants/knownModels.ts` Insert a new `GEMINI_3_FLASH` entry after the existing `GEMINI_3_PRO` definition: ```typescript GEMINI_3_FLASH: { provider: "google", providerModelId: "gemini-3-flash-preview", aliases: ["gemini-3-flash"], tokenizerOverride: "google/gemini-2.5-pro", }, ``` <details> <summary>Why these values?</summary> - **provider:** `"google"` — matches the existing Gemini model pattern - **providerModelId:** `"gemini-3-flash-preview"` — the exact model ID requested - **aliases:** `["gemini-3-flash"]` — convenient shorthand without `-preview` suffix - **tokenizerOverride:** `"google/gemini-2.5-pro"` — reuses the same tokenizer as `GEMINI_3_PRO` since `ai-tokenizer` likely doesn't have gemini-3 specific tokenizers yet - **warm:** Not set (defaults to false) — flash models are typically secondary, not preloaded </details> ### 2. Update thinking policy in `src/browser/utils/thinking/policy.ts` Per [Google's Gemini 3 docs](https://ai.google.dev/gemini-api/docs/gemini-3#rest), thinking levels differ between Pro and Flash: | Model | Thinking Levels | |-------|-----------------| | Gemini 3 Pro | `low`, `high` | | Gemini 3 Flash | `minimal`, `low`, `medium`, `high` | **Note:** Google's `minimal` maps to mux's `off` (no/minimal thinking). Update the `gemini-3` check in `getThinkingPolicyForModel()` to differentiate Flash from Pro: ```typescript // Gemini 3 Flash supports 4 levels: off (minimal), low, medium, high if (withoutProviderNamespace.includes("gemini-3-flash")) { return ["off", "low", "medium", "high"]; } // Gemini 3 Pro only supports "low" and "high" reasoning levels if (withoutProviderNamespace.includes("gemini-3")) { return ["low", "high"]; } ``` The Flash check must come **before** the generic `gemini-3` check since Flash contains `gemini-3` as a substring. ### 3. Refresh `src/common/utils/tokens/models.json` from LiteLLM LiteLLM already includes `gemini-3-flash-preview`. Just run the existing update script: ```bash bun scripts/update_models.ts ``` This fetches the latest model data from [LiteLLM's model_prices_and_context_window.json](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json). ### 4. Update `src/common/utils/ai/providerOptions.ts` for Flash-specific thinking levels Current code treats all `gemini-3` models the same, mapping `medium` → `high`. But Flash actually supports `medium` natively: | Model | Google API levels | |-------|-------------------| | Gemini 3 Pro | `low`, `high` | | Gemini 3 Flash | `minimal`, `low`, `medium`, `high` | Update the Google provider options to differentiate Flash from Pro: ```typescript if (isGemini3) { const isFlash = modelString.includes("gemini-3-flash"); if (isFlash) { // Flash supports: minimal (maps to off), low, medium, high // When off, we don't set thinkingConfig at all (API defaults to minimal) thinkingConfig.thinkingLevel = effectiveThinking === "xhigh" ? "high" : effectiveThinking; } else { // Pro only supports: low, high - map medium/xhigh to high thinkingConfig.thinkingLevel = effectiveThinking === "medium" || effectiveThinking === "xhigh" ? "high" : effectiveThinking; } } ``` ### 5. Update tests in `src/browser/utils/thinking/policy.test.ts` Add test case for the new Flash model: ```typescript expect(getThinkingPolicyForModel("google:gemini-3-flash-preview")).toEqual([ "off", "low", "medium", "high" ]); ``` ## Systems That Work Automatically | System | Reason | |--------|--------| | **Model display** | Generic Gemini parsing in `modelDisplay.ts` will format as "Gemini 3 Flash Preview" | | **Tokenizer** | Override specified; falls back to gemini-2.5-pro tokenizer | ## Verification After implementation: 1. `make typecheck` — ensures type correctness 2. `bun test src/browser/utils/thinking/policy.test.ts` — verify thinking policy 3. Verify model appears in `KNOWN_MODEL_OPTIONS` export </details> --- _Generated with `mux` • Model: `anthropic:claude-opus-4-5` • Thinking: `high`_ Signed-off-by: Thomas Kosiewski <tk@coder.com>
1 parent f51ad35 commit cac3cb0

File tree

7 files changed

+1074
-107
lines changed

7 files changed

+1074
-107
lines changed

docs/models.mdx

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -15,19 +15,20 @@ mux ships with a curated set of first-class models that we keep up to date with
1515

1616
{/* BEGIN KNOWN_MODELS_TABLE */}
1717

18-
| Model | ID | Aliases | Default |
19-
| -------------------- | --------------------------- | ---------------------------------------- | ------- |
20-
| Opus 4.5 | anthropic:claude-opus-4-5 | `opus` ||
21-
| Sonnet 4.5 | anthropic:claude-sonnet-4-5 | `sonnet` | |
22-
| Haiku 4.5 | anthropic:claude-haiku-4-5 | `haiku` | |
23-
| GPT-5.2 | openai:gpt-5.2 | `gpt` | |
24-
| GPT-5.2 Pro | openai:gpt-5.2-pro | `gpt-pro` | |
25-
| GPT-5.1 Codex | openai:gpt-5.1-codex | `codex` | |
26-
| GPT-5.1 Codex Mini | openai:gpt-5.1-codex-mini | `codex-mini` | |
27-
| GPT-5.1 Codex Max | openai:gpt-5.1-codex-max | `codex-max` | |
28-
| Gemini 3 Pro Preview | google:gemini-3-pro-preview | `gemini-3`, `gemini-3-pro` | |
29-
| Grok 4 1 Fast | xai:grok-4-1-fast | `grok`, `grok-4`, `grok-4.1`, `grok-4-1` | |
30-
| Grok Code Fast 1 | xai:grok-code-fast-1 | `grok-code` | |
18+
| Model | ID | Aliases | Default |
19+
| ---------------------- | ----------------------------- | ---------------------------------------- | ------- |
20+
| Opus 4.5 | anthropic:claude-opus-4-5 | `opus` ||
21+
| Sonnet 4.5 | anthropic:claude-sonnet-4-5 | `sonnet` | |
22+
| Haiku 4.5 | anthropic:claude-haiku-4-5 | `haiku` | |
23+
| GPT-5.2 | openai:gpt-5.2 | `gpt` | |
24+
| GPT-5.2 Pro | openai:gpt-5.2-pro | `gpt-pro` | |
25+
| GPT-5.1 Codex | openai:gpt-5.1-codex | `codex` | |
26+
| GPT-5.1 Codex Mini | openai:gpt-5.1-codex-mini | `codex-mini` | |
27+
| GPT-5.1 Codex Max | openai:gpt-5.1-codex-max | `codex-max` | |
28+
| Gemini 3 Pro Preview | google:gemini-3-pro-preview | `gemini-3`, `gemini-3-pro` | |
29+
| Gemini 3 Flash Preview | google:gemini-3-flash-preview | `gemini-3-flash` | |
30+
| Grok 4 1 Fast | xai:grok-4-1-fast | `grok`, `grok-4`, `grok-4.1`, `grok-4-1` | |
31+
| Grok Code Fast 1 | xai:grok-code-fast-1 | `grok-code` | |
3132

3233
{/* END KNOWN_MODELS_TABLE */}
3334

src/browser/utils/thinking/policy.test.ts

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -159,10 +159,19 @@ describe("getThinkingPolicyForModel", () => {
159159
]);
160160
});
161161

162-
test("returns low/high for Gemini 3", () => {
162+
test("returns low/high for Gemini 3 Pro", () => {
163163
expect(getThinkingPolicyForModel("google:gemini-3-pro-preview")).toEqual(["low", "high"]);
164164
});
165165

166+
test("returns off/low/medium/high for Gemini 3 Flash", () => {
167+
expect(getThinkingPolicyForModel("google:gemini-3-flash-preview")).toEqual([
168+
"off",
169+
"low",
170+
"medium",
171+
"high",
172+
]);
173+
});
174+
166175
test("returns all levels for other providers", () => {
167176
expect(getThinkingPolicyForModel("anthropic:claude-opus-4")).toEqual([
168177
"off",

src/browser/utils/thinking/policy.ts

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,11 @@ export function getThinkingPolicyForModel(modelString: string): ThinkingPolicy {
6666
return ["high"];
6767
}
6868

69+
// Gemini 3 Flash supports 4 levels: off (minimal), low, medium, high
70+
if (withoutProviderNamespace.includes("gemini-3-flash")) {
71+
return ["off", "low", "medium", "high"];
72+
}
73+
6974
// Gemini 3 Pro only supports "low" and "high" reasoning levels
7075
if (withoutProviderNamespace.includes("gemini-3")) {
7176
return ["low", "high"];

src/common/constants/knownModels.ts

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,12 @@ const MODEL_DEFINITIONS = {
8383
aliases: ["gemini-3", "gemini-3-pro"],
8484
tokenizerOverride: "google/gemini-2.5-pro",
8585
},
86+
GEMINI_3_FLASH: {
87+
provider: "google",
88+
providerModelId: "gemini-3-flash-preview",
89+
aliases: ["gemini-3-flash"],
90+
tokenizerOverride: "google/gemini-2.5-pro",
91+
},
8692
GROK_4_1: {
8793
provider: "xai",
8894
providerModelId: "grok-4-1-fast",

src/common/utils/ai/providerOptions.ts

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -264,11 +264,18 @@ export function buildProviderOptions(
264264
};
265265

266266
if (isGemini3) {
267-
// Gemini 3 uses thinkingLevel (low/high) - map medium/xhigh to supported values
268-
thinkingConfig.thinkingLevel =
269-
effectiveThinking === "medium" || effectiveThinking === "xhigh"
270-
? "high"
271-
: effectiveThinking;
267+
const isFlash = modelString.includes("gemini-3-flash");
268+
if (isFlash) {
269+
// Flash supports: minimal (maps to off), low, medium, high
270+
// When off, we don't set thinkingConfig at all (API defaults to minimal)
271+
thinkingConfig.thinkingLevel = effectiveThinking === "xhigh" ? "high" : effectiveThinking;
272+
} else {
273+
// Pro only supports: low, high - map medium/xhigh to high
274+
thinkingConfig.thinkingLevel =
275+
effectiveThinking === "medium" || effectiveThinking === "xhigh"
276+
? "high"
277+
: effectiveThinking;
278+
}
272279
} else {
273280
// Gemini 2.5 uses thinkingBudget
274281
const budget = GEMINI_THINKING_BUDGETS[effectiveThinking];

src/common/utils/tokens/modelStats.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,8 @@ export interface ModelStats {
1212
}
1313

1414
interface RawModelData {
15-
max_input_tokens?: number | string;
16-
max_output_tokens?: number | string;
15+
max_input_tokens?: number | string | null;
16+
max_output_tokens?: number | string | null;
1717
input_cost_per_token?: number;
1818
output_cost_per_token?: number;
1919
cache_creation_input_token_cost?: number;

0 commit comments

Comments
 (0)